Reference for ultralytics/data/split.py
Note
This file is available at https://github.com/ultralytics/ultralytics/blob/main/ultralytics/data/split.py. If you spot a problem please help fix it by contributing a Pull Request ๐ ๏ธ. Thank you ๐!
ultralytics.data.split.split_classify_dataset
split_classify_dataset(
source_dir: Union[str, Path], train_ratio: float = 0.8
) -> Path
Split classification dataset into train and val directories in a new directory.
Creates a new directory '{source_dir}_split' with train/val subdirectories, preserving the original class structure with an 80/20 split by default.
Directory structure
Before: caltech/ โโโ class1/ โ โโโ img1.jpg โ โโโ img2.jpg โ โโโ ... โโโ class2/ โ โโโ img1.jpg โ โโโ ... โโโ ...
After: caltech_split/ โโโ train/ โ โโโ class1/ โ โ โโโ img1.jpg โ โ โโโ ... โ โโโ class2/ โ โ โโโ img1.jpg โ โ โโโ ... โ โโโ ... โโโ val/ โโโ class1/ โ โโโ img2.jpg โ โโโ ... โโโ class2/ โ โโโ ... โโโ ...
Parameters:
Name | Type | Description | Default |
---|---|---|---|
source_dir
|
str | Path
|
Path to classification dataset root directory. |
required |
train_ratio
|
float
|
Ratio for train split, between 0 and 1. |
0.8
|
Returns:
Type | Description |
---|---|
Path
|
Path to the created split directory. |
Examples:
Split dataset with default 80/20 ratio
>>> split_classify_dataset("path/to/caltech")
Split with custom ratio
>>> split_classify_dataset("path/to/caltech", 0.75)
Source code in ultralytics/data/split.py
12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 |
|
ultralytics.data.split.autosplit
autosplit(
path: Path = DATASETS_DIR / "coco8/images",
weights: Tuple[float, float, float] = (0.9, 0.1, 0.0),
annotated_only: bool = False,
) -> None
Automatically split a dataset into train/val/test splits and save the resulting splits into autosplit_*.txt files.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path
|
Path
|
Path to images directory. |
DATASETS_DIR / 'coco8/images'
|
weights
|
tuple
|
Train, validation, and test split fractions. |
(0.9, 0.1, 0.0)
|
annotated_only
|
bool
|
If True, only images with an associated txt file are used. |
False
|
Examples:
Split images with default weights
>>> from ultralytics.data.split import autosplit
>>> autosplit()
Split with custom weights and annotated images only
>>> autosplit(path="path/to/images", weights=(0.8, 0.15, 0.05), annotated_only=True)
Source code in ultralytics/data/split.py
98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 |
|