Reference for hub_sdk/modules/datasets.py
Note
This file is available at https://github.com/ultralytics/hub-sdk/blob/main/hub_sdk/modules/datasets.py. If you spot a problem please help fix it by contributing a Pull Request 🛠️. Thank you 🙏!
hub_sdk.modules.datasets.Datasets
Datasets(
dataset_id: Optional[str] = None, headers: Optional[Dict[str, Any]] = None
)
Bases: CRUDClient
A class representing a client for interacting with Datasets through CRUD operations.
This class extends the CRUDClient class and provides specific methods for working with Datasets.
Attributes:
Name | Type | Description |
---|---|---|
hub_client |
DatasetUpload
|
An instance of DatasetUpload used for interacting with dataset uploads. |
id |
str | None
|
The unique identifier of the dataset, if available. |
data |
Dict
|
A dictionary to store dataset data. |
Note
The 'id' attribute is set during initialization and can be used to uniquely identify a dataset. The 'data' attribute is used to store dataset data fetched from the API.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataset_id
|
str
|
Unique id of the dataset. |
None
|
headers
|
Dict
|
Headers to include in HTTP requests. |
None
|
Source code in hub_sdk/modules/datasets.py
28 29 30 31 32 33 34 35 36 37 38 39 40 41 |
|
create_dataset
create_dataset(dataset_data: Dict) -> None
Create a new dataset with the provided data and set the dataset ID for the current instance.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataset_data
|
Dict
|
A dictionary containing the data for creating the dataset. |
required |
Source code in hub_sdk/modules/datasets.py
77 78 79 80 81 82 83 84 85 86 |
|
delete
delete(hard: bool = False) -> Optional[Response]
Delete the dataset resource represented by this instance.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
hard
|
bool
|
If True, perform a hard delete. |
False
|
Note
The 'hard' parameter determines whether to perform a soft delete (default) or a hard delete. In a soft delete, the dataset might be marked as deleted but retained in the system. In a hard delete, the dataset is permanently removed from the system.
Returns:
Type | Description |
---|---|
Optional[Response]
|
Response object from the delete request, or None if delete fails. |
Source code in hub_sdk/modules/datasets.py
88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 |
|
get_data
get_data() -> None
Retrieve data for the current dataset instance.
If a valid dataset ID has been set, it sends a request to fetch the dataset data and stores it in the instance. If no dataset ID has been set, it logs an error message.
Source code in hub_sdk/modules/datasets.py
43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 |
|
get_download_link
get_download_link() -> Optional[str]
Get dataset download link.
Returns:
Type | Description |
---|---|
Optional[str]
|
Return download link or None if the link is not available. |
Source code in hub_sdk/modules/datasets.py
129 130 131 132 133 134 135 136 |
|
update
update(data: Dict) -> Optional[Response]
Update the dataset resource represented by this instance.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data
|
Dict
|
The updated data for the dataset resource. |
required |
Returns:
Type | Description |
---|---|
Optional[Response]
|
Response object from the update request, or None if update fails. |
Source code in hub_sdk/modules/datasets.py
105 106 107 108 109 110 111 112 113 114 115 |
|
upload_dataset
upload_dataset(file: str = None) -> Optional[Response]
Upload a dataset file to the hub.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
file
|
str
|
The path to the dataset file to upload. |
None
|
Returns:
Type | Description |
---|---|
Optional[Response]
|
Response object from the upload request, or None if upload fails. |
Source code in hub_sdk/modules/datasets.py
117 118 119 120 121 122 123 124 125 126 127 |
|
hub_sdk.modules.datasets.DatasetList
DatasetList(page_size=None, public=None, headers=None)
Bases: PaginatedList
A class for managing a paginated list of datasets from the Ultralytics Hub API.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
page_size
|
int
|
The number of items to request per page. |
None
|
public
|
bool
|
Whether the items should be publicly accessible. |
None
|
headers
|
Dict
|
Headers to be included in API requests. |
None
|
Source code in hub_sdk/modules/datasets.py
142 143 144 145 146 147 148 149 150 151 152 |
|