Reference for `ultralytics/models/rtdetr/val.py`

Note

This file is available at https://github.com/ultralytics/ultralytics/blob/main/ultralytics/models/rtdetr/val.py. If you spot a problem please help fix it by contributing a Pull Request 🛠️. Thank you 🙏!

ultralytics.models.rtdetr.val.RTDETRDataset

RTDETRDataset(*args, data=None, **kwargs)

Bases: YOLODataset

Real-Time DEtection and TRacking (RT-DETR) dataset class extending the base YOLODataset class.

This specialized dataset class is designed for use with the RT-DETR object detection model and is optimized for real-time detection and tracking tasks.

Attributes:

Name	Type	Description
`augment`	`bool`	Whether to apply data augmentation.
`rect`	`bool`	Whether to use rectangular training.
`use_segments`	`bool`	Whether to use segmentation masks.
`use_keypoints`	`bool`	Whether to use keypoint annotations.
`imgsz`	`int`	Target image size for training.

Methods:

Name	Description
`load_image`	Load one image from dataset index.
`build_transforms`	Build transformation pipeline for the dataset.

Examples:

Initialize an RT-DETR dataset

>>> dataset = RTDETRDataset(img_path="path/to/images", imgsz=640)
>>> image, hw = dataset.load_image(0)

This constructor sets up a dataset specifically optimized for the RT-DETR (Real-Time DEtection and TRacking) model, building upon the base YOLODataset functionality.

Parameters:

Name	Type	Description	Default
`*args`	`Any`	Variable length argument list passed to the parent YOLODataset class.	`()`
`data`	`dict \| None`	Dictionary containing dataset information. If None, default values will be used.	`None`
`**kwargs`	`Any`	Additional keyword arguments passed to the parent YOLODataset class.	`{}`

Source code in ultralytics/models/rtdetr/val.py

def __init__(self, *args, data=None, **kwargs):
    """
    Initialize the RTDETRDataset class by inheriting from the YOLODataset class.

    This constructor sets up a dataset specifically optimized for the RT-DETR (Real-Time DEtection and TRacking)
    model, building upon the base YOLODataset functionality.

    Args:
        *args (Any): Variable length argument list passed to the parent YOLODataset class.
        data (dict | None): Dictionary containing dataset information. If None, default values will be used.
        **kwargs (Any): Additional keyword arguments passed to the parent YOLODataset class.
    """
    super().__init__(*args, data=data, **kwargs)

build_transforms

build_transforms(hyp=None)

Build transformation pipeline for the dataset.

Parameters:

Name	Type	Description	Default
`hyp`	`dict`	Hyperparameters for transformations.	`None`

Returns:

Type	Description
`Compose`	Composition of transformation functions.

Source code in ultralytics/models/rtdetr/val.py

def build_transforms(self, hyp=None):
    """
    Build transformation pipeline for the dataset.

    Args:
        hyp (dict, optional): Hyperparameters for transformations.

    Returns:
        (Compose): Composition of transformation functions.
    """
    if self.augment:
        hyp.mosaic = hyp.mosaic if self.augment and not self.rect else 0.0
        hyp.mixup = hyp.mixup if self.augment and not self.rect else 0.0
        hyp.cutmix = hyp.cutmix if self.augment and not self.rect else 0.0
        transforms = v8_transforms(self, self.imgsz, hyp, stretch=True)
    else:
        # transforms = Compose([LetterBox(new_shape=(self.imgsz, self.imgsz), auto=False, scale_fill=True)])
        transforms = Compose([])
    transforms.append(
        Format(
            bbox_format="xywh",
            normalize=True,
            return_mask=self.use_segments,
            return_keypoint=self.use_keypoints,
            batch_idx=True,
            mask_ratio=hyp.mask_ratio,
            mask_overlap=hyp.overlap_mask,
        )
    )
    return transforms

load_image

load_image(i, rect_mode=False)

Load one image from dataset index 'i'.

Parameters:

Name	Type	Description	Default
`i`	`int`	Index of the image to load.	required
`rect_mode`	`bool`	Whether to use rectangular mode for batch inference.	`False`

Returns:

Name	Type	Description
`im`	`Tensor`	The loaded image.
`resized_hw`	`tuple`	Height and width of the resized image with shape (2,).

Examples:

Load an image from the dataset

>>> dataset = RTDETRDataset(img_path="path/to/images")
>>> image, hw = dataset.load_image(0)

Source code in ultralytics/models/rtdetr/val.py

def load_image(self, i, rect_mode=False):
    """
    Load one image from dataset index 'i'.

    Args:
        i (int): Index of the image to load.
        rect_mode (bool, optional): Whether to use rectangular mode for batch inference.

    Returns:
        im (torch.Tensor): The loaded image.
        resized_hw (tuple): Height and width of the resized image with shape (2,).

    Examples:
        Load an image from the dataset
        >>> dataset = RTDETRDataset(img_path="path/to/images")
        >>> image, hw = dataset.load_image(0)
    """
    return super().load_image(i=i, rect_mode=rect_mode)

ultralytics.models.rtdetr.val.RTDETRValidator

RTDETRValidator(dataloader=None, save_dir=None, args=None, _callbacks=None)

Bases: DetectionValidator

RTDETRValidator extends the DetectionValidator class to provide validation capabilities specifically tailored for the RT-DETR (Real-Time DETR) object detection model.

The class allows building of an RTDETR-specific dataset for validation, applies Non-maximum suppression for post-processing, and updates evaluation metrics accordingly.

Attributes:

Name	Type	Description
`args`	`Namespace`	Configuration arguments for validation.
`data`	`dict`	Dataset configuration dictionary.

Methods:

Name	Description
`build_dataset`	Build an RTDETR Dataset for validation.
`postprocess`	Apply Non-maximum suppression to prediction outputs.

Examples:

Initialize and run RT-DETR validation

>>> from ultralytics.models.rtdetr import RTDETRValidator
>>> args = dict(model="rtdetr-l.pt", data="coco8.yaml")
>>> validator = RTDETRValidator(args=args)
>>> validator()

Notes

For further details on the attributes and methods, refer to the parent DetectionValidator class.

Source code in ultralytics/models/yolo/detect/val.py

def __init__(self, dataloader=None, save_dir=None, args=None, _callbacks=None) -> None:
    """
    Initialize detection validator with necessary variables and settings.

    Args:
        dataloader (torch.utils.data.DataLoader, optional): Dataloader to use for validation.
        save_dir (Path, optional): Directory to save results.
        args (Dict[str, Any], optional): Arguments for the validator.
        _callbacks (List[Any], optional): List of callback functions.
    """
    super().__init__(dataloader, save_dir, args, _callbacks)
    self.is_coco = False
    self.is_lvis = False
    self.class_map = None
    self.args.task = "detect"
    self.iouv = torch.linspace(0.5, 0.95, 10)  # IoU vector for mAP@0.5:0.95
    self.niou = self.iouv.numel()
    self.metrics = DetMetrics()

build_dataset

build_dataset(img_path, mode='val', batch=None)

Build an RTDETR Dataset.

Parameters:

Name	Type	Description	Default
`img_path`	`str`	Path to the folder containing images.	required
`mode`	`str`	`train` mode or `val` mode, users are able to customize different augmentations for each mode.	`'val'`
`batch`	`int`	Size of batches, this is for `rect`.	`None`

Returns:

Type	Description
`RTDETRDataset`	Dataset configured for RT-DETR validation.

Source code in ultralytics/models/rtdetr/val.py

def build_dataset(self, img_path, mode="val", batch=None):
    """
    Build an RTDETR Dataset.

    Args:
        img_path (str): Path to the folder containing images.
        mode (str, optional): `train` mode or `val` mode, users are able to customize different augmentations for
            each mode.
        batch (int, optional): Size of batches, this is for `rect`.

    Returns:
        (RTDETRDataset): Dataset configured for RT-DETR validation.
    """
    return RTDETRDataset(
        img_path=img_path,
        imgsz=self.args.imgsz,
        batch_size=batch,
        augment=False,  # no augmentation
        hyp=self.args,
        rect=False,  # no rect
        cache=self.args.cache or None,
        prefix=colorstr(f"{mode}: "),
        data=self.data,
    )

postprocess

postprocess(
    preds: Union[Tensor, List[Tensor], Tuple[Tensor]],
) -> List[Dict[str, torch.Tensor]]

Apply Non-maximum suppression to prediction outputs.

Parameters:

Name	Type	Description	Default
`preds`	`Tensor \| List \| Tuple`	Raw predictions from the model. If tensor, should have shape (batch_size, num_predictions, num_classes + 4) where last dimension contains bbox coords and class scores.	required

Returns:

Type	Description
`List[Dict[str, Tensor]]`	List of dictionaries for each image, each containing: - 'bboxes': Tensor of shape (N, 4) with bounding box coordinates - 'conf': Tensor of shape (N,) with confidence scores - 'cls': Tensor of shape (N,) with class indices

Source code in ultralytics/models/rtdetr/val.py

def postprocess(
    self, preds: Union[torch.Tensor, List[torch.Tensor], Tuple[torch.Tensor]]
) -> List[Dict[str, torch.Tensor]]:
    """
    Apply Non-maximum suppression to prediction outputs.

    Args:
        preds (torch.Tensor | List | Tuple): Raw predictions from the model. If tensor, should have shape
            (batch_size, num_predictions, num_classes + 4) where last dimension contains bbox coords and class scores.

    Returns:
        (List[Dict[str, torch.Tensor]]): List of dictionaries for each image, each containing:
            - 'bboxes': Tensor of shape (N, 4) with bounding box coordinates
            - 'conf': Tensor of shape (N,) with confidence scores
            - 'cls': Tensor of shape (N,) with class indices
    """
    if not isinstance(preds, (list, tuple)):  # list for PyTorch inference but list[0] Tensor for export inference
        preds = [preds, None]

    bs, _, nd = preds[0].shape
    bboxes, scores = preds[0].split((4, nd - 4), dim=-1)
    bboxes *= self.args.imgsz
    outputs = [torch.zeros((0, 6), device=bboxes.device)] * bs
    for i, bbox in enumerate(bboxes):  # (300, 4)
        bbox = ops.xywh2xyxy(bbox)
        score, cls = scores[i].max(-1)  # (300, )
        pred = torch.cat([bbox, score[..., None], cls[..., None]], dim=-1)  # filter
        # Sort by confidence to correctly get internal metrics
        pred = pred[score.argsort(descending=True)]
        outputs[i] = pred[score > self.args.conf]

    return [{"bboxes": x[:, :4], "conf": x[:, 4], "cls": x[:, 5]} for x in outputs]

📅 Created 1 year ago ✏️ Updated 10 months ago

Reference for ultralytics/models/rtdetr/val.py

ultralytics.models.rtdetr.val.RTDETRDataset

build_transforms

load_image

ultralytics.models.rtdetr.val.RTDETRValidator

build_dataset

postprocess

Reference for `ultralytics/models/rtdetr/val.py`