ONNX를 사용한 AutoML의 Computer Vision 모델에 대한 예측

2025-03-11

적용 대상: Python SDK azure-ai-ml v2(현재)

이 문서에서는 ONNX(Open Neural Network Exchange)를 사용하여 AutoML(Azure 자동화된 Machine Learning)에서 생성된 Computer Vision 모델을 예측하는 방법을 알아봅니다.

예측에 ONNX를 사용하려면 다음을 수행해야 합니다.

AutoML 학습 실행에서 ONNX 모델 파일을 다운로드합니다.
ONNX 모델의 입력 및 출력을 이해합니다.
입력 이미지에 필요한 형식이 되도록 데이터를 미리 처리합니다.
Python용 ONNX 런타임을 통해 추론을 수행합니다.
개체 감지 및 인스턴스 구분 작업에 대한 예측을 시각화합니다.

ONNX는 기계 학습 및 딥 러닝 모델을 위한 개방형 표준입니다. 인기 있는 AI 프레임워크에서 모델 가져오기 및 내보내기(상호 운용성)를 지원합니다. 자세한 내용은 ONNX GitHub 프로젝트를 살펴보세요.

ONNX 런타임은 플랫폼 간 추론을 지원하는 오픈 소스 프로젝트입니다. ONNX 런타임은 프로그래밍 언어(Python, C++, C#, C, Java 및 JavaScript 포함)에서 API를 제공합니다. 이러한 API를 사용하여 입력 이미지에 대한 추론을 수행할 수 있습니다. 모델을 ONNX 형식으로 내보낸 후에는 프로젝트에 필요한 프로그래밍 언어에서 이러한 API를 사용할 수 있습니다.

이 가이드에서는 ONNX 런타임용 Python API를 사용하여 인기 있는 비전 작업을 위한 이미지를 에측하는 방법을 알아봅니다. 이러한 ONNX 내보낸 모델을 언어 간에 사용할 수 있습니다.

필수 조건

지원되는 이미지 작업(분류, 개체 감지 또는 인스턴스 구분)에 대한 AutoML 학습 Computer Vision 모델을 가져옵니다. Computer Vision 작업에 대한 AutoML 지원 관련 정보를 알아봅니다.
onnxruntime 패키지를 설치합니다. 이 문서의 메서드는 버전 1.3.0~1.8.0에서 테스트되었습니다.

ONNX 모델 파일 다운로드

Azure Machine Learning 스튜디오 UI 또는 Azure Machine Learning Python SDK를 사용하여 AutoML 실행에서 ONNX 모델 파일을 다운로드할 수 있습니다. 실험 이름 및 부모 실행 ID를 가진 SDK를 통해 다운로드하는 것이 좋습니다.

Azure Machine Learning 스튜디오

Azure Machine Learning 스튜디오에서 학습 Notebook에 생성된 실험에 대한 하이퍼링크를 사용하거나 자산 아래의 실험 탭을 사용하여 실험으로 이동합니다. 그런 다음, 최상의 자식 실행을 선택합니다.

최상의 자식 실행 내에서 출력+로그>train_artifacts로 이동합니다. 다운로드 단추를 사용하여 다음 파일을 수동으로 다운로드합니다.

labels.json: 학습 데이터 세트의 모든 클래스 또는 레이블을 포함하는 파일
model.onnx: ONNX 형식의 모델

ONNX 모델 파일을 다운로드하기 위한 선택 항목을 보여 주는 스크린샷

다운로드한 모델 파일을 디렉터리에 저장합니다. 이 문서의 예제에서는 ./automl_models 디렉터리를 사용합니다.

Azure Machine Learning Python SDK

SDK를 사용하면 실험 이름 및 부모 실행 ID를 사용하여 가장 적합한 자식 실행(기본 메트릭 기준)을 선택할 수 있습니다. 그런 후, labels.json 및 model.onnx 파일을 다운로드할 수 있습니다.

다음 코드는 관련 기본 메트릭에 따라 최상의 자식 실행을 반환합니다.

from azure.identity import DefaultAzureCredential
from azure.ai.ml import MLClient
mlflow_client = MlflowClient()

credential = DefaultAzureCredential()
ml_client = None
try:
    ml_client = MLClient.from_config(credential)
except Exception as ex:
    print(ex)
    # Enter details of your Azure Machine Learning workspace
    subscription_id = ''   
    resource_group = ''  
    workspace_name = ''
    ml_client = MLClient(credential, subscription_id, resource_group, workspace_name)

import mlflow
from mlflow.tracking.client import MlflowClient

# Obtain the tracking URL from MLClient
MLFLOW_TRACKING_URI = ml_client.workspaces.get(
    name=ml_client.workspace_name
).mlflow_tracking_uri

mlflow.set_tracking_uri(MLFLOW_TRACKING_URI)

# Specify the job name
job_name = ''

# Get the parent run
mlflow_parent_run = mlflow_client.get_run(job_name)
best_child_run_id = mlflow_parent_run.data.tags['automl_best_child_run_id']
# get the best child run
best_run = mlflow_client.get_run(best_child_run_id)

학습 데이터 세터의 모든 클래스 및 레이블을 포함하는 labels.json 파일을 다운로드합니다.

local_dir = './automl_models'
if not os.path.exists(local_dir):
    os.mkdir(local_dir)

labels_file = mlflow_client.download_artifacts(
    best_run.info.run_id, 'train_artifacts/labels.json', local_dir
)

model.onnx 파일을 다운로드합니다.

onnx_model_path = mlflow_client.download_artifacts(
    best_run.info.run_id, 'train_artifacts/model.onnx', local_dir
)

ONNX 모델을 사용하여 개체 감지 및 인스턴스 세분화에 대한 일괄 처리 유추의 경우 일괄 처리 채점을 위한 모델 생성 섹션을 참조하세요.

일괄 처리 채점을 위한 모델 생성

기본적으로 AutoML for Images는 분류를 위한 일괄 처리 채점을 지원합니다. 그러나 개체 감지 및 인스턴스 세분화 ONNX 모델은 일괄 처리 유추를 지원하지 않습니다. 개체 감지 및 인스턴스 구분에 대한 일괄 처리 추론의 경우 다음 절차를 사용하여 필요한 일괄 처리 크기에 대한 ONNX 모델을 생성합니다. 특정 일괄 처리 크기에 대해 생성된 모델은 다른 일괄 처리 크기에 대해 작동하지 않습니다.

conda 환경 파일을 다운로드하고 명령 작업과 함께 사용할 환경 개체를 만듭니다.

#  Download conda file and define the environment

conda_file = mlflow_client.download_artifacts(
    best_run.info.run_id, "outputs/conda_env_v_1_0_0.yml", local_dir
)
from azure.ai.ml.entities import Environment
env = Environment(
    name="automl-images-env-onnx",
    description="environment for automl images ONNX batch model generation",
    image="mcr.microsoft.com/azureml/openmpi4.1.0-cuda11.1-cudnn8-ubuntu18.04",
    conda_file=conda_file,
)

다음 모델별 인수를 사용하여 스크립트를 제출합니다. 인수에 대한 자세한 내용은 모델별 하이퍼 매개 변수를 참조하고 지원되는 개체 감지 모델 이름은 지원되는 모델 아키텍처 섹션을 참조하세요.

일괄 처리 채점 모델을 만드는 데 필요한 인수 값을 얻으려면 AutoML 학습 실행의 출력 폴더 아래에 생성된 채점 스크립트를 참조하세요. 최상의 자식 실행을 위해 채점 파일 내의 모델 설정 변수에서 사용 가능한 하이퍼 매개 변수 값을 사용합니다.

다중 클래스 이미지 분류의 경우 최상의 자식 실행을 위해 생성된 ONNX 모델은 기본적으로 일괄 처리 채점을 지원합니다. 따라서 이 작업 유형에는 모델별 인수가 필요하지 않으며 레이블 및 ONNX 모델 파일 로드 섹션으로 건너뛸 수 있습니다.

inputs = {'model_name': 'fasterrcnn_resnet34_fpn',  # enter the faster rcnn or retinanet model name
         'batch_size': 8,  # enter the batch size of your choice
         'height_onnx': 600,  # enter the height of input to ONNX model
         'width_onnx': 800,  # enter the width of input to ONNX model
         'job_name': job_name,
         'task_type': 'image-object-detection',
         'min_size': 600,  # minimum size of the image to be rescaled before feeding it to the backbone
         'max_size': 1333,  # maximum size of the image to be rescaled before feeding it to the backbone
         'box_score_thresh': 0.3,  # threshold to return proposals with a classification score > box_score_thresh
         'box_nms_thresh': 0.5,  # NMS threshold for the prediction head
         'box_detections_per_img': 100   # maximum number of detections per image, for all classes
         }

inputs = {'model_name': 'yolov5',  # enter the yolo model name
          'batch_size': 8,  # enter the batch size of your choice
          'height_onnx': 640,  # enter the height of input to ONNX model
          'width_onnx': 640,  # enter the width of input to ONNX model
          'job_name': job_name,
          'task_type': 'image-object-detection',
          'img_size': 640,  # image size for inference
          'model_size': 'small',  # size of the yolo model
          'box_score_thresh': 0.1,  # threshold to return proposals with a classification score > box_score_thresh
          'box_iou_thresh': 0.5
        }

inputs = {'model_name': 'maskrcnn_resnet50_fpn',  # enter the maskrcnn model name
         'batch_size': 8,  # enter the batch size of your choice
         'height_onnx': 600,  # enter the height of input to ONNX model
         'width_onnx': 800,  # enter the width of input to ONNX model
         'job_name': job_name,
         'task_type': 'image-instance-segmentation',
         'min_size': 600,  # minimum size of the image to be rescaled before feeding it to the backbone
         'max_size': 1333,  # maximum size of the image to be rescaled before feeding it to the backbone
         'box_score_thresh': 0.3,  # threshold to return proposals with a classification score > box_score_thresh
         'box_nms_thresh': 0.5,  # NMS threshold for the prediction head
         'box_detections_per_img': 100  # maximum number of detections per image, for all classes
         }

스크립트를 제출하려면 현재 디렉터리에 ONNX_batch_model_generator_automl_for_images.py 파일을 다운로드하고 보관합니다. 다음 명령 작업을 사용하여 ONNX_batch_model_generator_automl_for_images.py에서 사용할 수 있는 스크립트 를 제출하여 특정 일괄 처리 크기의 ONNX 모델을 생성합니다. 다음 코드에서는 학습된 모델 환경을 사용하여 이 스크립트를 제출하여 ONNX 모델을 생성하고 출력 디렉터리에 저장합니다.

from azure.ai.ml import command

job = command(
    code="./onnx_generator_files",  # local path where the code is stored
    command="python ONNX_batch_model_generator_automl_for_images.py --model_name ${{inputs.model_name}} --batch_size ${{inputs.batch_size}} --height_onnx ${{inputs.height_onnx}} --width_onnx ${{inputs.width_onnx}} --job_name ${{inputs.job_name}} --task_type ${{inputs.task_type}} --min_size ${{inputs.min_size}} --max_size ${{inputs.max_size}} --box_score_thresh ${{inputs.box_score_thresh}} --box_nms_thresh ${{inputs.box_nms_thresh}} --box_detections_per_img ${{inputs.box_detections_per_img}}",
    inputs=inputs,
    environment=env,
    compute=compute_name,
    display_name="ONNX-batch-model-generation-rcnn",
    description="Use the PyTorch to generate ONNX batch scoring model.",
)
returned_job = ml_client.create_or_update(job)
ml_client.jobs.stream(returned_job.name)

from azure.ai.ml import command

job = command(
    code="./onnx_generator_files",  # local path where the code is stored
    command="python ONNX_batch_model_generator_automl_for_images.py --model_name ${{inputs.model_name}} --batch_size ${{inputs.batch_size}} --height_onnx ${{inputs.height_onnx}} --width_onnx ${{inputs.width_onnx}} --job_name ${{inputs.job_name}} --task_type ${{inputs.task_type}} --img_size ${{inputs.img_size}} --model_size ${{inputs.model_size}} --box_score_thresh ${{inputs.box_score_thresh}} --box_iou_thresh ${{inputs.box_iou_thresh}}",
    inputs=inputs,
    environment=env,
    compute=compute_name,
    display_name="ONNX-batch-model-generation",
    description="Use the PyTorch to generate ONNX batch scoring model.",
)
returned_job = ml_client.create_or_update(job)
ml_client.jobs.stream(returned_job.name)

from azure.ai.ml import command

job = command(
    code="./onnx_generator_files",  # local path where the code is stored
    command="python ONNX_batch_model_generator_automl_for_images.py --model_name ${{inputs.model_name}} --batch_size ${{inputs.batch_size}} --height_onnx ${{inputs.height_onnx}} --width_onnx ${{inputs.width_onnx}} --job_name ${{inputs.job_name}} --task_type ${{inputs.task_type}} --min_size ${{inputs.min_size}} --max_size ${{inputs.max_size}} --box_score_thresh ${{inputs.box_score_thresh}} --box_nms_thresh ${{inputs.box_nms_thresh}} --box_detections_per_img ${{inputs.box_detections_per_img}}",
    inputs=inputs,
    environment=env,
    compute=compute_name,
    display_name="ONNX-batch-model-generation-maskrcnn",
    description="Use the PyTorch to generate ONNX batch scoring model.",
)
returned_job = ml_client.create_or_update(job)
ml_client.jobs.stream(returned_job.name)

일괄 처리 모델이 생성되면 UI를 통해 출력 + 로그>출력에서 수동으로 다운로드하거나 다음 방법을 사용합니다.

batch_size = 8  # use the batch size used to generate the model
returned_job_run = mlflow_client.get_run(returned_job.name)

# Download run's artifacts/outputs
onnx_model_path = mlflow_client.download_artifacts(
    returned_job_run.info.run_id, 'outputs/model_'+str(batch_size)+'.onnx', local_dir
)

모델 다운로드 단계 후에 ONNX 런타임 Python 패키지를 사용하여 model.onnx 파일로 추론을 수행합니다. 데모를 위해 이 문서에서는 이미지 데이터 세트 준비 방법을 통해 간단한 비전 작업을 위한 데이터 세트를 사용합니다.

ONNX 모델 추론을 시연하기 위해 해당 데이터 세트를 포함하는 모든 비전 작업에 대해 모델을 학습시켰습니다.

레이블 및 ONNX 모델 파일 로드

다음 코드 조각은 클래스 이름이 정렬되어 있는 labels.json을 로드합니다. 즉, ONNX 모델이 레이블 ID를 2로 예측하면 labels.json 파일에서 세 번째 인덱스에 지정된 레이블 이름에 해당한다는 것입니다.

import json
import onnxruntime

labels_file = "automl_models/labels.json"
with open(labels_file) as f:
    classes = json.load(f)
print(classes)
try:
    session = onnxruntime.InferenceSession(onnx_model_path)
    print("ONNX model loaded...")
except Exception as e: 
    print("Error loading ONNX file: ", str(e))

ONNX 모델에 대한 예상 입력 및 출력 세부 정보 가져오기

모델이 있는 경우 모델별 세부 정보 및 작업별 세부 정보를 알고 있어야 합니다. 이러한 세부 정보로는 입력 수와 출력 수, 이미지 전처리를 위한 예상 입력 셰이프 또는 형식, 모델별 출력 또는 작업별 출력을 알 수 있는 출력 셰이프가 포함됩니다.

sess_input = session.get_inputs()
sess_output = session.get_outputs()
print(f"No. of inputs : {len(sess_input)}, No. of outputs : {len(sess_output)}")

for idx, input_ in enumerate(range(len(sess_input))):
    input_name = sess_input[input_].name
    input_shape = sess_input[input_].shape
    input_type = sess_input[input_].type
    print(f"{idx} Input name : { input_name }, Input shape : {input_shape}, \
    Input type  : {input_type}")  

for idx, output in enumerate(range(len(sess_output))):
    output_name = sess_output[output].name
    output_shape = sess_output[output].shape
    output_type = sess_output[output].type
    print(f" {idx} Output name : {output_name}, Output shape : {output_shape}, \
    Output type  : {output_type}")

ONNX 모델에 대한 예상 입력 및 출력 형식

모든 ONNX 모델에는 미리 정의된 입력 및 출력 형식 집합이 있습니다.

이 예제는 ONNX 모델 추론을 설명하기 위한 134개 이미지와 4개 클래스/레이블이 포함된 fridgeObjects 데이터 세트를 학습시킨 모델을 채택합니다. 이미지 분류 작업 학습에 대한 자세한 내용은 다중 클래스 이미지 분류 Notebook을 참조하세요.

입력 형식

입력은 전처리된 이미지입니다.

입력 이름	입력 셰이프	입력 유형	설명
input1	`(batch_size, num_channels, height, width)`	ndarray(float)	입력은 전처리된 이미지로, 배치 크기가 1이고 높이 및 너비가 224인 셰이프 `(1, 3, 224, 224)`를 갖습니다. 이러한 숫자는 학습 예제의 `crop_size`에 사용되는 값에 해당합니다.

출력 형식

출력은 모든 클래스/레이블에 대한 로짓 배열입니다.

출력 이름	출력 셰이프	출력 형식	설명
출력1	`(batch_size, num_classes)`	ndarray(float)	모델은 로짓(`softmax` 제외)을 반환합니다. 예를 들어, 배치 크기가 1이고 클래스가 4개 있으면 `(1, 4)`를 반환합니다.

이 예제는 ONNX 모델 추론을 설명하기 위한 128개 이미지와 4개 클래스/레이블이 포함된 다중 레이블 fridgeObjects 데이터 세트를 학습시킨 모델을 사용합니다. 다중 레이블 이미지 분류에 대한 모델 학습 관련 정보는 다중 레이블 이미지 분류 Notebook을 참조하세요.

입력 형식

입력은 전처리된 이미지입니다.

입력 이름	입력 셰이프	입력 유형	설명
input1	`(batch_size, num_channels, height, width)`	ndarray(float)	입력은 전처리된 이미지로, 배치 크기가 1이고 높이 및 너비가 224인 셰이프 `(1, 3, 224, 224)`를 갖습니다. 이러한 숫자는 학습 예제의 `crop_size`에 사용되는 값에 해당합니다.

출력 형식

출력은 모든 클래스/레이블에 대한 로짓 배열입니다.

출력 이름	출력 셰이프	출력 형식	설명
output1	`(batch_size, num_classes)`	ndarray(float)	모델은 로짓(`sigmoid` 제외)을 반환합니다. 예를 들어, 배치 크기가 1이고 클래스가 4개 있으면 `(1, 4)`를 반환합니다.

이 개체 감지 예제는 ONNX 모델 추론을 설명하기 위한 128개 이미지와 4개 클래스/레이블이 포함된 fridgeObjects 검색 데이터 세트를 학습시킨 모델을 사용합니다. 이 예제에서는 추론 단계를 보여 주기 위해 Faster R-CNN 모델을 학습시킵니다. 개체 감지 모델 학습에 대한 자세한 내용은 개체 감지 Notebook을 참조하세요.

입력 형식

입력은 전처리된 이미지입니다.

입력 이름	입력 셰이프	입력 유형	설명
입력	`(batch_size, num_channels, height, width)`	ndarray(float)	입력은 전처리된 이미지로, 배치 크기가 1이고 높이가 600, 너비가 800인 셰이프 `(1, 3, 600, 800)`를 갖습니다.

출력 형식

출력은 output_names 및 예측의 튜플입니다. 여기서 output_names 및 predictions는 각각 길이가 3*batch_size인 목록입니다. Faster R-CNN의 경우 출력 순서는 상자, 레이블 및 점수인 반면 RetinaNet의 경우 출력은 상자, 점수, 레이블입니다.

출력 이름	출력 셰이프	출력 형식	설명
`output_names`	`(3*batch_size)`	키 목록	일괄 처리 크기가 2인 경우 `output_names`은 `['boxes_0', 'labels_0', 'scores_0', 'boxes_1', 'labels_1', 'scores_1']`입니다.
`predictions`	`(3*batch_size)`	ndarray(float) 목록	일괄 처리 크기가 2인 경우 `predictions`은 `[(n1_boxes, 4), (n1_boxes), (n1_boxes), (n2_boxes, 4), (n2_boxes), (n2_boxes)]`의 형태를 취합니다. 여기서 각 인덱스 값은 `output_names`의 동일한 인덱스에 해당합니다.

다음 표에서는 이미지 일괄 처리의 각 샘플에 대해 반환된 상자, 레이블 및 점수에 대해 설명합니다.

속성	도형	유형	설명
상자	`(n_boxes, 4)`. 각 상자에는 `x_min, y_min, x_max, y_max`가 있습니다.	ndarray(float)	모델은 상단 왼쪽 및 하단 오른쪽 좌표와 함께 n개 상자를 반환합니다.
레이블	`(n_boxes)`	ndarray(float)	각 상자에 있는 개체의 레이블 또는 클래스 ID입니다.
점수	`(n_boxes)`	ndarray(float)	각 상자에 있는 개체의 신뢰도 점수입니다.

이 개체 감지 예제는 ONNX 모델 추론을 설명하기 위한 128개 이미지와 4개 클래스/레이블이 포함된 fridgeObjects 검색 데이터 세트를 학습시킨 모델을 사용합니다. 이 예제에서는 추론 단계를 보여 주기 위해 YOLO 모델을 학습시킵니다. 개체 감지 모델 학습에 대한 자세한 내용은 개체 감지 Notebook을 참조하세요.

입력 형식

입력은 전처리된 이미지로, 배치 크기가 1이고 높이 및 너비가 640인 셰이프 (1, 3, 640, 640)를 갖습니다. 이러한 숫자는 학습 예제에 사용되는 값에 해당합니다.

입력 이름	입력 셰이프	입력 유형	설명
입력	`(batch_size, num_channels, height, width)`	ndarray(float)	입력은 전처리된 이미지로, 일괄 처리 크기가 1이고 높이가 640, 너비가 640인 셰이프 `(1, 3, 640, 640)`를 갖습니다.

출력 형식

ONNX 모델 예측에는 여러 출력이 포함됩니다. 첫 번째 출력은 감지를 위해 최대가 아닌 억제를 수행하는 데 필요합니다. 쉽게 사용할 수 있는 자동화된 ML은 NMS 후처리 단계 후 출력 형식을 표시합니다. NMS 이후의 출력은 일괄 처리의 각 샘플에 대한 상자, 레이블 및 점수 목록입니다.

출력 이름	출력 셰이프	출력 형식	설명
출력	`(batch_size)`	ndarray(float) 목록	모델은 일괄 처리의 각 샘플에 대한 상자 감지를 반환합니다.

목록의 각 셀은 도형이 (n_boxes, 6)인 샘플의 상자 감지를 나타내며 각 상자에는 x_min, y_min, x_max, y_max, confidence_score, class_id가 있습니다.

이 인스턴스 구분 예제에서는 ONNX 모델 추론을 설명하기 위해 128개 이미지와 4개 클래스/레이블이 포함된 fridgeObjects 데이터 세트를 학습시킨 Mask R-CNN 모델을 사용합니다. 인스턴스 구분 모델의 학습에 대한 자세한 내용은 인스턴스 구분 Notebook을 참조하세요.

중요합니다

인스턴스 조각화 작업에는 Mask R-CNN만 지원됩니다. 입력 및 출력 형식은 Mask R-CNN만을 기준으로 합니다.

입력 형식

입력은 전처리된 이미지입니다. Mask R-CNN에 대한 ONNX 모델을 내보내 다른 셰이프의 이미지를 사용할 수 있습니다. 성능 향상을 위해 학습 이미지 크기와 일치하는 고정된 크기로 크기를 조정하는 것이 좋습니다.

입력 이름	입력 셰이프	입력 유형	설명
입력	`(batch_size, num_channels, height, width)`	ndarray(float)	입력은 전처리된 이미지로, 배치 크기가 1이고 입력 이미지와 높이 및 너비가 유사한 셰이프 `(1, 3, input_image_height, input_image_width)`를 갖습니다.

출력 형식

출력은 output_names 및 예측의 튜플입니다. 여기서 output_names 및 predictions는 각각 길이가 4*batch_size인 목록입니다.

출력 이름	출력 셰이프	출력 형식	설명
`output_names`	`(4*batch_size)`	키 목록	일괄 처리 크기가 2인 경우 `output_names`은 `['boxes_0', 'labels_0', 'scores_0', 'masks_0', 'boxes_1', 'labels_1', 'scores_1', 'masks_1']`입니다.
`predictions`	`(4*batch_size)`	ndarray(float) 목록	일괄 처리 크기가 2인 경우 `predictions`은 `[(n1_boxes, 4), (n1_boxes), (n1_boxes), (n1_boxes, 1, height_onnx, width_onnx), (n2_boxes, 4), (n2_boxes), (n2_boxes), (n2_boxes, 1, height_onnx, width_onnx)]`의 형태를 취합니다. 여기서 각 인덱스 값은 `output_names`의 동일한 인덱스에 해당합니다.

속성	도형	유형	설명
상자	`(n_boxes, 4)`. 각 상자에는 `x_min, y_min, x_max, y_max`가 있습니다.	ndarray(float)	모델은 상단 왼쪽 및 하단 오른쪽 좌표와 함께 n개 상자를 반환합니다.
레이블	`(n_boxes)`	ndarray(float)	각 상자에 있는 개체의 레이블 또는 클래스 ID입니다.
점수	`(n_boxes)`	ndarray(float)	각 상자에 있는 개체의 신뢰도 점수입니다.
마스크	`(n_boxes, 1, height_onnx, width_onnx)`	ndarray(float)	입력 이미지의 셰이프 높이 및 너비를 갖는 감지된 개체의 마스크(다각형)입니다.

전처리

ONNX 모델 추론을 위해 다음과 같은 전처리 단계를 수행합니다.

이미지를 RGB로 변환합니다.
이미지 크기를 학습 중에 유효성 검사 데이터 세트의 변환에 사용된 값에 해당하는 valid_resize_size 및 valid_resize_size 값으로 조정합니다. valid_resize_size의 기본값은 256입니다.
height_onnx_crop_size 및 width_onnx_crop_size 크기로 이미지 가운데를 자릅니다. 기본값이 224인 valid_crop_size에 해당합니다.
HxWxC를 CxHxW로 바꿉니다.
Float 형식으로 변환합니다.
ImageNet의 mean = [0.485, 0.456, 0.406] 및 std = [0.229, 0.224, 0.225]로 정규화합니다.

학습 중에 하이퍼 매개 변수valid_resize_size 및 valid_crop_size에 대해 다른 값을 선택하면 해당 값을 사용해야 합니다.

ONNX 모델에 필요한 입력 셰이프를 가져옵니다.

batch, channel, height_onnx_crop_size, width_onnx_crop_size = session.get_inputs()[0].shape
batch, channel, height_onnx_crop_size, width_onnx_crop_size

PyTorch 사용 안 함

import glob
import numpy as np
from PIL import Image

def preprocess(image, resize_size, crop_size_onnx):
    """Perform pre-processing on raw input image
    
    :param image: raw input image
    :type image: PIL image
    :param resize_size: value to resize the image
    :type image: Int
    :param crop_size_onnx: expected height of an input image in onnx model
    :type crop_size_onnx: Int
    :return: pre-processed image in numpy format
    :rtype: ndarray 1xCxHxW
    """

    image = image.convert('RGB')
    # resize
    image = image.resize((resize_size, resize_size))
    #  center  crop
    left = (resize_size - crop_size_onnx)/2
    top = (resize_size - crop_size_onnx)/2
    right = (resize_size + crop_size_onnx)/2
    bottom = (resize_size + crop_size_onnx)/2
    image = image.crop((left, top, right, bottom))

    np_image = np.array(image)
    # HWC -> CHW
    np_image = np_image.transpose(2, 0, 1) # CxHxW
    # normalize the image
    mean_vec = np.array([0.485, 0.456, 0.406])
    std_vec = np.array([0.229, 0.224, 0.225])
    norm_img_data = np.zeros(np_image.shape).astype('float32')
    for i in range(np_image.shape[0]):
        norm_img_data[i,:,:] = (np_image[i,:,:]/255 - mean_vec[i])/std_vec[i]
             
    np_image = np.expand_dims(norm_img_data, axis=0) # 1xCxHxW
    return np_image

# following code loads only batch_size number of images for demonstrating ONNX inference
# make sure that the data directory has at least batch_size number of images

test_images_path = "automl_models_multi_cls/test_images_dir/*" # replace with path to images
# Select batch size needed
batch_size = 8
# you can modify resize_size based on your trained model
resize_size = 256
# height and width will be the same for classification
crop_size_onnx = height_onnx_crop_size 

image_files = glob.glob(test_images_path)
img_processed_list = []
for i in range(batch_size):
    img = Image.open(image_files[i])
    img_processed_list.append(preprocess(img, resize_size, crop_size_onnx))
    
if len(img_processed_list) > 1:
    img_data = np.concatenate(img_processed_list)
elif len(img_processed_list) == 1:
    img_data = img_processed_list[0]
else:
    img_data = None

assert batch_size == img_data.shape[0]

PyTorch 사용

import glob
import torch
import numpy as np
from PIL import Image
from torchvision import transforms

def _make_3d_tensor(x) -> torch.Tensor:
    """This function is for images that have less channels.

    :param x: input tensor
    :type x: torch.Tensor
    :return: return a tensor with the correct number of channels
    :rtype: torch.Tensor
    """
    return x if x.shape[0] == 3 else x.expand((3, x.shape[1], x.shape[2]))

def preprocess(image, resize_size, crop_size_onnx):
    transform = transforms.Compose([
        transforms.Resize(resize_size),
        transforms.CenterCrop(crop_size_onnx),
        transforms.ToTensor(),
        transforms.Lambda(_make_3d_tensor),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])])
    
    img_data = transform(image)
    img_data = img_data.numpy()
    img_data = np.expand_dims(img_data, axis=0)
    return img_data

# following code loads only batch_size number of images for demonstrating ONNX inference
# make sure that the data directory has at least batch_size number of images

test_images_path = "automl_models_multi_cls/test_images_dir/*"  # replace with path to images
# Select batch size needed
batch_size = 8
# you can modify resize_size based on your trained model
resize_size = 256
# height and width will be the same for classification
crop_size_onnx = height_onnx_crop_size 

image_files = glob.glob(test_images_path)
img_processed_list = []
for i in range(batch_size):
    img = Image.open(image_files[i])
    img_processed_list.append(preprocess(img, resize_size, crop_size_onnx))
    
if len(img_processed_list) > 1:
    img_data = np.concatenate(img_processed_list)
elif len(img_processed_list) == 1:
    img_data = img_processed_list[0]
else:
    img_data = None

assert batch_size == img_data.shape[0]

ONNX 모델 추론을 위해 다음과 같은 전처리 단계를 수행합니다. 이러한 단계는 다중 클래스 이미지 분류에 대해서도 동일합니다.

이미지를 RGB로 변환합니다.
이미지 크기를 학습 중에 유효성 검사 데이터 세트의 변환에 사용된 값에 해당하는 valid_resize_size 및 valid_resize_size 값으로 조정합니다. valid_resize_size의 기본값은 256입니다.
height_onnx_crop_size 및 width_onnx_crop_size 크기로 이미지 가운데를 자릅니다. 이것은 기본값이 224인 valid_crop_size에 해당합니다.
HxWxC를 CxHxW로 바꿉니다.
Float 형식으로 변환합니다.
ImageNet의 mean = [0.485, 0.456, 0.406] 및 std = [0.229, 0.224, 0.225]로 정규화합니다.

학습 중에 하이퍼 매개 변수valid_resize_size 및 valid_crop_size에 대해 다른 값을 선택하면 해당 값을 사용해야 합니다.

ONNX 모델에 필요한 입력 셰이프를 가져옵니다.

batch, channel, height_onnx_crop_size, width_onnx_crop_size = session.get_inputs()[0].shape
batch, channel, height_onnx_crop_size, width_onnx_crop_size

PyTorch 사용 안 함

import glob
import numpy as np
from PIL import Image

def preprocess(image, resize_size, crop_size_onnx):
    """Perform pre-processing on raw input image
    
    :param image: raw input image
    :type image: PIL image
    :param resize_size: value to resize the image
    :type image: Int
    :param crop_size_onnx: expected height of an input image in onnx model
    :type crop_size_onnx: Int
    :return: pre-processed image in numpy format
    :rtype: ndarray 1xCxHxW
    """

    image = image.convert('RGB')
    # resize
    image = image.resize((resize_size, resize_size))
    # center  crop
    left = (resize_size - crop_size_onnx)/2
    top = (resize_size - crop_size_onnx)/2
    right = (resize_size + crop_size_onnx)/2
    bottom = (resize_size + crop_size_onnx)/2
    image = image.crop((left, top, right, bottom))

    np_image = np.array(image)
    # HWC -> CHW
    np_image = np_image.transpose(2, 0, 1) # CxHxW

    # normalize the image
    mean_vec = np.array([0.485, 0.456, 0.406])
    std_vec = np.array([0.229, 0.224, 0.225])
    norm_img_data = np.zeros(np_image.shape).astype('float32')
    for i in range(np_image.shape[0]):
        norm_img_data[i,:,:] = (np_image[i,:,:] / 255 - mean_vec[i]) / std_vec[i]    
    np_image = np.expand_dims(norm_img_data, axis=0) # 1xCxHxW
    return np_image

# following code loads only batch_size number of images for demonstrating ONNX inference
# make sure that the data directory has at least batch_size number of images

test_images_path = "automl_models_multi_label/test_images_dir/*" # replace with path to images
# Select batch size needed
batch_size = 8
# you can modify resize_size based on your trained model
resize_size = 256
# height and width will be the same for classification
crop_size_onnx = height_onnx_crop_size 

image_files = glob.glob(test_images_path)
img_processed_list = []
for i in range(batch_size):
    img = Image.open(image_files[i])
    img_processed_list.append(preprocess(img, resize_size, crop_size_onnx))
    
if len(img_processed_list) > 1:
    img_data = np.concatenate(img_processed_list)
elif len(img_processed_list) == 1:
    img_data = img_processed_list[0]
else:
    img_data = None

assert batch_size == img_data.shape[0]

PyTorch 사용

import glob
import torch
import numpy as np
from PIL import Image
from torchvision import transforms

def _make_3d_tensor(x) -> torch.Tensor:
    """This function is for images that have less channels.

    :param x: input tensor
    :type x: torch.Tensor
    :return: return a tensor with the correct number of channels
    :rtype: torch.Tensor
    """
    return x if x.shape[0] == 3 else x.expand((3, x.shape[1], x.shape[2]))

def preprocess(image, resize_size, crop_size_onnx):
    """Perform pre-processing on raw input image
    
    :param image: raw input image
    :type image: PIL image
    :param resize_size: value to resize the image
    :type image: Int
    :param crop_size_onnx: expected height of an input image in onnx model
    :type crop_size_onnx: Int
    :return: pre-processed image in numpy format
    :rtype: ndarray 1xCxHxW
    """
    transform = transforms.Compose([
        transforms.Resize(resize_size),
        transforms.CenterCrop(crop_size_onnx),
        transforms.ToTensor(),
        transforms.Lambda(_make_3d_tensor),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])])
    
    img_data = transform(image)
    img_data = img_data.numpy()
    img_data = np.expand_dims(img_data, axis=0)
    
    return img_data

# following code loads only batch_size number of images for demonstrating ONNX inference
# make sure that the data directory has at least batch_size number of images

test_images_path = "automl_models_multi_label/test_images_dir/*"  # replace with path to images
# Select batch size needed
batch_size = 8
# you can modify resize_size based on your trained model
resize_size = 256
# height and width will be the same for classification
crop_size_onnx = height_onnx_crop_size 

image_files = glob.glob(test_images_path)
img_processed_list = []
for i in range(batch_size):
    img = Image.open(image_files[i])
    img_processed_list.append(preprocess(img, resize_size, crop_size_onnx))
    
if len(img_processed_list) > 1:
    img_data = np.concatenate(img_processed_list)
elif len(img_processed_list) == 1:
    img_data = img_processed_list[0]
else:
    img_data = None

assert batch_size == img_data.shape[0]

Faster R-CNN 아키텍처를 사용하여 개체를 감지하려면 이미지 자르기를 제외하고 이미지 구분과 동일한 전처리 단계를 따릅니다. 높이 600 및 너비 800으로 이미지 크기를 조정할 수 있습니다. 다음 코드를 사용하여 예상되는 입력 높이 및 너비를 구할 수 있습니다.

batch, channel, height_onnx, width_onnx = session.get_inputs()[0].shape
batch, channel, height_onnx, width_onnx

그런 다음, 전처리 단계를 수행합니다.

import glob
import numpy as np
from PIL import Image

def preprocess(image, height_onnx, width_onnx):
    """Perform pre-processing on raw input image
    
    :param image: raw input image
    :type image: PIL image
    :param height_onnx: expected height of an input image in onnx model
    :type height_onnx: Int
    :param width_onnx: expected width of an input image in onnx model
    :type width_onnx: Int
    :return: pre-processed image in numpy format
    :rtype: ndarray 1xCxHxW
    """

    image = image.convert('RGB')
    image = image.resize((width_onnx, height_onnx))
    np_image = np.array(image)
    # HWC -> CHW
    np_image = np_image.transpose(2, 0, 1) # CxHxW
    # normalize the image
    mean_vec = np.array([0.485, 0.456, 0.406])
    std_vec = np.array([0.229, 0.224, 0.225])
    norm_img_data = np.zeros(np_image.shape).astype('float32')
    for i in range(np_image.shape[0]):
        norm_img_data[i,:,:] = (np_image[i,:,:] / 255 - mean_vec[i]) / std_vec[i]
    np_image = np.expand_dims(norm_img_data, axis=0) # 1xCxHxW
    return np_image

# following code loads only batch_size number of images for demonstrating ONNX inference
# make sure that the data directory has at least batch_size number of images

test_images_path = "automl_models_od/test_images_dir/*" # replace with path to images
image_files = glob.glob(test_images_path)
img_processed_list = []
for i in range(batch_size):
    img = Image.open(image_files[i])
    img_processed_list.append(preprocess(img, height_onnx, width_onnx))
    
if len(img_processed_list) > 1:
    img_data = np.concatenate(img_processed_list)
elif len(img_processed_list) == 1:
    img_data = img_processed_list[0]
else:
    img_data = None

assert batch_size == img_data.shape[0]

YOLO 아키텍처를 사용하여 개체를 감지하려면 이미지 자르기를 제외하고 이미지 구분과 동일한 전처리 단계를 따릅니다. 다음 코드를 사용하여 높이 600 및 800으로 이미지 크기를 조정하고, 예상되는 입력 높이 및 너비를 구할 수 있습니다.

batch, channel, height_onnx, width_onnx = session.get_inputs()[0].shape
batch, channel, height_onnx, width_onnx

YOLO에 필요한 전처리에 대해서는 yolo_onnx_preprocessing_utils.py를 참조하세요.

import glob
import numpy as np
from yolo_onnx_preprocessing_utils import preprocess

# use height and width based on the generated model
test_images_path = "automl_models_od_yolo/test_images_dir/*" # replace with path to images
image_files = glob.glob(test_images_path)
img_processed_list = []
pad_list = []
for i in range(batch_size):
    img_processed, pad = preprocess(image_files[i])
    img_processed_list.append(img_processed)
    pad_list.append(pad)
    
if len(img_processed_list) > 1:
    img_data = np.concatenate(img_processed_list)
elif len(img_processed_list) == 1:
    img_data = img_processed_list[0]
else:
    img_data = None

assert batch_size == img_data.shape[0]

중요합니다

인스턴스 조각화 작업에는 Mask R-CNN만 지원됩니다. 전처리 단계는 Mast R-CNN을 기준으로 합니다.

ONNX 모델 추론을 위해 다음과 같은 전처리 단계를 수행합니다.

이미지를 RGB로 변환합니다.
이미지 크기를 조정합니다.
HxWxC를 CxHxW로 바꿉니다.
Float 형식으로 변환합니다.
ImageNet의 mean = [0.485, 0.456, 0.406] 및 std = [0.229, 0.224, 0.225]로 정규화합니다.

resize_height 및 resize_width의 경우 Mask R-CNN의 min_size 및 max_size하이퍼 매개 변수에 따라, 학습 중에 사용한 값을 사용할 수도 있습니다.

import glob
import numpy as np
from PIL import Image

def preprocess(image, resize_height, resize_width):
    """Perform pre-processing on raw input image
    
    :param image: raw input image
    :type image: PIL image
    :param resize_height: resize height of an input image
    :type resize_height: Int
    :param resize_width: resize width of an input image
    :type resize_width: Int
    :return: pre-processed image in numpy format
    :rtype: ndarray of shape 1xCxHxW
    """

    image = image.convert('RGB')
    image = image.resize((resize_width, resize_height))
    np_image = np.array(image)
    # HWC -> CHW
    np_image = np_image.transpose(2, 0, 1)  # CxHxW
    # normalize the image
    mean_vec = np.array([0.485, 0.456, 0.406])
    std_vec = np.array([0.229, 0.224, 0.225])
    norm_img_data = np.zeros(np_image.shape).astype('float32')
    for i in range(np_image.shape[0]):
        norm_img_data[i,:,:] = (np_image[i,:,:]/255 - mean_vec[i])/std_vec[i]
    np_image = np.expand_dims(norm_img_data, axis=0)  # 1xCxHxW
    return np_image

# following code loads only batch_size number of images for demonstrating ONNX inference
# make sure that the data directory has at least batch_size number of images
# use height and width based on the trained model
# use height and width based on the generated model
test_images_path = "automl_models_is/test_images_dir/*" # replace with path to images
image_files = glob.glob(test_images_path)
img_processed_list = []
for i in range(batch_size):
    img = Image.open(image_files[i])
    img_processed_list.append(preprocess(img, height_onnx, width_onnx))
    
if len(img_processed_list) > 1:
    img_data = np.concatenate(img_processed_list)
elif len(img_processed_list) == 1:
    img_data = img_processed_list[0]
else:
    img_data = None

assert batch_size == img_data.shape[0]

ONNX 런타임을 사용한 추론

ONNX 런타임을 사용하는 추론은 Computer Vision 작업마다 다릅니다.

def get_predictions_from_ONNX(onnx_session, img_data):
    """Perform predictions with ONNX runtime
    
    :param onnx_session: onnx model session
    :type onnx_session: class InferenceSession
    :param img_data: pre-processed numpy image
    :type img_data: ndarray with shape 1xCxHxW
    :return: scores with shapes
            (1, No. of classes in training dataset) 
    :rtype: numpy array
    """

    sess_input = onnx_session.get_inputs()
    sess_output = onnx_session.get_outputs()
    print(f"No. of inputs : {len(sess_input)}, No. of outputs : {len(sess_output)}")    
    # predict with ONNX Runtime
    output_names = [ output.name for output in sess_output]
    scores = onnx_session.run(output_names=output_names,\
                                               input_feed={sess_input[0].name: img_data})
    
    return scores[0]

scores = get_predictions_from_ONNX(session, img_data)

def get_predictions_from_ONNX(onnx_session,img_data):
    """Perform predictions with ONNX runtime
    
    :param onnx_session: onnx model session
    :type onnx_session: class InferenceSession
    :param img_data: pre-processed numpy image
    :type img_data: ndarray with shape 1xCxHxW
    :return: scores with shapes
            (1, No. of classes in training dataset) 
    :rtype: numpy array
    """
    
    sess_input = onnx_session.get_inputs()
    sess_output = onnx_session.get_outputs()
    print(f"No. of inputs : {len(sess_input)}, No. of outputs : {len(sess_output)}")    
    # predict with ONNX Runtime
    output_names = [ output.name for output in sess_output]
    scores = onnx_session.run(output_names=output_names,\
                                               input_feed={sess_input[0].name: img_data})
    
    return scores[0]

scores = get_predictions_from_ONNX(session, img_data)

def get_predictions_from_ONNX(onnx_session, img_data):
    """perform predictions with ONNX runtime
    
    :param onnx_session: onnx model session
    :type onnx_session: class InferenceSession
    :param img_data: pre-processed numpy image
    :type img_data: ndarray with shape 1xCxHxW
    :return: boxes, labels , scores 
            (No. of boxes, 4) (No. of boxes,) (No. of boxes,)
    :rtype: tuple
    """

    sess_input = onnx_session.get_inputs()
    sess_output = onnx_session.get_outputs()
    
    # predict with ONNX Runtime
    output_names = [output.name for output in sess_output]
    predictions = onnx_session.run(output_names=output_names,\
                                               input_feed={sess_input[0].name: img_data})

    return output_names, predictions

output_names, predictions = get_predictions_from_ONNX(session, img_data)

def get_predictions_from_ONNX(onnx_session,img_data):
    """perform predictions with ONNX Runtime
    
    :param onnx_session: onnx model session
    :type onnx_session: class InferenceSession
    :param img_data: pre-processed numpy image
    :type img_data: ndarray with shape 1xCxHxW
    :return: boxes, labels , scores 
    :rtype: list
    """
    sess_input = onnx_session.get_inputs()
    sess_output = onnx_session.get_outputs()
    # predict with ONNX Runtime
    output_names = [ output.name for output in sess_output]
    pred = onnx_session.run(output_names=output_names,\
                                               input_feed={sess_input[0].name: img_data})
    return pred[0]

result = get_predictions_from_ONNX(session, img_data)

인스턴스 구분 모델은 상자, 레이블, 점수 및 마스크를 예측합니다. ONNX는 해당 경계 상자 및 클래스 신뢰도 점수와 함께 인스턴스당 예측 마스크를 출력합니다. 필요한 경우 이진 마스크에서 다각형으로 변환해야 할 수도 있습니다.


def get_predictions_from_ONNX(onnx_session, img_data):
    """Perform predictions with ONNX runtime
    
    :param onnx_session: onnx model session
    :type onnx_session: class InferenceSession
    :param img_data: pre-processed numpy image
    :type img_data: ndarray with shape 1xCxHxW
    :return: boxes, labels , scores , masks with shapes
            (No. of instances, 4) (No. of instances,) (No. of instances,)
            (No. of instances, 1, HEIGHT, WIDTH))  
    :rtype: tuple
    """
    
    sess_input = onnx_session.get_inputs()
    sess_output = onnx_session.get_outputs()
    # predict with ONNX Runtime
    output_names = [ output.name for output in sess_output]
    predictions = onnx_session.run(output_names=output_names,\
                                               input_feed={sess_input[0].name: img_data})
    return output_names, predictions

output_names, predictions = get_predictions_from_ONNX(session, img_data)

후처리

각 클래스에 대해 분류 신뢰도 점수(확률)를 구하려면 예측된 값에 대해 softmax()를 적용합니다. 그러면 예측은 확률이 가장 높은 클래스가 됩니다.

PyTorch 사용 안 함

def softmax(x):
    e_x = np.exp(x - np.max(x, axis=1, keepdims=True))
    return e_x / np.sum(e_x, axis=1, keepdims=True)

conf_scores = softmax(scores)
class_preds = np.argmax(conf_scores, axis=1)
print("predicted classes:", ([(class_idx, classes[class_idx]) for class_idx in class_preds]))

PyTorch 사용

conf_scores = torch.nn.functional.softmax(torch.from_numpy(scores), dim=1)
class_preds = torch.argmax(conf_scores, dim=1)
print("predicted classes:", ([(class_idx.item(), classes[class_idx]) for class_idx in class_preds]))

이 단계는 다중 클래스 분류와 다릅니다. 다중 레이블 이미지 분류에 대한 신뢰도 점수를 구하려면 로짓(ONNX 출력)에 sigmoid를 적용해야 합니다.

PyTorch 사용 안 함

def sigmoid(x):
    return 1 / (1 + np.exp(-x))

# we apply a threshold of 0.5 on confidence scores
score_threshold = 0.5
conf_scores = sigmoid(scores)
image_wise_preds = np.where(conf_scores > score_threshold)
for image_idx, class_idx in zip(image_wise_preds[0], image_wise_preds[1]):
    print('image: {}, class_index: {}, class_name: {}'.format(image_files[image_idx], class_idx, classes[class_idx]))

PyTorch 사용

# we apply a threshold of 0.5 on confidence scores
score_threshold = 0.5
conf_scores = torch.sigmoid(torch.from_numpy(scores))
image_wise_preds = torch.where(conf_scores > score_threshold)
for image_idx, class_idx in zip(image_wise_preds[0], image_wise_preds[1]):
    print('image: {}, class_index: {}, class_name: {}'.format(image_files[image_idx], class_idx, classes[class_idx]))

다중 클래스 및 다중 레이블 분류의 경우 AutoML에서 지원되는 모든 모델 아키텍처에 대해 앞에서 설명한 것과 동일한 단계를 따를 수 있습니다.

개체 감지의 경우 예측은 자동으로 height_onnx, width_onnx 규모로 이루어집니다. 예측된 상자 좌표를 원래 차원으로 변환하기 위해 다음 계산을 구현할 수 있습니다.

Xmin * original_width/width_onnx
Ymin * original_height/height_onnx (Note: No improvements made as the expression is universally understood and does not need translation or localization.)
Xmax * original_width/width_onnx
Ymax * original_height/height_onnx

또 다른 옵션은 다음 코드를 사용하여 상자 크기를 [0, 1] 범위로 조정하는 것입니다. 이렇게 하면 상자 좌표를 원래 이미지 높이와 너비에 각 좌표(예측 시각화 섹션에 설명된 대로)와 곱하여 원래 이미지 크기의 상자를 얻을 수 있습니다.

def _get_box_dims(image_shape, box):
    box_keys = ['topX', 'topY', 'bottomX', 'bottomY']
    height, width = image_shape[0], image_shape[1]

    box_dims = dict(zip(box_keys, [coordinate.item() for coordinate in box]))

    box_dims['topX'] = box_dims['topX'] * 1.0 / width
    box_dims['bottomX'] = box_dims['bottomX'] * 1.0 / width
    box_dims['topY'] = box_dims['topY'] * 1.0 / height
    box_dims['bottomY'] = box_dims['bottomY'] * 1.0 / height

    return box_dims

def _get_prediction(boxes, labels, scores, image_shape, classes):
    bounding_boxes = []
    for box, label_index, score in zip(boxes, labels, scores):
        box_dims = _get_box_dims(image_shape, box)

        box_record = {'box': box_dims,
                      'label': classes[label_index],
                      'score': score.item()}

        bounding_boxes.append(box_record)

    return bounding_boxes

# Filter the results with threshold.
# Please replace the threshold for your test scenario.
score_threshold = 0.8
filtered_boxes_batch = []
for batch_sample in range(0, batch_size*3, 3):
    # in case of retinanet change the order of boxes, labels, scores to boxes, scores, labels
    # confirm the same from order of boxes, labels, scores output_names 
    boxes, labels, scores = predictions[batch_sample], predictions[batch_sample + 1], predictions[batch_sample + 2]
    bounding_boxes = _get_prediction(boxes, labels, scores, (height_onnx, width_onnx), classes)
    filtered_bounding_boxes = [box for box in bounding_boxes if box['score'] >= score_threshold]
    filtered_boxes_batch.append(filtered_bounding_boxes)

다음 코드는 상자, 레이블 및 점수를 만듭니다. 이러한 경계 상자 세부 정보를 사용하여 Faster R-CNN 모델에 대해 수행한 것과 동일한 후처리 단계를 수행합니다.

from yolo_onnx_preprocessing_utils import non_max_suppression, _convert_to_rcnn_output

result_final = non_max_suppression(
    torch.from_numpy(result),
    conf_thres=0.1,
    iou_thres=0.5)

def _get_box_dims(image_shape, box):
    box_keys = ['topX', 'topY', 'bottomX', 'bottomY']
    height, width = image_shape[0], image_shape[1]

    box_dims = dict(zip(box_keys, [coordinate.item() for coordinate in box]))

    box_dims['topX'] = box_dims['topX'] * 1.0 / width
    box_dims['bottomX'] = box_dims['bottomX'] * 1.0 / width
    box_dims['topY'] = box_dims['topY'] * 1.0 / height
    box_dims['bottomY'] = box_dims['bottomY'] * 1.0 / height

    return box_dims

def _get_prediction(label, image_shape, classes):
    
    boxes = np.array(label["boxes"])
    labels = np.array(label["labels"])
    labels = [label[0] for label in labels]
    scores = np.array(label["scores"])
    scores = [score[0] for score in scores]

    bounding_boxes = []
    for box, label_index, score in zip(boxes, labels, scores):
        box_dims = _get_box_dims(image_shape, box)

        box_record = {'box': box_dims,
                      'label': classes[label_index],
                      'score': score.item()}

        bounding_boxes.append(box_record)

    return bounding_boxes

bounding_boxes_batch = []
for result_i, pad in zip(result_final, pad_list):
    label, image_shape = _convert_to_rcnn_output(result_i, height_onnx, width_onnx, pad)
    bounding_boxes_batch.append(_get_prediction(label, image_shape, classes))
print(json.dumps(bounding_boxes_batch, indent=1))

예측 시각화

레이블을 사용하여 입력 이미지를 시각화합니다.

import matplotlib.image as mpimg
import matplotlib.pyplot as plt
%matplotlib inline

sample_image_index = 0 # change this for an image of interest from image_files list
IMAGE_SIZE = (18, 12)
plt.figure(figsize=IMAGE_SIZE)
img_np = mpimg.imread(image_files[sample_image_index])

img = Image.fromarray(img_np.astype('uint8'), 'RGB')
x, y = img.size

fig,ax = plt.subplots(1, figsize=(15, 15))
# Display the image
ax.imshow(img_np)

label = class_preds[sample_image_index]
if torch.is_tensor(label):
    label = label.item()
    
conf_score = conf_scores[sample_image_index]
if torch.is_tensor(conf_score):
    conf_score = np.max(conf_score.tolist())
else:
    conf_score = np.max(conf_score)

display_text = '{} ({})'.format(label, round(conf_score, 3))
print(display_text)

color = 'red'
plt.text(30, 30, display_text, color=color, fontsize=30)

plt.show()

레이블을 사용하여 입력 이미지를 시각화합니다.

import matplotlib.image as mpimg
import matplotlib.pyplot as plt
%matplotlib inline

sample_image_index = 0 # change this for an image of interest from image_files list
IMAGE_SIZE = (18, 12)
plt.figure(figsize=IMAGE_SIZE)
img_np = mpimg.imread(image_files[sample_image_index])
img = Image.fromarray(img_np.astype('uint8'), 'RGB')
x, y = img.size

fig,ax = plt.subplots(1, figsize=(15, 15))
# Display the image
ax.imshow(img_np)
# we apply a threshold of 0.5 on confidence scores
score_threshold = 0.5
label_offset_x = 30
label_offset_y = 30
if torch.is_tensor(conf_scores):
    sample_image_scores = conf_scores[sample_image_index].tolist()
else:
    sample_image_scores = conf_scores[sample_image_index]
    
for index, score in enumerate(sample_image_scores):
    if score > score_threshold:
        label = classes[index]
        display_text = '{} ({})'.format(label, round(score, 3))
        print(display_text)

        color = 'red'
        plt.text(label_offset_x, label_offset_y, display_text, color=color, fontsize=30)
        label_offset_y += 30

plt.show()

상자 및 레이블을 사용하여 입력 이미지를 시각화합니다.

import matplotlib.image as mpimg
import matplotlib.patches as patches
import matplotlib.pyplot as plt
%matplotlib inline

img_np = mpimg.imread(image_files[1])  # replace with desired image index
image_boxes = filtered_boxes_batch[1]  # replace with desired image index

IMAGE_SIZE = (18, 12)
plt.figure(figsize=IMAGE_SIZE)
img = Image.fromarray(img_np.astype('uint8'), 'RGB')
x, y = img.size
print(img.size)

fig,ax = plt.subplots(1)
# Display the image
ax.imshow(img_np)

# Draw box and label for each detection 
for detect in image_boxes:
    label = detect['label']
    box = detect['box']
    ymin, xmin, ymax, xmax =  box['topY'], box['topX'], box['bottomY'], box['bottomX']
    topleft_x, topleft_y = x * xmin, y * ymin
    width, height = x * (xmax - xmin), y * (ymax - ymin)
    print('{}: {}, {}, {}, {}'.format(detect['label'], topleft_x, topleft_y, width, height))
    rect = patches.Rectangle((topleft_x, topleft_y), width, height, 
                             linewidth=1, edgecolor='green', facecolor='none')

    ax.add_patch(rect)
    color = 'green'
    plt.text(topleft_x, topleft_y, label, color=color)

plt.show()

상자 및 레이블을 사용하여 입력 이미지를 시각화합니다.

import matplotlib.image as mpimg
import matplotlib.patches as patches
import matplotlib.pyplot as plt
%matplotlib inline

img_np = mpimg.imread(image_files[1])  # replace with desired image index
image_boxes = bounding_boxes_batch[1]  # replace with desired image index

IMAGE_SIZE = (18, 12)
plt.figure(figsize=IMAGE_SIZE)
img = Image.fromarray(img_np.astype('uint8'), 'RGB')
x, y = img.size
print(img.size)

fig,ax = plt.subplots(1)
# Display the image
ax.imshow(img_np)

# Draw box and label for each detection 
for detect in image_boxes:
    label = detect['label']
    box = detect['box']
    ymin, xmin, ymax, xmax =  box['topY'], box['topX'], box['bottomY'], box['bottomX']
    topleft_x, topleft_y = x * xmin, y * ymin
    width, height = x * (xmax - xmin), y * (ymax - ymin)
    print('{}: {}, {}, {}, {}'.format(detect['label'], topleft_x, topleft_y, width, height))
    rect = patches.Rectangle((topleft_x, topleft_y), width, height, 
                             linewidth=1, edgecolor='green', facecolor='none')

    ax.add_patch(rect)
    color = 'green'
    plt.text(topleft_x, topleft_y, label, color=color)

plt.show()

마스크 및 레이블을 사용하여 샘플 입력 이미지 시각화

import matplotlib.patches as patches
import matplotlib.pyplot as plt
%matplotlib inline

def display_detections(image, boxes, labels, scores, masks, resize_height, 
                       resize_width, classes, score_threshold):
    """Visualize boxes and masks
    
    :param image: raw image
    :type image: PIL image
    :param boxes: box with shape (No. of instances, 4) 
    :type boxes: ndarray 
    :param labels: classes with shape (No. of instances,) 
    :type labels: ndarray
    :param scores: scores with shape (No. of instances,)
    :type scores: ndarray
    :param masks: masks with shape (No. of instances, 1, HEIGHT, WIDTH) 
    :type masks:  ndarray
    :param resize_height: expected height of an input image in onnx model
    :type resize_height: Int
    :param resize_width: expected width of an input image in onnx model
    :type resize_width: Int
    :param classes: classes with shape (No. of classes) 
    :type classes:  list
    :param score_threshold: threshold on scores in the range of 0-1
    :type score_threshold: float
    :return: None
    """

    _, ax = plt.subplots(1, figsize=(12,9))

    image = np.array(image)
    original_height = image.shape[0]
    original_width = image.shape[1]

    for mask, box, label, score in zip(masks, boxes, labels, scores):        
        if score <= score_threshold:
            continue
        mask = mask[0, :, :, None]        
        # resize boxes to original raw input size
        box = [box[0]*original_width/resize_width, 
               box[1]*original_height/resize_height, 
               box[2]*original_width/resize_width, 
               box[3]*original_height/resize_height]
        
        mask = cv2.resize(mask, (image.shape[1], image.shape[0]), 0, 0, interpolation = cv2.INTER_NEAREST)
        # mask is a matrix with values in the range of [0,1]
        # higher values indicate presence of object and vice versa
        # select threshold or cut-off value to get objects present       
        mask = mask > score_threshold
        image_masked = image.copy()
        image_masked[mask] = (0, 255, 255)
        alpha = 0.5  # alpha blending with range 0 to 1
        cv2.addWeighted(image_masked, alpha, image, 1 - alpha,0, image)
        rect = patches.Rectangle((box[0], box[1]), box[2] - box[0], box[3] - box[1],\
                                 linewidth=1, edgecolor='b', facecolor='none')
        ax.annotate(classes[label] + ':' + str(np.round(score, 2)), (box[0], box[1]),\
                    color='w', fontsize=12)
        ax.add_patch(rect)
        
    ax.imshow(image)
    plt.show()

score_threshold = 0.5
img = Image.open(image_files[1])  # replace with desired image index
image_boxes = filtered_boxes_batch[1]  # replace with desired image index
boxes, labels, scores, masks = predictions[4:8]  # replace with desired image index
display_detections(img, boxes.copy(), labels, scores, masks.copy(), 
                   height_onnx, width_onnx, classes, score_threshold)

다음을 통해 공유

ONNX를 사용한 AutoML의 Computer Vision 모델에 대한 예측

필수 조건

ONNX 모델 파일 다운로드

Azure Machine Learning 스튜디오

Azure Machine Learning Python SDK

일괄 처리 채점을 위한 모델 생성

레이블 및 ONNX 모델 파일 로드

ONNX 모델에 대한 예상 입력 및 출력 세부 정보 가져오기

ONNX 모델에 대한 예상 입력 및 출력 형식

입력 형식

출력 형식

전처리

PyTorch 사용 안 함

PyTorch 사용

ONNX 런타임을 사용한 추론

후처리

PyTorch 사용 안 함

PyTorch 사용

예측 시각화

다음 단계

피드백

추가 리소스