如何使用批处理终结点部署管道

2024-12-19

适用范围：Azure CLI ml 扩展 v2（最新版）Python SDK azure-ai-ml v2（最新版）

可以在批处理终结点下部署管道组件，从而提供一种在 Azure 机器学习中操作这些组件的便捷方法。本文介绍如何创建包含简单管道的批处理部署。你将了解以下内容：

创建并注册管道组件
创建批处理终结点并部署管道组件
测试部署

关于此示例

在此示例中，我们将部署一个管道组件，该组件包含一个打印“hello world!”的简单命令作业。此组件不需要任何输入或输出，是最简单的管道部署方案。

本文中的示例基于 azureml-examples 存储库中包含的代码示例。要在本地运行命令而无需复制或粘贴 YAML 和其他文件，请使用以下命令克隆存储库并转到你的编码语言所对应的文件夹：

Azure CLI
Python

git clone https://github.com/Azure/azureml-examples --depth 1
cd azureml-examples/cli

git clone https://github.com/Azure/azureml-examples --depth 1
cd azureml-examples/sdk/python

此示例的文件位于以下位置：

cd endpoints/batch/deploy-pipelines/hello-batch

在 Jupyter Notebook 中继续操作

可以通过在克隆的存储库中打开 sdk-deploy-and-test.ipynb 笔记本来遵循此示例的 Python SDK 版本。

先决条件

Azure 订阅。如果没有 Azure 订阅，请在开始之前创建一个免费帐户。
一个 Azure 机器学习工作区。若要创建工作区，请参阅管理 Azure 机器学习工作区。
Azure 机器学习工作区中的以下权限：
- 对于创建或管理批处理终结点和部署：使用已分配有 Microsoft.MachineLearningServices/workspaces/batchEndpoints/* 权限的“所有者”角色、“参与者”角色或自定义角色。
- 对于在工作区资源组中创建 Azure 资源管理器部署：使用在部署了工作区的资源组中已分配有 Microsoft.Resources/deployments/write 权限的“所有者”角色、“参与者”角色或自定义角色。
Azure 机器学习 CLI 或适用于 Python 的 Azure 机器学习 SDK：
- Azure CLI
- Python
运行以下命令，以安装 Azure CLI 和 Azure 机器学习的 ml 扩展：
```
az extension add -n ml
```
Azure CLI 的 ml 扩展版本 2.7 中引入了批处理终结点的管道组件部署。使用 az extension update --name ml 命令获取最新版本。
运行以下命令安装适用于 Python 的 Azure 机器学习 SDK：
```
pip install azure-ai-ml
```
该 SDK 的 1.7.0 版本中引入了 ModelBatchDeployment 和 PipelineComponentBatchDeployment 类。使用 pip install -U azure-ai-ml 命令获取最新版本。

连接到工作区

工作区是 Azure 机器学习的顶级资源。它提供了一个集中的位置，用于处理你在使用 Azure 机器学习时创建的所有项目。在本部分，你将连接到要在其中执行部署任务的工作区。

Azure CLI
Python

在以下命令中，输入你的订阅 ID、工作区名称、资源组名称以及位置：

az account set --subscription <subscription>
az configure --defaults workspace=<workspace> group=<resource-group> ___location=<___location>

导入所需的库：

from azure.ai.ml import MLClient, Input, load_component
from azure.ai.ml.entities import BatchEndpoint, ModelBatchDeployment, ModelBatchDeploymentSettings, PipelineComponentBatchDeployment, Model, AmlCompute, Data, BatchRetrySettings, CodeConfiguration, Environment, Data
from azure.ai.ml.constants import AssetTypes, BatchDeploymentOutputAction
from azure.ai.ml.dsl import pipeline
from azure.identity import DefaultAzureCredential

配置工作区详细信息并获取工作区句柄：

在以下命令中，输入你的订阅 ID、资源组名称和工作区名称：

subscription_id = "<subscription>"
resource_group = "<resource-group>"
workspace = "<workspace>"

ml_client = MLClient(DefaultAzureCredential(), subscription_id, resource_group, workspace)

创建管道组件

批处理终结点可以部署模型或管道组件。管道组件可重复使用，可以使用共享注册表将这些组件从一个工作区移动到另一个工作区，从而简化 MLOps 实践。

此示例中的管道组件包含一个步骤，该步骤仅在日志中打印“hello world”消息。它不需要任何输入或输出。

hello-component/hello.yml 文件包含管道作业的配置：

hello-component/hello.yml

$schema: https://azuremlschemas.azureedge.net/latest/pipelineComponent.schema.json
name: hello_batch
display_name: Hello Batch component
version: 1
type: pipeline
jobs:
  main_job:
    type: command
    component:
      code: src
      environment: azureml://registries/azureml/environments/sklearn-1.5/labels/latest
      command: >-
        python hello.py

注册组件：

Azure CLI
Python

az ml component create -f hello-component/hello.yml

hello_batch = load_component(source="hello-component/hello.yml")
hello_batch_registered = ml_client.components.create_or_update(hello_batch)

创建批处理终结点

为终结点提供名称。批处理终结点的名称在每个区域中必须是唯一的，因为该名称将用于构造调用 URI。为了确保唯一性，请在以下代码中指定的名称后面追加任何尾随字符。
- Azure CLI
- Python
```
ENDPOINT_NAME="hello-batch"
```
```
endpoint_name = "hello-batch"
```

配置终结点：

Azure CLI
Python

endpoint.yml 文件包含了终结点的配置。

endpoint.yml

$schema: https://azuremlschemas.azureedge.net/latest/batchEndpoint.schema.json
name: hello-batch
description: A hello world endpoint for component deployments.
auth_mode: aad_token

endpoint = BatchEndpoint(
    name=endpoint_name,
    description="A hello world endpoint for component deployments",
)

创建终结点：

Azure CLI
Python

az ml batch-endpoint create --name $ENDPOINT_NAME  -f endpoint.yml

ml_client.batch_endpoints.begin_create_or_update(endpoint).result()

查询终结点 URI：

Azure CLI
Python

az ml batch-endpoint show --name $ENDPOINT_NAME

endpoint = ml_client.batch_endpoints.get(name=endpoint_name)
print(endpoint)

部署管道组件

要部署管道组件，必须创建批处理部署。部署是一组资源，用于承载执行实际工作的资产。

创建计算群集。批处理终结点和部署在计算群集上运行。它们可以运行在工作区中已存在的任何 Azure 机器学习计算群集上。因此，多个批处理部署可以共享相同的计算基础结构。在此示例中，我们将在名为 batch-cluster 的 Azure 机器学习计算群集上工作。让我们验证工作区上是否存在计算，如果不存在，则创建计算。

Azure CLI
Python

az ml compute create -n batch-cluster --type amlcompute --min-instances 0 --max-instances 5

compute_name = "batch-cluster"
if not any(filter(lambda m: m.name == compute_name, ml_client.compute.list())):
    compute_cluster = AmlCompute(
        name=compute_name,
        description="Batch endpoints compute cluster",
        min_instances=0,
        max_instances=5,
    )
    ml_client.begin_create_or_update(compute_cluster).result()

配置部署：

Azure CLI
Python

deployment.yml 文件包含部署的配置。可以检查额外属性中的完整批处理终结点 YAML 机构。

deployment.yml

$schema: https://azuremlschemas.azureedge.net/latest/pipelineComponentBatchDeployment.schema.json
name: hello-batch-dpl
endpoint_name: hello-pipeline-batch
type: pipeline
component: azureml:hello_batch@latest
settings:
    default_compute: batch-cluster

deployment = PipelineComponentBatchDeployment(
    name="hello-batch-dpl",
    description="A hello world deployment with a single step.",
    endpoint_name=endpoint.name,
    component=hello_batch,
    settings={"continue_on_step_failure": False, "default_compute": compute_name},
)

创建部署：
- Azure CLI
- Python
运行以下代码以在批处理终结点下创建一个批处理部署，并将其设置为默认部署。
```
az ml batch-deployment create --endpoint $ENDPOINT_NAME -f deployment.yml --set-default
```
提示

请注意，我们使用 --set-default 标志来指示此新部署现在是默认部署。
此命令将启动部署创建操作，并在部署创建操作继续时返回确认响应。
```
ml_client.batch_deployments.begin_create_or_update(deployment).result()
```
创建后，将此新部署配置为默认部署：
```
endpoint = ml_client.batch_endpoints.get(endpoint_name)
endpoint.defaults.deployment_name = deployment.name
ml_client.batch_endpoints.begin_create_or_update(endpoint).result()
```
部署已就绪，可供使用。

测试部署

创建部署后，即可接收作业。可以按如下所示调用默认部署：

Azure CLI
Python

JOB_NAME=$(az ml batch-endpoint invoke -n $ENDPOINT_NAME --query name -o tsv)

job = ml_client.batch_endpoints.invoke(
    endpoint_name=endpoint.name,
)

提示

在此示例中，管道没有输入或输出。但是，如果管道组件需要，则可以在调用时指出。若要了解如何指示输入和输出，请参阅《为批量终结点创建作业和输入数据》，或参阅教程《如何部署管道以使用预处理执行批量评分（预览版）》。

可以使用以下方法监视演示进度并流式传输日志：

Azure CLI
Python

az ml job stream -n $JOB_NAME

ml_client.jobs.get(job.name)

若要等待作业完成，请运行以下代码：

ml_client.jobs.stream(name=job.name)

清理资源

完成后，从工作区中删除关联的资源：

Azure CLI
Python

运行以下代码以删除批处理终结点及其基础部署。 --yes 用于确认删除。

az ml batch-endpoint delete -n $ENDPOINT_NAME --yes

删除终结点：

ml_client.batch_endpoints.begin_delete(endpoint_name).result()

（可选）除非计划在以后的部署中重用计算群集，否则请删除计算。

Azure CLI
Python

az ml compute delete -n batch-cluster

ml_client.compute.begin_delete(name="batch-cluster")

通过