コードを使用してマネージドコンピューティングデプロイをデプロイおよび推論する方法

2025-06-05

Azure AI Foundry ポータルモデルカタログには 1,600 を超えるモデルが用意されており、これらのモデルをデプロイする一般的な方法は、マネージドコンピューティングデプロイオプションを使用することです。これは、マネージドオンラインデプロイとも呼ばれます。

大規模言語モデル (LLM) をデプロイすると、Web サイト、アプリケーション、またはその他の運用環境で使用できるようになります。デプロイには通常、サーバーまたはクラウドでモデルをホストし、ユーザーがモデルと対話するための API またはその他のインターフェイスを作成することが含まれます。 Chat や Copilot などの生成 AI アプリケーションのリアルタイム推論のためにデプロイを呼び出すことができます。

この記事では、Azure Machine Learning SDK を使用してモデルをデプロイする方法について説明します。この記事では、デプロイされたモデルで推論を実行する方法についても説明します。

[前提条件]

有効な支払い方法を持つ Azure サブスクリプション。無料または試用版の Azure サブスクリプションは機能しません。 Azure サブスクリプションを持っていない場合は、始めるために有料の Azure アカウントを作成してください。
ない場合は、ハブベースのプロジェクトを作成します。
Azureサブスクリプションでマーケットプレイスの購入が有効になりました。詳細については、こちらをご覧ください。

モデル ID を取得する

Azure Machine Learning SDK を使用してマネージドコンピューティングモデルをデプロイできますが、まず、モデルカタログを参照し、デプロイに必要なモデル ID を取得しましょう。

ヒント

Azure AI Foundry ポータルで左側のウィンドウをカスタマイズできるため、これらの手順に示されている項目とは異なる項目が表示される場合があります。探しているものが表示されない場合は、左側のペインの下部にある… もっと見るを選択してください。

Azure AI Foundry にサインインし、[ホーム] ページに移動します。
左側のサイドバーから [モデルカタログ] を選択します。
[デプロイオプション] フィルターで、[マネージドコンピューティング] を選択します。
モデルを選択します。
選択したモデルの詳細ページからモデル ID をコピーします。次のようになります。azureml://registries/azureml/models/deepset-roberta-base-squad2/versions/16

モデルをデプロイする

Azure Machine Learning SDK をインストールします。
```
pip install azure-ai-ml
pip install azure-identity
```

Azure Machine Learning で認証し、クライアントオブジェクトを作成します。プレースホルダーをサブスクリプション ID、リソースグループ名、Azure AI Foundry プロジェクト名に置き換えます。

from azure.ai.ml import MLClient
from azure.identity import InteractiveBrowserCredential

workspace_ml_client = MLClient(
    credential=InteractiveBrowserCredential,
    subscription_id="your subscription name goes here",
    resource_group_name="your resource group name goes here",
    workspace_name="your project name goes here",
)

エンドポイントを作成します。マネージドコンピューティングデプロイオプションでは、モデルデプロイの前にエンドポイントを作成する必要があります。エンドポイントは、複数のモデルデプロイを格納できるコンテナーと考えることができます。エンドポイント名はリージョン内で一意である必要があるため、この例ではタイムスタンプを使用して一意のエンドポイント名を作成します。
```
import time, sys
from azure.ai.ml.entities import (
    ManagedOnlineEndpoint,
    ManagedOnlineDeployment,
    ProbeSettings,
)

# Make the endpoint name unique
timestamp = int(time.time())
online_endpoint_name = "customize your endpoint name here" + str(timestamp)

# Create an online endpoint
endpoint = ManagedOnlineEndpoint(
    name=online_endpoint_name,
    auth_mode="key",
)
workspace_ml_client.online_endpoints.begin_create_or_update(endpoint).wait()
```

デプロイを作成します。次のコードのモデル ID を、[モデル ID の取得] セクションで選択したモデルの詳細ページからコピーしたモデル ID に置き換えます。

model_name = "azureml://registries/azureml/models/deepset-roberta-base-squad2/versions/16" 

demo_deployment = ManagedOnlineDeployment(
    name="demo",
    endpoint_name=online_endpoint_name,
    model=model_name,
    instance_type="Standard_DS3_v2",
    instance_count=2,
    liveness_probe=ProbeSettings(
        failure_threshold=30,
        success_threshold=1,
        timeout=2,
        period=10,
        initial_delay=1000,
    ),
    readiness_probe=ProbeSettings(
        failure_threshold=10,
        success_threshold=1,
        timeout=10,
        period=10,
        initial_delay=1000,
    ),
)
workspace_ml_client.online_deployments.begin_create_or_update(demo_deployment).wait()
endpoint.traffic = {"demo": 100}
workspace_ml_client.online_endpoints.begin_create_or_update(endpoint).result()

デプロイメントを推測する

推論をテストするには、サンプル json データが必要です。次の例を参考にして、sample_score.json を作成します。

{
  "inputs": {
    "question": [
      "Where do I live?",
      "Where do I live?",
      "What's my name?",
      "Which name is also used to describe the Amazon rainforest in English?"
    ],
    "context": [
      "My name is Wolfgang and I live in Berlin",
      "My name is Sarah and I live in London",
      "My name is Clara and I live in Berkeley.",
      "The Amazon rainforest (Portuguese: Floresta Amaz\u00f4nica or Amaz\u00f4nia; Spanish: Selva Amaz\u00f3nica, Amazon\u00eda or usually Amazonia; French: For\u00eat amazonienne; Dutch: Amazoneregenwoud), also known in English as Amazonia or the Amazon Jungle, is a moist broadleaf forest that covers most of the Amazon basin of South America. This basin encompasses 7,000,000 square kilometres (2,700,000 sq mi), of which 5,500,000 square kilometres (2,100,000 sq mi) are covered by the rainforest. This region includes territory belonging to nine nations. The majority of the forest is contained within Brazil, with 60% of the rainforest, followed by Peru with 13%, Colombia with 10%, and with minor amounts in Venezuela, Ecuador, Bolivia, Guyana, Suriname and French Guiana. States or departments in four nations contain \"Amazonas\" in their names. The Amazon represents over half of the planet's remaining rainforests, and comprises the largest and most biodiverse tract of tropical rainforest in the world, with an estimated 390 billion individual trees divided into 16,000 species."
    ]
  }
}

sample_score.jsonを使用した推論。サンプル json ファイルを保存した場所に基づいて、次のコードでスコアリングファイルの場所を変更します。

scoring_file = "./sample_score.json" 
response = workspace_ml_client.online_endpoints.invoke(
    endpoint_name=online_endpoint_name,
    deployment_name="demo",
    request_file=scoring_file,
)
response_json = json.loads(response)
print(json.dumps(response_json, indent=2))

自動スケールの構成

デプロイの自動スケールを構成するには、Azure portal に移動し、AI プロジェクトのリソースグループで Machine learning online deployment 型指定された Azure リソースを見つけ、[設定] の下にある [スケーリング] メニューを使用します。自動スケーリングの詳細については、Azure Machine Learning ドキュメントの「オンラインエンドポイントの自動スケーリング」を参照してください。

デプロイエンドポイントを削除する

Azure AI Foundry ポータルでデプロイを削除するには、デプロイの詳細ページの上部パネルにある [削除] ボタンを選択します。

割当の考慮事項

リアルタイムエンドポイントを使用した推論のデプロイと実行には、リージョンごとにサブスクリプションに割り当てられている仮想マシン (VM) コアクォータを使用します。 Azure AI Foundry にサインアップすると、リージョンで使用可能な複数の VM ファミリに対する既定の VM クォータを受け取ります。クォータ制限に達するまで、デプロイを作成し続けることができます。その後は、クォータの引き上げを要求できます。

Azure AI Foundry でできることについて、詳細を確認します
Azure AI の FAQ の記事で、よくあるご質問とその回答を確認します

次の方法で共有

コードを使用してマネージド コンピューティング デプロイをデプロイおよび推論する方法

[前提条件]

モデル ID を取得する

モデルをデプロイする

デプロイメントを推測する

自動スケールの構成

デプロイ エンドポイントを削除する

割当の考慮事項

関連コンテンツ

フィードバック

その他のリソース

コードを使用してマネージドコンピューティングデプロイをデプロイおよび推論する方法

デプロイエンドポイントを削除する