Share via


DocumentAnalysisClient class

A client for interacting with the Form Recognizer service's analysis features.

Examples:

The Form Recognizer service and clients support two means of authentication:

Azure Active Directory

import { DefaultAzureCredential } from "@azure/identity";
import { DocumentAnalysisClient } from "@azure/ai-form-recognizer";

const credential = new DefaultAzureCredential();
const client = new DocumentAnalysisClient(
  "https://<resource name>.cognitiveservices.azure.com",
  credential,
);

API Key (Subscription Key)

import { AzureKeyCredential, DocumentAnalysisClient } from "@azure/ai-form-recognizer";

const credential = new AzureKeyCredential("<API key>");
const client = new DocumentAnalysisClient(
  "https://<resource name>.cognitiveservices.azure.com",
  credential,
);

Constructors

DocumentAnalysisClient(string, KeyCredential, DocumentAnalysisClientOptions)

Create a DocumentAnalysisClient instance from a resource endpoint and a static API key (KeyCredential),

Example:

import { AzureKeyCredential, DocumentAnalysisClient } from "@azure/ai-form-recognizer";

const credential = new AzureKeyCredential("<API key>");
const client = new DocumentAnalysisClient(
  "https://<resource name>.cognitiveservices.azure.com",
  credential,
);
DocumentAnalysisClient(string, TokenCredential, DocumentAnalysisClientOptions)

Create a DocumentAnalysisClient instance from a resource endpoint and a an Azure Identity TokenCredential.

See the @azure/identity package for more information about authenticating with Azure Active Directory.

Example:

import { DefaultAzureCredential } from "@azure/identity";
import { DocumentAnalysisClient } from "@azure/ai-form-recognizer";

const credential = new DefaultAzureCredential();
const client = new DocumentAnalysisClient(
  "https://<resource name>.cognitiveservices.azure.com",
  credential,
);

Methods

beginAnalyzeDocument(string, FormRecognizerRequestBody, AnalyzeDocumentOptions<AnalyzeResult<AnalyzedDocument>>)

Extract data from an input using a model given by its unique ID.

This operation supports custom as well as prebuilt models. For example, to use the prebuilt invoice model, provide the model ID "prebuilt-invoice", or to use the simpler prebuilt layout model, provide the model ID "prebuilt-layout".

The fields produced in the AnalyzeResult depend on the model that is used for analysis, and the values in any extracted documents' fields depend on the document types in the model (if any) and their corresponding field schemas.

Examples

This method supports streamable request bodies (FormRecognizerRequestBody) such as Node.JS ReadableStream objects, browser Blobs, and ArrayBuffers. The contents of the body will be uploaded to the service for analysis.

import { DefaultAzureCredential } from "@azure/identity";
import { DocumentAnalysisClient } from "@azure/ai-form-recognizer";
import { createReadStream } from "node:fs";
import { PrebuiltReceiptModel } from "../samples-dev/prebuilt/prebuilt-receipt.js";

const credential = new DefaultAzureCredential();
const client = new DocumentAnalysisClient(
  "https://<resource name>.cognitiveservices.azure.com",
  credential,
);

const path = "<path to a document>";
const readStream = createReadStream(path);

// The PrebuiltReceiptModel `DocumentModel` instance encodes both the model ID and a stronger return type for the operation
const poller = await client.beginAnalyzeDocument(PrebuiltReceiptModel, readStream, {
  onProgress: ({ status }) => {
    console.log(`status: ${status}`);
  },
});

const {
  documents: [receiptDocument],
} = await poller.pollUntilDone();

// The fields of the document constitute the extracted receipt data.
const receipt = receiptDocument.fields;

if (receipt === undefined) {
  throw new Error("Expected at least one receipt in analysis result.");
}

console.log(`Receipt data (${receiptDocument.docType})`);
console.log("  Merchant Name:", receipt.merchantName?.value);

// The items of the receipt are an example of a `DocumentArrayValue`
if (receipt.items !== undefined) {
  console.log("Items:");
  for (const { properties: item } of receipt.items.values) {
    console.log("- Description:", item.description?.value);
    console.log("  Total Price:", item.totalPrice?.value);
  }
}

console.log("  Total:", receipt.total?.value);
beginAnalyzeDocument<Result>(DocumentModel<Result>, FormRecognizerRequestBody, AnalyzeDocumentOptions<Result>)

Extract data from an input using a model that has a known, strongly-typed document schema (a DocumentModel).

The fields produced in the AnalyzeResult depend on the model that is used for analysis. In TypeScript, the type of the result for this method overload is inferred from the type of the input DocumentModel.

Examples

This method supports streamable request bodies (FormRecognizerRequestBody) such as Node.JS ReadableStream objects, browser Blobs, and ArrayBuffers. The contents of the body will be uploaded to the service for analysis.

If the input provided is a string, it will be treated as a URL to the ___location of a document to be analyzed. See the beginAnalyzeDocumentFromUrl method for more information. Use of that method is preferred when using URLs, and URL support is only provided in this method for backwards compatibility.

import { DefaultAzureCredential } from "@azure/identity";
import { DocumentAnalysisClient } from "@azure/ai-form-recognizer";
import { createReadStream } from "node:fs";
import { PrebuiltReceiptModel } from "../samples-dev/prebuilt/prebuilt-receipt.js";

const credential = new DefaultAzureCredential();
const client = new DocumentAnalysisClient(
  "https://<resource name>.cognitiveservices.azure.com",
  credential,
);

const path = "<path to a document>";
const readStream = createReadStream(path);

// The PrebuiltReceiptModel `DocumentModel` instance encodes both the model ID and a stronger return type for the operation
const poller = await client.beginAnalyzeDocument(PrebuiltReceiptModel, readStream, {
  onProgress: ({ status }) => {
    console.log(`status: ${status}`);
  },
});

const {
  documents: [receiptDocument],
} = await poller.pollUntilDone();

// The fields of the document constitute the extracted receipt data.
const receipt = receiptDocument.fields;

if (receipt === undefined) {
  throw new Error("Expected at least one receipt in analysis result.");
}

console.log(`Receipt data (${receiptDocument.docType})`);
console.log("  Merchant Name:", receipt.merchantName?.value);

// The items of the receipt are an example of a `DocumentArrayValue`
if (receipt.items !== undefined) {
  console.log("Items:");
  for (const { properties: item } of receipt.items.values) {
    console.log("- Description:", item.description?.value);
    console.log("  Total Price:", item.totalPrice?.value);
  }
}

console.log("  Total:", receipt.total?.value);
beginAnalyzeDocumentFromUrl(string, string, AnalyzeDocumentOptions<AnalyzeResult<AnalyzedDocument>>)

Extract data from an input using a model given by its unique ID.

This operation supports custom as well as prebuilt models. For example, to use the prebuilt invoice model, provide the model ID "prebuilt-invoice", or to use the simpler prebuilt layout model, provide the model ID "prebuilt-layout".

The fields produced in the AnalyzeResult depend on the model that is used for analysis, and the values in any extracted documents' fields depend on the document types in the model (if any) and their corresponding field schemas.

Examples

This method supports extracting data from a file at a given URL. The Form Recognizer service will attempt to download a file using the submitted URL, so the URL must be accessible from the public internet. For example, a SAS token can be used to grant read access to a blob in Azure Storage, and the service will use the SAS-encoded URL to request the file.

import { DefaultAzureCredential } from "@azure/identity";
import {
  DocumentAnalysisClient,
  DocumentStringField,
  DocumentArrayField,
  DocumentObjectField,
} from "@azure/ai-form-recognizer";

const credential = new DefaultAzureCredential();
const client = new DocumentAnalysisClient(
  "https://<resource name>.cognitiveservices.azure.com",
  credential,
);

const poller = await client.beginAnalyzeDocumentFromUrl(
  "prebuilt-receipt",
  // The Document Intelligence service will access the following URL to a receipt image and extract data from it
  "https://raw.githubusercontent.com/Azure/azure-sdk-for-js/main/sdk/formrecognizer/ai-form-recognizer/assets/receipt/contoso-receipt.png",
);
poller.onProgress((state) => console.log("Operation:", state.modelId, state.status));

const { documents } = await poller.pollUntilDone();

const result = documents && documents[0];
if (result) {
  const receipt = result.fields;
  console.log("=== Receipt Information ===");
  console.log("Type:", result.docType);
  console.log("Merchant:", (receipt["MerchantName"] as DocumentStringField).value);

  console.log("Items:");
  for (const { properties: item } of ((receipt["Items"] as DocumentArrayField).values ||
    []) as DocumentObjectField[]) {
    console.log("- Description:", (item["Description"] as DocumentStringField).value);
    console.log("  Total Price:", (item["TotalPrice"] as DocumentStringField).value);
  }
} else {
  throw new Error("Expected at least one receipt in the result.");
}
beginAnalyzeDocumentFromUrl<Result>(DocumentModel<Result>, string, AnalyzeDocumentOptions<Result>)

Extract data from an input using a model that has a known, strongly-typed document schema (a DocumentModel).

The fields produced in the AnalyzeResult depend on the model that is used for analysis. In TypeScript, the type of the result for this method overload is inferred from the type of the input DocumentModel.

Examples

This method supports extracting data from a file at a given URL. The Form Recognizer service will attempt to download a file using the submitted URL, so the URL must be accessible from the public internet. For example, a SAS token can be used to grant read access to a blob in Azure Storage, and the service will use the SAS-encoded URL to request the file.

import { DefaultAzureCredential } from "@azure/identity";
import { DocumentAnalysisClient } from "@azure/ai-form-recognizer";
import { PrebuiltReceiptModel } from "../samples-dev/prebuilt/prebuilt-receipt.js";

const credential = new DefaultAzureCredential();
const client = new DocumentAnalysisClient(
  "https://<resource name>.cognitiveservices.azure.com",
  credential,
);

const poller = await client.beginAnalyzeDocumentFromUrl(
  PrebuiltReceiptModel,
  // The Document Intelligence service will access the following URL to a receipt image and extract data from it
  "https://raw.githubusercontent.com/Azure/azure-sdk-for-js/main/sdk/formrecognizer/ai-form-recognizer/assets/receipt/contoso-receipt.png",
);

const {
  documents: [document],
} = await poller.pollUntilDone();

// Use of PrebuiltModels.Receipt above (rather than the raw model ID), as it adds strong typing of the model's output
if (document) {
  const { merchantName, items, total } = document.fields;

  console.log("=== Receipt Information ===");
  console.log("Type:", document.docType);
  console.log("Merchant:", merchantName && merchantName.value);

  console.log("Items:");
  for (const item of (items && items.values) || []) {
    const { description, totalPrice } = item.properties;

    console.log("- Description:", description && description.value);
    console.log("  Total Price:", totalPrice && totalPrice.value);
  }

  console.log("Total:", total && total.value);
} else {
  throw new Error("Expected at least one receipt in the result.");
}
beginClassifyDocument(string, FormRecognizerRequestBody, ClassifyDocumentOptions)

Classify a document using a custom classifier given by its ID.

This method produces a long-running operation (poller) that will eventually produce an AnalyzeResult. This is the same type as beginAnalyzeDocument and beginAnalyzeDocumentFromUrl, but the result will only contain a small subset of its fields. Only the documents field and pages field will be populated, and only minimal page information will be returned. The documents field will contain information about all the identified documents and the docType that they were classified as.

Example

This method supports streamable request bodies (FormRecognizerRequestBody) such as Node.JS ReadableStream objects, browser Blobs, and ArrayBuffers. The contents of the body will be uploaded to the service for analysis.

import { DefaultAzureCredential } from "@azure/identity";
import { DocumentAnalysisClient } from "@azure/ai-form-recognizer";
import { createReadStream } from "node:fs";

const credential = new DefaultAzureCredential();
const client = new DocumentAnalysisClient(
  "https://<resource name>.cognitiveservices.azure.com",
  credential,
);

const path = "<path to a document>";
const readStream = createReadStream(path);

const poller = await client.beginClassifyDocument("<classifier id>", readStream);

const result = await poller.pollUntilDone();

if (result?.documents?.length === 0) {
  throw new Error("Failed to extract any documents.");
}

for (const document of result.documents) {
  console.log(
    `Extracted a document with type '${document.docType}' on page ${document.boundingRegions?.[0].pageNumber} (confidence: ${document.confidence})`,
  );
}
beginClassifyDocumentFromUrl(string, string, ClassifyDocumentOptions)

Classify a document from a URL using a custom classifier given by its ID.

This method produces a long-running operation (poller) that will eventually produce an AnalyzeResult. This is the same type as beginAnalyzeDocument and beginAnalyzeDocumentFromUrl, but the result will only contain a small subset of its fields. Only the documents field and pages field will be populated, and only minimal page information will be returned. The documents field will contain information about all the identified documents and the docType that they were classified as.

Example

This method supports extracting data from a file at a given URL. The Form Recognizer service will attempt to download a file using the submitted URL, so the URL must be accessible from the public internet. For example, a SAS token can be used to grant read access to a blob in Azure Storage, and the service will use the SAS-encoded URL to request the file.

import { DefaultAzureCredential } from "@azure/identity";
import { DocumentAnalysisClient } from "@azure/ai-form-recognizer";

const credential = new DefaultAzureCredential();
const client = new DocumentAnalysisClient(
  "https://<resource name>.cognitiveservices.azure.com",
  credential,
);

const documentUrl =
  "https://raw.githubusercontent.com/Azure/azure-sdk-for-js/main/sdk/formrecognizer/ai-form-recognizer/assets/invoice/Invoice_1.pdf";

const poller = await client.beginClassifyDocumentFromUrl("<classifier id>", documentUrl);

const result = await poller.pollUntilDone();

if (result?.documents?.length === 0) {
  throw new Error("Failed to extract any documents.");
}

for (const document of result.documents) {
  console.log(
    `Extracted a document with type '${document.docType}' on page ${document.boundingRegions?.[0].pageNumber} (confidence: ${document.confidence})`,
  );
}

Constructor Details

DocumentAnalysisClient(string, KeyCredential, DocumentAnalysisClientOptions)

Create a DocumentAnalysisClient instance from a resource endpoint and a static API key (KeyCredential),

Example:

import { AzureKeyCredential, DocumentAnalysisClient } from "@azure/ai-form-recognizer";

const credential = new AzureKeyCredential("<API key>");
const client = new DocumentAnalysisClient(
  "https://<resource name>.cognitiveservices.azure.com",
  credential,
);
new DocumentAnalysisClient(endpoint: string, credential: KeyCredential, options?: DocumentAnalysisClientOptions)

Parameters

endpoint

string

the endpoint URL of an Azure Cognitive Services instance

credential
KeyCredential

a KeyCredential containing the Cognitive Services instance subscription key

options
DocumentAnalysisClientOptions

optional settings for configuring all methods in the client

DocumentAnalysisClient(string, TokenCredential, DocumentAnalysisClientOptions)

Create a DocumentAnalysisClient instance from a resource endpoint and a an Azure Identity TokenCredential.

See the @azure/identity package for more information about authenticating with Azure Active Directory.

Example:

import { DefaultAzureCredential } from "@azure/identity";
import { DocumentAnalysisClient } from "@azure/ai-form-recognizer";

const credential = new DefaultAzureCredential();
const client = new DocumentAnalysisClient(
  "https://<resource name>.cognitiveservices.azure.com",
  credential,
);
new DocumentAnalysisClient(endpoint: string, credential: TokenCredential, options?: DocumentAnalysisClientOptions)

Parameters

endpoint

string

the endpoint URL of an Azure Cognitive Services instance

credential
TokenCredential

a TokenCredential instance from the @azure/identity package

options
DocumentAnalysisClientOptions

optional settings for configuring all methods in the client

Method Details

beginAnalyzeDocument(string, FormRecognizerRequestBody, AnalyzeDocumentOptions<AnalyzeResult<AnalyzedDocument>>)

Extract data from an input using a model given by its unique ID.

This operation supports custom as well as prebuilt models. For example, to use the prebuilt invoice model, provide the model ID "prebuilt-invoice", or to use the simpler prebuilt layout model, provide the model ID "prebuilt-layout".

The fields produced in the AnalyzeResult depend on the model that is used for analysis, and the values in any extracted documents' fields depend on the document types in the model (if any) and their corresponding field schemas.

Examples

This method supports streamable request bodies (FormRecognizerRequestBody) such as Node.JS ReadableStream objects, browser Blobs, and ArrayBuffers. The contents of the body will be uploaded to the service for analysis.

import { DefaultAzureCredential } from "@azure/identity";
import { DocumentAnalysisClient } from "@azure/ai-form-recognizer";
import { createReadStream } from "node:fs";
import { PrebuiltReceiptModel } from "../samples-dev/prebuilt/prebuilt-receipt.js";

const credential = new DefaultAzureCredential();
const client = new DocumentAnalysisClient(
  "https://<resource name>.cognitiveservices.azure.com",
  credential,
);

const path = "<path to a document>";
const readStream = createReadStream(path);

// The PrebuiltReceiptModel `DocumentModel` instance encodes both the model ID and a stronger return type for the operation
const poller = await client.beginAnalyzeDocument(PrebuiltReceiptModel, readStream, {
  onProgress: ({ status }) => {
    console.log(`status: ${status}`);
  },
});

const {
  documents: [receiptDocument],
} = await poller.pollUntilDone();

// The fields of the document constitute the extracted receipt data.
const receipt = receiptDocument.fields;

if (receipt === undefined) {
  throw new Error("Expected at least one receipt in analysis result.");
}

console.log(`Receipt data (${receiptDocument.docType})`);
console.log("  Merchant Name:", receipt.merchantName?.value);

// The items of the receipt are an example of a `DocumentArrayValue`
if (receipt.items !== undefined) {
  console.log("Items:");
  for (const { properties: item } of receipt.items.values) {
    console.log("- Description:", item.description?.value);
    console.log("  Total Price:", item.totalPrice?.value);
  }
}

console.log("  Total:", receipt.total?.value);
function beginAnalyzeDocument(modelId: string, document: FormRecognizerRequestBody, options?: AnalyzeDocumentOptions<AnalyzeResult<AnalyzedDocument>>): Promise<AnalysisPoller<AnalyzeResult<AnalyzedDocument>>>

Parameters

modelId

string

the unique ID (name) of the model within this client's resource

document
FormRecognizerRequestBody

a FormRecognizerRequestBody that will be uploaded with the request

options

AnalyzeDocumentOptions<AnalyzeResult<AnalyzedDocument>>

optional settings for the analysis operation and poller

Returns

a long-running operation (poller) that will eventually produce an AnalyzeResult

beginAnalyzeDocument<Result>(DocumentModel<Result>, FormRecognizerRequestBody, AnalyzeDocumentOptions<Result>)

Extract data from an input using a model that has a known, strongly-typed document schema (a DocumentModel).

The fields produced in the AnalyzeResult depend on the model that is used for analysis. In TypeScript, the type of the result for this method overload is inferred from the type of the input DocumentModel.

Examples

This method supports streamable request bodies (FormRecognizerRequestBody) such as Node.JS ReadableStream objects, browser Blobs, and ArrayBuffers. The contents of the body will be uploaded to the service for analysis.

If the input provided is a string, it will be treated as a URL to the ___location of a document to be analyzed. See the beginAnalyzeDocumentFromUrl method for more information. Use of that method is preferred when using URLs, and URL support is only provided in this method for backwards compatibility.

import { DefaultAzureCredential } from "@azure/identity";
import { DocumentAnalysisClient } from "@azure/ai-form-recognizer";
import { createReadStream } from "node:fs";
import { PrebuiltReceiptModel } from "../samples-dev/prebuilt/prebuilt-receipt.js";

const credential = new DefaultAzureCredential();
const client = new DocumentAnalysisClient(
  "https://<resource name>.cognitiveservices.azure.com",
  credential,
);

const path = "<path to a document>";
const readStream = createReadStream(path);

// The PrebuiltReceiptModel `DocumentModel` instance encodes both the model ID and a stronger return type for the operation
const poller = await client.beginAnalyzeDocument(PrebuiltReceiptModel, readStream, {
  onProgress: ({ status }) => {
    console.log(`status: ${status}`);
  },
});

const {
  documents: [receiptDocument],
} = await poller.pollUntilDone();

// The fields of the document constitute the extracted receipt data.
const receipt = receiptDocument.fields;

if (receipt === undefined) {
  throw new Error("Expected at least one receipt in analysis result.");
}

console.log(`Receipt data (${receiptDocument.docType})`);
console.log("  Merchant Name:", receipt.merchantName?.value);

// The items of the receipt are an example of a `DocumentArrayValue`
if (receipt.items !== undefined) {
  console.log("Items:");
  for (const { properties: item } of receipt.items.values) {
    console.log("- Description:", item.description?.value);
    console.log("  Total Price:", item.totalPrice?.value);
  }
}

console.log("  Total:", receipt.total?.value);
function beginAnalyzeDocument<Result>(model: DocumentModel<Result>, document: FormRecognizerRequestBody, options?: AnalyzeDocumentOptions<Result>): Promise<AnalysisPoller<Result>>

Parameters

model

DocumentModel<Result>

a DocumentModel representing the model to use for analysis and the expected output type

document
FormRecognizerRequestBody

a FormRecognizerRequestBody that will be uploaded with the request

options

AnalyzeDocumentOptions<Result>

optional settings for the analysis operation and poller

Returns

Promise<AnalysisPoller<Result>>

a long-running operation (poller) that will eventually produce an AnalyzeResult with documents that have the result type associated with the input model

beginAnalyzeDocumentFromUrl(string, string, AnalyzeDocumentOptions<AnalyzeResult<AnalyzedDocument>>)

Extract data from an input using a model given by its unique ID.

This operation supports custom as well as prebuilt models. For example, to use the prebuilt invoice model, provide the model ID "prebuilt-invoice", or to use the simpler prebuilt layout model, provide the model ID "prebuilt-layout".

The fields produced in the AnalyzeResult depend on the model that is used for analysis, and the values in any extracted documents' fields depend on the document types in the model (if any) and their corresponding field schemas.

Examples

This method supports extracting data from a file at a given URL. The Form Recognizer service will attempt to download a file using the submitted URL, so the URL must be accessible from the public internet. For example, a SAS token can be used to grant read access to a blob in Azure Storage, and the service will use the SAS-encoded URL to request the file.

import { DefaultAzureCredential } from "@azure/identity";
import {
  DocumentAnalysisClient,
  DocumentStringField,
  DocumentArrayField,
  DocumentObjectField,
} from "@azure/ai-form-recognizer";

const credential = new DefaultAzureCredential();
const client = new DocumentAnalysisClient(
  "https://<resource name>.cognitiveservices.azure.com",
  credential,
);

const poller = await client.beginAnalyzeDocumentFromUrl(
  "prebuilt-receipt",
  // The Document Intelligence service will access the following URL to a receipt image and extract data from it
  "https://raw.githubusercontent.com/Azure/azure-sdk-for-js/main/sdk/formrecognizer/ai-form-recognizer/assets/receipt/contoso-receipt.png",
);
poller.onProgress((state) => console.log("Operation:", state.modelId, state.status));

const { documents } = await poller.pollUntilDone();

const result = documents && documents[0];
if (result) {
  const receipt = result.fields;
  console.log("=== Receipt Information ===");
  console.log("Type:", result.docType);
  console.log("Merchant:", (receipt["MerchantName"] as DocumentStringField).value);

  console.log("Items:");
  for (const { properties: item } of ((receipt["Items"] as DocumentArrayField).values ||
    []) as DocumentObjectField[]) {
    console.log("- Description:", (item["Description"] as DocumentStringField).value);
    console.log("  Total Price:", (item["TotalPrice"] as DocumentStringField).value);
  }
} else {
  throw new Error("Expected at least one receipt in the result.");
}
function beginAnalyzeDocumentFromUrl(modelId: string, documentUrl: string, options?: AnalyzeDocumentOptions<AnalyzeResult<AnalyzedDocument>>): Promise<AnalysisPoller<AnalyzeResult<AnalyzedDocument>>>

Parameters

modelId

string

the unique ID (name) of the model within this client's resource

documentUrl

string

a URL (string) to an input document accessible from the public internet

options

AnalyzeDocumentOptions<AnalyzeResult<AnalyzedDocument>>

optional settings for the analysis operation and poller

Returns

a long-running operation (poller) that will eventually produce an AnalyzeResult

beginAnalyzeDocumentFromUrl<Result>(DocumentModel<Result>, string, AnalyzeDocumentOptions<Result>)

Extract data from an input using a model that has a known, strongly-typed document schema (a DocumentModel).

The fields produced in the AnalyzeResult depend on the model that is used for analysis. In TypeScript, the type of the result for this method overload is inferred from the type of the input DocumentModel.

Examples

This method supports extracting data from a file at a given URL. The Form Recognizer service will attempt to download a file using the submitted URL, so the URL must be accessible from the public internet. For example, a SAS token can be used to grant read access to a blob in Azure Storage, and the service will use the SAS-encoded URL to request the file.

import { DefaultAzureCredential } from "@azure/identity";
import { DocumentAnalysisClient } from "@azure/ai-form-recognizer";
import { PrebuiltReceiptModel } from "../samples-dev/prebuilt/prebuilt-receipt.js";

const credential = new DefaultAzureCredential();
const client = new DocumentAnalysisClient(
  "https://<resource name>.cognitiveservices.azure.com",
  credential,
);

const poller = await client.beginAnalyzeDocumentFromUrl(
  PrebuiltReceiptModel,
  // The Document Intelligence service will access the following URL to a receipt image and extract data from it
  "https://raw.githubusercontent.com/Azure/azure-sdk-for-js/main/sdk/formrecognizer/ai-form-recognizer/assets/receipt/contoso-receipt.png",
);

const {
  documents: [document],
} = await poller.pollUntilDone();

// Use of PrebuiltModels.Receipt above (rather than the raw model ID), as it adds strong typing of the model's output
if (document) {
  const { merchantName, items, total } = document.fields;

  console.log("=== Receipt Information ===");
  console.log("Type:", document.docType);
  console.log("Merchant:", merchantName && merchantName.value);

  console.log("Items:");
  for (const item of (items && items.values) || []) {
    const { description, totalPrice } = item.properties;

    console.log("- Description:", description && description.value);
    console.log("  Total Price:", totalPrice && totalPrice.value);
  }

  console.log("Total:", total && total.value);
} else {
  throw new Error("Expected at least one receipt in the result.");
}
function beginAnalyzeDocumentFromUrl<Result>(model: DocumentModel<Result>, documentUrl: string, options?: AnalyzeDocumentOptions<Result>): Promise<AnalysisPoller<Result>>

Parameters

model

DocumentModel<Result>

a DocumentModel representing the model to use for analysis and the expected output type

documentUrl

string

a URL (string) to an input document accessible from the public internet

options

AnalyzeDocumentOptions<Result>

optional settings for the analysis operation and poller

Returns

Promise<AnalysisPoller<Result>>

a long-running operation (poller) that will eventually produce an AnalyzeResult

beginClassifyDocument(string, FormRecognizerRequestBody, ClassifyDocumentOptions)

Classify a document using a custom classifier given by its ID.

This method produces a long-running operation (poller) that will eventually produce an AnalyzeResult. This is the same type as beginAnalyzeDocument and beginAnalyzeDocumentFromUrl, but the result will only contain a small subset of its fields. Only the documents field and pages field will be populated, and only minimal page information will be returned. The documents field will contain information about all the identified documents and the docType that they were classified as.

Example

This method supports streamable request bodies (FormRecognizerRequestBody) such as Node.JS ReadableStream objects, browser Blobs, and ArrayBuffers. The contents of the body will be uploaded to the service for analysis.

import { DefaultAzureCredential } from "@azure/identity";
import { DocumentAnalysisClient } from "@azure/ai-form-recognizer";
import { createReadStream } from "node:fs";

const credential = new DefaultAzureCredential();
const client = new DocumentAnalysisClient(
  "https://<resource name>.cognitiveservices.azure.com",
  credential,
);

const path = "<path to a document>";
const readStream = createReadStream(path);

const poller = await client.beginClassifyDocument("<classifier id>", readStream);

const result = await poller.pollUntilDone();

if (result?.documents?.length === 0) {
  throw new Error("Failed to extract any documents.");
}

for (const document of result.documents) {
  console.log(
    `Extracted a document with type '${document.docType}' on page ${document.boundingRegions?.[0].pageNumber} (confidence: ${document.confidence})`,
  );
}
function beginClassifyDocument(classifierId: string, document: FormRecognizerRequestBody, options?: ClassifyDocumentOptions): Promise<AnalysisPoller<AnalyzeResult<AnalyzedDocument>>>

Parameters

classifierId

string

the ID of the custom classifier to use for analysis

document
FormRecognizerRequestBody

the document to classify

options
ClassifyDocumentOptions

options for the classification operation

Returns

a long-running operation (poller) that will eventually produce an AnalyzeResult

beginClassifyDocumentFromUrl(string, string, ClassifyDocumentOptions)

Classify a document from a URL using a custom classifier given by its ID.

This method produces a long-running operation (poller) that will eventually produce an AnalyzeResult. This is the same type as beginAnalyzeDocument and beginAnalyzeDocumentFromUrl, but the result will only contain a small subset of its fields. Only the documents field and pages field will be populated, and only minimal page information will be returned. The documents field will contain information about all the identified documents and the docType that they were classified as.

Example

This method supports extracting data from a file at a given URL. The Form Recognizer service will attempt to download a file using the submitted URL, so the URL must be accessible from the public internet. For example, a SAS token can be used to grant read access to a blob in Azure Storage, and the service will use the SAS-encoded URL to request the file.

import { DefaultAzureCredential } from "@azure/identity";
import { DocumentAnalysisClient } from "@azure/ai-form-recognizer";

const credential = new DefaultAzureCredential();
const client = new DocumentAnalysisClient(
  "https://<resource name>.cognitiveservices.azure.com",
  credential,
);

const documentUrl =
  "https://raw.githubusercontent.com/Azure/azure-sdk-for-js/main/sdk/formrecognizer/ai-form-recognizer/assets/invoice/Invoice_1.pdf";

const poller = await client.beginClassifyDocumentFromUrl("<classifier id>", documentUrl);

const result = await poller.pollUntilDone();

if (result?.documents?.length === 0) {
  throw new Error("Failed to extract any documents.");
}

for (const document of result.documents) {
  console.log(
    `Extracted a document with type '${document.docType}' on page ${document.boundingRegions?.[0].pageNumber} (confidence: ${document.confidence})`,
  );
}
function beginClassifyDocumentFromUrl(classifierId: string, documentUrl: string, options?: ClassifyDocumentOptions): Promise<AnalysisPoller<AnalyzeResult<AnalyzedDocument>>>

Parameters

classifierId

string

the ID of the custom classifier to use for analysis

documentUrl

string

the URL of the document to classify

Returns