Quickstart: Azure AI Content Understanding REST APIs
Start using the latest preview version of the Azure AI Content Understanding REST API (2024-12-01-preview).
Azure AI Content Understanding is a new generative AI-based Azure AI Service that analyzes files of any modality (documents, images, videos, and audio) and extracts structured output in user-defined field formats.
Integrate the Content Understanding service into your workflows and applications easily by calling our REST APIs.
This quickstart guides you through using the Content Understanding REST API to create a custom analyzer and extract content and fields from your input.
Prerequisites
To get started, you need An active Azure subscription. If you don't have an Azure account, you can create a free subscription.
Once you have your Azure subscription, create an Azure AI Services resource in the Azure portal. This multi-service resource enables access to multiple Azure AI services with a single set of credentials.
This resource is listed under Azure AI services → Azure AI services in the portal.
Important
Azure provides more than one resource type named Azure AI services. Make certain that you select the one listed under Azure AI services → Azure AI services as depicted in the following image. For more information, see Create an Azure AI Services resource.
In this quickstart, we use the cURL command line tool. If it isn't installed, you can download a version for your dev environment:
Create a custom analyzer
To create a custom analyzer, you need to define a field schema that describes the structured data you want to extract. In the following example, we define a schema for extracting basic information from an invoice document.
First, create a JSON file named request_body.json
with the following content:
{
"description": "Sample invoice analyzer",
"scenario": "document",
"config": {
"returnDetails": true
},
"fieldSchema": {
"fields": {
"VendorName": {
"type": "string",
"method": "extract",
"description": "Vendor issuing the invoice"
},
"Items": {
"type": "array",
"method": "extract",
"items": {
"type": "object",
"properties": {
"Description": {
"type": "string",
"method": "extract",
"description": "Description of the item"
},
"Amount": {
"type": "number",
"method": "extract",
"description": "Amount of the item"
}
}
}
}
}
}
}
Before running the following cURL
commands, make the following changes to the HTTP request:
- Replace
{endpoint}
and{key}
with the endpoint and key values from your Azure portal Azure AI Services instance. - Replace
{analyzerId}
with the name of the new analyzer and create, such asmyInvoice
.
PUT Request
curl -i -X PUT "{endpoint}/contentunderstanding/analyzers/{analyzerId}?api-version=2024-12-01-preview" \
-H "Ocp-Apim-Subscription-Key: {key}" \
-H "Content-Type: application/json" \
-d @request_body.json
PUT Response
The 201 (Created
) response includes an Operation-Location
header containing a URL that you can use to track the status of this asynchronous creation operation.
201 Created
Operation-Location: {endpoint}/contentunderstanding/analyzers/{analyzerId}/operations/{operationId}?api-version=2024-12-01-preview
Upon completion, performing an HTTP GET on the URL returns "status": "succeeded"
.
curl -i -X GET "{endpoint}/contentunderstanding/analyzers/{analyzerId}/operations/{operationId}?api-version=2024-12-01-preview" \
-H "Ocp-Apim-Subscription-Key: {key}"
Analyze a file
You can analyze files using the custom analyzer you created to extract the fields defined in the schema.
Before running the cURL command, make the following changes to the HTTP request:
- Replace
{endpoint}
and{key}
with the endpoint and key values from your Azure portal Azure AI Services instance. - Replace
{analyzerId}
with the name of the custom analyzer created earlier. - Replace
{fileUrl}
with a publicly accessible URL of the file to analyze, such as a path to an Azure Storage Blob with a shared access signature (SAS) or the sample URLhttps://github.com/Azure-Samples/cognitive-services-REST-api-samples/raw/master/curl/form-recognizer/rest-api/invoice.pdf
.
POST request
curl -i -X POST "{endpoint}/contentunderstanding/analyzers/{analyzerId}:analyze?stringEncoding=codePoint&api-version=2024-12-01-preview" \
-H "Ocp-Apim-Subscription-Key: {key}" \
-H "Content-Type: application/json" \
-d "{\"url\":\"{fileUrl}\"}"
POST responseH
The 202 (Accepted
) response includes an Operation-Location
header containing a URL that you can use to track the status of this asynchronous analyze operation.
202 Accepted
Operation-Location: {endpoint}/contentunderstanding/analyzers/{analyzerId}/results/{resultId}?api-version=2024-12-01-preview
Get analyze result
Use the resultId
from the Operation-Location
header returned by the previous POST
response and retrieve the result of the analysis.
- Replace
{endpoint}
and{key}
with the endpoint and key values from your Azure portal Azure AI Services instance. - Replace
{analyzerId}
with the name of the custom analyzer created earlier. - Replace
{resultId}
with theresultId
returned from thePOST
request.
GET request
curl -i -X GET "{endpoint}/contentunderstanding/analyzers/{analyzerId}/results/{resultId}?api-version=2024-12-01-preview" \
-H "Ocp-Apim-Subscription-Key: {key}"
GET response
The 200 (OK
) JSON response includes a status
field indicating the status of the operation. If the operation isn't complete, the value of status
is running
or notStarted
. In such cases, you should call the API again, either manually or through a script. Wait an interval of one second or more between calls.
Sample response
{
"id": "bcf8c7c7-03ab-4204-b22c-2b34203ef5db",
"status": "Succeeded",
"result": {
"analyzerId": "sample_invoice_analyzer",
"apiVersion": "2024-12-01-preview",
"createdAt": "2024-11-13T07:15:46Z",
"warnings": [],
"contents": [
{
"markdown": "CONTOSO LTD.\n\n\n# INVOICE\n\nContoso Headquarters...",
"fields": {
"VendorName": {
"type": "string",
"valueString": "CONTOSO LTD.",
"spans": [ { "offset": 0, "length": 12 } ],
"confidence": 0.941,
"source": "D(1,0.5729,0.6582,2.3353,0.6582,2.3353,0.8957,0.5729,0.8957)"
},
"Items": {
"type": "array",
"valueArray": [
{
"type": "object",
"valueObject": {
"Description": {
"type": "string",
"valueString": "Consulting Services",
"spans": [ { "offset": 909, "length": 19 } ],
"confidence": 0.971,
"source": "D(1,2.3264,5.673,3.6413,5.673,3.6413,5.8402,2.3264,5.8402)"
},
"Amount": {
"type": "number",
"valueNumber": 60,
"spans": [ { "offset": 995, "length": 6 } ],
"confidence": 0.989,
"source": "D(1,7.4507,5.6684,7.9245,5.6684,7.9245,5.8323,7.4507,5.8323)"
}
}
}, ...
]
}
},
"kind": "document",
"startPageNumber": 1,
"endPageNumber": 1,
"unit": "inch",
"pages": [
{
"pageNumber": 1,
"angle": -0.0039,
"width": 8.5,
"height": 11,
"spans": [ { "offset": 0, "length": 1650 } ],
"words": [
{
"content": "CONTOSO",
"span": { "offset": 0, "length": 7 },
"confidence": 0.997,
"source": "D(1,0.5739,0.6582,1.7446,0.6595,1.7434,0.8952,0.5729,0.8915)"
}, ...
],
"lines": [
{
"content": "CONTOSO LTD.",
"source": "D(1,0.5734,0.6563,2.335,0.6601,2.3345,0.8933,0.5729,0.8895)",
"span": { "offset": 0, "length": 12 }
}, ...
]
}
],
"paragraphs": [
{
"content": "CONTOSO LTD.",
"source": "D(1,0.5734,0.6563,2.335,0.6601,2.3345,0.8933,0.5729,0.8895)",
"span": { "offset": 0, "length": 12 }
}, ...
],
"sections": [
{
"span": { "offset": 0, "length": 1649 },
"elements": [ "/sections/1", "/sections/2" ]
},
{
"span": { "offset": 0, "length": 12 },
"elements": [ "/paragraphs/0" ]
}, ...
],
"tables": [
{
"rowCount": 2,
"columnCount": 6,
"cells": [
{
"kind": "columnHeader",
"rowIndex": 0,
"columnIndex": 0,
"rowSpan": 1,
"columnSpan": 1,
"content": "SALESPERSON",
"source": "D(1,0.5389,4.5514,1.7505,4.5514,1.7505,4.8364,0.5389,4.8364)",
"span": { "offset": 512, "length": 11 },
"elements": [ "/paragraphs/19" ]
}, ...
],
"source": "D(1,0.4885,4.5543,8.0163,4.5539,8.015,5.1207,0.4879,5.1209)",
"span": { "offset": 495, "length": 228 }
}, ...
]
}
]
}
}
Next steps
- In this quickstart, you learned how to call the REST API to create a custom analyzer. For a user experience, try Azure AI Foundry.