FormTrainingClient Class
FormTrainingClient is the Form Recognizer interface to use for creating and managing custom models. It provides methods for training models on the forms you provide, as well as methods for viewing and deleting models, accessing account properties, copying models to another Form Recognizer resource, and composing models from a collection of existing models trained with labels.
Note
FormTrainingClient should be used with API versions <=v2.1.
To use API versions 2022-08-31 and up, instantiate a DocumentModelAdministrationClient.
- Inheritance
-
azure.ai.formrecognizer._form_base_client.FormRecognizerClientBaseFormTrainingClient
Constructor
FormTrainingClient(endpoint: str, credential: AzureKeyCredential | TokenCredential, **kwargs: Any)
Parameters
Name | Description |
---|---|
endpoint
Required
|
Supported Cognitive Services endpoints (protocol and hostname, for example: https://westus2.api.cognitive.microsoft.com). |
credential
Required
|
Credentials needed for the client to connect to Azure. This is an instance of AzureKeyCredential if using an API key or a token credential from identity. |
Keyword-Only Parameters
Name | Description |
---|---|
api_version
|
The API version of the service to use for requests. It defaults to API version v2.1. Setting to an older version may result in reduced feature compatibility. To use the latest supported API version and features, instantiate a DocumentModelAdministrationClient instead. |
Examples
Creating the FormTrainingClient with an endpoint and API key.
from azure.core.credentials import AzureKeyCredential
from azure.ai.formrecognizer import FormTrainingClient
endpoint = os.environ["AZURE_FORM_RECOGNIZER_ENDPOINT"]
key = os.environ["AZURE_FORM_RECOGNIZER_KEY"]
form_training_client = FormTrainingClient(endpoint, AzureKeyCredential(key))
Creating the FormTrainingClient with a token credential.
"""DefaultAzureCredential will use the values from these environment
variables: AZURE_CLIENT_ID, AZURE_TENANT_ID, AZURE_CLIENT_SECRET
"""
from azure.ai.formrecognizer import FormTrainingClient
from azure.identity import DefaultAzureCredential
endpoint = os.environ["AZURE_FORM_RECOGNIZER_ENDPOINT"]
credential = DefaultAzureCredential()
form_training_client = FormTrainingClient(endpoint, credential)
Methods
begin_copy_model |
Copy a custom model stored in this resource (the source) to the user specified target Form Recognizer resource. This should be called with the source Form Recognizer resource (with the model that is intended to be copied). The target parameter should be supplied from the target resource's output from calling the get_copy_authorization method. |
begin_create_composed_model |
Creates a composed model from a collection of existing models that were trained with labels. A composed model allows multiple models to be called with a single model ID. When a document is submitted to be analyzed with a composed model ID, a classification step is first performed to route it to the correct custom model. New in version v2.1: The begin_create_composed_model client method |
begin_training |
Create and train a custom model. The request must include a training_files_url parameter that is an externally accessible Azure storage blob container URI (preferably a Shared Access Signature URI). Note that a container URI (without SAS) is accepted only when the container is public or has a managed identity configured, see more about configuring managed identities to work with Form Recognizer here: https://docs.microsoft.com/azure/applied-ai-services/form-recognizer/managed-identities. Models are trained using documents that are of the following content type - 'application/pdf', 'image/jpeg', 'image/png', 'image/tiff', or 'image/bmp'. Other types of content in the container is ignored. New in version v2.1: The model_name keyword argument |
close |
Close the FormTrainingClient session. |
delete_model |
Mark model for deletion. Model artifacts will be permanently removed within a predetermined period. |
get_account_properties |
Get information about the models on the form recognizer account. |
get_copy_authorization |
Generate authorization for copying a custom model into the target Form Recognizer resource. This should be called by the target resource (where the model will be copied to) and the output can be passed as the target parameter into begin_copy_model. |
get_custom_model |
Get a description of a custom model, including the types of forms it can recognize, and the fields it will extract for each form type. |
get_form_recognizer_client |
Get an instance of a FormRecognizerClient from FormTrainingClient. |
list_custom_models |
List information for each model, including model id, model status, and when it was created and last modified. |
send_request |
Runs a network request using the client's existing pipeline. The request URL can be relative to the base URL. The service API version used for the request is the same as the client's unless otherwise specified. Overriding the client's configured API version in relative URL is supported on client with API version 2022-08-31 and later. Overriding in absolute URL supported on client with any API version. This method does not raise if the response is an error; to raise an exception, call raise_for_status() on the returned response object. For more information about how to send custom requests with this method, see https://aka.ms/azsdk/dpcodegen/python/send_request. |
begin_copy_model
Copy a custom model stored in this resource (the source) to the user specified target Form Recognizer resource. This should be called with the source Form Recognizer resource (with the model that is intended to be copied). The target parameter should be supplied from the target resource's output from calling the get_copy_authorization method.
begin_copy_model(model_id: str, target: Dict[str, str | int], **kwargs: Any) -> LROPoller[CustomFormModelInfo]
Parameters
Name | Description |
---|---|
model_id
Required
|
Model identifier of the model to copy to target resource. |
target
Required
|
The copy authorization generated from the target resource's call to get_copy_authorization. |
Keyword-Only Parameters
Name | Description |
---|---|
continuation_token
|
A continuation token to restart a poller from a saved state. |
Returns
Type | Description |
---|---|
An instance of an LROPoller. Call result() on the poller object to return a CustomFormModelInfo. |
Exceptions
Type | Description |
---|---|
Examples
Copy a model from the source resource to the target resource
source_client = FormTrainingClient(endpoint=source_endpoint, credential=AzureKeyCredential(source_key))
poller = source_client.begin_copy_model(
model_id=source_model_id,
target=target # output from target client's call to get_copy_authorization()
)
copied_over_model = poller.result()
print("Model ID: {}".format(copied_over_model.model_id))
print("Status: {}".format(copied_over_model.status))
begin_create_composed_model
Creates a composed model from a collection of existing models that were trained with labels.
A composed model allows multiple models to be called with a single model ID. When a document is submitted to be analyzed with a composed model ID, a classification step is first performed to route it to the correct custom model.
New in version v2.1: The begin_create_composed_model client method
begin_create_composed_model(model_ids: List[str], **kwargs: Any) -> LROPoller[CustomFormModel]
Parameters
Name | Description |
---|---|
model_ids
Required
|
List of model IDs to use in the composed model. |
Keyword-Only Parameters
Name | Description |
---|---|
model_name
|
An optional, user-defined name to associate with your model. |
continuation_token
|
A continuation token to restart a poller from a saved state. |
Returns
Type | Description |
---|---|
An instance of an LROPoller. Call result() on the poller object to return a CustomFormModel. |
Exceptions
Type | Description |
---|---|
Examples
Create a composed model
from azure.core.credentials import AzureKeyCredential
from azure.ai.formrecognizer import FormTrainingClient
endpoint = os.environ["AZURE_FORM_RECOGNIZER_ENDPOINT"]
key = os.environ["AZURE_FORM_RECOGNIZER_KEY"]
po_supplies = os.environ['PURCHASE_ORDER_OFFICE_SUPPLIES_SAS_URL_V2']
po_equipment = os.environ['PURCHASE_ORDER_OFFICE_EQUIPMENT_SAS_URL_V2']
po_furniture = os.environ['PURCHASE_ORDER_OFFICE_FURNITURE_SAS_URL_V2']
po_cleaning_supplies = os.environ['PURCHASE_ORDER_OFFICE_CLEANING_SUPPLIES_SAS_URL_V2']
form_training_client = FormTrainingClient(endpoint=endpoint, credential=AzureKeyCredential(key))
supplies_poller = form_training_client.begin_training(
po_supplies, use_training_labels=True, model_name="Purchase order - Office supplies"
)
equipment_poller = form_training_client.begin_training(
po_equipment, use_training_labels=True, model_name="Purchase order - Office Equipment"
)
furniture_poller = form_training_client.begin_training(
po_furniture, use_training_labels=True, model_name="Purchase order - Furniture"
)
cleaning_supplies_poller = form_training_client.begin_training(
po_cleaning_supplies, use_training_labels=True, model_name="Purchase order - Cleaning Supplies"
)
supplies_model = supplies_poller.result()
equipment_model = equipment_poller.result()
furniture_model = furniture_poller.result()
cleaning_supplies_model = cleaning_supplies_poller.result()
models_trained_with_labels = [
supplies_model.model_id,
equipment_model.model_id,
furniture_model.model_id,
cleaning_supplies_model.model_id
]
poller = form_training_client.begin_create_composed_model(
models_trained_with_labels, model_name="Office Supplies Composed Model"
)
model = poller.result()
print("Office Supplies Composed Model Info:")
print("Model ID: {}".format(model.model_id))
print("Model name: {}".format(model.model_name))
print("Is this a composed model?: {}".format(model.properties.is_composed_model))
print("Status: {}".format(model.status))
print("Composed model creation started on: {}".format(model.training_started_on))
print("Creation completed on: {}".format(model.training_completed_on))
begin_training
Create and train a custom model. The request must include a training_files_url parameter that is an externally accessible Azure storage blob container URI (preferably a Shared Access Signature URI). Note that a container URI (without SAS) is accepted only when the container is public or has a managed identity configured, see more about configuring managed identities to work with Form Recognizer here: https://docs.microsoft.com/azure/applied-ai-services/form-recognizer/managed-identities. Models are trained using documents that are of the following content type - 'application/pdf', 'image/jpeg', 'image/png', 'image/tiff', or 'image/bmp'. Other types of content in the container is ignored.
New in version v2.1: The model_name keyword argument
begin_training(training_files_url: str, use_training_labels: bool, **kwargs: Any) -> LROPoller[CustomFormModel]
Parameters
Name | Description |
---|---|
training_files_url
Required
|
An Azure Storage blob container's SAS URI. A container URI (without SAS) can be used if the container is public or has a managed identity configured. For more information on setting up a training data set, see: https://aka.ms/azsdk/formrecognizer/buildtrainingset. |
use_training_labels
Required
|
Whether to train with labels or not. Corresponding labeled files must exist in the blob container if set to True. |
Keyword-Only Parameters
Name | Description |
---|---|
prefix
|
A case-sensitive prefix string to filter documents in the source path for training. For example, when using an Azure storage blob URI, use the prefix to restrict sub folders for training. |
include_subfolders
|
A flag to indicate if subfolders within the set of prefix folders will also need to be included when searching for content to be preprocessed. Not supported if training with labels. |
model_name
|
An optional, user-defined name to associate with your model. |
continuation_token
|
A continuation token to restart a poller from a saved state. |
Returns
Type | Description |
---|---|
An instance of an LROPoller. Call result() on the poller object to return a CustomFormModel. |
Exceptions
Type | Description |
---|---|
Note that if the training fails, the exception is raised, but a model with an "invalid" status is still created. You can delete this model by calling |
|
Examples
Training a model (without labels) with your custom forms.
from azure.ai.formrecognizer import FormTrainingClient
from azure.core.credentials import AzureKeyCredential
endpoint = os.environ["AZURE_FORM_RECOGNIZER_ENDPOINT"]
key = os.environ["AZURE_FORM_RECOGNIZER_KEY"]
container_sas_url = os.environ["CONTAINER_SAS_URL_V2"]
form_training_client = FormTrainingClient(endpoint, AzureKeyCredential(key))
poller = form_training_client.begin_training(container_sas_url, use_training_labels=False)
model = poller.result()
# Custom model information
print("Model ID: {}".format(model.model_id))
print("Status: {}".format(model.status))
print("Model name: {}".format(model.model_name))
print("Training started on: {}".format(model.training_started_on))
print("Training completed on: {}".format(model.training_completed_on))
print("Recognized fields:")
# Looping through the submodels, which contains the fields they were trained on
for submodel in model.submodels:
print("...The submodel has form type '{}'".format(submodel.form_type))
for name, field in submodel.fields.items():
print("...The model found field '{}' to have label '{}'".format(
name, field.label
))
close
Close the FormTrainingClient session.
close() -> None
Keyword-Only Parameters
Name | Description |
---|---|
continuation_token
|
A continuation token to restart a poller from a saved state. |
Exceptions
Type | Description |
---|---|
delete_model
Mark model for deletion. Model artifacts will be permanently removed within a predetermined period.
delete_model(model_id: str, **kwargs: Any) -> None
Parameters
Name | Description |
---|---|
model_id
Required
|
Model identifier. |
Keyword-Only Parameters
Name | Description |
---|---|
continuation_token
|
A continuation token to restart a poller from a saved state. |
Returns
Type | Description |
---|---|
Exceptions
Type | Description |
---|---|
Examples
Delete a custom model.
form_training_client.delete_model(model_id=custom_model.model_id)
try:
form_training_client.get_custom_model(model_id=custom_model.model_id)
except ResourceNotFoundError:
print("Successfully deleted model with id {}".format(custom_model.model_id))
get_account_properties
Get information about the models on the form recognizer account.
get_account_properties(**kwargs: Any) -> AccountProperties
Keyword-Only Parameters
Name | Description |
---|---|
continuation_token
|
A continuation token to restart a poller from a saved state. |
Returns
Type | Description |
---|---|
Summary of models on account - custom model count, custom model limit. |
Exceptions
Type | Description |
---|---|
Examples
Get properties for the form recognizer account.
form_training_client = FormTrainingClient(endpoint=endpoint, credential=AzureKeyCredential(key))
# First, we see how many custom models we have, and what our limit is
account_properties = form_training_client.get_account_properties()
print("Our account has {} custom models, and we can have at most {} custom models\n".format(
account_properties.custom_model_count, account_properties.custom_model_limit
))
get_copy_authorization
Generate authorization for copying a custom model into the target Form Recognizer resource. This should be called by the target resource (where the model will be copied to) and the output can be passed as the target parameter into begin_copy_model.
get_copy_authorization(resource_id: str, resource_region: str, **kwargs: Any) -> Dict[str, str | int]
Parameters
Name | Description |
---|---|
resource_id
Required
|
Azure Resource Id of the target Form Recognizer resource where the model will be copied to. |
resource_region
Required
|
Location of the target Form Recognizer resource. A valid Azure region name supported by Cognitive Services. For example, 'westus', 'eastus' etc. See https://azure.microsoft.com/global-infrastructure/services/?products=cognitive-services for the regional availability of Cognitive Services. |
Keyword-Only Parameters
Name | Description |
---|---|
continuation_token
|
A continuation token to restart a poller from a saved state. |
Returns
Type | Description |
---|---|
A dictionary with values for the copy authorization - "modelId", "accessToken", "resourceId", "resourceRegion", and "expirationDateTimeTicks". |
Exceptions
Type | Description |
---|---|
Examples
Authorize the target resource to receive the copied model
target_client = FormTrainingClient(endpoint=target_endpoint, credential=AzureKeyCredential(target_key))
target = target_client.get_copy_authorization(
resource_region=target_region,
resource_id=target_resource_id
)
# model ID that target client will use to access the model once copy is complete
print("Model ID: {}".format(target["modelId"]))
get_custom_model
Get a description of a custom model, including the types of forms it can recognize, and the fields it will extract for each form type.
get_custom_model(model_id: str, **kwargs: Any) -> CustomFormModel
Parameters
Name | Description |
---|---|
model_id
Required
|
Model identifier. |
Keyword-Only Parameters
Name | Description |
---|---|
continuation_token
|
A continuation token to restart a poller from a saved state. |
Returns
Type | Description |
---|---|
CustomFormModel |
Exceptions
Type | Description |
---|---|
Examples
Get a custom model with a model ID.
custom_model = form_training_client.get_custom_model(model_id=model.model_id)
print("\nModel ID: {}".format(custom_model.model_id))
print("Status: {}".format(custom_model.status))
print("Model name: {}".format(custom_model.model_name))
print("Is this a composed model?: {}".format(custom_model.properties.is_composed_model))
print("Training started on: {}".format(custom_model.training_started_on))
print("Training completed on: {}".format(custom_model.training_completed_on))
get_form_recognizer_client
Get an instance of a FormRecognizerClient from FormTrainingClient.
get_form_recognizer_client(**kwargs: Any) -> FormRecognizerClient
Keyword-Only Parameters
Name | Description |
---|---|
continuation_token
|
A continuation token to restart a poller from a saved state. |
Returns
Type | Description |
---|---|
A FormRecognizerClient |
Exceptions
Type | Description |
---|---|
list_custom_models
List information for each model, including model id, model status, and when it was created and last modified.
list_custom_models(**kwargs: Any) -> ItemPaged[CustomFormModelInfo]
Keyword-Only Parameters
Name | Description |
---|---|
continuation_token
|
A continuation token to restart a poller from a saved state. |
Returns
Type | Description |
---|---|
ItemPaged[CustomFormModelInfo] |
Exceptions
Type | Description |
---|---|
Examples
List model information for each model on the account.
custom_models = form_training_client.list_custom_models()
print("We have models with the following IDs:")
for model_info in custom_models:
print(model_info.model_id)
send_request
Runs a network request using the client's existing pipeline.
The request URL can be relative to the base URL. The service API version used for the request is the same as the client's unless otherwise specified. Overriding the client's configured API version in relative URL is supported on client with API version 2022-08-31 and later. Overriding in absolute URL supported on client with any API version. This method does not raise if the response is an error; to raise an exception, call raise_for_status() on the returned response object. For more information about how to send custom requests with this method, see https://aka.ms/azsdk/dpcodegen/python/send_request.
send_request(request: HttpRequest, *, stream: bool = False, **kwargs) -> HttpResponse
Parameters
Name | Description |
---|---|
request
Required
|
The network request you want to make. |
Keyword-Only Parameters
Name | Description |
---|---|
stream
|
Whether the response payload will be streamed. Defaults to False. |
Returns
Type | Description |
---|---|
The response of your network call. Does not do error handling on your response. |
Exceptions
Type | Description |
---|---|
Azure SDK for Python