Get started: Document Intelligence Studio

This content applies to:checkmark v4.0 (preview) | Previous versions: blue-checkmark v3.1 (GA) blue-checkmark v3.0 (GA)

Document Intelligence Studio is an online tool for visually exploring, understanding, and integrating features from the Document Intelligence service in your applications. You can get started by exploring the pretrained models with sample or your own documents. You can also create projects to build custom template models and reference the models in your applications using the Python SDK and other quickstarts.

Prerequisites for new users

To use Document Intelligence Studio, you need the following assets and settings:

Tip

Create an Azure AI services resource if you plan to access multiple Azure AI services under a single endpoint/key. For Document Intelligence access only, create a Document Intelligence resource. Please note that you'll need a single-service resource if you intend to use Microsoft Entra authentication.

Document Intelligence now supports AAD token authentication additional to local (key-based) authentication when accessing the Document Intelligence resources and storage accounts. Be sure to follow below instructions to setup correct access roles, especially if your resources are applied with DisableLocalAuth policy.

  • Properly scoped Azure role assignments For document analysis and prebuilt models, following role assignments are required for different scenarios.

    • Basic ✔️ Cognitive Services User: you need this role to Document Intelligence or Azure AI services resource to enter the analyze page.

    • Advanced ✔️ Contributor: you need this role to create resource group, Document Intelligence service, or Azure AI services resource.

      For more information on authorization, see Document Intelligence Studio authorization policies.

      Note

      If local (key-based) authentication is disabled for your Document Intelligence service resource, be sure to obtain Cognitive Services User role and your AAD token will be used to authenticate requests on Document Intelligence Studio. The Contributor role only allows you to list keys but does not give you permission to use the resource when key-access is disabled.

  • Once your resource is configured, you can try the different models offered by Document Intelligence Studio. From the front page, select any Document Intelligence model to try using with a no-code approach.

  • To test any of the document analysis or prebuilt models, select the model and use one of the sample documents or upload your own document to analyze. The analysis result is displayed at the right in the content-result-code window.

  • Custom models need to be trained on your documents. See custom models overview for an overview of custom models.

Authentication

Navigate to the Document Intelligence Studio. If it's your first time logging in, a popup window appears prompting you to configure your service resource. In accordance with your organization's policy, you have one or two options:

  • Microsoft Entra authentication: access by Resource (recommended).

    • Choose your existing subscription.

    • Select an existing resource group within your subscription or create a new one.

    • Select your existing Document Intelligence or Azure AI services resource.

      Screenshot of configure service resource form from the Document Intelligence Studio.

  • Local authentication: access by API endpoint and key.

    • Retrieve your endpoint and key from the Azure portal.

    • Go to the overview page for your resource and select Keys and Endpoint from the left navigation bar.

    • Enter the values in the appropriate fields.

      Screenshot of the keys and endpoint page in the Azure portal.

  • After validating the scenario in the Document Intelligence Studio, use the C#, Java, JavaScript, or Python client libraries or the REST API to get started incorporating Document Intelligence models into your own applications.

To learn more about each model, see our concept pages.

View resource details

To view resource details such as name and pricing tier, select the Settings icon in the top-right corner of the Document Intelligence Studio home page and select the Resource tab. If you have access to other resources, you can switch resources as well.

Added prerequisites for custom projects

In addition to the Azure account and a Document Intelligence or Azure AI services resource, you need:

Azure Blob Storage container

A standard performance Azure Blob Storage account. You create containers to store and organize your training documents within your storage account. If you don't know how to create an Azure storage account with a container, following these quickstarts:

  • Create a storage account. When creating your storage account, make sure to select Standard performance in the Instance details → Performance field.
  • Create a container. When creating your container, set the Public access level field to Container (anonymous read access for containers and blobs) in the New Container window.

Azure role assignments

For custom projects, the following role assignments are required for different scenarios.

  • Basic

    • Cognitive Services User: You need this role for Document Intelligence or Azure AI services resource to train the custom model or do analysis with trained models.
    • Storage Blob Data Contributor: You need this role for the Storage Account to create a project and label data.
  • Advanced

    • Storage Account Contributor: You need this role for the Storage Account to set up CORS settings (this action is a one-time effort if the same storage account is reused).
    • Contributor: You need this role to create a resource group and resources.

    Note

    If local (key-based) authentication is disabled for your Document Intelligence service resource and storage account, be sure to obtain Cognitive Services User and Storage Blob Data Contributor roles respectively, so you have enough permissions to use Document Intelligence Studio. The Storage Account Contributor and Contributor roles only allow you to list keys but does not give you permission to use the resources when key-access is disabled.

Configure CORS

CORS (Cross Origin Resource Sharing) needs to be configured on your Azure storage account for it to be accessible from the Document Intelligence Studio. To configure CORS in the Azure portal, you need access to the CORS tab of your storage account.

  1. Select the CORS tab for the storage account.

    Screenshot of the CORS setting menu in the Azure portal.

  2. Start by creating a new CORS entry in the Blob service.

  3. Set the Allowed origins to https://documentintelligence.ai.azure.com.

    Screenshot that shows CORS configuration for a storage account.

    Tip

    You can use the wildcard character '*' rather than a specified domain to allow all origin domains to make requests via CORS.

  4. Select all the available 8 options for Allowed methods.

  5. Approve all Allowed headers and Exposed headers by entering an * in each field.

  6. Set the Max Age to 120 seconds or any acceptable value.

  7. To save the changes, select the save button at the top of the page.

CORS should now be configured to use the storage account from Document Intelligence Studio.

Sample documents set

  1. Sign in to the Azure portal and navigate to Your storage account > Data storage > Containers.

    Screenshot of Data storage menu in the Azure portal.

  2. Select a container from the list.

  3. Select Upload from the menu at the top of the page.

    Screenshot of container upload button in the Azure portal.

  4. The Upload blob window appears.

  5. Select your files to upload.

    Screenshot of upload blob window in the Azure portal.

Note

By default, the Studio will use documents that are located at the root of your container. However, you can use data organized in folders by specifying the folder path in the Custom form project creation steps. See Organize your data in subfolders

Use Document Intelligence Studio features

Auto label documents with prebuilt models or one of your own models

  • In custom extraction model labeling page, you can now auto label your documents using one of Document Intelligent Service prebuilt models or your trained models.

    Animated screenshot showing auto labeling in Studio.

  • For some documents, duplicate labels after running autolabel are possible. Make sure to modify the labels so that there are no duplicate labels in the labeling page afterwards.

    Screenshot showing duplicate label warning after auto labeling.

Auto label tables

  • In custom extraction model labeling page, you can now auto label the tables in the document without having to label the tables manually.

    Animated screenshot showing auto table labeling in Studio.

Add test files directly to your training dataset

  • Once you train a custom extraction model, make use of the test page to improve your model quality by uploading test documents to training dataset if needed.

  • If a low confidence score is returned for some labels, make sure to correctly label your content. If not, add them to the training dataset and relabel to improve the model quality.

    Animated screenshot showing how to add test files to training dataset.

Make use of the document list options and filters in custom projects

  • Use the custom extraction model labeling page to navigate through your training documents with ease by making use of the search, filter, and sort by feature.

  • Utilize the grid view to preview documents or use the list view to scroll through the documents more easily.

    Screenshot of document list view options and filters.

Project sharing

Share custom extraction projects with ease. For more information, see Project sharing with custom models.

Next steps

Get started with the Document Intelligence Studio.