Content Understanding document solutions (preview)

Important

  • Azure AI Content Understanding is available in preview. Public preview releases provide early access to features that are in active development.
  • Features, approaches, and processes may change or have constrained capabilities, prior to General Availability (GA).
  • For more information, see Supplemental Terms of Use for Microsoft Azure Previews.

Content Understanding is a cloud-based Azure AI Service designed to efficiently extract content and structured fields from documents and forms. It provides a comprehensive suite of APIs and an intuitive UX experience for optimal efficiency.

Content Understanding enables organization to streamline data collection and processing, enhance operational efficiency, optimize data-driven decision making, and empower innovation. With customizable analyzers, Content Understanding allows for easy extraction of content or fields from documents and forms, tailored to specific business needs.

Business use cases

Document analyzers can process complex documents in various formats and templates:

  • Contract lifecycle management: Extract key fields, clauses, and obligations from various contract types.
  • Loan and mortgage applications: Automate processing to enable quicker handling by banks, lenders, and government entities.
  • Financial services: Analyze complex documents like financial reports and asset management reports.
  • Expense management: Parse receipts and invoices from various retailers to validate expenses across different formats and templates.

Document analyzer capabilities

Screenshot of document extraction flow.

Content extraction enables the extraction of both printed and handwritten text from forms and documents, delivering business-ready content that is immediately actionable, usable, or adaptable for further development within your organization.

Add-on capabilities

Enhance your document extraction with optional add-on features, which can incur added costs. These features can be enabled or disabled based on your needs. Currently supported add-ons include:

  • Layout: Extracts layout information such as paragraphs, sections, tables, and more.
  • Barcode: Identifies and decodes all barcodes in the documents.
  • Formula: Recognizes all identified mathematical equations from the documents.

Field extraction

Field extraction enables the extraction of structured data from various forms and documents tailored to your specific needs. For instance, you can extract customer names, billing addresses, and line items from invoices; or parties, renewal date, and payment clause from contracts. You can start field extraction right after defining the schema or enhance it by labeling more sample documents to improve extraction quality.

Key Benefits

  • Accuracy and reliability: Ensure precise data extraction, reducing errors and boosting efficiency.
  • Scalability: Seamlessly scale out document processing to meet business demands.
  • Customizable: Adapt document analyzer to fit specific workflows.
  • Grounding source: Localize extracted data for human review workflows.
  • Confidence scores: Enhance automation with estimated confidence scores to maximize efficiency and minimize costs.

Input requirements

For detailed information on supported input document formats, refer to our Service quotas and limits page.

Supported languages and regions

For a detailed list of supported languages and regions, visit our Language and region support page.

Data privacy and security

Developers using Content Understanding should review Microsoft's policies on customer data. For more information, visit our Data, protection, and privacy page.

Next step