Run Analysis Custom extraction model

Pavith Vickneswararajah 0 Reputation points
2025-01-26T23:22:04.55+00:00

Hello,
when we train a custom extraction model we can hit the run analysis button in the document intelligence UI to generate the ocr file of the pdf, which we can then download in the azure blob storage manager.

is there a possiblity to generate the ocr files via code in c#? I know we can use
var operation = await client.AnalyzeDocumentFromUriAsync( WaitUntil.Completed, "prebuilt-layout", blobUri);

to get the information of the analyzed pdf, but is there a method where a ocr file is generated and stored in the blob storage?

Azure AI Document Intelligence
Azure AI Document Intelligence
An Azure service that turns documents into usable data. Previously known as Azure Form Recognizer.
1,882 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Azar 26,015 Reputation points MVP
    2025-01-27T06:48:00.1066667+00:00

    Hi there Pavith Vickneswararajah

    Thanks for using QandA platform

    Yes, it is possible, the AnalyzeDocumentFromUriAsync method retrieves the analyzed data from a document, it does not directly generate an OCR file. but, you can extract the text from the analysis result and save it as a file in your blob storage. After running the analysis using prebuilt-layout, you can iterate through the Pages in the AnalyzeResult to collect text from Lines or Words. Once you have the extracted text, use the Azure Blob SDK to create and upload a file (e.g., a .txt file) into your container.

    If this helps kindly accept the answer thanks much.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.