Can Azure Document Intelligence Studio be used to create a Seachable pdf

Shreyas Rastogi 225 Reputation points
2024-03-20T14:43:06.9233333+00:00

Hi ,

I am looking at creating searchable pdf using Azure document Intelligence

I see this article about possible implementation .

https://techcommunity.microsoft.com/t5/ai-azure-ai-services-blog/generate-searchable-pdfs-with-azure-form-recognizer/ba-p/3652024

Is there any API Endpoint in Document Intelligence which takes pdf as input and returns a searchable pdf as output.

Azure AI Document Intelligence
Azure AI Document Intelligence
An Azure service that turns documents into usable data. Previously known as Azure Form Recognizer.
1,907 questions
{count} votes

Accepted answer
  1. Konstantinos Passadis 19,376 Reputation points MVP
    2024-03-20T14:54:13.46+00:00

    Hello @Shreyas Rastogi

    Azure Document Intelligence doesn't provide a direct API endpoint for converting PDFs into searchable PDFs. The process, as described in the article, involves using Azure Form Recognizer along with a custom Python script.

    To achieve this you need an Azure AI Search Index and create a Vector Index to search within the PDFs

    https://learn.microsoft.com/en-us/azure/search/vector-search-how-to-create-index?tabs=config-2023-11-01%2Crest-2023-11-01%2Cpush%2Cportal-check-index

    --

    I hope this helps!

    Kindly mark the answer as Accepted and Upvote in case it helped!

    Regards

    0 comments No comments

1 additional answer

Sort by: Most helpful
  1. Anatoly Ponomarev 0 Reputation points Microsoft Employee
    2025-01-30T16:48:25.8166667+00:00

    Hello @Shreyas Rastogi ,

    Azure Document Intelligence is now supporting conversion into searchable PDFs in the latest version 2024-11-30 (4.0 General Availability).

    Here is link to documentation:
    https://learn.microsoft.com/en-us/azure/ai-services/document-intelligence/concept/add-on-capabilities?view=doc-intel-4.0.0&tabs=rest-api#searchable-pdf

    You can also try it in the Document Intelligence Studio (OCR/Read model):
    https://documentintelligence.ai.azure.com/studio/read

    Click on Analyze options and select optional output "Searchable PDF":
    Analyze options dialog

    Click "Run analysis" and when results are ready, click on download button to download searchable PDF:
    Download Searchable PDF button

    Best Regards,
    =Anatoly=


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.