Using document_intelligence_client.begin_analyze_document returns Session.request() got an unexpected keyword argument 'body'

Question

Using document_intelligence_client.begin_analyze_document returns Session.request() got an unexpected keyword argument 'body'

Maxime Jack Raymond Jens Cara 0

When I'm trying to use the following script:

  # Open local file in binary mode
    with open(file_path, "rb") as f:
        poller = document_intelligence_client.begin_analyze_document(
            model_id="prebuilt-layout",
            body=f
        )

I get the error:
Session.request() got an unexpected keyword argument 'body'

I've checked the version: Version: 1.0.0
and in case tried updating it with pip. It still returns the same error.
The key and endpoint are correct.

In essence, I want to use local PDF files and use document intelligence to retrieve the text and tables, maybe even figures later if I can figure this out first. I have a custom extraction model, but I'm just trying to test this first to see if everything is in order.

I've tried changing

body=f -> document=f

but the error just changes to

Session.request() got an unexpected keyword argument 'document'

1 answer

Your answer

Answer 1

Hi @Maxime Jack Raymond Jens Cara,

Apologies for the delay in response. The error occurs because begin_analyze_document() does not accept a raw file stream (body=f or document=f). Instead, it requires either a URL (urlSource) or a Base64-encoded file (base64Source).

Since you're working with local PDF files, you'll need to encode the file in Base64 before passing it to the API. Here's the correct approach:

import os
import base64
from azure.core.credentials import AzureKeyCredential
from azure.ai.documentintelligence import DocumentIntelligenceClient
from azure.ai.documentintelligence.models import AnalyzeResult

# Set your Azure endpoint and key
endpoint = "<END_POINT>"  # Replace with your endpoint
key = "YOUR_KEY"  # Replace with your key

def analyze_layout(local_file_path):
    document_intelligence_client = DocumentIntelligenceClient(
        endpoint=endpoint, credential=AzureKeyCredential(key)
    )

    # Read the local PDF and encode it to Base64
    with open(local_file_path, "rb") as file_stream:
        base64_data = base64.b64encode(file_stream.read()).decode("utf-8")

    # Send request with base64Source
    poller = document_intelligence_client.begin_analyze_document(
        model_id="prebuilt-layout", analyze_request={"base64Source": base64_data}
    )

    result: AnalyzeResult = poller.result()

    for page in result.pages:
        print(f"----Analyzing layout from page #{page.page_number}----")
        print(f"Page dimensions: {page.width} x {page.height} {page.unit}")

        if page.lines:
            for line in page.lines:
                print(f"Line: '{line.content}'")

    if result.tables:
        for table in result.tables:
            print(f"Table with {table.row_count} rows and {table.column_count} columns")
            for cell in table.cells:
                print(f"Cell[{cell.row_index}][{cell.column_index}]: '{cell.content}'")

if __name__ == "__main__":
    local_file_path = r"FILE_PATH"  # Change this to your local file path
    analyze_layout(local_file_path)

Here is the output: User's image

For more info: Get started with Document Intelligence.

This should resolve the error and allow you to process local PDFs correctly. Let me know if you need any further assistance!

If this answers your query, do click Accept Answer and Yes for was this answer helpful.

Share via

Using document_intelligence_client.begin_analyze_document returns Session.request() got an unexpected keyword argument 'body'

1 answer

Your answer