Hello Pile, Joshua,
Welcome to the Microsoft Q&A and thank you for posting your questions here.
I understand that you are having issue with Auto Document Splitting Prebuilt Bank Statements Extractor in Azure AI Document Intelligence.
I will provide you here the code to resolves the TypeError and ensures proper splitting of multi-statement PDFs.
Start by making sure you are using the latest version of azure-ai-documentintelligence by running bash command to update: pip install --upgrade azure-ai-documentintelligence
Secondly, in the code below I corrected the previous version and ensure that the split_mode is included in the request body (an AnalyzeDocumentRequest object).
from azure.core.credentials import AzureKeyCredential
from azure.ai.documentintelligence import DocumentIntelligenceClient
from azure.ai.documentintelligence.models import AnalyzeDocumentRequest, SplitMode
# Initialize the client
document_intelligence_client = DocumentIntelligenceClient(
endpoint="your_endpoint",
credential=AzureKeyCredential("your_key")
)
# Create a request with the file bytes and split mode
request = AnalyzeDocumentRequest(
base64_source=file_bytes, # Ensure file_bytes is properly encoded
split_mode=SplitMode.AUTO
)
# Analyze the document
poller = document_intelligence_client.begin_analyze_document(
model_id="prebuilt-bankStatement.us",
analyze_request=request # Pass the request object here
)
# Get results (will return multiple documents if split)
bank_statements = poller.result()
NOTE THAT:
- Use AnalyzeDocumentRequest to encapsulate parameters like split_mode.
- Ensure file_bytes is a base64-encoded string (use
base64.b64encode(file_bytes).decode('utf-8')
if needed).
- The newer SDK (
azure-ai-documentintelligence
) uses base64_source
or url_source
in the request body.
To read more, use the following links:
I hope this is helpful! Do not hesitate to let me know if you have any other questions.
Please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful.