Hi Chiew Lerk Qing,
Thanks for the question and welcome to Microsoft Q&A.
It looks like you're trying to extract tabular data where the header (MM-yyyy) is dynamic and facing issues with labeling data. Here are few which can help you with the expected results that you ask for
Use "Key-Value Pair Extraction" Instead of Labeling Data:
- Instead of relying on labeled data, use Azure AI Document Intelligence's key-value pair extraction to dynamically detect column headers and map them.
- Reference: 🔗 Extract tables and key-value pairs
Region-Based Extraction (But Define Specific Anchors):
- Since "region" didn’t work well, try defining fixed reference points in the document, like a specific static text near your table, to ensure consistent extraction.
- Reference: 🔗 Region-based form recognition
Dynamic Table Header Handling (Regex for MM-yyyy Format):
- Since headers change dynamically, implement a custom pre-processing step to detect and normalize headers before passing to the AI model.
- Example: Use regex
\b(0[1-9]|1[0-2])-\d{4}\b
to extractMM-yyyy
headers programmatically before AI processing.
Custom Model Training for Complex Tables:
- If standard models fail, train a custom Document Intelligence model with labeled datasets covering various formats.
- Reference: 🔗 Train a custom model
Regarding the Subscription & Support: Yes, if you subscribe to a paid tier, you get access to technical support beyond just this community forum. Azure offers Standard and Premium support plans, which provide faster response times and technical assistance via Azure Support. More details: 🔗 Azure Support Plans
Let me know if you need further guidance! Please accept the answer and vote yes if you find the answer useful for helping the community. If you have further doubts, I would be happy to support you.
Regards,
Chakravarthi Rangarajan Bhargavi