Document Intelligence custom extraction models assign one label per field. If you need multiple PII types, you’ll have to define separate fields for each type or use additional processing to further classify the extracted text.
Can you create PII detection using Document Intelligence Custom Model?
I was trying to fetch PII using Document Intelligence custom extraction model to make a PII detection model. But couldn't get the expected result. It is not labeling the information correctly. It gives only one label for each field. Maybe I am using it wrong. Can someone correct me about this.
2 answers
Sort by: Most helpful
-
-
SriLakshmi C 3,015 Reputation points Microsoft External Staff
2025-03-03T06:40:39.17+00:00 Hello Sujit Kumar Choudhary,
Greetings and Welcome to Microsoft Q&A!
In addition to the above response, Azure Document Intelligence Custom Model is primarily designed for structured data extraction and may not be ideal for detecting Personally Identifiable Information (PII). Since it assigns only one label per field, it does not support identifying multiple types of sensitive data within the same field.
For effective PII detection, it is recommended to use Azure AI Language's PII Detection API, which is specifically built to identify and redact sensitive information such as names, addresses, and financial details. Alternatively, you can first extract text from documents using Document Intelligence and then process the extracted content through the PII Detection API for more accurate results.
Kindly refer this What is Personally Identifiable Information (PII) detection in Azure AI Language
How to detect and redact Personally Identifying Information (PII),
I hope you understand. And, if you have any further query do let us know.
Thank you!