AI Language PII detection only works for plain text

GHIJ Labs 0 Reputation points
2025-02-22T15:56:38.31+00:00

AI Language PII detection only works for plain text. If there is HTML table which have formatting and PII data then it fails

example

Header 1 Header 2
John doe 123.45.67.8
example@example.com 9820098200
Azure AI Language
Azure AI Language
An Azure service that provides natural language capabilities including sentiment analysis, entity extraction, and automated question answering.
465 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Pavankumar Purilla 3,715 Reputation points Microsoft External Staff
    2025-02-24T17:58:06.3266667+00:00

    Hi GHIJ Labs,
    Greetings & Welcome to the Microsoft Q&A forum! Thank you for sharing your query.
    It sounds like you're encountering issues with detecting Personally Identifiable Information (PII) in HTML tables. This is a common challenge because PII detection tools often focus on plain text and may not handle HTML formatting well.

    Convert the HTML content into plain text before running PII detection. This can be done using libraries that strip HTML tags and extract text content.

    Hope this helps. Do let us know if you have any further queries.


    If this answers your query, do click Accept Answer and Yes for was this answer helpful.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.