Welcome to the Microsoft Q&A forum.
Creating a custom Sensitive Information Type (SIT) for detecting dates of birth in Microsoft Purview can be tricky. Let's refine your regex and ensure it aligns with the requirements for DLP detection.
Core DOB Regex
Your current regex seems to have some issues. Here's a refined version that should capture the various date formats more accurately:
\b(?:\d{2}[\/-]\d{2}[\/-]\d{4}|\d{4}[\/-]\d{2}[\/-]\d{2})\b
This regex captures:
- MM/DD/YYYY or MM-DD-YYYY
- YYYY/MM/DD or YYYY-MM-DD
- DD/MM/YYYY or DD-MM-YYYY
Contextual Keywords
For the contextual keywords, you can use the following regex to match phrases like "DOB", "Date of Birth", or "Born on" followed by a date:
\b(?:DOB|Date of Birth|Born on)\s*:\s*(?:\d{2}[\/-]\d{2}[\/-]\d{4}|\d{4}[\/-]\d{2}[\/-]\d{2})\b
- Testing: Ensure you test your regex thoroughly with various date formats and contexts to confirm it works as expected.
- Validators: Use the date validator in Microsoft Purview to ensure the dates match the expected formats.
- Documentation: Refer to the Microsoft documentation on creating custom SITs and using regex in DLP policies for additional guidance.
I: Sensitive information type REGEX validators and additional checks
2: Create custom sensitive information types
3: Learn about using regular expressions (regex) in DLP policies
I hope the above steps will resolve the issue, please do let us know if issue persists. Thank you