Best Practices for Securing Azure OpenAI with Confidential Data

Question

Hi,

I am developing an internal application that leverages Azure OpenAI Service and interacts with confidential data, including internal documents, table metadata of highly sensitive data, and more. I need guidance on the following security and compliance aspects:

Infrastructure Security: Is setting up a private VNet (something like this) sufficient to secure the solution? What are the key factors and best practices to ensure security, considering the use of Azure Storage, Azure AI Search, Azure Monitor (logs), and Azure OpenAI?
Inference & Compliance: Can the application process confidential data, such as metadata, documents, or even highly confidential content, without violating Azure security and compliance policies? Are there specific restrictions or guidelines to follow?
Validation & POC: Is there a way to validate that my application complies with security and compliance policies before deployment? Are there Microsoft-recommended processes or tools for this?

Answer

Hello Saurabh Arjun Sawant,

Thank you for posting your question in the Microsoft Q&A forum.

Setting up a private Virtual Network (VNet) itself is a strong foundational step for securing your Azure OpenAI solution, however it is not sufficient on its own. To ensure comprehensive security and compliance, especially when using Azure Storage, Azure AI Search, Azure Monitor, and Azure OpenAI, you need to adopt a multi-layered security approach. Below, I’ll address your questions in detail and provide key factors, best practices, and validation methods to ensure your solution is secure and compliant.

Infrastructure Security: To answer your question, a private VNet is a critical component for securing your solution, but it must be complemented with additional security measures.

Key Factors for Infrastructure Security

Private Endpoints: Use private endpoints for Azure Storage, Azure AI Search, and Azure OpenAI to ensure all communication occurs within the VNet and is not exposed to the public internet.

Network Security Groups (NSGs): Configure NSGs to restrict traffic to and from your resources. For example, allow only specific IP ranges or subnets to access Azure OpenAI.

Azure Firewall: Implement Azure Firewall to filter and monitor traffic between your VNet and external networks.

Service Endpoints: Enable service endpoints for Azure Storage and Azure AI Search to restrict access to your VNet.

DDoS Protection: Enable Azure DDoS Protection to safeguard against distributed denial-of-service attacks.

Best Practices for Infrastructure Security

Isolate Resources: Place Azure OpenAI, Azure Storage, and Azure AI Search in separate subnets within the VNet to minimize the attack surface.
Zero Trust Architecture: Adopt a zero-trust approach by verifying every request, regardless of its origin.
Encryption: Ensure all data is encrypted at rest (using Azure Storage encryption) and in transit (using TLS 1.2 or higher).
Private DNS Zones: Use Azure Private DNS Zones to resolve private endpoints within your VNet securely.
Logging and Monitoring: Enable Azure Monitor and Log Analytics to track and analyze network traffic and resource usage.

Few useful links to Microsoft docs:

Inference & Compliance: Yes, your application can process confidential data, but you must adhere to Azure security and compliance policies and follow specific guidelines to avoid violations.

Key Factors for Compliance

Data Classification: Classify your data (e.g., public, confidential, highly confidential) and apply appropriate security controls.

Data Residency: Ensure that data is stored and processed in regions compliant with your organization’s data residency requirements.

Regulatory Compliance: Verify that Azure OpenAI and related services comply with regulations like GDPR, HIPAA, ISO 27001, and SOC 2.

Best Practices for Processing Confidential Data

Data Anonymization: Anonymize or pseudonymize sensitive data before sending it to Azure OpenAI for inference.

Access Controls: Use RBAC and Azure AD to restrict access to confidential data.

Data Masking: Mask sensitive information in logs and outputs.

Audit Logs: Enable audit logs for Azure Storage, Azure AI Search, and Azure OpenAI to track data access and usage.

Content Filtering: Use Azure OpenAI’s built-in content filtering to prevent the generation of inappropriate or harmful content.
Define Security and Compliance Requirements:
- Identify the regulations and standards your application must comply with (e.g., GDPR, HIPAA).
- Define security controls (e.g., encryption, access controls).

Specific Restrictions and Guidelines

Data Retention: Define and enforce data retention policies to ensure data is not stored longer than necessary.
Data Deletion: Use Azure’s data deletion capabilities to permanently remove data when it is no longer needed.
Third-Party Integrations: Ensure third-party services comply with the same security and compliance standards.

Some useful links to Microsoft docs:

Validation & POC: Microsoft provides several tools and processes to validate that your application complies with security and compliance policies before deployment.

Some useful links to Microsoft docs:

If above answer helped, please do not forget to "Accept Answer" as this may help other community members to refer the info if facing similar issue.

Answer

Hello Saurabh Arjun Sawant,

Thank you for posting your question in the Microsoft Q&A forum.

Setting up a private Virtual Network (VNet) itself is a strong foundational step for securing your Azure OpenAI solution, however it is not sufficient on its own. To ensure comprehensive security and compliance, especially when using Azure Storage, Azure AI Search, Azure Monitor, and Azure OpenAI, you need to adopt a multi-layered security approach. Below, I’ll address your questions in detail and provide key factors, best practices, and validation methods to ensure your solution is secure and compliant.

Infrastructure Security: To answer your question, a private VNet is a critical component for securing your solution, but it must be complemented with additional security measures. I found this link helped me a lot - https://learn.microsoft.com/en-us/azure/cloud-adoption-framework/scenarios/ai/infrastructure/security?source=recommendations

Key Factors for Infrastructure Security

Private Endpoints: Use private endpoints for Azure Storage, Azure AI Search, and Azure OpenAI to ensure all communication occurs within the VNet and is not exposed to the public internet.
Network Security Groups (NSGs): Configure NSGs to restrict traffic to and from your resources. For example, allow only specific IP ranges or subnets to access Azure OpenAI.
Azure Firewall: Implement Azure Firewall to filter and monitor traffic between your VNet and external networks.
Service Endpoints: Enable service endpoints for Azure Storage and Azure AI Search to restrict access to your VNet.
DDoS Protection: Enable Azure DDoS Protection to safeguard against distributed denial-of-service attacks.

Best Practices for Infrastructure Security

Isolate Resources: Place Azure OpenAI, Azure Storage, and Azure AI Search in separate subnets within the VNet to minimize the attack surface.
Zero Trust Architecture: Adopt a zero-trust approach by verifying every request, regardless of its origin.
Encryption: Ensure all data is encrypted at rest (using Azure Storage encryption) and in transit (using TLS 1.2 or higher).
Private DNS Zones: Use Azure Private DNS Zones to resolve private endpoints within your VNet securely.
Logging and Monitoring: Enable Azure Monitor and Log Analytics to track and analyze network traffic and resource usage.

Few useful links to Microsoft docs:

Deploy and Configure Azure Firewall
Secure Azure Storage with Service Endpoints
Enable DDoS Protection

Inference & Compliance: Yes, your application can process confidential data, but you must adhere to Azure security and compliance policies and follow specific guidelines to avoid violations.

Key Factors for Compliance

Data Classification: Classify your data (e.g., public, confidential, highly confidential) and apply appropriate security controls.
Data Residency: Ensure that data is stored and processed in regions compliant with your organization’s data residency requirements.
Regulatory Compliance: Verify that Azure OpenAI and related services comply with regulations like GDPR, HIPAA, ISO 27001, and SOC 2.

Best Practices for Processing Confidential Data

Data Anonymization: Anonymize or pseudonymize sensitive data before sending it to Azure OpenAI for inference.
Access Controls: Use RBAC and Azure AD to restrict access to confidential data.
Data Masking: Mask sensitive information in logs and outputs.
Audit Logs: Enable audit logs for Azure Storage, Azure AI Search, and Azure OpenAI to track data access and usage.
Content Filtering: Use Azure OpenAI’s built-in content filtering to prevent the generation of inappropriate or harmful content.
Define Security and Compliance Requirements:
- Identify the regulations and standards your application must comply with (e.g., GDPR, HIPAA).
  - Define security controls (e.g., encryption, access controls).

Specific Restrictions and Guidelines

Data Retention: Define and enforce data retention policies to ensure data is not stored longer than necessary.
Data Deletion: Use Azure’s data deletion capabilities to permanently remove data when it is no longer needed.
Third-Party Integrations: Ensure third-party services comply with the same security and compliance standards.

Some useful links to Microsoft docs:

Validation & POC: Microsoft provides several tools and processes to validate that your application complies with security and compliance policies before deployment.

Microsoft-Recommended Tools and Processes

Azure Security Center:

Provides a secure score that evaluates your security posture.
Offers tailored recommendations for securing your resources.
Enables regulatory compliance assessments (e.g., GDPR, HIPAA).

Azure Policy:

Enforce compliance with organizational standards and regulations.
Use built-in policies for Azure Storage, Azure AI Search, and Azure OpenAI to ensure secure configurations.

Microsoft Purview:

Use Purview to classify and label sensitive data.
Monitor data usage and ensure compliance with data protection regulations.

Azure Blueprints:

Create repeatable deployment templates that enforce security and compliance standards.
Use pre-built blueprints for common compliance frameworks.

Penetration Testing:

Conduct penetration testing to identify vulnerabilities in your application and infrastructure.
Use Microsoft’s Penetration Testing Rules of Engagement for guidance.

Compliance Manager:

Use Compliance Manager in the Microsoft 365 Admin Center to assess your compliance with regulatory requirements.
Track and manage compliance activities.

Some useful links to Microsoft docs:

If above answer helped, please do not forget to "Accept Answer" as this may help other community members to refer the info if facing similar issue.

Answer

Hello Saurabh Arjun Sawant,

I am glad that the detailed response with related info helped. Regarding your follow up questions below is the information you might need to explore.

Yes, it is explicitly permitted to pass highly confidential data in the prompt to Azure OpenAI, even if anonymization or masking is not feasible, provided you adhere to Azure's security and compliance policies. Microsoft has designed Azure OpenAI Service to handle sensitive and confidential data securely, but there are important considerations and best practices to follow to ensure compliance and mitigate risks, I have listed some key considerations with related Microsoft documentation you should explore:

Data usage policy - https://learn.microsoft.com/en-us/legal/cognitive-services/openai/data-privacy

Compliance Certifications -https://www.microsoft.com/en-us/trust-center/compliance

Content Filtering - https://learn.microsoft.com/en-us/azure/cognitive-services/openai/concepts/content-filter

Follow best practices for securing Azure AI infrastructure with confidential data - https://learn.microsoft.com/en-us/azure/cloud-adoption-framework/scenarios/ai/infrastructure/security?source=recommendations

Data Classification and Metadata:

Azure OpenAI does not require specific headers or metadata for data classification. However, you can implement the following best practices to ensure proper handling of sensitive data:

Custom Metadata: If your application requires it, you can include custom metadata in your requests to classify data (e.g., {"classification": "highly_confidential"}). This metadata can be logged and monitored for compliance purposes.
Logging and Monitoring: Use Azure Monitor and Log Analytics to track requests and responses. You can enrich logs with custom metadata to ensure sensitive data is handled appropriately.
Data Loss Prevention (DLP): Integrate Azure OpenAI with Azure Purview or other DLP tools to classify and protect sensitive data before it is sent to the service.

Moreover, if anonymization or masking is not feasible in your use case, you can still use Azure OpenAI securely by:

· Ensuring that the data is processed in a compliant region.

· Using private endpoints and encryption to secure the connection.

· Regularly auditing and monitoring usage to ensure compliance with internal and external policies.

Designated Representative or Team support: Always refer to Microsoft’s official documentation and consult with your Microsoft account team or Azure support if you have specific concerns.

You may engage with Microsoft’s FastTrack team, Microsoft Consulting Services( MCS) for validation. Use tools like Microsoft Defender for Cloud, Compliance Manager, and Azure Policy to ensure compliance before deployment.

https://learn.microsoft.com/en-us/azure/fasttrack/

https://www.microsoft.com/en-us/industry/services/consulting

You may also want to explore Azure Security and Compliance Workshops to understand latest updates.

If above answer helped, please do not forget to "Accept Answer" as this may help other community members to refer the info if facing similar issue.

Share via

Best Practices for Securing Azure OpenAI with Confidential Data

3 answers

Your answer