Security and Data Integrity
1. Authentication
Azure OpenAI (AOAI) supports two forms of authentication.
Microsoft Entra ID: AOAI supports Microsoft Entra authentication with managed identities or Entra application. Using a managed identity, it is possible to achieve a keyless authentication between the consumer and the AOAI service. In Entra application-based authentication, each consumer will need to maintain a client id and a secret for the Entra app that has access to the AOAI resource.
API Key: A secret key can be passed by the consumer to authenticate itself with AOAI. API Keys provided by AOAI are secret by nature. If the consumer wants to use API key-based authentication, they need to have access to this secret key. However, when a consumer needs to distribute requests across multiple AOAI endpoints, they are tasked with managing each API key independently. Entrusting consumers with API keys not only imposes the responsibility of key management upon them. It also heightens the risk of security breaches if a key is compromised at the consumer's end. Furthermore, this approach complicates the implementation of other security best practices, such as key rotation and the ability to block requests from specific consumers.
It's possible that the Gateway might interface with endpoints that are not AOAI. In those situations, different endpoints could have different ways of authenticating.
One of the suggested approaches here is to offload authentication to AOAI (or other GenAI endpoints) to the GenAI gateway and terminate the consumer authentication at the Gateway level. Decoupling the GenAI endpoints auth from the end consumers allow them to employ uniform enterprise-wide authentication mechanism like OAuth. It also allows them to authenticate against the GenAI Gateway and mitigate the above stated risks. In AOAI endpoints, GenAI gateway can use managed identity to authenticate as well. When authentication is offloaded, the GenAI resource cannot recognize the consumers, because the Gateway will use its own credentials for all the requests.
2. Personally Identifiable Information (PII) and Data Masking
The GenAI gateway acts as a broker between the consumer and backend AOAI services. Using the GenAI gateway for PII detection and data masking becomes a critical part of the architecture. This setup allows the following features:
- Centralized handling of sensitive data
- Ensuring that any personal information is identified and appropriately managed before being processed by AOAI
A centralized approach also presents an opportunity to standardize PII handling practices across multiple consumer applications, leading to more consistent and maintainable data privacy protocols.
Automated processes can be implemented at the Gateway level to intercept requests and detect PII information before it's processed by Azure OpenAI services. Once detected, PII data can either be redacted or replaced with generic placeholders.
Detecting PII
Services such as Azure AI Language can be used for identifying and categorizing PII information in text data. Certain Azure services such as Azure Purview can also help in detecting and surfacing PII information.
Microsoft Presidio can also be used for fine-grain control over identification and anonymization of PII data.
For more specific or customized PII detection, a custom domain-specific ML model can be trained using Azure Machine Learning service. A REST endpoint exposes the model.
Orchestration: The GenAI gateway can employ an orchestration layer that takes advantage of Azure services to automate workflows. This workflow can include steps for PII detection and data masking operations. An Azure function or Logic App could be used.
However, integrating an extra layer for PII detection and masking can increase the overall response latency for the consumers. This factor must be balanced against the need for data privacy and compliance when designing the system.
3. Data Sovereignty
Data sovereignty in the context of AOAI refers to the legal and regulatory requirements related to the storage and processing of data within the geographic boundaries of a specific country or region. Planning for data sovereignty is critical for a business to avoid non-compliance with local data protection laws resulting in hefty fines.
The GenAI gateway can play a crucial role in data sovereignty by utilizing region affinity based on the consumer's location. It can intelligently redirect traffic to backend AOAI instances and other cloud services for processing requests. The request will be located in regions that comply with the relevant data residency and sovereignty laws. In a hybrid setup that combines on-premises custom Large Language Models (LLMs) with AOAI, it is essential to ensure that the hybrid system also adheres to multi-region availability requirements to support consumer affinity.
4. Content Moderation
With the rise of LLM-based chat applications, organizations must prevent users from disclosing sensitive data to externally hosted LLMs. Similarly, the response data from LLMs must be screened to exclude any profanity.
With the GenAI gateway design, enterprises can implement a centralized content moderation strategy for their GenAI applications. For Azure OpenAI, default content filtering occurs within Azure, and enterprises can configure the level of content moderation within Azure.
For more content moderation needs in GenAI services, integrate the GenAI gateway with a content moderation service. The service will validate before sending the response to the application. Here are some suggestions on an external content moderation service that can be integrated.
Azure Content Moderator Service
The Content Moderator service is an API that is powered by artificial intelligence and runs on Azure. The service is capable of scanning text, image, and video content for potential risky, offensive, or undesirable aspects.
-
Azure AI Content Safety detects harmful user-generated and AI-generated content in applications and services. Azure AI Content Safety includes text and image APIs that allow you to detect material that is harmful. It offers advanced AI features and enhanced performance. AI content safety is enabled by default when OpenAI is model is deployed.
To decide on the service, refer to this document for details. Integrating an additional content moderation service will increase overall response latency.
Refer to this section for a high-level design overview of the solution.