How to configure content filters (preview) for models in Azure AI services

Important

Items marked (preview) in this article are currently in public preview. This preview is provided without a service-level agreement, and we don't recommend it for production workloads. Certain features might not be supported or might have constrained capabilities. For more information, see Supplemental Terms of Use for Microsoft Azure Previews.

The content filtering system integrated into Azure AI Services runs alongside the core models. It uses an ensemble of multi-class classification models to detect four categories of harmful content (violence, hate, sexual, and self-harm) at four severity levels respectively (safe, low, medium, and high). It offers optional binary classifiers for detecting jailbreak risk, existing text, and code in public repositories. Learn more about content categories, severity levels, and the behavior of the content filtering system in the following article

The default content filtering configuration is set to filter at the medium severity threshold for all four content harms categories for both prompts and completions. Hence, content detected at severity level medium or high is filtered, while content detected at severity level low or safe isn't filtered.

Content filters can be configured at the resource level and associated with one or more deployments.

Prerequisites

To complete this article, you need:

An Azure subscription. If you are using GitHub Models, you can upgrade your experience and create an Azure subscription in the process. Read Upgrade from GitHub Models to Azure AI model inference if it's your case.
An Azure AI services resource. For more information, see Create an Azure AI Services resource.

An AI project connected to your Azure AI Services resource. You call follow the steps at Configure Azure AI model inference service in my project in Azure AI Foundry.

Create a custom content filter

Follow these steps to create a custom content filter:

Go to Azure AI Foundry portal.
Select Safety + security.
Select the tab Content filters and then select Create content filter.
Under Basic information, give the content filter a name.
Under Connection, select the connection to the Azure AI Services resource that is connected to your project.
Under Input filter, configure the filter depending on your requirements. This configuration is applied before the request reaches the model itself.
Under Output filter, configure the filter depending on your requirements. This configuration is applied after the model is executed and content is generated.
Select Next.
Optionally, you can associate a given deployment with the created content filter. You can change the model deployments associated at any time.
Once the deployment completes, the new content filter is applied to the model deployment.

Account for content filtering in your code

Once content filtering has been applied to your model deployment, requests can be intercepted by the service depending on the inputs and outputs. When a content filter is triggered, a 400 error code is returned with the description of the rule triggered.

Install the package azure-ai-inference using your package manager, like pip:

pip install azure-ai-inference>=1.0.0b5

Warning

Azure AI Services resource requires the version azure-ai-inference>=1.0.0b5 for Python.

Then, you can use the package to consume the model. The following example shows how to create a client to consume chat completions:

import os
from azure.ai.inference import ChatCompletionsClient
from azure.core.credentials import AzureKeyCredential

model = ChatCompletionsClient(
    endpoint="https://<resource>.services.ai.azure.com/models",
    credential=AzureKeyCredential(os.environ["AZUREAI_ENDPOINT_KEY"]),
)

Explore our samples and read the API reference documentation to get yourself started.

Install the package @azure-rest/ai-inference using npm:

npm install @azure-rest/ai-inference

Then, you can use the package to consume the model. The following example shows how to create a client to consume chat completions:

import ModelClient from "@azure-rest/ai-inference";
import { isUnexpected } from "@azure-rest/ai-inference";
import { AzureKeyCredential } from "@azure/core-auth";

const client = new ModelClient(
    "https://<resource>.services.ai.azure.com/models", 
    new AzureKeyCredential(process.env.AZUREAI_ENDPOINT_KEY)
);

Explore our samples and read the API reference documentation to get yourself started.

Install the Azure AI inference library with the following command:

dotnet add package Azure.AI.Inference --prerelease

Import the following namespaces:

using Azure;
using Azure.Identity;
using Azure.AI.Inference;

Then, you can use the package to consume the model. The following example shows how to create a client to consume chat completions:

ChatCompletionsClient client = new ChatCompletionsClient(
    new Uri("https://<resource>.services.ai.azure.com/models"),
    new AzureKeyCredential(Environment.GetEnvironmentVariable("AZURE_INFERENCE_CREDENTIAL"))
);

Explore our samples and read the API reference documentation to get yourself started.

Add the package to your project:

<dependency>
    <groupId>com.azure</groupId>
    <artifactId>azure-ai-inference</artifactId>
    <version>1.0.0-beta.1</version>
</dependency>

Then, you can use the package to consume the model. The following example shows how to create a client to consume chat completions:

ChatCompletionsClient client = new ChatCompletionsClientBuilder()
    .credential(new AzureKeyCredential("{key}"))
    .endpoint("{endpoint}")
    .buildClient();

Explore our samples and read the API reference documentation to get yourself started.

Use the reference section to explore the API design and which parameters are available. For example, the reference section for Chat completions details how to use the route /chat/completions to generate predictions based on chat-formatted instructions. Notice that the path /models is included to the root of the URL:

Request

POST https://<resource>.services.ai.azure.com/models/chat/completions?api-version=2024-05-01-preview
Authorization: Bearer <bearer-token>
Content-Type: application/json

The following example shows the response for a chat completion request that has triggered content safety.

from azure.ai.inference.models import AssistantMessage, UserMessage, SystemMessage
from azure.core.exceptions import HttpResponseError

try:
    response = model.complete(
        messages=[
            SystemMessage(content="You are an AI assistant that helps people find information."),
            UserMessage(content="Chopping tomatoes and cutting them into cubes or wedges are great ways to practice your knife skills."),
        ]
    )

    print(response.choices[0].message.content)

except HttpResponseError as ex:
    if ex.status_code == 400:
        response = json.loads(ex.response._content.decode('utf-8'))
        if isinstance(response, dict) and "error" in response:
            print(f"Your request triggered an {response['error']['code']} error:\n\t {response['error']['message']}")
        else:
            raise ex
    else:
        raise ex

try {
    var messages = [
        { role: "system", content: "You are an AI assistant that helps people find information." },
        { role: "user", content: "Chopping tomatoes and cutting them into cubes or wedges are great ways to practice your knife skills." },
    ]

    var response = await client.path("/chat/completions").post({
        body: {
            messages: messages,
        }
    });
    
    console.log(response.body.choices[0].message.content)
}
catch (error) {
    if (error.status_code == 400) {
        var response = JSON.parse(error.response._content)
        if (response.error) {
            console.log(`Your request triggered an ${response.error.code} error:\n\t ${response.error.message}`)
        }
        else
        {
            throw error
        }
    }
}

try
{
    requestOptions = new ChatCompletionsOptions()
    {
        Messages = {
            new ChatRequestSystemMessage("You are an AI assistant that helps people find information."),
            new ChatRequestUserMessage(
                "Chopping tomatoes and cutting them into cubes or wedges are great ways to practice your knife skills."
            ),
        },
    };

    response = client.Complete(requestOptions);
    Console.WriteLine(response.Value.Choices[0].Message.Content);
}
catch (RequestFailedException ex)
{
    if (ex.ErrorCode == "content_filter")
    {
        Console.WriteLine($"Your query has trigger Azure Content Safety: {ex.Message}");
    }
    else
    {
        throw;
    }
}

try {
    List<ChatRequestMessage> chatMessages = new ArrayList<>();
    chatMessages.add(new ChatRequestSystemMessage("You are a helpful assistant"));
    chatMessages.add(new ChatRequestUserMessage("Chopping tomatoes and cutting them into cubes or wedges are great ways to practice your knife skills."));

    ChatCompletions response = client.complete(new ChatCompletionsOptions(chatMessages));
    System.out.println(response.getChoices().get(0).getMessage().getContent());
} catch (HttpResponseException ex) {
    if (ex.getResponse().getStatusCode() == 400)
        System.out.println("Your query has triggered Azure Content Safety: " + ex.getMessage());
    } else {
        throw ex;
    }
}

Request

POST /chat/completions?api-version=2024-05-01-preview
Authorization: Bearer <bearer-token>
Content-Type: application/json

{
    "messages": [
    {
        "role": "system",
        "content": "You are a helpful assistant"
    },
    {
        "role": "user",
        "content": "Chopping tomatoes and cutting them into cubes or wedges are great ways to practice your knife skills."
    }
    ],
    "temperature": 0,
    "top_p": 1,
}

Response

{
    "status": 400,
    "code": "content_filter",
    "message": "The response was filtered",
    "param": "messages",
    "type": null
}

Follow best practices

We recommend informing your content filtering configuration decisions through an iterative identification (for example, red team testing, stress-testing, and analysis) and measurement process to address the potential harms that are relevant for a specific model, application, and deployment scenario. After you implement mitigations such as content filtering, repeat measurement to test effectiveness.

Recommendations and best practices for Responsible AI for Azure OpenAI, grounded in the Microsoft Responsible AI Standard can be found in the Responsible AI Overview for Azure OpenAI.

Important

Items marked (preview) in this article are currently in public preview. This preview is provided without a service-level agreement, and we don't recommend it for production workloads. Certain features might not be supported or might have constrained capabilities. For more information, see Supplemental Terms of Use for Microsoft Azure Previews.

The content filtering system integrated into Azure AI Services runs alongside the core models. It uses an ensemble of multi-class classification models to detect four categories of harmful content (violence, hate, sexual, and self-harm) at four severity levels respectively (safe, low, medium, and high). It offers optional binary classifiers for detecting jailbreak risk, existing text, and code in public repositories. Learn more about content categories, severity levels, and the behavior of the content filtering system in the following article

The default content filtering configuration is set to filter at the medium severity threshold for all four content harms categories for both prompts and completions. Hence, content detected at severity level medium or high is filtered, while content detected at severity level low or safe isn't filtered.

Content filters can be configured at the resource level and associated with one or more deployments.

Prerequisites

To complete this article, you need:

An Azure subscription. If you are using GitHub Models, you can upgrade your experience and create an Azure subscription in the process. Read Upgrade from GitHub Models to Azure AI model inference if it's your case.
An Azure AI services resource. For more information, see Create an Azure AI Services resource.

Add a model deployment with custom content filtering

We recommend creating content filters using either Azure AI Foundry portal or in code using Bicep. Creating custom content filters or applying them to deployments is not supported using the Azure CLI.

Account for content filtering in your code

Once content filtering has been applied to your model deployment, requests can be intercepted by the service depending on the inputs and outputs. When a content filter is triggered, a 400 error code is returned with the description of the rule triggered.

Install the package azure-ai-inference using your package manager, like pip:

pip install azure-ai-inference>=1.0.0b5

Warning

Azure AI Services resource requires the version azure-ai-inference>=1.0.0b5 for Python.

Then, you can use the package to consume the model. The following example shows how to create a client to consume chat completions:

import os
from azure.ai.inference import ChatCompletionsClient
from azure.core.credentials import AzureKeyCredential

model = ChatCompletionsClient(
    endpoint="https://<resource>.services.ai.azure.com/models",
    credential=AzureKeyCredential(os.environ["AZUREAI_ENDPOINT_KEY"]),
)

Explore our samples and read the API reference documentation to get yourself started.

Install the package @azure-rest/ai-inference using npm:

npm install @azure-rest/ai-inference

Then, you can use the package to consume the model. The following example shows how to create a client to consume chat completions:

import ModelClient from "@azure-rest/ai-inference";
import { isUnexpected } from "@azure-rest/ai-inference";
import { AzureKeyCredential } from "@azure/core-auth";

const client = new ModelClient(
    "https://<resource>.services.ai.azure.com/models", 
    new AzureKeyCredential(process.env.AZUREAI_ENDPOINT_KEY)
);

Explore our samples and read the API reference documentation to get yourself started.

Install the Azure AI inference library with the following command:

dotnet add package Azure.AI.Inference --prerelease

Import the following namespaces:

using Azure;
using Azure.Identity;
using Azure.AI.Inference;

Then, you can use the package to consume the model. The following example shows how to create a client to consume chat completions:

ChatCompletionsClient client = new ChatCompletionsClient(
    new Uri("https://<resource>.services.ai.azure.com/models"),
    new AzureKeyCredential(Environment.GetEnvironmentVariable("AZURE_INFERENCE_CREDENTIAL"))
);

Explore our samples and read the API reference documentation to get yourself started.

Add the package to your project:

<dependency>
    <groupId>com.azure</groupId>
    <artifactId>azure-ai-inference</artifactId>
    <version>1.0.0-beta.1</version>
</dependency>

Then, you can use the package to consume the model. The following example shows how to create a client to consume chat completions:

ChatCompletionsClient client = new ChatCompletionsClientBuilder()
    .credential(new AzureKeyCredential("{key}"))
    .endpoint("{endpoint}")
    .buildClient();

Explore our samples and read the API reference documentation to get yourself started.

Use the reference section to explore the API design and which parameters are available. For example, the reference section for Chat completions details how to use the route /chat/completions to generate predictions based on chat-formatted instructions. Notice that the path /models is included to the root of the URL:

Request

POST https://<resource>.services.ai.azure.com/models/chat/completions?api-version=2024-05-01-preview
Authorization: Bearer <bearer-token>
Content-Type: application/json

The following example shows the response for a chat completion request that has triggered content safety.

from azure.ai.inference.models import AssistantMessage, UserMessage, SystemMessage
from azure.core.exceptions import HttpResponseError

try:
    response = model.complete(
        messages=[
            SystemMessage(content="You are an AI assistant that helps people find information."),
            UserMessage(content="Chopping tomatoes and cutting them into cubes or wedges are great ways to practice your knife skills."),
        ]
    )

    print(response.choices[0].message.content)

except HttpResponseError as ex:
    if ex.status_code == 400:
        response = json.loads(ex.response._content.decode('utf-8'))
        if isinstance(response, dict) and "error" in response:
            print(f"Your request triggered an {response['error']['code']} error:\n\t {response['error']['message']}")
        else:
            raise ex
    else:
        raise ex

try {
    var messages = [
        { role: "system", content: "You are an AI assistant that helps people find information." },
        { role: "user", content: "Chopping tomatoes and cutting them into cubes or wedges are great ways to practice your knife skills." },
    ]

    var response = await client.path("/chat/completions").post({
        body: {
            messages: messages,
        }
    });
    
    console.log(response.body.choices[0].message.content)
}
catch (error) {
    if (error.status_code == 400) {
        var response = JSON.parse(error.response._content)
        if (response.error) {
            console.log(`Your request triggered an ${response.error.code} error:\n\t ${response.error.message}`)
        }
        else
        {
            throw error
        }
    }
}

try
{
    requestOptions = new ChatCompletionsOptions()
    {
        Messages = {
            new ChatRequestSystemMessage("You are an AI assistant that helps people find information."),
            new ChatRequestUserMessage(
                "Chopping tomatoes and cutting them into cubes or wedges are great ways to practice your knife skills."
            ),
        },
    };

    response = client.Complete(requestOptions);
    Console.WriteLine(response.Value.Choices[0].Message.Content);
}
catch (RequestFailedException ex)
{
    if (ex.ErrorCode == "content_filter")
    {
        Console.WriteLine($"Your query has trigger Azure Content Safety: {ex.Message}");
    }
    else
    {
        throw;
    }
}

try {
    List<ChatRequestMessage> chatMessages = new ArrayList<>();
    chatMessages.add(new ChatRequestSystemMessage("You are a helpful assistant"));
    chatMessages.add(new ChatRequestUserMessage("Chopping tomatoes and cutting them into cubes or wedges are great ways to practice your knife skills."));

    ChatCompletions response = client.complete(new ChatCompletionsOptions(chatMessages));
    System.out.println(response.getChoices().get(0).getMessage().getContent());
} catch (HttpResponseException ex) {
    if (ex.getResponse().getStatusCode() == 400)
        System.out.println("Your query has triggered Azure Content Safety: " + ex.getMessage());
    } else {
        throw ex;
    }
}

Request

POST /chat/completions?api-version=2024-05-01-preview
Authorization: Bearer <bearer-token>
Content-Type: application/json

{
    "messages": [
    {
        "role": "system",
        "content": "You are a helpful assistant"
    },
    {
        "role": "user",
        "content": "Chopping tomatoes and cutting them into cubes or wedges are great ways to practice your knife skills."
    }
    ],
    "temperature": 0,
    "top_p": 1,
}

Response

{
    "status": 400,
    "code": "content_filter",
    "message": "The response was filtered",
    "param": "messages",
    "type": null
}

Follow best practices

We recommend informing your content filtering configuration decisions through an iterative identification (for example, red team testing, stress-testing, and analysis) and measurement process to address the potential harms that are relevant for a specific model, application, and deployment scenario. After you implement mitigations such as content filtering, repeat measurement to test effectiveness.

Recommendations and best practices for Responsible AI for Azure OpenAI, grounded in the Microsoft Responsible AI Standard can be found in the Responsible AI Overview for Azure OpenAI.

Important

Items marked (preview) in this article are currently in public preview. This preview is provided without a service-level agreement, and we don't recommend it for production workloads. Certain features might not be supported or might have constrained capabilities. For more information, see Supplemental Terms of Use for Microsoft Azure Previews.

The content filtering system integrated into Azure AI Services runs alongside the core models. It uses an ensemble of multi-class classification models to detect four categories of harmful content (violence, hate, sexual, and self-harm) at four severity levels respectively (safe, low, medium, and high). It offers optional binary classifiers for detecting jailbreak risk, existing text, and code in public repositories. Learn more about content categories, severity levels, and the behavior of the content filtering system in the following article

The default content filtering configuration is set to filter at the medium severity threshold for all four content harms categories for both prompts and completions. Hence, content detected at severity level medium or high is filtered, while content detected at severity level low or safe isn't filtered.

Content filters can be configured at the resource level and associated with one or more deployments.

Prerequisites

To complete this article, you need:

An Azure subscription. If you are using GitHub Models, you can upgrade your experience and create an Azure subscription in the process. Read Upgrade from GitHub Models to Azure AI model inference if it's your case.
An Azure AI services resource. For more information, see Create an Azure AI Services resource.

Install the Azure CLI.
Identify the following information:
- Your Azure subscription ID.
- Your Azure AI Services resource name.
- The resource group where the Azure AI Services resource is deployed.
- The model name, provider, version, and SKU you would like to deploy. You can use the Azure AI Foundry portal or the Azure CLI to identify it. In this example we deploy the following model:
  - Model name:: Phi-3.5-vision-instruct
  - Provider: Microsoft
  - Version: 2
  - Deployment type: Global standard

Add a model deployment with custom content filtering

Use the template ai-services-content-filter-template.bicep to describe the content filter policy:

ai-services-content-filter-template.bicep

@description('Name of the Azure AI Services account where the policy will be created')
param accountName string

@description('Name of the policy to be created')
param policyName string

@allowed(['Asynchronous_filter', 'Blocking', 'Default', 'Deferred'])
param mode string = 'Default'

@description('Base policy to be used for the new policy')
param basePolicyName string = 'Microsoft.DefaultV2'

param contentFilters array = [
  {
      name: 'Violence'
      severityThreshold: 'Medium'
      blocking: true
      enabled: true
      source: 'Prompt'
  }
  {
      name: 'Hate'
      severityThreshold: 'Medium'
      blocking: true
      enabled: true
      source: 'Prompt'
  }
  {
      name: 'Sexual'
      severityThreshold: 'Medium'
      blocking: true
      enabled: true
      source: 'Prompt'
  }
  {
      name: 'Selfharm'
      severityThreshold: 'Medium'
      blocking: true
      enabled: true
      source: 'Prompt'
  }
  {
      name: 'Jailbreak'
      blocking: true
      enabled: true
      source: 'Prompt'
  }
  {
      name: 'Indirect Attack'
      blocking: true
      enabled: true
      source: 'Prompt'
  }
  {
      name: 'Profanity'
      blocking: true
      enabled: true
      source: 'Prompt'
  }
  {
      name: 'Violence'
      severityThreshold: 'Medium'
      blocking: true
      enabled: true
      source: 'Completion'
  }
  {
      name: 'Hate'
      severityThreshold: 'Medium'
      blocking: true
      enabled: true
      source: 'Completion'
  }
  {
      name: 'Sexual'
      severityThreshold: 'Medium'
      blocking: true
      enabled: true
      source: 'Completion'
  }
  {
      name: 'Selfharm'
      severityThreshold: 'Medium'
      blocking: true
      enabled: true
      source: 'Completion'
  }
  {
      name: 'Protected Material Text'
      blocking: true
      enabled: true
      source: 'Completion'
  }
  {
      name: 'Protected Material Code'
      blocking: false
      enabled: true
      source: 'Completion'
  }
  {
      name: 'Profanity'
      blocking: true
      enabled: true
      source: 'Completion'
  }
]

resource raiPolicy 'Microsoft.CognitiveServices/accounts/raiPolicies@2024-06-01-preview' = {
    name: '${accountName}/${policyName}'
    properties: {
        mode: mode
        basePolicyName: basePolicyName
        contentFilters: contentFilters
    }
}

Use the template ai-services-deployment-template.bicep to describe model deployments:

ai-services-deployment-template.bicep

@description('Name of the Azure AI services account')
param accountName string

@description('Name of the model to deploy')
param modelName string

@description('Version of the model to deploy')
param modelVersion string

@allowed([
  'AI21 Labs'
  'Cohere'
  'Core42'
  'Meta'
  'Microsoft'
  'Mistral AI'
  'OpenAI'
])
@description('Model provider')
param modelPublisherFormat string

@allowed([
    'GlobalStandard'
    'Standard'
    'GlobalProvisioned'
    'Provisioned'
])
@description('Model deployment SKU name')
param skuName string = 'GlobalStandard'

@description('Content filter policy name')
param contentFilterPolicyName string = 'Microsoft.DefaultV2'

@description('Model deployment capacity')
param capacity int = 1

resource modelDeployment 'Microsoft.CognitiveServices/accounts/deployments@2024-04-01-preview' = {
  name: '${accountName}/${modelName}'
  sku: {
    name: skuName
    capacity: capacity
  }
  properties: {
    model: {
      format: modelPublisherFormat
      name: modelName
      version: modelVersion
    }
    raiPolicyName: contentFilterPolicyName == null ? 'Microsoft.Nill' : contentFilterPolicyName
  }
}

Create the main deployment definition:

main.bicep

param accountName string
param modelName string
param modelVersion string
param modelPublisherFormat string
param contentFilterPolicyName string

module raiPolicy 'ai-services-content-filter-template.bicep' = {
  name: 'raiPolicy'
  scope: resourceGroup(resourceGroupName)
  params: {
    accountName: accountName
    policyName: contentFilterPolicyName
  }
}

module modelDeployment 'ai-services-deployment-template.bicep' = {
    name: 'modelDeployment'
    scope: resourceGroup(resourceGroupName)
    params: {
        accountName: accountName
        modelName: modelName
        modelVersion: modelVersion
        modelPublisherFormat: modelPublisherFormat
        contentFilterPolicyName: contentFilterPolicyName
    }
    dependsOn: [
        raiPolicy
    ]
}

Run the deployment:

RESOURCE_GROUP="<resource-group-name>"
ACCOUNT_NAME="<azure-ai-model-inference-name>" 
MODEL_NAME="Phi-3.5-vision-instruct"
PROVIDER="Microsoft"
VERSION=2
RAI_POLICY_NAME="custom-policy"

az deployment group create \
    --resource-group $RESOURCE_GROUP \
    --template-file main.bicep \
    --parameters accountName=$ACCOUNT_NAME raiPolicyName=$RAI_POLICY_NAME modelName=$MODEL_NAME modelVersion=$VERSION modelPublisherFormat=$PROVIDER

Account for content filtering in your code

Once content filtering has been applied to your model deployment, requests can be intercepted by the service depending on the inputs and outputs. When a content filter is triggered, a 400 error code is returned with the description of the rule triggered.

Install the package azure-ai-inference using your package manager, like pip:

pip install azure-ai-inference>=1.0.0b5

Warning

Azure AI Services resource requires the version azure-ai-inference>=1.0.0b5 for Python.

Then, you can use the package to consume the model. The following example shows how to create a client to consume chat completions:

import os
from azure.ai.inference import ChatCompletionsClient
from azure.core.credentials import AzureKeyCredential

model = ChatCompletionsClient(
    endpoint="https://<resource>.services.ai.azure.com/models",
    credential=AzureKeyCredential(os.environ["AZUREAI_ENDPOINT_KEY"]),
)

Explore our samples and read the API reference documentation to get yourself started.

Install the package @azure-rest/ai-inference using npm:

npm install @azure-rest/ai-inference

Then, you can use the package to consume the model. The following example shows how to create a client to consume chat completions:

import ModelClient from "@azure-rest/ai-inference";
import { isUnexpected } from "@azure-rest/ai-inference";
import { AzureKeyCredential } from "@azure/core-auth";

const client = new ModelClient(
    "https://<resource>.services.ai.azure.com/models", 
    new AzureKeyCredential(process.env.AZUREAI_ENDPOINT_KEY)
);

Explore our samples and read the API reference documentation to get yourself started.

Install the Azure AI inference library with the following command:

dotnet add package Azure.AI.Inference --prerelease

Import the following namespaces:

using Azure;
using Azure.Identity;
using Azure.AI.Inference;

Then, you can use the package to consume the model. The following example shows how to create a client to consume chat completions:

ChatCompletionsClient client = new ChatCompletionsClient(
    new Uri("https://<resource>.services.ai.azure.com/models"),
    new AzureKeyCredential(Environment.GetEnvironmentVariable("AZURE_INFERENCE_CREDENTIAL"))
);

Explore our samples and read the API reference documentation to get yourself started.

Add the package to your project:

<dependency>
    <groupId>com.azure</groupId>
    <artifactId>azure-ai-inference</artifactId>
    <version>1.0.0-beta.1</version>
</dependency>

Then, you can use the package to consume the model. The following example shows how to create a client to consume chat completions:

ChatCompletionsClient client = new ChatCompletionsClientBuilder()
    .credential(new AzureKeyCredential("{key}"))
    .endpoint("{endpoint}")
    .buildClient();

Explore our samples and read the API reference documentation to get yourself started.

Use the reference section to explore the API design and which parameters are available. For example, the reference section for Chat completions details how to use the route /chat/completions to generate predictions based on chat-formatted instructions. Notice that the path /models is included to the root of the URL:

Request

POST https://<resource>.services.ai.azure.com/models/chat/completions?api-version=2024-05-01-preview
Authorization: Bearer <bearer-token>
Content-Type: application/json

The following example shows the response for a chat completion request that has triggered content safety.

from azure.ai.inference.models import AssistantMessage, UserMessage, SystemMessage
from azure.core.exceptions import HttpResponseError

try:
    response = model.complete(
        messages=[
            SystemMessage(content="You are an AI assistant that helps people find information."),
            UserMessage(content="Chopping tomatoes and cutting them into cubes or wedges are great ways to practice your knife skills."),
        ]
    )

    print(response.choices[0].message.content)

except HttpResponseError as ex:
    if ex.status_code == 400:
        response = json.loads(ex.response._content.decode('utf-8'))
        if isinstance(response, dict) and "error" in response:
            print(f"Your request triggered an {response['error']['code']} error:\n\t {response['error']['message']}")
        else:
            raise ex
    else:
        raise ex

try {
    var messages = [
        { role: "system", content: "You are an AI assistant that helps people find information." },
        { role: "user", content: "Chopping tomatoes and cutting them into cubes or wedges are great ways to practice your knife skills." },
    ]

    var response = await client.path("/chat/completions").post({
        body: {
            messages: messages,
        }
    });
    
    console.log(response.body.choices[0].message.content)
}
catch (error) {
    if (error.status_code == 400) {
        var response = JSON.parse(error.response._content)
        if (response.error) {
            console.log(`Your request triggered an ${response.error.code} error:\n\t ${response.error.message}`)
        }
        else
        {
            throw error
        }
    }
}

try
{
    requestOptions = new ChatCompletionsOptions()
    {
        Messages = {
            new ChatRequestSystemMessage("You are an AI assistant that helps people find information."),
            new ChatRequestUserMessage(
                "Chopping tomatoes and cutting them into cubes or wedges are great ways to practice your knife skills."
            ),
        },
    };

    response = client.Complete(requestOptions);
    Console.WriteLine(response.Value.Choices[0].Message.Content);
}
catch (RequestFailedException ex)
{
    if (ex.ErrorCode == "content_filter")
    {
        Console.WriteLine($"Your query has trigger Azure Content Safety: {ex.Message}");
    }
    else
    {
        throw;
    }
}

try {
    List<ChatRequestMessage> chatMessages = new ArrayList<>();
    chatMessages.add(new ChatRequestSystemMessage("You are a helpful assistant"));
    chatMessages.add(new ChatRequestUserMessage("Chopping tomatoes and cutting them into cubes or wedges are great ways to practice your knife skills."));

    ChatCompletions response = client.complete(new ChatCompletionsOptions(chatMessages));
    System.out.println(response.getChoices().get(0).getMessage().getContent());
} catch (HttpResponseException ex) {
    if (ex.getResponse().getStatusCode() == 400)
        System.out.println("Your query has triggered Azure Content Safety: " + ex.getMessage());
    } else {
        throw ex;
    }
}

Request

POST /chat/completions?api-version=2024-05-01-preview
Authorization: Bearer <bearer-token>
Content-Type: application/json

{
    "messages": [
    {
        "role": "system",
        "content": "You are a helpful assistant"
    },
    {
        "role": "user",
        "content": "Chopping tomatoes and cutting them into cubes or wedges are great ways to practice your knife skills."
    }
    ],
    "temperature": 0,
    "top_p": 1,
}

Response

{
    "status": 400,
    "code": "content_filter",
    "message": "The response was filtered",
    "param": "messages",
    "type": null
}

Follow best practices

We recommend informing your content filtering configuration decisions through an iterative identification (for example, red team testing, stress-testing, and analysis) and measurement process to address the potential harms that are relevant for a specific model, application, and deployment scenario. After you implement mitigations such as content filtering, repeat measurement to test effectiveness.

Recommendations and best practices for Responsible AI for Azure OpenAI, grounded in the Microsoft Responsible AI Standard can be found in the Responsible AI Overview for Azure OpenAI.

Share via

Prerequisites

Create a custom content filter

Account for content filtering in your code

Follow best practices

Prerequisites

Add a model deployment with custom content filtering

Account for content filtering in your code

Follow best practices

Prerequisites

Add a model deployment with custom content filtering

Account for content filtering in your code

Follow best practices

Next steps

Feedback

Additional resources