How to trace your application with Azure AI Inference SDK
Important
Items marked (preview) in this article are currently in public preview. This preview is provided without a service-level agreement, and we don't recommend it for production workloads. Certain features might not be supported or might have constrained capabilities. For more information, see Supplemental Terms of Use for Microsoft Azure Previews.
In this article you'll learn how to trace your application with Azure AI Inference SDK with your choice between using Python, JavaScript, or C#. The Azure AI Inference client library provides support for tracing with OpenTelemetry.
Enable trace in your application
Prerequisites
- An Azure Subscription.
- An Azure AI project, see Create a project in Azure AI Foundry portal.
- An AI model supporting the Azure AI model inference API deployed through Azure AI Foundry.
- If using Python, you need Python 3.8 or later installed, including pip.
- If using JavaScript, the supported environments are LTS versions of Node.js.
Installation
Install the package azure-ai-inference
using your package manager, like pip:
pip install azure-ai-inference[opentelemetry]
Install the Azure Core OpenTelemetry Tracing plugin, OpenTelemetry, and the OTLP exporter for sending telemetry to your observability backend. To install the necessary packages for Python, use the following pip commands:
pip install opentelemetry
pip install opentelemetry-exporter-otlp
To learn more about Azure AI Inference SDK for Python and observability, see Tracing via Inference SDK for Python.
To learn more , see the Inference SDK reference.
Configuration
You need to add following configuration settings as per your use case:
To capture prompt and completion contents, set the
AZURE_TRACING_GEN_AI_CONTENT_RECORDING_ENABLED
environment variable to true (case insensitive). By default, prompts, completions, function names, parameters, or outputs aren't recorded.To enable Azure SDK tracing, set the
AZURE_SDK_TRACING_IMPLEMENTATION
environment variable to opentelemetry. Alternatively, you can configure it in the code with the following snippet:from azure.core.settings import settings settings.tracing_implementation = "opentelemetry"
To learn more, see Azure Core Tracing OpenTelemetry client library for Python.
Enable Instrumentation
The final step is to enable Azure AI Inference instrumentation with the following code snippet:
from azure.ai.inference.tracing import AIInferenceInstrumentor
# Instrument AI Inference API
AIInferenceInstrumentor().instrument()
It's also possible to uninstrument the Azure AI Inferencing API by using the uninstrument call. After this call, the traces will no longer be emitted by the Azure AI Inferencing API until instrument is called again:
AIInferenceInstrumentor().uninstrument()
Tracing your own functions
To trace your own custom functions, you can leverage OpenTelemetry, you'll need to instrument your code with the OpenTelemetry SDK. This involves setting up a tracer provider and creating spans around the code you want to trace. Each span represents a unit of work and can be nested to form a trace tree. You can add attributes to spans to enrich the trace data with additional context. Once instrumented, configure an exporter to send the trace data to a backend for analysis and visualization. For detailed instructions and advanced usage, refer to the OpenTelemetry documentation. This will help you monitor the performance of your custom functions and gain insights into their execution.
Attach User feedback to traces
To attach user feedback to traces and visualize them in Azure AI Foundry portal using OpenTelemetry's semantic conventions, you can instrument your application enabling tracing and logging user feedback. By correlating feedback traces with their respective chat request traces using the response ID, you can use view and manage these traces in Azure AI Foundry portal. OpenTelemetry's specification allows for standardized and enriched trace data, which can be analyzed in Azure AI Foundry portal for performance optimization and user experience insights. This approach helps you use the full power of OpenTelemetry for enhanced observability in your applications.
Related content
- Python samples containing fully runnable Python code for tracing using synchronous and asynchronous clients.
- JavaScript samples containing fully runnable JavaScript code for tracing using synchronous and asynchronous clients.
- C# Samples containing fully runnable C# code for doing inference using synchronous and asynchronous methods.