Does azure-openai-emit-token-metric policy at API Management Service support cached tokens?

SS 5

The Azure API Management Service recently updated the azure-openai-emit-token-metric to support the GPT-4o model.

Does this policy support the recording of Cached Tokens? According to the official documentation on MS Learn, it appears that this feature is not supported.

https://learn.microsoft.com/en-us/azure/api-management/azure-openai-emit-token-metric-policy

Token count metrics include: Total Tokens, Prompt Tokens, and Completion Tokens.

However, I’m unsure if this is due to the documentation not being updated or if the feature is indeed not supported at all.

Feel free to ask if you need any further assistance!

Loknathsatyasaivarma Mahali 465 Reputation points Microsoft Vendor

2025-02-20T05:04:41.15+00:00

Hello @SS,

Thanks for raising your concern on Microsoft Q&A Platform!

The azure-openai-emit-token-metric policy in Azure API Management Service is designed to send custom metrics to Application Insights about the consumption of large language model tokens through Azure OpenAI Service APIs.

As application insights are part of Azure monitor below are the list of supported metrics for Azure Open AI (Mentioned in below screenshot).

https://learn.microsoft.com/en-us/azure/ai-services/openai/monitor-openai-reference

Are you talking about the "Prompt token Cache Match Token" metric? or any other so that I can check with the team internally on this.
SS 5 Reputation points

2025-02-20T07:04:50.98+00:00

Hi @Loknathsatyasaivarma Mahali ,

Thank you for the follow-up.

However, I believe the site you provided is not relevant to the current discussion.

The site you mentioned seems to list the metrics recorded directly to the configured Application Insights in Azure OpenAI Service (AOAI).
In this case, the architecture involves issuing requests directly to AOAI from the application or client. AOAI has a feature to automatically record metrics to Azure’s managed Azure Monitor resources, and the metrics are measured by this page’s functionality.
In other words, this page lists the metrics measured as a feature of AOAI.

What I am concerned about is whether the azure-openai-emit-token-metric policy in API Management Service can measure Cached Token as a custom metric in Application Insights when using AOAI.
In this case, API Management Service is placed in front of AOAI, and custom metrics are sent to a self-deployed Application Insights instance.
So, my question is about the metrics that can be measured as a feature of APIM.

Could you inquire with the internal team about this matter? It would be very helpful if you could!
Loknathsatyasaivarma Mahali 465 Reputation points Microsoft Vendor

2025-02-21T06:57:21.4766667+00:00

Hello @SS,

Thanks for sharing the required information and work around, we are discussing the issue with the internal team please allow us some time we will update you accordingly.

Share via

Does azure-openai-emit-token-metric policy at API Management Service support cached tokens?

Your answer