Deployed model using unclear tokens

Question

Deployed model using unclear tokens

Matej Jakubčík 85

I have deployed a gpt4 model in Azure foundry. The problem is that sometimes in the metrics it is displaying that the model is using tokens but I did not do anything with it. How is it possible? What could be the cause of that? What can I do in order to control the number of tokens this model process in order to not be surprised with costly bill?

Alekhya Vaddepally 0 Reputation points Microsoft External Staff

2025-03-06T10:11:46.4466667+00:00

Hi Matej Jakubčík,

We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet. In case if you have any resolution, please do share that same with the community as it can be helpful to others. Otherwise, will respond with more details and we will try to help
Alekhya Vaddepally 0 Reputation points Microsoft External Staff

2025-03-07T10:13:42.6966667+00:00

Hi Matej Jakubčík,

We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet. In case if you have any resolution, please do share that same with the community as it can be helpful to others. Otherwise, will respond with more details and we will try to help

1 answer

Your answer

Alekhya Vaddepally 0 Reputation points Microsoft External Staff

2025-03-06T10:11:46.4466667+00:00

Hi Matej Jakubčík,

We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet. In case if you have any resolution, please do share that same with the community as it can be helpful to others. Otherwise, will respond with more details and we will try to help
Alekhya Vaddepally 0 Reputation points Microsoft External Staff

2025-03-07T10:13:42.6966667+00:00

Hi Matej Jakubčík,

We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet. In case if you have any resolution, please do share that same with the community as it can be helpful to others. Otherwise, will respond with more details and we will try to help

Answer 1

Hi Matej Jakubčík,

When you deploy a model like GPT-4 in Azure, the token usage can occur even if you are not actively making requests.

these are the possible reasons for why you are seeing token usage:

System Activity or Background Processes may send automated requests to consume tokens

If Retained Sessions application sessions open, interactions might still be happening without you realizing it, and also other applications, scripts, or users with access may be making requests.

To control the number of tokens processed and avoid unexpected costs, consider the following steps:

Use Azure Monitor and Application Insights to track requests being made to the model. This will help identify the source of token usage.

You can also define token limits per request and set quotas using Azure OpenAI service configuration.

check only authorized applications and users can send requests. Rotate keys if needed.

Reduce unnecessary token usage by refining your prompt structures and limiting response length

Set up budget alerts in Azure Cost Management to avoid unexpected charges.

By implementing these strategies, you can gain more control over token usage and manage costs effectively.

https://learn.microsoft.com/en-us/azure/ai-foundry/model-inference/how-to/manage-costs#understand-model-inference-billing-model

https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/manage-costs#understand-the-azure-openai-full-billing-model

https://learn.microsoft.com/en-us/azure/azure-monitor/app/app-insights-overview

https://learn.microsoft.com/en-us/azure/ai-services/openai/quotas-limits

If the answer is helpful, please click Accept Answer and kindly upvote it so that other people who faces similar issue may get benefitted from it.

Let me know if you have any further Queries.

Share via

Deployed model using unclear tokens

1 answer

Your answer