Hi @Jeremy Lau,
Thank you for reaching out to Microsoft Q&A forum!
When you are using an Azure OpenAI provisioned service for $260/month in Australia East with a Shared scope, you are essentially reserving a set number of Provisioned Throughput Units (PTUs), which directly determine your Tokens Per Minute (TPM) quota. The exact PTUs you receive for this cost depend on Azure’s pricing and allocation policies for the region. PTUs define how much capacity you have for generating and processing tokens, and each model (e.g., GPT-4-Turbo, GPT-3.5-Turbo) has a specific PTU-to-TPM mapping. For example, a single PTU might correspond to a few thousand tokens per minute, but this varies based on the model and Azure’s internal provisioning. Since your purchase falls under Provision Managed Globally, your service is likely pooled with others, meaning your TPM allocation may be dynamically managed rather than fixed.
To estimate provisioned capacity using request level data, open the capacity planner in the Azure AI Foundry. The capacity calculator is under Shared resources > Model Quota > Azure OpenAI Provisioned.
The Provisioned option and the capacity planner are only available in certain regions within the Quota pane, if you don't see this option setting the quota region to Sweden Central will make this option available. For more info, please look into this page.
To determine the exact TPM quota for your purchase, you can check the Azure OpenAI Quota page. If your TPM is lower than needed, you may be able to increase PTUs by requesting a quota adjustment, though availability depends on regional capacity.
For more info, please refer to:
Let me know if you need help finding specific PTU-to-TPM mappings for your region and model.