Hi @Mohammad
Quota is assigned to your subscription on a per-region, per-model basis in units of Tokens-per-Minute (TPM). When you onboard a subscription to Azure OpenAI, you'll receive default quota for most available models.
Quota Tokens-Per-Minute (TPM) allocation is not related to the max input token limit of a model. Model input token limits are defined in the models table and are not impacted by changes made to TPM.
For increasing Quota limit and request please find documentation.
https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/quota?tabs=rest
Please try out these steps with your data and check if it works. Hope this answer helps you with solution! Please comment below if you need any assistance on the same. Happy to help!
I hope the solution is useful to you and then accept the answer.
Regards,
Janarthanan S