How to achieve 50,000 azure open ai api requests in 5 hours which contains minimum of 10k input tokens and 7k output tokens in each call?

Johnimmanuel Vetri 20 Reputation points
2024-11-21T09:16:00.56+00:00

For my use case, I want to process the text data using azure openAI. The model im using is gpt-4o-mini. In this I need to process 15,000 * 50,000 tokens in 8 hours. So , the number of api calls per minutes or ratelimit is restricting my solution. Can you give me a better answer for this ?

Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
3,360 questions
Azure AI services
Azure AI services
A group of Azure services, SDKs, and APIs designed to make apps more intelligent, engaging, and discoverable.
2,953 questions
0 comments No comments
{count} votes

2 answers

Sort by: Most helpful
  1. Shikha Ghildiyal 695 Reputation points Microsoft Employee
    2024-11-22T06:36:07.8566667+00:00

    The Azure OpenAI service has a limit of 40 million tokens per month for each model. The GPT-4o-mini model has a limit of 40 million tokens as well. Therefore, you need to make sure that you are not exceeding the token limit for the model. To achieve your goal of processing 15,000 * 50,000 tokens in 8 hours, you need to make sure that you are not exceeding the rate limit of the Azure OpenAI service. The rate limit for the service is 1,200 transactions per minute. Each transaction can contain a maximum of 2,048 tokens. To process 15,000 * 50,000 tokens in 8 hours, you need to make 375 API calls per minute. Each API call should contain 40 transactions, with each transaction containing 10,000 input tokens and 7,000 output tokens. However, this is just a theoretical calculation and it may not be possible to achieve this rate limit in practice.

    To optimize the performance of your solution, you can try the following techniques: -

    Implement retry logic in your application to handle any errors or timeouts.

    Avoid sharp changes in the workload. Increase the workload gradually.

    Test different load increase patterns to find the optimal rate limit for your solution.

    1. Create another OpenAI service resource in the same or different regions and distribute the workload among them.

    Please do not forget to "Accept the answer” and “up-vote” wherever the information provided helps you, this can be beneficial to other community members.

    I hope this helps! Let me know if you have any further questions

    0 comments No comments

  2. Shikha Ghildiyal 695 Reputation points Microsoft Employee
    2024-11-22T06:37:29.09+00:00

    Hi

    The Azure OpenAI service has a limit of 40 million tokens per month for each model. The GPT-4o-mini model has a limit of 40 million tokens as well. Therefore, you need to make sure that you are not exceeding the token limit for the model. To achieve your goal of processing 15,000 * 50,000 tokens in 8 hours, you need to make sure that you are not exceeding the rate limit of the Azure OpenAI service. The rate limit for the service is 1,200 transactions per minute. Each transaction can contain a maximum of 2,048 tokens. To process 15,000 * 50,000 tokens in 8 hours, you need to make 375 API calls per minute. Each API call should contain 40 transactions, with each transaction containing 10,000 input tokens and 7,000 output tokens. However, this is just a theoretical calculation and it may not be possible to achieve this rate limit in practice.

    To optimize the performance of your solution, you can try the following techniques: -

    Implement retry logic in your application to handle any errors or timeouts.

    Avoid sharp changes in the workload. Increase the workload gradually.

    Test different load increase patterns to find the optimal rate limit for your solution.

    Create another OpenAI service resource in the same or different regions, and distribute the workload among them.

    Please do not forget to "Accept the answer” and “up-vote” wherever the information provided helps you, this can be beneficial to other community members.

    I hope this helps! Let me know if you have any further questions

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.