Requests to the ChatCompletions_Create Operation under Azure OpenAI API version 2024-08-01-preview have exceeded token rate limit of your current AIServices S0 pricing tier

Marco Moroni 0 Reputation points
2025-02-24T10:57:11.2433333+00:00

Hi all we have an error "The execution of a locator method failed. Class = "Cargo Manifest", Locator = "AE_datiCM", Original error message: 

Web service failure: error code=0x803d0013

The server returned a fault: 

Requests to the ChatCompletions_Create Operation under Azure OpenAI API version 2024-08-01-preview have exceeded token rate limit of your current AIServices S0 pricing tier. Please retry after 86400 seconds. Please contact Azure support service if you would like to further increase the default rate limit", during the AI chat in Tungsten TotalAgility 8.1, in order to use the subscription . We need to increase S0 tier in order to use the chat . Many thanks in advance ******@lynxspa.com, ******@lynxspa.com

Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
3,732 questions
{count} votes

Accepted answer
  1. Manas Mohanty 945 Reputation points Microsoft Vendor
    2025-02-25T16:41:28.1433333+00:00

    Hi Marco Moroni

    Rate limits indicates that you are exceeding the estimated cumulative max-processed-token per minute at some time during your inference.

    You might be sending longer queries or generating longer outputs or going through a huge index size.

    Solution will be

    1. Increase your max_token param from model deployment
    2. Adjust your prompts to be shorter, precise and clear.
    3. Adjust system message to keep the answer size within smaller chunks.
    4. Implement retry mechanism with a sleep time

    Reference thread.

    Thank you.

    1 person found this answer helpful.
    0 comments No comments

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.