I'm getting Server Error from Azure OpenAI realtime API in 'response.done'

Question

AOAI server error

I'm encountering a server_error when trying to access the Azure /openai/realtime API. I've deployed the gpt-4o-realtime-preview model in the Sweden Central region. Initially, the response.done status was "Completed," but after a few successful trials, it started failing and returning a server_error message. FYI, I'm utilizing the VoiceRAG approach as suggested in https://github.com/Azure-Samples/aisearch-openai-rag-audio

I’d appreciate any guidance on resolving this issue. Thanks in advance!

Answer

Hello Henok Birru,

Welcome to the Microsoft Q&A and thank you for posting your questions here.

I understand that you are getting Server Error from Azure OpenAI real-time API in 'response.done'.

Before debugging the application, ensure there are no known issues with the Azure OpenAI service in Sweden Central: Check Azure Status Page: https://status.azure.com Check OpenAI Service Quotas in the Azure Portal: Navigate to Azure Portal > OpenAI Resource > Usage and Quotas Look for rate limits, concurrent requests, and token limits
Since the error appeared after multiple successful requests, it may be due to exceeding rate limits, so run the following Azure CLI command to check rate limits:

az openai quota show --resource-group --name

If exceeding limits, try reducing request frequency or applying for quota increase via Azure support.

Also, you can modify your API request handling to capture full error details:

   import openai
   import json
   response = None
   try:
       response = openai.ChatCompletion.create(
           model="gpt-4o-realtime-preview",
           messages=[{"role": "user", "content": "Hello, how are you?"}]
       )
   except openai.error.OpenAIError as e:
       print("Error:", json.loads(e.http_body))  # Capture full API error response
   if response:
       print("Response:", response)

If the error message includes a 429 status, it indicates rate limiting. Also, if it’s a 500 or 503 error, the issue is on Azure’s side.

Since the issue appears region-dependent, try deploying the same model in West Europe or East US to see if the problem persists: az openai deployment create --resource-group --name \ --model gpt-4o-realtime-preview --region eastus If it works in another region but fails in Sweden Central, the issue is likely region-specific instability.

In another view, if the API is unstable, implement exponential backoff to retry after failures:

   import time
   import openai
   def call_openai_with_retry():
       retries = 5
       delay = 2  # Start with a 2-second delay
       for attempt in range(retries):
           try:
               response = openai.ChatCompletion.create(
                   model="gpt-4o-realtime-preview",
                   messages=[{"role": "user", "content": "Hello"}]
               )
               return response
           except openai.error.OpenAIError as e:
               print(f"Error: {e}. Retrying in {delay} seconds...")
               time.sleep(delay)
               delay *= 2  # Double the delay each time
       return None  # If all retries fail
   response = call_openai_with_retry()
   if response:
       print("Success:", response)
   else:
       print("API failed after retries.")

If the issue persist and above could not resolve it: Contact Azure Support https://portal.azure.com/#blade/Microsoft_Azure_Support/HelpAndSupportBlade

I hope this is helpful! Do not hesitate to let me know if you have any other questions or clarifications.

Please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful.

Share via

I'm getting Server Error from Azure OpenAI realtime API in 'response.done'

1 answer

Your answer