Thanks for posting your question on Microsoft Q&A.
This behavior could be related to the rate limits and quotas set for Azure OpenAI services. Azure OpenAI Service evaluates the rate of incoming requests over short periods, typically 1 or 10 seconds. If the number of requests exceeds the expected rate, new requests may receive a timeout or rate limit response.
To address this issue, consider the following steps:
- Review Rate Limits: Ensure that your request rate is within the allowed limits. You can find more details on managing Azure OpenAI Service quotas.
- Optimize Request Distribution: Distribute your requests evenly over time to avoid exceeding the rate limits.
- Increase Quotas: If necessary, you can request an increase in your service quotas through the Azure portal.
For more detailed guidance, you can refer to the Azure OpenAI Service quota management documentation.
If the reply was helpful, please don't forget to upvote and/or accept it as an answer. Let me know if you have any other queries.
Thank you!