Hello Tomás Novo,
Welcome to the Microsoft Q&A and thank you for posting your questions here.
Regarding your explanation and consideration to use auto-scaling, I understand that you want to scale the app to sustain 9000 simultaneous users without bottlenecks or compromising quality.
This is a solution architecture perspective to handle your new request:
- For Azure AI Search, upgrade to the Standard S3 tier. This tier supports up to 36 search units (SUs) with 12 partitions and 12 replicas, providing the necessary capacity for high query volumes - https://learn.microsoft.com/en-us/azure/search/search-limits-quotas-capacity and https://learn.microsoft.com/en-us/azure/search/search-sku-tier
- For Azure OpenAI, ensure the rate limit of 450,000 tokens per minute is sufficient. If not, consider deploying additional instances or upgrading to a higher tier. For 9000 simultaneous users, you might need to scale horizontally by adding more instances of the GPT-4o model - https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models and https://learn.microsoft.com/en-us/azure/ai-services/openai/quotas-limits
- For Azure Cosmos DB, upgrade to a Standard provisioned throughput model with autoscale enabled. - https://learn.microsoft.com/en-us/azure/cosmos-db/free-tier and https://azure.microsoft.com/en-us/pricing/details/cosmos-db/autoscale-provisioned
- For App Service, upgrade to the Standard S2 tier. This tier provides more CPU, memory, and scaling capabilities, supporting up to 10 instances. You may need to scale out to multiple instances to handle the load - https://learn.microsoft.com/en-us/azure/app-service/manage-scale-up and https://azure.microsoft.com/en-us/pricing/details/app-service/windows/
- For Auto-scaling, Implement auto-scaling for all services to manage resource allocation dynamically. Azure provides built-in autoscaling mechanisms that can automatically add or remove resources based on predefined rules or real-time metrics. - https://learn.microsoft.com/en-us/azure/architecture/best-practices/auto-scaling and https://learn.microsoft.com/en-us/azure/virtual-machine-scale-sets/virtual-machine-scale-sets-autoscale-overview
NOTE:
Be mindful of the cost implications of upgrading these services. Use the Azure Pricing Calculator to estimate the costs based on your specific usage patterns and requirements.
I hope this is helpful! Do not hesitate to let me know if you have any other questions.
Please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful.