Hello Aryan Behal,
Welcome to the Microsoft Q&A and thank you for posting your questions here.
I understand that you would like to know how to achieve better load balancing since your VMSS max CPU usage is 100% while average usage is less than 40%.
While the provided solution by @Sai Krishna Katakam covers many important aspects, here are a few additional recommendations to further optimize load balancing:
- Make sure that your autoscaling policies are not only based on CPU usage but also consider other metrics like memory usage, disk I/O, and network traffic. This can provide a more holistic approach to scaling. - https://learn.microsoft.com/en-us/azure/virtual-machine-scale-sets/virtual-machine-scale-sets-autoscale-overview
- Use Azure Traffic Analytics to gain deeper insights into traffic patterns and identify any anomalies or bottlenecks. - https://learn.microsoft.com/en-us/azure/network-watcher/traffic-analytics
- If session affinity (also known as sticky sessions) is enabled, it can lead to uneven load distribution. Consider disabling it if your application can handle it. - https://learn.microsoft.com/en-us/azure/load-balancer/load-balancer-troubleshoot-backend-traffic
- Investigate if there are any application-level bottlenecks causing uneven load distribution. This could involve optimizing code, database queries, or other resources.
- Evaluate if scaling up (using larger VM sizes) might be more effective than scaling out (adding more instances), depending on your application's characteristics.
I hope this is helpful! Implementing these additional recommendations along with the provided solution, you should be able to achieve better load balancing and reduce latency issues in your Azure VMSS deployment. Do not hesitate to let me know if you have any other questions.
Please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful.