VMSS max CPU usage is 100% while average usage is less than 40%. How to achieve better load balancing?

Question

VMSS max CPU usage is 100% while average usage is less than 40%. How to achieve better load balancing?

Aryan Behal 20 Microsoft Employee

Hi team,

I have a service using Azure Virtual Machine Scale Set that facing constant latency issue. On checking the CPU usage, I found that MAX CPU usage is always 100%. I have already scaled VMSS max instances to 100 (max possible value) but still seeing latency issue.

User's image

{C6477EA6-15E7-42AF-AAB7-47FE7F8920F0}

But if I check AVG CPU load and MIN CPU load, they are as low as 40% and 2.6%. Why is load not getting distributed evenly?

{6E311B5D-7767-4A41-8026-8608381C54FA}

{2E8D5465-AB91-4784-9923-3767DA423CB5}

Accepted answer

1 additional answer

Your answer

Answer 1

Hello Aryan Behal,

Welcome to the Microsoft Q&A and thank you for posting your questions here.

I understand that you would like to know how to achieve better load balancing since your VMSS max CPU usage is 100% while average usage is less than 40%.

While the provided solution by @Sai Krishna Katakam covers many important aspects, here are a few additional recommendations to further optimize load balancing:

Make sure that your autoscaling policies are not only based on CPU usage but also consider other metrics like memory usage, disk I/O, and network traffic. This can provide a more holistic approach to scaling. - https://learn.microsoft.com/en-us/azure/virtual-machine-scale-sets/virtual-machine-scale-sets-autoscale-overview
Use Azure Traffic Analytics to gain deeper insights into traffic patterns and identify any anomalies or bottlenecks. - https://learn.microsoft.com/en-us/azure/network-watcher/traffic-analytics
If session affinity (also known as sticky sessions) is enabled, it can lead to uneven load distribution. Consider disabling it if your application can handle it. - https://learn.microsoft.com/en-us/azure/load-balancer/load-balancer-troubleshoot-backend-traffic
Investigate if there are any application-level bottlenecks causing uneven load distribution. This could involve optimizing code, database queries, or other resources.
Evaluate if scaling up (using larger VM sizes) might be more effective than scaling out (adding more instances), depending on your application's characteristics.

I hope this is helpful! Implementing these additional recommendations along with the provided solution, you should be able to achieve better load balancing and reduce latency issues in your Azure VMSS deployment. Do not hesitate to let me know if you have any other questions.

Please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful.

Answer 2

Hi Aryan Behal,

Welcome to the Microsoft Q&A Platform! Thank you for asking your question here.

The issue of uneven CPU load in your Azure Virtual Machine Scale Set (VMSS), where max CPU usage hits 100% while average usage remains low, suggests traffic distribution imbalance. This can be caused by ineffective load balancing, session persistence, or application bottlenecks.

Make sure your Azure Load Balancer or Application Gateway is configured correctly, using the default 5-tuple hash algorithm and avoiding session persistence unless necessary. Verify health probe configurations to prevent traffic from being routed to unhealthy instances. Adjust autoscaling rules to scale based on multiple metrics like CPU and memory, with appropriate cooldown periods to avoid frequent scaling.

Use Azure Monitor and Application Insights to analyze traffic distribution and application performance for hotspots. If traffic spans regions, consider Azure Front Door or Traffic Manager for improved global traffic distribution. For workloads with bursty traffic, explore VMSS Flexible Orchestration Mode for better workload management.

Please refer to below documentation:
High Availability Ports
Overview of autoscale with Azure Virtual Machine Scale Sets
What is Azure Front Door
If an answer has been helpful, please consider accept the "Answer" and "Upvote" to help increase visibility of this question for other members of the Microsoft Q&A community.

User's image

Sai Krishna Katakam 1,725 Reputation points Microsoft External Staff

2024-11-27T11:00:35.52+00:00

Hello Aryan Behal,

Just checking in to see if you had a chance to review my answer on your question. Please let us know if it was helpful and feel free to reach out if you have any further queries.

If you found the information useful, please click "Accept Answer" and "Upvote" on the post to let us know.

Thank You.

Share via

VMSS max CPU usage is 100% while average usage is less than 40%. How to achieve better load balancing?

1 additional answer

Your answer