VMSS max CPU usage is 100% while average usage is less than 40%. How to achieve better load balancing?

Aryan Behal 20 Reputation points Microsoft Employee
2024-11-25T09:38:37.9166667+00:00

Hi team,

I have a service using Azure Virtual Machine Scale Set that facing constant latency issue. On checking the CPU usage, I found that MAX CPU usage is always 100%. I have already scaled VMSS max instances to 100 (max possible value) but still seeing latency issue.

User's image

{C6477EA6-15E7-42AF-AAB7-47FE7F8920F0}

But if I check AVG CPU load and MIN CPU load, they are as low as 40% and 2.6%. Why is load not getting distributed evenly?

{6E311B5D-7767-4A41-8026-8608381C54FA}

{2E8D5465-AB91-4784-9923-3767DA423CB5}

Azure Service Fabric
Azure Service Fabric
An Azure service that is used to develop microservices and orchestrate containers on Windows and Linux.
272 questions
Azure Virtual Machine Scale Sets
Azure Virtual Machine Scale Sets
Azure compute resources that are used to create and manage groups of heterogeneous load-balanced virtual machines.
423 questions
0 comments No comments
{count} votes

Accepted answer
  1. Sina Salam 14,551 Reputation points
    2024-11-25T15:31:39.18+00:00

    Hello Aryan Behal,

    Welcome to the Microsoft Q&A and thank you for posting your questions here.

    I understand that you would like to know how to achieve better load balancing since your VMSS max CPU usage is 100% while average usage is less than 40%.

    While the provided solution by @Sai Krishna Katakam covers many important aspects, here are a few additional recommendations to further optimize load balancing:

    1. Make sure that your autoscaling policies are not only based on CPU usage but also consider other metrics like memory usage, disk I/O, and network traffic. This can provide a more holistic approach to scaling. - https://learn.microsoft.com/en-us/azure/virtual-machine-scale-sets/virtual-machine-scale-sets-autoscale-overview
    2. Use Azure Traffic Analytics to gain deeper insights into traffic patterns and identify any anomalies or bottlenecks. - https://learn.microsoft.com/en-us/azure/network-watcher/traffic-analytics
    3. If session affinity (also known as sticky sessions) is enabled, it can lead to uneven load distribution. Consider disabling it if your application can handle it. - https://learn.microsoft.com/en-us/azure/load-balancer/load-balancer-troubleshoot-backend-traffic
    4. Investigate if there are any application-level bottlenecks causing uneven load distribution. This could involve optimizing code, database queries, or other resources.
    5. Evaluate if scaling up (using larger VM sizes) might be more effective than scaling out (adding more instances), depending on your application's characteristics.

    I hope this is helpful! Implementing these additional recommendations along with the provided solution, you should be able to achieve better load balancing and reduce latency issues in your Azure VMSS deployment. Do not hesitate to let me know if you have any other questions.


    Please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful.

    0 comments No comments

1 additional answer

Sort by: Most helpful
  1. Sai Krishna Katakam 1,510 Reputation points Microsoft Vendor
    2024-11-25T13:31:06.25+00:00

    Hi Aryan Behal,

    Welcome to the Microsoft Q&A Platform! Thank you for asking your question here.

    The issue of uneven CPU load in your Azure Virtual Machine Scale Set (VMSS), where max CPU usage hits 100% while average usage remains low, suggests traffic distribution imbalance. This can be caused by ineffective load balancing, session persistence, or application bottlenecks.

    Make sure your Azure Load Balancer or Application Gateway is configured correctly, using the default 5-tuple hash algorithm and avoiding session persistence unless necessary. Verify health probe configurations to prevent traffic from being routed to unhealthy instances. Adjust autoscaling rules to scale based on multiple metrics like CPU and memory, with appropriate cooldown periods to avoid frequent scaling.

    Use Azure Monitor and Application Insights to analyze traffic distribution and application performance for hotspots. If traffic spans regions, consider Azure Front Door or Traffic Manager for improved global traffic distribution. For workloads with bursty traffic, explore VMSS Flexible Orchestration Mode for better workload management.

    Please refer to below documentation:
    High Availability Ports
    Overview of autoscale with Azure Virtual Machine Scale Sets
    What is Azure Front Door
    If an answer has been helpful, please consider accept the "Answer" and "Upvote" to help increase visibility of this question for other members of the Microsoft Q&A community. 

    User's image


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.