Tutorial: Automatically scale a Virtual Machine Scale Set with the Azure CLI
When you create a scale set, you define the number of VM instances that you wish to run. As your application demand changes, you can automatically increase or decrease the number of VM instances. The ability to autoscale lets you keep up with customer demand or respond to application performance changes throughout the lifecycle of your app. In this tutorial you learn how to:
- Use autoscale with a scale set
- Create and use autoscale rules
- Simulate CPU load to trigger autoscale rules
- Monitor autoscale actions as demand changes
If you don't have an Azure subscription, create an Azure free account before you begin.
Prerequisites
Use the Bash environment in Azure Cloud Shell. For more information, see Quickstart for Bash in Azure Cloud Shell.
If you prefer to run CLI reference commands locally, install the Azure CLI. If you're running on Windows or macOS, consider running Azure CLI in a Docker container. For more information, see How to run the Azure CLI in a Docker container.
If you're using a local installation, sign in to the Azure CLI by using the az login command. To finish the authentication process, follow the steps displayed in your terminal. For other sign-in options, see Sign in with the Azure CLI.
When you're prompted, install the Azure CLI extension on first use. For more information about extensions, see Use extensions with the Azure CLI.
Run az version to find the version and dependent libraries that are installed. To upgrade to the latest version, run az upgrade.
- This tutorial requires version 2.0.32 or later of the Azure CLI. If using Azure Cloud Shell, the latest version is already installed.
Create a scale set
Create a resource group with az group create.
export RANDOM_SUFFIX=$(openssl rand -hex 3)
export REGION="WestUS2"
export MY_RESOURCE_GROUP_NAME="myResourceGroup$RANDOM_SUFFIX"
az group create --name $MY_RESOURCE_GROUP_NAME --location $REGION
Now create a Virtual Machine Scale Set with az vmss create. The following example creates a scale set with an instance count of 2, generates SSH keys if they don't exist, and uses a valid image Ubuntu2204.
export MY_SCALE_SET_NAME="myScaleSet$RANDOM_SUFFIX"
az vmss create \
--resource-group $MY_RESOURCE_GROUP_NAME \
--name $MY_SCALE_SET_NAME \
--image Ubuntu2204 \
--orchestration-mode Flexible \
--instance-count 2 \
--admin-username azureuser \
--generate-ssh-keys
Define an autoscale profile
To enable autoscale on a scale set, you first define an autoscale profile. This profile defines the default, minimum, and maximum scale set capacity. These limits let you control cost by not continually creating VM instances, and balance acceptable performance with a minimum number of instances that remain in a scale-in event. Create an autoscale profile with az monitor autoscale create. The following example sets the default and minimum capacity of 2 VM instances, and a maximum of 10:
az monitor autoscale create \
--resource-group $MY_RESOURCE_GROUP_NAME \
--resource $MY_SCALE_SET_NAME \
--resource-type Microsoft.Compute/virtualMachineScaleSets \
--name autoscale \
--min-count 2 \
--max-count 10 \
--count 2
Create a rule to autoscale out
If your application demand increases, the load on the VM instances in your scale set increases. If this increased load is consistent, rather than just a brief demand, you can configure autoscale rules to increase the number of VM instances. When these instances are created and your application is deployed, the scale set starts to distribute traffic to them through the load balancer. You control which metrics to monitor, how long the load must meet a given threshold, and how many VM instances to add.
Create a rule with az monitor autoscale rule create that increases the number of VM instances when the average CPU load is greater than 70% over a 5-minute period. When the rule triggers, the number of VM instances is increased by three.
az monitor autoscale rule create \
--resource-group $MY_RESOURCE_GROUP_NAME \
--autoscale-name autoscale \
--condition "Percentage CPU > 70 avg 5m" \
--scale out 3
Create a rule to autoscale in
When application demand decreases, the load on the VM instances drops. If this decreased load persists over a period of time, you can configure autoscale rules to decrease the number of VM instances in the scale set. This scale-in action helps reduce costs by running only the necessary number of instances required to meet current demand.
Create another rule with az monitor autoscale rule create that decreases the number of VM instances when the average CPU load drops below 30% over a 5-minute period. The following example scales in the number of VM instances by one.
az monitor autoscale rule create \
--resource-group $MY_RESOURCE_GROUP_NAME \
--autoscale-name autoscale \
--condition "Percentage CPU < 30 avg 5m" \
--scale in 1
Simulate CPU load on scale set
To test the autoscale rules, you need to simulate sustained CPU load on the VM instances in the scale set. In this minimalist approach, we avoid installing additional packages by using the built-in yes
command to generate CPU load. The following command starts 3 background processes that continuously output data to /dev/null
for 60 seconds and then terminates them.
for i in {1..3}; do
yes > /dev/null &
done
sleep 60
pkill yes
This command simulates CPU load without introducing package installation errors.
Monitor the active autoscale rules
To monitor the number of VM instances in your scale set, use the watch
command. It may take up to 5 minutes for the autoscale rules to begin the scale-out process in response to the CPU load. However, once it happens, you can exit watch with CTRL + C keys.
By then, the scale set will automatically increase the number of VM instances to meet the demand. The following command shows the list of VM instances in the scale set:
az vmss list-instances \
--resource-group $MY_RESOURCE_GROUP_NAME \
--name $MY_SCALE_SET_NAME \
--output table
Once the CPU threshold has been met, the autoscale rules increase the number of VM instances in the scale set. The output will show the list of VM instances as new ones are created.
InstanceId LatestModelApplied Location Name ProvisioningState ResourceGroup VmId
------------ -------------------- ---------- --------------- ------------------- -------------------- ------------------------------------
1 True WestUS2 myScaleSet_1 Succeeded myResourceGroupxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
2 True WestUS2 myScaleSet_2 Succeeded myResourceGroupxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
4 True WestUS2 myScaleSet_4 Creating myResourceGroupxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
5 True WestUS2 myScaleSet_5 Creating myResourceGroupxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
6 True WestUS2 myScaleSet_6 Creating myResourceGroupxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Once the CPU load subsides, the average CPU load returns to normal. After another 5 minutes, the autoscale rules then scale in the number of VM instances. Scale-in actions remove VM instances with the highest IDs first. When a scale set uses Availability Sets or Availability Zones, scale-in actions are evenly distributed across the VM instances. The following sample output shows one VM instance being deleted as the scale set autoscales in:
6 True WestUS2 myScaleSet_6 Deleting myResourceGroupxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Clean up resources
To remove your scale set and associated resources, please manually delete the resource group using your preferred method.
Next steps
In this tutorial, you learned how to automatically scale in or out a scale set with the Azure CLI:
- Use autoscale with a scale set
- Create and use autoscale rules
- Simulate CPU load to trigger autoscale rules
- Monitor autoscale actions as demand changes