Configure custom metrics for rolling upgrades on Virtual Machine Scale Sets (Preview)

Note

Custom metrics for rolling upgrades on Virtual Machine Scale Sets is currently in preview. Previews are made available to you on the condition that you agree to the supplemental terms of use. Some aspects of these features may change prior to general availability (GA).

Custom metrics for rolling upgrades enables you to utilize the application health extension to emit custom metrics to your Virtual Machine Scale Set. These custom metrics can be used to tell the scale set the order in which virtual machines should be updated when a rolling upgrade is triggered. The custom metrics can also inform your scale set when an upgrade should be skipped on a specific instance. This allows you to have more control over the ordering and the update process itself.

Custom metrics can be used in combination with other rolling upgrade functionality such as automatic OS upgrades, automatic extension upgrades and MaxSurge rolling upgrades.

Requirements

  • When using custom metrics for rolling upgrades on Virtual Machine Scale Sets, the scale set must also use the application health extension with rich health states to report phase ordering or skip upgrade information. Custom metrics upgrades aren't supported when using the application health extension with binary states.
  • The application health extension must be set up to use HTTP or HTTPS in order to receive the custom metrics information. TCP isn't supported for integration with custom metrics for rolling upgrades.

Concepts

Phase ordering

A phase is a grouping construct for virtual machines. Each phase is determined by setting metadata emitted from the application health extension via the customMetrics property. The Virtual Machine Scale Set takes the information retrieved from the custom metrics and uses it to place virtual machines into their assigned phases. Within each phase, the Virtual Machine Scale set will also assign upgrade batches. Each batch is configured using the rolling upgrade policy which takes into consideration the update domains (UD), fault domains (FD), and zone information of each virtual machine.

When a rolling upgrade is initiated, the virtual machines are placed into their designated phases. The phased upgrades are performed in numerical sequence order. Virtual Machines in all batches within a phase will be completed before moving onto the next phase. If no phase ordering is received for a virtual machine, the scale set will place it into the last phase

Regional scale set Diagram that shows a high level diagram of what happens when using n-phase upgrades on a regional scale set.

Zonal scale set Diagram that shows a high level diagram of what happens when using n-phase upgrades on a zonal scale set.

To specify phase number the virtual machine should be associated with, use phaseOrderingNumber parameter.

{
     “applicationHealthState”: “Healthy”,
      “customMetrics”: "{ \"rollingUpgrade\": { \"PhaseOrderingNumber\": 0 } }"
}

Skip upgrade

Skip upgrade functionality enables an individual instance to be omitted from an upgrade during the rolling upgrade. This is similar to utilizing instance protection but can more seamlessly integrate into the rolling upgrade workflow and into instance level applications. Similar to phase ordering, the skip upgrade information is passed to the Virtual Machine Scale Set via the application health extension and custom metrics settings. When the rolling upgrade is triggered, the Virtual Machine Scale Set checks the response of the application health extensions custom metrics and if skip upgrade is set to true, the instance is not included in the rolling upgrade.

Diagram that shows a high level diagram of what happens when using skip upgrade on a zonal scale set.

For skipping an upgrade on a virtual machine, use SkipUpgrade parameter. This tells the rolling upgrade to skip over this virtual machine when performing the upgrades.

{
     “applicationHealthState”: “Healthy”,
      “customMetrics”: "{ \"rollingUpgrade\": { \"SkipUpgrade\": true} }"
}

Skip upgrade and phase order can also be used together:

{
     “applicationHealthState”: “Healthy”,
      “customMetrics”: "{ \"rollingUpgrade\": { \"SkipUpgrade\": false, \"PhaseOrderingNumber\": 0 } }"
}

Configure the application health extension

The application health extension requires an HTTP or HTTPS request with an associated port or request path. TCP probes are supported when using the application health extension, but can't set the ApplicationHealthState through the probe response body and can't be used with rolling upgrades with custom metrics.

{
  "extensionProfile" : {
     "extensions" : [
      {
        "name": "HealthExtension",
        "properties": {
          "publisher": "Microsoft.ManagedServices",
          "type": "<ApplicationHealthLinux or ApplicationHealthWindows>",
          "autoUpgradeMinorVersion": true,
          "typeHandlerVersion": "1.0",
          "settings": {
            "protocol": "<protocol>",
            "port": <port>,
            "requestPath": "</requestPath>",
            "intervalInSeconds": 5,
            "numberOfProbes": 1
          }
        }
      }
    ]
  }
}
Name Value / Example Data Type
protocol http or https string
port Optional when protocol is http or https int
requestPath Mandatory when using http or https string
intervalInSeconds Optional, default is 5 seconds. This setting is the interval between each health probe. For example, if intervalInSeconds == 5, a probe is sent to the local application endpoint once every 5 seconds. int
numberOfProbes Optional, default is 1. This setting is the number of consecutive probes required for the health status to change. For example, if numberOfProbles == 3, you need 3 consecutive "Healthy" signals to change the health status from "Unhealthy"/"Unknown" into "Healthy" state. The same requirement applies to change health status into "Unhealthy" or "Unknown" state. int
gracePeriod Optional, default = intervalInSeconds * numberOfProbes; maximum grace period is 7200 seconds int

Install the application health extension

Use az vmss extension set to add the application health extension to the scale set model definition.

Create a json file called extensions.json with the desired settings.

{
  "protocol": "<protocol>",
  "port": <port>,
  "requestPath": "</requestPath>",
  "gracePeriod": <healthExtensionGracePeriod>
}

Apply the application health extension.

az vmss extension set \
  --name ApplicationHealthLinux \
  --publisher Microsoft.ManagedServices \
  --version 2.0 \
  --resource-group myResourceGroup \
  --vmss-name myScaleSet \
  --settings ./extension.json

Upgrade the virtual machines in the scale set. This step is only required if your scale set is using a manual upgrade policy. For more information on upgrade policies, see upgrade policies for Virtual Machine Scale Sets

az vmss update-instances \
  --resource-group myResourceGroup \
  --name myScaleSet \
  --instance-ids "*"

Configure the application health extension response

Configuring the custom metrics response can be accomplished in many different ways. It can be integrated into existing applications, dynamically updated and be used along side various functions to provide an output based on a specific situation.

These sample applications include phase order and skip upgrade parameters into the custom metrics response.

#!/bin/bash

# Open firewall port (replace with your firewall rules as needed)
sudo iptables -A INPUT -p tcp --dport 8000 -j ACCEPT

# Create Python HTTP server for responding with JSON
cat <<EOF > server.py
import json
from http.server import BaseHTTPRequestHandler, HTTPServer

# Function to generate the JSON response
def generate_response_json():
    return json.dumps({
        "ApplicationHealthState": "Healthy",
        "CustomMetrics": json.dumps({
            "RollingUpgrade": {
                "PhaseOrderingNumber": 1,
                "SkipUpgrade": "false"
            }
        })
    })

class RequestHandler(BaseHTTPRequestHandler):
    def do_GET(self):
        # Respond with HTTP 200 and JSON content
        self.send_response(200)
        self.send_header('Content-type', 'application/json')
        self.end_headers()
        response = generate_response_json()
        self.wfile.write(response.encode('utf-8'))

# Set up the HTTP server
def run(server_class=HTTPServer, handler_class=RequestHandler):
    server_address = ('localhost', 8000)
    httpd = server_class(server_address, handler_class)
    print('Starting server on port 8000...')
    httpd.serve_forever()

if __name__ == "__main__":
    run()
EOF

# Run the server in the background
python3 server.py &

# Store the process ID of the server
SERVER_PID=$!

# Wait a few seconds to ensure the server starts
sleep 2

# Confirm execution
echo "Server has been started on port 8000 with PID $SERVER_PID"

For more response configuration examples, see application health samples

Next steps

Learn how to perform manual upgrades on Virtual Machine Scale Sets.