Update Kubernetes and node images across multiple clusters using Azure Kubernetes Fleet Manager

Platform admins managing large number of clusters often have problems with staging the updates of multiple clusters (for example, upgrading node OS image or Kubernetes versions) in a safe and predictable way. To address this challenge, Azure Kubernetes Fleet Manager (Fleet) allows you to orchestrate updates across multiple clusters using update runs.

Update runs consist of stages, groups, and strategies and can be applied manually for one-time updates, or automatically, for ongoing regular updates using auto-upgrade profiles. All update runs (manual or automated) honor member cluster maintenance windows.

This guide covers how to configure and manually execute update runs.

Prerequisites

  • Read the conceptual overview of this feature, which provides an explanation of update strategies, runs, stages, and groups referenced in this guide.

  • You must have a Fleet resource with one or more member cluster. If not, follow the quickstart to create a Fleet resource and join Azure Kubernetes Service (AKS) clusters as members.

  • Set the following environment variables:

    export GROUP=<resource-group>
    export FLEET=<fleet-name>
    export AKS_CLUSTER_ID=<aks-cluster-resource-id>
    
  • If you're following the Azure CLI instructions in this article, you need Azure CLI version 2.58.0 or later installed. To install or upgrade, see Install the Azure CLI.

  • You also need the fleet Azure CLI extension, which you can install by running the following command:

    az extension add --name fleet
    

    Run the az extension update command to update to the latest version of the extension released:

    az extension update --name fleet
    

Creating update runs

Update run supports two options for the cluster upgrade sequence:

  • One by one: If you don't care about controlling the cluster upgrade sequence, one-by-one provides a simple approach to upgrade all member clusters of the fleet in sequence one at a time.
  • Control sequence of clusters using update groups and stages: If you want to control the cluster upgrade sequence, you can structure member clusters in update groups and update stages. You can store this sequence as a template in the form of an update strategy. You can create update runs later using the update strategies instead of defining the sequence every time you need to create an update run.

Note

Update runs honor the planned maintenance windows that you set at the AKS cluster level. For more information, see planned maintenance across multiple member clusters, which explains how update runs handle member clusters configured with planned maintenance windows.

Update all clusters one by one

  1. In the Azure portal, navigate to your Azure Kubernetes Fleet Manager resource.

  2. From the service menu, under Settings, select Multi-cluster update > Create a run.

  3. Enter a name for the update run, and then select One by one for the upgrade type.

    Screenshot of the Azure portal pane for creating update runs that update clusters one by one in Azure Kubernetes Fleet Manager.

  4. Select one of the following options for the Upgrade scope:

    • Kubernetes version for both control plane and node pools
    • Kubernetes version for only control plane of the cluster
    • Node image version only
  5. Select one of the following options for the Node image:

    • Latest image: Updates every AKS cluster in the update run to the latest image available for that cluster in its region.
    • Consistent image: As it's possible for an update run to have AKS clusters across multiple regions where the latest available node images can be different (check release tracker for more information). The update run picks the latest common image across all these regions to achieve consistency.

    Screenshot of the Azure portal pane for creating update runs. The upgrade scope section is shown.

  6. Select Create to create the update run.

Update clusters using groups and stages

You can define an update run using update stages to sequentially order the application of updates to different update groups. For example, a first update stage might update test environment member clusters, and a second update stage would then update production environment member clusters. You can also specify a wait time between the update stages. You can store this sequence as a template in the form of an update strategy.

  1. In the Azure portal, navigate to your Azure Kubernetes Fleet Manager resource.

  2. From the service menu, under Settings, select Multi-cluster update > Create a run.

  3. Enter a name for the update run, and then select Stages for the update sequence type.

    Screenshot of the Azure portal page for choosing stages mode within update run.

  4. Select Create stage, and then enter a name for the stage and the wait time between stages.

    Screenshot of the Azure portal page for creating a stage and defining wait time.

  5. Select the update groups that you want to include in this stage. You can also specify the order of the update groups if you want to update them in a specific sequence. When you're done, select Create.

    Screenshot of the Azure portal page for stage creation that shows the selection of upgrade groups.

  6. Select one of the following options for the Upgrade scope:

    • Kubernetes version for both control plane and node pools
    • Kubernetes version for only control plane of the cluster
    • Node image version only
  7. Select one of the following options for the Node image:

    • Latest image: Updates every AKS cluster in the update run to the latest image available for that cluster in its region.
    • Consistent image: As it's possible for an update run to have AKS clusters across multiple regions where the latest available node images can be different (check release tracker for more information). The update run picks the latest common image across all these regions to achieve consistency.

    Screenshot of the Azure portal pane for creating update runs. The upgrade scope section is shown.

  8. Select Create to create the update run.

    Specifying stages and their order every time when creating an update run can get repetitive and cumbersome. Update strategies simplify this process by allowing you to store templates for update runs. For more information, see update strategy creation and usage.

  9. In the Multi-cluster update menu, select the update run, and then select Start.

Create an update run using update strategies

Creating an update run requires you to specify the stages, groups, order each time. Update strategies simplify this process by allowing you to store templates for update runs.

Note

It's possible to create multiple update runs with unique names from the same update strategy.

You can create an update strategy using one of the following methods:

Save an update strategy while creating an update run

  • Save an update strategy while creating an update run in the Azure portal:

    A screenshot of the Azure portal showing update run stages being saved as an update strategy.

Manage an update run

The following sections explain how to manage an update run using the Azure portal and Azure CLI.

  • On the Multi-cluster update page of the fleet resource, you can Start an update run that's either in Not started or Failed state:

    A screenshot of the Azure portal showing how to start an update run in the 'Not started' state.

  • On the Multi-cluster update page of the fleet resource, you can Stop a currently Running update run:

    A screenshot of the Azure portal showing how to stop an update run in the 'Running' state.

  • Within any update run in the Not Started, Failed, or Running state, you can select any Stage and Skip the upgrade:

    A screenshot of the Azure portal showing how to skip upgrade for a specific stage in an update run.

    You can similarly skip the upgrade at the update group or member cluster level too.

For more information, see the conceptual overview on the update run states and skip behavior on runs/stages/groups.

Next steps