Securing AKS workloads: Validating container image signatures with Ratify and Azure Policy

Introduction

Container security is crucial in the cloud-native landscape to protect workloads. To address this, Microsoft introduced the Containers Secure Supply Chain (CSSC) framework, enhancing security throughout the lifecycle of container images. One of the stages defined in the CSSC framework is the Deploy stage, where container images are deployed to production environments, such as Azure Kubernetes Service (AKS) clusters. Ensuring a secure production environment involves maintaining the integrity and authenticity of container images. This is achieved by signing container images at the Build stage and then verifying them at the Deploy stage, ensuring that only trusted and unaltered images are deployed.

Ratify, a CNCF sandbox project supported by Microsoft, is a robust verification engine that verifies container images security metadata, such as signatures, and only allows the deployment of images that meet your specified policies.

Scenario

An image producer builds and pushes container images to the Azure Container Registry (ACR) within CI/CD pipelines. These images are intended for deploying and running cloud-native workloads on AKS clusters by image consumers. The image producer signs the container images in ACR using Notary Project tooling, specifically Notation, within the CI/CD pipelines. The keys and certificates for signing are securely stored in Azure Key Vault (AKV). Once signed, Notary Project signatures are created and stored in ACR, referencing the corresponding images. An image consumer sets up Ratify and policies on the AKS cluster to validate the Notary Project signatures of images during deployment. Images that fail signature validation will be denied from deployment if the policy effect is set to deny effect. This ensures that only trusted and unaltered images are deployed to the AKS cluster.

As the image producer, follow these documents to sign container images in ACR:

This document will guide you, as the image consumer, through the process of verifying container image signatures with Ratify and Azure policy on AKS clusters.

Important

If you prefer using a managed experience over using open-source Ratify directly, you can opt for the AKS image integrity policy (public preview) to ensure image integrity on your AKS clusters instead.

Signature validation overview

Here are the high-level steps for signature verification:

  1. Set up identity and access controls: Configure the identity used by Ratify to access ACR and AKV with the necessary roles.

  2. Set up Ratify on your AKS cluster: Set up Ratify using Helm chart installation as a standard Kubernetes service.

  3. Set up a custom Azure policy: Create and assign a custom Azure policy with the desired policy effect: Deny or Audit.

After following these steps, you can start deploying your workloads to observe the results. With the Deny effect policy, only images that have passed signature verification are allowed for deployment, while images that are unsigned or signed by untrusted identities are denied. With the Audit effect policy, images can be deployed, but your component will be marked as non-compliant for auditing purposes.

Prerequisites

  • Install and configure the latest Azure CLI, or run commands in the Azure Cloud Shell.
  • Install helm for Ratify installation and kubectl for troubleshooting and status checking.
  • Create or use an AKS cluster enabled with an OIDC Issuer by following the steps in Configure an AKS cluster with an OpenID Connect (OIDC) issuer. This AKS cluster is where your container images will be deployed, Ratify will be installed, and custom Azure policies will be applied.
  • Connect the ACR to the AKS cluster if not already connected by following the steps in Authenticate with ACR from AKS. The ACR is where your container images are stored for deployment to your AKS cluster.
  • Enable the Azure Policy add-on. To verify that the add-on is installed, or to install it if it is not already, follow the steps in Azure Policy add-on for AKS.

Set up identity and access controls

Create or use a user-assigned managed identity

If you don't already have a user-assigned managed identity, follow this document to create one. This identity will be used by Ratify to access Azure resources, such as ACR and AKV.

Create a federated identity credential for your identity

Set up environment variables:

export AKS_RG=<aks-resource-group-name>
export AKS_NAME=<aks-name>
export AKS_OIDC_ISSUER=$(az aks show -n $AKS_NAME -g $AKS_RG --query "oidcIssuerProfile.issuerUrl" -otsv)

export IDENTITY_RG=<identity-resource-group-name>
export IDENTITY_NAME=<identity-name>
export IDENTITY_CLIENT_ID=$(az identity show --name  $IDENTITY_NAME --resource-group $IDENTITY_RG --query 'clientId' -o tsv)
export IDENTITY_OBJECT_ID=$(az identity show --name $IDENTITY_NAME --resource-group $IDENTITY_RG --query 'principalId' -otsv)

export RATIFY_NAMESPACE="gatekeeper-system"
export RATIFY_SA_NAME="ratify-admin"

Note

Update the values of the variables RATIFY_NAMESPACE and RATIFY_SA_NAME if you are not using the default values. Make sure you use the same values during Ratify helm chart installation.

The following command creates a federated credential for your managed identity, allowing it to authenticate using tokens issued by an OIDC issuer, specifically for a Kubernetes service account RATIFY_SA_NAME in the namespace RATIFY_NAMESPACE.

az identity federated-credential create \
--name ratify-federated-credential \
--identity-name "$IDENTITY_NAME" \
--resource-group "$IDENTITY_RG" \
--issuer "$AKS_OIDC_ISSUER" \
--subject system:serviceaccount:"$RATIFY_NAMESPACE":"$RATIFY_SA_NAME"

Configure access for your identity

Configure access to ACR

The AcrPull role is required for your identity to pull signatures and other container image metadata. Use the following instructions to assign the role:

export ACR_SUB=<acr-subscription-id>
export ACR_RG=<acr-resource-group>
export ACR_NAME=<acr-name>

az role assignment create \
--role acrpull \
--assignee-object-id ${IDENTITY_OBJECT_ID} \
--scope subscriptions/${ACR_SUB}/resourceGroups/${ACR_RG}/providers/Microsoft.ContainerRegistry/registries/${ACR_NAME}

Configure access to AKV

The Key Vault Secrets User role is required for your identity to fetch the entire certificate chain from your AKV. Use the following instructions to assign the role:

Set up additional environment variables for the AKV resource:

export AKV_SUB=<acr-subscription-id>
export AKV_RG=<acr-resource-group>
export AKV_NAME=<acr-name>

az role assignment create \
--role "Key Vault Secrets User" \
--assignee ${IDENTITY_OBJECT_ID} \
--scope "/subscriptions/${AKV_SUB}/resourceGroups/${AKV_RG}/providers/Microsoft.KeyVault/vaults/${AKV_NAME}"

Set up Ratify on your AKS cluster with Azure Policy enabled

Know your helm chart parameters

When installing the Helm chart for Ratify, you need to pass values to parameters using the --set flag or by providing a custom values file. Those values will be used to configure Ratify for signature verification. For a comprehensive list of parameters, refer to the Ratify Helm chart documentation.

For this scenario, you will need to configure:

  • The identity that we set up previously for accessing ACR and AKV
  • The certificate stored in AKV for signature verification
  • One Notary Project trust policy for signature verification including registryScopes, trustStores and trustedIdentities

See the parameter table below for details:

Parameter Description Value
azureWorkloadIdentity.clientId Specifies the client ID of the Azure Workload Identity "$IDENTITY_CLIENT_ID"
oras.authProviders.azureWorkloadIdentityEnabled Enable/disable Azure Workload Identity for ACR authentication true
azurekeyvault.enabled Enable/disable fetching certificates from AKV true
azurekeyvault.vaultURI The URI of the AKV resource "https://$AKV_NAME.vault.azure.net"
azurekeyvault.tenantId The tenant id of the AKV resource "$AKV_TENANT_ID"
azurekeyvault.certificates[0].name Name of the certificate "$CERT_NAME"
notation.trustPolicies[0].registryScopes[0] A repository URI that the policy applies to "$REPO_URI"
notation.trustPolicies[0].trustStores[0] Trust stores where certificates of certain type ca or tsa are stored ca:azurekeyvault
notation.trustPolicies[0].trustedIdentities[0] The subject field of the signing certificate with prefix x509.subject: indicating who you trust "x509.subject: $SUBJECT"

By using timestamping for your images, you can ensure that images signed before the certificate expires can still be verified successfully, eliminating the need to re-sign existing images. You can specify additional parameters as the following:

Parameter Description Value
notationCerts[0] The filepath to the PEM formatted TSA root certificate file "$TSA_ROOT_CERT_FILEPATH"
notation.trustPolicies[0].trustStores[1] Another trust store where the TSA root certificate is stored tsa:notationCerts[0]

If you have multiple certificates for signature verification, you can specify additional parameters and values, for example,

Parameter Description Value
azurekeyvault.certificates[1].name Name of the certificate "$CERT_NAME_2"
notation.trustPolicies[0].trustedIdentities[1] Another subject field of the signing certificate indicating who you trust "x509.subject: $SUBJECT_2"

Install Ratify helm chart with desired parameters and values

Ensure that the Ratify Helm chart version is at least 1.15.0, which will install Ratify version 1.4.0 or higher. In this example, helm chart version 1.15.0 is used.

Set up additional environment variables for installation:

export CHART_VER="1.15.0"
export REPO_URI="$ACR_NAME.azurecr.io/<namespace>/<repo>"
export SUBJECT="<Subject-of-signing-certificate>"
export AKV_TENANT_ID="$(az account show --query tenantId --output tsv)"
helm repo add ratify https://ratify-project.github.io/ratify
helm repo update

helm install ratify ratify/ratify --atomic --namespace $RATIFY_NAMESPACE --create-namespace --version $CHART_VER --set provider.enableMutation=false --set featureFlags.RATIFY_CERT_ROTATION=true \
--set azureWorkloadIdentity.clientId=$IDENTITY_CLIENT_ID \
--set oras.authProviders.azureWorkloadIdentityEnabled=true \
--set azurekeyvault.enabled=true \
--set azurekeyvault.vaultURI="https://$AKV_NAME.vault.azure.net" \
--set azurekeyvault.certificates[0].name="$CERT_NAME" \
--set azurekeyvault.tenantId="$AKV_TENANT_ID" \  
--set notation.trustPolicies[0].registryScopes[0]="$REPO_URI" \
--set notation.trustPolicies[0].trustStores[0]="ca:azurekeyvault" \
--set notation.trustPolicies[0].trustedIdentities[0]="x509.subject: $SUBJECT"

Important

For images that are not linked to a trust policy, signature validation will fail. For instance, if the images are not within the repository $REPO_URI, the signature validation for those images will fail. You can add multiple repositories by specifying additional parameters. For example, to add another repository for the trust policy notation.trustPolicies[0], include the parameter --set notation.trustPolicies[0].registryScopes[1]="$REPO_URI_1".

Note

For timestamping support, you need to specify additional parameters: --set-file notationCerts[0]="$TSA_ROOT_CERT_FILE" and --set notation.trustPolicies[0].trustStores[1]="ca:azurekeyvault".

Set up a custom Azure policy

Assign a new policy to your AKS cluster

Create a custom Azure policy for signature verification. By default, the policy effect is set to Deny, meaning images that fail signature validation will be denied deployment. Alternatively, you can configure the policy effect to Audit, allowing images that fail signature verification to be deployed while marking the AKS cluster and related workloads as non-compliant. The Audit effect is useful for verifying your signature verification configuration without risking outages due to incorrect settings for your production environment.

export CUSTOM_POLICY=$(curl -L https://raw.githubusercontent.com/ratify-project/ratify/refs/tags/v1.4.0/library/default/customazurepolicy.json)
export DEFINITION_NAME="ratify-default-custom-policy"
export DEFINITION_ID=$(az policy definition create --name "$DEFINITION_NAME" --rules "$(echo "$CUSTOM_POLICY" | jq .policyRule)" --params "$(echo "$CUSTOM_POLICY" | jq .parameters)" --mode "Microsoft.Kubernetes.Data" --query id -o tsv)

Assign the policy to your AKS cluster with the default effect Deny.

export POLICY_SCOPE=$(az aks show -g "$AKS_RG" -n "$AKS_NAME" --query id -o tsv)
az policy assignment create --policy "$DEFINITION_ID" --name "$DEFINITION_NAME" --scope "$POLICY_SCOPE"

To change the policy effect to Audit, you can pass additional parameter to az policy assignment create command. For example:

az policy assignment create --policy "$DEFINITION_ID" --name "$DEFINITION_NAME" --scope "$POLICY_SCOPE" -p "{\"effect\": {\"value\":\"Audit\"}}"

Note

It will take around 15 minutes to complete the assignment.

Use the following command to check the custom policy status.

kubectl get constraintTemplate ratifyverification

Below is an example of the output for a successful policy assignment:

NAME                 AGE
ratifyverification   11m

To make a change on an existing assignment, you need to delete the existing assignment first, make changes, and finally create a new assignment.

Deploy your images and check the policy effects

Use Deny policy effect

With the Deny policy effect, only images signed with trusted identities are allowed for deployment. You can begin deploying your workloads to observe the effects. In this document, we will use the kubectl command to deploy a simple pod. Similarly, you can deploy your workloads using a Helm chart or any templates that trigger Helm installation.

Set up environment variables:

export IMAGE_SIGNED=<signed-image-reference>
export IMAGE_UNSIGNED=<unsigned-image-reference>
export IMAGE_SIGNED_UNTRUSTED=<signed-untrusted-image-reference>

Run the following command. Since $IMAGE_SIGNED references an image that is signed by a trusted identity and configured in Ratify, it is allowed for deployment.

kubectl run demo-signed --image=$IMAGE_SIGNED

Below is an example of the output for a successful deployment:

pod/demo-signed created

$IMAGE_UNSIGNED references an image that is not signed. $IMAGE_SIGNED_UNTRUSTED references an image that is signed using a different certificate that you will not trust. So, these two images will be denied for deployment. For example, run the following command:

kubectl run demo-unsigned --image=$IMAGE_UNSIGNED

Below is an example of the output for a deployment that is denied:

Error from server (Forbidden): admission webhook "validation.gatekeeper.sh" denied the request: [azurepolicy-ratifyverification-077bac5b63d37da0bc4a] Subject failed verification: $IMAGE_UNSIGNED

You can use the following command to output Ratify logs and search the log with text verification response for subject $IMAGE_UNSIGNED, check the errorReason field to understand the reason for any denied deployment.

kubectl logs <ratify-pod> -n $RATIFY_NAMESPACE

Use Audit policy effect

With Audit policy effect, unsigned images or images signed with untrusted identities are allowed for deployment. However, the AKS cluster and related components will be marked as non-compliant. For more details on how to view non-compliant resources and understand the reasons, see Get the Azure policy compliance-data.

Cleaning Up

Use the following commands to uninstall Ratify and clean up Ratify CRDs:

helm delete ratify --namespace $RATIFY_NAMESPACE
kubectl delete crd stores.config.ratify.deislabs.io verifiers.config.ratify.deislabs.io certificatestores.config.ratify.deislabs.io policies.config.ratify.deislabs.io keymanagementproviders.config.ratify.deislabs.io namespacedkeymanagementproviders.config.ratify.deislabs.io namespacedpolicies.config.ratify.deislabs.io namespacedstores.config.ratify.deislabs.io namespacedverifiers.config.ratify.deislabs.io

Delete the policy assignment and definition using the following commands:

az policy assignment delete --name "$DEFINITION_NAME" --scope "$POLICY_SCOPE"
az policy definition delete --name "$DEFINITION_NAME"

FAQ

How can I set up certificates for signature verification if I don't have access to AKV?

In some cases, image consumers may not have access to the certificates used for signature verification. To verify signatures, you will need to download the root CA certificate file in PEM format and specify the related parameters for the Ratify Helm chart installation. Below is an example command similar to the previous installation command, but without any parameters related to AKV certificates. The Notary Project trust store refers to the certificate file that passed in parameter notationCerts[0]:

helm install ratify ratify/ratify --atomic --namespace $RATIFY_NAMESPACE --create-namespace --version $CHART_VER --set provider.enableMutation=false --set featureFlags.RATIFY_CERT_ROTATION=true \
--set azureWorkloadIdentity.clientId=$IDENTITY_CLIENT_ID \
--set oras.authProviders.azureWorkloadIdentityEnabled=true \
--set-file notationCerts[0]="<root-ca-certifice-filepath>"
--set notation.trustPolicies[0].registryScopes[0]="$REPO_URI" \
--set notation.trustPolicies[0].trustStores[0]="ca:notationCerts[0]" \
--set notation.trustPolicies[0].trustedIdentities[0]="x509.subject: $SUBJECT"

Note

Since notationCerts[0] is used for the root CA certificate, if you have an additional certificate file for timestamping purpose, make sue you use the correct index. For example, notationCerts[1] is used for the TSA root certificate file, then use another trust store notation.trustPolicies[0].trustStores[1]" with the value "tsa:notationCerts[1]".

What steps should I take if Azure Policy is disabled in my AKS cluster?

If Azure Policy is disabled on your AKS cluster, you must install OPA Gatekeeper as the policy controller before installing Ratify.

Note

Azure Policy should remain disabled, as Gatekeeper conflicts with the Azure Policy add-on on AKS clusters. If you want to enable Azure Policy later on, you need to uninstall Gatekeeper and Ratify, and then follow this document to set up Ratify with Azure Policy enabled.

helm repo add gatekeeper https://open-policy-agent.github.io/gatekeeper/charts

helm install gatekeeper/gatekeeper  \
--name-template=gatekeeper \
--namespace gatekeeper-system --create-namespace \
--set enableExternalData=true \
--set validatingWebhookTimeoutSeconds=5 \
--set mutatingWebhookTimeoutSeconds=2 \
--set externaldataProviderResponseCacheTTL=10s

Then, install Ratify as described in the previous steps. After installation, enforce policies using the following commands. By default, the policy effect is set to Deny. You can refer to the Gatekeeper violations document to update the constraint.yaml for different policy effects.

kubectl apply -f https://ratify-project.github.io/ratify/library/default/template.yaml
kubectl apply -f https://ratify-project.github.io/ratify/library/default/samples/constraint.yaml

How can I update Ratify configurations after it has been installed?

Ratify configurations are Kubernetes custom resources, allowing you to update these resources without reinstalling Ratify.

  • To update AKV-related configurations, use the Ratify KeyManagementProvider custom resource. Follow the documentation.
  • To update Notary Project trust policies and stores, use the Ratify Verifier custom resource. Follow the documentation.
  • To authenticate and interact with ACR (or other OCI-compliant registries), use the Ratify Store custom resource. Follow the documentation.

What should I do if my container images are not signed using the Notation tool?

This document is applicable for verifying Notary Project signatures independently on any tools that can produce Notary Project-compliant signatures. Ratify also supports verifying other types of signatures. For more information, see the Ratify user guide.