Troubleshoot Azure Key Vault Secrets Provider add-on in AKS
This article discusses how to troubleshoot issues that you might experience when using the Azure Key Vault Secrets Provider add-on in Azure Kubernetes Service (AKS).
Note
This article applies to the AKS managed add-on version of the Azure Key Vault Secrets Provider. If you use the helm installed (self-managed) version, go to the Azure Key Vault Provider for Secrets Store CSI Driver GitHub documentation.
Prerequisites
The Kubernetes kubectl tool (To install kubectl by using Azure CLI, run the az aks install-cli command.)
The Kubernetes Secrets Store CSI Driver add-on, enabled on the AKS cluster
The client URL (curl) tool
The Netcat (
nc
) command-line tool for TCP connections
Troubleshooting checklist
Step 1: Confirm that Azure Key Vault Secrets Provider add-on is enabled on your cluster
Run the az aks show command to confirm that the add-on is enabled on your cluster:
az aks show -g <aks-resource-group-name> -n <aks-name> --query 'addonProfiles.azureKeyvaultSecretsProvider'
The command output should be similar to the following text:
{
"config": null,
"enabled": true,
"identity": {
"clientId": "<client-id>",
"objectId": "<object-id>",
"resourceId": "/subscriptions/<subscription-id>/resourcegroups/<resource-group-name>/providers/Microsoft.ManagedIdentity/userAssignedIdentities/<azure-key-vault-secrets-provider-identity-name>"
}
}
If the enabled
flag is shown as false
in the preceding output, the Azure Key Vault Secrets Provider add-on isn't enabled on your cluster. In this case, refer to Azure Key Vault Provider for Secrets Store CSI Driver GitHub documentation for further troubleshooting.
If the enabled
flag is shown as true
in the preceding output, the Azure Key Vault Secrets Provider add-on is enabled on your cluster. In this case, go to next steps in this article.
Step 2: Check the Secrets Store Provider and CSI Driver pod logs
Azure Key Vault Secrets Provider add-on logs are generated by both provider and driver pods. To troubleshoot issues that affect the provider or driver, examine the logs from the pod that's running on the same node as your application pod.
Run the kubectl get command to find the Secrets Store Provider and CSI Driver pods that run on the same node that your application pod runs on:
kubectl get pod -l 'app in (secrets-store-provider-azure, secrets-store-csi-driver)' -n kube-system -o wide
Run the kubectl logs command to view logs from the Secrets Store Provider pod:
kubectl logs -n kube-system <provider-pod-name> --since=1h | grep ^E
Run the kubectl logs command to view logs from the Secrets Store CSI Driver pod:
kubectl logs -n kube-system <csi-driver-pod-name> -c secrets-store --since=1h | grep ^E
Once you collect the Secrets Store Provider and CSI Driver pod logs, analyze these logs against the causes mentioned in the following sections to identify the issue and corresponding solution.
Note
If you open a support request, it's a good idea to include the relevant logs from the Azure Key Vault Provider and the Secrets Store CSI Driver.
Cause 1: Couldn't retrieve the key vault token
You might see the following error entry in the logs or event messages:
Warning FailedMount 74s kubelet MountVolume.SetUp failed for volume "secrets-store-inline" : kubernetes.io/csi: mounter.SetupAt failed: rpc error: code = Unknown desc = failed to mount secrets store objects for pod default/test, err: rpc error: code = Unknown desc = failed to mount objects, error: failed to get keyvault client: failed to get key vault token: nmi response failed with status code: 404, err: <nil>
This error occurs because a Node Managed Identity (NMI) component in aad-pod-identity returned an error message about a token request.
Solution 1: Check the NMI pod logs
For more information about this error and how to resolve it, check the NMI pod logs, and refer to the Microsoft Entra pod identity troubleshooting guide.
Cause 2: The provider pod can't access the key vault instance
You might see the following error entry in the logs or event messages:
E1029 17:37:42.461313 1 server.go:54] failed to process mount request, error: keyvault.BaseClient#GetSecret: Failure sending request: StatusCode=0 -- Original Error: context deadline exceeded
This error occurs because the provider pod can't access the key vault instance. Access might be prevented for any of the following reasons:
A firewall rule is blocking egress traffic from the provider.
Network policies that are configured in the AKS cluster are blocking egress traffic.
The provider pods run on the host network. A failure might occur if a policy is blocking this traffic or if network jitters occur on the node.
Solution 2: Check network policies, allowlist, and node connection
To fix the issue, take the following actions:
Put the provider pods on the allowlist.
Check for policies that are configured to block traffic.
Make sure that the node has connectivity to Microsoft Entra ID and your key vault.
To test the connectivity to your Azure key vault from the pod that's running on the host network, follow these steps:
Create the pod:
cat <<EOF | kubectl apply --filename - apiVersion: v1 kind: Pod metadata: name: curl spec: hostNetwork: true containers: - args: - tail - -f - /dev/null image: curlimages/curl:7.75.0 name: curl dnsPolicy: ClusterFirst restartPolicy: Always EOF
Run kubectl exec to run a command in the pod that you created:
kubectl exec --stdin --tty curl -- sh
Authenticate by using your Azure key vault:
curl -X POST 'https://login.microsoftonline.com/<aad-tenant-id>/oauth2/v2.0/token' \ -d 'grant_type=client_credentials&client_id=<azure-client-id>&client_secret=<azure-client-secret>&scope=https://vault.azure.net/.default'
Try to get a secret that's already created in your Azure key vault:
curl -X GET 'https://<key-vault-name>.vault.azure.net/secrets/<secret-name>?api-version=7.2' \ -H "Authorization: Bearer <access-token-acquired-above>"
Cause 3: The user-assigned managed identity is incorrect in the SecretProviderClass custom resource
If you encounter an HTTP error code "400" instance that's accompanied by an "Identity not found" error description, the user-assigned managed identity is incorrect in your SecretProviderClass
custom resource. The full response resembles the following text:
MountVolume.SetUp failed for volume "<volume-name>" :
rpc error:
code = Unknown desc = failed to mount secrets store objects for pod <namespace>/<pod>,
err: rpc error: code = Unknown desc = failed to mount objects,
error: failed to get objectType:secret, objectName:<key-vault-secret-name>, objectVersion:: azure.BearerAuthorizer#WithAuthorization:
Failed to refresh the Token for request to https://<key-vault-name>.vault.azure.net/secrets/<key-vault-secret-name>/?api-version=2016-10-01:
StatusCode=400 -- Original Error: adal: Refresh request failed.
Status Code = '400'.
Response body: {"error":"invalid_request","error_description":"Identity not found"}
Endpoint http://169.254.169.254/metadata/identity/oauth2/token?api-version=2018-02-01&client_id=<userAssignedIdentityID>&resource=https%!!(MISSING)A(MISSING)%!!(MISSING)F(MISSING)%!!(MISSING)F(MISSING)vault.azure.net
Solution 3: Update SecretProviderClass by using the correct userAssignedIdentityID value
Find the correct user-assigned managed identity, and then update the SecretProviderClass
custom resource to specify the correct value in the userAssignedIdentityID
parameter. To find the correct user-assigned managed identity, run the following az aks show command in Azure CLI:
az aks show --resource-group <resource-group-name> \
--name <cluster-name> \
--query addonProfiles.azureKeyvaultSecretsProvider.identity.clientId \
--output tsv
For information about how to set up a SecretProviderClass
custom resource in YAML format, see the Use a user-assigned managed identity section of the Provide an identity to access the Azure Key Vault Provider for Secrets Store CSI Driver article.
Cause 4: The Key Vault private endpoint is on a different virtual network than the AKS nodes
Public network access isn't allowed at the Azure Key Vault level, and the connectivity between AKS and Key Vault is made through a private link. However, the AKS nodes and the private endpoint of the Key Vault are on different virtual networks. This scenario generates a message that resembles the following text:
MountVolume.SetUp failed for volume "<volume>" :
rpc error:
code = Unknown desc = failed to mount secrets store objects for pod <namespace>/<pod>,
err: rpc error: code = Unknown desc = failed to mount objects,
error: failed to get objectType:secret, objectName: :<key-vault-secret-name>, objectVersion:: keyvault.BaseClient#GetSecret:
Failure responding to request:
StatusCode=403 -- Original Error: autorest/azure: Service returned an error.
Status=403 Code="Forbidden"
Message="Public network access is disabled and request is not from a trusted service nor via an approved private link.\r\n
Caller: appid=<application-id>;oid=<object-id>;iss=https://sts.windows.net/<id>/;xms_mirid=/subscriptions/<subscription-id>/resourcegroups/<aks-infrastructure-resource-group>/providers/Microsoft.Compute/virtualMachineScaleSets/aks-<nodepool-name>-<nodepool-id>-vmss;xms_az_rid=/subscriptions/<subscription-id>/resourcegroups/<aks-infrastructure-resource-group>/providers/Microsoft.Compute/virtualMachineScaleSets/aks-<nodepool-name>-<nodepool-id>-vmss \r\n
Vault: <keyvaultname>;location=<location>" InnerError={"code":"ForbiddenByConnection"}
Solution 4a: Set up a virtual network link and virtual network peering to connect the virtual networks
Fixing the connectivity issue is generally a two-step process:
Create a virtual network link for the virtual network of the AKS cluster at the private Azure DNS zone level.
Add virtual network peering between the virtual network of the AKS cluster and the virtual network of the Key Vault private endpoint.
These steps are described in more detail in the following sections.
Step 1: Create the virtual network link
Connect to the AKS cluster nodes to determine whether the fully qualified domain name (FQDN) of the Key Vault is resolved through a public IP address or a private IP address. If you receive the "Public network access is disabled and request is not from a trusted service nor via an approved private link" error message, the Key Vault endpoint is probably resolved through a public IP address. To check for this scenario, run the nslookup command:
nslookup <key-vault-name>.vault.azure.net
If the FQDN is resolved through a public IP address, the command output resembles the following text:
root@aks-<nodepool-name>-<nodepool-id>-vmss<scale-set-instance>:/# nslookup <key-vault-name>.vault.azure.net
Server: 168.63.129.16
Address: 168.63.129.16#53
Non-authoritative answer:
<key-vault-name>.vault.azure.net canonical name = <key-vault-name>.privatelink.vaultcore.azure.net.
<key-vault-name>.privatelink.vaultcore.azure.net canonical name = data-prod.weu.vaultcore.azure.net.
data-prod-weu.vaultcore.azure.net canonical name = data-prod-weu-region.vaultcore.azure.net.
data-prod-weu-region.vaultcore.azure.net canonical name = azkms-prod-weu-b.westeurope.cloudapp.azure.com.
Name: azkms-prod-weu-b.westeurope.cloudapp.azure.com
Address: 20.1.2.3
In this case, create a virtual network link for the virtual network of the AKS cluster at the private DNS zone level. (A virtual network link is already created automatically for the virtual network of the Key Vault private endpoint.)
To create the virtual network link, follow these steps:
In the Azure portal, search for and select Private DNS zones.
In the list of private DNS zones, select the name of your private DNS zone. In this example, the private DNS zone is privatelink.vaultcore.azure.net.
In the navigation pane of the private DNS zone, locate the Settings heading, and then select Virtual network links.
In the list of virtual network links, select Add.
In the Add virtual network link page, complete the following fields.
Field name Action Link name Enter a name to use for the virtual network link. Subscription Select the name of the subscription that you want to contain the virtual network link. Virtual network Select the name of the virtual network of the AKS cluster. Select the OK button.
After you finish the link creation procedure, run the nslookup
command. The output should now resemble the following text that shows a more direct DNS resolution:
root@aks-<nodepool-name>-<nodepool-id>-vmss<scale-set-instance>:/# nslookup <key-vault-name>.vault.azure.net
Server: 168.63.129.16
Address: 168.63.129.16#53
Non-authoritative answer:
<key-vault-name>.vault.azure.net canonical name = <key-vault-name>.privatelink.vaultcore.azure.net.
Name: <key-vault-name>.privatelink.vaultcore.azure.net
Address: 172.20.0.4
After the virtual network link is added, the FQDN should be resolvable through a private IP address.
Step 2: Add virtual network peering between virtual networks
If you're using a private endpoint, you've probably disabled public access at the Key Vault level. Therefore, no connectivity exists between AKS and the Key Vault. You can test that configuration by using the following Netcat (nc) command:
nc -v -w 2 <key-vault-name>.vault.azure.net 443
If connectivity isn't available between AKS and the Key Vault, you see output that resembles the following text:
nc: connect to <key-vault-name>.vault.azure.net port 443 (tcp) timed out: Operation now in progress
To establish connectivity between AKS and the Key Vault, add virtual network peering between the virtual networks by following these steps:
Go to the Azure portal.
Use one of the following options to follow the instructions from the Create virtual network peer section of the Tutorial: Connect virtual networks with virtual network peering using the Azure portal article to peer the virtual networks and verify that the virtual networks are connected (from one end):
Go to your AKS virtual network, and peer it to the virtual network of the Key Vault private endpoint.
Go to the virtual network of the Key Vault private endpoint, and peer it to the AKS virtual network.
In the Azure portal, search for and select the name of the other virtual network (the virtual network that you peered to in the previous step).
In the virtual network navigation pane, locate the Settings heading, and then select Peerings.
In the virtual network peering page, verify that the Name column contains the Peering link name of the Remote virtual network that you specified in step 2. Also, make sure that the Peering status column for that peering link has a value of Connected.
After you complete this procedure, you can run the Netcat command again. The DNS resolution and connectivity between AKS and the Key Vault should now succeed. Also, make sure that the Key Vault secrets are successfully mounted and work as expected, as shown by the following output:
Connection to <key-vault-name>.vault.azure.net 443 port [tcp/https] succeeded!
Solution 4b: Troubleshoot error code 403
Troubleshoot error code "403" by reviewing the HTTP 403: Insufficient Permissions section of the Azure Key Vault REST API Error Codes reference article.
Cause 5: The secrets-store.csi.k8s.io driver is missing from the list of registered CSI drivers
If you receive the following error message about a missing secrets-store.csi.k8s.io
driver in the pod events, then the Secrets Store CSI Driver pods aren't running on the node in which the application is running:
Warning FailedMount 42s (x12 over 8m56s) kubelet, akswin000000 MountVolume.SetUp failed for volume "secrets-store01-inline" : kubernetes.io/csi: mounter.SetUpAt failed to get CSI client: driver name secrets-store.csi.k8s.io not found in the list of registered CSI drivers
Solution 5: Troubleshoot the Secret Store CSI Driver pod running on the same node
Retrieve the status of the Secret Store CSI Driver pod running on the same node by running the following command:
kubectl get pod -l app=secrets-store-csi-driver -n kube-system -o wide
If pod status isn't Running
or any of the containers in this pod isn't in Ready
state, then proceed to check the logs for this pod by following the steps in Check the Secrets Store Provider and CSI Driver pod logs.
Cause 6: SecretProviderClass not found
You might see the following event when describing your application pod:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedMount 2s (x5 over 10s) kubelet MountVolume.SetUp failed for volume "xxxxxxx" : rpc error: code = Unknown desc = failed to get secretproviderclass xxxxxxx/xxxxxxx, error: SecretProviderClass.secrets-store.csi.x-k8s.io "xxxxxxxxxxxxx" not found
This event indicates that the SecretProviderClass
referenced in your pod's volume specification doesn't exist in the same namespace as your application pod.
Solution 6a: Create the missing SecretProviderClass resource
Make sure that the SecretProviderClass
resource referenced in your pod's volume specification exists in the same namespace where your application pod is running.
Solution 6b: Modify your application pod's volume specification to reference the correct SecretProviderClass resource name
Edit your application pod's volume specification to reference the correct SecretProviderClass
resource name:
...
spec:
containers:
...
volumes:
- name: my-volume
csi:
driver: secrets-store.csi.k8s.io
readOnly: true
volumeAttributes:
secretProviderClass: "xxxxxxxxx"
Cause 7: The request is unauthenticated
The request is unauthenticated for Key Vault, as indicated by a "401" error code.
Solution 7: Troubleshoot error code 401
Troubleshoot error code "401" by reviewing the "HTTP 401: Unauthenticated Request" section of the Azure Key Vault REST API Error Codes reference article.
Cause 8: The number of requests exceeds the stated maximum
The number of requests exceeds the stated maximum for the timeframe, as indicated by a "429" error code.
Solution 8: Troubleshoot error code 429
Troubleshoot error code "429" by reviewing the "HTTP 429: Too Many Requests" section of the Azure Key Vault REST API Error Codes reference article.
Third-party information disclaimer
The third-party products that this article discusses are manufactured by companies that are independent of Microsoft. Microsoft makes no warranty, implied or otherwise, about the performance or reliability of these products.
Contact us for help
If you have questions or need help, create a support request, or ask Azure community support. You can also submit product feedback to Azure feedback community.