AKS Cluster deployment failed with Custom DNS server configuration

Nadeem Hussain Joo 1 Reputation point
2025-02-07T13:51:24.83+00:00

Hi Everyone

I was deploying an AKS cluster with my own vnet configured with custom DNS servers. Earlier, i got an error that subnet with route table can't be attached to AKS. I removed the route table but got an error related to DNS. It was expected as we are using the custom DNS server and there was no route available to reach there. i edited the configuration and selected default DNS server provided by Microsoft and redeployed it and this time the deployment was successful.

To ensure that AKS will work with our route table and custom DNS , i added those configurations again and restarted the AKS. it went in loop for around 15 minutes and later gave an error message that "Failed to start the Kubernetes service 'aks-001'. Error: Agents are unable to resolve Kubernetes API server name. It's likely custom DNS server is not correctly configured.

It seems that i need to link the private DNS Zone created in Managed resource group by AKS cluster with subscription where i have the DNS server. However, i am concerned if i have to create one more AKS cluster where in i would require to link new DNS zone with same DNS server,. How DNS resolution will work in that case ?

Azure Kubernetes Service (AKS)
Azure Kubernetes Service (AKS)
An Azure service that provides serverless Kubernetes, an integrated continuous integration and continuous delivery experience, and enterprise-grade security and governance.
2,260 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Sina Salam 17,336 Reputation points
    2025-02-07T14:54:08.5866667+00:00

    Hello Nadeem Hussain Joo,

    Welcome to the Microsoft Q&A and thank you for posting your questions here.

    Thank you also, for contacting me on LinkedIn.

    Regarding your previous discussion and more detail, you provided here. I understand that there was four major issue to troubleshoot, which are the followings:

    1. The system-managed identity is created during AKS deployment by default, making it impossible to assign permissions beforehand. This causes failures when a route table is attached to the subnet.
    2. The custom DNS server cannot resolve the AKS API server’s private FQDN. For an example, mycluster.privatelink.<region>.azmk8s.io
    3. AKS nodes cannot reach the Kubernetes API server or Azure services due to missing routes.
    4. Each new AKS cluster requires manual linking of its private DNS zone.

    Therefore, to resolve these issues:

    Number 1:

    Step 1: Create a user-assigned managed identity before deploying the AKS cluster.

    Step 2: Assign the Network Contributor role to this identity on:

    • The subnet where AKS will be deployed.
    • The route table (if using a custom one).

    Step 3: Deploy AKS with the user-assigned identity specified in the identity block (like CLI example below). Because the identity exists pre-deployment, ensuring permissions are already configured for subnet/route table access.

       # Example: Deploy AKS with a user-assigned identity
       az aks create \
         --resource-group <RG> \
         --name <CLUSTER> \
         --vnet-subnet-id <SUBNET_ID> \
         --assign-identity <USER_ASSIGNED_IDENTITY_RESOURCE_ID>
    

    Number 2:

    Link Private DNS Zones to the DNS Server’s VNet, so that each AKS private cluster creates a Private DNS Zone (e.g., privatelink.<region>.azmk8s.io):

    Step 1: After cluster deployment, link this zone to the VNet hosting your custom DNS server.

    Step 2: For multiple clusters, repeat the linking process for each new cluster’s private DNS zone.

    Conditional Forwarding on Custom DNS Server:

    • Configure your DNS server to forward queries for privatelink.<region>.azmk8s.io to Azure DNS (168.63.129.16).
    • For Windows DNS Server: Add a conditional forwarder for "privatelink.<region>.azmk8s.io" > 168.63.129.16
    • For Bind (Linux): zone "privatelink.<region>.azmk8s.io" { type forward; forwarders { 168.63.129.16; }; };

    Number 3:

    Attach the route table to the subnet after AKS deployment if the cluster fails to provision with it pre-attached.

    DestinationNext HopPurposeAzureFirewall-IPFirewallInternet-bound traffic168.63.129.16/32InternetAzure DNS resolutionAKS-API-Server-IPFirewall/VNetKubernetes API server (if using firewall)Number 4:

    To easily manage Multiple AKS Clusters, use Azure Policy to auto-link private DNS zones to your DNS server’s VNet when new AKS clusters are created.

    The below is an example policy definition basically for your need:

           {
             "if": {
               "allOf": [
                 { "field": "type", "equals": "Microsoft.ContainerService/managedClusters" },
                 { "field": "Microsoft.ContainerService/managedClusters/apiServerAccessProfile.enablePrivateCluster", "equals": "true" }
               ]
             },
             "then": {
               "effect": "deployIfNotExists",
               "details": {
                 "type": "Microsoft.Network/privateDnsZones/virtualNetworkLinks",
                 "roleDefinitionIds": [ "<ROLE_ID>" ],
                 "deployment": {
                   "template": {
                     // Template to link the private DNS zone to the target VNet
                   }
                 }
               }
             }
           }
    

    I hope this is helpful and work perfectly! Do not hesitate to let me know if you have any other questions.


    Please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.