I have an AKS cluster with a managed load balancer in front of it, automatically provisioned by my LoadBalancer service, and associated with a public IP address. The node pools belong to a subnet I created with an associated NSG. Everything works great.
I created a global load balancer with a separate public IP address that has this AKS load balancer as a backend. But I can't route traffic through this IP address to my cluster, despite all the health checks for both load balancers succeeding. (TCP connections time out.)
I can route traffic through the same global load balancer, and same AKS load balancer, to a one-off VM I created in the same virtual network and subnet. So clearly there's nothing inherently wrong with either load balancer or the vnet/subnet/NSG/my connection.
Also, I had previously tried routing to the same node pool from a separate (regional) load balancer I had manually created to a NodePort service in the cluster, which also did not work (again despite the health checks, pointing at the same port as the target backend port, showing healthy backends).
So my question is, is there something special about the managed load balancers that allows them to route traffic to node pools in AKS clusters? Why would routing through a global load balancer upstream interfere with that? Or is there some setting I'm potentially missing somewhere, perhaps in the kubernetes cluster?
I found this page: https://learn.microsoft.com/en-us/troubleshoot/azure/azure-kubernetes/connectivity/connection-issues-application-hosted-aks-cluster. It says:
If you receive a Connection Timed Out
error message, check the network security group that's associated with the AKS nodes. Also, check the AKS subnet. It could be blocking the traffic from the load balancer or application gateway to the AKS nodes.
But it doesn't say what settings exactly to check, or why it would be blocking traffic to the AKS nodes but not the VM.
Failing any of that, is there a way to inspect traffic logs for the load balancer in some way that can tell me where exactly the connection is stalling out?