Scalability considerations for Azure Kubernetes Service environments
Azure Kubernetes Service (AKS) can be scaled-in and out depending upon infrastructure needs (requiring more or less capacity), adding node pools with special capabilities like GPU, or application needs, in this case you have several factors like number and rate of concurrent connections, number of requests, back-end latencies on AKS applications.
The most common scalability options for AKS are the cluster autoscaler and the horizontal pod autoscaler. The cluster autoscaler adjusts the number of nodes based on the requested compute resources in the node pool. The horizontal pod autoscaler (HPA) adjusts the number of pods in a deployment depending on CPU utilization or other configured metrics.
Design considerations
Here are some crucial factors to consider:
Is rapid scalability a requirement for your application (no-time-to-wait)?
- To have a quick provisioning of pods use virtual nodes, they are only supported with Linux nodes/pods.
Is the workload non-time sensitive and can handle interruptions? Consider the use of Spot VMs
Is the underlying infrastructure (network plug-in, IP ranges, subscription limits, quotas, and so on) capable to scale out?
Consider to automate scalability
- You can enable cluster autoscaling to scale the number of nodes. Consider cluster autoscaling and scale-to-zero
- Horizontal pod autoscaler automatically scales the number of pods.
Consider scalability with multizone and node pools
- When creating node pools consider to set Availability Zones with AKS.
- Consider to use multiple node pools to support applications with different requirements.
- Scale node pools with cluster autoscaler.
- You can scale to zero the user node pools. See the limitations.
Design recommendations
Follow these best practices for your design:
- Use virtual machine scale sets (VMSS), which are required for scenarios including autoscaling, multiple node pools, and Windows node pool support.
- Don't manually enable or edit VMSS scalability settings in the Azure portal or using the Azure CLI. Instead, use the cluster autoscaler.
- If you need fast burst autoscaling choose to burst from AKS cluster using Azure Container Instances and virtual nodes for rapid and infinite scalability and per-second billing.
- Use cluster autoscaler and scale-to-zero for predictable scalability using VM-based worker nodes.
- Enable cluster autoscaler to meet application demands.
- You can enable autoscale for your multiple node pools.
- Enable horizontal pod autoscaler (HPA) to mitigate the busy hours of your application.
- All your containers and pods must have resource requests and limits defined.
- HPA automatically scales the number of pods based on observed resource limits CPU/memory or custom metrics.
- Enable Azure Monitor for containers and live monitoring to monitor the cluster and workload utilization.
- Use multiple node pools when your applications have different resource requirements.
- Remember you can specify a VM size for a node pool.
- Consider Spot VM-based node pools for non-time-sensitive workloads that can handle interruptions and evictions.