Exercise - Create your HorizontalPodAutoscaler

15 minutes

Exercise - Scaling an application

Create an AKS cluster

Before you can start scaling your application, you need to create an AKS cluster with the required resources.

Sign in to the Azure Cloud Shell with the account you want to deploy resources into and select Bash as the running shell.
Create a resource group using the az group create command. The following example creates a resource group named myResourceGroup in the eastus location:
```
az group create --name myResourceGroup --location eastus
```
Create an AKS cluster using the az aks create command. The following example creates a cluster named myAKSCluster in the myResourceGroup resource group. The cluster has one node and uses the Standard_DS2_v2 VM size.
```
az aks create --resource-group myResourceGroup --name myAKSCluster --node-count 1 --node-vm-size Standard_DS2_v2 --enable-app-routing --generate-ssh-keys
```
The command takes a few minutes to complete.

Get the credentials for the cluster using the az aks get-credentials command.

az aks get-credentials --resource-group myResourceGroup --name myAKSCluster

Verify that the cluster is running and that you can connect to it using the kubectl get nodes command.
```
kubectl get nodes
```
The command should return one node with a status of Ready.

Deploy the application resources

Now that you have a cluster, you can deploy the application to it.

Deploy the application

Create the application namespace using the kubectl create namespace command.
```
kubectl create namespace hpa-contoso
```

Create a new file named deployment.yml in the Cloud Shell editor and paste the following YAML code into it:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: contoso-website
  namespace: hpa-contoso
spec:
  replicas: 1
  selector:
    matchLabels:
      app: contoso-website
  template:
    metadata:
      labels:
        app: contoso-website
    spec:
      containers:
        - name: contoso-website
          image: mcr.microsoft.com/mslearn/samples/contoso-website
          resources:
            requests:
              cpu: 100m
              memory: 128Mi
            limits:
              cpu: 250m
              memory: 256Mi
          ports:
            - containerPort: 80

Save the file.
Deploy the application to the cluster using the kubectl apply command.
```
kubectl apply -f deployment.yml
```
Your output should look similar to the following example output:
```
deployment.apps/contoso-website created
```

Create a DNS zone and deploy the ingress resource

Create an Azure DNS zone using the az network dns zone create command. The following example creates a DNS zone named contoso-website.com:
```
az network dns zone create --resource-group myResourceGroup --name contoso-website.com
```
Get the resource ID for your DNS zone using the az network dns zone show command and save the output to a variable named DNS_ZONE_ID.
```
DNS_ZONE_ID=$(az network dns zone show --resource-group myResourceGroup --name contoso-website.com --query id --output tsv)
```

Update the application routing cluster add-on to enable Azure DNS integration using the az aks approuting zone command.

az aks approuting zone add --resource-group myResourceGroup --name myAKSCluster --ids=${DNS_ZONE_ID} --attach-zones

Create a file named ingress.yml in the Cloud Shell editor and paste the following YAML code into it. Make sure you replace the <dns-zone-name> placeholder with the name of your DNS zone.

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: contoso-website
  namespace: hpa-contoso
  annotations:
spec:
  ingressClassName: webapprouting.kubernetes.azure.com
  rules:
  - host: <dns-zone-name>
    http:
      paths:
      - backend:
          service:
            name: contoso-website
            port:
              number: 80
        path: /
        pathType: Prefix

Save the file.
Deploy the ingress resource to the cluster using the kubectl apply command.
```
kubectl apply -f ingress.yml
```
Your output should look similar to the following example output:
```
ingress.networking.k8s.io/contoso-website created
```

Create the service resource

Create a file named service.yml in the Cloud Shell editor and paste the following YAML code into it:

apiVersion: v1
kind: Service
metadata:
  name: contoso-website
  namespace: hpa-contoso
spec:
  type: ClusterIP
  ports:
    - port: 80
      targetPort: 80
  selector:
    app: contoso-website

Save the file.
Deploy the service resource to the cluster using the kubectl apply command.
```
kubectl apply -f service.yml
```
Your output should look similar to the following example output:
```
service/contoso-website created
```

Create a HorizontalPodAutoscaler

Create a file named hpa.yml in the Cloud Shell editor and paste the following YAML code into it:
```
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: contoso-website
  namespace: hpa-contoso
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: contoso-website
  minReplicas: 1
  maxReplicas: 10
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 20
    - type: Resource
      resource:
        name: memory
        target:
          type: Utilization
          averageUtilization: 50
```
It's important to point out that the scaleTargetRef keys need to be the same as the created deployment resource. In your case, the deployment you created has the apiVersion as apps/v1 and it's called contoso-website. This HPA is configured to query the native CPU metric. If this metric goes above its average of 20% for a specified amount of time, it scales the deploy out in a unit. The algorithm used to calculate this metric is based on this mathematical equation:
```
desiredReplicas = ceil[currentReplicas * ( currentMetricValue / desiredMetricValue )]
```
The minReplicas and maxReplicas keys define the minimum and maximum number of replicas the deployment can have. The metrics key defines the metrics the HPA queries to scale the deployment. In this case, the HPA queries the CPU and memory metrics. If the CPU metric goes above 20% or the memory metric goes above 50%, the HPA scales the deployment out.
Save the file.
Create the HPA using the kubectl apply command.
```
kubectl apply -f hpa.yml
```
Your output should look similar to the following example output:
```
horizontalpodautoscaler.autoscaling/contoso-website created
```

Check the results

Query the metrics and usage of the HPA using the kubectl get hpa command.
```
kubectl get hpa --namespace hpa-contoso
```
Your output should look similar to the following example output:
```
NAME              REFERENCE                    TARGETS          MINPODS   MAXPODS   REPLICAS   AGE
contoso-website   Deployment/contoso-website   0%/20%, 0%/50%   1         10        1          83s
```
Notice the TARGETS column. It shows the current usage of the metrics defined in the HPA. In this case, the CPU usage is 0% and the memory usage is 0%. This is because the application is not receiving any traffic.

Note

It's possible that the HPA shows unknown metrics for the first few seconds as it's trying to reach the metrics API to fetch those from the server.