AKS Virtual Node quota "leak" with KEDA ScaledJob

Question

AKS Virtual Node quota "leak" with KEDA ScaledJob

Jason Berk 20

we recently enabled the virtual node on our AKS cluster and use it to run various background jobs. These jobs are triggered by KEDA and specifically use a ScaledJob. When KEDA sees a message in a service bus queue it (keda) correctly starts a ScaledJob to pull the message and process it in AKS. While the job is running, our ACI core quote is decreased by 1 (which is expected).

KEDA has a setting for "successfulJobsHistoryLimit" number of jobs, which by default is 100.

it appears that even though the job is complete and no pod is actually running (ie, not compute is happening), the Azure ACI quota is decreased by one for every completed job that's kept in the cluster due to the successfulJobsHistoryLimit and failedJobsHistoryLimit settings.

for example, if my ACI vcup quota is 10, and I set successfulJobsHistoryLimit = 2

the first job runs and consumes one of my quota (I have 9 left)

the second job runs and consumes one of my quota (I have 8 left)

some minutes later, both jobs finish....but my quota will forever say I only have 8 vcpu available. I have "leaked" two of my quota because the job details are kept alive in the cluster, even though nothing is actively running.

seems related to this: https://github.com/kedacore/keda/issues/4536

basically, if you don't set your job history limits to zero, you'll "leak" that much quota.

Is this expected / known / standard behavior?

Arko 335 Microsoft External Staff

Hello Jason Berk,

Yes, what you're experiencing is a known behavior related to how Kubernetes handles completed jobs in a ScaledJob when using Azure Virtual Nodes (ACI integration) in AKS.

By default, Kubernetes retains job history (successfulJobsHistoryLimit and failedJobsHistoryLimit), which keeps terminated pods in a Completed state. However, in Azure Container Instances (ACI), these completed jobs continue to consume quota until Kubernetes fully removes their records.

This results in a vCPU quota leak, where quota is not released immediately even though the job has finished executing.

You can follow below steps to ensure KEDA jobs automatically trigger on Service Bus messages.

On your AKS cluster, install KEDA


helm repo add kedacore https://kedacore.github.io/charts

helm repo update

helm install keda kedacore/keda --namespace keda --create-namespace

enter image description here

Create Azure Service Bus Queue and retrieve the connection string and store it in Kubernetes


az servicebus namespace create --resource-group arkorg --name myservicebusns2025 --location centralindia

az servicebus queue create --resource-group arkorg --namespace-name myservicebusns2025 --name myqueue

enter image description here


az servicebus namespace authorization-rule keys list \

    --resource-group arkorg \

    --namespace-name myservicebusns2025 \

    --name RootManageSharedAccessKey \

    --query primaryConnectionString \

    --output tsv

enter image description here


kubectl create namespace keda-demo

kubectl create secret generic keda-auth-servicebus \

  --from-literal=connectionString="<YOUR_CONNECTION_STRING>" -n keda-demo

enter image description here

Deploy KEDA ScaledJob

Sample yaml-


apiVersion: keda.sh/v1alpha1

kind: ScaledJob

metadata:

  name: my-scaled-job

  namespace: keda-demo

spec:

  jobTargetRef:

    template:

      spec:

        containers:

        - name: queue-processor

          image: busybox

          command: ["sh", "-c", "echo Processing message... && sleep 10"]

        restartPolicy: Never

  pollingInterval: 5

  successfulJobsHistoryLimit: 1

  failedJobsHistoryLimit: 1

  maxReplicaCount: 10

  triggers:

  - type: azure-servicebus

    metadata:

      queueName: myqueue

      connectionFromEnv: connectionString

    authenticationRef:

      name: keda-servicebus-auth

---

apiVersion: keda.sh/v1alpha1

kind: TriggerAuthentication

metadata:

  name: keda-servicebus-auth

  namespace: keda-demo

spec:

  secretTargetRef:

  - parameter: connection

    name: keda-auth-servicebus

    key: connectionString


kubectl apply -f scaledjob.yaml

enter image description here

Ensure the new settings are applied


kubectl rollout restart deployment keda-operator -n keda

Done, now test if it works

I am using a sample python script send_message.py

to trigger a message


from azure.servicebus import ServiceBusClient, ServiceBusMessage

CONNECTION_STR = "Endpoint=sb://myservicebusns2025.servicebus.windows.net/;SharedAccessKeyName=RootManageSharedAccessKey;SharedAccessKey=abcdefghijk="

QUEUE_NAME = "myqueue"

def send_message():

    servicebus_client = ServiceBusClient.from_connection_string(conn_str=CONNECTION_STR, logging_enable=True)

    with servicebus_client:

        sender = servicebus_client.get_queue_sender(queue_name=QUEUE_NAME)

        with sender:

            message = ServiceBusMessage("Hello from Python!")

            sender.send_messages(message)

            print("Message sent successfully!")

send_message()

check if KEDA creates jobs


kubectl get pods -n keda-demo

enter image description here

This reduces successfulJobsHistoryLimit and failedJobsHistoryLimit to 1.

However, the best way to fully prevent quota leaks is to set successfulJobsHistoryLimit: 0.

Share via

AKS Virtual Node quota "leak" with KEDA ScaledJob

Your answer