AKS Virtual Node quota "leak" with KEDA ScaledJob

Jason Berk 20 Reputation points
2025-03-11T13:20:57.1733333+00:00

we recently enabled the virtual node on our AKS cluster and use it to run various background jobs. These jobs are triggered by KEDA and specifically use a ScaledJob. When KEDA sees a message in a service bus queue it (keda) correctly starts a ScaledJob to pull the message and process it in AKS. While the job is running, our ACI core quote is decreased by 1 (which is expected).

KEDA has a setting for "successfulJobsHistoryLimit" number of jobs, which by default is 100.

it appears that even though the job is complete and no pod is actually running (ie, not compute is happening), the Azure ACI quota is decreased by one for every completed job that's kept in the cluster due to the successfulJobsHistoryLimit and failedJobsHistoryLimit settings.

for example, if my ACI vcup quota is 10, and I set successfulJobsHistoryLimit = 2

the first job runs and consumes one of my quota (I have 9 left)

the second job runs and consumes one of my quota (I have 8 left)

some minutes later, both jobs finish....but my quota will forever say I only have 8 vcpu available. I have "leaked" two of my quota because the job details are kept alive in the cluster, even though nothing is actively running.

seems related to this: https://github.com/kedacore/keda/issues/4536

basically, if you don't set your job history limits to zero, you'll "leak" that much quota.

Is this expected / known / standard behavior?

Azure Kubernetes Service (AKS)
Azure Kubernetes Service (AKS)
An Azure service that provides serverless Kubernetes, an integrated continuous integration and continuous delivery experience, and enterprise-grade security and governance.
2,306 questions
{count} votes

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.