Hi Dean,
Check below steps to understand scale-in behavior with target-based scaling:
When the queue is empty, the target-based scaling mechanism will determine that instances that are needed and will scale down the number of instances accordingly, but also ensures the ongoing work is completed through Drain Mode.
Azure Functions on the Premium Plan have Drain mode enabled by default which means when scaling-in occurs, the instances that are being shut down will have time to complete all the active processes.
If a function is in progress when the scale-in happens, Azure Functions will not terminate the instance immediately. Instead, the function will be given a grace period to complete the in-progress requests.
Graceful shutdown: When the function scales down the instances, it waits for the function to complete its current execution before terminating the instance.
Drain mode: When an instance is being scaled down, it will still handle any in-progress requests but will not take on new tasks.
When the host is in drain mode:
- It stops listening for new incoming requests,
- Cancellation token is passed as a parameter to the function invocation,
- A scale-in operation will be performed.
Durable Functions can indeed help with long-running tasks, but it uses the same Azure Functions scaling model. If you have multiple messages in a queue, Azure Functions will still try to scale out to process the messages concurrently.
Hope this helps.
If the answer is helpful, please click Accept Answer and kindly upvote it. If you have any further questions about this answer, please click Comment.