Random Instance Errors on Azure Function App (Python) - Elastic Plan

Giel Oomen 36 Reputation points
2025-01-10T10:58:28.1833333+00:00

Hi all,

After switching the ASP we use for our function app to Elastic Premium, we are having issues with the uptime of our app. I will provide a list of details below and things we checked and tried.

Info about the app:

  • It hosts our public facing API
  • Python 3.11
  • Elastic Premium (EP1)
  • Always On is turned OFF
  • 1 always ready instance
  • Requests come in through JWT verified calls in API Management
  • Health Check setup and uses a custom endpoint with all client connections

What happens:

The instances sometimes fail, but we have so far been unable to find out why. The Microsoft docs suggest always having multiple instances running to mitigate having a single point of failure which we will also do, but we first want to find the root cause.

Logs:

In the Application Insights we see RpcExceptions, TokenExpired and ClientConnectionFailures, which are all errors that can only be thrown when the instance is running so these are not it.

Then we dove deeper and looked into Kudu logs which show the following on a failing instance:
ERROR - Container func-###-###-api_0_a244931d for site func-###-###-api has exited, failing site start

So it appears sometimes containers fail to start, but... it doesn't show why.

All the Kudu logs show this line:
2025-01-08T14:15:29.492Z INFO - Logging is not enabled for this container.

Is it possible to enable container logging? And will it make a difference if we have it?

The total Kudu logs of the containers are ~30 lines of normal bootup info (image up to date, pulling image etc.), failing containers only appear to happen when the API is being used, but not always. We track what users do through LogRocket and when we replicate their exact steps we have never been able to replicate a container failure.

If anyone could help us in the direction to find the root cause of this we would greatly appreciate it. If specifics logs can help find the origin also please let me know so I can provide them.

Azure Functions
Azure Functions
An Azure service that provides an event-driven serverless compute platform.
5,347 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Pinaki Ghatak 5,575 Reputation points Microsoft Employee
    2025-01-13T10:09:58.4833333+00:00

    Hello @Giel Oomen

    To troubleshoot this issue that you stated above, you can try the following steps:

    1. Enable container logging: You can enable container logging by setting the WEBSITE_CONTAINER_LOGGING_ENABLED app setting to true. This will enable logging for your containers and may provide more information about why they are failing to start.
    2. Check the health of your function app: You mentioned that you have set up a health check endpoint. Make sure that this endpoint is returning a healthy status and that it is being called regularly. You can also try increasing the frequency of the health checks to see if this helps.
    3. Check the resource utilization of your function app: Make sure that your function app is not running out of resources, such as CPU or memory. You can use the Azure portal or Azure Monitor to monitor the resource utilization of your function app.
    4. Check for any code issues: Make sure that your code is not causing the failures. Check for any errors or exceptions in your code that may be causing the containers to fail.
    5. Increase the number of instances: As you mentioned, having multiple instances running can help mitigate the risk of a single point of failure. You can try increasing the number of instances to see if this helps.

    This should get you started. I hope this helps.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.