Azure 502 Bad Gateway Issue
I use an application gateway with WAF setup to run our web application deployed in a single Azure VM.
When I access the application through App GW from the browser, I sometimes get a 502 Bad Gateway error. The App GW health probe responds with, "Cannot connect to backend server. Check whether any NSG/UDR/Firewall is blocking access to the server. Check if application is running on correct port."
This issue does not always occur. Whenever I hit the server multiple times, this issue occurs, and it will fix itself automatically after some time or if I clear the browser cache.
In App GW log, I get sometime "error_info_s: ERRORINFO_UPSTREAM_NO_LIVE" or "ERRORINFO_UPSTREAM_CLOSED_CONNECTION".
Is this behaviour of App GW or any solution to fix the issue.? Appreciate your suggestions or sharing your experiences.
Azure Application Gateway
-
Sai Prasanna Sinde • 3,410 Reputation points • Microsoft Vendor
2025-01-15T22:32:06.86+00:00 Welcome to the Microsoft Q&A Platform! Thank you for asking your question here.
Please go through the below points:
- Your VM might be intermittently overloaded (like high CPU, memory, or network usage), making it unable to respond to requests from the application gateway health probes or user traffic in a timely manner.
- Please use Azure Monitor to track CPU, memory, network, and disk I/O on your VM. Look for spikes or sustained high usage around the times when the 502 errors occur. For your reference: https://learn.microsoft.com/en-us/azure/virtual-machines/monitor-vm#:~:text=Simplified%20onboarding%20of%20the%20Azure%20Monitor%20agent%20and%20the%20Dependency%20agent%2C%20so%20that%20you%20can%20monitor%20a%20virtual%20machine%20(VM)%20guest%20operating%20system%20and%20workloads.
- The web application running on your VM might be restarting intermittently, causing temporary unavailability. Please try to review your web application's logs for errors or exceptions that might indicate crashes.
- The backend server closes connections, but the application gateway still sends traffic, make sure that the backend server's keep-alive timeout is greater than application gateway idle timeout.
- Try to review the NSG rules associated with both your application gateway subnet and your VM's subnet. Ensure there are rules allowing inbound traffic on the ports used by your application (like 80 or 443) from the application gateway IP range.
- Verify that your UDRs are not directing traffic away from your VM. Ensure that traffic destined for your VM's IP address is routed correctly.
- If you're using the VM's firewall, ensure that there's a rule allowing inbound traffic on the required ports from the application gateway IP address range
- Ensure the health probe interval is not too short (30 sec is common). A very short interval can overwhelm a busy server.
- Make sure the timeout is appropriate (like 30 sec). It should be long enough for the server to respond even under load.
- An appropriate threshold prevents flapping between healthy and unhealthy states. Verify the port used in your backend settings matches the port your application is listening on within the VM.
- If you are enabled the connection draining, be sure to configure an appropriate timeout for connection draining.
- Analyze your application gateway WAF logs to see if any requests are being blocked around the time of the 502 errors.
- Make sure that your DNS records are correctly configured and that the DNS servers you're using are reliable.
- As you mentioned the issue resolves itself or with a browser cache clear, it might be a temporary network glitches or routing problems within Azure's infrastructure could be causing the connection failures.
- Enable diagnostic logging for your application gateway and send the logs to a Log Analytics workspace. This will give you detailed insights into health probe failures, request routing, and WAF activity.
- Use Network Watcher's connection troubleshoot feature to diagnose network connectivity issues between your application gateway and VM. For your reference: https://learn.microsoft.com/en-us/azure/network-watcher/connection-troubleshoot-portal#:~:text=In%20this%20article%2C%20you%20learn%20how%20to%20use%20the%20connection%20troubleshoot%20feature%20of%20Azure%20Network%20Watcher%20to%20diagnose%20and%20troubleshoot%20connectivity%20issues.%20For%20more%20information%20about%20connection%20troubleshoot%2C%20see%20Connection%20troubleshoot%20overview.
Kindly let us know if the above helps or you need further assistance on this issue.
Thanks,
Sai.
-
Sai Prasanna Sinde • 3,410 Reputation points • Microsoft Vendor
2025-01-16T18:54:39.1933333+00:00 Hi @Mohammed Shafi,
Greetings of the day!
Just checking in to see if you had a chance to see my response to your question. Please tell us if it was helpful and feel free to reach out to us if you have any queries.
Thanks,
Sai. -
Sai Prasanna Sinde • 3,410 Reputation points • Microsoft Vendor
2025-01-17T17:55:41.6566667+00:00 Hi @Mohammed Shafi,
Hope you are having a great day.
Just checking in to see if you have got a chance to see my response to your question in resolving the issue.
If you are still facing any further issues, please don't hesitate to reach out to us. We are happy to assist you.
Looking forward to your response and appreciate your time on this.
Cheers,
Sai.
-
Sai Prasanna Sinde • 3,410 Reputation points • Microsoft Vendor
2025-01-27T01:47:11.85+00:00 Hi @Mohammed Shafi,
Hope you are having a great day.
I wanted to check if you have had the chance to review the answer posted above and if it is helpful in resolving your issue.
If it was helpful, please click "Upvote and Accept Answer" on this post to let us know.
We're here to help, so if you have any further questions, don't hesitate to ask.
Thanks,
Sai.
-
Mohammed Shafi • 20 Reputation points
2025-01-27T15:52:49.7066667+00:00 Hi Sai.
Sorry for my late response. Still i could not find a solutions.
Please see my answers in Bold and Italic text, and please let me know your thoughts and suggestions to fix the issues.
Thanks in Advance.
- Your VM might be intermittently overloaded (like high CPU, memory, or network usage), making it unable to respond to requests from the application gateway health probes or user traffic in a timely manner. I checked VM Insghits, no high CPU, Memory, Network usage..But when client request comes frequenltly through the gateway, VM getting unreachaable.
- Please use Azure Monitor to track CPU, memory, network, and disk I/O on your VM. Look for spikes or sustained high usage around the times when the 502 errors occur. For your reference: https://learn.microsoft.com/en-us/azure/virtual-machines/monitor-vm#:~:text=Simplified%20onboarding%20of%20the%20Azure%20Monitor%20agent%20and%20the%20Dependency%20agent%2C%20so%20that%20you%20can%20monitor%20a%20virtual%20machine%20(VM)%20guest%20operating%20system%20and%20workloads. Enabled already, But not getting a solution for our issue.
- The web application running on your VM might be restarting intermittently, causing temporary unavailability. Please try to review your web application's logs for errors or exceptions that might indicate crashes. No chance, we are using the same application in another azure VM without gateway setup and not having any issue if we access the app direcly without gateway and this issue comes when app access through the gateway.
- The backend server closes connections, but the application gateway still sends traffic, make sure that the backend server's keep-alive timeout is greater than application gateway idle timeout. In application gateway backend setrings i extended Request time out to 220 seconds. and we are using HTTP/2 connections to the frontend IP address on Application Gateway v2 SKU, the idle timeout is set to 180 seconds and is nonconfigurable.
- Try to review the NSG rules associated with both your application gateway subnet and your VM's subnet. Ensure there are rules allowing inbound traffic on the ports used by your application (like 80 or 443) from the application gateway IP range. No NSG configured to application gateway and for VM we are using NSG, port 80 and 443 allowed in NSG Inbound Rule
- Verify that your UDRs are not directing traffic away from your VM. Ensure that traffic destined for your VM's IP address is routed correctly. We are not using any UDR, using system default routing.
- If you're using the VM's firewall, ensure that there's a rule allowing inbound traffic on the required ports from the application gateway IP address range Even we disabled the VM OS firewall, this issue comes.
- Ensure the health probe interval is not too short (30 sec is common). A very short interval can overwhelm a busy server. I tested deafult 30 sec health probe and tested custom probe (60 Sec), but no luck
- Make sure the timeout is appropriate (like 30 sec). It should be long enough for the server to respond even under load. Tested with 60 sec also, But how much we can give maximum here?
- An appropriate threshold prevents flapping between healthy and unhealthy states. Verify the port used in your backend settings matches the port your application is listening on within the VM. In app gateay backend we enabled port 443 and the listener we created for 443 only.
- If you are enabled the connection draining, be sure to configure an appropriate timeout for connection draining.. Not enabled connection draining.
- Analyze your application gateway WAF logs to see if any requests are being blocked around the time of the 502 errors. Enabled already, in the log we get error like ERRORINFO_UPSTREAM_NO_LIVE*" or "ERRORINFO_UPSTREAM_CLOSED_CONNECTION". B*ut this 502 error comes even we dsabled the WAF also.
- Make sure that your DNS records are correctly configured and that the DNS servers you're using are reliable. DNS configured correctly and we are using the same DNS server for other applcation server also.
- As you mentioned the issue resolves itself or with a browser cache clear, it might be a temporary network glitches or routing problems within Azure's infrastructure could be causing the connection failures. Sorry It is not like that, even i close and reopen the browser not getting fixed. It is automatically getting fixed after a while.
- Enable diagnostic logging for your application gateway and send the logs to a Log Analytics workspace. This will give you detailed insights into health probe failures, request routing, and WAF activity. Enabled already, But no solution for our issue.
- Use Network Watcher's connection troubleshoot feature to diagnose network connectivity issues between your application gateway and VM. For your reference: https://learn.microsoft.com/en-us/azure/network-watcher/connection-troubleshoot-portal#:~:text=In%20this%20article%2C%20you%20learn%20how%20to%20use%20the%20connection%20troubleshoot%20feature%20of%20Azure%20Network%20Watcher%20to%20diagnose%20and%20troubleshoot%20connectivity%20issues.%20For%20more%20information%20about%20connection%20troubleshoot%2C%20see%20Connection%20troubleshoot%20overview. We are using the application gateway connection troubleshoot to get the connection details, During testing i am getting troubelshoot details as an attached image 1 and during the 502 error am getting as image 2 Image 1 - No Error.png Image 2 - 502 Error Troubleshoot 1.png Image 2 - 502 Error Troubleshoot 2.png
Sign in to comment