Azure application gateway not taking the timeout setting - azure

I'm having an issue with the timeout of my application gateway waf v2.
I set the timeout to 220 sec as showed in the following picture
but im getting a 504 gateway timeout of a particular request at 100 seconds.
Do I need to have any other consideration for make this timeout possible?
[UPDATE]
The error is a 504.0 Gateway timeout.
If I force this error, putting a wait statement on my SP, the error is just a 504 Azure Gateway Timeout
Thanks in advance

Most probably this is happening from your app service and not from the gateway.
Since you are getting a timeout at 100 sec, this may be from the default http timeout. you can check the application gateway request timeout is set correctly by the below .
az network application-gateway show --resource-group <replace with your resource group> --name <replace with your application gateway name> --query 'backendHttpSettingsCollection[].{name: name, reque stTimeout: requestTimeout}'
if this is coming out as expected (230 sec), then you need to see your app service.
For example, if your backend is configured in azure app services, the typical deployment will be based on IIS and the default connection timeout is 2 minutes.
You can override the behavior of the app service. If it is in azure app service, you can use XDT to change the connection timeout attribute of weblimit .
If it is a.NET application that serves the request that you have mentioned and uses HTTP client, then the default timeout is 100sec. You should set a request timeout value that is greater than your application gateway
HttpClient httpClient = new HttpClient();
httpClient.Timeout = TimeSpan.FromMinutes(10);
https://learn.microsoft.com/en-us/dotnet/api/system.net.http.httpclient.timeout?view=net-6.0

The error 504 gateway timeout usually occurs when one or more servers could not complete the request within the allotted time and does not receive a timely response from gateway.
To verify if the backend is taking time for the response, you can enable diagnostic logs on application gateway.
By this you can find access logs for the time taken by backend for the response.
Using these logs, you can view Application Gateway access patterns and analyze important information.
Please check whether your domain is proxied via CloudFlare as #Ked Mardemootoo commented.
Please note that:
As your connections are getting dropped at less than the request time out seconds set, you need to find which connection is triggered.
To resolve the issue, please contact Azure Support.
For more information, please refer below links:
azure public ip - Causes for Application Gateway Connection Timeout - Stack Overflow
Random 504 Gateway timeout while doing load test with application gateway - Microsoft Q&A

Related

Unhealty backend after scaling up App service plan

I have an application gateway running with a web application in a App service plan. The application gateway listens and passes requests to the backend, which is the web app. There is a health probe implemented that works fine.
The web app was reachable fine until I scaled up the Service plan. Suddenly the health probe timed out reaching the backend and I got a 502 bad gateway error in the browser trying to reach the web application. After hours the website suddenly was back and the backend was healthy again. I was under the impression that you could scale up and down the App plan without any noticeable effect on the website, but it seems the gateway was not playing along.
Did I configure something wrong or should this work like I assumed?
I tried to reproduce the same in environment create app service running with application gateway and got a 502 error.
The number of TCP connections allowed by the plan standard while is an older it contains the double make sure while scaling up and down in app service try to remain in same tier so that inbound IP will wait for sometimes and then scale back.
Try to update your default setting in configuration ->General setting-> ARR Affinity Off. Either your application isn't stateful, or the session state is kept on a distant service like a cache or database. And try to Run your application with a minimum of 2-3 instances to prevent from failure.
You can make use of app service diagnostics gives you the right information to more easily
For Reference:
Get started with autoscale in Azure - Azure Monitor| Microsoft
Guide to Running Healthy Apps - Azure App Service
And I got the same error in application gateway as well to avoid the issue
In your virtual network -> service endpoint -> Add endpoint Microsoft.web in default subnet
.

Azure Application Gateway backendpool to Event Grid Topic

I would like to ask if it is possible to use Azure Application Gateway to route(backend pool) traffic to Azure Event Grid Topic. I have tested it but no matter what i do i always get 502 error when i tries to perform a POST Request to my App Gateway.
Here is the sample flow:
[user] ===> [Application Gateway]====>[EventGrid Topic]====> Azure Function
Is this possible?
I got working, make sure you have the application setup as follows:
There is not a 'healthcheck' endpoint from Event Grid hence I just added the FQDN + /ping to the healthprobe and configured the probe to accept a 404.
In the HTTP settings you should make sure that the incoming hostname gets overrided with the EG FQDN by using the "Override with new hostname" option.
After that it should work just fine.Take a look at the rest of my HTTP settings below.

Azure Application Gateway won't forward Error Code 500

I face the issue that azure application gateway will show error 502 instead of forwarding the correct error message with HTTP code 500 from the underlying service.
Is that something one could configure? I don't want the application gateway to filter my error messages?
#thomasuebi, by default Azure application gateway sends out periodic probes to backend servers to check their health status. If any backend server does not respond successfully Azure application gateway marks it as unhealthy. When a client request is received for such "unhealthy" backend servers the application gateway does not forward the request to "unhealthy" backend servers and returns a "502 Bad Gateway" error to the requesting client. You can go through this documentation for additional details.
You can also go through this document to set custom error pages instead of 502 bad gateway page.

Azure API Management (consumption tier): First request gives timeout and is not sent to backend service

I have a service running behind an Azure API Management instance running in the consumption tier. When no traffic has been sent to the API Management instance in a while (15 minutes isn't enough to trigger it, but an hour is), the first request sent takes about 3 minutes 50 seconds and returns a HTTP 500 with this body content:
<html><head><title>500 - The request timed out.</title></head><body> <font color ="#aa0000"> <h2>500 - The request timed out.</h2></font> The web server failed to respond within the specified time.</body></html>
Following requests work fine. Based on application logs and testing with an API Management instance pointing to my local machine via ngrok, it doesn't look like API management is even trying to connect to the backend for these requests. For the local test, I ran my app under the debugger, put a breakpoint in my service method (there's no auth that could get in the way) and watched the "output" window in Visual Studio. It never hit my breakpoint, and never showed anything in the output window for that "500 request timed out" request. When I made another request to API Management, it forwarded along to my service as expected, giving me output and hitting my breakpoint.
Is this some known issue with API Management consumption tier that I need to find some way to work around (ie. a service regularly pinging it)? Or a possible configuration issue with the way I've set up my API Management instance?
My API management instance is deployed via an ARM template using the consumption tier in North Central US and has some REST and some SOAP endpoints (this request I've been using for testing is one of the SOAP ones and uses the envelope header to specify the SOAP action).
Additional information:
The request is question is about 2KB, and a response from the server (which doesn't play into this scenario as the call never makes it to my server) is about 1KB, so it's not an issue with request/response sizes.
When I turn on request tracing (by sending the Ocp-Apim-Subscription-Key + Ocp-Apim-Trace headers), this 500 response I'm getting doesn't have the Ocp-Apim-Trace-Location header with the trace info that other requests do.
I get this behavior when I send 2 requests (to get the 4-minute 500 response and then a normal 5s 200 response), wait an hour, and make another request (which gets the 4-minute delay and 500 response), so I don't believe this could be related to the instance serving too much traffic (at least too much of my traffic).
Further testing shows that this happens about once every 60 to 90 minutes, even if I send one request every minute trying to keep the APIM instance "alive".
HTTP 500 (Internal Server Error) status code indicates that the server encountered an unexpected condition that prevented it from fulfilling the request. (possibly due to large payload). There is no issue at APIM level. Analyze the APIM inspector trace and you should see HTTP 500 status code under 'forward-request' response attribute.
You need to understand who is throwing these HTTP 404 and 500 responses, APIM, or the backend SOAP API. The best way to get that answer is to collect APIM inspector trace to inspect request and response. Debug your APIs using request tracing
The Consumption tier exposes serverless properties. It runs on a shared infrastructure, can scale down to zero in times of no traffic and is billed per execution. Connections are pooled and reused unless explicitly closed by the back end. Api management service limits
1. These pattern of symptoms are also often known to occurs due to
network address translation (SNAT) port limits with your APIM
service.
Whenever a client calls one of your APIM APIs, Azure API Management service opens a SNAT port to access your backend API. Azure uses SNAT and a Load Balancer (not exposed to customers) to communicate with end points outside Azure in the public IP address space, as well as end points internal to Azure that aren't using Virtual Network service endpoints. (This situation is only applicable to backend APIs exposed on public IPs.)
Each instance of API Management service is initially given a pre-allocated number of SNAT ports. That limit affects opening connections to the same host and port combination. SNAT ports are used up when you have repeated calls to the same address and port combination. Once a SNAT port has been released, the port is available for reuse as needed. The Azure Network load balancer reclaims SNAT ports from closed connections only after waiting four minutes.
A rapid succession of client requests to your APIs may exhaust the pre-allocated quota of SNAT ports if these ports are not closed and recycled fast enough, preventing your APIM service from processing client requests in a timely manner.
Following strategies can be considered:
Use multiple IPs for your backend URLs
Place your APIM and backend service in the same VNet
Place your APIM in a virtual network and route outbound calls to Azure Firewall
Consider response caching and other backend performance tuning (configuring certain APIs with response caching to reduce latency
between client applications calling your API and your APIM backend
load.)
Consider implementing access restriction policies (policy can be used to prevent API usage spikes on a per key basis by limiting the
call rate per a specified time period.)
2. The forward-request policy forwards the incoming request to the
backend service specified in the request context. The backend
service URL is specified in the API settings and can be changed
using the set backend service policy.
Policy statement:
<forward-request timeout="time in seconds" follow-redirects="false | true" buffer-request-body="false | true" buffer-response="true | false" fail-on-error-status-code="false | true"/>
Example:
The following API level policy forwards all API requests to the backend service with a timeout interval of 60 seconds.
<!-- api level -->
<policies>
<inbound>
<base/>
</inbound>
<backend>
<forward-request timeout="60"/>
</backend>
<outbound>
<base/>
</outbound>
</policies>
Attribute: timeout="integer"
Description: The amount of time in seconds to wait for the HTTP
response headers to be returned by the backend service before a
timeout error is raised. Minimum value is 0 seconds. Values greater
than 240 seconds may not be honored as the underlying network
infrastructure can drop idle connections after this time.
Required: No
Default: None
This policy can be used in the following policy sections and scopes.
Policy sections: backend
Policy scopes: all scopes
Checkout similar feedback for your reference. Also, refer for detailed troubleshooting of 5oo error for APIM.

Azure application gateway throws 502 when application sends 401

Azure application gateway displays 502 bad gateway error, while application returns 401 or 500 errors. It should send whatever the application sends but by default it sends 502. Any idea what happen and any configuration or code change suggestions?
EDIT:
We are using node js for our API service. When a client tries to hit the endpoint without any auth header, then the service will return 401 error. This error is transformed into 502 when it's passing the App gateway.
General Workflow
When application gateway receives a status code greater than 399, then it will consider there were some issues with the servers and it will remove the server from the pool. After sometime it will check the application status, if it is returning status code lesser than 400 then it add the server to pool.
By default application gateway will be configured to check the app health by making a HTTP/HTTPS request.
Cause
Application may encountered any errors or any authentication errors may result in different error codes. This might caused the application gateway result in 502 error.
Probe
We can configure a special file/end point to check the application/database health. This configuration should be in the probe file.
Useful resource
https://azure.microsoft.com/en-us/documentation/articles/application-gateway-create-probe-classic-ps/
https://azure.microsoft.com/en-us/documentation/articles/application-gateway-probe-overview/
Hope it helps!
Error Status Codes(401, 404) returned from pod considered as unhealthy by the azure application gateway and it produces 502 Bad Gateway Error as a response. So you need to modify the health check mechanism of Azure Application Gateway.
Error Codes that are considered to be healthy 200-399 by default
Modify it in "health probes" section inside your Application Gateway Resource, https://learn.microsoft.com/en-us/azure/application-gateway/application-gateway-create-probe-portal

Resources