I made a server with NestJS and ran it on an EC2 instance, and it is responding with 502 randomly.
I googled and a lot of articles mentioned the difference between the timeout interval of ALB and the server.
when I made a simple webpage server with express the same issue happened but fixed it by adjusting keepAliveTimeout and headersTimeout.
(each value was 65000ms and 67000ms)
however, this nestjs server still responds with 502 and I don't know what else to do.
tried to change the timeout of ALB to 30sec and to 120sec and this didn't work either.
what else is the possibility of this issue? I am lost.
I have microservices written in node/express hosted on EC2 with an application load balancer.
Some users are getting a 502 even before the request reaches the server.
I register every log inside each instance, and I don't have the logs of those requests, I have the request immediately before the 502, and the requests right after the 502, that's why I am assuming that the request never reaches the servers. Most users solve this by refreshing the page or using an anonymous tab, which makes the connection to a different machine (we have 6).
I can tell from the load balancer logs that the load balancer responds almost immediately to the request with 502. I guess that this could be a TCP RST.
I had a similar problem a long time ago, and I had to add keepAliveTimeout and headersTimeout to the node configuration. Here are my settings (still using the LB default of the 60s):
server.keepAliveTimeout = 65000;
server.headersTimeout = 80000;
The metrics, especially memory and CPU usage of all instances are fine.
These 502 errors started after an update we made where we introduced several packages, for instance, axios. At first, I thought it could be axios, because the keep-alive is not enabled by default. But it didn't work. Other than the axios, we just use the request.
Any tips on how should I debug/fix this issue?
HTTP 502 errors are usually caused by a problem with the load balancer. Which would explain why the requests are never reaching your server, presumably because the load balancer can't reach the server for some or other reason.
This link has some hints regarding how to get logs from a classic load balancer. However, since you didn't specify, you might be using an application load balancer, in which case this link might be more useful.
From the ALB access logs I knew that either the ALB couldn't connect the target or the connection was being immediately terminated by the target.
And the most difficult part was figure out how to replicate the 502 error.
It looks like the node version I was using has a request header size limit of 8kb. If any request exceeded that limit, the target would reject the connection, and the ALB would return a 502 error.
Solution:
I solved the issue by adding --max-http-header-size=size to the node start command line, where size is a value greater than 8kb.
A few common reasons for an AWS Load Balancer 502 Bad Gateway:
Be sure to have your public subnets (that your ALB is targeting) are set to auto-assign a public IP (so that instances deployed are auto-assigned a public IP).
Security group for your alb allows http and/or https traffic from the IPs that you are connecting from.
I was also Having the same problem from 1 or 2 months something like that and I didn't found the solution. And I was also having AWS Premium support but they were also not able to find the solution. I was getting 502 Error randomly loke may be 10 times per day. Finally after reading the docs from AWS
The target receives the request and starts to process it, but closes the connection to the load balancer too early. This usually occurs when the duration of the keep-alive timeout for the target is shorter than the idle timeout value of the load balancer.
https://aws.amazon.com/premiumsupport/knowledge-center/elb-alb-troubleshoot-502-errors/
SOLUTION:
I was running "Apache" webserver in EC2 so Increased "KEEPALIVETIMEOUT=65". This did the trick. For me.
My Application getting timeout when I am trying to fetch data from sources available on my tomcat server. I can see DB query is the culprit because it is sending data in 100 seconds due to huge amount of data it is processing. My request getting timeout in 60 seconds which leads to below error
Proxy Error
The proxy server received an invalid response from an upstream server.
The proxy server could not handle the request
Reason: Error reading from remote server
I am using mod_proxy to connect to tomcat servers from Apache web server. I tried increasing connctionTimeout to 90000 mili-seconds for SSL connector but still request getting timeout in 60 seconds. Is there anything I am missing to change so that I can increase my connection timeout.
I am using tomcat 9. Any leads will be very helpful. As I am stuck here from quite long.
Thanks
Found a way to define proxy timeout value in httpd.conf file. I added below two parameters
Timeout 300
ProxyTimeout 300
For more details you can refer here
I have couchdb running on a Linux Ubuntu 14.04 VM and a .net Web application running under Azure Web Apps. Under our ELMAH logging for the web application I keep getting intermittent errors:
System.Net.Sockets.SocketException
A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond [ipaddress]:5984
I've checked the CouchDB logs and there isn't a record of those requests so I don't believe it's hitting the CouchDB server, I can confirm this by looking at the web server logs on Azure and see the Error 500 response. I've also tried a tcpdump however with little success (another issue logging tcpdump to a separate disk keeps failing due to access denied)
We've previously ran CouchDB on a Windows VM with no issues so I wonder if the issue relates to the OS connection settings for tcp and timeouts
Anyone have any suggestions as to where to look or what immediately jumps to mind?
When I run my node.js app in development I intermittently see connection refused an about every 2nd or 3rd request. I am not even sending the requests very quickly (about 1 per second). The requests should be completing very quickly as this is an express app with an end-point that is just checking if the content-type is set correctly. Is it likely that I am seeing the issue because I am not proxying the requests through nginx? Nginx would queue the requests; whereas not using nginx would mean that I am just hitting my node.js app directly. I don't see anything in my node.js app's logs that would indicate an error.