I have a python application that was deployed using gunicorn. The system configuration is 8 core CPU and 64 GB RAM. I have worker-thread combination of 2:3 and sent 500 requests at once. I need to understand how gunicorn manages these requests as after the completion of the process, out of 500 only 371 requests were successfully completed while other requests are lost as if I have never sent them. I could not even find those requests in logs as well.
Related
Problem - Increase in latencies ( p90 > 30s ) for a simple WebSocket server hosted on VM.
Repro
Run a simple websocket server on a single VM. The server simply receives a request and then upgrades it to websocket without any logic. The client will continuously send 50 parallel requests for a period of 5 mins ( so approximately 3000 requests ).
Issue
Most requests have a latency of the range 100ms-2s. However, for 300-500 requests, we observe that latencies are high ( 10-40s with p90 greater than 30s ) while some have TCP timeouts (Linux default timeout of 127s).
When analyzing the VM processes, we observe that when requests are taking a lot of time, the node process loses it CPU share in favor of some processes started by the VM.
Further Debugging
Increasing process priority (renice) and i/o priority (ionice) did not solve the problem
Increasing cores and memories to 8 core, 32 GiB memory did not solve the problem.
Edit - 1
Repro Code ( clustering enabled ) - https://gist.github.com/Sid200026/3b506a9f77cfce3fa4efdd1ec9dd29bc
When monitoring active processes via htop, we find that the processes started by the following commands are causing an issue
python3 -u bin/WALinuxAgent-2.9.0.4-py2.7.egg -run-exthandlers
/usr/lib/linux-tools/5.15.0-1031-azure/hv_kvp_daemon -n
I have an ExpressJS app running on EC2 in a docker container which suddenly stopped responding to any requests after 6 days of normal operation with similar requests. The CPU and network traffic looked normal, but I don't have memory metrics because AWS doesn't automatically collect those.
Once I restarted the container, it resumed responding to requests as normal.
Under what circumstances would an Express app stop responding to requests?
Possible causes I can think of:
Code running stuck in an infinite loop (but this would max out the CPU)
Memory full
What else could cause this?
I'm relatively new to running production node.js apps and I've recently been having problems with my server timing out.
Basically after a certain amount of usage & time my node.js app stops responding to requests. I don't even see routes being fired on my console anymore - it's like the whole thing just comes to a halt and the HTTP calls from my client (iPhone running AFNetworking) don't reach the server anymore. But if I restart my node.js app server everything starts working again, until things inevitable stop again. The app never crashes, it just stops responding to requests.
I'm not getting any errors, and I've made sure to handle and log all DB connection errors so I'm not sure where to start. I thought it might have something to do with memory leaks so I installed node-memwatch and set up a listener for memory leaks but that doesn't get called before my server stops responding to requests.
Any clue as to what might be happening and how I can solve this problem?
Here's my stack:
Node.js on AWS EC2 Micro Instance (using Express 4.0 + PM2)
Database on AWS RDS volume running MySQL (using node-mysql)
Sessions stored w/ Redis on same EC2 instance as the node.js app
Clients are iPhones accessing the server via AFNetworking
Once again no errors are firing with any of the modules mentioned above.
First of all you need to be a bit more specific about timeouts.
TCP timeouts: TCP divides a message into packets which are sent one by one. The receiver needs to acknowledge having received the packet. If the receiver does not acknowledge having received the package within certain period of time, a TCP retransmission occurs, which is sending the same packet again. If this happens a couple of more times, the sender gives up and kills the connection.
HTTP timeout: An HTTP client like a browser, or your server while acting as a client (e.g: sending requests to other HTTP servers), can set an arbitrary timeout. If a response is not received within that period of time, it will disconnect and call it a timeout.
Now, there are many, many possible causes for this... from more trivial to less trivial:
Wrong Content-Length calculation: If you send a request with a Content-Length: 20 header, that means "I am going to send you 20 bytes". If you send 19, the other end will wait for the remaining 1. If that takes too long... timeout.
Not enough infrastructure: Maybe you should assign more machines to your application. If (total load / # of CPU cores) is over 1, or your memory usage is high, your system may be over capacity. However keep reading...
Silent exception: An error was thrown but not logged anywhere. The request never finished processing, leading to the next item.
Resource leaks: Every request needs to be handled to completion. If you don't do this, the connection will remain open. In addition, the IncomingMesage object (aka: usually called req in express code) will remain referenced by other objects (e.g: express itself). Each one of those objects can use a lot of memory.
Node event loop starvation: I will get to that at the end.
For memory leaks, the symptoms would be:
the node process would be using an increasing amount of memory.
To make things worse, if available memory is low and your server is misconfigured to use swapping, Linux will start moving memory to disk (swapping), which is very I/O and CPU intensive. Servers should not have swapping enabled.
cat /proc/sys/vm/swappiness
will return you the level of swappiness configured in your system (goes from 0 to 100). You can modify it in a persistent way via /etc/sysctl.conf (requires restart) or in a volatile way using: sysctl vm.swappiness=10
Once you've established you have a memory leak, you need to get a core dump and download it for analysis. A way to do that can be found in this other Stackoverflow response: Tools to analyze core dump from Node.js
For connection leaks (you leaked a connection by not handling a request to completion), you would be having an increasing number of established connections to your server. You can check your established connections with netstat -a -p tcp | grep ESTABLISHED | wc -l can be used to count established connections.
Now, the event loop starvation is the worst problem. If you have short lived code node works very well. But if you do CPU intensive stuff and have a function that keeps the CPU busy for an excessive amount of time... like 50 ms (50 ms of solid, blocking, synchronous CPU time, not asynchronous code taking 50 ms), operations being handled by the event loop such as processing HTTP requests start falling behind and eventually timing out.
The way to find a CPU bottleneck is using a performance profiler. nodegrind/qcachegrind are my preferred profiling tools but others prefer flamegraphs and such. However it can be hard to run a profiler in production. Just take a development server and slam it with requests. aka: a load test. There are many tools for this.
Finally, another way to debug the problem is:
env NODE_DEBUG=tls,net node <...arguments for your app>
node has optional debug statements that are enabled through the NODE_DEBUG environment variable. Setting NODE_DEBUG to tls,net will make node emit debugging information for the tls and net modules... so basically everything being sent or received. If there's a timeout you will see where it's coming from.
Source: Experience of maintaining large deployments of node services for years.
Based on iis architecture, request from client hitting IIS will pass through httppipeline, specifically through each httpmodule and finally reaches respective httphandlers and then to worker process. Is this happening serially, one after the other?
Say if 10,000 requests hits the webserver concurrently in a sec, is each request get processed one by one? If the webserver has multi-core CPU and high memory capacity, does this helps IIS to handle the requests simultaneously?
Is there any webserver capable to handle requests in parallel?
I just replied to this guys question - very similar, and the answer is the same:
IIS and HTTP pipelining, processing requests in parallel
We have a .NET 2.0 Remoting server running in Single-Call mode under IIS7. It has two APIs, say:
DoLongRunningCalculation() - has a lot of database requests and can take a long time to execute.
HelloWorld() - just returns "Hello World".
We tried to stress test the remoting server (on a Windows 7 machine) in a worst case scenario by bombarding it randomly with the two API calls and found that if we go beyond 10 client requests, the HelloWorld response (which generally is less than 0.1 sec) starts taking longer and longer going into many seconds. Our objective is that we dont want to have the long-running remoting calls to block the short-running calls. Here are the performance counters for ASP.NET v2.0.50727 if we have 20 client threads running:
Requests Queued: 0
Requests Executing: (Max:10)
Worker Processes Running: 0
Pipeline Instance Mode: (Max:10)
Requests in Application Queue: 0
We've tried setting maxConcurrentRequestsPerCPU to "5000" in registry as per Thomas's blog: ASP.NET Thread Usage on IIS 7.0 and 6.0 but it hasn't helped. Based on the above data, it appears that the number of concurrent requests is stuck at 10.
So, the question is:
How do we go about increasing the concurrent requests? The main objective is that we don't want to have the long-running remoting calls to block the short-running calls.
Why are the Max Requests Executing always stuck at 10?
Thanks in advance.
Windows 7 has a 20 inbound connection limit. XP and prior was limited to 10 (not sure about Vista). This is likely the cause of your drop in performance. Try testing on an actual server OS that doesn't have an arbitrary connection limit.