I'm running a Postgres DB with a Node.js web application on a Ubuntu droplet with 16 vCPUs, and have found a strange behavior during times of high load for Postgres processes. It seems that Postgres processes stop completely following a time with near 100% CPU load, causing my API to freeze. Why is this?
Beneath attached are 2 screenshots that are approximately 1 minute apart from each other, using top in the command line.
Starting web app through PM2 — https://i.stack.imgur.com/Vs6WW.png
After a while — https://i.stack.imgur.com/RTc9G.png
Finally — https://i.stack.imgur.com/DVY10.png
I'm finding this behavior only when my server is handling 10,000+ requests per 10 minutes. Is this intended? What's going on here and is it possible to somehow not have these processes "stop" and not respawn unless I restart my Node.js app?
UPDATE: Postgres log file shows a lot of unexpected EOF on client connection with an open transaction. Is this caused by CPU overload or errors within the transaction?
Related
So I have a node (v8.9.4) server that is running on an AWS EC2 instance using the forever package to start it up. The server has worked without any issues for years but now that it's grown and more people are using it, it suddenly starts to time out all requests at seemingly random times, after working for a few hours.
I've found that running forever restart on the server gets all requests working again so I've got a temporary cronjob to restart it every hour but this is not good design and I would much rather have the server running without any issues.
I've gone through my server logs and found this which may be significant:
error: Forever detected script was killed by signal: SIGKILL
error: Script restart attempt #131
Warning: connect.session() MemoryStore is not designed for a production environment, as it will leak memory, and will not scale past a single process.
Another thing that may be important, the server stays running while this issue occurs so any checks on the server status through UptimeRobot (or any server status checker) returns a success.
Considering the server will run fine for a few hours and also start up again with no issues after a restart, I'm thinking it is not an issue with the code but something else that I am not aware of. My current hypothesis is the requests will start timing out if the server runs out of CPU but I would like to explore more options before making the final call on the issue. If anyone had any insight into this issue, I would be super grateful! :)
I have a Parse Server which is a Node.js + express wrapper for a mobile app (about 100 simultaneous users every day), hosted on DigitalOcean. The app server communicates with MongoDB, which is hosted on another droplet of DigitalOcean. I'm using pm2 as a process manager and its monitoring tool, which is web-based. On the same process, we operate LiveQuery, a WebSocket server made by the Parse community as well.
The thing is, I've been having some performance issues with the server. Everything works smoothly, until the Active handles rise up uncontrollably! (see the image below) It's like after one point the server says "I'm done! Now I rest!"
Usually the active handles stay between 30 to 70. The moment I restart the process with pm2 restart everything goes back to normal!
I've been having this issue for quite some time now and I haven’t been able to figure out what’s causing it! Any help will be greatly appreciated!
EDIT: I did a stress test where I created 200 LiveQuery sockets for 1 user, instead of 2 that a user normally has and there was a spike of 300 active handles, for like 5 seconds! The moment the sockets were all created, everything went back to normal!
I usually use restart based on memory usage
pm2 start filename.js --max-memory-restart 160 --exp-backoff-restart-delay=100
pm2 has also built-in cron job or autostart script setup in case the server ever restarts, see https://pm2.keymetrics.io/docs/usage/restart-strategies/
it would be could if pm2 would provide restart options based on active connections or heap memory
I'm trying to figure out why my nodejs app becomes unresponsive after 11h 20min. It happens every time, no matter if I run it on amazon-linux or Red Hat.
My Stack:
nodejs (v. 6.9.4)
mongodb (3.2)
pm2 process manager
AWS EC2 instance T2 medium
Every time I'm running the app it becomes unresponsive with an error returned to the browser:
net::ERR_CONNECTION_RESET
Pm2 doesn't restart the app, so I suspect it has nothing to do with nodejs, I also analysed the app and it doesn't have memory leaks. Db logs also look alright.
The only constant factor is the fact that the app crashes after it runs for 11h 20min.
I'm handling all possible errors from the nodejs app, but no errors in the log files occur so I suspect it has to be something else.
I also checked var/log/messages and /home/centos/messages but nothing related to the crash of the app there either.
/var/log/mongodb/mongo.log doesn't show anything specific either.
What would be the best way to approach the problem ?
Any clues how can I debug it or what could be the reason ?
Thanks
Copied from the comment since it apparently led to the solution:
You're leaking something other than memory is my guess, maybe file descriptors. Try using netstat or lsof to see if there are a lot more open connections or files than you expect.
error: Forever detected script was killed by signal: SIGKILL
I'm running a node app on production with "forever".
Somewhat randomly, it shows these events in the logs, and this is causing requests with lots of backend processing that access a database to just stop, and you then have to re-request and hope that it finishes before the next SIGKILL.
My question is this: under any circumstances could an application exception cause a SIGKILL like this, in the context of forever?
I can't reproduce this locally in my development environment.
ENVIRONMENT:
ubuntu 14.04
memcached
forever
node by itself (no nginx reverse proxy or anything)
connecting to a postgres database to query data
It's really hard to say for sure if the SIGKILL's are on an set interval, or if they are happening at a certain point in program execution. The logs don't have a timestamp by default. From looking at the output, I'd say it is happening somewhat randomly in program execution since it is at different points in the log file they appear.
Check your system logs to see if the linux kernel's out of memory killer is sending the signal as per this answer
We have test server with 3 different node.js apps running on it. Each application is using the same MongoDB database test instance of which also runs on the same server. So at any given moment of time we have at most 3 different open connections to mongodb server.
The issue is that after each code deployment (which basically is: killing currently running process, code update and starting new process) i see new process(thread of a single proccess) on server which is shown in htop as /usr/bin/mongod --config /etc/mongodb.conf. Thus once in awhile we have to restart the test server because there are too many not used threads like that and it makes the mongod process take all the RAM.
I am not sure why is this happening and looking for solution to fix this issue.
My assumption is that if we simply kill the node.js proccess the connection (and therefore the thread related to this connection) somehow stays alive and therefore instead of killing nodejs process, we should gracefully shut it down with closing the DB connection.
htop is also showing different threads, your mongod isn't started multiple times, which wouldn't be possible with the same config because the port is already in use.
use top or ps aux | grep mongod and you should see just one process.
you can also configure htop not to show those, press F2 > display options > hide userland threads.