Clean shutdown of express.js + mongodb server (using node.js) - node.js

I have node.js webserver using express.js and mongodb as a datastore. This server is being controlled by runit, and I am trying to implement a way to gracefully shut down the server.
I am implementing signal handlers for SIGINT and SIGTERM, and I am aware that you can stop listening for new connections by calling .close() on the object returned by createServer(). So far so good.
However, even when no more requests are forthcoming, there may be a number of requests already in the system that needs to finish before I can close the database.
I am using a mongodb ReplicaSet, and I figure that if I just call db.close() right away, that some of these requests may fail in some manner. Is there some way to close the database in a way that allows pending database queries to finish or do I have to manually have a +/- counter to see how many "active" queries are pending, and then wait to shutdown until it reaches 0?

You would have to manually ensure that you are in a clean state to shut down as the driver does not wait flush out the remaining operations before shutting down.
Feel free to log a ticket for the feature on
https://github.com/mongodb/node-mongodb-native/issues/

Related

Nodejs stuck on processing whenever the app is restarted

I have a nodejs application running on Linux, as we all know, whenever I restart the nodejs app it will get a new PID, suppose while the nodejs app is running, a client connects to it and running some process and the process status is processing, during that point of time, if the nodejs app restarts(on the server-side), how can we make sure the client connects back to the previous processing state.
What is happening now is, whenever the server restarts, the process stucks in processing forever.
Just direct me to a sample of how this scenario is handled in real life.
Thank You.
If I'm understanding you correctly, then the answer is you can't...
The reason for this is that, when you restart the process the event loop is restarted, meaning any processes that were running or were waiting in the event loop are gone. You are essentially clearing out the event loop when you restart.
I would say though, if you know the process is 'crashing' node then you probably want to look into that process and see why is crashing, place it in a try catch to it wont kill the server.
now with that said ( and without knowing what, processing state really means ) you could set a flag in your DB server for say 'job1' and have a status column of say 'running' when it was kicked off. When the node server restarts it can read Job status for 'running' jobs, if the 'job' is in a 'running' state you can fire off the job again and once complete update the table to 'completed'
This probably not the most efficient way as it's much better to figure out why the process if crashing, but as a fall-back this could work although in a clustered environment this could cause issues because server 1 may fail while server 2 is processing because server 1 does not know what server two is doing. With more details as to the use case, environment etc would probably allow for a better answer

Do I really need to call client.shutdown() when finished with Cassandra in Node.js script?

I've been trying to find information about Cassandra sessions relating to the Node.js cassandra-driver by Datastax. I read something which said that cassandra-driver automatically manages a session and that I don't need to call client.shutdown().
I'm looking for general information about how cassandra-driver manages sessions, how can I see all active Cassandra sessions, and do I need to call shutdown() or is that counter productive having to reopen a session every time the script is run?
Based on "pm2 info" I don't see a ton of active handles so I don't think anything wrong is going on but I may be mistaken. Ram usage does seem a bit high for a small script (85mb).
In the DataStax drivers, Session is a stateful object handling a pool of connections and aware of the status of nodes in the Cluster at any time (avoiding sending request to unavailable node). TCP sockets are opened and it is a best practice to close when you don't need it anymore. See here to get more infos : https://docs.datastax.com/en/developer/nodejs-driver-dse/2.1/features/connection-pooling/
Now session.connect() may takes a bit of time: the more nodes you have in your cluster, the longer it will be to open connections to every single one. This is the reason why, it is better to init connections in a "cold start" when you work with FAAS (avoiding to open/close for each request)
So:
Always close your connections (shutdown()) when you don't need it anymore (shutdown hook in your applications)
Keep your connections "alive" as long as you need it, do not shutdown for each request, this is NOT stateless.
yes, it is "better" to connect the client outside of the handler function. to keep it state-Full.
however, AWS Lambda with nodeJS, by default function execution continues until the event loop is empty or the function times out.
create the client outside of handler, set the context.callbackWaitsForEmptyEventLoop = false and don't call client.shutdown.

Server constantly running a function to update a cache, will it block all other server functions?

About once a minute, I need to cache all orderbooks from various cryptocurrency exchanges. There are hundreds of orderbooks, so this update function will likely never stop running.
My question is: If my server is constantly running this orderbook update function, will it block all other server functionality? Will users ever be able to interact with my server?
Do I need to create a separate service to perform the updating, or can Node somehow prioritize API requests and pause the caching function?
My question is: If my server is constantly running this orderbook
update function, will it block all other server functionality? Will
users ever be able to interact with my server?
If you are writing asynchronously, these actions will go into your eventloop and your node server would pick next event from eventloop while these actions are being performed. If you have too many events like this, your event queue would be long and user would face really slow response or may even get a timeout
Do I need to create a separate service to perform the updating, or can
Node somehow prioritize API requests and pause the caching function?
Node only consumes event from the event queue. There are no priorities.
From the design perspective, you should look for options which can reduce this write load like bulkCreate/edit or if you are using redis for cache, consider redis pipeline
This is a very open ended question much of which depends on your system. In general your server should be able to handle concurrent requests, but there are some things to watch out for.
Performance costs. If the operation to retrieve and store data requires too much computational power, then it will cause strain on all requests processed by the server.
Database connections. The server spends a lot of time waiting for database queries to complete. If you have one database connection for the entire application, and this connection is busy, they will have to wait until the database connection is free. You may want to look into database connection 'pooling'.

Clustered socket.io server hangs

I'm writing a socket.io based server in Node.js (6.9.0). I am using the builtin cluster module to enable multiple processes. For now, there is only two process: a master and a worker. The master receives the connections and maintains an in-memory global data structure (which the worker can query via IPC). The worker process does the majority of work by handling each incoming connection.
I am finding a hanging condition that I cannot attribute to any internal failure when the server is stressed at 300 concurrent users. Under lower concurrency, I don't see the hanging condition.
I'm enabling all forms of debugging (using the debug module: socket.io:socket, socket.io:client as well as my own custom calls to debug).
The last activity I can see is in socket.io, however, the messages indicate that sockets are closing ("reason client namespace disconnect") due to their own "end of test" cycle. It just seems like incoming connections are not be serviced.
I'm using Artillery.io as the test client.
In the server application, I have handlers for uncaught exceptions and try-catch blocks around everything.
In a prior iteration, I also used cluster, but reversed the responsibilities so that the master process handled the connections (with the worker handling global data). That didn't exhibit the same failure. Not sure if something is wrong with the connection distribution. For that, I have also dumped internalMessage events to monitor the internal workings of cluster.
I am not using any other module for connection distribution or sticky sessions. As there is only a single process handling connections (at this time), it doesn't seem relevant.
I was able to remove the hanging condition by changing the cluster scheduling policy from Round Robin (SCHED_RR) to None, which is OS specific (SCHED_NONE). I can't tell whether this is due to a bug in connection distribution (or something else inherent in the scheduling policy), but this one change seems to prevent the hanging condition.

Check queued reads/writes for MongoDB

I feel like this question would have been asked before, but I can't find one. Pardon me if this is a repeat.
I'm building a service on Node.js hosted in Heroku and using MongoDB hosted by Compose. Under heavy load, the latency is most likely to come from the database, as there is nothing very CPU-heavy in the service layer. Thus, when MongoDB is overloaded, I want to return an HTTP 503 promptly instead of waiting for a timeout.
I'm also using REDIS, and REDIS has a feature where you can check the number of queued commands (redisClient.command_queue.length). With this feature, I can know right away if REDIS is backed up. Is there something similar for MongoDB?
The best option I have found so far is polling the server for status via this command, but (1) I'm hoping for something client side, as there could be spikes within the polling interval that cause problems, and (2) I'm not actually sure what part of the status response I want to act on. That second part brings me to a follow up question...
I don't fully understand how the MondoDB client works with the server. Is one connection shared per client instance (and in my case, per process)? Are queries and writes queued locally or on the server? Or, is one connection opened for each query/write, until the database's connection pool is exhausted? If the latter is the case, it seems like I might want to keep an eye on the open connections. Does the MongoDB server return such information at other times, besides when polled for status?
Thanks!
MongoDB connection pool workflow-
Every MongoClient instance has a built-in connection pool. The client opens sockets on demand to support the number of concurrent MongoDB operations your application requires. There is no thread-affinity for sockets.
The client instance, opens one additional socket per server in your MongoDB topology for monitoring the server’s state.
The size of each connection pool is capped at maxPoolSize, which defaults to 100.
When a thread in your application begins an operation on MongoDB, if all other sockets are in use and the pool has reached its maximum, the thread pauses, waiting for a socket to be returned to the pool by another thread.
You can increase maxPoolSize:
client = MongoClient(host, port, maxPoolSize=200)
By default, any number of threads are allowed to wait for sockets to become available, and they can wait any length of time. Override waitQueueMultiple to cap the number of waiting threads. E.g., to keep the number of waiters less than or equal to 500:
client = MongoClient(host, port, maxPoolSize=50, waitQueueMultiple=10)
Once the pool reaches its max size, additional threads are allowed to wait indefinitely for sockets to become available, unless you set waitQueueTimeoutMS:
client = MongoClient(host, port, waitQueueTimeoutMS=100)
Reference for connection pooling-
http://blog.mongolab.com/2013/11/deep-dive-into-connection-pooling/

Resources