I have created a node server using express which run python program which run on gpu in background to process the inputs/request. I am able to display the output on browser. How can I make my node server salable to handle many concurrent. Should I use queue for handling requests? If yes how do I start? I need a definite approach.
Node.js by default is setup to be highly performant due to it's asynchronous event loop.
However, for multi-core systems, you could look at the cluster module to scale out your api: https://nodejs.org/api/cluster.html
Note that your Node.js backend should ideally be stateless for scalability.
Related
I am running my NodeJs app on Google App Engine and as NodeJs handles requests in a single-threaded environment that's why I wanted to ask is there any way to check if my NodeJs Main thread is doing heavy tasks or not ??
The app http://pm2.keymetrics.io/ has many monitoring features that can detect the health of nodejs and take specified actions.
My node application currently has two main modules:
a scraper module
an express server
The former is very server intensive task which indefinately runs in a loop. It scrapes information from over more than 100 urls, crunches the data and puts it into a mongodb database (using mongoose). This process runs over and over and over. :P
The latter part, my express server, responds to http/socket get requests and returns the crunched data which was written to the db by the scraper to the requesting client.
I'd like to optimize the performance of my server so that the express requests and responds get prioritized over the server intensive task(s). A client should be able to get the requested data asap, without having the scraper eat up all of my server resources.
I though about putting the server intensive task or the express server into its own thread, but then I stumbled upon cluster, and child processes; and now I'm totally confused which approach would be the right one for my situation.
One of the benefits I'm having is that there is a clear seperation between the writing part of my application and the reading part. The scraper writes stuff to the db, express reads from the db (no post/put/delete/...) calls are exposed. So, I -guess- I won't run into threading problems with different threads trying to write to the same db.
Any good suggestions? Thanks in advance!
Resources like cpu and memory required by processes are managed by the operative system. You should not waste your time writing that logic within your source code.
I think you should look at the problem from outside your source code files. Once they ran they are processes. Processes are managed, as I said, by the OS.
Firstly I would split that on two separate commands.
One being the scraper module (eg npm run scraper, that runs something like node scraper.js).
The other one being your express server (eg npm start, that runs something like node server.js).
This approach will let you configure that within your OS or your cluster.
A rapid approach for that will be to use docker.
With two docker containers running your projects with cpu usage limitations. This is fairly easy to do and does not require for you to lift a new server... and at the same time it provides the
isolation level you need to scale it to many servers in the future.
Steps to do this:
Learn a little about docker and docker compose and install them in your server
Build a docker image for your application (you can upload it to a free private image that docker hub gives you for free)
Build a docker compose for your two services using that image, with the cpu configuration you need (you can set both cpu and memory limits easily)
An alternative to that would be running the two commands (npm run scraper and npm start) with some tool like cpulimit, nice/niceness and ionice, or something else like namespaces and cgroups manually (but docker does that for you).
PD: Also, I would recommend to rethink your backend process. Maybe it's better to run it every 12 hours or something like that, instead of all the time, and you may run it from within cron instead of a loop.
"Node.js is limited to a single thread". how the nodeJS will react when we are deploying in Multi-Core systems? will it boost the performance?
The JavaScript running in the Node.js V8 engine is single-threaded, but the underlying libuv multi-platform support library is multi-threaded and those threads will be distributed across the CPU cores by the operating system according to it's scheduling algorithm, so with your JavaScript application running asynchronously (and single-threaded) at the top level, you still benefit from multi-core under the covers.
As others have mentioned, the Node.js Cluster module is an excellent way to exploit multi-core for concurrency at the application (JavaScript V8) level, and since Express is cluster aware, you can have multiple worker processes executing concurrent server logic, without needing a unique listening port for each process. Impressive.
As others have mentioned, you will need Redis or equivalent to share data among the cluster worker processes. You will also want a logging facility that is cluster aware, so the cluster master and all worker processes can log to a single shared log file. The Node log4node module is a good choice here, and it works with logrotate.
Typical web examples show using the runtime detected number of cores as the number of cluster worker processes to fork, but I prefer to make that a configuration option in a config.yaml file so I can tune the number of worker processes running the main JavaScript application as needed.
Nodejs runs in one thread, but you can start multiple nodejs processes.
If you are, for example, building web server you can route every request to one of nodejs processes.
Edit: As hereandnow78 and vkurchatkin suggested, maybe the best way to use power of multi core system would be to use nodejs cluster module
cluster module is the solution.
But u need to know that, node.js cluster is, it invokes child process. It means each process cannot share the data.
To share data, u need to use Redis or other IMDG to share the data across the cluster nodes.
I have a simple node.js server app built that I'm hoping to test out soon. It's single threaded and works fine without any child processing whatsoever. My problem is that the server box has multiple cores and the simplest way I can think to utilize them is by running multiple instances of the server app. However this would require them all to be on the same domain name and so some sort of request routing is required. I personally don't have much experience with servers in general and don't know if this is a task for node.js to perform or some other less complicated program (or more complicated.) If there is a node.js mechanism to solve this, for example, if one running instance can send incoming requests to the next instance, than how would I detect when this needs to happen? Transversely, if I use some other program how will it manage to detect when it needs to start talking to a new instance?
Node.js includes built-in support for managing a cluster of instances of your application to take advantage of multiple cores via the cluster module.
I am trying to throttle hosted node.js applications. Those applications are user created in a web-ide and it seems, it can knock out the entire server.
Do we need to apply this in C++ and rebuild node.js by self ?
If you are using linux, you can try something like "renice" to set the priority of each of these processes. Node.js is no different than hosting python, perl or PHP applications, any of them can take a lot of CPU if the program is written poorly or the application is processing many requests.
If by "knock out the entire server" you mean can cause a kernel panic, make sure you have the latest version of node.js and your server is up to date. This should never happen.