Hi I am implementing an Email Client Application. My application going to deal 10000 * 10000 of records. so for scalability purpose, i prefered cluster Concept from Nodejs. so my requirement is for every 1000 Process should be handled by one master Node with help of its Workers. so i got question like, in Single Server How many Master Node can be allowed. if any one know pls let me know.... Waiting for reply....
Master 1 -> should handle 1000 of Records
4 Workers (At a time 4 Record can be processed if CPU core is 4)
Master 2 -> should handle 1000 of Records
4 Workers (At a time 4 Record can be processed if CPU core is 4)
like above i need to handle...
Answer is 1. Nodejs scales nice for one CPU and to use more CPUs you have the workers. You also can't share the ports, so how would the email clients know how to connect to additional masters?
Related
I have a nodejs app that has one getUsers endpoint, this endpoint gets the first 10 users from a mongodb database and returns them in a JSON response.
Lets say this is a sync process (async wont make any sense here since i need to fetch and return the result in the same request) and the entire process takes 1 sec. So, if I have 4 CPU in a virtual machine and I am running the nodejs in cluster mode (meaning 4 nodejs processes running together).
Q1) Does this mean my app can handle 4 simultaneous request per second? If this is true, how can my app handle 1000 request per second?
Q2) I know nodejs is single threaded but Is there anyway the nodejs instance can use more than 1 thread per CPU so that it can handle more request per cpu per second?
Q1) Does this mean my app can handle 4 simultaneous request per second?
If we neglect load balancer process time, your cluster could handle around 4 RPS (requests per second).
If this is true, how can my app handle 1000 request per second?
In short, by scaling via threads or server instances.
More details will be described below.
Q2) I know nodejs is single threaded
Falsy statement.
Node.JS has at least 4 threads to handle I/O operations and few more threads for other under the hood processes.
This makes even single Node.JS instance pretty powerful to handle many async operations.
but Is there anyway the nodejs instance can use more than 1 thread per CPU
Yes, you can use worker_thread to create as many threads as you wish.
so that it can handle more request per cpu per second?
Firstly, you have to make your request handler works asynchronous so it can process more than one request at a time within single thread (using single CPU core).
Then you would like to scale it up via worker_threads to use more threads i.e. more cores to process higher amount of requests.
And that's all inside single process.
P.S. Since each new thread will use new core, there exists a recommendation to spawn around N - 1 threads where N is number of logical cores on your CPU.
Q1) Does this mean my app can handle 4 simultaneous request per second? If this is true, how can my app handle 1000 request per second?
Yes, not because of multiple instances but because Nodejs handles each incoming request asynchronously, even though the processing of a request is itself synchronous. You can handle the 4 of them with a single instance.
The number of requests that can be handled depends on your server.
I recommend doing a performance test. For example, I have recently used Grafana K6
Q2) I know nodejs is single threaded but Is there anyway the nodejs instance can use more than 1 thread per CPU so that it can handle more request per cpu per second?
No, if you want use more CPU you have to create multiple instances in cluster mode.
Check this out
I have a situation where I create a Node.js cluster using PM2. A single request fired at a worker would take considerable time (2+ mins) as it's doing intensive computations (in a pipeline of steps) with a couple of I/O operations at different stages (step 1 is 'download over HTTP', an intermediate and last step are 'write to disk'). The client that sends requests to the cluster throttle the requests it sends by two factors:
Frequency (how many requests per second), we use a slow pace (1 per second)
How many open requests it can make, we make this less than or equal to the number nodes we have in the cluster
For example, if the cluster is 10 nodes, then the client will only send 10 requests to the cluster at a speed of 1 per second, and won't send any more requests until one or more requests returns with either success or failure, which means that a worker or more should be free now to do more work, then the client will send more work to the cluster.
While watching the load on the server, it seems that the load balancer does not distribute work evenly as one would expect from a classic round-robin distribution schema. What happens is that a single worker (usually the 1st one) will receive a lot of requests while there're free workers in the cluster. This eventually will cash the worker to malfunction.
We implemented a mechanism to prevent a worker from proceeding with new requests if it's still working on a previous one. This prevented malfunctioning, but still, a lot of requests are denied service although the cluster has vacant workers!
Can you think of the reason why this behavior is happening, or how can improve the way PM2 does work?
I am working on a nodejs application and the requirement is to send around 10k requests per second per connection. The client application has to open one websocket connection to send these requets and at the server side it has to just receive and send the data to a queue. The number of socket connections at the server side isn't that much, may be around 1k. I have few questions regarding this and any help is greatly appreciated.
First, is it possible to achieve this setup with a single master process? Since I cannot share the web socket connections with the child processes I need to get the bandwith from master process.
When I tried benchmarking nodejs ws library, I was only able to send approximately 1k requests per second of 9kb each. How can I increase the throughput?
Are there any examples on how to achieve max throughput since I can only find posts with how to achieve max connections?
Thanks.
You will eventually need to scale
First, is it possible to achieve this setup with a single master process?
I don't really think its possible to achieve this with a single thread.
(you should consider scaling and never design restricting yourself from that option)
Since I cannot share the web socket connections with the
child processes I need to get the bandwith from master process.
Im sure you will be happy to know about the existance of socket.io-redis.
With it you will be able to send/receive events (share clients) between multiple instances of your code (processes or servers). Read more : socket.io-redis (Github)
I know you are speaking about ws but maybe its worth the change to socket.io. Image Source Article
Specially knowing you can scale both vertically (increase the number of threads per machine)
and horizontally (deploy more "masters" accross more machines) with relative ease. (and I repeat myself : sharing your socket clients & communications accross all instances)
When I tried benchmarking nodejs ws library, I was only able to send
approximately 1k requests per second of 9kb each. How can I increase
the throughput?
I would suggest trying socket.io + socket.io-redis
spawn a master with a number of workers equal to the number of CPU
cores. (vertical scaling)
deploy your master accross 2 or more machines (horizontal scaling)
learn about load-balancing & perform benchmarks.
Are there any examples on how to achieve max throughput since I can
only find posts with how to achieve max connections?
You will increase the total throughput if you increase the number of instances communicating with clients. (Horizontal + Vertical Scaling)
socket.io-redis Application Example (github)
Using Redis with Node.js and Socket.IO (tutorial)
This might also be an interesting read :
SocketCluster 100k messages / sec (ycombinator)
SocketCluster (github)
Hope it helped.
So I'm starting to use node.js for a project I'm doing.
When a client makes a request, My node.js server fetches from another server a json and then reformats it into a new json that gets served to this client. However, the json that the node server got from the other server can potentially be pretty big and so that "massaging" of data is pretty cpu intensive.
I've been reading for the past few hours how node.js isn't great for cpu tasks and the main response that I've seen is to spawn a child-process (basically a .js file running through a different instance of node) that deals with any cpu intensive tasks that might block the main event loop.
So let's say I have 20,000 concurrent users, that would mean it would spawn 20,000 os-level jobs as it's running these child-processes.
Does this sound like a good idea? (A different web server would just create 20,000 threads on the same process.)
I'm not sure if I should be running a child-process. But I do need to make a non-blocking cpu intensive task. Any ideas of what I should do?
The people who say that don't know how to architect solutions.
NodeJS is exactly what it says, It is a node, and should be treated like such.
In your example, your node instance connects to an external api and grabs json to process and send back.
i.e.
1. Get // server.com/getJSON
2. Process the json
3. Post // server.com/postJSON
So what do you do?
Ask yourself is time an issue? if so then node isnt the solution
However if you are more interested in raw processing power so instead of 1 request done in 4 seconds
You are interested in 200 requests finishing in 10 seconds, but each individual one taking about the full 10 seconds.
Diagnose how long your JSON should take to massage, if it is less than 1 second.
Just run 4 node instances instead of 1.
However if its more complex than that, Break the json into segments to process. And use asynchronous callbacks to process each segment
process.nextTick(function( doprocess(segment1); process.nextTick(function() {doprocess(segment2)
each doProcess calls the next doProcess
Node js will trade time between requests.
Now Take that solution and scale it too 4 node instances per server, and 2-5 servers
and suddenly you have an extremely scaleable and cost effective solution.
I have two tomcat servers running at the same time. I have reports which are requested from server 1 sent to server 2 for processing. So how would I go about managing the threads on server 2? For example, if I wanted to queue up the threads how would I go about doing that?
Use a message queue (like RabbitMQ) in the middle to queue up the tasks that need to be done.
Then, your report generating server can pull jobs from the queue and work on them. If you need to slow down or speed up, then you can increase the number of "workers" running.