I have a central API server starting cluster worker instances. Each instance has a specific bigger job, and there might be manipulations I want to do on that specific instance only. This was the rough idea I had in mind:
API Server with express, master process
instance 1: GET /instances/1/*
instance 2: GET /instances/2/*
Each instance is a separate worker process, and I was hoping I could delegate all API requests for specific worker, directly to the worker (to execute functions in that worker).
instance/:id does represent the WorkerID.
The client might request the logs where workerID = x, so GET /instances/x/logs.
The goal here is that master routes all requests for instance X to the sub-process identified as x.
This is not for load distribution across workers that are essentially clones/mirrors.
Each of my worker may be performing a variation of a long running task (days, weeks, months long). Methods are shared across all workers, but if I'm calling /instances/x/logs I only want to query that on that specific worker process. That's the goal I'm looking to figure out.
// route these to subprocess x
GET /instances/x/logs
POST /instances/x/settings
// route these to subprocess y
GET /instances/y/logs
POST /instances/y/settings
// route these to subprocess z
GET /instances/z/logs
POST /instances/z/settings
// make a new worker process, returns worker ID as reference
POST /instances/
I saw I can have multiple express listeners on the same port across different processes, but if I understood correctly, this is automatically load-balanced by express. I can't route specific requests to specific workers based on the path, can I?
Each instance is a separate worker process, and I was hoping I could delegate all API requests for specific worker, directly to the worker (to execute functions in that worker).
Indeed you can do that, but unless your instance/:id represents the WorkerID, you've hit an end.
Let's assume the following example where :id is not a worker id:
W - Worker
W1/instances/1/:method - has the following methods names, cities, cars
W2/instances/2/:method - has the following methods names, fruits, stats
The HTTP client will want to access:
GET /instances/1/name, that's great, name exists in both paths. - TRUE
GET /instances/2/fruits, that's great, fruits exists in this path on W2, BUT if the balancer serves you W1 with that route you'll have an error, because fruits doesn't exist on W1 - FALSE
Final answer:
You cannot request workers to pop-up and serve your will, best you can do is some communication between master & workers or have some methods that requires some processing, and based on the CPU usage, trigger those methods when a low worker is served. But take a look at the good part, if they die, you can fork new ones without crashing your whole app.
Related
I have a SERVICE that gets a request from a Webhook and this is currently deployed across seperate Cloud Run containers. These seperate containers are the exact same (image), however, each instance processes data seperately for each particular account.
This is due to a ~ 3-5 min processing of the request and if the user sends in more requests, it needs to wait for the existing process to be completed for that particular user before processing the next one to avoid racing conditions. The container can still receive webhooks though, however, the actual processing of the data itself needs to be done one by one for each account.
Is there no way to reduce the container count, as such for example, to use one container to process all the requests, while still ensuring it processes one task for each user at a time and waits for that to complete for that user, before processing the next request from the same user.
To explain it better, i.e.
Multiple tasks can be run across all the users
However, per user 1 task at a time processed; Once that is completed, the next task for that user can be processed
I was thinking of monitoring the tasks through a Redis Cache, however, with Cloud Run being stateless, I am not sure that is the right way to go.
Or seperating the requests and the actual work - Master / Worker - And having the worker report back to the master once a task is completed for the user across 2 images (Using the concurrency to process multiple tasks across the users), however that might mean that I would have to increase the timeout time for Cloud Run.
Good to hear any other suggestions.
Apologies if this doesn't seem clear, feel free to ask for more information.
I am trying to design a service following the Command Query Responsibility Segregation Pattern (CQRS) in NodeJs. To handle segregation, I am taking the following approach:
Create separate workers for querying and executing commands
Expose them using a REST API
The REST API has been designed with ExpressJs. All endpoints starting with 'update', 'create' and 'delete' keywords are treated as commands; all endpoints with 'get' or 'find' are treated as queries.
When a request reaches its designated handler, one of the following occurs:
If its a command, a response is sent immediately after delegating the task to worker process; other services are notified by generating appropriate events when the master process receives a completion message from the worker.
If its a query, the response is handled by a designated worker that can use a reference of the database connection passed on as arguments to fetch and send the query result.
For (2) above, I am trying to create a mechanism that somehow "passes" the response object to the worker which, can then complete the request. Can this be done by "cloning" the response object and passing it as plain arguments? If not, what is the preferred way of achieving this?
I think you are better off in (2) to pass the query off onto a worker process, which returns to the master process, which then sends back the request.
First of all, you don't really want to give the worker processes "access" to the outside. They should be all internal workers, managed by the master process.
Second, the Express server's job is to receive requests, do something with them, then return a result. It seems like over-complicating to try to pass the communication off to a worker.
If you are really worried about your Express server getting overwhelmed with requests, you should consider something like Docker to create a "swarm" of express instances.
Is there a service or framework or any way that would allow me to run Node JS for heavy computations letting me choose the number of cores?
I'll be more specific: let's say I want to run some expensive computation for each of my users and I have 20000 users.
So I want to run the expensive computation for each user on a separate thread/core/computer, so I can finish the computation for all users faster.
But I don't want to deal with low level server configuration, all I'm looking for is something similar to AWS Lambda but for high performance computing, i.e., letting me scale as I please (maybe I want 1000 cores).
I did simulate this with AWS Lambda by having a "master" lambda that receives the data for all 20000 users and then calls a "computation" lambda for each user. Problem is, with AWS Lambda I can't make 20000 requests and wait for their callbacks at the same time (I get a request limit exceeded error).
With some setup I could user Amazon HPC, Google Compute Engine or Azure, but they only go up to 64 cores, so if I need more than that, I'd still have to setup all the machines I need separately and orchestrate the communication between them with something like Open MPI, handling the different low level setups for master and compute instances (accessing via ssh and etc).
So is there any service I can just paste my Node JS code, maybe choose the number of cores and run (not having to care about OS, or how many computers there are in my cluster)?
I'm looking for something that can take that code:
var users = [...];
function expensiveCalculation(user) {
// ...
return ...;
}
users.forEach(function(user) {
Thread.create(function() {
save(user.id, expensiveCalculation(user));
});
});
And run each thread on a separate core so they can run simultaneously (therefore finishing faster).
I think that your problem is that you feel the need to process 20000 inputs at once on the same machine. Have you looked into SQS from Amazon? Maybe you push those 20000 inputs into SQS and then have a cluster of servers pull from that queue and process each one individually.
With this approach you could add as many servers, processes or add as many AWS Lambda invokes as you want. You could even use a combination of the 3 to see what's cheaper or faster. Adding resources will only reduce the amount of time it would take to complete the computations. Then you wouldn't have to wait for 20000 requests or anything to complete. The process could tell you when it completes the computation by sending some notification after it completes.
So basically, you could have a simple application that just grabbed 10 of these inputs at a time and ran your computation on them. After it finishes you could then have this process delete them from SQS and send a notification somewhere (Maybe SNS?) to notify the user or some other system that they are done. Then it would repeat the process.
After that you could scale the process horizontally and you wouldn't need a super computer in order to process this. So you could either get a cluster of EC2 instances that ran several of these applications a piece or have a Lambda function invoked periodically in order to pull items out of SQS and process them.
EDIT:
To get started using an EC2 instance I would look at the docs here. To start with I would pick the smallest, cheapest instance (T2.micro I think), and leave everything at it's default. There's no need to open any port other than the one for SSH.
Once it's setup and you login, the first thing you need to do is run aws configure to setup your profile that way you can access AWS resources from the instance. After that install Node and get your application on there using git or something. Once it's setup though, go to the EC2 console and in your Actions menu there will be an option to create an image from the instance.
Once you create an image, then you can go to Auto Scaling groups and create a launch configuration using that AMI. Then it'll let you specify how many instances you want to run.
I feel like this could also be done more easily using their container service, but honestly I don't know how to use it yet.
I have a site that makes the standard data-bound calls, but then also have a few CPU-intensive tasks which are ran a few times per day, mainly by the admin.
These tasks include grabbing data from the db, running a few time-consuming different algorithms, then reuploading the data. What would be the best method for making these calls and having them run without blocking the event loop?
I definitely want to keep the calculations on the server so web workers wouldn't work here. Would a child process be enough here? Or should I have a separate thread running in the background handling all /api/admin calls?
The basic answer to this scenario in Node.js land is to use the core cluster module - https://nodejs.org/docs/latest/api/cluster.html
It is an acceptable API to :
easily launch worker node.js instances on the same machine (each instance will have its own event loop)
keep a live communication channel for short messages between instances
this way, any work done in the child instance will not block your master event loop.
Can someone explain in detail how the core cluster module works in Node.js?
How the workers are able to listen to a single port?
As far as I know that the master process does the listening, but how it can know which ports to listen since workers are started after the master process? Do they somehow communicate that back to the master by using the child_process.fork communication channel? And if so how the incoming connection to the port is passed from the master to the worker?
Also I'm wondering what logic is used to determine to which worker an incoming connection is passed?
I know this is an old question, but this is now explained at nodejs.org here:
The worker processes are spawned using the child_process.fork method,
so that they can communicate with the parent via IPC and pass server
handles back and forth.
When you call server.listen(...) in a worker, it serializes the
arguments and passes the request to the master process. If the master
process already has a listening server matching the worker's
requirements, then it passes the handle to the worker. If it does not
already have a listening server matching that requirement, then it
will create one, and pass the handle to the worker.
This causes potentially surprising behavior in three edge cases:
server.listen({fd: 7}) -
Because the message is passed to the master,
file descriptor 7 in the parent will be listened on, and the handle
passed to the worker, rather than listening to the worker's idea of
what the number 7 file descriptor references.
server.listen(handle) -
Listening on handles explicitly will cause the
worker to use the supplied handle, rather than talk to the master
process. If the worker already has the handle, then it's presumed that
you know what you are doing.
server.listen(0) -
Normally, this will cause servers to listen on a
random port. However, in a cluster, each worker will receive the same
"random" port each time they do listen(0). In essence, the port is
random the first time, but predictable thereafter. If you want to
listen on a unique port, generate a port number based on the cluster
worker ID.
When multiple processes are all accept()ing on the same underlying
resource, the operating system load-balances across them very
efficiently. There is no routing logic in Node.js, or in your program,
and no shared state between the workers. Therefore, it is important to
design your program such that it does not rely too heavily on
in-memory data objects for things like sessions and login.
Because workers are all separate processes, they can be killed or
re-spawned depending on your program's needs, without affecting other
workers. As long as there are some workers still alive, the server
will continue to accept connections. Node does not automatically
manage the number of workers for you, however. It is your
responsibility to manage the worker pool for your application's needs.
NodeJS uses a round-robin decision to make load balancing between the child processes. It will give the incoming connections to an empty process, based on the RR algorithm.
The children and the parent do not actually share anything, the whole script is executed from the beginning to end, that is the main difference between the normal C fork. Traditional C forked child would continue executing from the instruction where it was left, not the beginning like NodeJS. So If you want to share anything, you need to connect to a cache like MemCache or Redis.
So the code below produces 6 6 6 (no evil means) on the console.
var cluster = require("cluster");
var a = 5;
a++;
console.log(a);
if ( cluster.isMaster){
worker = cluster.fork();
worker = cluster.fork();
}
Here is a blog post that explains this
As an update to #OpenUserX03's answer, nodejs has no longer use system load-balances but use a built in one. from this post:
To fix that Node v0.12 got a new implementation using a round-robin algorithm to distribute the load between workers in a better way. This is the default approach Node uses since then including Node v6.0.0