Imagine you are going to have a lot of long processor intensive tasks of translating some strings into something else. You are going to want to have a pool of actual threads to keep the main node thread going and to make use of your cores.
The main way to do this is to either implement Threads-a-gogo or Webworker-Threads, and start a pool of 16 threads (e.g. on a Intel with 8 cores you usually have 16 threads concurrently).
Doing a request to a thread is called an event or a message. Getting a response is also catching an event or getting a message. But how does this work with a threadPool?
If you skip the Webworker API, TAGG and Webworkers for node have the same underlying API. You can load your translation function in all workers using threadPool.load and que a task to one of them using threadPool.any.
But imagine I now have 50 tasks (strings to translate) to be queued. The threadPool will eventually emit 50 events (responses with a translated string) without telling me what task the response belongs to?
I think I am fundamentally misunderstanding one thing about the threadPool.
Is there a way I can just add a task to the threadPool queue and receive a callback when that particular task is done?
Why emit events from the thread pool when you can just return the translated string? The value returned by the code is received by the callback you passed to threadpool.any.eval(). Example:
threadPool.any.eval('return "hello world"', function(err, data) {
// data === 'hello world'
});
Related
I've been testing some code to see how does NodeJS event loop actually works. So I get in touch with this piece of code:
console.time('Time spending');
let list = [];
for (let index = 0; index < 1000000; index++) {
const data = JSON.stringify({
id: Date.now(),
index,
});
list.push(data);
}
console.log(list);
console.timeEnd('Time spending');
When this code is executed, NodeJS spawns eleven threads on SO (Ubuntu running on WSL 2). But why it does that?
This code is not being declared as an async code.
That's the worker pool. As mentioned in the great guide Don't block the event loop, Node.js has an "event loop" and a worker pool. Those threads you see are the worker pool, and the size is defined with the environment variable UV_THREADPOOL_SIZE, from libuv, which Node.js uses internally. The reason node.js spawns those threads has nothing to do with your expensive loop, it's just the default behavior at startup.
There's extensive documentation on how the event loop works on the official Node.js site, but essentially some operations, like filesystem I/O, are synchronous because the underlying operating system does not offer an asynchronous interface (or it's too new/experimental). Node.js works around that by using a thread pool where the event loop submits a task, like reading a file, which is usually a synchronous job, and goes to the next event while a thread does the dirty work of actually reading the file, it can block the thread, but it does not matter, because the event loop is not blocked. When it's done, it reaches back to the event loop with the data. So, for the event loop (and the programmer), the synchronous read was done asynchronously.
There are no parallel threads being used to run your code. Nodejs runs all your code you show in just one thread. You could just do this:
setTimeout(() => {
console.log("done with timeout");
}, 10 * 60 * 1000);
And, you would see the same number of threads. What you are seeing has nothing to do with your specific code.
The other threads you see are just other threads that nodejs uses for it's own internal purposes such as the worker pool for disk I/O, asynchronous crypto, some other built-in operations and other internal housekeeping operations.
Also, Javascript code marked as async still runs in the one main Javascript thread so your reference that nothing is async wouldn't change things either. It doesn't matter (from a thread point of view) whether code is async or not.
Your big for loop blocks the entire event loop so no other Javascript code or events can run until your for loop finishes. There's no really much to learn about the event loop from this code except that your loop blocks the event loop until the loop completes.
Is there ever any reason to add blocks to a serial dispatch queue asynchronously as opposed to synchronously?
As I understand it a serial dispatch queue only starts executing the next task in the queue once the preceding task has completed executing. If this is the case, I can't see what you would you gain by submitting some blocks asynchronously - the act of submission may not block the thread (since it returns straight-away), but the task won't be executed until the last task finishes, so it seems to me that you don't really gain anything.
This question has been prompted by the following code - taken from a book chapter on design patterns. To prevent the underlying data array from being modified simultaneously by two separate threads, all modification tasks are added to a serial dispatch queue. But note that returnToPool adds tasks to this queue asynchronously, whereas getFromPool adds its tasks synchronously.
class Pool<T> {
private var data = [T]();
// Create a serial dispath queue
private let arrayQ = dispatch_queue_create("arrayQ", DISPATCH_QUEUE_SERIAL);
private let semaphore:dispatch_semaphore_t;
init(items:[T]) {
data.reserveCapacity(data.count);
for item in items {
data.append(item);
}
semaphore = dispatch_semaphore_create(items.count);
}
func getFromPool() -> T? {
var result:T?;
if (dispatch_semaphore_wait(semaphore, DISPATCH_TIME_FOREVER) == 0) {
dispatch_sync(arrayQ, {() in
result = self.data.removeAtIndex(0);
})
}
return result;
}
func returnToPool(item:T) {
dispatch_async(arrayQ, {() in
self.data.append(item);
dispatch_semaphore_signal(self.semaphore);
});
}
}
Because there's no need to make the caller of returnToPool() block. It could perhaps continue on doing other useful work.
The thread which called returnToPool() is presumably not just working with this pool. It presumably has other stuff it could be doing. That stuff could be done simultaneously with the work in the asynchronously-submitted task.
Typical modern computers have multiple CPU cores, so a design like this improves the chances that CPU cores are utilized efficiently and useful work is completed sooner. The question isn't whether tasks submitted to the serial queue operate simultaneously — they can't because of the nature of serial queues — it's whether other work can be done simultaneously.
Yes, there are reasons why you'd add tasks to serial queue asynchronously. It's actually extremely common.
The most common example would be when you're doing something in the background and want to update the UI. You'll often dispatch that UI update asynchronously back to the main queue (which is a serial queue). That way the background thread doesn't have to wait for the main thread to perform its UI update, but rather it can carry on processing in the background.
Another common example is as you've demonstrated, when using a GCD queue to synchronize interaction with some object. If you're dealing with immutable objects, you can dispatch these updates asynchronously to this synchronization queue (i.e. why have the current thread wait, but rather instead let it carry on). You'll do reads synchronously (because you're obviously going to wait until you get the synchronized value back), but writes can be done asynchronously.
(You actually see this latter example frequently implemented with the "reader-writer" pattern and a custom concurrent queue, where reads are performed synchronously on concurrent queue with dispatch_sync, but writes are performed asynchronously with barrier with dispatch_barrier_async. But the idea is equally applicable to serial queues, too.)
The choice of synchronous v asynchronous dispatch has nothing to do with whether the destination queue is serial or concurrent. It's simply a question of whether you have to block the current queue until that other one finishes its task or not.
Regarding your code sample code, that is correct. The getFromPool should dispatch synchronously (because you have to wait for the synchronization queue to actually return the value), but returnToPool can safely dispatch asynchronously. Obviously, I'm wary of seeing code waiting for semaphores if that might be called from the main thread (so make sure you don't call getFromPool from the main thread!), but with that one caveat, this code should achieve the desired purpose, offering reasonably efficient synchronization of this pool object, but with a getFromPool that will block if the pool is empty until something is added to the pool.
I am implementing a REST service for financial calculation. So each request is supposed to be a CPU intensive task, and I think that the best place to create threads it's in the following function:
exports.execute = function(data, params, f, callback) {
var queriesList = [];
var resultList = [];
for (var i = 0; i < data.lista.length; i++)
{
var query = (function(cod) {
return function(callbackFlow) {
params.paramcodneg = cod;
doCdaQuery(params, function(err, result)
{
if (err)
{
return callback({ERROR: err}, null);
}
f(data, result, function(ret)
{
resultList.push(ret);
callbackFlow();
});
});
}
})(data.lista[i]);
queriesList.push(query);
}
flow.parallel(queriesList, function() {
callback(null, resultList);
});
};
I don't know what is best, run flow.parallel in a separeted thread or run each function of the queriesList in its own thread. What is best ? And how to use threads-a-gogo module for that ?
I've tried but couldn't write the right code for that.
Thanks in advance.
Kleyson Rios.
I'll admit that I'm relatively new to node.js and I haven't yet used threads a gogo, but I have had some experience with multi-threaded programming, so I'll take a crack at answering this question.
Creating a thread for every single query (I'm assuming these queries are CPU-bound calculations rather than IO-bound calls to a database) is not a good idea. Creating and destroying threads in an expensive operation, so creating an destroying a group of threads for every request that requires calculation is going to be a huge drag on performance. Too many threads will cause more overhead as the processor switches between them. There isn't any advantage to having more worker threads than processor cores.
Also, if each query doesn't take that much processing time, there will be more time spent creating and destroying the thread than running the query. Most of the time would be spent on threading overhead. In this case, you would be much better off using a single-threaded solution using flow or async, which distributes the processing over multiple ticks to allow the node.js event loop to run.
Single-threaded solutions are the easiest to understand and debug, but if the queries are preventing the main thread from getting other stuff done, then a multi-threaded solution is necessary.
The multi-threaded solution you propose is pretty good. Running all the queries in a separate thread prevents the main thread from bogging down. However, there isn't any point in using flow or async in this case. These modules simulate multi-threading by distributing the processing over multiple node.js ticks and tasks run in parallel don't execute in any particular order. However, these tasks still are running in a single thread. Since you're processing the queries in their own thread, and they're no longer interfering with the node.js event loop, then just run them one after another in a loop. Since all the action is happening in a thread without a node.js event loop, using flow or async in just introduces more overhead for no additional benefit.
A more efficient solution is to have a thread pool hanging out in the background and throw tasks at it. The thread pool would ideally have the same number of threads as processor cores, and would be created when the application starts up and destroyed when the application shuts down, so the expensive creating and destroying of threads only happens once. I see that Threads a Gogo has a thread pool that you can use, although I'm afraid I'm not yet familiar enough with it to give you all the details of using it.
I'm drifting into territory I'm not familiar with here, but I believe you could do it by pushing each query individually onto the global thread pool and when all the callbacks have completed, you'll be done.
The Node.flow module would be handy here, not because it would make processing any faster, but because it would help you manage all the query tasks and their callbacks. You would use a loop to push a bunch of parallel tasks on the flow stack using flow.parallel(...), where each task would send a query to the global threadpool using threadpool.any.eval(), and then call ready() in the threadpool callback to tell flow that the task is complete. After the parallel tasks have been queued up, use flow.join() to run all the tasks. That should run the queries on the thread pool, with the thread pool running as many tasks as it can at once, using all the cores and avoiding creating or destroying threads, and all the queries will have been processed.
Other requests would also be tossing their tasks onto the thread pool as well, but you wouldn't notice that because the request being processed would only get callbacks for the tasks that the request gave to the thread pool. Note that this would all be done on the main thread. The thread pool would do all the non-main-thread processing.
You'll need to do some threads a gogo and node.flow documentation reading and figure out some of the details, but that should give you a head start. Using a separate thread is more complex than using the main thread, and making use of a thread pool is even more complex, so you'll have to choose which one is best for you. The extra complexity might or might not be worth it.
I am building a CPU intensive web app, where ill write the CPU intensive stuff in C++ while ill write the webserver in node.js. The node.js would be connected to c++ via addons. I am confused about one thing -
Say the time of the CPU intensive operation per request is 5 seconds(maybe this involved inverting a huge matrix). When this request comes through, the node.js binding to c++ would send this request over to the c++ code.
Now does this mean that node.js would not be caught up for the next 5 seconds and can continue serving other requests?
I am confused as i have heard that even though node offers asynchronous features, it is still single threaded.
Obviously I would not want node.js to be stuck up for 5s as it is a huge price to pay. Imagine 100s of requests simultaneously for this intensive operation..
Trying to understand JS callbacks and asynchronicity logic, i came across with many different versions of the following description;
a callback function which is passed to another function as a parameter, runs
following to the time taking process of the function it's passed to.
The dilemma gets originated with the "time taking" adjective. Such as is it
Time taking because of CPU being idle and waiting for a response?
Time taking because of CPU being busy with number crunching like hell?
This is not clear in the description and confused me. So i tried the following two codes.
getData('http://fakedomain1234.com/userlist', writeData);
document.getElementById('output').innerHTML += "show this before data ...";
function getData(dataURI, callback) {
// Normally you would actually connect to a server here.
// We're just going to simulate a 3-second delay.
var timer = setTimeout(function () {
var dataArray = [123, 456, 789, 012, 345, 678];
callback(dataArray);
}, 3000);
}
function writeData(myData) {
document.getElementById('output').innerHTML += myData;
}
<body>
<p id="output"></p>
</body>
and
getData('http://fakedomain1234.com/userlist', writeData);
document.getElementById('output').innerHTML += "show this before data ...";
function getData(dataURI, callback) {
var dataArray = [123, 456, 789, 012, 345, 678];
for (i=0; i<1000000000; i++);
callback(dataArray);
}
function writeData(myData) {
document.getElementById('output').innerHTML += myData;
}
<body>
<p id="output"></p>
</body>
so in both codes there is a time taking activity in the getData function. In the first one the CPU is idle and in the second the CPU is busy. Clearly when CPU is busy the JS runtime is not asynchronous.
The main thread of Node is the JS event loop, so all logic interacting with JS is single threaded. This also includes any C++ logic triggered directly via JS.
Generally any long-running tasks should be split off into worker processes. For instance, in your case, you could have a worker process that would queue up calculations, emitting events back to the JS thread when they have completed.
So really, it's a question of how you go about your connected to c++ via addons code.
I'm not going to refer to the specifics of Node.js as I'm not that familiar with the internal architecture and the possibilities it allows (but I understand it supports multiple worker threads, each representing a different event loop)
In general, if you need to process 100 request/s that take 5 seconds solid CPU time, then there's nothing you can do, except ensuring that you have 500 processors available.
If 100 request/s is peak, while on average it will be much lower, then the solution is queueing, and you use the queue to absorb the blow.
Now things start to get interesting when it is not 5 seconds solid CPU time, but 0.1 CPU time and 4.9 waiting or anything in between. This is the case where asynchronous processing should be used to put all that waiting time to work.
Asynchronous in this case means that:
All your execution happens in an event loop.
You don't wait, no sleep, no blocking I/O, just execute or return to the event loop.
You split your task into non-blocking subtasks, interspeded with (async) events (e.g. with a response) that continue the execution.
You split your system into a number of event processing services, exchanging requests and responses through asynchronous events and collaborating to provide the overall functionality.
What to do if you have a subsystem you cannot turn into an asynchronous service under the principles above?
The answer is to wrap it with queues (to absorb the requests) + multiple threads (allowing execution of some threads hile other threads are waiting), providing the async events request/response interface expected by rest of the subsystems.
In all cases it is best to keep a bounded number of threads (instead of a per-request thread model) and always keep the total number of active/hot threads in the system below the number of processing resources.
Node.js is nice in that its input/output is inherently asynchronously and all the infrastructure is geared towards implementing the kind of things I described above.
With Node.js, or eventlet or any other non-blocking server, what happens when a given request takes long, does it then block all other requests?
Example, a request comes in, and takes 200ms to compute, this will block other requests since e.g. nodejs uses a single thread.
Meaning your 15K per second will go down substantially because of the actual time it takes to compute the response for a given request.
But this just seems wrong to me, so I'm asking what really happens as I can't imagine that is how things work.
Whether or not it "blocks" is dependent on your definition of "block". Typically block means that your CPU is essentially idle, but the current thread isn't able to do anything with it because it is waiting for I/O or the like. That sort of thing doesn't tend to happen in node.js unless you use the non-recommended synchronous I/O functions. Instead, functions return quickly, and when the I/O task they started complete, your callback gets called and you take it from there. In the interim, other requests can be processed.
If you are doing something computation-heavy in node, nothing else is going to be able to use the CPU until it is done, but for a very different reason: the CPU is actually busy. Typically this is not what people mean when they say "blocking", instead, it's just a long computation.
200ms is a long time for something to take if it doesn't involve I/O and is purely doing computation. That's probably not the sort of thing you should be doing in node, to be honest. A solution more in the spirit of node would be to have that sort of number crunching happen in another (non-javascript) program that is called by node, and that calls your callback when complete. Assuming you have a multi-core machine (or the other program is running on a different machine), node can continue to respond to requests while the other program crunches away.
There are cases where a cluster (as others have mentioned) might help, but I doubt yours is really one of those. Clusters really are made for when you have lots and lots of little requests that together are more than a single core of the CPU can handle, not for the case where you have single requests that take hundreds of milliseconds each.
Everything in node.js runs in parallel internally. However, your own code runs strictly serially. If you sleep for a second in node.js, the server sleeps for a second. It's not suitable for requests that require a lot of computation. I/O is parallel, and your code does I/O through callbacks (so your code is not running while waiting for the I/O).
On most modern platforms, node.js does us threads for I/O. It uses libev, which uses threads where that works best on the platform.
You are exactly correct. Nodejs developers must be aware of that or their applications will be completely non-performant, if long running code is not asynchronous.
Everything that is going to take a 'long time' needs to be done asynchronously.
This is basically true, at least if you don't use the new cluster feature that balances incoming connections between multiple, automatically spawned workers. However, if you do use it, most other requests will still complete quickly.
Edit: Workers are processes.
You can think of the event loop as 10 people waiting in line to pay their bills. If somebody is taking too much time to pay his bill (thus blocking the event loop), the other people will just have to hang around waiting for their turn to come.. and waiting...
In other words:
Since the event loop is running on a single thread, it is very
important that we do not block it’s execution by doing heavy
computations in callback functions or synchronous I/O. Going over a
large collection of values/objects or performing time-consuming
computations in a callback function prevents the event loop from
further processing other events in the queue.
Here is some code to actually see the blocking / non-blocking in action:
With this example (long CPU-computing task, non I/O):
var net = require('net');
handler = function(req, res) {
console.log('hello');
for (i = 0; i < 10000000000; i++) { a = i + 5; }
}
net.createServer(handler).listen(80);
if you do 2 requests in the browser, only a single hello will be displayed in the server console, meaning that the second request cannot be processed because the first one blocks the Node.js thread.
If we do an I/O task instead (write 2 GB of data on disk, it took a few seconds during my test, even on a SSD):
http = require('http');
fs = require('fs');
buffer = Buffer.alloc(2*1000*1000*1000);
first = true;
done = false;
write = function() {
fs.writeFile('big.bin', buffer, function() { done = true; });
}
handler = function(req, res) {
if (first) {
first = false;
res.end('Starting write..')
write();
return;
}
if (done) {
res.end("write done.");
} else {
res.end('writing ongoing.');
}
}
http.createServer(handler).listen(80);
here we can see that the a-few-second-long-IO-writing-task write is non-blocking: if you do other requests in the meantime, you will see writing ongoing.! This confirms the well-known non-blocking-for-IO features of Node.js.