Run a method asynchronous on the server - node.js

I have a Meteor server method that executes a function that I want to run in the background. I want the server method to continue while the function runs in a different process.
the method:
myMethod: function(arg) {
_event.emit("myEvent",args);
...do some other stuff
}
I'd like the server to do the other stuff and return to the client, and to do the stuff in _event.emit in the background no results have to be sent back to the client.
What it currently does is run the stuff in _event.emit and than the other stuff and then return to the client.
I tried to solve it by doing a call on server side with an empty callback function, but that didn't do the trick
myMethod: function (arg) {
return Meteor.call("secondMethod", _.toArray(arg), function() {})
}
secondMethod: function (arg) {
_event.emit("myEvent",args);
}
hope someone knows a solution for this.
thanks

The javascript in nodejs is single threaded. It works off an event queue where an event is popped off the queue, some callback is called to serve that event and the code behind that callback runs to completion, then the next event is pulled off the event queue and the process is repeated.
As such, there is no real "background processing". If your JS thread is executing, then nothing else can run at that time until that JS thread finishes and allows the next event in the event queue to be processed.
To truly run something else in parallel where both your task and other nodejs code can literally run at the same time, you would have to create a new process and let the new process carry out your task. This new process could be any type of process (some pre-built program running a task or another custom nodejs process).
"Running in the background" can sometimes be simulated by time slicing your "background" process work such that it does small amounts of work on timer ticks and the pauses between timer ticks allow other nodejs JS events to be processed and their code to run. But, to do this type of simulated "background", you have to write your task to execute in small chunks. It's a different (and often more cumbersome) way of writing code.

On the server you can just use Meteor.setTimeout() to run a function asynchronously. Just use an interval of zero if you want to return control immediately. Your main function will continue to run and as soon as there's nothing left in the event queue the scheduled task will run (or it will wait the amount of time you specify if non-zero).

Related

node.js - process.on('message'....) does not work outside of infinite loop

process.on('message', (data) => {
console('on receive message')
})
generate()
function generate () {
func() //method which have while-loop which will takes about 10 mins.
generate()
}
I cannot receive message while func() is working.
only receive stacked message after func() finsiehd.
How can I receive message even func() is working...
NodeJS is a single threaded application. Therefor running a loop for ~10 min will block all processing of any other code. You need to release the execution stack to allow other pending code to run.
A very simple way to do that is with setTimeout which will stop run the specified function in the future, or setInterval which will run it periodically.
setInterval(generate, 600000); // Run generate every 10 minutes
Other options include async/await + Bluebird Promise.delay or a wide variety of queuing / async libraries on npm.
If you absolutely need to func to be executing then you need to examine other options. As I said before, nodejs is a single threaded application. You could do the processing in another nodeJS process or you need to re-design func so that it executes in a queue such that the execution stack will be periodically cleared so that pending tasks can run.

Single thread synchronous and asynchronous confusion

Assume makeBurger() will take 10 seconds
In synchronous program,
function serveBurger() {
makeBurger();
makeBurger();
console.log("READY") // Assume takes 5 seconds to log.
}
This will take a total of 25 seconds to execute.
So for NodeJs lets say we make an async version of makeBurgerAsync() which also takes 10 seconds.
function serveBurger() {
makeBurgerAsync(function(count) {
});
makeBurgerAsync(function(count) {
});
console.log("READY") // Assume takes 5 seconds to log.
}
Since it is a single thread. I have troubling imagine what is really going on behind the scene.
So for sure when the function run, both async functions will enter event loops and console.log("READY") will get executed straight away.
But while console.log("READY") is executing, no work is really done for both async function right? Since single thread is hogging console.log for 5 seconds.
After console.log is done. CPU will have time to switch between both async so that it can run a bit of each function each time.
So according to this, the function doesn't necessarily result in faster execution, async is probably slower due to switching between event loop? I imagine that, at the end of the day, everything will be spread on a single thread which will be the same thing as synchronous version?
I am probably missing some very big concept so please let me know. Thanks.
EDIT
It makes sense if the asynchronous operations are like query DB etc. Basically nodejs will just say "Hey DB handle this for me while I'll do something else". However, the case I am not understanding is the self-defined callback function within nodejs itself.
EDIT2
function makeBurger() {
var count = 0;
count++; // 1 time
...
count++; // 999999 times
return count;
}
function makeBurgerAsync(callback) {
var count = 0;
count++; // 1 time
...
count++; // 999999 times
callback(count);
}
In node.js, all asynchronous operations accomplish their tasks outside of the node.js Javascript single thread. They either use a native code thread (such as disk I/O in node.js) or they don't use a thread at all (such as event driven networking or timers).
You can't take a synchronous operation written entirely in node.js Javascript and magically make it asynchronous. An asynchronous operation is asynchronous because it calls some function that is implemented in native code and written in a way to actually be asynchronous. So, to make something asynchronous, it has to be specifically written to use lower level operations that are themselves asynchronous with an asynchronous native code implementation.
These out-of-band operations, then communicate with the main node.js Javascript thread via the event queue. When one of these asynchronous operations completes, it adds an event to the Javascript event queue and then when the single node.js thread finishes what it is currently doing, it grabs the next event from the event queue and calls the callback associated with that event.
Thus, you can have multiple asynchronous operations running in parallel. And running 3 operations in parallel will usually have a shorter end-to-end running time than running those same 3 operations in sequence.
Let's examine a real-world async situation rather than your pseudo-code:
function doSomething() {
fs.readFile(fname, function(err, data) {
console.log("file read");
});
setTimeout(function() {
console.log("timer fired");
}, 100);
http.get(someUrl, function(err, response, body) {
console.log("http get finished");
});
console.log("READY");
}
doSomething();
console.log("AFTER");
Here's what happens step-by-step:
fs.readFile() is initiated. Since node.js implements file I/O using a thread pool, this operation is passed off to a thread in node.js and it will run there in a separate thread.
Without waiting for fs.readFile() to finish, setTimeout() is called. This uses a timer sub-system in libuv (the cross platform library that node.js is built on). This is also non-blocking so the timer is registered and then execution continues.
http.get() is called. This will send the desired http request and then immediately return to further execution.
console.log("READY") will run.
The three asynchronous operations will complete in an indeterminate order (whichever one completes it's operation first will be done first). For purposes of this discussion, let's say the setTimeout() finishes first. When it finishes, some internals in node.js will insert an event in the event queue with the timer event and the registered callback. When the node.js main JS thread is done executing any other JS, it will grab the next event from the event queue and call the callback associated with it.
For purposes of this description, let's say that while that timer callback is executing, the fs.readFile() operation finishes. Using it's own thread, it will insert an event in the node.js event queue.
Now the setTimeout() callback finishes. At that point, the JS interpreter checks to see if there are any other events in the event queue. The fs.readfile() event is in the queue so it grabs that and calls the callback associated with that. That callback executes and finishes.
Some time later, the http.get() operation finishes. Internal to node.js, an event is added to the event queue. Since there is nothing else in the event queue and the JS interpreter is not currently executing, that event can immediately be serviced and the callback for the http.get() can get called.
Per the above sequence of events, you would see this in the console:
READY
AFTER
timer fired
file read
http get finished
Keep in mind that the order of the last three lines here is indeterminate (it's just based on unpredictable execution speed) so that precise order here is just an example. If you needed those to be executed in a specific order or needed to know when all three were done, then you would have to add additional code in order to track that.
Since it appears you are trying to make code run faster by making something asynchronous that isn't currently asynchronous, let me repeat. You can't take a synchronous operation written entirely in Javascript and "make it asynchronous". You'd have to rewrite it from scratch to use fundamentally different asynchronous lower level operations or you'd have to pass it off to some other process to execute and then get notified when it was done (using worker processes or external processes or native code plugins or something like that).

How do I make function to running on background?

I have this code periodically calls the load function which does very load work taking 10sec. Problem is when load function is being executed, it's blocking the main flow. If I send a simple GET request (like a health check) while load is being executed, the GET call is blocked until the load call is finished.
function setLoadInterval() {
var self = this;
this.interval = setInterval(function doHeavyWork() {
// this takes 10 sec
self.load();
self.emit('reloaded');
}, 20000);
I tried async.parallel but still the GET call was blocked. I tried setTimeout but got the same result. How do I make load to running on background so that it doesn't block the main flow?
this.interval = setInterval(function doHeavyWork() {
async.parallel([function(cb) {
self.load();
cb(null);
}], function(err) {
if (err) {
// log error
}
self.emit('reloaded');
})
}, 20000);
Node.js is an event driven non-blocking IO model
Anything that is IO is offloaded as a separate thread in the underlying engine and hence parallelism is achieved.
If the task is CPU intensive there is no way you can achieve parallelism as by default Javascript is a blocking sync language
However there are some ways to achieve this by offloading the CPU intensive task to a different process.
Option1:
exec or spawn a child process and execute the load() function in that spawned node app. This is okay if the interval fired is for every 20000 ms as by the time another one fired, the 10sec process will be completed.
Otherwise it is dangerous as it can spawn too many node applications eating up your Systems resources
Option2:
I dont know how much data self.load() accepts and returns. If it is trivial and network overhead is acceptable, make that task a load-balanced web service (may be 4 webservers running in parallel) which accepts (rather point to) 1M records and returns back filtered records.
NOTE
It looks like you are using node async parallel function. But keep a note of this description from the documentation.
Note: parallel is about kicking-off I/O tasks in parallel, not about parallel execution of code. If your tasks do not use any timers or perform any I/O, they will actually be executed in series. Any synchronous setup sections for each task will happen one after the other. JavaScript remains single-threaded.

Node.js Event loop

Is the Node.js I/O event loop single- or multithreaded?
If I have several I/O processes, node puts them in an external event loop. Are they processed in a sequence (fastest first) or handles the event loop to process them concurrently (...and in which limitations)?
Event Loop
The Node.js event loop runs under a single thread, this means the application code you write is evaluated on a single thread. Nodejs itself uses many threads underneath through libuv, but you never have to deal with with those when writing nodejs code.
Every call that involves I/O call requires you to register a callback. This call also returns immediately, this allows you to do multiple IO operations in parallel without using threads in your application code. As soon as an I/O operation is completed it's callback will be pushed on the event loop. It will be executed as soon as all the other callbacks that where pushed on the event loop before it are executed.
There are a few methods to do basic manipulation of how callbacks are added to the event loop.
Usually you shouldn't need these, but every now and then they can be useful.
setImmediate
process.nextTick
At no point will there ever be two true parallel paths of execution, so all operations are inherently thread safe. There usually will be several asynchronous concurrent paths of execution that are being managed by the event loop.
Read More about the event loop
Limitations
Because of the event loop, node doesn't have to start a new thread for every incoming tcp connection. This allows node to service hundreds of thousands of requests concurrently , as long as you aren't calculating the first 1000 prime numbers for each request.
This also means it's important to not do CPU intensive operations, as these will keep a lock on the event loop and prevent other asynchronous paths of execution from continuing.
It's also important to not use the sync variant of all the I/O methods, as these will keep a lock on the event loop as well.
If you want to do CPU heavy things you should ether delegate it to a different process that can execute the CPU bound operation more efficiently or you could write it as a node native add on.
Read more about use cases
Control Flow
In order to manage writing many callbacks you will probably want to use a control flow library.
I believe this is currently the most popular callback based library:
https://github.com/caolan/async
I've used callbacks and they pretty much drove me crazy, I've had much better experience using Promises, bluebird is a very popular and fast promise library:
https://github.com/petkaantonov/bluebird
I've found this to be a pretty sensitive topic in the node community (callbacks vs promises), so by all means, use what you feel will work best for you personally. A good control flow library should also give you async stack traces, this is really important for debugging.
The Node.js process will finish when the last callback in the event loop finishes it's path of execution and doesn't register any other callbacks.
This is not a complete explanation, I advice you to check out the following thread, it's pretty up to date:
How do I get started with Node.js
From Willem's answer:
The Node.js event loop runs under a single thread. Every I/O call requires you to register a callback. Every I/O call also returns immediately, this allows you to do multiple IO operations in parallel without using threads.
I would like to start explaining with this above quote, which is one of the common misunderstandings of node js framework that I am seeing everywhere.
Node.js does not magically handle all those asynchronous calls with just one thread and still keep that thread unblocked. It internally uses google's V8 engine and a library called libuv(written in c++) that enables it to delegate some potential asynchronous work to other worker threads (kind of like a pool of threads waiting there for any work to be delegated from the master node thread). Then later when those threads finish their execution they call their callbacks and that is how the event loop is aware of the fact that the execution of a worker thread is completed.
The main point and advantage of nodejs is that you will never need to care about those internal threads and they will stay away from your code!. All the nasty sync stuff that should normally happen in multi threaded environments will be abstracted out by nodejs framework and you can happily work on your single thread (main node thread) in a more programmer friendly environment (while benefiting from all the performance enhancements of multiple threads).
Below is a good post if anyone is interested:
When is the thread pool used?
you have to know first about nodeJs implementaion in order to know event loop.
actually node js core implementation using two components :
v8 javascript runtime engine
libuv for handlign non i/o blocking operation and handling threads and concurrent operations for you;
with the javascript you can actually write code with one thread but this means not that your code execute on the one thread although you can execute on multiple thread s using clusters in node js
now when you want to execute some code like :
let fs = require('fs');
fs.stat('path',(err,stat)=>{
//do something with the stat;
console.log('second');
});
console.log('first');
the execution of this code at high level is like this:
first the v8 engine run this code and then if there is no error
everything is good then it looks for the
it try to run it run line by line when it gets to the fs .stats this is a node js api very similar to the web apis like setTimeout that the browser handle it for us when it encounter to the fs.stats it is pass the code to the libuv components with a flag and pass your callback to the event queue then the libuv you execute your code during the operation and when its done just send some signal and then d the v8 execute your code az a callback you set on the queue but it always check for the stack is empty then go for the your code on the queue # always remember that !
Well, to understand nodejs I/O events in the event, you must understand nodejs event loop properly.
from the name event loop, we understand it's a loop that runs cycle after cycle round-robin basis until there are no events remains in the loop or the app closed.
The event loop is one of the topmost features in nodejs, it is what makes async programming in nodejs.
When the program starts we are in a node process in the single thread where the event loop runs. Now the most importing things we need to know that the event loop is where all the application code that is inside callback functions is executed.
So, basically all code that is not top-level code will run in the event loop. Some part (mostly heavy duties) might get offloaded to the thread pool
(When is the thread pool used?), the event loop will take care of those heavy duties and return the result to the event of the event loop.
It is the heart of the node architecture, and nodejs built around callback functions. so callbacks will triggered as soon as some work is finished sometime in the future because node uses an event-triggered architecture.
When an application receives an HTTP request on a node server or a timer expiring or a file finishing to read all these will emit events as soon as they are done with their work, and our event loop will then pick up these events and call the callback functions that are associated with each event, it's usually said that the event loop does the orchestration, which simply means that it receives events, calls their callback functions, and offloads the more expensive tasks to the thread pool.
Now, how does all this actually work behind the scenes? In what order are these callbacks executed?
Well, when we start our node application, the event loop starts running right away. An event loop has multiple phases, and each phase has a callback queue, where the four most important phases are 1. Expired timer callbacks, 2.I/O polling and callbacks 3. setImmediate callbacks, and 4. Close callbacks. There are other phases that is used internally by Node.
So, the first phase takes care of callbacks of expired timers, for example, from the setTimeout() function. So, if there are callback functions from timers that just expired, these are the first ones to be processed by the event loop.
** The most important thing is, If a timer expires later during the time when one of the other phases is being processed, well then the callback of that timer will only be called as soon as the event loop comes back to this first phase. And it works like this in all four phases.**
So callbacks in each queue are processed one by one until there are no ones left in the queue and only then, the event loop will enter the next phase. for example, suppose there is 1000 setTimeOut callbacks timer expired and the event loop is in the first phase then all these 1000 setTimeOuts callbacks will execute one by one then it will go to the next phase(I/O pooling and callbacks).
Next up, we have I/O pooling and execution of I/O callbacks. Here I/O stands for input/output and polling basically means looking for new I/O events that are ready to be processed and putting theme into the callback queue.
In the context of a Node application, I/O means mainly stuff like networking and file access, so in this phase where probably 99% of general application code gets executed.
The next phase is for setImmediate callbacks, and SetImmediate is a special kind of timer that we can use if we want to process callbacks immediately after the I/O polling and execution phase.
And finally, the fourth phase is the close callbacks, in this phase, all close events are processed, for example when a server or a WebSocket shut down.
These are the four phases in the event loop, but besides these four callbacks queues there are actually also two other queues,
1. nextTick() other
2. microtasks queue(which is mainly for resolved promises)
If there are any callbacks in one of these two queues to be processed, they will be executed right after the current phase of the event loop finishes instead of waiting for the entire loop/cycle to finish.
In other words, after each of these four phases, if there are any callbacks in these two special queues, they will be executed right away. Now imagine that a promise resolves and returns some data from an API call while the callback of an expired timer is running, In this case, the promise callback will be executed right after the one from the timer finish.
The same logic also applies to the nextTick() queue. The nextTick() is a function that we can use when we really, really need to execute a certain callback right after the current event loop phase. It's a bit similar to setImmediate, with the difference that setImmediate only runs after the I/O callback phase.
Will all the above things can happen in one tick/cycle of the event loop, In the meantime their new events could have arisen in a particular phase or old event could be expired, the event loop will handle those events with another new cycle.
So now it's time to decide whether the loop should continue to the next tick or if the program should exit. Node simply checks whether there are any timers or I/O tasks that are still running in the background if there aren't any then it will exit the application. But if there are any pending timers or I/O tasks, then the node will continue running the event loop and go starting to the next cycle.
For example, in node application when we are listening for incoming HTTP requests, we basically running an infinite I/O task, and that is run in the event loop, for that Node.js keep running and keep listening for new HTTP request coming in instead of just exiting the application.
Also when we are writing or reading a file in the background that's also an I/O task and it makes sense that the app doesn't exist while it's working with that file, right?
Now The event loop in practices:
const fs = require('fs');
setTimeout(()=>console.log('Timer 1 finished'), 0);
fs.readFile('test-file.txt', ()=>{
console.log('I/O finished');
});
setImmediate(()=>console.log('Immediate 1 finished'))
console.log('Hello from the top level code');
Output:
Well the first lin is Hello from the top level code, yes it is expected because this is a code that gets executed immediately. Then after we have three output, Timer 1 finished this line is expected because of phase one as we discuess before, but after that I/O finished should be printed, because we discuess that setImmediate runs after the I/O callback phase, but this code is actually not in an I/O cycle, so it is not running inside of the event loop, because it's not runnin inside of any callback function.
Now lets do another test:
const fs = require('fs');
setTimeout(()=>console.log('Timer 1 finished'), 0);
setImmediate(()=>console.log('Immediate 1 finished'));
fs.readFile('test-file.txt', ()=>{
console.log('I/O finished');
setTimeout(()=>console.log('Timer 2 finished'), 0);
setImmediate(()=>console.log('Immediate 2 finished'));
setTimeout(()=>console.log('Timer 3 finished'), 0);
setImmediate(()=>console.log('Immediate 3 finished'));
});
console.log('Hello from the top level code')
Output:
The output is as expected right? Now let's add some delay:
setTimeout(()=>console.log('Timer 1 finished'), 0);
setImmediate(()=>console.log('Immediate 1 finished'));
fs.readFile('test-file.txt', ()=>{
console.log('I/O finished');
setTimeout(()=>console.log('Timer 2 finished'), 3000);
setImmediate(()=>console.log('Immediate 2 finished'));
setTimeout(()=>console.log('Timer 3 finished'), 0);
setImmediate(()=>console.log('Immediate 3 finished'));
});
console.log('Hello from the top level code')
output:
In the first cycle inside I/O everything executed, but because of the dealy Timer-2 executed inside its code in the second cycle.
Now Lets add nextTick(), and see how nodejs behaves:
setTimeout(()=>console.log('Timer 1 finished'), 0);
setImmediate(()=>console.log('Immediate 1 finished'));
fs.readFile('test-file.txt', ()=>{
console.log('I/O finished');
setTimeout(()=>console.log('Timer 2 finished'), 3000);
setImmediate(()=>console.log('Immediate 2 finished'));
setTimeout(()=>console.log('Timer 3 finished'), 0);
setImmediate(()=>console.log('Immediate 3 finished'));
process.nextTick(()=>console.log('Process Next Tick'));
});
console.log('Hello from the top level code')
Output:
Well, the first callback is executed is inside the process.NextTick(), as it is expected right? Because nextTicks callbacks stays in the microtask queue an they executed after each phase.
If you run this simple node code
console.log('starting')
setTimeout(()=>{
console.log('0sec')
}, 0)
setTimeout(()=>{
console.log('2sec')
}, 2000)
console.log('end')
What do you expect output to be?
If its,
starting
0sec
end
2sec
it's is wrong guess, we will get
starting
end
0sec
2sec
because node will never print code in event loop before exiting main()
So basically, First main() will go in stack, then console.log('starting ') so you will see it printed first, after that come setTimeout(()=>{console.log('0sec')}, 0) will go in a stack and then in nodeAPI (node uses multi-threads (lib written in c++) to execute setTimeout to finish, even tho above code is single thread code) after time is up it moves to the event loop, now node can't print it unless stack is not empty. So, next line i.e setTimeout of 2sec will be first pushed to stack,then nodeAPI which will wait for 2 sec to finish, and then to even loop, in mean while next code line will be executed that is console.log('end') and so we see end msg before 0sec, because if nodes non blocking nature. After end code is over and hence main is poped out and its turn of event loop code to be executed that is first 0sec and after that 2sec msg will be printed.

How does the stack work on NodeJs?

I've recently started reading up on NodeJS (and JS) and am a little confused about how call backs are working in NodeJs. Assume we do something like this:
setTimeout(function(){
console.log("World");
},1000);
console.log("Hello");
output: "Hello World"
So far on what I've read JS is single threaded so the event loop is going though one big stack, and I've also been told not to put big calls in the callback functions.
1)
Ok, so my first question is assuming its one stack does the callback function get run by the main event loop thread? if so then if we have a site with that serves up content via callbacks (fetches from db and pushes request), and we have 1000 concurrent, are those 1000 users basically being served synchronously, where the main thread goes in each callback function does the computation and then continues the main event loop? If this is the case, how exactly is this concurrent?
2) How are the callback functions added to the stack, so lets say my code was like the following:
setTimeout(function(){
console.log("Hello");
},1000);
setTimeout(function(){
console.log("World");
},2000);
then does the callback function get added to the stack before the timeout even has occured? if so is there a flag that's set that notifies the main thread the callback function is ready (or is there another mechanism). If this is infact what is happening doesn't it just bloat the stack, especially for large web applications with callback functions, and the larger the stack the longer everything will take to run since the thread has to step through the entire thing.
The event loop is not a stack. It could be better thought of as a queue (first in/first out). The most common usage is for performing I/O operations.
Imagine you want to read a file from disk. You start by saying hey, I want to read this file, and when you're done reading, call this callback. While the actual I/O operation is being performed in a separate process, the Node application is free to keep doing something else. When the file has finished being read, it adds an item to the event loop's queue indicating the completion. The node app might still be busy doing something else, or there may be other items waiting to be dequeued first, but eventually our completion notification will be dequeued by node's event loop. It will match this up with our callback, and then the callback will be called as expected. When the callback has returned, the event loop dequeues the next item and continues. When there is nothing left in node's event loop, the application exits.
That's a rough approximation of how the event loop works, not an exact technical description.
For the specific setTimeout case, think of it like a priority queue. Node won't consider dequeuing the item/job/callback until at least that amount of time has passed.
There are lots of great write-ups on the Node/JavaScript event loop which you probably want to read up on if you're confused.
callback functions are not added to caller stack. There is no recursion here. They are called from event loop. Try to replace console.log in your example and watch result - stack is not growing.

Resources