Nodejs event handling - node.js

Following is my nodejs code
var emitter = require('events'),
eventEmitter = new emitter.EventEmitter();
eventEmitter.on('data', function (result) { console.log('Im From Data'); });
eventEmitter.on('error', function (result) { console.log('Im Error'); });
require('http').createServer(function (req, res) {
res.end('Response');
var start = new Date().getTime();
eventEmitter.emit('data', true);
eventEmitter.emit('error', false);
while(new Date().getTime() - start < 5000) {
//Let me sleep
}
process.nextTick(function () {
console.log('This is event loop');
});
}).listen(8090);
Nodejs is single threaded and it runs in an eventloop and the same thread serves the events.
So, in the above code on a request to my localhost:8090 node thread should be kept busy serving the request [there is a sleep for 5s].
At the same time there are two events being emitted by eventEmitter. So, both these events must be queued in the eventloop for processing once the request is served.
But that is not happening, I can see the events being served synchronously as they are emitted.
Is that expected? I understand that if it works as I expect then there would be no use of extending events module. But how are the events emitted by eventEmitter handled?

Only things that require asynchronous processing are pushed into the event loop. The standard event emitter in node will dispatch an event immediately. Only code using things like process.nextTick, setTimeout, setInterval, or code explicitly adding to it in C++ affect the event loop, like node's libraries.
For example, when you use node's fs library for something like createReadStream, it returns a stream, but opens the file in the background. When it is open, node adds to the event loop and when the function in the loop gets called, it will trigger the 'open' event on the stream object. Then, node will load blocks from the file in the background, and add to the event loop to trigger data events on the stream.
If you wanted those events to be emitted after 5 seconds, you'd want to use setTimeout or put the emit calls after your busy loop.
I'd also like to be clear, you should never have a busy loop like that in Node code. I can't tell if you were just doing it to test the event loop, or if it is part of some real code. If you need more info, please you expand on the functionality you are looking to achieve.

Related

Javascript sleep code running, but seems to not cause any delays

This an Angular app, and this specific code is inside a webworker in Typescript. I'm still new to webworkers, but both the sleep and the loop execute inside the same thread.
The intent is to poll a service and exit the loop when a the process is completed. My problem is the sleep call below is not sleeping. I need it to delay for at least 9 seconds, and ideally I'd like it to be configurable. But it runs as though the sleep didn't run.
I have two questions:
Why is the sleep not working?
This is an Angular app served by a NodeJS container on cirrus. When the Angular app is requested and served by this NodeJS server, I'd like to pass a secret defined at the NodeJS server along with the Angular app. This secret would be the polling delay. Would a cookie value be returned in the NodeJS Angular App response? Not sure what the best response would be.
Code below:
function sleep(ms: number) {
return new Promise((resolve) => {
log('DEBUG','Sleeping for ' + ms + ' ms');
setTimeout(resolve, ms);
});
}
while (jobIsStillRunning(jobExecutionResult)) {
postMessageWithLog(jobExecutionResult);
log('DEBUG','sendFilePolling() jobExecutionResult=' + JSON.stringify(jobExecutionResult));
sleep(30000); // Sleep function runs but doesn't do anything.
jobExecutionResult = await getAsyncFilesResult(jobExecutionResult);
}
You need to await on sleep function
I think you need to know macro and micro tasks.
Microtasks come solely from our code. They are usually created by promises: an execution of .then/catch/finally handler becomes a microtask
If a microtask recursively queues other microtasks, it might take a long time until the next macrotask is processed. This means, you could end up with a blocked UI, or some finished I/O idling in your application.
example:
macrotasks: setTimeout, setInterval, setImmediate, I/O, UI rendering, etc/
microtasks: process.nextTick, Promises, etc.
code:
console.log('1')
setTimeout(()=>{
console.log('2')
},0)
Promise.resolve().then(()=>{
console.log('3')
})
console.log('4')
output:
1
4
3
2
wait what?!
explain:
console.log('1') it's a normal code and immediately run.
setTimeout it's a macrotask then it will add to macro queue
Promise.resolve it's a microtask then it will add to micro queue
console.log('4') it a normal code and immediately run.
now when normal codes are end and now microtaskqueue and macrotaskqueue need to be dequeue.
inside microtaskqueue we have one task and it's Promise.resolve then immediately run .then function
now microtaskqueue is clear too, inside macrotaskqueue we have one task and it's setTimeout then immediately run timeoutCallback function
if you want to use sleep function without await you need to handle this.
but you can add await before your sleep function and wait on it

when does Node spawned child process actually start?

In the documentation for Node's Child Process spawn() function, and in examples I've seen elsewhere, the pattern is to call the spawn() function, and then to set up a bunch of handlers on the returned ChildProcess object. For instance, here is the first example of spawn() given on that documentation page:
const { spawn } = require('child_process');
const ls = spawn('ls', ['-lh', '/usr']);
ls.stdout.on('data', (data) => {
console.log(`stdout: ${data}`);
});
ls.stderr.on('data', (data) => {
console.error(`stderr: ${data}`);
});
ls.on('close', (code) => {
console.log(`child process exited with code ${code}`);
});
The spawn() function itself is called on the second line. My understanding is that spawn() starts a child process asynchronously. From the documentation:
The child_process.spawn() method spawns a new process using the given
command, with command line arguments in args.
However, the following lines of the script above go on to set up various handlers for the process, so it's assuming that the process hasn't actually started (and potentially finished) between the time spawn() is called on line 2 and the other stuff happens on the subsequent lines. I know JavaScript/Node is single threaded. However, the operating system is not single threaded, and naively one would read that spawn() call to be telling the operating system to spawn the process right now (at which point, with unfortunate timing, the OS could suspend the parent Node process and run/complete the child process before the next line of the Node code is executed).
But it must be that the process doesn't actually get spawned until the current JavaScript function completes (or more generally the current JavaScript event handler that called the current function completes), right?
That seems like a pretty important thing to say. Why doesn't it say that in the Child Process documentation page? Is there some overriding Node principle that makes it unnecessary to say that explicitly?
The spawning of the new process starts immediately (it's handed over to the OS to actually fire up the process and get it going). Starting the new process with .spawn() is asynchronous and non-blocking. So, it will initiate the operation with the OS and immediately return. You might think that that's why it's OK to set up event handlers after it returns (because the process hasn't yet finished starting). Well, yes and no. It likely hasn't yet finished starting the new process, but that isn't the main reason why it's OK.
It's OK, because node.js runs all its events through a single threaded event queue. Thus no events from the newly spawned process can be processed until after your code finishes executing and returns control back to the system. Only then can it process the next event in the event queue and trigger one of the events you are registering handlers for.
Or, said another way, none of the events from the other process are pre-emptive. They won't/can't interrupt your existing Javascript code. So, since you're still running your Javascript code, those events can't get run yet. Instead, they sit in the event queue until your Javascript code finishes and then the interpreter can go get the next event from the event queue and run the callback associated with it. Likewise, that callback runs until it returns back to the interpreter and then the interpreter can get the next event and run its callback and so on...
That's why node.js is called an event-driven system.
As such, it's perfectly fine to do this type of structure:
const { spawn } = require('child_process');
const ls = spawn('ls', ['-lh', '/usr']);
ls.stdout.on('data', (data) => {
console.log(`stdout: ${data}`);
});
ls.stderr.on('data', (data) => {
console.error(`stderr: ${data}`);
});
ls.on('close', (code) => {
console.log(`child process exited with code ${code}`);
});
None of those data or close events can execute their callbacks until after your code is done and returns control back to the system. So, it's perfectly safe to set up those event handlers like you are. Even if the newly spawned process was running and generating events right away, those events will just sit in the event queue until your Javascript finishes what it is doing (which includes setting up your event handlers).
Now, if you delayed setting up the event handlers until some future tick of the event loop (as shown below) with something like a setTimeout(), then you could miss some events:
const { spawn } = require('child_process');
const ls = spawn('ls', ['-lh', '/usr']);
setTimeout(() => {
ls.stdout.on('data', (data) => {
console.log(`stdout: ${data}`);
});
ls.stderr.on('data', (data) => {
console.error(`stderr: ${data}`);
});
ls.on('close', (code) => {
console.log(`child process exited with code ${code}`);
});
}, 10);
Here you are not setting up the event handlers immediately as part of the same tick of the event loop, but after a short delay. Therefore some events could get processed from the event loop before you install your event handlers and you could miss some of these events. Obviously, you would never do it this way (on purpose), but I just wanted to show that code running on the same tick of the event loop does not have a problem, but code running on some future tick of the event loop could have a problem missing events.
This is to follow up on jfriend00's answer, to explain what it helped me understand, in case it helps someone else. I knew about the event-driven nature of JavaScript/Node. What jfriend00's explanation made clear to me is the idea that an event can happen and Node can be aware that it happened, but it doesn't actually decide which handlers to tell about that event until the next tick. For instance, if the spawn() call fails outright (e.g., command does not exist), Node obviously knows that immediately. My thought was that it would then immediately queue the appropriate handlers to run on the next tick. But what I now understand is that it puts the "raw event" (i.e., the fact that the spawn failed, with whatever details about that) in its queue, and then on the next tick it determines and calls the appropriate handlers. And the same is true for other events like receiving output from the process, etc. The event is saved but the appropriate handlers for the event are only determined when the next tick runs, so handlers assigned on the previous tick, after spawn(), will get called.

Unexpected Node.js program flow

I am new to node.js and working through the API. In the stream module docs I came across this example of the "unpipe event" (actually a fusion of two examples in the docs).
const fs = require("fs);
const writable = fs.createWriteStream("write.txt");
const readable = fs.createReadStream("read.txt");
readable.pipe(writable);
setTimeout(function(){
console.log("Stop writing to file.txt");
readable.unpipe(writable);
console.log("Manually close the file stream");
writable.end();
}, 0);
writable.on("unpipe", function(src){
console.log("Something has stopped piping into the writer");
});
I can't understand the following console.log order:
"Stop writing to file.txt"
"Something has stopped piping into the writer"
"Manually close the file stream"
Given the setTimeout callback is running - which is the first phase of the event loop as I understand - how on earth does the callback for the "unpipe" event start to run before the setTimeout callback has finished.
Originally I had the setTimeout firing after a time above zero seconds, however I was finding that the unpipe call back was always called first. I reasoned that my computer was reading the file always first before the setTimeout was ready. (Although I can't see any mention in the docs about the completion of the write to the file eliciting the "unpipe" event, but this makes sense I suppose). However I can't for the life of me reason how the above program flow is occurring. Thanks in advance for any help.
As specified by the node.js documentation:
The EventEmitter calls all listeners synchronously in the order in which they were registered.
That is, when .emit is called, it synchronously runs through all listeners for the emitted event and calls them.
Note that if necessary you can wrap your callback code in process.nextTick to ensure that it will always run asynchronously, but in your case it's likely that's unnecessary.
Also the source of the call to .emit (the emission of the event) will often be asynchronous.

How to iterate on each record of a Model.stream waterline query?

I need to do something like:
Lineup.stream({foo:"bar"}).exec(function(err,lineup){
// Do something with each record
});
Lineup is a collection with over 18000 records so I think using find is not a good option. What's the correct way to do this? From docs I can't figure out how to.
The .stream() method returns a node stream interface ( a read stream ) that emits events as data is read. Your options here are either to .pipe() to something else that can take "stream" input, such as the response object of the server, or to attach an event listener to the events emitted from the stream. i.e:
Piped to response
Lineup.stream({foo:"bar"}).pipe(res);
Setup event listeners
var stream = Lineup.stream({foo:"bar"});
stream.on("data",function(data) {
stream.pause(); // stop emitting events for a moment
/*
* Do things
*/
stream.resume(); // resume events
});
stream.on("err",function(err) {
// handle any errors that will throw in reading here
});
The .pause() and .resume() are quite inportant as otherwise things within the processing just keep responding to emitted events before that code is complete. While fine for small cases, this is not desirable for larger "streams" that the interface is meant to be used for.
Additionally, if you are calling any "asynchronous" actions inside the event handler like this, then you need to take care to .resume() within the callback or promise resolution , thus waiting for that "async" action to complete itself.
But look at the "node documentation" linked earlier for more in depth information on "stream".
P.S I believe the following syntax should also be supported if it suits your sensibilities better:
var stream = Lineup.find({foo:"bar"}).stream();

node.js does not flush buffers on a crash

I am running node as a windows service. The service was crashing on startup so I implemented a logging system, to discover that messages do not get written to a file when the application is forced to exit. I have been able to duplicate the problem with the code below:
var fs = require('fs');
var logStream = fs.createWriteStream('./nx3.log');
logStream.end('Goodbye world');
process.exit(0);
Nothing is written into nx3.log because the buffers don't flush. I have been able to work around the problem by using fs.appendFileSync but I would prefer to be using a mature logging module rather than rolling my own.
Is it possible to open up a write stream that is unbuffered? Or some other way around this?
The issue here is not related to the buffer. The FS writeableStream performs the writes asynchronously. So process.exit is not waiting for logSteam.end to perform the write, rather it is exiting immediately.
What you can do, is listen on the uncaughtException event, and perform your logging there. If a listener is added for this exception, the default action (which is to print a stack trace and exit) will not occur.
process.on('uncaughtException', function (err) {
logStream.end(err, function() {
// write has completed, now we can exit
process.exit(0);
});
});
Watch out, the callback of end() is listening to finish event which DOES NOT implies buffer have been flushed into the output (e.g. disk or network).
So, end() will
emit finish
flush data
emit end
So, listen to end event from stream to be sure, like this example:
process.on('uncaughtException', function (err) {
// Use logStream to log what you want
logStream.write(err);
// Exit process only after flushing data
logStream.writeStream.once('end', () => process.exit(1));
logStream.end();
});
Source: https://nodejs.org/api/stream.html#stream_events_finish_and_end

Resources