What event order guarantees does node make? - node.js

I see in the documentation that listeners will be executed "in order" for a given event, but what other guarantees are there? For instance, is the following code guranteed to print 0 through 9 sequentially, or is that just a side effect of the current implementation?
var EventEmitter = require('events').EventEmitter
var ev = new EventEmitter();
ev.on("foo", console.log);
for (var i = 0; i < 10; i++) {
ev.emit("foo", i);
}

For instance, is the following code guaranteed to print 0 through 9 sequentially
Hmm. I don't think it's actually guaranteed in the documentation anywhere, but that's the only reasonable way an event queue can work. If events aren't delivered in the order that they were sent, it can lead to very tangled logic on the receiving end.
As pointed out in one of the comments on your question, in the all-JavaScript case, it can't work any other way, because the event is dispatched synchronously during the emit() call. For native objects, something similar applies - they need to call emit() via the V8 bindings, so ultimately those events get delivered in the order the native code sends them, as well.

Listeners will be executed in the order that they are attached.
var EventEmitter = require('events').EventEmitter
var ev = new EventEmitter();
ev.on("foo", console.log);
ev.on("foo", function(i){ console.log('...'); });
for (var i = 0; i < 10; i++) {
ev.emit('foo', i);
}
Will output:
1
...
2
...
3
...
// and so on
But change the order of registration to:
ev.on('foo', function(i) { console.log('...'); });
ev.on('foo', console.log);
And the output will be:
...
1
...
2
...
3
// and so on
As I'm sure you can tell, that has nothing to do with the fact that the original code prints the values sequentially. I'm not sure if listeners called via emit are executed on a separate thread or not but by the looks of your results, I'd guess not which is why you see the sequential output.

Related

socket.io how to send multiple messages sequentially?

I'm using socket.io like this
Client:
socket.on('response', function(i){
console.log(i);
});
socket.emit('request', whateverdata);
Server:
socket.on('request', function(whateverdata){
for (i=0; i<10000; i++){
console.log(i);
socket.emit('response', i);
}
console.log("done!");
});
I need output like this when putting the two terminals side by side:
Server Client
0 0
1 1
. (etc) .
. .
9998 9998
9999 9999
done!
But instead I am getting this:
Server Client
0
1
. (etc)
.
9998
9999
done!
0
1
.
. (etc)
9998
9999
Why?
Shouldn't Socket.IO / Node emit the message immediately, not wait for the loop to complete before emitting any of them?
Notes:
The for loop is very long and computationally slow.
This question is referring to the socket.io library, not websockets in general.
Due to latency, waiting for confirmation from the client before sending each response is not possible
The order that the messages are received is not important, only that they are received as quickly as possible
The server emits them all in a loop and it takes a small bit of time for them to get to the client and get processed by the client in another process. This should not be surprising.
It is also possible that the single-threaded nature of Javascript in node.js prevents the emits from actually getting sent until your Javascript loop finishes. That would take detailed examination of socket.io code to know for sure if that is an issue. As I said before if you want to 1,1 then 2,2 then 3,3 instead of 1,2,3 sent, then 1,2,3 received you have to write code to force that.
If you want the client to receive the first before the server sends the 2nd, then you have to make the client send a response to the first and have the server not send the 2nd until it receives the response from the first. This is all async networking. You don't control the order of events in different processes unless you write specific code to force a particular sequence.
Also, how do you have client and server in the same console anyway? Unless you are writing out precise timestamps, you wouldn't be able to tell exactly what event came before the other in two separate processes.
One thing you could try is to send 10, then do a setTimeout(fn, 1) to send the next 10 and so on. That would give JS a chance to breathe and perhaps process some other events that are waiting for you to finish to allow the packets to get sent.
There's another networking issue too. By default TCP tries to batch up your sends (at the lowest TCP level). Each time you send, it sets a short timer and doesn't actually send until that timer fires. If more data arrives before the timer fires, it just adds that data to the "pending" packet and sets the timer again. This is referred to as the Nagle's algorithm. You can disable this "feature" on a per-socket basis with socket.setNoDelay(). You have to call that on the actual TCP socket.
I am seeing some discussion that Nagle's algorithm may already be turned off for socket.io (by default). Not sure yet.
In stepping through the process of socket.io's .emit(), there are some cases where the socket is marked as not yet writable. In those cases, the packets are added to a buffer and will be processed "later" on some future tick of the event loop. I cannot see exactly what puts the socket temporarily in this state, but I've definitely seen it happen in the debugger. When it's that way, a tight loop of .emit() will just buffer and won't send until you let other events in the event loop process. This is why doing setTimeout(fn, 0) every so often to keep sending will then let the prior packets process. There's some other event that needs to get processed before socket.io makes the socket writable again.
The issue occurs in the flush() method in engine.io (the transport layer for socket.io). Here's the code for .flush():
Socket.prototype.flush = function () {
if ('closed' !== this.readyState &&
this.transport.writable &&
this.writeBuffer.length) {
debug('flushing buffer to transport');
this.emit('flush', this.writeBuffer);
this.server.emit('flush', this, this.writeBuffer);
var wbuf = this.writeBuffer;
this.writeBuffer = [];
if (!this.transport.supportsFraming) {
this.sentCallbackFn.push(this.packetsFn);
} else {
this.sentCallbackFn.push.apply(this.sentCallbackFn, this.packetsFn);
}
this.packetsFn = [];
this.transport.send(wbuf);
this.emit('drain');
this.server.emit('drain', this);
}
};
What happens sometimes is that this.transport.writable is false. And, when that happens, it does not send the data yet. It will be sent on some future tick of the event loop.
From what I can tell, it looks like the issue may be here in the WebSocket code:
WebSocket.prototype.send = function (packets) {
var self = this;
for (var i = 0; i < packets.length; i++) {
var packet = packets[i];
parser.encodePacket(packet, self.supportsBinary, send);
}
function send (data) {
debug('writing "%s"', data);
// always creates a new object since ws modifies it
var opts = {};
if (packet.options) {
opts.compress = packet.options.compress;
}
if (self.perMessageDeflate) {
var len = 'string' === typeof data ? Buffer.byteLength(data) : data.length;
if (len < self.perMessageDeflate.threshold) {
opts.compress = false;
}
}
self.writable = false;
self.socket.send(data, opts, onEnd);
}
function onEnd (err) {
if (err) return self.onError('write error', err.stack);
self.writable = true;
self.emit('drain');
}
};
Where you can see that the .writable property is set to false when some data is sent until it gets confirmation that the data has been written. So, when rapidly sending data in a loop, it may not be letting the event come through that signals that the data has been successfully sent. When you do a setTimeout() to let some things in the event loop get processed that confirmation event comes through and the .writable property gets set to true again so data can again be sent immediately.
To be honest, socket.io is built of so many abstract layers across dozens of modules that it's very difficult code to debug or analyze on GitHub so it's hard to be sure of the exact explanation. I did definitely see the .writable flag as false in the debugger which did cause a delay so this seems like a plausible explanation to me. I hope this helps.

Node.js Spawning multiple threads within a class method

How can I run a single method multiple times multi-threaded when called as a method of a class?
At first I tried to use the cluster module, but I realize it just re-runs the whole process from the start, rightfully so.
How can I achieve something like what's outlined below?
I want a class's method to spawn n processes, and when the parallel tasks are completed, I can resolve a promise which the method returns.
The problem with the code below is that calling cluster.fork() will fork index.js process.
index.js
const Person = require('./Person.js');
var Mary = new Person('Mary');
Mary.run(5).then(() => {...});
console.log('I should only run once, but I am called 5 times too many');
Person.js
const cluster = require('cluster');
class Person{
run(distance){
var completed = 0;
return new Promise((resolve, reject) => {
for(var i = 0; i < distance; i++) {
// run a separate process for each
cluster.fork().send(i).on('message', message => {
if (message === 'completed') { ++completed; }
if (completed === distance) { resolve(); }
});
}
});
}
}
I think the short answer is impossible. It's even worse - this has nothing to do with js. To multi (process or thread) in your particular problem you will essentially need a copy of the object in every thread, since it needs (maybe) access to fields - in this case you would need to either initialize it in every thread or share memory. That last one I don't think is provided in cluster, and not trivial in other languages in every use case.
If the calculation is independent of the Person I suggest you extract it, and use the usual (in index.js):
if(cluster.isWorker) {
//Use the i for calculation
} else {
//Create Person, then fork children in for loop
}
You then collect the results and change the Person as needed. You will be copying index.js, but this is standard and you only run what you need.
The problem is if results are dependent on Person. If these are constant for all i you can still send them to your forks independently. Otherwise what you have is the only way to fork. In general forking in cluster is not meant for methods, but for the app itself, which is the standard forking behavior.
Another solution
Following your comment, I suggest you checkout child_process.execFile or child_process.exec on same file.
This way you can spawn a totally independent process on the fly. Now instead of calling cluster.fork you call execFile. You can use either the exit code or stdout as return values (stderr etc.). Promise is now replaced with:
var results = []
for(var i = 0; i < distance; i++) {
// run a separate process for each
results.push(child_process.execFile().child.execFile('node', 'mymethod.js`,i]));
}
//... catch the exit event from all results or return a callback using results.
Inside mymethod.js Have your code that takes i and returns what you want either in the exit code or through stdout, both properties of the returned child_process. This is a bit un-node.js-y since you're waiting on asynchronous calls, but you're requirements are non standard. Since I'm not sure how you use this perhaps returning a callback with the array is a better idea.

Removing event listeners on a currently emitting event

I have the following sample app, written in Node.js:
'use strict';
var events = require('events'),
util = require('util');
var EventEmitter = events.EventEmitter;
var Foo = function () {};
util.inherits(Foo, EventEmitter);
var foo = new Foo();
foo.once('x', function () {
foo.removeAllListeners();
console.log('Google!');
});
foo.once('x', function () {
foo.removeAllListeners();
console.log('Yahoo!');
});
foo.emit('x');
It prints:
Google!
Yahoo!
Now my question is: Apparently the removeAllListeners does not affect the event listeners that are currently bound to the event. Is this by random, or is this by intent? (I checked this out using 0.10.32 as well as 0.11.13)
The background of my question is: If I bind two event handlers to a stream's end event, and one of them calls removeAllListeners, does Node.js guarantee that both will always be run, or is this just by good luck?
In looking at the implementation of the .emit() method, it looks like once it starts processing an event and calling listeners, that event will not be affected by any code that calls removeAllListeners() so in your example both listeners will be called.
The code for .emit() makes a copy of the array of listeners before executing any of them so that once it starts executing one, it will execute all of them, even if they are removed during execution. Here's the relevant piece of code:
} else if (util.isObject(handler)) {
len = arguments.length;
args = new Array(len - 1);
for (i = 1; i < len; i++)
args[i - 1] = arguments[i];
listeners = handler.slice();
len = listeners.length;
for (i = 0; i < len; i++)
listeners[i].apply(this, args);
}
From the EventEmitter implementation here: https://github.com/joyent/node/blob/857975d5e7e0d7bf38577db0478d9e5ede79922e/lib/events.js#L120
In this piece of code, handler will be an array of listener functions. The line
listeners = handler.slice()
makes a copy of the listeners array before any listeners are executed. This is to be expected because iteration of that array can be messed up (duplicates or skips) if code is freely allowed to modify the array being iterated while it is being iterated. So, it freezes the set of listeners to be called before calling any of them.

NODE.JS accessing shared variable in socket.IO event handler

I am doing an experimental online poker game using NODE.JS and socket.IO. The game requires 3 players to join to start. I use socket.IO to listen to the connections from joining players. Whenever there are 3 players coming, they will form one group. Currently I use some shared variables to do this. But if there are lots of players coming in at the same time, I am fear it will cause synchronization problem. As you can see from the code snippet, the players, groups, groupNumber, comingPlayer, clients are all shared between multiple 'connection' event handlers. So when one event handler is executed and another event handler got scheduled by the V8 engine, it may corrupt the states of these shared variables.
I have done some research using Google but didn't find satisfactory answers. So I posted here to see if any expert can help me. Thanks in advance!
var clients = {}; // {"player1": 1} {"player2": 1} {"player3": 1}
var groups = {}; // {"1": ["player1", "player2", "player3"]
var groupNumber = 1; // the current group number
var comingPlayers = 0; // a temporary variable for the coming players
var players = []; // a temporary array containing the players which have not formed 1 group
socket.on('connection', function(client) {
sockets[client.id] = client;
players.push(client.id);
clients[client.id] = groupNumber;
comingPlayers++;
if (comingPlayers === 3) { // now there are 3 players which can compose 1 group
groups[groupNumber] = arrayClone(players);
gamePlay(groupNumber);
players = [];
groupNumber++;
comingPlayers = 0;
}
}
The code you've shown is atomic (it will completely finish before another event handler can run), so you shouldn't have any synchronization issues. Remember that in node, all user-written code runs in a single thread. If somebody else is trying to connect, it won't interrupt what you're doing.

Can I allow for "breaks" in a for loop with node.js?

I have a massive for loop and I want to allow I/O to continue while I'm processing. Maybe every 10,000 or so iterations. Any way for me to allow for additional I/O this way?
A massive for loop is just you blocking the entire server.
You have two options, either put the for loop in a new thread, or make it asynchronous.
var data = [];
var next = function(i) {
// do thing with data [i];
process.nextTick(next.bind(this, i + 1));
};
process.nextTick(next.bind(this, 0));
I don't recommend the latter. Your just implementing naive time splicing which the OS level process scheduler can do better then you.
var exec = require("child_process").exec
var s = exec("node " + filename, function (err, stdout, stderr) {
stdout.on("data", function() {
// handle data
});
});
Alternatively use something like hook.io to manage processes for you.
Actually you probably want to aggressively redesign your codebase if you have a blocking for loop.
Maybe something like this to break your loop into chunks...
Instead of:
for (var i=0; i<len; i++) {
doSomething(i);
}
Something like:
var i = 0, limit;
while (i < len) {
limit = (i+10000);
if (limit > len)
limit = len;
process.nextTick(function(){
for (; i<limit; i++) {
doSomething(i);
}
});
}
}
The nextTick() call gives a chance for other events to get in there, but it still does most looping synchronously which (I'm guessing) will be a lot faster than creating a new event for every iteration. And obviously, you can experiment with the number (10,000) till you get the results you want.
In theory, you could also use setTimeout() rather than nextTick(), in case it turns out that giving other processes a somewhat bigger "time-slice" helps. That gives you one more variable (the timeout milliseconds) that you can use for tuning.

Resources