NodeJS sockets initialized as unpaused? - node.js

A net.Socket object in NodeJS is a Readable Stream, however one note in the docs got me concerned:
For the Net.Socket 'data' event, the docs say
Note that the data will be lost if there is no listener when a Socket emits a 'data' event.
That seems to imply a Socket is returned to the calling script in "flowing-mode" and already un-paused? However, for a generic Readable Stream, the documentation for the 'data' event says
If you attach a data event listener, then it will switch the stream into flowing mode, and data will be passed to your handler as soon as it is available.
That "If" seems to imply if you wait a bit to bind to the 'data' event, the stream will wait for you, and if you intentionally want to miss the 'data' events, the example in the resume() method seems to indicate you must call the resume() method to start the flow of data.
My concern is that when working with a net.Server, when you receive a net.Socket as part of a 'connection' event, is it imperative that you start handling the 'data' events right away since it's already opened? Meaning if I do:
var s = new net.Server();
s.on('connection', function(socket) {
// Do some lengthy setup process here, blocking execution for a few seconds...
socket.on('data', function(d) { console.log(d); });
});
s.listen(8080);
Meaning not bind to the 'data' event right away, I could lose data? So is this a more robust way to handle incoming connections if you have a lengthy setup required for each one?
var s = new net.Server();
s.on('connection', function(socket) {
socket.pause(); // Not ready for you yet!
// Do some lengthy setup process here, blocking execution for a few seconds...
socket.on('data', function(d) { console.log(d); });
socket.resume(); // Okay, go!
});
s.listen(8080);
Anyone have experience working with listening on raw socket streams to know if this data loss is an issue?
I'm hoping this is an instance where the Net.Socket documentation wasn't updated since v0.10, since the stream documentation has a section that mentions 'data' events started emitting right away in versions prior to 0.10. Were TCP sockets properly updated to not start emitting 'data' packets right away, and the documentation not updated appropriately?

Yes, this is the docs flaw. Here is an example:
var net = require('net')
var server = net.createServer(onConnection)
function onConnection (socket) {
console.log('onConnection')
setTimeout(startReading, 1000)
function startReading () {
socket.on('data', read)
socket.on('end', stopReading)
}
function stopReading () {
socket.removeListener('data', read)
socket.removeListener('end', stopReading)
}
}
function read (data) {
console.log('Received: ' + data.toString('utf8'))
}
server.listen(1234, onListening)
function onListening () {
console.log('onListening')
net.connect(1234, onConnect)
}
function onConnect () {
console.log('onConnect')
this.write('1')
this.write('2')
this.write('3')
this.write('4')
this.write('5')
this.write('6')
}
All the data is received. If you explicitly resume() socket, you will lose it.
Also, if you do your "lengthy" setup in a blocking manner (which you shouldn't) you can't lose any IO as it has no chance to be processed, so no events will be emitted.

Related

How to share a TCP socket object between parent and (forked) child?

I have an application in which my TCP server module (parent) listens for 'connection' events and receives some data on the created socket to perform a handshake with the remote client. Once the handshake is performed, the server needs to send the socket object to a forked child, which will also send and receive data to the socket, do some stuff and finally send result to parent and be killed. For some reasons, I need to keep the socket object in the parent for further data processing not performed in the child, after the child has finished.
I've managed to send the socket to the child using the subprocess.send() method but, this way, the socket handle becomes null in the parent. I tried setting the keepOpen option to true and it almost worked, since I can send the socket and still work with it in the parent, but It seems not to work properly, because incoming data is not always received by the child 'data' event listener.
I also tried to removeListener for the 'data' event from the parent, prior to sending the socket to the child, but this made no difference, data is still being lost at some point on some occasions (on some others it is correctly received after an unexpected delay...). This code extract illustrates what I'm trying to do:
const net = require('net');
const server = net.createServer();
const cp = require('child_process');
server.on('connection', (socket) => {
socket.on('data', (data) => {
// Perform handhsake
const child = cp.fork('child.js');
child.on('message', (result) => {
console.log('CHILD finished processing: ', result);
child.kill('SIGHUP');
// Do more stuff with socket
});
child.send('socket', socket);
// (At this point, socket handle is null)
});
});
server.listen(PORT)
I'm new to nodejs, I assume there might be errors in the code. Thanks.

NodeJS streams and premature end

Assuming a Readable Stream in NodeJS and a Data (on('data', ...)) event handler tied to it that is relatively slow, is it possible for the End event to fire before the last Data handler(s) has finished, and if so, will it prematurely terminate that handler? Or, will all Data events get dispatched and run?
In my case, I am working with large files and want to commit to a DB every data chunk. I am worried that I may lose the last record or two (or more) if End is fired before the last DB calls in the handler actually complete.
Event 'end' fire after last 'data' event. But it may happend before the last Data handler has finished. It is possible that before one 'data' handler has finished, next is started. It depends of what you have in your code, but it is possible that later call of event 'data' finish before earlier. It may cause errors and problems in your code.
Example how to cause problems (to your own tests):
var fs = require('fs');
var rr = fs.createReadStream('somebigfile.jpg');
var i=0;
rr.on('data', function(chunk) {
i++;
var s = i;
console.log('readable:' + s);
setTimeout(function(){
console.log('timeout:'+s);
}, 50-i*10);
});
rr.on('end', function() {
console.log('end');
});
It will print in your console when start each 'data' event handler. And after some miliseconds when it finish. Finish may be in different order.
Solution:
Readable Streams have two modes 'flowing mode' and a 'paused mode'. When you add 'data' event handler, you auto set Readable Streams to flowing mode.
From documentation :
When in flowing mode, data is read from the underlying system and
provided to your program as fast as possible
In this mode events will not wait for your slow actions to finish. For your need is 'paused mode'.
From documentation:
In paused mode, you must explicitly call stream.read() to get chunks
of data out. Streams start out in paused mode.
In other words: you demand chunk of data, you get it, you work with it, and when you ready you ask for new chunk of data. In this mode you controll when you want to get your data.
How to change to 'paused mode':
It is default mode for this stream. But when you register 'data' event handler it switch to 'flowing mode'. Therefore not use readstream.on('data',...)
Instead use readstream.on('readable', function(){...}) when it fire, then it means that stream is ready to give chunk of data. To get chunk of data use var chunk = readstream.read();
Example from docs:
var fs = require('fs');
var rr = fs.createReadStream('foo.txt');
rr.on('readable', function() {
console.log('readable:', rr.read());
});
rr.on('end', function() {
console.log('end');
});
Please read documentation for more details, because there are more posibilities when stream is auto switched to 'flowing mode'.
Work with slow handlers and flowing mode:
If you want/need work in 'flowing mode', there is also solution. You can pause and resume stream. When you get chunk form readstream('data'), pause stream and when you finish work then resume it.
Example from documentation:
var readable = getReadableStreamSomehow();
readable.on('data', function(chunk) {
console.log('got %d bytes of data', chunk.length);
readable.pause();
console.log('there will be no more data for 1 second');
setTimeout(function() {
console.log('now data will start flowing again');
readable.resume();
}, 1000);
});

Node.js Socket pipe method DOES NOT pipe last packet to the http response

I have Node server which use Express as web app.
This server creates a tcp socket connection with other side TCP server.
I'm trying to pipe tcp data to the user http response.
It works fine for a while, but the LAST tcp packet is NOT piped to http response.
So, download status of web browser stopped as 99.9% downloaded.
My source code is below.
Anyone can help me to solve this problem?
Thanks in advance.
app.get('/download/*', function(req, res){
var tcpClient = new net.Socket();
tcpClient.connect(port, ip, function() {
// some logic
});
tcpClient.on('data', function(data) {
/* skip ... */
tcpClient.pipe(res); // This method is called once in the 'data' event loop
/* skip ... */
});
tcpClient.on('close', function() {
clog.debug('Connection closed.');
});
tcpClient.on('end', function() {
clog.debug('Connection Ended.');
});
tcpClient.on('error', function(err){
clog.err(err.stack);
});
});
That's not how you are supposed to use .pipe().
When you pipe a stream into another, you don't have to handle the data events yourself: everything is taken care of by the pipe. Moreover, the data event is emitted on every chunk of data, which means that you are possibly piping() the streams multiple times.
You only need to create and initialize the Socket, and then pipe it to your response stream:
tcpClient.connect(port, ip, function () {
// some logic
this.pipe(res);
});
Edit: As you precised in the comments, the first chunk contains metadata, and you only want to pipe from the second chunk thereon. Here's a possible solution:
tcpClient.connect(port, ip, function () {
// some logic
// Only call the handler once, i.e. on the first chunk
this.once('data', function (data) {
// Some logic to process the first chunk
// ...
// Now that the custom logic is done, we can pipe the tcp stream to the response
this.pipe(res);
});
});
As a side note, if you want to add custom logic to the data that comes from the tcpClient before writing it to the response object, check out the Transform stream. You will then have to:
create a transform stream with your custom transforming logic
pipe all streams together: tcpClient.pipe(transformStream).pipe(res).

nodejs stdin readable event not triggered

The readable event is not triggered in the process.stdin
test.js
var self = process.stdin, data ;
self.on('readable', function() {
var chunk = this.read();
if (chunk === null) {
handleArguments();
} else {
data += chunk;
}
});
self.on('end', function() {
console.log("end event",data )
});
Then when i do node test.jsand start typing the in the console, the readable event is not triggered at all.
Please tell me how to attach readable listener to process.stdin stream.
If you are trying to capture what you type on to console, try these steps,
process.stdin.resume();
process.stdin starts in paused state.You need to bring it to ready state.
listen on 'data' event to capture the data typed in console.
process.stdin.on('data', function(data) {
console.log(data.toString());
});
I am not sure if this helped your actual problem.Hope atleast it gives you some insight.
Additional Info:
The readable event is introduced in Node.js v0.9.4. So check if the node you are using is gte 0.9.4.
Note from node api docs:
The 'data' event emits either a Buffer (by default) or a string if setEncoding() was used.
Note that adding a 'data' event listener will switch the Readable stream into "old mode", where data is emitted as soon as it is available, rather than waiting for you to call read() to consume it.

Nodeunit Execution Order?

I am trying to test my web server using nodeunit:
test.js
exports.basic = testCase({
setUp: function (callback) {
this.ws = new WrappedServer();
this.ws.run(PORT);
callback();
},
tearDown: function (callback) {
delete this.ws;
callback();
},
testFoo: function(test) {
var socket = ioClient.connect(URL);
console.log('before client emit')
socket.emit('INIT', 1, 1);
console.log('after client emit');
}
});
and this is my very simple nodejs server:
WrappedServer.prototype.run = function(port) {
this.server = io.listen(port, {'log level': 2});
this.attachCallbacks();
};
WrappedServer.prototype.attachCallbacks = function() {
var ws = this;
ws.server.sockets.on('connection', function(socket) {
ws.attachDebugToSocket(socket);
console.log('socket attaching INIT');
socket.on('INIT', function(userId, roomId) {
// do something here
});
console.log('socket finished attaching INIT');
});
}
Basically I am getting this error:
[...cts/lolol/nodejs/testing](testingServer)$ nodeunit ws.js
info - socket.io started
before client emit
after client emit
info - handshake authorized 1013616781193777373
The "sys" module is now called "util". It should have a similar interface.
socket before attaching INIT
socket finished attaching INIT
info - transport end
Somehow, the socket emits INIT BEFORE the server attaches callbacks for sockets.
Why is this happening? In addition, what's the right way to do this?
I'm assuming you were expecting the order to be this?
socket before attaching INIT
socket finished attaching INIT
before client emit
after client emit
From the small amount of code given, the issue is probably two things.
First, and probably the main issue, is that your ioClient.connect will not connect immediately. You need to pass some kind of callback to that, and emit INIT, and then execute the test's callback function once it has actually connected.
Second, you should probably do the same thing with you run command. listen will not stary listening immediately, so you're going to get inconsistent results occasionally if it hasn't started listening by the time it executes your test. You should also pass the setUp's callback to io.listen.
Update
To be clear for listen, just like most things in node, the socketio server's listen method is asynchronous. Calling the method tells it to start listening, but there is some time in the background where the server sets up the networking stuff to start listening. Just like node's core listen, http://nodejs.org/docs/latest/api/net.html#server.listen, socket.io's version takes a callback argument that is called once the server is up and listening.
io.listen(port, {'log level': 2}, callback);
Unless socket.io starts giving you errors about failing to connect, this probably is not an issue, but it is something to keep in mind. Treating asynchronous actions as if they were instantaneous is an easy way to make bugs that only come up occasionally. Since your run wraps listen, I think in general, not just for testing, passing a callback to run would be a very good idea.

Resources