Check if NodeJS ClearTextStream stream is "ended"? - node.js

I have a ClearTextStream for a TLS connection and I want to check if "end" was already called. The actual problem is, that I'm trying to write something into the stream and I get an "write after end" error.
Now to avoid that, I just want to check if "end" was already called. I do have an "close" event, but it isn't fired in all cases.
I can't find it in the documentation and I couldn't find anything like that by googling.
I could check the error event (which is throwing "write after end" for me) and handle the situation there - but is there really no way to check this in the beginning?
Thanks!

If you get a write after end error, that means that you are trying to write data to a Writable stream that has been closed (ie. that can't accept anymore input data). When a writable stream closes, the finish event is emitted (see the documentation). On the other hand, the close event is emitted by a Readable stream, when the underlying resource is closed (for instance when the file descriptor you are reading is closed).
As a ClearTextStream is a Duplex stream, it can emit both close and finish events, but they don't mean the same thing. In your particular case, you should listen to the finish event and react appropriately.
Another solution would be to check the this.ended and this.finished booleans (see the source code), but I wouldn't recommend that as they are private variables and only reflect the implementation details, not the public API.

Related

Defining event handlers after readline initiated

The textbook way to read a file line-by-line in NodeJS seems to be to call readline.createInterface, and then afterward attach event handlers for line and close.
There doesn't seem to be anything to "start" the reader. It just goes, and seems to work perfectly. How does it know when to start reading? How does it guarantee that those events, which don't exist yet, will always pick up every line in the file?
I always assumed that it just all happened so fast that the events get attached faster than it takes to open the file from disk and start reading it - but that doesn't really hold up.
For example, suppose I put some heavy CPU-consuming code after the lineReader has been created, but before the events attached. It still seems to work, and the event still fires for each line. How did it "wait" until the heavy stuff was done before it started reading? If I don't attach the line event, then it runs anyway and the close event still fires, so it's not like it's waiting for the line event to be created.
var lineReader = readline.createInterface({
input: fs.createReadStream("input.txt")
});
// EVENTS HAVE NOT BEEN CREATED YET
lineReader.on("line", line => { console.log(line); });
lineReader.on("close", () => { console.log("DONE"); });
This isn't specific to lineReader - seems to be a common Node pattern - this is just the easiest to define and run.
Internally, readline.createInterface() is creating a stream. Streams, by default, are paused. They unpause themselves in a number of ways and what's relevant here is when a data event listener is added.
And, inside of readline.createInterface(), a data event handler is added. That starts the stream flowing and it will start emitting data events which the readline code will parse into line events.
Also because node.js and streams are event driven and node.js runs your Javascript as single threaded, that means that no events will occur until your setup code finishes executing. Internally, node.js may have already started reading the file (using asynchronous I/O and threads internally), but even if it finishes the first read from the file before your setup code finishes executing, all it will do is insert a data event in the event queue. node.js won't process that data event until your setup code is done executing and has returned control back to the node.js event loop.
Then, the data event callback will be called, the readline code will parse the data from that first event and if there is a full line in that first data event, it will then trigger a line event.
There doesn't seem to be anything to "start" the reader.
Attaching a data event handler on the readStream (internal to the readline code) is what tells the stream to start flowing.
It just goes, and seems to work perfectly. How does it know when to start reading?
Same as above.
How does it guarantee that those events, which don't exist yet, will always pick up every line in the file?
The readline code receives raw data from the file in its data event handler. It then parses that code into lines and emits line events for each line that it finds. When a file read crosses a line boundary, it must buffer a partial line and wait for the rest of the line to come on the next data event from the stream.
When the linereader code sees that the stream is done reading and there are no more bytes, it sends the last line (if there is one in the buffer) and then issues the close event to tell the listener that its all done.
For example, suppose I put some heavy CPU-consuming code after the lineReader has been created, but before the events attached. It still seems to work, and the event still fires for each line. How did it "wait" until the heavy stuff was done before it started reading?
This is because node.js is event-driven. The first data event from the stream (internal to the readline code) is the result of an fs.readFile() function and that notifies completion through the event queue. An event in the event queue will not be processed until the current piece of Javascript finishes and returns control back to the event loop (at which point it will then service the next event waiting in the event queue). So, no matter how much have CPU-consuming code you have before you attach the event handlers, the internals of readline won't be told about the first data read from the file until all that is done.
It is this single-threaded, event-driven nature that ensures that you get to install your event listeners before those events can be triggered so there's no way you can miss them.
If I don't attach the line event, then it runs anyway and the close event still fires, so it's not like it's waiting for the line event to be created.
Correct. The readline code attaches the data event handler inside the createInterface() call, whether you have a line event listener or not. So, the stream will start flowing and the file will get read whether you have a line event handler or not.
FYI, one way you can help answers these questions yourself is to just go look at the node.js code and see how it works. That's what I did here. Here's a link to the createInterface() function where you can see what I've described here.
And, you can see here in the stream doc, where is describes the three ways that a stream starts flowing, one of which is the attaching of a data event listener.

why python selectors module has no event for socket error

In select,there is a list for error socket or epoll has event for ERROR
But in selectors module just has events for EVENT_READ and EVENT_WRITE.
therefore,how can I know the error socket without event?
An error on the socket will always result in the underlying socket being signaled as readable (at least). For example, if you are waiting for data from a remote peer, and that peer closes its end of the connection (or abends, which does the same thing), the local socket will get the EVENT_READ marking. When you go to read it, you would then get zero bytes (end of file), telling you that the peer is gone (or at least finished sending).
Similarly, if you were waiting to send data and the peer resets the connection, you will get an EVENT_WRITE notification. When you then go to attempt a send, you will get an error from the send (which, in python, means an exception).
The only thing you lose here from select is the ability to detect exceptional conditions: the xlist from select.select or POLLPRI from select.poll. If you needed those, you would need to use the lower-level select module directly. (Priority/out of band data is not commonly used so this is not an unreasonable choice.)
So the simplified interface provided by selectors really loses no "error" information. If there is an error on the socket that would have caused a POLLERR return from select.poll, a RST from the remote, say, you will get a EVENT_READ or EVENT_WRITE notification and whatever error occurred will be re-triggered as soon as you attempt send or recv.
A good rule of thumb to keep in mind with select, poll and friends is that a result indicating "readable" really means "will not block if you attempt to read". It doesn't mean you will actually get data back from the read; you may get an error instead.
Likewise for "writable": you may not be able to send data, but attempting the write won't block.

How to get a readable stream to 'close'

I'm getting a readable stream (require('stream').Readable) from a library I'm using*.
In a general sense, how can I close this (any) readable stream once all data is consumed? I'm seeing the end event, but the close event is never received.
Tried: .close() and destroy() don't seem to be avail anymore on require('stream').Readable, while they were avail on require('fs') streams.
I believe the above is causing some erratic behavior under load. I.e.: running out of file descriptors, mem leaks, etc, so any help is much appreciated.
Thanks.
*) x-ray. Under the covers it uses enstore, which uses an adapted require('stream').Readable
Readable streams typically don't emit close (they emit end). The close event is more for Writable streams to indicate that an underlying file descriptor has been closed for example.
There is no need to manually close a Readable stream once all of the data has been consumed, it ends automatically (this is done when the stream implementation calls push(null)).
Of course if the stream implementation isn't cleaning up any resources it uses behind the scenes, then that is a bug and should be filed on the appropriate project's issue tracker.

Node.js flush socket after write

I'm implementing a tcp protocol in Node.
Full source:
https://github.com/roelandmoors/ads.js/blob/master/ads.js
specs:
http://infosys.beckhoff.com/content/1033/tcadsamsspec/html/tcadsamsspec_amstcppackage.htm?id=17754
The problem is that I use this to send a package:
this.tcpClient.write(buf);
If I send multiple commands, then multiple commands are combined into a single tcp packet.
This doesn't work.
There are more questions about this on SO, but they recommend using a delimeter.
But since I can't change the protocol this isn't an option.
Isn't there a simple solution to flush the socket?
socket.setNoDelay() doesn't help.
Edit: I also tried to use the drain event to send the next write, but the event is never called?
Update:
This seems to solve the problem, but is very uggly and I don't now if it always works.
Instead of writing it directly I write to a buffer:
this.writeFILO.push(buf);
Every cycle(?) I'm writing a package to the socket stream:
var sendCycle = function(ads) {
if (ads.writeFILO.length > 0) {
ads.tcpClient.write(ads.writeFILO.shift());
}
setTimeout(function() {
sendCycle(ads);
}, 0);
}
I refer to the socket.write(data, [encoding], [callback]) API:
The optional callback parameter will be executed when the data is finally written out - this may not be immediately.
So, set up a queue (array is fine) which holds messages to be sent.
When the above callback is being called, check the queue and send if needed..
This however does not guarantee what you're looking for, you'll have to test. Unfortunately the docs don't explicitly state when there's an acknowledgement from the remote end point that it actually received that message...
In the end, as you concluded, TCP is a stream.
An interesting idea which just came up to me now, however, if you're FORCED TO use an existing protocol, then open two TCP connections.
When one connection acknowledges (whatever the higher-level protocol is) receiving that message, send the next through the other one... and so forth..
Anyway, nice challenge :)
I was wrong. TCP is a stream and the protocol works like a stream, but I didn't handle it like a stream.
PS: sending seperate messages seemed to work with setImmediate()
I know that this is an old question, and I'm not 100% sure I understand what you are looking for, but there is a way to flush a socket in node. First you need to implement a Transform class.
See here for example: https://nodejs.org/api/stream.html#stream_implementing_a_transform_stream.
Then you can take your stream and pipe it through your transform before piping it into your socket.
I do not own this node module but I have seen an example of this here: https://github.com/yongtang/clamav.js/blob/master/index.js#L8

What is Streams3 in Node.js and how does it differ from Streams2?

I've often heard of Streams2 and old-streams, but what is Streams3? It get mentioned in this talk by Thorsten Lorenz.
Where can I read about it, and what is the difference between Streams2 and Streams3.
Doing a search on Google, I also see it mentioned in the Changelog of Node 0.11.5,
stream: Simplify flowing, passive data listening (streams3) (isaacs)
I'm going to give this a shot, but I've probably got it wrong. Having never written Streams1 (old-streams) or Streams2, I'm probably not the right guy to self-answer this one, but here it goes. It seems as if there is Streams1 API that still persists to some degree. In Streams2, there are two modes of streams flowing (legacy), and non-flowing. In short, the shim that supported flowing mode is going away. This was the message that lead to the patch now called called Streams3,
Same API as streams2, but remove the confusing modality of flowing/old
mode switch.
Every time read() is called, and returns some data, a data event fires.
resume() will make it call read() repeatedly. Otherwise, no change.
pause() will make it stop calling read() repeatedly.
pipe(dest) and on('data', fn) will automatically call resume().
No switches into old-mode. There's only flowing, and paused. Streams start out paused.
Unfortunately, to understand any of description which defines Streams3 pretty well, you need to first understand Streams1, and the legacy streams
Backstory
First, let's take a look at what the Node v0.10.25 docs say about the two modes,
Readable streams have two "modes": a flowing mode and a non-flowing mode. When in flowing mode, data is read from the underlying system and provided to your program as fast as possible. In non-flowing mode, you must explicitly call stream.read() to get chunks of data out. — Node v0.10.25 Docs
Isaac Z. Schlueter said in November slides I dug up:
streams2
"suck streams"
Instead of 'data' events spewing, call read() to pull data from source
Solves all problems (that we know of)
So it seems as if in streams1, you'd create an object and call .on('data', cb) to that object. This would set the event to be trigger, and then you were at the mercy of the stream. In Streams2 internally streams have buffers and you request data from those streams explicitly (using `.read). Isaac goes on to specify how backwards compat works in Streams2 to keep Streams1 (old-stream) modules functioning
old-mode streams1 shim
New streams can switch into old-mode, where they spew 'data'
If you add a 'data' event handler, or call pause() or resume(), then switch
Making minimal changes to existing tests to keep us honest
So in Streams2, a call to .pause() or .resume() triggers the shim. And, it should, right? In Streams2 you have control over when to .read(), and you're not catching stuff being thrown at you. This triggered a legacy mode that acted independently of Streams2.
Let's take an example from Isaac's slide,
createServer(function(q,s) {
// ADVISORY only!
q.pause()
session(q, function(ses) {
q.on('data', handler)
q.resume()
})
})
In Streams1, q starts up right away reading and emitting (likely losing data), until the call to q.pause advises q to stop pulling in data but not from emitting events to clear what it already read.
In Streams2, q starts off paused until the call to .pause() which signifies to emulate the old mode.
In Streams3, q starts off as paused having never read from the file handle making the q.pause() a noop, and on the call to q.on('data', cb) will call q.resume until there is no more data in the buffer. And, then call again q.resume doing the same thing.
Seems like Streams3 was introduced in io.js, then in Node 0.11+
Streams 1 Supported data being pushed to a stream. There was no consumer control, data was thrown at the consumer whether it was ready or not.
Streams 2 allows data to be pushed to a stream as per Streams 1, or for a consumer to pull data from a stream as needed. The consumer could control the flow of data in pull mode (using stream.read() when notified of available data). The stream can not support both push and pull at the same time.
Streams 3 allows pull and push data on the same stream.
Great overview here:
https://strongloop.com/strongblog/whats-new-io-js-beta-streams3/
A cached version (accessed 8/2020) is here: https://hackerfall.com/story/whats-new-in-iojs-10-beta-streams-3
I suggest you read the documentation, more specifically the section "API for Stream Consumers", it's actually very understandable, besides I think the other answer is wrong: http://nodejs.org/api/stream.html#stream_readable_read_size

Resources