Close readable stream to FIFO in NodeJS - node.js

I am creating a readable stream to a linux fifo in nodejs like this:
var stream = FS.createReadStream('fifo');
This all works well and I can receive the data from the fifo just fine.
My problem is that I want to have a method to shut my software down gently and therefore I need to close this stream somehow.
Calling
process.exit();
does have no effect as the stream is blocking.
I also tried to destroy the stream manually by calling the undocumented methods stream.close() as well as stream.destroy() as described in the answers of this question.
I know that I could kill my own process using process.kill(process.pid, 'SIGKILL') but this feels like a really bad hack and could have bad impacts on the filesystem or database.
Isn't there a better way to achieve this?
You can try this minimal example to reproduce my problem:
var FS = require('fs');
console.log("Creating readable stream on fifo ...");
var stream = FS.createReadStream('fifo');
stream.once('close', function() {
console.log("The close event was emitted.");
});
stream.close();
stream.destroy();
process.exit();
After creating a fifo called 'fifo' using mkfifo fifo.
How could I modify the above code to shutdown the software correctly?

Explicitly writing to the named pipe will unblock the read operation, for example:
require('child_process').execSync('echo "" > fifo');
process.exit();

Related

What is the advantage of using pipe function over res.write

The framework is Express.
When I'm sending a request from within an end point and start receiving data, either I can read data in chunks and write them instantly:
responseHandler.on('data', (chunk) => {
res.write(chunk);
});
Or I can create a writable stream and pipe the response to that.
responseHandler.pipe(res)
It is obvious that the pipe function takes care of the former process with more dimensions to it. What are they?
The most important difference between managing event handlers and using readable.pipe(writable) is that using pipe:
The flow of data will be automatically managed so that the destination Writable stream is not overwhelmed by a faster Readable stream. Pipe
It means that readable stream may be faster than writable and pipe handles that logic. If you are writing code like:
responseHandler.on('data', (chunk) => {
res.write(chunk);
});
res.write() function
Returns: (boolean) false if the stream wishes for the calling code to wait for the 'drain' event to be emitted before continuing to write additional data; otherwise true. Link
It means that writable stream could be not ready to handle more data. So you can manage this manually as mentioned in writable.write() example.
In some cases you do not have readable stream and you could write to writable stream using writable.write().
Example
const data = []; // array of some data.
data.forEach((d) => writable.write(d));
But again, you must see what writable.write returns. If it is false you must act in a manual fashion to adjust stream flow.
Another way is to wrap your data into readable stream and just pipe it.
By the way, there is one more great advantage of using pipes. You can chain them by your needs, for instance:
readableStream
.pipe(modify) // transform stream
.pipe(zip) // transform stream
.pipe(writableStream);
By summing everything up piggyback on node.js given functionality to manage streams if possible. In most cases it will help you avoid extra complexity and it will not be slower compared to managing it manually.

Node pipe to stdout -- how do I tell if drained?

The standard advice on determine whether you need to wait for drain event on process.stdout is to check whether it returns false when you write to it.
How should I check if I've piped another stream to it? It would seem that that stream can emit finish before all the output is actually written. Can I do something like?
upstreamOfStdout.on('finish', function(){
if(!process.stdout.write('')) {
process.stdout.on('drain', function() { done("I'm done"); });
}
else {
done("I'm done");
}
});
upstreamOfStdout.pipe(process.stdout);
I prefer an answer that doesn't depend on the internals of any streams. Just given that the streams conform to the node stream interface, what is the canonical way to do this?
EDIT:
The larger context is a wrapper:
new Promise(function(resolve, reject){
stream.on(<some-event>, resolve);
... (perhaps something else here?)
});
where stream can be process.stdout or something else, which has another through stream piped into it.
My program exits whenever resolve is called -- I presume the Promise code keeps the program alive until all promises have been resolved.
I have encountered this situation several times, and have always used hacks to solve the problem (e.g. there are several private members in process.stdout that are useful.) But I really would like to solve this once and for all (or learn that it is a bug, so I can track the issue and fix my hacks when its resolved, at least): how do I tell when a stream downstream of another is finished processing its input?
Instead of writing directly to process.stdout, create a custom writable (shown below) which writes to stdout as a side effect.
const { Writable } = require('stream');
function writeStdoutAndFinish(){
return new Writable({
write(chunk, encoding, callback) {
process.stdout.write(chunk,callback);
},
});
};
The result of writeStdoutAndFinish() will emit a finish event.
async function main(){
...
await new Promise((resolve)=>{
someReadableStream.pipe(writeStdoutAndFinish()).on('finish',()=>{
console.log('finish received');
resolve();
})
});
...
}
In practice, I don't that the above approach differs in behavior from
async function main(){
...
await new Promise((resolve)=>{
(someReadableStream.on('end',()=>{
console.log('end received');
resolve();
})).pipe(process.stdout)
});
...
}
First of all, as far as I can see from the documentation, that stream never emits the finish event, so unlikely you can rely on that.
Moreover, from the documentation above mentioned, the drain event seems to be used to notify the user about when the stream is ready to accept more data once the .write method returned false. In any case you can deduce that that means that all the other data have been written. From the documentation for the write method indeed we deduce that the false value (aka please stop pushing data) is not mandatory and you can freely ignore it, but subsequent data will be probably stuffed in memory letting the use of it to grow up.
Because of that, basing my assumption on the sole documentation, I guess you can rely on the drain event to know when all the data have been nicely handled or are likely to be flushed out.
That said, it looks to me also that there is not a clear way to definitely know when all the data have been effectively sent to the console.
Finally, you can listen the end event of the piped stream to know when it has been fully consumed, no matter if it has been written to the console or the data are still buffered within the console stream.
Of course, you can also freely ignore the problem, for a fully consumed stream should be nicely handled by node.js, thus discarded and you have not to deal with it anymore once you have piped it to the second stream.

Does the new way to read streams in Node cause blocking?

The documentation for node suggests that for the new best way to read streams is as follows:
var readable = getReadableStreamSomehow();
readable.on('readable', function() {
var chunk;
while (null !== (chunk = readable.read())) {
console.log('got %d bytes of data', chunk.length);
}
});
To me this seems to cause a blocking while loop. This would mean that if node is responding to an http request by reading and sending a file, the process would have to block while the chunk is read before it could be sent.
Isn't this blocking IO which node.js tries to avoid?
The important thing to note here is that it's not blocking in the sense that it's waiting for more input to arrive on the stream. It's simply retrieving the current contents of the stream's internal buffer. This kind of loop will finish pretty quickly since there is no waiting on I/O at all.
A stream can be both synchronous and asynchronous. If readable stream synchronously pushes data in the internal buffer then you'll get a synchronous stream. And yes, in that case if it pushes lots of data synchronously node's event loop won't be able to run until all the data is pushed.
Interestingly, if you even remove the while loop in readble callback, the stream module internally calls a while loop once and keeps running until all the pushed data is read.
But for asynchronous IO operations(e.g. http or fs module), they push data asynchronously in the buffer. So the while loop only runs when data is pushed in buffer and stops as soon as you've read the entire buffer.

Using callbacks with Socket IO

I'm using node and socket io to stream twitter feed to the browser, but the stream is too fast. In order to slow it down, I'm attempting to use setInterval, but it either only delays the start of the stream (without setting evenly spaced intervals between the tweets) or says that I can't use callbacks when broadcasting. Server side code below:
function start(){
stream.on('tweet', function(tweet){
if(tweet.coordinates && tweet.coordinates != null){
io.sockets.emit('stream', tweet);
}
});
}
io.sockets.on("connection", function(socket){
console.log('connected');
setInterval(start, 4000);
});
I think you're misunderstanding how .on() works for streams. It's an event handler. Once it is installed, it's there and the stream can call you at any time. Your interval is actually just making things worse because it's installing multiple .on() handlers.
It's unclear what you mean by "data coming too fast". Too fast for what? If it's just faster than you want to display it, then you can just store the tweets in an array and then use timers to decide when to display things from the array.
If data from a stream is coming too quickly to even store and this is a flowing nodejs stream, then you can pause the stream with the .pause() method and then, when you're able to go again, you can call .resume(). See http://nodejs.org/api/stream.html#stream_readable_pause for more info.

How to transfer/stream big data from/to child processes in node.js without using the blocking stdio?

I have a bunch of (child)processes in node.js that need to transfer large amounts of data.
When I read the manual it says the the stdio and ipc inferface between them are blocking, so that won't do.
I'm looking into using file descriptors but I cannot find a way to stream from them (see my other more specific question How to stream to/from a file descriptor in node?)
I think I might use a net socket, but I fear that has unwanted overhead.
I also see this but it not the same (and has no answers: How to send huge amounts of data from child process to parent process in a non-blocking way in Node.js?)
I found a solution that seems to work: when spawning the child process you can pass options for stdio and setup a pipe to stream data.
The trick is to add an additional element, and set it to 'pipe'.
In the parent process stream to child.stdio[3].
var opts = {
stdio: [process.stdin, process.stdout, process.stderr, 'pipe']
};
var child = child_process.spawn('node', ['./child.js'], opts);
// send data
mySource.pipe(child.stdio[3]);
//read data
child.stdio[3].pipe(myHandler);
In de child open stream for file descriptor 3.
// read from it
var readable = fs.createReadStream(null, {fd: 3});
// write to it
var writable = fs.createWriteStream(null, {fd: 3});
Note that not every stream you get from npm works correctly, I tried JSONStream.stringify() but it created errors, but it worked after I piped it via through2. (no idea why that is).
Edit: some observations: it seems the pipe is not always Duplex stream, so you might need two pipes. And there is something weird going on where in one case it only works if I also have a ipc channel, so 6 total: [stdin, stdout, stderr, pipe, pipe, ipc].

Resources