Trying to understand readable streams in node.js - node.js

In my module.js I have
var Stream = require('stream');
module.exports = function () {
var stream = new Stream();
stream.readable = true;
stream.emit('data', 'some stuff')
stream.emit('end')
return stream;
}
and in my index.js
var module = require('./module')
module().pipe(process.stdout)
substack's example from the stream handbook is working just fine. Why doesn't my code show anything in the command line?

Because you are emitting data before calling pipe, and 'data' listener is attached after first 'data' event is fired
EventEmitter's calls are synchronous (as almost everything else non-IO in node.js)
A bit simplified version of
stream.emit('data', 'some stuff')
stream.pipe(process.stdout)
without EventEmitter could be rewritten as
stream.listeners = [];
// 'emit' call
var ondata = stream.listeners.data;
if (ondata) {
// only one listener case in the example
ondata('some stuff');
}
// 'pipe' call
stream.listeners.data = function(buff) {
process.write(buff);
}

Related

How to properly close a writable stream in Node js?

I'm quite new to javascripts. I'm using node js writable stream to write a .txt file; It works well, but I cannot understand how to properly close the file, as its content is blank as long as the program is running. More in detail I need to read from that .txt file after it has been written, but doing it this way returns an empty buffer.
let myWriteStream = fs.createWriteStream("./filepath.txt");
myWriteStream.write(stringBuffer + "\n");
myWriteStream.on('close', () => {
console.log('close event emitted');
});
myWriteStream.end();
// do things..
let data = fs.readFileSync("./filepath.txt").toString().split("\n");
Seems like the event emitted by the .end() method is triggered after the file reading, causing it to be read as empty. If I put a while() to wait for the event to be triggered, so that I know for sure the stream is closed before the reading, the program waits forever.
Do you have any clue of what I'm doing wrong?
your missing 2 things one test that write is succeed
then you need to wait for stream finish event
const { readFileSync, createWriteStream } = require('fs')
const stringBuffer = Buffer.from(readFileSync('index.js')
)
const filePath = "./filepath.txt"
const myWriteStream = createWriteStream(filePath)
let backPressureTest = false;
while (!backPressureTest) {
backPressureTest = myWriteStream.write(stringBuffer + "\n");
}
myWriteStream.on('close', () => {
console.log('close event emitted');
});
myWriteStream.on('finish', () => {
console.log('finish event emitted');
let data = readFileSync(filePath).toString().split("\n");
console.log(data);
});
myWriteStream.end();

NodeJS Stream flushed during the Event Loop iteration

I'm trying to pipe one Stream Axios Response into multiple files. It's not working, and I can reproduce it with the simple code below:
Will work:
const { PassThrough } = require('stream')
const inputStream = new PassThrough()
inputStream.write('foo')
// Now I have a stream with content
inputStream.pipe(process.stdout)
inputStream.pipe(process.stderr)
// will print 'foofoo', for both stdout and stderr
Will not work:
const { PassThrough } = require('stream')
const inputStream = new PassThrough()
inputStream.write('foo')
inputStream.pipe(process.stdout)
setImmediate(() => {
inputStream.pipe(process.stderr)
})
// Will print only 'foo'
The question is, Can I say that the existed content in the stream will be piped only if the two pipe commands will execute in the same Event-Loop iteration?
Doesn't that make the situation non-deterministic?
By the time the callback scheduled with setImmediate is executed, the stream data is already flushed. This can checked by .readableLength stream property.
You can use cork and uncork in order to control when the buffered stream data is flushed.
const { PassThrough } = require('stream')
const inputStream = new PassThrough()
inputStream.cork()
inputStream.write('foo')
inputStream.pipe(process.stdout)
setImmediate(() => {
inputStream.pipe(process.stderr)
inputStream.uncork()
})

How to mock streams in NodeJS

I'm attempting to unit test one of my node-js modules which deals heavily in streams. I'm trying to mock a stream (that I will write to), as within my module I have ".on('data/end)" listeners that I would like to trigger. Essentially I want to be able to do something like this:
var mockedStream = new require('stream').readable();
mockedStream.on('data', function withData('data') {
console.dir(data);
});
mockedStream.on('end', function() {
console.dir('goodbye');
});
mockedStream.push('hello world');
mockedStream.close();
This executes, but the 'on' event never gets fired after I do the push (and .close() is invalid).
All the guidance I can find on streams uses the 'fs' or 'net' library as a basis for creating a new stream (https://github.com/substack/stream-handbook), or they mock it out with sinon but the mocking gets very lengthy very quicky.
Is there a nice way to provide a dummy stream like this?
There's a simpler way: stream.PassThrough
I've just found Node's very easy to miss stream.PassThrough class, which I believe is what you're looking for.
From Node docs:
The stream.PassThrough class is a trivial implementation of a Transform stream that simply passes the input bytes across to the output. Its purpose is primarily for examples and testing...
The code from the question, modified:
const { PassThrough } = require('stream');
const mockedStream = new PassThrough(); // <----
mockedStream.on('data', (d) => {
console.dir(d);
});
mockedStream.on('end', function() {
console.dir('goodbye');
});
mockedStream.emit('data', 'hello world');
mockedStream.end(); // <-- end. not close.
mockedStream.destroy();
mockedStream.push() works too but as a Buffer so you'll might want to do: console.dir(d.toString());
Instead of using Push, I should have been using ".emit(<event>, <data>);"
My mock code now works and looks like:
var mockedStream = new require('stream').Readable();
mockedStream._read = function(size) { /* do nothing */ };
myModule.functionIWantToTest(mockedStream); // has .on() listeners in it
mockedStream.emit('data', 'Hello data!');
mockedStream.emit('end');
The accept answer is only partially correct. If all you need is events to fire, using .emit('data', datum) is okay, but if you need to pipe this mock stream anywhere else it won't work.
Mocking a Readable stream is surprisingly easy, requiring only the Readable lib.
let eventCount = 0;
const mockEventStream = new Readable({
objectMode: true,
read: function (size) {
if (eventCount < 10) {
eventCount = eventCount + 1;
return this.push({message: `event${eventCount}`})
} else {
return this.push(null);
}
}
});
Now you can pipe this stream wherever and 'data' and 'end' will fire.
Another example from the node docs:
https://nodejs.org/api/stream.html#stream_an_example_counting_stream
Building on #flacnut 's answer, I did this (in NodeJS 12+) using Readable.from() to construct a stream preloaded with data (a list of filenames):
const mockStream = require('stream').Readable.from([
'file1.txt',
'file2.txt',
'file3.txt',
])
In my case, I wanted to mock the stream of filenames returned by fast-glob.stream:
const glob = require('fast-glob')
// inject the mock stream into glob module
glob.stream = jest.fn().mockReturnValue(mockStream)
In the function being tested:
const stream = glob.stream(globFilespec)
for await (const filename of stream) {
// filename = file1.txt, then file2.txt, then file3.txt
}
Works like a charm!
Here's a simple implementation which uses jest.fn() where the goal is to validate what has been written to the stream created by fs.createWriteStream(). The nice thing about jest.fn() is that although the calls to fs.createWriteStream() and stream.write() are inline in this test function, these functions don't need to be called directly by the test.
const fs = require('fs');
const mockStream = {}
test('mock fs.createWriteStream with mock implementation', async () => {
const createMockWriteStream = (filename, args) => {
return mockStream;
}
mockStream3.write = jest.fn();
fs.createWriteStream = jest.fn(createMockWriteStream);
const stream = fs.createWriteStream('foo.csv', {'flags': 'a'});
await stream.write('foobar');
expect(fs.createWriteStream).toHaveBeenCalledWith('foo.csv', {'flags': 'a'});
expect(mockStream.write).toHaveBeenCalledWith('foobar');
})

What does events/EventEmitter do in a nodejs constructor function

I am learning node.js. On the nodejs api website there is a piece of code that I don't really get.
The link is here
var util = require("util");
var events = require("events");
function MyStream() {
events.EventEmitter.call(this);
}
util.inherits(MyStream, events.EventEmitter);
MyStream.prototype.write = function(data) {
this.emit("data", data);
}
var stream = new MyStream();
console.log(stream instanceof events.EventEmitter); // true
console.log(MyStream.super_ === events.EventEmitter); // true
stream.on("data", function(data) {
console.log('Received data: "' + data + '"');
})
stream.write("It works!"); // Received data: "It works!"
so the confusing part is
events.EventEmitter.call(this);
What does it do here?
MyStream is a new object declaration that inherits behaviors from events.EventEmitter as can be seen from this line where the inheritance is configured:
util.inherits(MyStream, events.EventEmitter);
So, when the MyStream constructor is invoked usually via something like var stream = new MyStream();, it needs to also invoke the constructor of the object that it inherits from so the parent object can initialize itself properly. That's what this line is:
events.EventEmitter.call(this);
events.EventEmitter is the constructor of the object that MyStream inherits from. events.EventEmitter.call(this) instructs Javascript to call that constructor with the this pointer set to this object.
If you need more help with understanding .call(), you can read this MDN reference.

Is it possible to register multiple listeners to a child process's stdout data event? [duplicate]

I need to run two commands in series that need to read data from the same stream.
After piping a stream into another the buffer is emptied so i can't read data from that stream again so this doesn't work:
var spawn = require('child_process').spawn;
var fs = require('fs');
var request = require('request');
var inputStream = request('http://placehold.it/640x360');
var identify = spawn('identify',['-']);
inputStream.pipe(identify.stdin);
var chunks = [];
identify.stdout.on('data',function(chunk) {
chunks.push(chunk);
});
identify.stdout.on('end',function() {
var size = getSize(Buffer.concat(chunks)); //width
var convert = spawn('convert',['-','-scale',size * 0.5,'png:-']);
inputStream.pipe(convert.stdin);
convert.stdout.pipe(fs.createWriteStream('half.png'));
});
function getSize(buffer){
return parseInt(buffer.toString().split(' ')[2].split('x')[0]);
}
Request complains about this
Error: You cannot pipe after data has been emitted from the response.
and changing the inputStream to fs.createWriteStream yields the same issue of course.
I don't want to write into a file but reuse in some way the stream that request produces (or any other for that matter).
Is there a way to reuse a readable stream once it finishes piping?
What would be the best way to accomplish something like the above example?
You have to create duplicate of the stream by piping it to two streams. You can create a simple stream with a PassThrough stream, it simply passes the input to the output.
const spawn = require('child_process').spawn;
const PassThrough = require('stream').PassThrough;
const a = spawn('echo', ['hi user']);
const b = new PassThrough();
const c = new PassThrough();
a.stdout.pipe(b);
a.stdout.pipe(c);
let count = 0;
b.on('data', function (chunk) {
count += chunk.length;
});
b.on('end', function () {
console.log(count);
c.pipe(process.stdout);
});
Output:
8
hi user
The first answer only works if streams take roughly the same amount of time to process data. If one takes significantly longer, the faster one will request new data, consequently overwriting the data still being used by the slower one (I had this problem after trying to solve it using a duplicate stream).
The following pattern worked very well for me. It uses a library based on Stream2 streams, Streamz, and Promises to synchronize async streams via a callback. Using the familiar example from the first answer:
spawn = require('child_process').spawn;
pass = require('stream').PassThrough;
streamz = require('streamz').PassThrough;
var Promise = require('bluebird');
a = spawn('echo', ['hi user']);
b = new pass;
c = new pass;
a.stdout.pipe(streamz(combineStreamOperations));
function combineStreamOperations(data, next){
Promise.join(b, c, function(b, c){ //perform n operations on the same data
next(); //request more
}
count = 0;
b.on('data', function(chunk) { count += chunk.length; });
b.on('end', function() { console.log(count); c.pipe(process.stdout); });
You can use this small npm package I created:
readable-stream-clone
With this you can reuse readable streams as many times as you need
For general problem, the following code works fine
var PassThrough = require('stream').PassThrough
a=PassThrough()
b1=PassThrough()
b2=PassThrough()
a.pipe(b1)
a.pipe(b2)
b1.on('data', function(data) {
console.log('b1:', data.toString())
})
b2.on('data', function(data) {
console.log('b2:', data.toString())
})
a.write('text')
I have a different solution to write to two streams simultaneously, naturally, the time to write will be the addition of the two times, but I use it to respond to a download request, where I want to keep a copy of the downloaded file on my server (actually I use a S3 backup, so I cache the most used files locally to avoid multiple file transfers)
/**
* A utility class made to write to a file while answering a file download request
*/
class TwoOutputStreams {
constructor(streamOne, streamTwo) {
this.streamOne = streamOne
this.streamTwo = streamTwo
}
setHeader(header, value) {
if (this.streamOne.setHeader)
this.streamOne.setHeader(header, value)
if (this.streamTwo.setHeader)
this.streamTwo.setHeader(header, value)
}
write(chunk) {
this.streamOne.write(chunk)
this.streamTwo.write(chunk)
}
end() {
this.streamOne.end()
this.streamTwo.end()
}
}
You can then use this as a regular OutputStream
const twoStreamsOut = new TwoOutputStreams(fileOut, responseStream)
and pass it to to your method as if it was a response or a fileOutputStream
If you have async operations on the PassThrough streams, the answers posted here won't work.
A solution that works for async operations includes buffering the stream content and then creating streams from the buffered result.
To buffer the result you can use concat-stream
const Promise = require('bluebird');
const concat = require('concat-stream');
const getBuffer = function(stream){
return new Promise(function(resolve, reject){
var gotBuffer = function(buffer){
resolve(buffer);
}
var concatStream = concat(gotBuffer);
stream.on('error', reject);
stream.pipe(concatStream);
});
}
To create streams from the buffer you can use:
const { Readable } = require('stream');
const getBufferStream = function(buffer){
const stream = new Readable();
stream.push(buffer);
stream.push(null);
return Promise.resolve(stream);
}
What about piping into two or more streams not at the same time ?
For example :
var PassThrough = require('stream').PassThrough;
var mybiraryStream = stream.start(); //never ending audio stream
var file1 = fs.createWriteStream('file1.wav',{encoding:'binary'})
var file2 = fs.createWriteStream('file2.wav',{encoding:'binary'})
var mypass = PassThrough
mybinaryStream.pipe(mypass)
mypass.pipe(file1)
setTimeout(function(){
mypass.pipe(file2);
},2000)
The above code does not produce any errors but the file2 is empty

Resources