This question is how to really implement the read method of a readable stream.
I have this implementation of a Readable stream:
import {Readable} from "stream";
this.readableStream = new Readable();
I am getting this error
events.js:136
throw er; // Unhandled 'error' event
^
Error [ERR_STREAM_READ_NOT_IMPLEMENTED]: _read() is not implemented
at Readable._read (_stream_readable.js:554:22)
at Readable.read (_stream_readable.js:445:10)
at resume_ (_stream_readable.js:825:12)
at _combinedTickCallback (internal/process/next_tick.js:138:11)
at process._tickCallback (internal/process/next_tick.js:180:9)
at Function.Module.runMain (module.js:684:11)
at startup (bootstrap_node.js:191:16)
at bootstrap_node.js:613:3
The reason the error occurs is obvious, we need to do this:
this.readableStream = new Readable({
read(size) {
return true;
}
});
I don't really understand how to implement the read method though.
The only thing that works is just calling
this.readableStream.push('some string or buffer');
if I try to do something like this:
this.readableStream = new Readable({
read(size) {
this.push('foo'); // call push here!
return true;
}
});
then nothing happens - nothing comes out of the readable!
Furthermore, these articles says you don't need to implement the read method:
https://github.com/substack/stream-handbook#creating-a-readable-stream
https://medium.freecodecamp.org/node-js-streams-everything-you-need-to-know-c9141306be93
My question is - why does calling push inside the read method do nothing? The only thing that works for me is just calling readable.push() elsewhere.
why does calling push inside the read method do nothing? The only thing that works for me is just calling readable.push() elsewhere.
I think it's because you are not consuming it, you need to pipe it to an writable stream (e.g. stdout) or just consume it through a data event:
const { Readable } = require("stream");
let count = 0;
const readableStream = new Readable({
read(size) {
this.push('foo');
if (count === 5) this.push(null);
count++;
}
});
// piping
readableStream.pipe(process.stdout)
// through the data event
readableStream.on('data', (chunk) => {
console.log(chunk.toString());
});
Both of them should print 5 times foo (they are slightly different though). Which one you should use depends on what you are trying to accomplish.
Furthermore, these articles says you don't need to implement the read method:
You might not need it, this should work:
const { Readable } = require("stream");
const readableStream = new Readable();
for (let i = 0; i <= 5; i++) {
readableStream.push('foo');
}
readableStream.push(null);
readableStream.pipe(process.stdout)
In this case you can't capture it through the data event. Also, this way is not very useful and not efficient I'd say, we are just pushing all the data in the stream at once (if it's large everything is going to be in memory), and then consuming it.
From documentation:
readable._read:
"When readable._read() is called, if data is available from the resource, the implementation should begin pushing that data into the read queue using the this.push(dataChunk) method. link"
readable.push:
"The readable.push() method is intended be called only by Readable implementers, and only from within the readable._read() method. link"
Implement the _read method after your ReadableStream's initialization:
import {Readable} from "stream";
this.readableStream = new Readable();
this.readableStream.read = function () {};
readableStream is like a pool:
.push(data), It's like pumping water to a pool.
.pipe(destination), It's like connecting the pool to a pipe and pump water to other place
The _read(size) run as a pumper and control how much water flow and when the data is end.
The fs.createReadStream() will create read stream with the _read() function has been auto implemented to push file data and end when end of file.
The _read(size) is auto fire when the pool is attached to a pipe. Thus, if you force calling this function without connect a way to destination, it will pump to ?where? and it affect the machine status inside _read() (may be the cursor move to wrong place,...)
The read() function must be create inside new Stream.Readable(). It's actually a function inside an object. It's not readableStream.read(), and implement readableStream.read=function(size){...} will not work.
The easy way to understand implement:
var Reader=new Object();
Reader.read=function(size){
if (this.i==null){this.i=1;}else{this.i++;}
this.push("abc");
if (this.i>7){ this.push(null); }
}
const Stream = require('stream');
const renderStream = new Stream.Readable(Reader);
renderStream.pipe(process.stdout)
You can use it to reder what ever stream data to POST to other server.
POST stream data with Axios :
require('axios')({
method: 'POST',
url: 'http://127.0.0.1:3000',
headers: {'Content-Length': 1000000000000},
data: renderStream
});
Related
I'm familiar with Node streams, but I'm struggling on best practices for abstracting code that I reuse a lot into a single pipe step.
Here's a stripped down version of what I'm writing today:
inputStream
.pipe(csv.parse({columns:true})
.pipe(csv.transform(function(row) {return transform(row); }))
.pipe(csv.stringify({header: true})
.pipe(outputStream);
The actual work happens in transform(). The only things that really change are inputStream, transform(), and outputStream. Like I said, this is a stripped down version of what I actually use. I have a lot of error handling and logging on each pipe step, which is ultimately why I'm try to abstract the code.
What I'm looking to write is a single pipe step, like so:
inputStream
.pipe(csvFunction(transform(row)))
.pipe(outputStream);
What I'm struggling to understand is how to turn those pipe steps into a single function that accepts a stream and returns a stream. I've looked at libraries like through2 but I'm but not sure how that get's me to where I'm trying to go.
You can use the PassThrough class like this:
var PassThrough = require('stream').PassThrough;
var csvStream = new PassThrough();
csvStream.on('pipe', function (source) {
// undo piping of source
source.unpipe(this);
// build own pipe-line and store internally
this.combinedStream =
source.pipe(csv.parse({columns: true}))
.pipe(csv.transform(function (row) {
return transform(row);
}))
.pipe(csv.stringify({header: true}));
});
csvStream.pipe = function (dest, options) {
// pipe internal combined stream to dest
return this.combinedStream.pipe(dest, options);
};
inputStream
.pipe(csvStream)
.pipe(outputStream);
Here's what I ended up going with. I used the through2 library and the streaming API of the csv library to create the pipe function I was looking for.
var csv = require('csv');
through = require('through2');
module.exports = function(transformFunc) {
parser = csv.parse({columns:true, relax_column_count:true}),
transformer = csv.transform(function(row) {
return transformFunc(row);
}),
stringifier = csv.stringify({header: true});
return through(function(chunk,enc,cb){
var stream = this;
parser.on('data', function(data){
transformer.write(data);
});
transformer.on('data', function(data){
stringifier.write(data);
});
stringifier.on('data', function(data){
stream.push(data);
});
parser.write(chunk);
parser.removeAllListeners('data');
transformer.removeAllListeners('data');
stringifier.removeAllListeners('data');
cb();
})
}
It's worth noting the part where I remove the event listeners towards the end, this was due to running into memory errors where I had created too many event listeners. I initially tried solving this problem by listening to events with once, but that prevented subsequent chunks from being read and passed on to the next pipe step.
Let me know if anyone has feedback or additional ideas.
I'm attempting to unit test one of my node-js modules which deals heavily in streams. I'm trying to mock a stream (that I will write to), as within my module I have ".on('data/end)" listeners that I would like to trigger. Essentially I want to be able to do something like this:
var mockedStream = new require('stream').readable();
mockedStream.on('data', function withData('data') {
console.dir(data);
});
mockedStream.on('end', function() {
console.dir('goodbye');
});
mockedStream.push('hello world');
mockedStream.close();
This executes, but the 'on' event never gets fired after I do the push (and .close() is invalid).
All the guidance I can find on streams uses the 'fs' or 'net' library as a basis for creating a new stream (https://github.com/substack/stream-handbook), or they mock it out with sinon but the mocking gets very lengthy very quicky.
Is there a nice way to provide a dummy stream like this?
There's a simpler way: stream.PassThrough
I've just found Node's very easy to miss stream.PassThrough class, which I believe is what you're looking for.
From Node docs:
The stream.PassThrough class is a trivial implementation of a Transform stream that simply passes the input bytes across to the output. Its purpose is primarily for examples and testing...
The code from the question, modified:
const { PassThrough } = require('stream');
const mockedStream = new PassThrough(); // <----
mockedStream.on('data', (d) => {
console.dir(d);
});
mockedStream.on('end', function() {
console.dir('goodbye');
});
mockedStream.emit('data', 'hello world');
mockedStream.end(); // <-- end. not close.
mockedStream.destroy();
mockedStream.push() works too but as a Buffer so you'll might want to do: console.dir(d.toString());
Instead of using Push, I should have been using ".emit(<event>, <data>);"
My mock code now works and looks like:
var mockedStream = new require('stream').Readable();
mockedStream._read = function(size) { /* do nothing */ };
myModule.functionIWantToTest(mockedStream); // has .on() listeners in it
mockedStream.emit('data', 'Hello data!');
mockedStream.emit('end');
The accept answer is only partially correct. If all you need is events to fire, using .emit('data', datum) is okay, but if you need to pipe this mock stream anywhere else it won't work.
Mocking a Readable stream is surprisingly easy, requiring only the Readable lib.
let eventCount = 0;
const mockEventStream = new Readable({
objectMode: true,
read: function (size) {
if (eventCount < 10) {
eventCount = eventCount + 1;
return this.push({message: `event${eventCount}`})
} else {
return this.push(null);
}
}
});
Now you can pipe this stream wherever and 'data' and 'end' will fire.
Another example from the node docs:
https://nodejs.org/api/stream.html#stream_an_example_counting_stream
Building on #flacnut 's answer, I did this (in NodeJS 12+) using Readable.from() to construct a stream preloaded with data (a list of filenames):
const mockStream = require('stream').Readable.from([
'file1.txt',
'file2.txt',
'file3.txt',
])
In my case, I wanted to mock the stream of filenames returned by fast-glob.stream:
const glob = require('fast-glob')
// inject the mock stream into glob module
glob.stream = jest.fn().mockReturnValue(mockStream)
In the function being tested:
const stream = glob.stream(globFilespec)
for await (const filename of stream) {
// filename = file1.txt, then file2.txt, then file3.txt
}
Works like a charm!
Here's a simple implementation which uses jest.fn() where the goal is to validate what has been written to the stream created by fs.createWriteStream(). The nice thing about jest.fn() is that although the calls to fs.createWriteStream() and stream.write() are inline in this test function, these functions don't need to be called directly by the test.
const fs = require('fs');
const mockStream = {}
test('mock fs.createWriteStream with mock implementation', async () => {
const createMockWriteStream = (filename, args) => {
return mockStream;
}
mockStream3.write = jest.fn();
fs.createWriteStream = jest.fn(createMockWriteStream);
const stream = fs.createWriteStream('foo.csv', {'flags': 'a'});
await stream.write('foobar');
expect(fs.createWriteStream).toHaveBeenCalledWith('foo.csv', {'flags': 'a'});
expect(mockStream.write).toHaveBeenCalledWith('foobar');
})
I realize that node is non-blocking, however, I also realize that because node has only one thread, putting a three second while loop in the middle of your event loop will cause blocking. I.e.:
var start = new Date();
console.log('Test 1');
function sleep(time, words) {
while(new Date().getTime() < start.getTime() + time);
console.log(words);
}
sleep(3000, 'Test 2'); //This will block
console.log('Test 3') //Logs Test 1, Test 2, Test 3
Many of the examples I have seen dealing with the new "Streams2" interface look like they would cause this same blocking. For instance this one, borrowed from here:
var crypto = require('crypto');
var fs = require('fs');
var readStream = fs.createReadStream('myfile.txt');
var hash = crypto.createHash('sha1');
readStream
.on('readable', function () {
var chunk;
while (null !== (chunk = readStream.read())) {
hash.update(chunk); //DOESN'T This Cause Blocking?
}
})
.on('end', function () {
console.log(hash.digest('hex'));
});
If I am following right, the readStream will emit the readable event when there is data in the buffer. So it seems that once the readable event is emitted, the entire event loop would be stopped until the readStream.read() emits null. This seems less desirable than the old way (because it would not block). Can somebody please tell me why I am wrong. Thanks.
You don't have to read until the internal stream buffer is empty. You could just read once if you wanted and then read another chunk some time later.
readStream.read() itself is not blocking, but hash.update(chunk) is (for a brief amount of time) because the hashing is done on the main thread (there is a github issue about adding an async interface that would execute crypto functions in the thread pool though).
Also, you can simplify the code you have to use the crypto stream interface:
var crypto = require('crypto'),
fs = require('fs');
var readStream = fs.createReadStream('myfile.txt'),
hasher = crypto.createHash('sha1');
readStream.pipe(hasher).on('readable', function() {
// the hash stream automatically pushes the digest
// to the readable side once the writable side is ended
console.log(this.read());
}).setEncoding('hex');
All JS code is single-threaded, so a loop will block, but you are misunderstanding how long that loop will run for. Calling .read() takes a readable item from the stream, just like a 'data' handler would be called with the item. It will stop executing and unblock as soon as there are no items. 'readable' is triggered whenever there is data, and then it empties the buffer and waits for another 'readable'. So where your first while loop relies on the time to be updated, which could be some unbounded amount of time, the other loop is basically doing:
while (items.length > 0) items.pop()
which is pretty much the minimum amount of work you need to do to process items from a stream.
I've seen and read a few tutorials that state you can pipe one stream to another almost like lego blocks, but I can't find anything on how to catch a pipe command when a stream is piped to your object.
What I mean is how do I create an object with functions so I can do:
uploadWrapper = function (client, file, callback) {
upload = function (client,file,callback){
var file = file
// this.data = 'undefined'
stream.Writable.call(this);
this.end = function () {
if(typeof this.data !== 'undefined') file.data = this.data
callback(file.data,200)
}
// var path = urlB.host('upload').object('files',file.id).action('content').url
// // client.upload(path,file,callback)
}
util.inherits(upload,stream.Writable)
upload.prototype._write = function (chunk, encoding, callback) {
this.data = this.data + chunk.toString('utf8')
callback()
}
return new upload(client,file,callback)
}
exports.upload = uploadWrapper
How do I handle when data is piped to my object?
I've looked but I can't really find anything about this (maybe I haven't looked in the write places?).
Can any one point me in the right direction?
If it helps to know it, all I Want to be able to do is catch a data stream and build a string containing data with binary encoding; whether it's from a file-stream or a request stream from a server(i.e. the data from a file of a multipart request) object.
EDIT: I've updated the code to log the data
EDIT: I've fixed it, I can now receive piped data, I had to put the code in a wrapper that returned the function that implemented stream.
EDIT: different problem now, this.data in _read isn't storing in a way that this.data in the upload function can read.
EDIT: OK, now I can deal with the callback and catch the data, I need to work out how to tell if data is being piped to it or if it's being used as a normal function.
If you want to create your own stream that can be piped to and/or from, look at the node docs for implementing streams.
How to close a readable stream in Node.js?
var input = fs.createReadStream('lines.txt');
input.on('data', function(data) {
// after closing the stream, this will not
// be called again
if (gotFirstLine) {
// close this stream and continue the
// instructions from this if
console.log("Closed.");
}
});
This would be better than:
input.on('data', function(data) {
if (isEnded) { return; }
if (gotFirstLine) {
isEnded = true;
console.log("Closed.");
}
});
But this would not stop the reading process...
Edit: Good news! Starting with Node.js 8.0.0 readable.destroy is officially available: https://nodejs.org/api/stream.html#stream_readable_destroy_error
ReadStream.destroy
You can call the ReadStream.destroy function at any time.
var fs = require("fs");
var readStream = fs.createReadStream("lines.txt");
readStream
.on("data", function (chunk) {
console.log(chunk);
readStream.destroy();
})
.on("end", function () {
// This may not been called since we are destroying the stream
// the first time "data" event is received
console.log("All the data in the file has been read");
})
.on("close", function (err) {
console.log("Stream has been destroyed and file has been closed");
});
The public function ReadStream.destroy is not documented (Node.js v0.12.2) but you can have a look at the source code on GitHub (Oct 5, 2012 commit).
The destroy function internally mark the ReadStream instance as destroyed and calls the close function to release the file.
You can listen to the close event to know exactly when the file is closed. The end event will not fire unless the data is completely consumed.
Note that the destroy (and the close) functions are specific to fs.ReadStream. There are not part of the generic stream.readable "interface".
Invoke input.close(). It's not in the docs, but
https://github.com/joyent/node/blob/cfcb1de130867197cbc9c6012b7e84e08e53d032/lib/fs.js#L1597-L1620
clearly does the job :) It actually does something similar to your isEnded.
EDIT 2015-Apr-19 Based on comments below, and to clarify and update:
This suggestion is a hack, and is not documented.
Though for looking at the current lib/fs.js it still works >1.5yrs later.
I agree with the comment below about calling destroy() being preferable.
As correctly stated below this works for fs ReadStreams's, not on a generic Readable
As for a generic solution: it doesn't appear as if there is one, at least from my understanding of the documentation and from a quick look at _stream_readable.js.
My proposal would be put your readable stream in paused mode, at least preventing further processing in your upstream data source. Don't forget to unpipe() and remove all data event listeners so that pause() actually pauses, as mentioned in the docs
Today, in Node 10
readableStream.destroy()
is the official way to close a readable stream
see https://nodejs.org/api/stream.html#stream_readable_destroy_error
You can't. There is no documented way to close/shutdown/abort/destroy a generic Readable stream as of Node 5.3.0. This is a limitation of the Node stream architecture.
As other answers here have explained, there are undocumented hacks for specific implementations of Readable provided by Node, such as fs.ReadStream. These are not generic solutions for any Readable though.
If someone can prove me wrong here, please do. I would like to be able to do what I'm saying is impossible, and would be delighted to be corrected.
EDIT: Here was my workaround: implement .destroy() for my pipeline though a complex series of unpipe() calls. And after all that complexity, it doesn't work properly in all cases.
EDIT: Node v8.0.0 added a destroy() api for Readable streams.
At version 4.*.* pushing a null value into the stream will trigger a EOF signal.
From the nodejs docs
If a value other than null is passed, The push() method adds a chunk of data into the queue for subsequent stream processors to consume. If null is passed, it signals the end of the stream (EOF), after which no more data can be written.
This worked for me after trying numerous other options on this page.
This destroy module is meant to ensure a stream gets destroyed, handling different APIs and Node.js bugs. Right now is one of the best choice.
NB. From Node 10 you can use the .destroy method without further dependencies.
You can clear and close the stream with yourstream.resume(), which will dump everything on the stream and eventually close it.
From the official docs:
readable.resume():
Return: this
This method will cause the readable stream to resume emitting 'data' events.
This method will switch the stream into flowing mode. If you do not want to consume the data from a stream, but you do want to get to its 'end' event, you can call stream.resume() to open the flow of data.
var readable = getReadableStreamSomehow();
readable.resume();
readable.on('end', () => {
console.log('got to the end, but did not read anything');
});
It's an old question but I too was looking for the answer and found the best one for my implementation. Both end and close events get emitted so I think this is the cleanest solution.
This will do the trick in node 4.4.* (stable version at the time of writing):
var input = fs.createReadStream('lines.txt');
input.on('data', function(data) {
if (gotFirstLine) {
this.end(); // Simple isn't it?
console.log("Closed.");
}
});
For a very detailed explanation see:
http://www.bennadel.com/blog/2692-you-have-to-explicitly-end-streams-after-pipes-break-in-node-js.htm
This code here will do the trick nicely:
function closeReadStream(stream) {
if (!stream) return;
if (stream.close) stream.close();
else if (stream.destroy) stream.destroy();
}
writeStream.end() is the go-to way to close a writeStream...
for stop callback execution after some call,
you have to use process.kill with particular processID
const csv = require('csv-parser');
const fs = require('fs');
const filepath = "./demo.csv"
let readStream = fs.createReadStream(filepath, {
autoClose: true,
});
let MAX_LINE = 0;
readStream.on('error', (e) => {
console.log(e);
console.log("error");
})
.pipe(csv())
.on('data', (row) => {
if (MAX_LINE == 2) {
process.kill(process.pid, 'SIGTERM')
}
// console.log("not 2");
MAX_LINE++
console.log(row);
})
.on('end', () => {
// handle end of CSV
console.log("read done");
}).on("close", function () {
console.log("closed");
})