NodeJS Stream splitting - node.js

I have an infinite data stream from a forked process. I want this stream to be processed by a module and sometimes I want to duplicate the data from this stream to be processed by a different module (e.g. monitoring a data stream but if anything interesting happens I want to log the next n bytes to file for further investigation).
So let's suppose the following scenario:
I start the program and start consuming the readable stream
2 secs later I want to process the same data for 1 sec by a different stream reader
Once the time is up I want to close the second consumer but the original consumer must stay untouched.
Here is a code snippet for this:
var stream = process.stdout;
stream.pipe(detector); // Using the first consumer
function startAnotherConsumer() {
stream2 = new PassThrough();
stream.pipe(stream2);
// use stream2 somewhere else
}
function stopAnotherConsumer() {
stream.unpipe(stream2);
}
My problem here is that unpiping the stream2 doesn't get it closed. If I call stream.end() after the unpipe command, then it crashes with the error:
events.js:160
throw er; // Unhandled 'error' event
^
Error: write after end
at writeAfterEnd (_stream_writable.js:192:12)
at PassThrough.Writable.write (_stream_writable.js:243:5)
at Socket.ondata (_stream_readable.js:555:20)
at emitOne (events.js:101:20)
at Socket.emit (events.js:188:7)
at readableAddChunk (_stream_readable.js:176:18)
at Socket.Readable.push (_stream_readable.js:134:10)
at Pipe.onread (net.js:548:20)
I even tried to pause the source stream to help the buffer to be flushed from the second stream but it didn't work either:
function stopAnotherConsumer() {
stream.pause();
stream2.once('unpipe', function () {
stream.resume();
stream2.end();
});
stream.unpipe(stream2);
}
Same error as before here (write after end).
How to solve the problem? My original intent is to duplicate the streamed data from one point, then close the second stream after a while.
Note: I tried to use this answer to make it work.

As there were no answers, I post my (patchwork) solution. In case anyone'd have a better one, don't hold it back.
A new Stream:
const Writable = require('stream').Writable;
const Transform = require('stream').Transform;
class DuplicatorStream extends Transform {
constructor(options) {
super(options);
this.otherStream = null;
}
attachStream(stream) {
if (!stream instanceof Writable) {
throw new Error('DuplicatorStream argument is not a writeable stream!');
}
if (this.otherStream) {
throw new Error('A stream is already attached!');
}
this.otherStream = stream;
this.emit('attach', stream);
}
detachStream() {
if (!this.otherStream) {
throw new Error('No stream to detach!');
}
let stream = this.otherStream;
this.otherStream = null;
this.emit('detach', stream);
}
_transform(chunk, encoding, callback) {
if (this.otherStream) {
this.otherStream.write(chunk);
}
callback(null, chunk);
}
}
module.exports = DuplicatorStream;
And the usage:
var stream = process.stdout;
var stream2;
duplicatorStream = new DuplicatorStream();
stream.pipe(duplicatorStream); // Inserting my duplicator stream in the chain
duplicatorStream.pipe(detector); // Using the first consumer
function startAnotherConsumer() {
stream2 = new stream.PassThrough();
duplicatorStream.attachStream(stream2);
// use stream2 somewhere else
}
function stopAnotherConsumer() {
duplicatorStream.once('detach', function () {
stream2.end();
});
duplicatorStream.detachStream();
}

Related

Create an ongoing stream from buffer and append to the stream

I am receiving a base 64 encode string in my Nodejs server in chunks, and I want to convert it to a stream that can be read by another process but I am not finding how to do it. Currently, I have this code.
const stream = Readable.from(Buffer.from(data, 'base64'));
But this creates a new instance of stream, but what I would like to do it to keep appending to the open stream until no more data is received from my front end. How do I create an appending stream that I can add to and can be read by another process?
--- Additional information --
Client connect to the NodeJS server via websocket. I read the "data" from the payload on the websocket message received.
socket.on('message', async function(res) {
try
{
let payload = JSON.parse(res);
let payloadType = payload['type'];
let data = payload['data'];
---Edit --
I am getting this error message after pushing to the stream.
Error [ERR_METHOD_NOT_IMPLEMENTED]: The _read() method is not implemented
at Readable._read (internal/streams/readable.js:642:9)
at Readable.read (internal/streams/readable.js:481:10)
at maybeReadMore_ (internal/streams/readable.js:629:12)
at processTicksAndRejections (internal/process/task_queues.js:82:21) {
code: 'ERR_METHOD_NOT_IMPLEMENTED'
}
This is the code where I am reading it from, and connected to the stream:
const getAudioStream = async function* () {
for await (const chunk of micStream) {
if (chunk.length <= SAMPLE_RATE) {
yield {
AudioEvent: {
AudioChunk: encodePCMChunk(chunk),
},
};
}
}
};

Node - piping process.stdout doesn't drain automatically

Consider this Readable stream:
class ReadableStream extends stream.Readable {
constructor() {
super({objectMode:true, highWaterMark:128});
}
i = 0;
_read(size: number) {
while (this.push({key:this.i++})){}
}
}
Piping to process.stdout doesn't drain it automatically. Nothing happens, and the program exits.
new ReadableStream().pipe(process.stdout);
Now, let's pipe it to this Writable stream instead:
class WritableStream extends stream.Writable {
constructor() {
super({objectMode:true, highWaterMark:128});
}
_write(chunk: any, encoding: string, callback: (error?: (Error | null)) => void) {
console.log(chunk);
callback();
}
}
new ReadableStream().pipe(new WritableStream());
The console is instantly filled with numbers, and so it goes into infinity.
Why process.stdout or fs.createWriteStream do automatically request data?
process.stdout is not an object mode stream and does not work properly when you pipe an object mode stream to it. If you change your readableStream to not be an object mode stream, then the .pipe() will work properly.
In fact, if you attach an event handler for the error event such as:
new ReadableStream().pipe(process.stdout).on('error', err => {
console.log(err);
});
Then, you will get this:
TypeError [ERR_INVALID_ARG_TYPE]: The "chunk" argument must be one of type string or Buffer. Received type object
at validChunk (_stream_writable.js:268:10)
at WriteStream.Writable.write (_stream_writable.js:303:21)
at ReadableStream.ondata (_stream_readable.js:727:22)
at ReadableStream.emit (events.js:210:5)
at ReadableStream.Readable.read (_stream_readable.js:525:10)
at flow (_stream_readable.js:1000:34)
at resume_ (_stream_readable.js:981:3)
at processTicksAndRejections (internal/process/task_queues.js:80:21) {
code: 'ERR_INVALID_ARG_TYPE'
}
Which shows that stdout is not expecting to get an object.

Error [ERR_STREAM_PREMATURE_CLOSE]: Premature close in Node Pipeline stream

I am using the stream.pipeline functionality from Node to upload some data to S3. The basic idea I'm implementing is pulling files from a request and writing them to S3. I have one pipeline that pulls zip files and writes them to S3 successfully. However, I want my second pipeline to make the same request, but unzip and write the unzipped files to S3. The pipeline code looks like the following:
pipeline(request.get(...), s3Stream(zipFileWritePath)),
pipeline(request.get(...), new unzipper.Parse(), etl.map(entry => entry.pipe(s3Stream(createWritePath(writePath, entry)))))
The s3Stream function looks like so:
function s3Stream(file) {
const pass = new stream.PassThrough()
s3Store.upload(file, pass)
return pass
}
The first pipeline works well, and is currently operating greatly in production. However, when adding the second pipeline, I get the following error:
Error [ERR_STREAM_PREMATURE_CLOSE]: Premature close
at Parse.onclose (internal/streams/end-of-stream.js:56:36)
at Parse.emit (events.js:187:15)
at Parse.EventEmitter.emit (domain.js:442:20)
at Parse.<anonymous> (/node_modules/unzipper/lib/parse.js:28:10)
at Parse.emit (events.js:187:15)
at Parse.EventEmitter.emit (domain.js:442:20)
at finishMaybe (_stream_writable.js:641:14)
at afterWrite (_stream_writable.js:481:3)
at onwrite (_stream_writable.js:471:7)
at /node_modules/unzipper/lib/PullStream.js:70:11
at afterWrite (_stream_writable.js:480:3)
at process._tickCallback (internal/process/next_tick.js:63:19)
Any idea what could be causing this or solutions to resolve this would be greatly appreciated!
TL;DR
When using a pipeline you accept to consume the readable stream fully, you don't want anything stopping before the readable ends.
Deep dive
After some time working with those shenanigans here is some more usefull informations.
import stream from 'stream'
const s1 = new stream.PassThrough()
const s2 = new stream.PassThrough()
const s3 = new stream.PassThrough()
s1.on('end', () => console.log('end 1'))
s2.on('end', () => console.log('end 2'))
s3.on('end', () => console.log('end 3'))
s1.on('close', () => console.log('close 1'))
s2.on('close', () => console.log('close 2'))
s3.on('close', () => console.log('close 3'))
stream.pipeline(
s1,
s2,
s3,
async s => { for await (_ of s) { } },
err => console.log('end', err)
)
now if i call s2.end() it will close all parents
end 2
close 2
end 3
close 3
pipeline is the equivalent of s3(s2(s1)))
but if i call s2.destroy() it print and destroy everything, this is your problem here a stream is destroyed before it ends normally, either an error or a return/break/throws in an asyncGenerator/asyncFunction
close 2
end Error [ERR_STREAM_PREMATURE_CLOSE]: Premature close
at PassThrough.onclose (internal/streams/end-of-stream.js:117:38)
at PassThrough.emit (events.js:327:22)
at emitCloseNT (internal/streams/destroy.js:81:10)
at processTicksAndRejections (internal/process/task_queues.js:83:21) {
code: 'ERR_STREAM_PREMATURE_CLOSE'
}
close 1
close 3
You must not let one of the streams without a way to catch their errors
stream.pipeline() leaves dangling event listeners on the streams after theallback has been invoked. In the case of reuse of streams after failure, this can cause event listener leaks and swallowed errors.
node source (14.4)
const onclose = () => {
if (readable && !readableEnded) {
if (!isReadableEnded(stream))
return callback.call(stream, new ERR_STREAM_PREMATURE_CLOSE());
}
if (writable && !writableFinished) {
if (!isWritableFinished(stream))
return callback.call(stream, new ERR_STREAM_PREMATURE_CLOSE());
}
callback.call(stream);
};

How do I write into a writable stream conditionally only if it is open?

I have this function in my module which writes to a child process's stdin stream. But sometimes I face
events.js:160
throw er; // Unhandled 'error' event
^
Error: write EPIPE
at exports._errnoException (util.js:1020:11)
at WriteWrap.afterWrite (net.js:800:14)
I think its happening because sometimes the writable stdin stream is closed before I write into it. Basically I want to check whether its closed or not. If its open I'll write into it otherwise I won't write into it.
Relevant Code
/**
* Write the stdin into the child process
* #param proc Child process refrence
* #param stdin stdin string
*/
export function writeToStdin(proc: ChildProcess, stdin: string) {
if (stdin) {
proc.stdin.write(stdin + '\r\n');
proc.stdin.end();
}
}
Is there an API to check it as I couldn't find any?
You can try the finished api from streams
const { finished } = require('stream');
const rs = fs.createReadStream('archive.tar');
finished(rs, (err) => {
if (err) {
console.error('Stream failed', err);
} else {
console.log('Stream is done reading');
}
});
rs.resume(); // drain the stream
nodejs documentation references the field writable on WritableStream objects, which seems to be what you're looking for: https://nodejs.org/api/stream.html#writablewritable
Is true if it is safe to call writable.write(), which means the stream has not been destroyed, errored or ended.

_read() is not implemented on Readable stream

This question is how to really implement the read method of a readable stream.
I have this implementation of a Readable stream:
import {Readable} from "stream";
this.readableStream = new Readable();
I am getting this error
events.js:136
throw er; // Unhandled 'error' event
^
Error [ERR_STREAM_READ_NOT_IMPLEMENTED]: _read() is not implemented
at Readable._read (_stream_readable.js:554:22)
at Readable.read (_stream_readable.js:445:10)
at resume_ (_stream_readable.js:825:12)
at _combinedTickCallback (internal/process/next_tick.js:138:11)
at process._tickCallback (internal/process/next_tick.js:180:9)
at Function.Module.runMain (module.js:684:11)
at startup (bootstrap_node.js:191:16)
at bootstrap_node.js:613:3
The reason the error occurs is obvious, we need to do this:
this.readableStream = new Readable({
read(size) {
return true;
}
});
I don't really understand how to implement the read method though.
The only thing that works is just calling
this.readableStream.push('some string or buffer');
if I try to do something like this:
this.readableStream = new Readable({
read(size) {
this.push('foo'); // call push here!
return true;
}
});
then nothing happens - nothing comes out of the readable!
Furthermore, these articles says you don't need to implement the read method:
https://github.com/substack/stream-handbook#creating-a-readable-stream
https://medium.freecodecamp.org/node-js-streams-everything-you-need-to-know-c9141306be93
My question is - why does calling push inside the read method do nothing? The only thing that works for me is just calling readable.push() elsewhere.
why does calling push inside the read method do nothing? The only thing that works for me is just calling readable.push() elsewhere.
I think it's because you are not consuming it, you need to pipe it to an writable stream (e.g. stdout) or just consume it through a data event:
const { Readable } = require("stream");
let count = 0;
const readableStream = new Readable({
read(size) {
this.push('foo');
if (count === 5) this.push(null);
count++;
}
});
// piping
readableStream.pipe(process.stdout)
// through the data event
readableStream.on('data', (chunk) => {
console.log(chunk.toString());
});
Both of them should print 5 times foo (they are slightly different though). Which one you should use depends on what you are trying to accomplish.
Furthermore, these articles says you don't need to implement the read method:
You might not need it, this should work:
const { Readable } = require("stream");
const readableStream = new Readable();
for (let i = 0; i <= 5; i++) {
readableStream.push('foo');
}
readableStream.push(null);
readableStream.pipe(process.stdout)
In this case you can't capture it through the data event. Also, this way is not very useful and not efficient I'd say, we are just pushing all the data in the stream at once (if it's large everything is going to be in memory), and then consuming it.
From documentation:
readable._read:
"When readable._read() is called, if data is available from the resource, the implementation should begin pushing that data into the read queue using the this.push(dataChunk) method. link"
readable.push:
"The readable.push() method is intended be called only by Readable implementers, and only from within the readable._read() method. link"
Implement the _read method after your ReadableStream's initialization:
import {Readable} from "stream";
this.readableStream = new Readable();
this.readableStream.read = function () {};
readableStream is like a pool:
.push(data), It's like pumping water to a pool.
.pipe(destination), It's like connecting the pool to a pipe and pump water to other place
The _read(size) run as a pumper and control how much water flow and when the data is end.
The fs.createReadStream() will create read stream with the _read() function has been auto implemented to push file data and end when end of file.
The _read(size) is auto fire when the pool is attached to a pipe. Thus, if you force calling this function without connect a way to destination, it will pump to ?where? and it affect the machine status inside _read() (may be the cursor move to wrong place,...)
The read() function must be create inside new Stream.Readable(). It's actually a function inside an object. It's not readableStream.read(), and implement readableStream.read=function(size){...} will not work.
The easy way to understand implement:
var Reader=new Object();
Reader.read=function(size){
if (this.i==null){this.i=1;}else{this.i++;}
this.push("abc");
if (this.i>7){ this.push(null); }
}
const Stream = require('stream');
const renderStream = new Stream.Readable(Reader);
renderStream.pipe(process.stdout)
You can use it to reder what ever stream data to POST to other server.
POST stream data with Axios :
require('axios')({
method: 'POST',
url: 'http://127.0.0.1:3000',
headers: {'Content-Length': 1000000000000},
data: renderStream
});

Resources