Node.js, stream pipe output data to client with socket io-stream - node.js

Sorry for a repeating topic, but i've searched and experimented for 2 days now and i haven't been able to solve the problem.
I am trying to live stream pictures every 1 second to a client via socket.io-stream using the following code:
var args = [
"-i",
"/dev/video0",
"-s",
"1280x720",
"-qscale",
1,
"-vf",
"fps=1",
config.imagePath,
"-s",
config.imageStream.resolution[0],
"-f",
"image2pipe",
"-qscale",
1,
"-vf",
"fps=1",
"pipe:1"
];
camera = spawn("avconv", args); // avconv = ffmpeg
The settings are good, and the process writes to stdout successfully. I capture all outgoing image data using this simplified code:
var ss = require("socket.io-stream");
camera.stdout.on("data", function(data) {
var stream = ss.createStream();
ss(socket).emit("img", stream, "newImg");
// how do i write the data-object to the stream?
// fs.createReadStream(imagePath).pipe(stream);
});
"socket" comes from the client using the socket.io-package, no problem there. So what i am doing is that i listen to the stdout-pipe for the "data" event. That data gets passed to the function above. That means that at this stage "data" is not a stream, its a "<Buffer .. .. .."-object, and therefore i cannot stream it like i could previously using the commented createReadStream-statement where i read the image from disk. How do i stream the data (Buffer at this stage) to the client? Can i do this differently, perhaps not using socket.io-stream? "data" is just one part of the whole image, so perhaps two or three "data"-objects need to be put together to form the complete image.
I tried using "stream.write(data, "binary");" which did transfer the Buffer-objects, problem is that there is not end of stream-event and therefore i do not know when an image is complete. I tried registering to stdout.on "close", "end", "finish", nothing triggers. Am i missing something? Am i making it overly complex? The reasoning behind my implementation is that i need a new stream for each complete image, is that right?
Thanks alot!

Related

What are the roles of _read and read in Node JS streams?

I'm really just looking for clarification on how these work. IMO the documentation on streams is somewhat lacking, and there actually aren't a lot of resources out their that comprehensively explain explain how they're are meant to work and be extended.
My question can be broken down into two parts
One, What is the role of the _read function within the stream module? When I run this code it endlessly prints out "hello world" until null is pushed onto the stream buffer. This seems to indicate that _read is called in some kind of loop that waits for a null in the buffer, but I can't find documentation anywhere that states this in explicit terms.
var Readable = require('stream').Readable
var rs = Readable()
rs._read = function () {
rs.push("hello world")
rs.push(null)
};
rs.on("data", function(data){
console.log("some data", data)
})
Two, what does read actually do? My understanding is that read consumes data from the read stream buffer, and fires the data event. Is that all that's going on here?
read() is something that a consumer of the readStream calls if they want to specifically read some bytes from the stream (when the stream is not flowing).
_read() is an internal method that is part of the internal implementation of the read stream. The internals of the stream call this method (it is NOT to be called from the outside) when the stream is flowing and the stream wants to get more data from the source. When called the _read() method pushes data with .push(data) or if it has no more data, then it does a .push(null).
You can see an explanation and example here in this article.
_read(size) {
if (this.data.length) {
const chunk = this.data.slice(0, size);
this.data = this.data.slice(size, this.data.length);
this.push(chunk);
} else {
this.push(null); // 'end', no more data
}
}
If you were implementing a read stream to some custom source of data, then you would implement the _read() method to fetch up to the size amount of data from your source and .push() that data into the stream.

Buffering a Float32Array to a client

This should be obvious, but for some reason I am not getting any result. I have already spent way too much time just trying different ways to get this working without results.
TLDR: A shorter way to explain this question could be: I know how to stream a sound from a file. How to stream a buffer containing sound that was synthesized on the server instead?
This works:
client:
var stream = ss.createStream();
ss(socket).emit('get-file', stream, data.bufferSource);
var parts = [];
stream.on('data', function(chunk){
parts.push(chunk);
});
stream.on('end', function () {
var blob=new Blob(parts,{type:"audio"});
if(cb){
cb(blob);
}
});
server (in the 'socket-connected' callback of socket.io)
var ss = require('socket.io-stream');
// ....
ss(socket).on('get-file', (stream:any, filename:any)=>{
console.log("get-file",filename);
fs.createReadStream(filename).pipe(stream);
});
Now, the problem:
I want to alter this audio buffer and send the modified audio instead of just the file. I converted the ReadStream into an Float32Array, and did some processes sample by sample. Now I want to send that modified Float32Array to the client.
In my view, I just need to replaces the fs.createReadStream(filename) with(new Readable()).push(modifiedSoundBuffer). However, I get a TypeError: Invalid non-string/buffer chunk. Interestingly, if I convert this modifiedSodunBuffer into a Uint8Array, it doesn't yell at me, and the client gets a large array, which looks good; only that all the array values are 0. I guess that it's flooring all the values?
ss(socket).on('get-buffer', (stream:any, filename:any)=>{
let readable=(new Readable()).push(modifiedFloat32Array);
readable.pipe(stream);
});
I am trying to use streams for two reasons: sound buffers are large, and to allow concurrent processing in the future
if you will convert object Float32Array to buffer before sending like this Readable()).push(Buffer.from(modifiedSoundBuffer)) ?

Proper way to unpipe a streams2 pipeline and empty it (not just flush)

Premise
I'm trying to find the correct way to prematurely terminate a series of piped streams (pipeline) in Node.js: sometimes I want to gracefully abort the stream before it has finished. Specifically I'm dealing with mostly objectMode: true and non-native parallel streams, but this shouldn't really matter.
Problem
The problem is when I unpipe the pipeline, data remains in each stream's buffer and is drained. This might be okay for most of the intermediate streams (e.g. Readable/Transform), but the last Writable still drains to its write target (e.g. a file or a database or socket or w/e). This could be problematic if the buffer contains hundreds or thousands of chunks which takes a significant amount of time to drain. I want it to stop immediately, i.e. not drain; why waste cycles and memory on data that doesn't matter?
Depending on the route I go, I receive either a "write after end" error, or an exception when the stream cannot find existing pipes.
Question
What is the proper way to gracefully kill off a pipeline of streams in the form a.pipe(b).pipe(c).pipe(z)?
Solution?
The solution I have come up with is 3-step:
unpipe each stream in the pipeline in reverse order
Empty each stream's buffer that implements Writable
end each stream that implements Writable
Some pseudo code illustrating the entire process:
var pipeline = [ // define the pipeline
readStream,
transformStream0,
transformStream1,
writeStream
];
// build and start the pipeline
var tmpBuildStream;
pipeline.forEach(function(stream) {
if ( !tmpBuildStream ) {
tmpBuildStream = stream;
continue;
}
tmpBuildStream = lastStream.pipe(stream);
});
// sleep, timeout, event, etc...
// tear down the pipeline
var tmpTearStream;
pipeline.slice(0).reverse().forEach(function(stream) {
if ( !tmpTearStream ) {
tmpTearStream = stream;
continue;
}
tmpTearStream = stream.unpipe(tmpTearStream);
});
// empty and end the pipeline
pipeline.forEach(function(stream) {
if ( typeof stream._writableState === 'object' ) { // empty
stream._writableState.length -= stream._writableState.buffer.length;
stream._writableState.buffer = [];
}
if ( typeof stream.end === 'function' ) { // kill
stream.end();
}
});
I'm really worried about the usage of stream._writableState and modifying the internal buffer and length properties (the _ signifies a private property). This seems like a hack. Also note that since I'm piping, things like pause and resume our out of the question (based on a suggestion I received from IRC).
I also put together a runnable version (pretty sloppy) you can grab from github: https://github.com/zamnuts/multipipe-proto (git clone, npm install, view readme, npm start)
In this particular case I think we should get rid of the structure where you have 4 different not fully customised streams. Piping them together will create chain dependency that will be hard to control if we haven't implement our own mechanism.
I would like to focus on your actuall goal here:
INPUT >----[read] → [transform0] → [transform1] → [write]-----> OUTPUT
| | | |
KILL_ALL------o----------o--------------o------------o--------[nothing to drain]
I believe that the above structure can be achieved via combining custom:
duplex stream - for own _write(chunk, encoding, cb)and _read(bytes) implementation with
transform stream - for own _transform(chunk, encoding, cb) implementation.
Since you are using the writable-stream-parallel package you may also want to go over their libs, as their duplex implementation can be found here: https://github.com/Clever/writable-stream-parallel/blob/master/lib/duplex.js .
And their transform stream implementation is here: https://github.com/Clever/writable-stream-parallel/blob/master/lib/transform.js. Here they handle the highWaterMark.
Possible solution
Their write stream : https://github.com/Clever/writable-stream-parallel/blob/master/lib/writable.js#L189 has an interesting function writeOrBuffer, I think you might be able to tweak it a bit to interrupt writing the data from buffer.
Note: These 3 flags are controlling the buffer clearing:
( !finished && !state.bufferProcessing && state.buffer.length )
References:
Node.js Transform Stream Doc
Node.js Duplex Stream Doc
Writing Transform Stream in Node.js
Writing Duplex Stream in Node.js

Playing PCM stream from Web Audio API on Node.js

I'm streaming recorded PCM audio from a browser with web audio api.
I'm streaming it with binaryJS (websocket connection) to a nodejs server and I'm trying to play that stream on the server using the speaker npm module.
This is my client. The audio buffers are at first non-interleaved IEEE 32-bit linear PCM with a nominal range between -1 and +1. I take one of the two PCM channels to start off and stream it below.
var client = new BinaryClient('ws://localhost:9000');
var Stream = client.send();
recorder.onaudioprocess = function(AudioBuffer){
var leftChannel = AudioBuffer.inputBuffer.getChannelData (0);
Stream.write(leftChannel);
}
Now I receive the data as a buffer and try writing it to a speaker object from the npm package.
var Speaker = require('speaker');
var speaker = new Speaker({
channels: 1, // 1 channel
bitDepth: 32, // 32-bit samples
sampleRate: 48000, // 48,000 Hz sample rate
signed:true
});
server.on('connection', function(client){
client.on('stream', function(stream, meta){
stream.on('data', function(data){
speaker.write(leftchannel);
});
});
});
The result is a high pitch screech on my laptop's speakers, which is clearly not what's being recorded. It's not feedback either. I can confirm that the recording buffers on the client are valid since I tried writing them to a WAV file and it played back fine.
The docs for speaker and the docs for the AudioBuffer in question
I've been stumped on this for days. Can someone figure out what is wrong or perhaps offer a different approach?
Update with solution
First off, I was using the websocket API incorrectly. I updated above to use it correctly.
I needed to convert the audio buffers to an array buffer of integers. I choose to use Int16Array. Since the given audio buffer has a range in-between 1 and -1, it was as simple as multiplying by the range of the new ArrayBuffer (32767 to -32768).
recorder.onaudioprocess = function(AudioBuffer){
var left = AudioBuffer.inputBuffer.getChannelData (0);
var l = left.length;
var buf = new Int16Array(l)
while (l--) {
buf[l] = left[l]*0xFFFF; //convert to 16 bit
}
Stream.write(buf.buffer);
}
It looks like you're sending your stream through as the meta object.
According to the docs, BinaryClient.send takes a data object (the stream) and a meta object, in that order. The callback for the stream event receives the stream (as a BinaryStream object, not a Buffer) in the first parameter and the meta object in the second.
You're passing send() the string 'channel' as the stream and the Float32Array from getChannelData() as the meta object. Perhaps if you were to swap those two parameters (or just use client.send(leftChannel)) and then change the server code to pass stream to speaker.write instead of leftchannel (which should probably be renamed to meta, or dropped if you don't need it), it might work.
Note that since Float32Array isn't a stream or buffer object, BinaryJS might try to send it in one chunk. You may want to send leftChannel.buffer (the ArrayBuffer behind that object) instead.
Let me know if this works for you; I'm not able to test your exact setup right now.

gzipping a file with nodejs streams causes memory leaks

I'm trying to do what should be seemingly quite simple: take a file with filename X, and create a gzipped version as "X.gz". Nodejs's zlib module does not come with a convenient zlib.gzip(infile, outfile), so I figured I'd use an input stream, an output stream, and a zlib gzipper, then pipe them:
var zlib = require("zlib"),
zipper = zlib.createGzip(),
fs = require("fs");
var tryThing = function(logfile) {
var input = fs.createReadStream(logfile, {autoClose: true}),
output = fs.createWriteStream(logfile + ".gz");
input.pipe(zipper).pipe(output);
output.on("end", function() {
// delete original file, it is no longer needed
fs.unlink(logfile);
// clear listeners
zipper.removeAllListeners();
input.removeAllListeners();
});
}
however, every time I run this function, the memory footprint of Node.js grows by about 100kb. Am I forgetting to tell the streams they should just kill themselves off again because they won't be needed any longer?
Or, alternatively, is there a way to just gzip a file without bothering with streams and pipes? I tried googling for "node.js gzip a file" but it's just links to the API docs, and stack overflow questions on gzipping streams and buffers, not how to just gzip a file.
I think you need to properly unpipe and close the stream. Simply removeAllListeners() may not be enough to clean things up. As streams may be waiting for more data (and thus staying alive in memory unnecessarily.)
Also you're not closing the output stream as well and IMO I'd listen on the input stream's end instead of the output.
// cleanup
input.once('end', function() {
zipper.removeAllListeners();
zipper.close();
zipper = null;
input.removeAllListeners();
input.close();
input = null;
output.removeAllListeners();
output.close();
output = null;
});
Also I don't think the stream returned from zlib.createGzip() can be shared once ended. You should create a new one at every iteration of tryThing:
var input = fs.createReadStream(logfile, {autoClose: true}),
output = fs.createWriteStream(logfile + ".gz")
zipper = zlib.createGzip();
input.pipe(zipper).pipe(output);
Havn't tested this tho as I don't have a memory profile tool nearby right now.

Resources