How does buffer works in node js? - node.js

I'm new in node js and trying to broadcast video streaming, but not getting any idea how to do this. Want to know how buffering works in a node js application?

Buffers are instances of the Buffer class in node, which is designed to handle raw binary data. Each buffer corresponds to some raw memory allocated outside V8. Buffers act somewhat like arrays of integers, but aren't resizable and have a whole bunch of methods specifically for binary data. In addition, the "integers" in a buffer each represent a byte and so are limited to values from 0 to 255 (2^8 - 1), inclusive.
More about buffers here.
Looks something like this:
Data is processed in terms of streams , instead whole of data at a time. These streams are collected in a buffer and once the buffer is full, the streams are passed on from one point to another (to the client requesting the data).
something like streaming movies online. This way we don't have to wait for the whole of data to arrive but receive in chunk and start using it even before the data is arrived. This video is simple and helpful.

Related

Socket.io reading and writing binary data with zero copy buffers

I see that socket.io supports binary data. To send I could just set a Buffer object.
I want to send / receive a large number medium size files. I want to see if it can be optimized. When creating a Buffer from file and sending via socket.io, does it internally create any copy is the data or is it handled with zero-copy?
Similarly, when receiving, is it possible to receive the data as a Buffer that can be written to a file without creating a copy? I couldn't find an example of receiving data as a Buffer. Can someone point out examples of receiving binary data as a Buffer?

piping node.js object streams to multiple destinations is producing bizarre results -- why?

When piping one transform stream to two other transform streams, occasionally I'm getting a few of the objects from one destination stream appearing in place of the proper objects in the other destination stream. In a stream of 90,000 objects, in about 1 out of 3 runs about 10 objects starting at the sequence number about 10,000 are from the wrong stream (the start position of number of anomolous objects varies). What in the world could account for such bizarre results?
The setup:
sourceStream.pipe(processingStream1).pipe(check1);
processingStream1.pipe(check2).pipe(destinationStream1);
processingStream1.pipe(processingStream2).pipe(destinationStream2);
The sourceStream is a transform stream fed by a file read. The two destination streams are transform streams leading to file writes. Both the file read and file write are through the fs streaming API. All the streams rely on node.js automatic backpressure in piping.
Occasionally objects from processingStream2 are leaking into destinationStream1, as described above.
The checking streams (check1 a sink, check2 a passthrough) show the anomalous objects exist in the stream through check2 but not in the stream into check1.
The file reads and writes are of text (csv) files. I'm using Node.js version 8.6 on Windows 7 (though deserved, please don't throw rocks at me for the latter).
Suggestions on how to better isolate the problem also welcomed. The anomoly is structured enought that it doesn't seem like a generic memory leak, but not consistent enough to be a code error. I'm mystified.
Ugh! processingStream2 modifies the object in the stream coming through it (actually modifies a property of a sub-object). Apparently you can't count on the order of the pipes as controlling the order in changes in the streamed objects. Very occassionally, after sending the source objects through processingStream2, the input object to processingStream2 goes into processingStream1 via node internals. Probably as part of some optimization under the hood.
Lesson learned: don't change the input streamed object when piping to multiple destinations, even if you think you're making the change downstream. May you never have to learn this lesson the hard way!

NodeJS Request Pipe buffer size

How can I set up the maximum buffer size on a NodeJS Request Pipe? I'm trying to use AWS Lambda to download from a source and pipe upload to a destination like in the code below:
request(source).pipe(request(destination))
This code works fine, but if the file size is bigger than the AWS Lambda Memory Size (image below), it crashes. If I increase the memory, it works, so I know is not the timeout or link, but only the memory allocation. Initially I don't to increase the number, but even if I use the maximum, still 1.5GB, and I'm expecting to transfer files bigger than that.
Is there a global variable for NodeJS on AWS Lambda for this? Or any other suggestion?
Two things to consider:
Do not use request(source).pipe(request(destination)) with or within a promise (async/await). For some reason it memory leaks when done with promises.
"However, STREAMING THE RESPONSE (e.g. .pipe(...)) is DISCOURAGED because Request-Promise would grow the memory footprint for large requests unnecessarily high. Use the original Request library for that. You can use both libraries in the same project." Source: https://www.npmjs.com/package/request-promise
To control how much memory the pipe uses: Set the highWaterMark for BOTH ends of the pipe. I REPEAT: BOTH ENDS OF THE PIPE. This will force the pipe to let only so much data into the pipe and out of the pipe, and thus limits its occupation in memory. (But does not limit how fast data moves through the pipe...see Bonus)
request.get(sourceUrl,{highWaterMark: 1024000, encoding:null}).pipe(request(destinationUrl,{highWaterMark: 1024000));
1025000 is in bytes and is approximately 10MB.
Source for highWaterMark background:
"Because Duplex and Transform streams are both Readable and Writable, each maintains two separate internal buffers used for reading and writing, allowing each side to operate independently of the other while maintaining an appropriate and efficient flow of data. For example, net.Socket instances are Duplex streams whose Readable side allows consumption of data received from the socket and whose Writable side allows writing data to the socket. Because data may be written to the socket at a faster or slower rate than data is received, it is important for each side to operate (and buffer) independently of the other." <- last sentence here is the important part.
https://nodejs.org/api/stream.html#stream_readable_pipe_destination_options
Bonus: If you want to throttle how fast data passes through the pipe, check something like this out: https://www.npmjs.com/package/stream-throttle
const throttle = require('stream-throttle');
let th = new throttle.Throttle({rate: 10240000}); //if you dont want to transfer data faster than 10mb/sec
request.get(sourceUrl,{highWaterMark: 1024000, encoding:null}).pipe(th).pipe(request(destinationUrl,{highWaterMark: 1024000));

Does NodeJs stream pipe is symmetric?

I'm building a server which transfers files from endpoint A to endpoint B.
I'm wondering if the NodeJs stream pipe is symmetric?
If I do the following: request.get(A).pipe(request.put(B));, does it upload as fast as it downloads?
I'm asking this question, because my server has an asymmetric connexion (it downloads faster than upload), and I try to avoid memory consumption.
According to node's documentation on stream#pipe
pipe will switch the read stream to flowing mode - it will read only when the write stream has finished consuming previous packets.
readable.pipe() method attaches a Writable stream to the readable, causing it to switch automatically into flowing mode and push all of its data to the attached Writable. The flow of data will be automatically managed so that the destination Writable stream is not overwhelmed by a faster Readable stream.
So your transfer may be asymmetrical, due to different send/download speed - the difference may be buffered in Node's memory - Buffering of streams
Buffering#
Both Writable and Readable streams will store data in an internal
buffer that can be retrieved using writable._writableState.getBuffer()
or readable._readableState.buffer, respectively.
The amount of data potentially buffered depends on the highWaterMark
option passed into the streams constructor. For normal streams, the
highWaterMark option specifies a total number of bytes. For streams
operating in object mode, the highWaterMark specifies a total number
of objects.
Data is buffered in Readable streams when the implementation calls
stream.push(chunk). If the consumer of the Stream does not call
stream.read(), the data will sit in the internal queue until it is
consumed.
Once the total size of the internal read buffer reaches the threshold
specified by highWaterMark, the stream will temporarily stop reading
data from the underlying resource until the data currently buffered
can be consumed (that is, the stream will stop calling the internal
readable._read() method that is used to fill the read buffer).
Data is buffered in Writable streams when the writable.write(chunk)
method is called repeatedly. While the total size of the internal
write buffer is below the threshold set by highWaterMark, calls to
writable.write() will return true. Once the the size of the internal
buffer reaches or exceeds the highWaterMark, false will be returned.
A key goal of the stream API, and in particular the stream.pipe()
method, is to limit the buffering of data to acceptable levels such
that sources and destinations of differing speeds will not overwhelm
the available memory.
Because Duplex and Transform streams are both Readable and Writable,
each maintain two separate internal buffers used for reading and
writing, allowing each side to operate independently of the other
while maintaining an appropriate and efficient flow of data. For
example, net.Socket instances are Duplex streams whose Readable side
allows consumption of data received from the socket and whose Writable
side allows writing data to the socket. Because data may be written to
the socket at a faster or slower rate than data is received, it is
important each side operate (and buffer) independently of the other.
I recommend that you look at this question here the topic is elaborated a further.
If you run the following sample
const http = require('http');
http.request({method:'GET', host:'somehost.com', path: '/cat-picture.jpg'}, (response)=>{
console.log(response);
}).end()
you can explore the underlying sockets - on my system they all have the highWaterMark : 16384 property. So if I understand the documentation, and the above-mentioned questions, in your case about 16KB may be buffered in the faster GET socket on Node.js level - what happens below is probably highly dependent on your system/network configuration.

How do I change the volume of a PCM audio stream in node?

I am trying to figure out how to adjust the volume level of a PCM audio stream in node.
I have looked all over npmjs.org at all of the modules that I could find for working with audio, but haven't found anything that will take a stream in, change the volume, and give me a stream out.
Are there any modules that exist that can do this, perhaps even even something that wasn't made specifically for it?
If not, then I could create a module, if someone can point me in the right direction for modifying a stream byte by byte.
Here is what I am trying to accomplish:
I am writing a program to receive several PCM audio streams, and mix them for several outputs with varying volume levels. Example:
inputs vol output
music 25% output 1
live audio 80% output 1
microphone 0% output 1
music 100% output 2
live audio 0% output 2
microphone 0% output 2
What type of connection are you using? (Would make it easier to give example code)
What you basically want to do, is create a connection. Then on the connection or request object add a listener for the 'data' event. If you don't set an encoding, the data parameter on the callback should be a Buffer. The data event is triggered after each chunk is delivered through the network.
The Buffer gives you byte-size access to the data-stream using regular javascript number values. You can then parse that chunk, keep them in memory over multiple data-events using a closure (in order to buffer multiple chunks). And when appropriate write the parsed and processed data to a socket (another socket or the same in case of bi-directional sockets). Don't forget to manage your closure in order to avoid memory leaks!
This is just an abstract description. Let me know if anything needs clarification.

Resources