Create read stream from Buffer for uploading to s3 [duplicate] - node.js

I have a library that takes as input a ReadableStream, but my input is just a base64 format image. I could convert the data I have in a Buffer like so:
var img = new Buffer(img_string, 'base64');
But I have no idea how to convert it to a ReadableStream or convert the Buffer I obtained to a ReadableStream.
Is there a way to do this?

For nodejs 10.17.0 and up:
const { Readable } = require('stream');
const stream = Readable.from(myBuffer);

something like this...
import { Readable } from 'stream'
const buffer = new Buffer(img_string, 'base64')
const readable = new Readable()
readable._read = () => {} // _read is required but you can noop it
readable.push(buffer)
readable.push(null)
readable.pipe(consumer) // consume the stream
In the general course, a readable stream's _read function should collect data from the underlying source and push it incrementally ensuring you don't harvest a huge source into memory before it's needed.
In this case though you already have the source in memory, so _read is not required.
Pushing the whole buffer just wraps it in the readable stream api.

Node Stream Buffer is obviously designed for use in testing; the inability to avoid a delay makes it a poor choice for production use.
Gabriel Llamas suggests streamifier in this answer: How to wrap a buffer as a stream2 Readable stream?

You can create a ReadableStream using Node Stream Buffers like so:
// Initialize stream
var myReadableStreamBuffer = new streamBuffers.ReadableStreamBuffer({
frequency: 10, // in milliseconds.
chunkSize: 2048 // in bytes.
});
// With a buffer
myReadableStreamBuffer.put(aBuffer);
// Or with a string
myReadableStreamBuffer.put("A String", "utf8");
The frequency cannot be 0 so this will introduce a certain delay.

You can use the standard NodeJS stream API for this - stream.Readable.from
const { Readable } = require('stream');
const stream = Readable.from(buffer);
Note: Don't convert a buffer to string (buffer.toString()) if the buffer contains binary data. It will lead to corrupted binary files.

You don't need to add a whole npm lib for a single file. i refactored it to typescript:
import { Readable, ReadableOptions } from "stream";
export class MultiStream extends Readable {
_object: any;
constructor(object: any, options: ReadableOptions) {
super(object instanceof Buffer || typeof object === "string" ? options : { objectMode: true });
this._object = object;
}
_read = () => {
this.push(this._object);
this._object = null;
};
}
based on node-streamifier (the best option as said above).

Here is a simple solution using streamifier module.
const streamifier = require('streamifier');
streamifier.createReadStream(new Buffer ([97, 98, 99])).pipe(process.stdout);
You can use Strings, Buffer and Object as its arguments.

This is my simple code for this.
import { Readable } from 'stream';
const newStream = new Readable({
read() {
this.push(someBuffer);
},
})

Try this:
const Duplex = require('stream').Duplex; // core NodeJS API
function bufferToStream(buffer) {
let stream = new Duplex();
stream.push(buffer);
stream.push(null);
return stream;
}
Source:
Brian Mancini -> http://derpturkey.com/buffer-to-stream-in-node/

Related

NodeJS Stream flushed during the Event Loop iteration

I'm trying to pipe one Stream Axios Response into multiple files. It's not working, and I can reproduce it with the simple code below:
Will work:
const { PassThrough } = require('stream')
const inputStream = new PassThrough()
inputStream.write('foo')
// Now I have a stream with content
inputStream.pipe(process.stdout)
inputStream.pipe(process.stderr)
// will print 'foofoo', for both stdout and stderr
Will not work:
const { PassThrough } = require('stream')
const inputStream = new PassThrough()
inputStream.write('foo')
inputStream.pipe(process.stdout)
setImmediate(() => {
inputStream.pipe(process.stderr)
})
// Will print only 'foo'
The question is, Can I say that the existed content in the stream will be piped only if the two pipe commands will execute in the same Event-Loop iteration?
Doesn't that make the situation non-deterministic?
By the time the callback scheduled with setImmediate is executed, the stream data is already flushed. This can checked by .readableLength stream property.
You can use cork and uncork in order to control when the buffered stream data is flushed.
const { PassThrough } = require('stream')
const inputStream = new PassThrough()
inputStream.cork()
inputStream.write('foo')
inputStream.pipe(process.stdout)
setImmediate(() => {
inputStream.pipe(process.stderr)
inputStream.uncork()
})

How to mock streams in NodeJS

I'm attempting to unit test one of my node-js modules which deals heavily in streams. I'm trying to mock a stream (that I will write to), as within my module I have ".on('data/end)" listeners that I would like to trigger. Essentially I want to be able to do something like this:
var mockedStream = new require('stream').readable();
mockedStream.on('data', function withData('data') {
console.dir(data);
});
mockedStream.on('end', function() {
console.dir('goodbye');
});
mockedStream.push('hello world');
mockedStream.close();
This executes, but the 'on' event never gets fired after I do the push (and .close() is invalid).
All the guidance I can find on streams uses the 'fs' or 'net' library as a basis for creating a new stream (https://github.com/substack/stream-handbook), or they mock it out with sinon but the mocking gets very lengthy very quicky.
Is there a nice way to provide a dummy stream like this?
There's a simpler way: stream.PassThrough
I've just found Node's very easy to miss stream.PassThrough class, which I believe is what you're looking for.
From Node docs:
The stream.PassThrough class is a trivial implementation of a Transform stream that simply passes the input bytes across to the output. Its purpose is primarily for examples and testing...
The code from the question, modified:
const { PassThrough } = require('stream');
const mockedStream = new PassThrough(); // <----
mockedStream.on('data', (d) => {
console.dir(d);
});
mockedStream.on('end', function() {
console.dir('goodbye');
});
mockedStream.emit('data', 'hello world');
mockedStream.end(); // <-- end. not close.
mockedStream.destroy();
mockedStream.push() works too but as a Buffer so you'll might want to do: console.dir(d.toString());
Instead of using Push, I should have been using ".emit(<event>, <data>);"
My mock code now works and looks like:
var mockedStream = new require('stream').Readable();
mockedStream._read = function(size) { /* do nothing */ };
myModule.functionIWantToTest(mockedStream); // has .on() listeners in it
mockedStream.emit('data', 'Hello data!');
mockedStream.emit('end');
The accept answer is only partially correct. If all you need is events to fire, using .emit('data', datum) is okay, but if you need to pipe this mock stream anywhere else it won't work.
Mocking a Readable stream is surprisingly easy, requiring only the Readable lib.
let eventCount = 0;
const mockEventStream = new Readable({
objectMode: true,
read: function (size) {
if (eventCount < 10) {
eventCount = eventCount + 1;
return this.push({message: `event${eventCount}`})
} else {
return this.push(null);
}
}
});
Now you can pipe this stream wherever and 'data' and 'end' will fire.
Another example from the node docs:
https://nodejs.org/api/stream.html#stream_an_example_counting_stream
Building on #flacnut 's answer, I did this (in NodeJS 12+) using Readable.from() to construct a stream preloaded with data (a list of filenames):
const mockStream = require('stream').Readable.from([
'file1.txt',
'file2.txt',
'file3.txt',
])
In my case, I wanted to mock the stream of filenames returned by fast-glob.stream:
const glob = require('fast-glob')
// inject the mock stream into glob module
glob.stream = jest.fn().mockReturnValue(mockStream)
In the function being tested:
const stream = glob.stream(globFilespec)
for await (const filename of stream) {
// filename = file1.txt, then file2.txt, then file3.txt
}
Works like a charm!
Here's a simple implementation which uses jest.fn() where the goal is to validate what has been written to the stream created by fs.createWriteStream(). The nice thing about jest.fn() is that although the calls to fs.createWriteStream() and stream.write() are inline in this test function, these functions don't need to be called directly by the test.
const fs = require('fs');
const mockStream = {}
test('mock fs.createWriteStream with mock implementation', async () => {
const createMockWriteStream = (filename, args) => {
return mockStream;
}
mockStream3.write = jest.fn();
fs.createWriteStream = jest.fn(createMockWriteStream);
const stream = fs.createWriteStream('foo.csv', {'flags': 'a'});
await stream.write('foobar');
expect(fs.createWriteStream).toHaveBeenCalledWith('foo.csv', {'flags': 'a'});
expect(mockStream.write).toHaveBeenCalledWith('foobar');
})

Meteor: ArrayBuffer (FileReader result) is not passed to Meteor.method()

I have this event (upload of an image file using <input type="file">):
"change .logoBusinessBig-upload":function(event, template){
var reader = new FileReader()
reader.addEventListener("load", function(evt){
var x = reader.result
console.log(x)
Meteor.call("saveFile", x)
})
reader.readAsArrayBuffer(event.currentTarget.files[0])
}
and this Meteor.method()
saveFile:function(file){
console.log(file)
var fs = Npm.require("fs")
fs.writeFile('../../../../../public/jow.txt', file, function (err) {
console.log("file saved")
});
}
The console.log(x) in the event outputs an ArrayBuffer object, while the console.log(file) in the Meteor.method() shows and empty {} object.
Why is that? The ArrayBuffer should have been passed to the Meteor.method()
//client.js
'change': function(event, template) {
event.preventDefault();
var file = event.target.files[0]; //assuming you have only 1 file
var reader = new FileReader(); //create a reader according to HTML5 File API
reader.onload = function(event){
var buffer = new Uint8Array(reader.result) // convert to binary
Meteor.call('saveFile',buffer);
}
reader.readAsArrayBuffer(file); //read the file as arraybuffer
}
//server.js
'saveFile': function(buffer){
fs.writeFile('/location',new Buffer(buffer),function(error){...});
}
You can't save to /public folder, this triggers a reload
Client-server communication via methods in Meteor uses the DDP protocol, which only supports EJSON-able data-types and does not allow the transmission of more complex object like your ArrayBuffer, which is why you don't see it on the server.
I suggest you read the file as a binary string, send it to your method like that and then manipulate it (either via an ArrayBuffer or by some other means) once it's on the server.
Seeing that EJSON will encode typed arrays as base64 strings it doesn't matter if using EJSON or DateURL - they are equally inefficient (increasing bandwidth use by 30%).
So this:
reader.onload = function(event){
var buffer = new Uint8Array(reader.result) // convert to binary
Meteor.call('saveFile',buffer); // will convert to EJSON/base64
}
reader.readAsArrayBuffer(file); //read the file as arraybuffer
is equivalent to
reader.onload = function(event){
Meteor.call('saveFile',reader.result);
}
reader.readAsDataURL(file); //read the file DataURL (base 64)
The last version is a line shorter on the client side, but will add a line on the server side when you unpack the file to trim it of the mime-type prefix, typically something like
new Buffer(dataURI.replace(/^data:.{1,20}\/.{1,30};base64,/, ''), 'base64');
The alternative: XHR
So neither is really more efficient. If you want to to save on bandwidth, try doing this bit using XHR, which natively suppports all the binary types (File, ArrayBuffer, Blob). You might need to handle it outside of Meteor, perhaps as a small Express app with a route handled by a front-end proxy like NginX.

Is it possible to register multiple listeners to a child process's stdout data event? [duplicate]

I need to run two commands in series that need to read data from the same stream.
After piping a stream into another the buffer is emptied so i can't read data from that stream again so this doesn't work:
var spawn = require('child_process').spawn;
var fs = require('fs');
var request = require('request');
var inputStream = request('http://placehold.it/640x360');
var identify = spawn('identify',['-']);
inputStream.pipe(identify.stdin);
var chunks = [];
identify.stdout.on('data',function(chunk) {
chunks.push(chunk);
});
identify.stdout.on('end',function() {
var size = getSize(Buffer.concat(chunks)); //width
var convert = spawn('convert',['-','-scale',size * 0.5,'png:-']);
inputStream.pipe(convert.stdin);
convert.stdout.pipe(fs.createWriteStream('half.png'));
});
function getSize(buffer){
return parseInt(buffer.toString().split(' ')[2].split('x')[0]);
}
Request complains about this
Error: You cannot pipe after data has been emitted from the response.
and changing the inputStream to fs.createWriteStream yields the same issue of course.
I don't want to write into a file but reuse in some way the stream that request produces (or any other for that matter).
Is there a way to reuse a readable stream once it finishes piping?
What would be the best way to accomplish something like the above example?
You have to create duplicate of the stream by piping it to two streams. You can create a simple stream with a PassThrough stream, it simply passes the input to the output.
const spawn = require('child_process').spawn;
const PassThrough = require('stream').PassThrough;
const a = spawn('echo', ['hi user']);
const b = new PassThrough();
const c = new PassThrough();
a.stdout.pipe(b);
a.stdout.pipe(c);
let count = 0;
b.on('data', function (chunk) {
count += chunk.length;
});
b.on('end', function () {
console.log(count);
c.pipe(process.stdout);
});
Output:
8
hi user
The first answer only works if streams take roughly the same amount of time to process data. If one takes significantly longer, the faster one will request new data, consequently overwriting the data still being used by the slower one (I had this problem after trying to solve it using a duplicate stream).
The following pattern worked very well for me. It uses a library based on Stream2 streams, Streamz, and Promises to synchronize async streams via a callback. Using the familiar example from the first answer:
spawn = require('child_process').spawn;
pass = require('stream').PassThrough;
streamz = require('streamz').PassThrough;
var Promise = require('bluebird');
a = spawn('echo', ['hi user']);
b = new pass;
c = new pass;
a.stdout.pipe(streamz(combineStreamOperations));
function combineStreamOperations(data, next){
Promise.join(b, c, function(b, c){ //perform n operations on the same data
next(); //request more
}
count = 0;
b.on('data', function(chunk) { count += chunk.length; });
b.on('end', function() { console.log(count); c.pipe(process.stdout); });
You can use this small npm package I created:
readable-stream-clone
With this you can reuse readable streams as many times as you need
For general problem, the following code works fine
var PassThrough = require('stream').PassThrough
a=PassThrough()
b1=PassThrough()
b2=PassThrough()
a.pipe(b1)
a.pipe(b2)
b1.on('data', function(data) {
console.log('b1:', data.toString())
})
b2.on('data', function(data) {
console.log('b2:', data.toString())
})
a.write('text')
I have a different solution to write to two streams simultaneously, naturally, the time to write will be the addition of the two times, but I use it to respond to a download request, where I want to keep a copy of the downloaded file on my server (actually I use a S3 backup, so I cache the most used files locally to avoid multiple file transfers)
/**
* A utility class made to write to a file while answering a file download request
*/
class TwoOutputStreams {
constructor(streamOne, streamTwo) {
this.streamOne = streamOne
this.streamTwo = streamTwo
}
setHeader(header, value) {
if (this.streamOne.setHeader)
this.streamOne.setHeader(header, value)
if (this.streamTwo.setHeader)
this.streamTwo.setHeader(header, value)
}
write(chunk) {
this.streamOne.write(chunk)
this.streamTwo.write(chunk)
}
end() {
this.streamOne.end()
this.streamTwo.end()
}
}
You can then use this as a regular OutputStream
const twoStreamsOut = new TwoOutputStreams(fileOut, responseStream)
and pass it to to your method as if it was a response or a fileOutputStream
If you have async operations on the PassThrough streams, the answers posted here won't work.
A solution that works for async operations includes buffering the stream content and then creating streams from the buffered result.
To buffer the result you can use concat-stream
const Promise = require('bluebird');
const concat = require('concat-stream');
const getBuffer = function(stream){
return new Promise(function(resolve, reject){
var gotBuffer = function(buffer){
resolve(buffer);
}
var concatStream = concat(gotBuffer);
stream.on('error', reject);
stream.pipe(concatStream);
});
}
To create streams from the buffer you can use:
const { Readable } = require('stream');
const getBufferStream = function(buffer){
const stream = new Readable();
stream.push(buffer);
stream.push(null);
return Promise.resolve(stream);
}
What about piping into two or more streams not at the same time ?
For example :
var PassThrough = require('stream').PassThrough;
var mybiraryStream = stream.start(); //never ending audio stream
var file1 = fs.createWriteStream('file1.wav',{encoding:'binary'})
var file2 = fs.createWriteStream('file2.wav',{encoding:'binary'})
var mypass = PassThrough
mybinaryStream.pipe(mypass)
mypass.pipe(file1)
setTimeout(function(){
mypass.pipe(file2);
},2000)
The above code does not produce any errors but the file2 is empty

write base64 to file using stream

I am sending a base64 string to my server. On the server I want to create a readable stream that I push the base64 chunks onto that then goes to a writable stream and written to file. My problem is only the first chunk is written to file. My guess is because I create a new buffer with each chunk this is what is causing the problem but if I send just the string chunks in without creating the buffer the image file is corrupt.
var readable = new stream.Readable();
readable._read = function() {}
req.on('data', function(data) {
var dataText = data.toString();
var dataMatch = dataText.match(/^data:([A-Za-z-+\/]+);base64,(.+)$/);
var bufferData = null;
if (dataMatch) {
bufferData = new Buffer(dataMatch[2], 'base64')
}
else {
bufferData = new Buffer(dataText, 'base64')
}
readable.push(bufferData)
})
req.on('end', function() {
readable.push(null);
})
This is not so trivial as you might think:
Use Transform, not Readable. You can pipe request stream to transform, thus handling back pressure.
You can't use regular expressions, because text you are expecting can be broken in two or more chunks. You could try to accumulate chunks and exec regular expression each time, but if the format of stream is incorrect (that is, not a data uri) you will end up buffering the whole request and running regular expression a lot of times on megabytes long string.
You can't take arbitrary chunk and do new Buffer(chunk, 'base64') because it may not be valid itself. Example: new Buffer('AQID', 'base64') yields new Buffer([1, 2, 3]), but Buffer.concat([new Buffer('AQ', 'base64'), new Buffer('ID', 'base64')]) yields new Buffer([1, 32])
For the 3 problem you can use one of available modules (like base64-stream). Here is an example:
var base64 = require('base64-stream');
var stream = require('stream');
var decoder = base64.decode();
var input = new stream.PassThrough();
var output = new stream.PassThrough();
input.pipe(decoder).pipe(output);
output.on('data', function (data) {
console.log(data);
});
input.write('AQ');
input.write('ID');
You can see that it buffers input and emits data as soon as enough arrived.
As for the 2 problem you need to implement simple stream parser. As an idea: wait for data: string, then buffer chunks (if you need them) until ;base64, found, then pipe to base64-stream.

Resources