I am sending a base64 string to my server. On the server I want to create a readable stream that I push the base64 chunks onto that then goes to a writable stream and written to file. My problem is only the first chunk is written to file. My guess is because I create a new buffer with each chunk this is what is causing the problem but if I send just the string chunks in without creating the buffer the image file is corrupt.
var readable = new stream.Readable();
readable._read = function() {}
req.on('data', function(data) {
var dataText = data.toString();
var dataMatch = dataText.match(/^data:([A-Za-z-+\/]+);base64,(.+)$/);
var bufferData = null;
if (dataMatch) {
bufferData = new Buffer(dataMatch[2], 'base64')
}
else {
bufferData = new Buffer(dataText, 'base64')
}
readable.push(bufferData)
})
req.on('end', function() {
readable.push(null);
})
This is not so trivial as you might think:
Use Transform, not Readable. You can pipe request stream to transform, thus handling back pressure.
You can't use regular expressions, because text you are expecting can be broken in two or more chunks. You could try to accumulate chunks and exec regular expression each time, but if the format of stream is incorrect (that is, not a data uri) you will end up buffering the whole request and running regular expression a lot of times on megabytes long string.
You can't take arbitrary chunk and do new Buffer(chunk, 'base64') because it may not be valid itself. Example: new Buffer('AQID', 'base64') yields new Buffer([1, 2, 3]), but Buffer.concat([new Buffer('AQ', 'base64'), new Buffer('ID', 'base64')]) yields new Buffer([1, 32])
For the 3 problem you can use one of available modules (like base64-stream). Here is an example:
var base64 = require('base64-stream');
var stream = require('stream');
var decoder = base64.decode();
var input = new stream.PassThrough();
var output = new stream.PassThrough();
input.pipe(decoder).pipe(output);
output.on('data', function (data) {
console.log(data);
});
input.write('AQ');
input.write('ID');
You can see that it buffers input and emits data as soon as enough arrived.
As for the 2 problem you need to implement simple stream parser. As an idea: wait for data: string, then buffer chunks (if you need them) until ;base64, found, then pipe to base64-stream.
Related
I have a library that takes as input a ReadableStream, but my input is just a base64 format image. I could convert the data I have in a Buffer like so:
var img = new Buffer(img_string, 'base64');
But I have no idea how to convert it to a ReadableStream or convert the Buffer I obtained to a ReadableStream.
Is there a way to do this?
For nodejs 10.17.0 and up:
const { Readable } = require('stream');
const stream = Readable.from(myBuffer);
something like this...
import { Readable } from 'stream'
const buffer = new Buffer(img_string, 'base64')
const readable = new Readable()
readable._read = () => {} // _read is required but you can noop it
readable.push(buffer)
readable.push(null)
readable.pipe(consumer) // consume the stream
In the general course, a readable stream's _read function should collect data from the underlying source and push it incrementally ensuring you don't harvest a huge source into memory before it's needed.
In this case though you already have the source in memory, so _read is not required.
Pushing the whole buffer just wraps it in the readable stream api.
Node Stream Buffer is obviously designed for use in testing; the inability to avoid a delay makes it a poor choice for production use.
Gabriel Llamas suggests streamifier in this answer: How to wrap a buffer as a stream2 Readable stream?
You can create a ReadableStream using Node Stream Buffers like so:
// Initialize stream
var myReadableStreamBuffer = new streamBuffers.ReadableStreamBuffer({
frequency: 10, // in milliseconds.
chunkSize: 2048 // in bytes.
});
// With a buffer
myReadableStreamBuffer.put(aBuffer);
// Or with a string
myReadableStreamBuffer.put("A String", "utf8");
The frequency cannot be 0 so this will introduce a certain delay.
You can use the standard NodeJS stream API for this - stream.Readable.from
const { Readable } = require('stream');
const stream = Readable.from(buffer);
Note: Don't convert a buffer to string (buffer.toString()) if the buffer contains binary data. It will lead to corrupted binary files.
You don't need to add a whole npm lib for a single file. i refactored it to typescript:
import { Readable, ReadableOptions } from "stream";
export class MultiStream extends Readable {
_object: any;
constructor(object: any, options: ReadableOptions) {
super(object instanceof Buffer || typeof object === "string" ? options : { objectMode: true });
this._object = object;
}
_read = () => {
this.push(this._object);
this._object = null;
};
}
based on node-streamifier (the best option as said above).
Here is a simple solution using streamifier module.
const streamifier = require('streamifier');
streamifier.createReadStream(new Buffer ([97, 98, 99])).pipe(process.stdout);
You can use Strings, Buffer and Object as its arguments.
This is my simple code for this.
import { Readable } from 'stream';
const newStream = new Readable({
read() {
this.push(someBuffer);
},
})
Try this:
const Duplex = require('stream').Duplex; // core NodeJS API
function bufferToStream(buffer) {
let stream = new Duplex();
stream.push(buffer);
stream.push(null);
return stream;
}
Source:
Brian Mancini -> http://derpturkey.com/buffer-to-stream-in-node/
I'm processing files in a multipart form with Busboy. The process in simplified version looks like this:
file.pipe(filePeeker).pipe(gzip).pipe(encrypt).pipe(uploadToS3)
filePeeker is a through-stream (built with trough2). This duplex stream does the following:
Identify filetype by looking at first bytes in first chunk
Accumulating file size
Calculating MD5 hash
After the first four bytes in the first chunk I know if the file is a zip file. If this is the case I want to redirect the file to a completely different stream. In the new stream the compressed files will be unZipped and then handled separately with the same concept as the original file.
How can I accomplish this?
OriginalProcess:
file.pipe(filePeeker).if(!zipFile).pipe(gZip).pipe(encrypt).pipe(uploadToS3)
UnZip-process
file.pipe(filePeeker).if(zipFile).pipe(streamUnzip).pipeEachNewFile(originalProcess).
Thanks
//Michael
There are modules for that, but the basic idea would be to push to another readable stream and return early in your conditional. Write a Transform stream for it.
var Transform = require("stream").Transform;
var util = require("util");
var Readable = require('stream').Readable;
var rs = new Readable;
rs.pipe(unzip());
function BranchStream () {
Transform.call(this);
}
util.inherits(BranchStream, Transform);
BranchStream.prototype._transform = function (chunk, encoding, done) {
if (isZip(chunk)) {
rs.push(chunk);
return done()
}
this.push(doSomethingElseTo(chunk))
return done()
}
I have this event (upload of an image file using <input type="file">):
"change .logoBusinessBig-upload":function(event, template){
var reader = new FileReader()
reader.addEventListener("load", function(evt){
var x = reader.result
console.log(x)
Meteor.call("saveFile", x)
})
reader.readAsArrayBuffer(event.currentTarget.files[0])
}
and this Meteor.method()
saveFile:function(file){
console.log(file)
var fs = Npm.require("fs")
fs.writeFile('../../../../../public/jow.txt', file, function (err) {
console.log("file saved")
});
}
The console.log(x) in the event outputs an ArrayBuffer object, while the console.log(file) in the Meteor.method() shows and empty {} object.
Why is that? The ArrayBuffer should have been passed to the Meteor.method()
//client.js
'change': function(event, template) {
event.preventDefault();
var file = event.target.files[0]; //assuming you have only 1 file
var reader = new FileReader(); //create a reader according to HTML5 File API
reader.onload = function(event){
var buffer = new Uint8Array(reader.result) // convert to binary
Meteor.call('saveFile',buffer);
}
reader.readAsArrayBuffer(file); //read the file as arraybuffer
}
//server.js
'saveFile': function(buffer){
fs.writeFile('/location',new Buffer(buffer),function(error){...});
}
You can't save to /public folder, this triggers a reload
Client-server communication via methods in Meteor uses the DDP protocol, which only supports EJSON-able data-types and does not allow the transmission of more complex object like your ArrayBuffer, which is why you don't see it on the server.
I suggest you read the file as a binary string, send it to your method like that and then manipulate it (either via an ArrayBuffer or by some other means) once it's on the server.
Seeing that EJSON will encode typed arrays as base64 strings it doesn't matter if using EJSON or DateURL - they are equally inefficient (increasing bandwidth use by 30%).
So this:
reader.onload = function(event){
var buffer = new Uint8Array(reader.result) // convert to binary
Meteor.call('saveFile',buffer); // will convert to EJSON/base64
}
reader.readAsArrayBuffer(file); //read the file as arraybuffer
is equivalent to
reader.onload = function(event){
Meteor.call('saveFile',reader.result);
}
reader.readAsDataURL(file); //read the file DataURL (base 64)
The last version is a line shorter on the client side, but will add a line on the server side when you unpack the file to trim it of the mime-type prefix, typically something like
new Buffer(dataURI.replace(/^data:.{1,20}\/.{1,30};base64,/, ''), 'base64');
The alternative: XHR
So neither is really more efficient. If you want to to save on bandwidth, try doing this bit using XHR, which natively suppports all the binary types (File, ArrayBuffer, Blob). You might need to handle it outside of Meteor, perhaps as a small Express app with a route handled by a front-end proxy like NginX.
I would like to read a .txt file, append data to the end and finally send it to a zipstream. Right now what I'm doing is writting a new file and then using the new file for zipstream, but I would like to do it on the fly, without creating an unnecessary new file.
My question is how to create a read stream, modify it and send to another readstream (maybe with a writestream in the middle).
Is this possible?
The original idea was this one, but I'm lost somewhere in the middle:
var zipstream = require('zipstream');
var Stream = require('stream');
var zipOut = fs.createWriteStream('file.zip');
var zip = zipstream.createZip({ level : 1 });
zip.pipe(zipOut);
var rs = fs.createReadStream('file.txt');
var newRs = new Stream(); // << Here should be an in/out stream??
newRs.pipe = function(dest) {
dest.write(rs.read());
dest.write("New text at the end");
};
zip.addEntry(newRs, {name : 'file.txt'}, function() {
zip.finalize();
});
You can inplement a transform (subclass of stream.Transform). Then in its _flush method you have the ability to output any content you want when the input stream has reached the end. This can be piped between a readable stream and a writable one. Refer to node's stream module documentation for inplementation details.
I need to run two commands in series that need to read data from the same stream.
After piping a stream into another the buffer is emptied so i can't read data from that stream again so this doesn't work:
var spawn = require('child_process').spawn;
var fs = require('fs');
var request = require('request');
var inputStream = request('http://placehold.it/640x360');
var identify = spawn('identify',['-']);
inputStream.pipe(identify.stdin);
var chunks = [];
identify.stdout.on('data',function(chunk) {
chunks.push(chunk);
});
identify.stdout.on('end',function() {
var size = getSize(Buffer.concat(chunks)); //width
var convert = spawn('convert',['-','-scale',size * 0.5,'png:-']);
inputStream.pipe(convert.stdin);
convert.stdout.pipe(fs.createWriteStream('half.png'));
});
function getSize(buffer){
return parseInt(buffer.toString().split(' ')[2].split('x')[0]);
}
Request complains about this
Error: You cannot pipe after data has been emitted from the response.
and changing the inputStream to fs.createWriteStream yields the same issue of course.
I don't want to write into a file but reuse in some way the stream that request produces (or any other for that matter).
Is there a way to reuse a readable stream once it finishes piping?
What would be the best way to accomplish something like the above example?
You have to create duplicate of the stream by piping it to two streams. You can create a simple stream with a PassThrough stream, it simply passes the input to the output.
const spawn = require('child_process').spawn;
const PassThrough = require('stream').PassThrough;
const a = spawn('echo', ['hi user']);
const b = new PassThrough();
const c = new PassThrough();
a.stdout.pipe(b);
a.stdout.pipe(c);
let count = 0;
b.on('data', function (chunk) {
count += chunk.length;
});
b.on('end', function () {
console.log(count);
c.pipe(process.stdout);
});
Output:
8
hi user
The first answer only works if streams take roughly the same amount of time to process data. If one takes significantly longer, the faster one will request new data, consequently overwriting the data still being used by the slower one (I had this problem after trying to solve it using a duplicate stream).
The following pattern worked very well for me. It uses a library based on Stream2 streams, Streamz, and Promises to synchronize async streams via a callback. Using the familiar example from the first answer:
spawn = require('child_process').spawn;
pass = require('stream').PassThrough;
streamz = require('streamz').PassThrough;
var Promise = require('bluebird');
a = spawn('echo', ['hi user']);
b = new pass;
c = new pass;
a.stdout.pipe(streamz(combineStreamOperations));
function combineStreamOperations(data, next){
Promise.join(b, c, function(b, c){ //perform n operations on the same data
next(); //request more
}
count = 0;
b.on('data', function(chunk) { count += chunk.length; });
b.on('end', function() { console.log(count); c.pipe(process.stdout); });
You can use this small npm package I created:
readable-stream-clone
With this you can reuse readable streams as many times as you need
For general problem, the following code works fine
var PassThrough = require('stream').PassThrough
a=PassThrough()
b1=PassThrough()
b2=PassThrough()
a.pipe(b1)
a.pipe(b2)
b1.on('data', function(data) {
console.log('b1:', data.toString())
})
b2.on('data', function(data) {
console.log('b2:', data.toString())
})
a.write('text')
I have a different solution to write to two streams simultaneously, naturally, the time to write will be the addition of the two times, but I use it to respond to a download request, where I want to keep a copy of the downloaded file on my server (actually I use a S3 backup, so I cache the most used files locally to avoid multiple file transfers)
/**
* A utility class made to write to a file while answering a file download request
*/
class TwoOutputStreams {
constructor(streamOne, streamTwo) {
this.streamOne = streamOne
this.streamTwo = streamTwo
}
setHeader(header, value) {
if (this.streamOne.setHeader)
this.streamOne.setHeader(header, value)
if (this.streamTwo.setHeader)
this.streamTwo.setHeader(header, value)
}
write(chunk) {
this.streamOne.write(chunk)
this.streamTwo.write(chunk)
}
end() {
this.streamOne.end()
this.streamTwo.end()
}
}
You can then use this as a regular OutputStream
const twoStreamsOut = new TwoOutputStreams(fileOut, responseStream)
and pass it to to your method as if it was a response or a fileOutputStream
If you have async operations on the PassThrough streams, the answers posted here won't work.
A solution that works for async operations includes buffering the stream content and then creating streams from the buffered result.
To buffer the result you can use concat-stream
const Promise = require('bluebird');
const concat = require('concat-stream');
const getBuffer = function(stream){
return new Promise(function(resolve, reject){
var gotBuffer = function(buffer){
resolve(buffer);
}
var concatStream = concat(gotBuffer);
stream.on('error', reject);
stream.pipe(concatStream);
});
}
To create streams from the buffer you can use:
const { Readable } = require('stream');
const getBufferStream = function(buffer){
const stream = new Readable();
stream.push(buffer);
stream.push(null);
return Promise.resolve(stream);
}
What about piping into two or more streams not at the same time ?
For example :
var PassThrough = require('stream').PassThrough;
var mybiraryStream = stream.start(); //never ending audio stream
var file1 = fs.createWriteStream('file1.wav',{encoding:'binary'})
var file2 = fs.createWriteStream('file2.wav',{encoding:'binary'})
var mypass = PassThrough
mybinaryStream.pipe(mypass)
mypass.pipe(file1)
setTimeout(function(){
mypass.pipe(file2);
},2000)
The above code does not produce any errors but the file2 is empty