How to create a readStream from bytes in memory? - node.js

All of the examples of stream creation I have encountered are centered around file. I am working with an interface that requires me to pipe a read stream to a write stream. My input is raw bytes I have in memory, not a file.
https://nodejs.org/api/fs.html#fs_fs_createreadstream_path_options
How to accomplish ^^^ by passing in 'raw bytes' instead of a file descriptor?

This is what I got working (from How to create streams from string in Node.Js?):
streamFromString(raw) {
const Readable = require('stream').Readable;
const s = new Readable();
s._read = function noop() {};
s.push(raw);
s.push(null);
return s;
}

Related

How to write a single file while reading from multiple input streams in NodeJS

How to write a single file while reading from multiple input streams of the exact same file from diffrent locations with NodeJS.
As its still not Clear Maybe?
I want to use more performance for the download lets say we have 2 locations for the same file each can perform only 10mb down stream so i want to download a part from the first location and the secund in parallel. to get it with 20mb.
so both streams need to get joined some how and both streams need to know the range they are downloading.
i have 2 examples
var http = require('http')
var fs = require('fs')
// will write to disk __dirname/file1.zip
function writeFile(fileStream){
//...
}
// This example assums downloading from 2 http locations
http.request('http://location1/file1.zip').pipe(writeFile)
http.request('http://location2/file1.zip').pipe(writeFile)
var fs = require('fs')
// will write to disk __dirname/file1.zip
function writeFile(fileStream){
//...
}
// this example is reading the same file from 2 diffrent disks
fs.readfFile('/mount/volume1/file1.zip').pipe(writeFile)
fs.readfFile('/mount/volume2/file1.zip').pipe(writeFile)
How i think that it would work
ReadStream needs to check if a defined content range is already writen befor rereading the next chunk from each file and maybe they should start in on a random location in the file to read.
if the total file content length is X we will divide it into smaller chunks and create a map where each entry has a fixed content length so we know what parts we got and what parts we are downloading in total.
Trying to answer this question my self
We can try to simply optimistic raise Read
let SIZE = 64; // 64 byte intervals
let buffers = []
let bytesRead = 0
function readParallel(filepath,callback){
fs.open(filepath, 'r', function(err, fd) {
fs.fstat(fd, function(err, stats) {
let bufferSize = stats.size;
while (bytesRead < bufferSize) {
let size = Math.min(SIZE, bufferSize - bytesRead);
let buffer = new Buffer(size),
let position = bytesRead
let length = size
let offset = bytesRead
let read = fs.readSync(fd, buffer, offset, length, position);
buffers.push(buffer);
bytesRead += read;
}
});
});
}
// At the End: buffers.concat() ==== "File Content"
fs.createReadStream() has an option you can pass it to specify the start
let f = fs.createReadStream("myfile.txt", {start: 1000});
You could also open a normal file descriptor with fs.open(), then fs.read() one byte from a position right before where you want the stream to be positioned using the position argument to fs.read() and then you can pass that file descriptor into fs.createReadStream() as an option and the stream will start with that file descriptor and position (though obviously the start option to fs.createReadStream() is a bit simpler).
Using csv-parse with csv-stringify from the CSV Project.
const fs = require('fs');
const parse = require('csv-parse');
const stringify = require('csv-stringify')
const stringifier = stringify();
const writeFile = fs.createWriteStream('out.csv');
fs.createReadStream('file1.csv').pipe(parse()).pipe(stringifier).pipe(writeFile);
fs.createReadStream('file2.csv').pipe(parse()).pipe(stringifier).pipe(writeFile);
Here I parse each file separately (using a different parse stream for each source), then pipe both to the same stringify stream which concatenates them, then write to destination.
Range Locking
The Answer is Advisory Locking it is as simple as Torrent does it
assign the whole file or a part of it to multiple smaller parts
lock the file range and fetch that range from a list of sources.
use the file created in part 1 as driver for a FIFO Queue it contains all meta
To get a File from Multiple Sources a JS Implementation would look like
if we assume all files are only i put no error handling in here
const queue = [];
const sources = ['https://example.com/file','https://example1.com/file'];
const fileSize = fetch({sources[0],{method: 'HEAD'}).then(({ headers })=>headers['Content-Size']);
const targetBuffer = new UInt8Array(fileSize);
const charset = 'x-user-defined';
// Maps to the UTF Private Address Space Area so you can get bits as chars
const binaryRawEnablingHeader = `text/plain; charset=${charset}`;
const requestDefaults = {
headers: {
'Content-Type': binaryRawEnablingHeader,
'range': 'bytes=2-5,10-13'
}
}
const downloadPlan = /* some logic that puts that bytes into the target WiP */
// use response.text() and then convert that to byte via
// UNICODE Private Area 0xF700-0xF7ff.
const convertToAbyte = (chars) =>
new Array(chars.length)
.map((_abyte,offset) =>
chars.charCodeAt(offset) & 0xff);

Node same buffer on write stream

I have the following code
const buffer = new Buffer(buffer_size);
const wstream = fs.createWriteStream('testStream.ogg');
do{
read = obj1.partialDecrypt(buffer);
if(read>=0){
if(read<buffer_size){
wstream.write(buffer.slice(0,buffer_size));
}
else{
wstream.write(buffer);
}
}
total+=read;
}while(read>0);
wstream.end();
In which partialDecrypt fill the buffer with binary data and return the size filled.
If I fill the buffer more than one time the data written to the stream does not match the expected. Should I do something to reuse the same buffer on the stream?
Turns out reusing buffer is not a good idea. Like on this thread, creating a new buffer each pass was the way to go.

Node.js: splitting a readable stream pipe to multiple sequential writable streams

Given a Readable stream (which may be process.stdin or a file stream), is it possible/practical to pipe() to a custom Writable stream that will fill a child Writable until a certain size; then close that child stream; open a new Writable stream and continue?
(The context is to upload a large piece of data from a pipeline to a CDN, dividing it up into blocks of a reasonable size as it goes, without having to write the data to disk first.)
I've tried creating a Writable that handles the opening and closing of the child stream in the _write function, but the problem comes when the incoming chunk is too big to fit in the existing child stream: it has to write some of the chunk to the old stream; create the new stream; and then wait for the open event on the new stream before completing the _write call.
The other thought I had was to create an extra Duplex or Transform stream to buffer the pipe and ensure that the chunk coming into the Writable is definitely equal to or less than the amount the existing child stream can accept, to give the Writable time to change the child stream over.
Alternatively, is this overcomplicating everything and there's a much easier way to do the original task?
I bumped across the question when looking for an answer for a related problem. How to parse a file and split it its lines into separate files depending on some category value in the line.
I did my best to change my code to make it more relevant to your problem. However, that's rapidly adapted. Not tested. Treat it as pseudo-code.
var fs = require('fs'),
through = require('through');
var destCount = 0, dest, size = 0, MAX_SIZE = 1000;
readableStream
.on('data', function(data) {
var out = data.toString() + "\n";
size += out.length;
if(size > MAX_SIZE) {
dest.emit("end");
dest = null;
size = 0;
}
if(!dest) {
// option 1. manipulate data before saving them.
dest = through();
dest.pipe(fs.createWriteStream("log" + destCount))
// option 2. write directly to file
// dest = fs.createWriteStream("log" + destCount);
}
dest.emit("data", out);
})
.on('end', function() {
dest.emit('end');
});
I would introduce a Transform in between the Readable and Writable stream. And in its _transform, I would do all the logic I would need.
Maybe, I would only have a Readable and a Transform only. The _transform method would create all the Writable stream I need
Personally, I only use a Writable stream only when I'm dumping data somewhere and I would be done processing that chunk.
I avoid implementing _read and _write as much as I can and abuse Transform stream.
But the point I don't understand in your question is write about size. What do you mean by it.?

Node.js Readable file stream not getting data

I'm attempting to create a Readable file stream that I can read individual bytes from. I'm using the code below.
var rs = fs.createReadStream(file).on('open', function() {
var buff = rs.read(8); //Read first 8 bytes
console.log(buff);
});
Given that file is an existing file of at least 8 bytes, why am I getting 'null' as the output for this?
Event open means that stream has been initialized, it does not mean you can read from the stream. You would have to listen for either readable or data events.
var rs = fs.createReadStream(file);
rs.once('readable', function() {
var buff = rs.read(8); //Read first 8 bytes only once
console.log(buff.toString());
});
It looks like you're calling this rs.read() method. However, that method is only available in the Streams interface. In the Streams interface, you're looking for the 'data' event and not the 'open' event.
That stated, the docs actually recommend against doing this. Instead you should probably be handling chunks at a time if you want to stream them:
var rs = fs.createReadStream('test.txt');
rs.on('data', function(chunk) {
console.log(chunk);
});
If you want to read just a specific portion of a file, you may want to look at fs.open() and fs.read() which are lower level.

How to wrap a buffer as a stream2 Readable stream?

How can I transform a node.js buffer into a Readable stream following using the stream2 interface ?
I already found this answer and the stream-buffers module but this module is based on the stream1 interface.
The easiest way is probably to create a new PassThrough stream instance, and simply push your data into it. When you pipe it to other streams, the data will be pulled out of the first stream.
var stream = require('stream');
// Initiate the source
var bufferStream = new stream.PassThrough();
// Write your buffer
bufferStream.end(Buffer.from('Test data.'));
// Pipe it to something else (i.e. stdout)
bufferStream.pipe(process.stdout)
As natevw suggested, it's even more idiomatic to use a stream.PassThrough, and end it with the buffer:
var buffer = new Buffer( 'foo' );
var bufferStream = new stream.PassThrough();
bufferStream.end( buffer );
bufferStream.pipe( process.stdout );
This is also how buffers are converted/piped in vinyl-fs.
A modern simple approach that is usable everywhere you would use fs.createReadStream() but without having to first write the file to a path.
const {Duplex} = require('stream'); // Native Node Module
function bufferToStream(myBuuffer) {
let tmp = new Duplex();
tmp.push(myBuuffer);
tmp.push(null);
return tmp;
}
const myReadableStream = bufferToStream(your_buffer);
myReadableStream is re-usable.
The buffer and the stream exist only in memory without writing to local storage.
I use this approach often when the actual file is stored at some cloud service and our API acts as a go-between. Files never get wrote to a local file.
I have found this to be the very reliable no matter the buffer (up to 10 mb) or the destination that accepts a Readable Stream. Larger files should implement

Resources