How to read specific chunk from file Node JS? - node.js

How to read buffer by selecting a start position and end position from file while on streaming.

It solves my problem read-chunk

you could create a custom writable stream with the writable interface. But maybe that defeats the whole purpose of a stream... do you know upfront which positions you need to read or is it random? if do you need scan patterns?

Related

Node.js read and write stream to the same file at the same time

TL;DR
I'm browsing through a number of solutions on npm and github looking for something that would allow me to read and write to the same file in two different places at the same time. So far I'm having trouble actually finding anything like this. Is there a module of some sort that will allow that?
Background
In essence my requirement is that in a large file I need to, in the following order:
read
transform
write
Ideally the usage would be something like:
const fd = fs.open(file, "r+");
const read = createReadStreamSomehowFrom(fd);
const write = createWriteStreamSomehowFrom(fd);
read
.pipe(new Transform(transform() {...}))
.pipe(write);
I could do that with standard fs.create[Read/Write]Stream but there's no way to control the flow of both streams and if my write position goes beyond read position then I'm reading something I just wrote...
The use case is the same as perl -p -i -e, read and write to the same file (meaning the same inode) asynchronously and replace the contents without loading everything into memory.
I would expect this a real world use case, yet all implementations I found actually load the whole file into memory and then save it. Am I missing a known module here or is there a need to actually write something like this?
Hmm... a tough one it seems. :)
So here's for the record - I found no such module and actually discussed this with some people responsible for a nice in-file replacing module. Seeing no way to solve this I decided to write it from scratch and here it is:
signicode/rw-stream repo on github
rw-stream at npm
The module works on a simple principle that no byte can be written until it has been consumed in the readable stream and it's fairly simple underneath (couple fs.read/write ops with keeping eye on the point of read and write).
If you find this useful then I'm happy. :)

How to synchronously read from a ReadStream in node

I am trying to read UTF-8 text from a file in a memory and time efficient way. There are two ways to read directly from a file synchronously:
fs.readFileSync will read the entire file and return a buffer containing the file's entire contents
fs.readSync will read a set amount of bytes from a file and return a buffer containing just those contents
I initially just used fs.readFileSync because it's easiest, but I'd like to be able to efficiently handle potentially large files by only reading in chunks of text at a time. So I started using fs.readSync instead. But then I realized that fs.readSync doesn't handle UTF-8 decoding. UTF-8 is simple, so I could whip up some logic to manually decode it, but Node already has services for that, so I'd like to avoid that if possible.
I noticed fs.createReadStream, which returns a ReadStream that can be used for exactly this purpose, but unfortunately it seems to only be available in an asynchronous mode of operation.
Is there a way to read from a ReadStream in a synchronous way? I have a massive stack built on top of this already, and I'd rather not have to refactor it to be asynchronous.
I discovered the string_decoder module, which handles all that UTF-8 decoding logic I was worried I'd have to write. At this point, it seems like a no-brainer to use this on top of fs.readSync to get the synchronous behavior I was looking for.
You basically just keep feeding bytes to it, and as it is able to successfully decode characters, it will emit them. The Node documentation is sufficient at describing how it works.

piping node.js object streams to multiple destinations is producing bizarre results -- why?

When piping one transform stream to two other transform streams, occasionally I'm getting a few of the objects from one destination stream appearing in place of the proper objects in the other destination stream. In a stream of 90,000 objects, in about 1 out of 3 runs about 10 objects starting at the sequence number about 10,000 are from the wrong stream (the start position of number of anomolous objects varies). What in the world could account for such bizarre results?
The setup:
sourceStream.pipe(processingStream1).pipe(check1);
processingStream1.pipe(check2).pipe(destinationStream1);
processingStream1.pipe(processingStream2).pipe(destinationStream2);
The sourceStream is a transform stream fed by a file read. The two destination streams are transform streams leading to file writes. Both the file read and file write are through the fs streaming API. All the streams rely on node.js automatic backpressure in piping.
Occasionally objects from processingStream2 are leaking into destinationStream1, as described above.
The checking streams (check1 a sink, check2 a passthrough) show the anomalous objects exist in the stream through check2 but not in the stream into check1.
The file reads and writes are of text (csv) files. I'm using Node.js version 8.6 on Windows 7 (though deserved, please don't throw rocks at me for the latter).
Suggestions on how to better isolate the problem also welcomed. The anomoly is structured enought that it doesn't seem like a generic memory leak, but not consistent enough to be a code error. I'm mystified.
Ugh! processingStream2 modifies the object in the stream coming through it (actually modifies a property of a sub-object). Apparently you can't count on the order of the pipes as controlling the order in changes in the streamed objects. Very occassionally, after sending the source objects through processingStream2, the input object to processingStream2 goes into processingStream1 via node internals. Probably as part of some optimization under the hood.
Lesson learned: don't change the input streamed object when piping to multiple destinations, even if you think you're making the change downstream. May you never have to learn this lesson the hard way!

Is it possible to use a different buffer with Node Stream?

I have a class that extends Writable from Stream.
From the following link https://nodejs.org/api/stream.html#stream_streams_under_the_hood
.. I understand that this inherently uses a buffer, but I want it to use a ring buffer instead. Is it easy to implement? Or am I on the wrong path here? I'm a Node noob.

Node.js - How does a readable stream react to a file that is still being written?

I have found a lot of information on how to pump, or pipe data from a read stream to a write stream in Node. The newest version even auto pauses, and resumes for you. However, I have a different need and would like some help.
I am writing a video file using ffmpeg (to a local file, not a writeable stream), and I would like to create a readstream that reads the data as it gets written. Obviously, the read stream speed will surpass how quickly ffmpeg encodes the file. What will happen when the read stream reaches the end of data before ffmpeg finishes writing the file? I assume it will stop the read stream before the file is fully encoded.
Anyone have any suggestions for the best way to pause/resume the read stream so that it doesn't reach the end of the locally encoding file until the encoding is 100% complete?
In summary:
This is what people normally do: readStream --> writeStream (using .pipe)
This is what I want to do: local file (in slow creation process) --> readStream
As always, thanks to the stackOverflow community.
The growing-file module is what you want.

Resources