can anyone explain arguments of these functions? - node.js

I am working with NodeJs and trying to write and read binary files.
I am having headache with NodeJs documentation which did not provide much explanations.
Especially I want to know
fs.writeSync(fd, buffer, offset, length, position)
I know 'fd' and 'buffer', but confused with 'offset' and 'position'.
fs.readSync(fd, buffer, offset, length, position)
I guess this one is the same.
can any one explain to me?
Thanks

offset is the starting position where you start to read the input buffer (therefore, offset + length should be equal or less then buffer's size)
position is the starting position where you start to write the output
The following stupid example will show you how it works:
const fs = require('fs')
var fd = fs.openSync("test.txt", "w")
var buf = Buffer.alloc(5, 'abcde')
fs.writeSync(fd, buf, 0, buf.length, 0)
// buffer's elements [0-4] are written to file's position 0
// test.txt holds 'abcde'
buf = Buffer.alloc(5, 'fghij')
fs.writeSync(fd, buf, 2, buf.length - 2, 2)
// buffer's elements [2-4] are written to file's position 2
// test.txt holds 'abhij'

Related

Remove NodeJs Stream padding

I'm writing an application where I need to strip the first X and last Y bytes from a stream. So what I need is basically a function I can pass to pipe that takes X and Y as parameters and removes the desired number of bytes from the stream as it comes through. My simplified setup is like this:
const rs = fs.createReadStream('some_file')
const ws = fs.createWriteStream('some_other_file')
rs.pipe(streamPadding(128, 512)).pipe(ws)
After that, some_other_fileshould contain all the contents of some_fileminus the first 128 Bytes and the last 512 bytes. I've read up on streams, but couldn't figure out how to properly do this, so that it also handles errors during the transfer and does backpressure correctly.
As far as I know, I'd need a duplex stream, that, whenever I read from it, reads from its input stream, keeps track of where in the stream we are and skips the first 128 bytes before emitting data. Some tips on how to implement that would be very helpful.
The second part seems more difficult, if not impossible to do, because how would I know whether I already reached the last 512 bytes or not, before the input stream actually closed. I suspect that might not be possible, but I'm sure there must be a way to solve this problem, so if you have any advice on that, I'd be very thankful!
You can create a new Transform Stream which does what you wish. As for losing the last x bytes, you can always keep the last x bytes buffered and just ignore them when the stream ends.
Something like this (assuming you're working with buffers).
const {Transform} = require('stream');
const ignoreFirst = 128,
ignoreLast = 512;
let lastBuff,
cnt = 0;
const MyTrimmer = new Transform({
transform(chunk,encoding,callback) {
let len = Buffer.byteLength(chunk);
// If we haven't ignored the first bit yet, make sure we do
if(cnt <= ignoreFirst) {
let diff = ignoreFirst - cnt;
// If we have more than we want to ignore, adjust pointer
if(len > diff)
chunk = chunk.slice(diff,len);
// Otherwise unset chunk for later
else
chunk = undefined;
}
// Keep track of how many bytes we've seen
cnt += len;
// If we have nothing to push after trimming, just get out
if(!chunk)
return callback();
// If we already have a saved buff, concat it with the chunk
if(lastBuff)
chunk = Buffer.concat([lastBuff,chunk]);
// Get the new chunk length
len = Buffer.byteLength(chunk);
// If the length is less than what we ignore at the end, save it and get out
if(len < ignoreLast) {
lastBuff = chunk;
return callback();
}
// Otherwise save the piece we might want to ignore and push the rest through
lastBuff = chunk.slice(len-ignoreLast,len);
this.push(chunk.slice(0,len-ignoreLast));
callback();
}
});
Then you add that your pipeline, assuming you're reading from a file and writing to a file:
const rs = fs.createReadStream('some_file')
const ws = fs.createWriteStream('some_other_file')
myTrimmer.pipe(ws);
rs.pipe(myTrimmer);

Buffers filled with unicode zeroes

I'm trying to synchronously read parameters from console in node, I managed to do the following:
var load = function () {
const BUFFER_LENGTH = 1024;
const stdin = fs.openSync('/dev/stdin', 'rs');
const buffer = Buffer.alloc(BUFFER_LENGTH);
console.log('Provide parameter: ');
fs.readSync(stdin, buffer, 0, BUFFER_LENGTH);
fs.closeSync(stdin);
return buffer.toString().replace(/\n*/, '');
}
It works, but here's a strange thing:
var loadedValue = load();
console.log(loadedValue); // displays "a", if I typed "a", so the result is correct
console.log({loadedValue}); // displays {a: 'a\n\u0000\u0000....'}
When I wrap the value in an object, the remaining BUFFER bits are showed in a string. Why is that? How can I get rid of them? Regexp on a string before making an object doesn't work.
Buffer.alloc(BUFFER_LENGTH) creates a buffer of a particular length (1024 in your case), and fills that buffer with NULL characters (as documented here).
Next, you read some (say 2) bytes from stdin into that buffer, which replaces the first two of those NULL characters with the characters read from stdin. The rest of the buffer still consists of NULL's.
If you don't truncate the buffer to the amount of bytes read, your function returns a buffer of length 1024, mostly filled with NULL's. Since those aren't printable, they don't show up in the first console.log(), but they're still there.
So after reading from stdin, you should truncate the buffer to the right size:
let bytesRead = fs.readSync(stdin, buffer, 0, BUFFER_LENGTH);
buffer = buffer.slice(0, bytesRead);

fuse: Setting offsets for the filler function in readdir

I am implementing a virtual filesystem using the fuse, and need some understanding regarding the offset parameter in readdir.
Earlier we were ignoring the offset and passing 0 in the filler function, in which case the kernel should take care.
Our filesystem database, is storing: directory name, filelength, inode number and parent inode number.
How do i calculate get the offset?
Then is the offset of each components, equal to their size sorted in incremental form of their inode number? What happens is there is a directory inside a directory, is the offset in that case equal to the sum of the files inside?
Example: in case the dir listing is - a.txt b.txt c.txt
And inode number of a.txt=3, b.txt=5, c.txt=7
Offset of a.txt= directory offset
Offset of b.txt=dir offset + size of a.txt
Offset of c.txt=dir offset + size of b.txt
Is the above assumption correct?
P.S: Here are the callbacks of fuse
The selected answer is not correct
Despite the lack of upvotes on this answer, this is the correct answer. Cracking into the format of the void buffer should be discouraged, and that's the intent behind declaring such things void in C code - you shouldn't write code that assumes knowledge of the format of the data behind void pointers, use whatever API is provided properly instead.
The code below is very simple and straightforward, as it should be. No knowledge of the format of the Fuse buffer is required.
Fictitious API
This is a contrived example of what some device's API could look
like. This is not part of Fuse.
// get_some_file_names() -
// returns a struct with buffers holding the names of files.
// PARAMETERS
// * path - A path of some sort that the fictitious device groks.
// * offset - Where in the list of file names to start.
// RETURNS
// * A name_list, it has some char buffers holding the file names
// and a couple other auxiliary vars.
//
name_list *get_some_file_names(char *path, size_t offset);
Listing the files in parts
Here's a Fuse callback that can be registered with the Fuse system to
list the filenames provided by get_some_file_names(). It's arbitrarily named readdir_callback() so its purpose is obvious.
int readdir_callback( char *path,
void *buf, // This is meant to be "opaque".
fuse_fill_dir_t *filler, // filler takes care of buf.
off_t off, // Last value given to filler.
struct fuse_file_info *fi )
{
// Call the fictitious API to get a list of file names.
name_list *list = get_some_file_names(path, off);
for (int i = 0; i < list->length; i++)
{
// Feed the file names to filler() one at a time.
if (filler(buf, list->names[i], NULL, off + i + 1))
{
break; // filler() returned 1, requesting a break.
}
incr_num_files_listed(list);
}
if (all_files_listed(list))
{
return 1; // Tell Fuse we're done.
}
return 0;
}
The off (offset) value is not used by the filler function to fill its opaque buffer, buf. The off value is, however, meaningful to the callback as an offset base as it provides file names to filler(). Whatever value was last passed to filler() is what gets passed back to readdir_callback() on its next invocation. filler()
itself only cares whether the off value is 0 or not-0.
Indicating "I'm done listing!" to Fuse
To signal to the Fuse system that your readdir_callback() is done listing file names in parts (when the last of the list of names has been given to filler()), simply return 1 from it.
How off Is Used
The off, offset, parameter should be non-0 to perform the partial listings. That's its only requirement as far as filler() is concerned. If off is 0, that indicates to Fuse that you're going to do a full listing in one shot (see below).
Although filler() doesn't care what the off value is beyond it being non-0, the value can still be meaningfully used. The code above is using the index of the next item in its own file list as its value. Fuse will keep passing the last off value it received back to the read dir callback on each invocation until the listing is complete (when readdir_callback() returns 1).
Listing the files all at once
int readdir_callback( char *path,
void *buf,
fuse_fill_dir_t *filler,
off_t off,
struct fuse_file_info *fi )
{
name_list *list = get_all_file_names(path);
for (int i = 0; i < list->length; i++)
{
filler(buf, list->names[i], NULL, 0);
}
return 0;
}
Listing all the files in one shot, as above, is simpler - but not by much. Note that off is 0 for the full listing. One may wonder, 'why even bother with the first approach of reading the folder contents in parts?'
The in-parts strategy is useful where a set number of buffers for file names is allocated, and the number of files within folders may exceed this number. For instance, the implementation of name_list above may only have 8 allocated buffers (char names[8][256]). Also, buf may fill up and filler() start returning 1 if too many names are given at once. The first approach avoids this.
The offset passed to the filler function is the offset of the next item in the directory. You can have the entries in the directory in any order you want. If you don't want to return an entire directory at once, you need to use the offset to determine what gets asked for and stored. The order of items in the directory is up to you, and doesn't matter what order the names or inodes or anything else is.
Specifically, in the readdir call, you are passed an offset. You want to start calling the filler function with entries that will be at this callback or later. In the simplest case, the length of each entry is 24 bytes + strlen(name of entry), rounded up to the nearest multiple of 8 bytes. However, see the fuse source code at http://sourceforge.net/projects/fuse/ for when this might not be the case.
I have a simple example, where I have a loop (pseudo c-code) in my readdir function:
int my_readdir(const char *path, void *buf, fuse_fill_dir_t filler, off_t offset, struct fuse_file_info *fi)
{
(a bunch of prep work has been omitted)
struct stat st;
int off, nextoff=0, lenentry, i;
char namebuf[(long enough for any one name)];
for (i=0; i<NumDirectoryEntries; i++)
{
(fill st with the stat information, including inode, etc.)
(fill namebuf with the name of the directory entry)
lenentry = ((24+strlen(namebuf)+7)&~7);
off = nextoff; /* offset of this entry */
nextoff += lenentry;
/* Skip this entry if we weren't asked for it */
if (off<offset)
continue;
/* Add this to our response until we are asked to stop */
if (filler(buf, namebuf, &st, nextoff))
break;
}
/* All done because we were asked to stop or because we finished */
return 0;
}
I tested this within my own code (I had never used the offset before), and it works fine.

How to append binary data to a buffer in node.js

I have a buffer with some binary data:
var b = new Buffer ([0x00, 0x01, 0x02]);
and I want to append 0x03.
How can I append more binary data? I'm searching in the documentation but for appending data it must be a string, if not, an error occurs (TypeError: Argument must be a string):
var b = new Buffer (256);
b.write ("hola");
console.log (b.toString ("utf8", 0, 4)); //hola
b.write (", adios", 4);
console.log (b.toString ("utf8", 0, 11)); //hola, adios
Then, the only solution I can see here is to create a new buffer for every appended binary data and copy it to the major buffer with the correct offset:
var b = new Buffer (4); //4 for having a nice printed buffer, but the size will be 16KB
new Buffer ([0x00, 0x01, 0x02]).copy (b);
console.log (b); //<Buffer 00 01 02 00>
new Buffer ([0x03]).copy (b, 3);
console.log (b); //<Buffer 00 01 02 03>
But this seems a bit inefficient because I have to instantiate a new buffer for every append.
Do you know a better way for appending binary data?
EDIT
I've written a BufferedWriter that writes bytes to a file using internal buffers. Same as BufferedReader but for writing.
A quick example:
//The BufferedWriter truncates the file because append == false
new BufferedWriter ("file")
.on ("error", function (error){
console.log (error);
})
//From the beginning of the file:
.write ([0x00, 0x01, 0x02], 0, 3) //Writes 0x00, 0x01, 0x02
.write (new Buffer ([0x03, 0x04]), 1, 1) //Writes 0x04
.write (0x05) //Writes 0x05
.close (); //Closes the writer. A flush is implicitly done.
//The BufferedWriter appends content to the end of the file because append == true
new BufferedWriter ("file", true)
.on ("error", function (error){
console.log (error);
})
//From the end of the file:
.write (0xFF) //Writes 0xFF
.close (); //Closes the writer. A flush is implicitly done.
//The file contains: 0x00, 0x01, 0x02, 0x04, 0x05, 0xFF
LAST UPDATE
Use concat.
Updated Answer for Node.js ~>0.8
Node is able to concatenate buffers on its own now.
var newBuffer = Buffer.concat([buffer1, buffer2]);
Old Answer for Node.js ~0.6
I use a module to add a .concat function, among others:
https://github.com/coolaj86/node-bufferjs
I know it isn't a "pure" solution, but it works very well for my purposes.
Buffers are always of fixed size, there is no built in way to resize them dynamically, so your approach of copying it to a larger Buffer is the only way.
However, to be more efficient, you could make the Buffer larger than the original contents, so it contains some "free" space where you can add data without reallocating the Buffer. That way you don't need to create a new Buffer and copy the contents on each append operation.
This is to help anyone who comes here looking for a solution that wants a pure approach. I would recommend understanding this problem because it can happen in lots of different places not just with a JS Buffer object. By understanding why the problem exists and how to solve it you will improve your ability to solve other problems in the future since this one is so fundamental.
For those of us that have to deal with these problems in other languages it is quite natural to devise a solution, but there are people who may not realize how to abstract away the complexities and implement a generally efficient dynamic buffer. The code below may have potential to be optimized further.
I have left the read method unimplemented to keep the example small in size.
The realloc function in C (or any language dealing with intrinsic allocations) does not guarantee that the allocation will be expanded in size with out moving the existing data - although sometimes it is possible. Therefore most applications when needing to store a unknown amount of data will use a method like below and not constantly reallocate, unless the reallocation is very infrequent. This is essentially how most file systems handle writing data to a file. The file system simply allocates another node and keeps all the nodes linked together, and when you read from it the complexity is abstracted away so that the file/buffer appears to be a single contiguous buffer.
For those of you who wish to understand the difficulty in just simply providing a high performance dynamic buffer you only need to view the code below, and also do some research on memory heap algorithms and how the memory heap works for programs.
Most languages will provide a fixed size buffer for performance reasons, and then provide another version that is dynamic in size. Some language systems opt for a third-party system where they keep the core functionality minimal (core distribution) and encourage developers to create libraries to solve additional or higher level problems. This is why you may question why a language does not provide some functionality. This small core functionality allows costs to be reduced in maintaining and enhancing the language, however you end up having to write your own implementations or depending on a third-party.
var Buffer_A1 = function (chunk_size) {
this.buffer_list = [];
this.total_size = 0;
this.cur_size = 0;
this.cur_buffer = [];
this.chunk_size = chunk_size || 4096;
this.buffer_list.push(new Buffer(this.chunk_size));
};
Buffer_A1.prototype.writeByteArrayLimited = function (data, offset, length) {
var can_write = length > (this.chunk_size - this.cur_size) ? (this.chunk_size - this.cur_size) : length;
var lastbuf = this.buffer_list.length - 1;
for (var x = 0; x < can_write; ++x) {
this.buffer_list[lastbuf][this.cur_size + x] = data[x + offset];
}
this.cur_size += can_write;
this.total_size += can_write;
if (this.cur_size == this.chunk_size) {
this.buffer_list.push(new Buffer(this.chunk_size));
this.cur_size = 0;
}
return can_write;
};
/*
The `data` parameter can be anything that is array like. It just must
support indexing and a length and produce an acceptable value to be
used with Buffer.
*/
Buffer_A1.prototype.writeByteArray = function (data, offset, length) {
offset = offset == undefined ? 0 : offset;
length = length == undefined ? data.length : length;
var rem = length;
while (rem > 0) {
rem -= this.writeByteArrayLimited(data, length - rem, rem);
}
};
Buffer_A1.prototype.readByteArray = function (data, offset, length) {
/*
If you really wanted to implement some read functionality
then you would have to deal with unaligned reads which could
span two buffers.
*/
};
Buffer_A1.prototype.getSingleBuffer = function () {
var obuf = new Buffer(this.total_size);
var cur_off = 0;
var x;
for (x = 0; x < this.buffer_list.length - 1; ++x) {
this.buffer_list[x].copy(obuf, cur_off);
cur_off += this.buffer_list[x].length;
}
this.buffer_list[x].copy(obuf, cur_off, 0, this.cur_size);
return obuf;
};
insert byte to specific place.
insertToArray(arr,index,item) {
return Buffer.concat([arr.slice(0,index),Buffer.from(item,"utf-8"),arr.slice(index)]);
}

Binary file I/O

How to read and write to binary files in D language? In C would be:
FILE *fp = fopen("/home/peu/Desktop/bla.bin", "wb");
char x[4] = "RIFF";
fwrite(x, sizeof(char), 4, fp);
I found rawWrite at D docs, but I don't know the usage, nor if does what I think. fread is from C:
T[] rawRead(T)(T[] buffer);
If the file is not opened, throws an exception. Otherwise, calls fread for the file handle and throws on error.
rawRead always read in binary mode on Windows.
rawRead and rawWrite should behave exactly like fread, fwrite, only they are templates to take care of argument sizes and lengths.
e.g.
auto stream = File("filename","r+");
auto outstring = "abcd";
stream.rawWrite(outstring);
stream.rewind();
auto inbytes = new char[4];
stream.rawRead(inbytes);
assert(inbytes[3] == outstring[3]);
rawRead is implemented in terms of fread as
T[] rawRead(T)(T[] buffer)
{
enforce(buffer.length, "rawRead must take a non-empty buffer");
immutable result =
.fread(buffer.ptr, T.sizeof, buffer.length, p.handle);
errnoEnforce(!error);
return result ? buffer[0 .. result] : null;
}
If you just want to read in a big buffer of values (say, ints), you can simply do:
int[] ints = cast(int[]) std.file.read("ints.bin", numInts * int.sizeof);
and
std.file.write("ints.bin", ints);
Of course, if you have more structured data then Scott Wales' answer is more appropriate.

Resources