Node.js: Efficiently reading a range of lines - node.js

I'm currently using Node.js and am wondering how one would read a range of lines from a large text file. An obvious solution would be like so:
var fs = require('fs');
fs.readFile(file, function(err, data) {
var lines = data.split('\n');
});
However, that would involve loading the entire file into memory, which would be impractical for large text files, such as ones 100MB+.
In Bash, I would normally use sed for this case.

With lazy:
var fs = require('fs'),
lazy = require('lazy');
var x = 23;
var y = 42;
var lines = (
lazy(fs.createReadStream('./large.txt'))
.lines
.skip(x - 1)
.take(y - x + 1)
);
lines.forEach(function(line) {
console.log(line.toString('utf-8'));
});

Related

node Stream from CSV, Transform, and Stream to TSV

I have a 1.4GB csv file that I want to go through row-by-row and parse each row. Once each row has been parsed, add that row to the stream and write the output as a tsv file. I thought the below code worked, but it simply adds each row to the end of the previous row without adding in line breaks as I expected. I also tried adding .pipe(split2()) to the line before the .pipe(writeStream) to split the data before writing but that simply froze the application.
Has anybody been successfully reading and writing with this process in node?
var fs = require('fs'),
_ = require('lodash'),
split2 = require('split2'),
through2 = require('through2');
fs.createReadStream('input_file_name.csv')
.pipe(split2())
.pipe(through2.obj(function (chunk, enc, callback) {
// Process the CSV row
var row = _.zipObject(['header1', 'header2', 'header3'], chunk.toString().split(','));
this.push(processRow(row).join('\t')); // does an action to each row
callback()
}))
.pipe(fs.createWriteStream('output_file_name.tsv'));
Realized I was missing a good CSV parser, in lieu of simply splitting on ,, as well as adding a \n to the end of each data string.
var fs = require('fs'),
_ = require('lodash'),
parse = require('csv-parse'),
transform = require('stream-transform');
var parser = parse();
var transformer = transform(function (record, callback) {
var row = _.zipObject(['header1', 'header2', 'header3'], record);
callback(null, processRow(row).join('\t') + '\n');
}, {parallel: 10}
);
fs.createReadStream('input_file_name.csv')
.pipe(parser)
.pipe(transformer)
.pipe(fs.createWriteStream('output_file_name.tsv'));

How to make an array in new file from a list of items

I have a file with a list of words like this
add
blah
blahblah
undo
In other words, there's whitespace at the beginning of some of the lines.
Using node.js, I'm doing this to remove the whitespace (which works fine)
var fs = require('fs')
var array = fs.readFileSync(myfile.txt).toString().split("\n");
for(i in array){
var k = array[i].trim();
console.log(k);
}
but I would like to put the result in a new file like this
newfile.txt
var arr = ["add", "blah", "blahblah"];
Question: How to make an array in new file from a list of items
var fs = require('fs')
var inLines = fs.readFileSync('in.txt').toString().split("\n");
var trimmed = inLines.map(function (line) {
return line.trim();
});
// remove any blank lines
var noEmpties = trimmed.filter(function (line) {
return line;
});
var newData = 'var arr = ' + JSON.stringify(noEmpties) + ';\n'
fs.writeFileSync('newfile.txt', newData, 'utf8');

Write a line into a .txt file with Node.js

I want to use Node.js to create a simple logging system which prints a line before the past line into a .txt file. However, I don't know how the file system functionality from Node.js works.
Can someone explain it?
Inserting data into the middle of a text file is not a simple task. If possible, you should append it to the end of your file.
The easiest way to append data some text file is to use build-in fs.appendFile(filename, data[, options], callback) function from fs module:
var fs = require('fs')
fs.appendFile('log.txt', 'new data', function (err) {
if (err) {
// append failed
} else {
// done
}
})
But if you want to write data to log file several times, then it'll be best to use fs.createWriteStream(path[, options]) function instead:
var fs = require('fs')
var logger = fs.createWriteStream('log.txt', {
flags: 'a' // 'a' means appending (old data will be preserved)
})
logger.write('some data') // append string to your file
logger.write('more data') // again
logger.write('and more') // again
Node will keep appending new data to your file every time you'll call .write, until your application will be closed, or until you'll manually close the stream calling .end:
logger.end() // close string
Note that logger.write in the above example does not write to a new line. To write data to a new line:
var writeLine = (line) => logger.write(`\n${line}`);
writeLine('Data written to a new line');
Simply use fs module and something like this:
fs.appendFile('server.log', 'string to append', function (err) {
if (err) return console.log(err);
console.log('Appended!');
});
Step 1
If you have a small file
Read all the file data in to memory
Step 2
Convert file data string into Array
Step 3
Search the array to find a location where you want to insert the text
Step 4
Once you have the location insert your text
yourArray.splice(index,0,"new added test");
Step 5
convert your array to string
yourArray.join("");
Step 6
write your file like so
fs.createWriteStream(yourArray);
This is not advised if your file is too big
I created a log file which prints data into text file using "Winston" logger. The source code is here below,
const { createLogger, format, transports } = require('winston');
var fs = require('fs')
var logger = fs.createWriteStream('Data Log.txt', {
flags: 'a'
})
const os = require('os');
var sleep = require('system-sleep');
var endOfLine = require('os').EOL;
var t = ' ';
var s = ' ';
var q = ' ';
var array1=[];
var array2=[];
var array3=[];
var array4=[];
array1[0] = 78;
array1[1] = 56;
array1[2] = 24;
array1[3] = 34;
for (var n=0;n<4;n++)
{
array2[n]=array1[n].toString();
}
for (var k=0;k<4;k++)
{
array3[k]=Buffer.from(' ');
}
for (var a=0;a<4;a++)
{
array4[a]=Buffer.from(array2[a]);
}
for (m=0;m<4;m++)
{
array4[m].copy(array3[m],0);
}
logger.write('Date'+q);
logger.write('Time'+(q+' '))
logger.write('Data 01'+t);
logger.write('Data 02'+t);
logger.write('Data 03'+t);
logger.write('Data 04'+t)
logger.write(endOfLine);
logger.write(endOfLine);
function mydata() //user defined function
{
logger.write(datechar+s);
logger.write(timechar+s);
for ( n = 0; n < 4; n++)
{
logger.write(array3[n]);
}
logger.write(endOfLine);
}
var now = new Date();
var dateFormat = require('dateformat');
var date = dateFormat(now,"isoDate");
var time = dateFormat(now, "h:MM:ss TT ");
var datechar = date.toString();
var timechar = time.toString();
mydata();
sleep(5*1000);

How can I create a txt file that holds the contents of an array in JavaScript?

I have several arrays that contain data that I would like to export, each array to a txt file, in order to be analyzed using MATLAB.
Let's say my array is:
var xPosition = [];
// some algorithm that adds content to xPosition
// TODO: export array into a txt file let's call it x_n.txt
It would be great to store each element of an array per line.
I have found a guide for the solution to my question in this post. The following code is what I ended up using:
var fs = require('fs');
var xPosition = [];
// some algorithm that adds content to xPosition
var file = fs.createWriteStream('./positions/x_n.txt');
file.on('error', function(err) { /* error handling */ });
xPosition.forEach(function(v) { file.write(v + '\n'); });
file.end();
The solution you found works, but here's how I'd have done it:
var fs = require('fs');
var xPosition = [1,2,3]; // Generate this
var fileName = './positions/x_n.txt';
fs.writeFileSync(fileName, xPosition.join('\n'));
This uses node's synchronous file writing capability, which is ideal for your purposes. You don't have to open or close file handles, etc. I'd use streams only if I had gigabytes of data to write out.

nodejs: each line in separate file

I want to split a file: each line in a separate file. The initial file is really big. I finished with code bellow:
var fileCounter = -1;
function getWritable() {
fileCounter++;
writable = fs.createWriteStream('data/part'+ fileCounter + '.txt', {flags:'w'});
return writable;
}
var readable = fs.createReadStream(file).pipe(split());
readable.on('data', function (line) {
var flag = getWritable().write(line, function() {
readable.resume();
});
if (!flag) {
readable.pause();
}
});
It works but it is ugly. Is there more nodish way to do that? maybe with piping and without pause/resume.
NB: it's not a question about lines/files/etc . The question is about streams and I just try to illustrate it with the problem
You can use Node's built-in readline module.
var fs = require('fs');
var readline = require('readline');
var fileCounter = -1;
var file = "foo.txt";
readline.createInterface({
input: fs.createReadStream(file),
terminal: false
}).on('line', function(line) {
var writable = fs.createWriteStream('data/part'+ fileCounter + '.txt', {flags:'w'});
writable.write(line);
fileCounter++
});
Note that this will lose the last line of the file if there is no newline at the end, so make sure your last line of data is followed by a newline.
Also note that the docs indicate that it is Stability index 2, meaning:
Stability: 2 - Unstable The API is in the process of settling, but has
not yet had sufficient real-world testing to be considered stable.
Backwards-compatibility will be maintained if reasonable.
How about the following? Did you try? Pause and resume logic isn't realy needed here.
var split = require('split');
var fs = require('fs');
var fileCounter = -1;
var readable = fs.createReadStream(file).pipe(split());
readable.on('data', function (line) {
fileCounter++;
var writable = fs.createWriteStream('data/part'+ fileCounter + '.txt', {flags:'w'});
writable.write(line);
writable.close();
});
Piping dynamically would be hard...
EDIT: You could create a writable (so pipe()able) object that would, on('data') event, do the "create file, open it, write the data, close it" but it :
wouldn't be reusable
wouldn't follow the KISS principle
would require a special and specific logic for file naming (It would accept a string pattern as an argument in its constructor with a placeholder for the number. Etc...)
I realy don't recommend that path, or you're going to take ages implementing a non-realy-reusable module. Though, that would make a good writable implementation exercise.

Resources