Get Data from CSV File in nodejs - node.js

I have a csv file having about 10k records. I need to retrieve it one by one in my nodejs app.
The scenario is there is when user clicks button "X" first time, the async request is sent to nodejs app which gets data from first row from CSV file. When he clicks again, it'll show data from second row and it keeps on going.
I tried using fast-csv and lazy but all of them read the complete file. Is their a way I can achieve tihs?

Node comes with a readline module in it's core, allowing you to process a readable stream line by line.
var fs = require("fs"),
readline = require("readline");
var file = "something.csv";
var rl = readline.createInterface({
input: fs.createReadStream(file),
output: null,
terminal: false
})
rl.on("line", function(line) {
console.log("Got line: " + line);
});
rl.on("close", function() {
console.log("All data processed.");
});

I think the module 'split' by dominic tarr will suffice.
It breaks up the stream line by line.
https://npmjs.org/package/split
fs.createReadStream(file)
.pipe(split())
.on('data', function (line) {
//each chunk now is a seperate line!
})

Related

Array is empty after pushing items (synchronous)

I am trying to read file from a .txt file in nodejs, when i get access to each line, I push it to an array but in the end, the array is empty.
var array=[];
let lineReader = require('readline').createInterface({
input: require('fs').createReadStream('file.txt')
});
lineReader.on('line', function (line) {
console.log(line);
array.push(line);
});
console.log(array);
.Txt File Content
THIS IS LINE#1
THIS IS LINE#2
THIS IS LINE#3
Output
[]
THIS IS LINE#1
THIS IS LINE#2
THIS IS LINE#3
The problem is that the thing happens async. You need to add a listener for close event.
lineReader.on("close", () => {
console.log(array);
})

How to write a file with fs.writeFileSync in a list

Im trying to make my first "AI"
that store the words that i use with a "chat bot" everytime that i open
like
if i use "hello" and later i use "Heya" the chatbot will write these words on a json file
And use these words randomly to greet me when i open the program
(At this moment i just trying to write the words on a list, the bot greeting me with these words i already don't created)
heres the code that i tryied:
const fs = require('fs');
const readline = require('readline');
const rl = readline.createInterface({
input: process.stdin,
output: process.stdout
});
rl.question('Hello! ', (answer) => {
// TODO: Log the answer in a database
console.log(`The ${answer} word has stored :)`);
fs.writeFileSync(__dirname + '/startkeywords.json', JSON.stringify(answer, null, 4));
console.log('created at' + __dirname);
rl.close();
});
but, when i execute the program and say something,its writes,but its not on a list
that just writes, so i'm trying to write the file on a list like this:
list example
help pls
If you want to store your "keywords" as a valid json file, you might need to read file, add data and write again:
const fs = require('fs')
const startKeywordsFile = 'startkeywords.json'
// create an empty json if not exists
if (!fs.existsSync(startKeywordsFile)) {
fs.writeFileSync(startKeywordsFile, '[]')
}
let words = JSON.parse(fs.readFileSync(startKeywordsFile, 'utf8'))
function addWord(newWord) {
if (words.includes(newWord)) {
// already exists
return
}
words.push(newWord)
fs.writeFileSync(startKeywordsFile, JSON.stringify(words, null, 4))
}
addWord('hi')
addWord('hello')
startkeywords.json:
[
"hi",
"hello"
]
Keep in mind it might have performance issue with a large list. If you want to save your keywords as plain text (one word a line, not valid json), you can use fs.appendFileSync to append your file without having to rewrite the entire file everytime.

NodeJs set readline module speed

Im reading a text file in NodeJs using readline module.
var lineReader = require('readline').createInterface({
input: require('fs').createReadStream('log.txt')
});
lineReader.on('line', function (line) {
console.log(line);
});
lineReader.on('close', function() {
console.log('Finished!');
});
Is there any way to set the time of the reading?
For example i want to read each line every 5msec.
You can pause the reader stream as soon as you read a line. Then resume it 5ms later. Repeat this till the end of file. Make sure to adjust highWaterMark option to a lower value so that the file reader stream doesn't read multiple lines at once.
var lineReader = require('readline').createInterface({
input: require('fs').createReadStream('./log.txt',{
highWaterMark : 10
})
});
lineReader.on('line', line => {
lineReader.pause(); // pause reader
// Resume 5ms later
setTimeout(()=>{
lineReader.resume();
}, 5)
console.log(line);
});
You can use observables to do this. Here's an example of the kind of buffering I think you want with click events instead of file line events. Not sure if there's a cleaner way to do it that avoids the setInterval though....
let i = 0;
const source = Rx.Observable
.fromEvent(document.querySelector('#container'), 'click')
.controlled();
var subscription =
source.subscribe(() => console.log('was clicked ' + i++));
setInterval(() => source.request(1), 500);
Here's a fiddle and also a link to docs for rx:
https://jsfiddle.net/w6ewg175/
https://github.com/Reactive-Extensions/RxJS/blob/master/doc/api/core/operators/controlled.md

How Nodejs insert a string to the n-th line of file easily?

I googled around and checked a few npm (e.g. Lazy), but still couldn't find a good pattern to insert a string to n-th number of line of a file.
Being a newbie to Nodejs, I suppose this could be done easily as in other languages, e.g. PHP / Ruby.
Thanks for your solution in advance.
What you can do is:
Open a file in read mode
`var fileData = fs.createReadStream('filename.extension');'
Read line-by-line and track a counter
Check this counter with your desired n-th line number
If matched: append the line fileData.write("this is a message"); by opening file in append mode and traversing to the counter.
If didn't match: print "No such position found. Error!"
I'd probably use one of 'given input stream, notify me on each line' modules, for example node-lazy or byline:
var fs = require('fs'),
byline = require('byline');
var stream = byline(fs.createReadStream('sample.txt'));
stream.on('line', function(line) {
// do stuff with line
});
stream.pipe(fs.createWriteStream('./output');
If your file is small, you can simply read all of the file synchronously and split the result string like this:
require('fs').readFileSync('abc.txt').toString().split('\n').forEach(function (line) { line; })[1]
Another way:
Line-by-line npm
var LineByLineReader = require('line-by-line'),
var lr = new LineByLineReader('big_file.txt');
lr.on('error', function (err) {
// 'err' contains error object
});
lr.on('line', function (line) {
// pause emitting of lines...
lr.pause();
// ...do your asynchronous line processing..
setTimeout(function () {
// ...and continue emitting lines.
lr.resume();
}, 100);
});
lr.on('end', function () {
// All lines are read, file is closed now.
});
Your node-lazy way:
var lazy = require("lazy"),
fs = require("fs");
var matched_line_number = 10;// let say 10, can be any
new lazy(fs.createReadStream('./MyVeryBigFile.extension'))
.lines
.forEach(function(line){
console.log(line.toString());
ctr++;
}
);
Another way could be:
var fs = require('fs'),
async = require('async'),
carrier = require('carrier');
async.parallel({
input: fs.openFile.bind(null, './input.txt', 'r'),
output: fs.openFile.bind(null, './output.txt', 'a')
}, function (err, result) {
if (err) {
console.log("An error occured: " + err);
return;
}
carrier.carry(result.input)
.on('line', result.output.write)
.on('end', function () {
result.output.end();
console.log("Done");
});
});
Open you file in read mode and line-by-line check for the desired line and simultaneously write it to another file with manipulating your lines.

Parse output of spawned node.js child process line by line

I have a PhantomJS/CasperJS script which I'm running from within a node.js script using process.spawn(). Since CasperJS doesn't support require()ing modules, I'm trying to print commands from CasperJS to stdout and then read them in from my node.js script using spawn.stdout.on('data', function(data) {}); in order to do things like add objects to redis/mongoose (convoluted, yes, but seems more straightforward than setting up a web service for this...) The CasperJS script executes a series of commands and creates, say, 20 screenshots which need to be added to my database.
However, I can't figure out how to break the data variable (a Buffer?) into lines... I've tried converting it to a string and then doing a replace, I've tried doing spawn.stdout.setEncoding('utf8'); but nothing seems to work...
Here is what I have right now
var spawn = require('child_process').spawn;
var bin = "casperjs"
//googlelinks.js is the example given at http://casperjs.org/#quickstart
var args = ['scripts/googlelinks.js'];
var cspr = spawn(bin, args);
//cspr.stdout.setEncoding('utf8');
cspr.stdout.on('data', function (data) {
var buff = new Buffer(data);
console.log("foo: " + buff.toString('utf8'));
});
cspr.stderr.on('data', function (data) {
data += '';
console.log(data.replace("\n", "\nstderr: "));
});
cspr.on('exit', function (code) {
console.log('child process exited with code ' + code);
process.exit(code);
});
https://gist.github.com/2131204
Try this:
cspr.stdout.setEncoding('utf8');
cspr.stdout.on('data', function(data) {
var str = data.toString(), lines = str.split(/(\r?\n)/g);
for (var i=0; i<lines.length; i++) {
// Process the line, noting it might be incomplete.
}
});
Note that the "data" event might not necessarily break evenly between lines of output, so a single line might span multiple data events.
I've actually written a Node library for exactly this purpose, it's called stream-splitter and you can find it on Github: samcday/stream-splitter.
The library provides a special Stream you can pipe your casper stdout into, along with a delimiter (in your case, \n), and it will emit neat token events, one for each line it has split out from the input Stream. The internal implementation for this is very simple, and delegates most of the magic to substack/node-buffers which means there's no unnecessary Buffer allocations/copies.
I found a nicer way to do this with just pure node, which seems to work well:
const childProcess = require('child_process');
const readline = require('readline');
const cspr = childProcess.spawn(bin, args);
const rl = readline.createInterface({ input: cspr.stdout });
rl.on('line', line => /* handle line here */)
Adding to maerics' answer, which does not deal properly with cases where only part of a line is fed in a data dump (theirs will give you the first part and the second part of the line individually, as two separate lines.)
var _breakOffFirstLine = /\r?\n/
function filterStdoutDataDumpsToTextLines(callback){ //returns a function that takes chunks of stdin data, aggregates it, and passes lines one by one through to callback, all as soon as it gets them.
var acc = ''
return function(data){
var splitted = data.toString().split(_breakOffFirstLine)
var inTactLines = splitted.slice(0, splitted.length-1)
var inTactLines[0] = acc+inTactLines[0] //if there was a partial, unended line in the previous dump, it is completed by the first section.
acc = splitted[splitted.length-1] //if there is a partial, unended line in this dump, store it to be completed by the next (we assume there will be a terminating newline at some point. This is, generally, a safe assumption.)
for(var i=0; i<inTactLines.length; ++i){
callback(inTactLines[i])
}
}
}
usage:
process.stdout.on('data', filterStdoutDataDumpsToTextLines(function(line){
//each time this inner function is called, you will be getting a single, complete line of the stdout ^^
}) )
You can give this a try. It will ignore any empty lines or empty new line breaks.
cspr.stdout.on('data', (data) => {
data = data.toString().split(/(\r?\n)/g);
data.forEach((item, index) => {
if (data[index] !== '\n' && data[index] !== '') {
console.log(data[index]);
}
});
});
Old stuff but still useful...
I have made a custom stream Transform subclass for this purpose.
See https://stackoverflow.com/a/59400367/4861714
#nyctef's answer uses an official nodejs package.
Here is a link to the documentation: https://nodejs.org/api/readline.html
The node:readline module provides an interface for reading data from a Readable stream (such as process.stdin) one line at a time.
My personal use-case is parsing json output from the "docker watch" command created in a spawned child_process.
const dockerWatchProcess = spawn(...)
...
const rl = readline.createInterface({
input: dockerWatchProcess.stdout,
output: null,
});
rl.on('line', (log: string) => {
console.log('dockerWatchProcess event::', log);
// code to process a change to a docker event
...
});

Resources