How to append some data directly to a file with node.js? - node.js

Let's say we have:
a file data.json or data.txt with this content: {"data":[]}
and an array of paths: ["C:\\path1", "C:\\path2", "C:\\path3"]
Question
How would we append the array of paths into this file with node.js data stream (or whatnot) so that we get this in the end:
{"data":["C:\\path1", "C:\\path2", "C:\\path3"]}
Code
let filePath = 'C:\test\data.json'
let paths = ["C:\\path1", "C:\\path2", "C:\\path3"]
for (let index = 0; index < paths.length; index++) {
// ... streaming paths to the file one by one
}
I cannot put paths in the file without a loop - in my project I have walkdir(drive, options, (path) => {}) instead of for loop. It also returns paths one by one like the above for loop, it's just for demonstration.

Because it is JSON, you can't actually append to the file. You have to read out the entire document, parse the JSON to a POJO, make your changes, stringify the JSON, and write it back.
import { readFile, writeFile } from 'fs';
readFile(filePath, (err, data) => {
if (err) throw new Error(err);
const json = JSON.parse(data);
paths.forEach(path => json.data.push(path));
writeFile(filePath, JSON.stringify(json), err => { /* handle err */ });
});
If it was a plaintext file, you could do an append by writing to it and setting the flag option to a (for append).
writeFile(filePath, { flag: 'a' }, 'text to append');

Related

Read Lines of figures in Text file and add String to each line

I'm trying to read lines of figures in a text file, add a string to each 'paid' and saved all inside another text file. But I'm confused with the output I'm getting. the commented codes are for saving it in a text file.
fs.readFile('./figures.txt', 'utf8', (err, data) => {
const figures = data.split('\n');
for (let i = 0; i < figures.length; i++) {
console.log(figures[i] + 'paid' );
}
// fs.writeFile('./paidApproved.txt', figures.join('\n'), (err) => {
// if (err) {
// reject(err);
// }
// });
});
Output i got
paid50
paid44
179987paid
Seems like the end-of-file marker is different from what you have considered in your code.
We have two end-of-file markers:
\n on POSIX
\r\n on Windows
So if you're sure about the structure of the file, and If you think the file itself has no problem, then you need to apply a tiny change in your code. Instead of explicitly putting \n as a line breaker, you need to ask os for the line breaker, and your code will be more flexible.
const os = require('os');
fs.readFile('./figures.txt', 'utf8', (err, data) => {
const figures = data.split(os.EOL);
// rest of code
});

Node Js Acessing a Jason File

still new to JSON and while ive searched the net and created a function to create a init file if none exists i'm coming up blank for search and retrive the data of the new existing file or how I add new entries or update new entries
so far i can do a read file and export the resits in a console log so i know the assignment work, its a global variable so the data should persist out of the read file loop but when i try and access it later to make the local array i'll pull data from and use for updating later it reads as undefined.
fs.readFile(path, 'utf8', (error, data) => {
if(error){
console.log(error);
return;
}
//console.log(JSON.parse(data));
JSONData = JSON.parse(data);
for (let i = 0; i < JSONData.length; i++) {
console.log(i+": ["+JSONData[i].unique+"] "+JSONData[i].name);
}
});//fs.readFile
var playerKey = "KuroTO";
playerKey = playerKey.toLowerCase();
for (let i = 0; i < JSONData.length; i++) {
if (JSONData[i].unique.toLowerCase() == playerKey){
console.log("["+i+"] "+JSONData[i].unique.toLowerCase()+": "+playerKey);
PlayerCard1.push(JSONData[i].userid);//0
PlayerCard1.push(JSONData[i].username);//1
PlayerCard1.push(JSONData[i].unique);//2
PlayerCard1.push(JSONData[i].name);//3
PlayerCard1.push(JSONData[i].avatarurl);//4
PlayerCard1.push(JSONData[i].level);//5
PlayerCard1.push(JSONData[i].Rank);//6
PlayerCard1.push(JSONData[i].henshined);//7
PlayerCard1.push(JSONData[i].Strength);//8
PlayerCard1.push(JSONData[i].Perception);//9
PlayerCard1.push(JSONData[i].Endurance);//10
PlayerCard1.push(JSONData[i].Wisdom);//11
PlayerCard1.push(JSONData[i].Intelligence)//12;
PlayerCard1.push(JSONData[i].Luck)//13;
PlayerCard1.push(JSONData[i].Agility)//14;
PlayerCard1.push(JSONData[i].Flexability)//15;
PlayerCard1.push(JSONData[i].RatedSpeed)//16;
};//if unique matches
};//for
this is ther psudo code concept im trying to do
if (JSONData.stringify.unique == {SearchUID}){toonname = JSONData.stringify.name;}
as i understand it you cant really apend just rewrite the file over again with new data and i think i can figure that out on my own once i cand figure out how to real the file into an array i can search like above
To read JSON, simply require the file.
JSON:
{
"key": "H"
}
JS:
let jsonFile = require("./path/to/json");
console.log(jsonFile.key); // H
Editing is just as simple.
let jsonFile = require("./path/to/json");
jsonFile.key = "A"
console.log(jsonFile.key) // A
Saving edits requires use of FileSystem:
const fs = require("fs")
let jsonFile = require("./path/to/json");
jsonFile.key = "A"
// first argument is the file path
// second argument is the JSON to write - the file is overwritten already
// due to above, so just JSON.stringify() the required file.
// third argument is an error callback
fs.writeFile("./path/to/jsonFile", JSON.stringify(jsonFile), (err) => {
if (err) throw new Error(err);
});
This can also be used to slightly clean up your current init function if you wanted, but that's up to you of course.

How to read big csv file batch by batch in Nodejs?

I have a csv file which contains more than 500k records. Fields of the csv are
name
age
branch
Without loading huge data in to memory I need to process all the records from the file. Need to read few records insert them in to collection and manipulate and then continue reading remaining records. As I'm new to this, unable to understand how it would work. If I try to print the batch, it prints buffered data, will the below code work for my requirement? With that buffered value, how can i get the csv record & insert, manipulate file data.
var stream = fs.createReadStream(csvFilePath)
.pipe(csv())
.on('data',(data) => {
batch.push(data)
counter ++;
if(counter == 100){
stream.pause()
setTimeout(() => {
console.log("batch in ",data)
counter = 0;
batch = []
stream.resume()},5000)
}
})
.on('error',(e) => {
console.log("er ",e);
})
.on('end',() => {
console.log("end");
})
I have written you some sample code how to work with streams.
You basically create a stream and proceed with it's chunks. A chunk is an object of type buffer. To work on it as text call toString().
Haven't a lot of time to explain you more but the comments should help out.
Also consider to use a module, since csv parsing was already done a lot.
Hope this helps>
import * as fs from 'fs'
// end oof line delimiter, system specific.
import { EOL } from 'os'
// the delimiter used in csv
var delimiter = ','
// add your own implementttaion of parsing a portion of the text here.
const parseChunk = (text, index) => {
// first chunk, the header is included here.
if(index === 0) {
// The first row will be the header. So take it
var headerLine = text.substring(0, text.indexOf(EOL))
// remove the header from the text for further processing.
// also replace the new line character..
text = text.replace(headerLine+EOL, '')
// Do something with header here..
}
// Now you have a part of the file to process without headers.
// The csv parse function you need to figure out yourself. Best
// is to use some module for that. there are plenty of edge cases
// when parsing csv.
// custom csv parser here =>h ttps://stackoverflow.com/questions/1293147/example-javascript-code-to-parse-csv-data
// if the csv is well formatted it could be enough to use this
var lines = text.split(EOL)
for(var line of lines) {
var values = line.split(delimiter)
console.log('liine received', values)
// StoreToDb(values)
}
}
// create the stream
const stream = fs.createReadStream('file.csv')
// variable to count the chunks for knowing if header is inckuded..
var chunkCount = 0
// handle data event of stream
stream.on('data', chunk => {
// the stream sends you a Buffer
// to have it as text, convert it to string
const text = chunk.toString()
// Note that chunks will be a fixed size
// but mostly consist of multiple lines,
parseChunk(text, chunkCount)
// increment the count.
chunkCount++
})
stream.on('end', () => {
console.log('parsing finished')
})
stream.on('error', (err) => {
// error, handle properly here, maybe rollback changess already made to db
// and parse again. You can may also use the chunkCount to start the parsing
// again and omit first x chunks, so u can restsart at given point
console.log('parsing error ', err)
})

NodeJS - If moving multiple files fails, deleting ones that failed

I'm trying to upload multiple files with an HTTP post, and then NodeJS handles:
save files' info to database
move files from tmp folder to permanent folder
if any file move fails, delete the file from tmp folder
My two issues are described in the comments within code snippet below:
path.resolve isn't working
iterator isn't working within fs.rename
for (i = 0; i < req.files.length; i++) {
const file = new fileSchema({
_userId: req.body._userId,
_companyId: req.body._companyId,
_meetingId: response._id,
originalFilename: req.files[i].originalname,
savedFilename: req.files[i].filename,
});
file.save().then((response) => { }).catch(error => { console.log(error) });
const currentPath = path.resolve(temp_folder, req.files[i].filename);
const newPath = upload_folder +"/"+ req.body._userId +"/"+ req.body._companyId +"/"+ response._id +"/"+ req.files[i].filename;
// 1. why doesn't path.resolve work with the inputs on the line above? I have to concat a string as in line above?
fs.rename(currentPath, newPath, function(err) {
if (err) {
console.log("Error moving files");
try { removeTempFiles(temp_folder, req.files[i]); } // helper function which works written elsewhere
// 2. req.files[i] is undefined (even though req.files works) so the line above fails - i.e. the iterator isn't captured within rename?
catch(err) { console.log(err); }
} else {
console.log("Successfully moved the file!");
}
});
}
Any help appreciated, thanks.
Change this
for (i = 0; i < req.files.length; i++) {
to this:
for (let i = 0; i < req.files.length; i++) {
The addition of let will create a separate i for each iteration of the for loop so it will stay valid inside your fs.rename() callback.
And, path.join(), is probably a better choice than path.resolve() for combining path segments.

Nodejs Read very large file(~10GB), Process line by line then write to other file

I have a 10 GB log file in a particular format, I want to process this file line by line and then write the output to other file after applying some transformations. I am using node for this operation.
Though this method is fine but it takes a hell lot of time to do this. I was able to do this within 30-45 mins in JAVA, but in node it is taking more than 160 minutes to do the same job. Following is the code:
Following is the initiation code which reads each line from the input.
var path = '../10GB_input_file.txt';
var output_file = '../output.txt';
function fileopsmain(){
fs.exists(output_file, function(exists){
if(exists) {
fs.unlink(output_file, function (err) {
if (err) throw err;
console.log('successfully deleted ' + output_file);
});
}
});
new lazy(fs.createReadStream(path, {bufferSize: 128 * 4096}))
.lines
.forEach(function(line){
var line_arr = line.toString().split(';');
perform_line_ops(line_arr, line_arr[6], line_arr[7], line_arr[10]);
}
);
}
This is the method that performs some operation over that line and
passes the input to write method to write it into the output file.
function perform_line_ops(line_arr, range_start, range_end, daynums){
var _new_lines = '';
for(var i=0; i<days; i++){
//perform some operation to modify line pass it to print
}
write_line_ops(_new_lines);
}
Following method is used to write data into a new file.
function write_line_ops(line) {
if(line != null && line != ''){
fs.appendFileSync(output_file, line);
}
}
I want to bring this time down to 15-20 mins. Is it possible to do so.
Also for the record I'm trying this on a intel i7 processor with 8 GB of RAM.
You can do this easily without a module. For example:
var fs = require('fs');
var inspect = require('util').inspect;
var buffer = '';
var rs = fs.createReadStream('foo.log');
rs.on('data', function(chunk) {
var lines = (buffer + chunk).split(/\r?\n/g);
buffer = lines.pop();
for (var i = 0; i < lines.length; ++i) {
// do something with `lines[i]`
console.log('found line: ' + inspect(lines[i]));
}
});
rs.on('end', function() {
// optionally process `buffer` here if you want to treat leftover data without
// a newline as a "line"
console.log('ended on non-empty buffer: ' + inspect(buffer));
});
I can't guess where the possible bottleneck is in your code.
Can you add the library or the source code of the lazy function?
How many operations does your perform_line_ops do? (if/else, switch/case, function calls)
I've created a example based on your given code, I know that this does not answer your question but maybe helps you understand how node handles such case.
const fs = require('fs')
const path = require('path')
const inputFile = path.resolve(__dirname, '../input_file.txt')
const outputFile = path.resolve(__dirname, '../output_file.txt')
function bootstrap() {
// fs.exists is deprecated
// check if output file exists
// https://nodejs.org/api/fs.html#fs_fs_exists_path_callback
fs.exists(outputFile, (exists) => {
if (exists) {
// output file exists, delete it
// https://nodejs.org/api/fs.html#fs_fs_unlink_path_callback
fs.unlink(outputFile, (err) => {
if (err) {
throw err
}
console.info(`successfully deleted: ${outputFile}`)
checkInputFile()
})
} else {
// output file doesn't exist, move on
checkInputFile()
}
})
}
function checkInputFile() {
// check if input file can be read
// https://nodejs.org/api/fs.html#fs_fs_access_path_mode_callback
fs.access(inputFile, fs.constants.R_OK, (err) => {
if (err) {
// file can't be read, throw error
throw err
}
// file can be read, move on
loadInputFile()
})
}
function saveToOutput() {
// create write stream
// https://nodejs.org/api/fs.html#fs_fs_createwritestream_path_options
const stream = fs.createWriteStream(outputFile, {
flags: 'w'
})
// return wrapper function which simply writes data into the stream
return (data) => {
// check if the stream is writable
if (stream.writable) {
if (data === null) {
stream.end()
} else if (data instanceof Array) {
stream.write(data.join('\n'))
} else {
stream.write(data)
}
}
}
}
function parseLine(line, respond) {
respond([line])
}
function loadInputFile() {
// create write stream
const saveOutput = saveToOutput()
// create read stream
// https://nodejs.org/api/fs.html#fs_fs_createreadstream_path_options
const stream = fs.createReadStream(inputFile, {
autoClose: true,
encoding: 'utf8',
flags: 'r'
})
let buffer = null
stream.on('data', (chunk) => {
// append the buffer to the current chunk
const lines = (buffer !== null)
? (buffer + chunk).split('\n')
: chunk.split('\n')
const lineLength = lines.length
let lineIndex = -1
// save last line for later (last line can be incomplete)
buffer = lines[lineLength - 1]
// loop trough all lines
// but don't include the last line
while (++lineIndex < lineLength - 1) {
parseLine(lines[lineIndex], saveOutput)
}
})
stream.on('end', () => {
if (buffer !== null && buffer.length > 0) {
// parse the last line
parseLine(buffer, saveOutput)
}
// Passing null signals the end of the stream (EOF)
saveOutput(null)
})
}
// kick off the parsing process
bootstrap()
I know this is old but...
At a guess appendFileSync() _write()_s to the file system and waits for the response. Lots of small writes are generally expensive, presuming you use a BufferedWriter in Java you might get faster results by skipping some write()s.
Use one of the async writes and see if node buffers sensibly, or write the lines to large node Buffer until it is full and always write a full (or nearly full) Buffer. By tuning the buffer size you could validate if the number of writes affects perf. I suspect it would.
The execution is slow, because you're not using node's asynchronous operations. In essence, you're executing the code like this:
> read some lines
> transform
> write some lines
> repeat
While you could be doing everything at once, or at least reading and writing. Some examples in the answers here do that, but the syntax is at least complicated. Using scramjet you can do it in a couple simple lines:
const {StringStream} = require('scramjet');
fs.createReadStream(path, {bufferSize: 128 * 4096})
.pipe(new StringStream({maxParallel: 128}) // I assume this is an utf-8 file
.split("\n") // split per line
.parse((line) => line.split(';')) // parse line
.map([line_arr, range_start, range_end, daynums] => {
return simplyReturnYourResultForTheOtherFileHere(
line_arr, range_start, range_end, daynums
); // run your code, return promise if you're doing some async work
})
.stringify((result) => result.toString())
.pipe(fs.createWriteStream)
.on("finish", () => console.log("done"))
.on("error", (e) => console.log("error"))
This will probably run much faster.

Resources