node.js readfile woes - node.js

The following piece of code creates a text file and then reads it, overwrites it, and reads it again. Except the creation of the file the three I/O operations are performed using Node.js async readFile and writeFile.
I don't understand why the first read is returning no error but no data either. The output of this code is:
Starting...
Done.
first read returned EMPTY data!
write finished OK
second read returned data: updated text
Even if the operations were to happen in an arbitrary order (due to their async nature) I would have NOT expected to get an "empty data" object.
Any ideas why I am getting an empty data when reading the file (and no error) ?
Is there anything that I can do to make sure the file content is read?
var fs = require('fs');
var fileName = __dirname + '/test.txt';
// Create the test file (this is sync on purpose)
fs.writeFileSync(fileName, 'initial test text', 'utf8');
console.log("Starting...");
// Read async
fs.readFile(fileName, 'utf8', function(err, data) {
var msg = "";
if(err)
console.log("first read returned error: ", err);
else {
if (data === null)
console.log("first read returned NULL data!");
else if (data === "")
console.log("first read returned EMPTY data!");
else
console.log("first read returned data: ", data);
}
});
// Write async
fs.writeFile(fileName, 'updated text', 'utf8', function(err) {
var msg = "";
if(err)
console.log("write finished with error: ", err);
else
console.log("write finished OK");
});
// Read async
fs.readFile(fileName, 'utf8', function(err, data) {
var msg = "";
if(err)
console.log("second read returned error: ", err);
else
if (data === null)
console.log("second read returned NULL data!");
else if (data === "")
console.log("second read returned EMPTY data!");
else
console.log("second read returned data: ", data);
});
console.log("Done.");

Your code is asking for race conditions. Your first sync write is probably writing the file, but then your first read, second write, and second read are put onto the event loop simultaneously.
What could have happened here? First read gets read permission from the filesystem, second write gets write permission from the filesystem and immediately zeroes the file for future updating, then the first read reads the now empty file. Then the second write starts writing data and the second read doesn't get read permission until it's done.
If you want to avoid this, you need to use the flow:
fs.writeFileSync(filename, 'initial', 'utf8');
fs.readFile(filename, 'utf8', function(err, data) {
console.log(data);
fs.writeFile(filename, 'text', 'utf8', function(err) {
fs.readFile(filename, 'utf8', function(err, data) {
console.log(data);
});
});
});
If that "pyramid" insults your programming sensibilities (why wouldn't it?) use the async library's series function:
fs.writeFileSync(filename, 'initial', 'utf8');
async.series([
function(callback) {
fs.readFile(filename, 'utf8', callback);
},
function(callback) {
fs.writeFile(filename, 'text', 'utf8', callback);
},
function(callback) {
fs.readFile(filename, 'utf8', callback);
}
], function(err, results) {
if(err) console.log(err);
console.log(results); // Should be: ['initial', null, 'text']
});
EDIT: More compact, but also more "magical" to people not familiar with the async library and modern Javascript features:
fs.writeFileSync(filename, 'initial', 'utf8');
async.series([
fs.readFile.bind(this, filename, 'utf8'),
fs.writeFile.bind(this, filename, 'text', 'utf8'),
fs.readFile.bind(this, filename, 'utf8'),
], function(err, results) {
if(err) console.log(err);
console.log(results); // Should be: ['initial', null, 'text']
});
EDIT2: Serves me right for making that edit without looking up the definition of bind. The first parameter needs to be the this object (or whatever you want to use as this).

I had a similar problem. I was writing text to a file and had a change-handler telling me when the file had changed at which point I tried to read it ASYNC to process the new content of the file further.
Most of the time that worked but in some cases the callback for the ASYNC-read returned an empty string. So perhaps the changed-event happened before the file was fully written so when I tried to read it I got empty string. Now one could have hoped that the ASYNC read would have recognized that the file is in the process of being written and thus should wait until the write-operation was completed. Seems that in Node.js writing does not lock the file from being read so you get unexpected results if you try to read while write is going on.
I was able to GET AROUND this problem by detecting if the result of ASYNC read was empty string and if so do an additional SYNC-read on the same file. That seems to produce the correct content. Yes SYNC-read is slower, but I do it only if it seems that the ASYNC-read failed to produce the expected content.

Related

How do I modify JSON files while keeping the state updated?

If I have a program as follows to modify a JSON file:
var fs = require('fs');
var dt = require('./dataWrite.json');
console.log("before",dt);
fs.readFile('./data.json', 'utf8', (err, data) => {
if (err) {
throw err;
}
else {
fs.writeFileSync('./dataWrite.json', data);
}
});
console.log("after",dt);
The console for the before and after gives me the same results. The data in the file is modified as expected though. Is there a way to always have the latest state of the file in your program?
Side question: the following code doesn't modify the files at all, I wasn't able to figure why
var fs = require('fs');
var dt = fs.readFileSync('./dataTest.json', 'utf8', function (err, data) {
if (err) {
throw err;
}
});
console.log('before', dt);
fs.readFileSync('./data.json', 'utf8', (err, data) => {
if (err) {
throw err;
}
fs.writeFileSync('./dataTest.json', data);
console.log('data', data);
});
console.log("after", dt);
It's important here to distinguish between synchronous and asynchronous logic.
Since you are using require to read in the json file, the value of the file when the program executes is read in synchronously within dt, and read in once at the beginning of the program only.
When you use the fs.readFile API, you'll notice that it is an asynchronous API and that it requires you to provide a callback to handle the file's data. This means that any execution within it is handled at a later date.
As such, your before and after code will just print the same contents.
If you console.log(dt) after executing fs.writeFileSync you will still see the old value since dt is holding the old value known at the beginning of the program and not the latest value, but if you update the variable to the contents of the file after rereading the file, then you would see the latest contents.
e.g.
...
fs.writeFileSync('./dataWrite.json', data);
dt = fs.readFileSync('./dataWrite.json', 'utf8');
console.log(dt);
...
See fs.readFileSync.

How to set variable value inside read file function and use it outside in node js?

I'm trying to read the file content as string and want to set the variable value equals to it. But when I try to access file_data variable in this case, this gives an empty value if it's used outside the function. However inside the fs.readFile function, it works fine.
var fs = require('fs');
let file_data = '';
fs.readFile('text_file.txt', 'utf8', function(err, data) {
if (err) throw err;
console.log(data); //this works fine here
file_data = data; //setting it here so that I can use it afterwards
});
//the following line gives blank output
console.log(file_data );
I'm a bit new at this, so please point out if there is something I need to read first before using functions like this.
fs.readFile() is an async function so
//the following line gives blank output
console.log(file_data );
Is processed before fs.readFile() responds, so you can:
fs.readFile('text_file.txt', 'utf8', function(err, data) {
if (err) throw err;
console.log(data); //this works fine here
file_data = data; //setting it here so that I can use it afterwards
// CONSOLE LOG HERE!! AND IT WORKS
console.log(file_data );
});
Alternatively, you can use the synchronous version fs.readFileSync()
try {
const data = fs.readFileSync('text_file.txt', 'utf8')
console.log(data)
} catch (err) {
console.error(err)
}
Source

fs.appendFile but fail if path does *not* exist

Looking through the fs docs, I am looking for a flag that I can use with fs.appendFile, where an error will be raised if the path does not exist.
I am seeing flags that pertain to raising errors if the path does already exist, but I am not seeing flags that will raise errors if the path does not exist -
https://nodejs.org/api/fs.html
First off, I assume you mean fs.appendFile(), since the fs.append() you refer to is not in the fs module.
There does not appear to be a flag that opens the file for appending that returns an error if the file does not exist. You could write one yourself. Here's a general idea for how to do so:
fs.appendToFileIfExist = function(file, data, encoding, callback) {
// check for optional encoding argument
if (typeof encoding === "function") {
callback = encoding;
encoding = 'utf8';
}
// r+ opens file for reading and writing. Error occurs if the file does
fs.open(file, 'r+', function(err, fd) {
if (err) return callback(err);
function done(err) {
fs.close(fd, function(close_err) {
fd = null;
if (!err && close_err) {
// if no error passed in and there was a close error, return that
return callback(close_err);
} else {
// otherwise return error passed in
callback(err);
}
});
}
// file is open here, call done(err) when we're done to clean up open file
// get length of file so we know how to append
fs.fstat(fd, function(err, stats) {
if (err) return done(err);
// write data to the end of the file
fs.write(fd, data, stats.size, encoding, function(err) {
done(err);
});
});
});
}
You could, of course, just test to see if the file exists before calling fs.appendFile(), but that is not recommended because of race conditions. Instead, it is recommended that you set the right flags on fs.open() and let that trigger an error if the file does not exist.

Optimal design pattern - functions which process multiple files

Goal is to create distinct functions which separate out the work of loading multiple (xml) files and parsing them. I could do this all in one function, but the nested callbacks begin to get ugly. In other words, I don't want to do this:
// Explore directory
fs.readdir(path, function (err, files) {
if(err) throw err;
// touch each file
files.forEach(function(file) {
fs.readFile(path+file, function(err, data) {
if (err) throw err;
someAsyncFunction ( function (someAsyncFunctionResult) {
// Do some work, then call another async function...
nestedAsynchFunction ( function (nestedAsyncFunctionResult) {
// Do Final Work here, X levels deep. Ouch!
});
});
});
});
});
Instead, I want one function which reads my files and puts each file's XML payload into an array of objects which is returned to the caller (each object represents the name of the file and the XML in the file). Here's the function that might load up reports into an array:
function loadReports (callback) {
var path = "./downloaded-reports/";
var reports = [];
// There are TWO files in this path....
fs.readdir(path, function (err, files) {
if(err) throw err;
files.forEach(function(file) {
fs.readFile(path+file, function(err, data) {
if (err) throw err;
reports.push({ report: file, XML: data.toString()});
//gets called twice, which makes for strangeness in the calling function
callback(null, reports);
});
});
// callback won't work here, returns NULL reports b/c they haven't been processed yet
//callback(null, reports);
});
}
...and here's the function which will call the one above:
function parseReports() {
loadReports( function(err, data) {
console.log ("loadReports callback");
var reportXML = new jsxml.XML(data[0].XML);
var datasources = reportXML.child('datasources').child('datasource').child('connection').attribute("dbname").toString();
console.log(JSON.stringify(datasources,null, 2));
// More async about to be done below
} );
}
As you can see in the loadReports() comments, I can't get the callback to work right. It either calls back BEFORE the array is has been populated at all, or it calls back twice - once for each fs.readFile operation.
SO...what is the best way to deal with this sort of situation? In brief - What's the best design pattern for a function which processes multiple things asynchronously, so that it ONLY calls back when all "things" have been completely processed? The simpler the better. Do I need to use some sort of queuing module like Q or node-queue?
Thanks much!
Edit: Something like this works inside the deepest loop in terms of not hitting the callback twice, but it seems like a kludge:
fs.readdir(path, function (err, files) {
if(err) throw err;
files.forEach(function(file) {
fs.readFile(path+file, function(err, data) {
if (err) throw err;
reports.push({ report: file, XML: data.toString()});
// WORKS, but seems hacky.
if (reports.length = files.length) callback(null, reports);
});
});
});

Node.js fs.readFile()

can anyone think of a reason for the readFile function's callback doesn't get executed?
fs.exists(filePath, function(exists){
if(exists){ // results true
fs.readFile(filePath, "utf8", function(err, data){
if(err){
console.log(err)
}
console.log(data);
})
}
});
filePath is ./etc/coords.txt and it's a json formatted string.
using the Sync version - readFileSync - doesn't work as well.
Because the options is an object not a string:
fs.readFile(filename, [options], callback)
filename String
options Object
encoding String | Null default = null
flag String default = 'r'
callback Function
Asynchronously reads the entire contents of a file. Example:
fs.readFile('/etc/passwd', function (err, data) {
if (err) throw err; console.log(data);
});
The callback is passed two arguments (err, data), where data is the
contents of the file.
So:
fs.exists(filePath, function(exists){
if(exists){ // results true
fs.readFile(filePath, {encoding: "utf8"}, function(err, data){
if(err){
console.log(err)
}
console.log(data);
})
}
});
I think it might be related to a problem with the text file.
This file was generated by a C# app that wrote to the file a stream that contains Environment.NewLine
Not sure that's it, anyway, once I have removed to Environment.NewLine it worked.
if its found locally just use
var contents = require("./file.txt");

Resources