using async Node.JS to serve HTTP requests - node.js

I've successfully written a few nodejs HTTP handlers to serve data in response to an HTTP request. However, everything I've written has been using *Sync version of functions. I'm now quickly running into limitations of this approach.
I cannot figure out, however, how to properly use asynchronous functions in the HTTP request context. If I try an async call, processing quickly falls through and returns without giving the code a chance to process data.
What's the correct approach? I haven't been able to find any good examples, so any pointers to literature are appreciated. Short of that, what's an example of a handler for a GET request that scans a local directory and, say, returns a json list of file names and corresponding number of lines (or really any stub code of the above that shows the proper technique).

Here's a simple sample:
var http = require('http')
var fs = require('fs')
function dir (req, res) {
fs.readdir('.', function (error, files) {
if (error) {
res.writeHead(500)
res.end(error.message)
return
}
files.forEach(function (file) {
res.write(file + '\n')
})
res.end()
})
}
var server = http.createServer(dir)
server.listen(7000)
Run with node server.js and test it with curl :7000.
Yes the request handler returns before the readdir callback is executed. That is by design. That's how async programming works. It's OK. when the filesystem IO is done, the callback will execute and the response will be sent.

Peter Lyons' answer is great/correct. I'm going to expand on it a bit and suggest a different method of synchronization using promises and co as well as nested/looping asynchronicity.
/* Script to count all lines of a file */
const co = require("co");
// Promisifed fs -- eventually node will support this on its own
const fs = require("mz/fs");
const rootDir = 'files/';
// Recursivey count the lines of all files in the given directory and sum them
function countLines(directory) {
// We can only use `yield` inside a generator block
// `co` allows us to do this and walks through the generator on its own
// `yield` will not move to the next line until the promise resolves
//
// This is still asynchronous code but it is written in a way
// that makes it look synchronized. This entire block is asynchronous, so we
// can `countLines` of multiple directories simultaneously
return co(function* () {
// `files` will be an array of files in the given directory
const files = yield fs.readdir(directory);
// `.map` will create an array of promises. `yield` only completes when
// *all* promises in the array have resolved
const lines = yield files.map(file => countFileLines(file, directory));
// Sum the lines of all files in this directory
return lines.reduce((a, b) => a + b, 0);
});
}
function countFileLines(file, directory) {
// We need the full path to read the file
const fullPath = `${directory}/${file}`;
// `co` returns a promise, so `co` itself can be yielded
// This entire block is asynchronous so we should be able to count lines
// of files without waiting for each file to be read
return co(function* () {
// Used to check whether this file is a directory
const stats = yield fs.stat(fullPath);
if (stats.isDirectory()) {
// If it is, recursively count lines of this directory
return countLines(fullPath);
}
// Otherwise just get the line count of the file
const contents = yield fs.readFile(fullPath, "utf8");
return contents.split("\n").length - 1;
});
}
co(function* () {
console.log(yield countLines(rootDir));
})
// All errors propagate here
.catch(err => console.error(err.stack));
Note that this is just an example. There are probably already libraries to count lines of files in a directory and there are definitely libraries that simplify recursive reading/globbing of files.

Related

Best Way to Build a Collection in Node Mongo Driver from a Directory

I asked a similar question yesterday to this, but the solution was very easy and did not really address my fundamental problem of not understanding flow control in asynchronous JavaScript. The short version of what I am trying to do is build a MongoDB collection from a directory of JSON files. I had it working, but I modified something and now the flow is such that the program runs to completion, and therefore closes the connection before the asynchronous insertOne() calls are executed. When the insertOne() calls are finally executed, the data is not input and I get warnings about an unhandled exception from using a closed connection.
I am new to this, so if what I am doing is not best practice (it isn't), please let me know and I am happy to change things to get it to be reliable. The relevant code basically looks like this:
fs.readDirSync(dataDir).forEach(async function(file){
//logic to build data object from JSON file
console.log('Inserting object ' + obj['ID']);
let result = await connection.insertOne(obj);
console.log('Object ' + result.insertedId + ' inserted.');
})
The above is wrapped in an async function that I await for. By placing a console.log() message at the end of program flow, followed by a while(true);, I have verified that all the "'Inserting object ' + obj[ID]" messages are printed, but not the following "'Object ' + result.insertedId + ' inserted'" messages when flow reaches the end of the program. If I remove the while(true); I get all the error messages, because I am no longer blocking and obviously by that point the client is closed. In no case is the database actually built.
I understand that there are always learning curves, but it is really frustrating to not be able to do something as simple as flow control. I am just trying to do something as simple as "loop through each file, perform function on each file, close, and exit", which is remedial programming. So, what is the best way to mark a point that flow control will not pass until all attempts to insert data into the Collection are complete (either successfully or unsuccessfully, because ideally I can use a flag to mark if there were any errors)?
I have found a better answer than my original, so I am going to post it for anyone else who needs this in the future, as there does not seem to be too much out there. I will leave my original hack up too, as it is an interesting experiment to run for anyone curious about the asynchronous queue. I will also note for everyone that there is a pretty obvious way to Promise.allSettled(), but it seems that this would put all files into memory at once which is what I am trying to avoid, so I am not going to write up that solution too.
This method uses the Node fs Promises API, specifically the fsPromises readdir method. I'll show the results running three test files I made in the same directory that have console.log() messages peppered throughout to help understand program flow.
This first file (without-fs-prom.js) uses the ordinary read method and demonstrates the problem. As you can see, the asynchronous functions (the doFile() calls) do not terminate until the end. This means anything you wanted to run only after all the files are processed would be run before processing finished.
/*
** This version loops through the files and calls an asynchronous
** function with the tradidional fs API (not the Promises API).
*/
const fs = require('fs');
async function doFile(file){
console.log(`Doing ${file}`);
return true;
}
async function loopFiles(){
console.log('Enter loopFiles(), about to loop through the files.');
fs.readdirSync(__dirname).forEach(async function(file){
console.log(`About to do file ${file}`);
ret = await doFile(file);
console.log(`Did file ${file}, returned ${ret}`);
return ret;
});
console.log('Done looping through the files, returning from loopFiles()');
}
console.log('Calling loopFiles()');
loopFiles();
console.log('Returned from loopFiles()');
/* Result of run:
> require('./without-fs-prom')
Calling loopFiles()
Enter loopFiles(), about to loop through the files.
About to do file with-fs-prom1.js
Doing with-fs-prom1.js
About to do file with-fs-prom2.js
Doing with-fs-prom2.js
About to do file without-fs-prom.js
Doing without-fs-prom.js
Done looping through the files, returning from loopFiles()
Returned from loopFiles()
{}
> Did file with-fs-prom1.js, returned true
Did file with-fs-prom2.js, returned true
Did file without-fs-prom.js, returned true
*/
The problem can be partially fixed using the fsPromises API as in with-fs-prom1.js follows:
/*
** This version loops through the files and calls an asynchronous
** function with the fs/promises API and assures all files are processed
** before termination of the loop.
*/
const fs = require('fs');
async function doFile(file){
console.log(`Doing ${file}`);
return true;
}
async function loopFiles(){
console.log('Enter loopFiles(), read the dir');
const files = await fs.promises.readdir(__dirname);
console.log('About to loop through the files.');
for(const file of files){
console.log(`About to do file ${file}`);
ret = await doFile(file);
console.log(`Did file ${file}, returned ${ret}`);
}
console.log('Done looping through the files, returning from loopFiles()');
}
console.log('Calling loopFiles()');
loopFiles();
console.log('Returned from loopFiles()');
/* Result of run:
> require('./with-fs-prom1')
Calling loopFiles()
Enter loopFiles(), read the dir
Returned from loopFiles()
{}
> About to loop through the files.
About to do file with-fs-prom1.js
Doing with-fs-prom1.js
Did file with-fs-prom1.js, returned true
About to do file with-fs-prom2.js
Doing with-fs-prom2.js
Did file with-fs-prom2.js, returned true
About to do file without-fs-prom.js
Doing without-fs-prom.js
Did file without-fs-prom.js, returned true
Done looping through the files, returning from loopFiles()
*/
In this case, code after the file iteration loop within the asynchronous function itself runs after all files have been processed. You can have code in any function context with the following construction (file with-fs-prom2.js):
/*
** This version loops through the files and calls an asynchronous
** function with the fs/promises API and assures all files are processed
** before termination of the loop. It also demonstrates how that can be
** done from another asynchrounous call.
*/
const fs = require('fs');
async function doFile(file){
console.log(`Doing ${file}`);
return true;
}
async function loopFiles(){
console.log('Enter loopFiles(), read the dir');
const files = await fs.promises.readdir(__dirname);
console.log('About to loop through the files.');
for(const file of files){
console.log(`About to do file ${file}`);
ret = await doFile(file);
console.log(`Did file ${file}, returned ${ret}`);
}
console.log('Done looping through the files, return from LoopFiles()');
return;
}
async function run(){
console.log('Enter run(), calling loopFiles()');
await loopFiles();
console.log('Returned from loopFiles(), return from run()');
return;
}
console.log('Calling run()');
run();
console.log('Returned from run()');
/* Result of run:
> require('./with-fs-prom2')
Calling run()
Enter run(), calling loopFiles()
Enter loopFiles(), read the dir
Returned from run()
{}
> About to loop through the files.
About to do file with-fs-prom1.js
Doing with-fs-prom1.js
Did file with-fs-prom1.js, returned true
About to do file with-fs-prom2.js
Doing with-fs-prom2.js
Did file with-fs-prom2.js, returned true
About to do file without-fs-prom.js
Doing without-fs-prom.js
Did file without-fs-prom.js, returned true
Done looping through the files, return from LoopFiles()
Returned from loopFiles(), return from run()
*/
EDIT
This was my first tentative answer. It is a hack of a solution at best. I am leaving it up because it is an interesting experiment for people who want to peer into the asynchronous queue, and there may be some real use case for this somewhere too. I think my newly posted answer is superior in all reasonable cases, though.
Original Answer
I found a bit of an answer. It is a hack, but further searching on the net and the lack of responses indicate that there may be no real good way to reliably control flow with asynchronous code and callbacks. Basically, the modification is along the lines of:
fs.readDirSync(dataDir).forEach(async function(file){
jobsOutstanding++;
//logic to build data object from JSON file
console.log('Inserting object ' + obj['ID']);
let result = await connection.insertOne(obj);
console.log('Object ' + result.insertedId + ' inserted.');
jobsOutstanding--;
})
Where jobsOutstanding is a top level variable to the module with an accessor, numJobsOutstanding().
I now wrap the close like this (with some logging to watch how the flow works):
async function closeClient(client){
console.log("Enter closeClient()");
if(!client || !client.topology || !client.topology.isConnected()){
console.log("Already closed.");
}
else if(dataObject.numJobsOutstanding() == 0){
await client.close();
console.log("Closed.");
}
else{
setTimeout(function(){ closeClient(client);}, 100);
}
}
I got this one to run correctly, and the logging is interesting to visualize the asynchronous queue. I am not going to accept this answer yet to see if anyone out there knows something better.

How to download multiple links from a .csv file using multithreading in node.js?

I am trying to download links from a .csv file and store the downloaded files in a folder. I have used multithreading library for this i.e mt-files-downloader. The files are downloading fine but it takes too much time to download about 313 files. These files are about 400Kb in size max. When i tried using normal download using node i could download them in a minute or two but with this library the download should be fast as i am using multithread library but it takes lot of time. Below is my code any help would be useful. Thanks!
var rec;
csv
.fromStream(stream, { headers: ["Recording", , , , , , , ,] })
.on("data", function (records) {
rec = records.Recording;
//console.log(rec);
download(rec);
})
.on("end", function () {
console.log('Reading complete')
});
function download(rec) {
var filename = rec.replace(/\//g, '');
var filePath = './recordings/'+filename;
var downloadPath = path.resolve(filePath)
var fileUrl = 'http:' + rec;
var downloader = new Downloader();
var dl = downloader.download(fileUrl, downloadPath);
dl.start();
dl.on('error', function(dl) {
var dlUrl = dl.url;
console.log('error downloading = > '+dl.url+' restarting download....');
if(!dlUrl.endsWith('.wav') && !dlUrl.endsWith('Recording')){
console.log('resuming file download => '+dlUrl);
dl.resume();
}
});
}
You're right, downloading 313 files of 400kB should not take long - and I don't think this has to do with your code - maybe the connection is bad? Have you tried downloading a single file via curl?
Anyway I see two problems in your approach with which I can help:
first - you download all the files at the same time (which may introduce some overhead on the server)
second - your error handling will run in loop without waiting and checking the actual file, so if there's a 404 - you'll flood the server with requests.
Using streams with on('data') events has a major drawback of executing all the chunks more or less synchronously as they are read. This means that your code will execute whatever is in on('data') handler never waiting for completion of your downloads. The only limiting factor is now how fast the server can read the csv - and I'd expect millions of lines per second to be normal.
From the server perspective, you're simply requesting 313 files at once, which will result, not wanting to speculate on the actual technical mechanisms of the server, in some of those requests waiting and interfering with each other.
This can be solved by using a streaming framework, like scramjet, event-steram or highland for instance. I'm the author of the first and it's IMHO the easiest in this case, but you can use any of those changing the code a little to match their API - it's pretty similar in all cases anyway.
Here's a heavily commented code that will run a couple downloads in parallel:
const {StringStream} = require("scramjet");
const sleep = require("sleep-promise");
const Downloader = require('mt-files-downloader');
const downloader = new Downloader();
const {StringStream} = require("scramjet");
const sleep = require("sleep-promise");
const Downloader = require('mt-files-downloader');
const downloader = new Downloader();
// First we create a StringStream class from your csv stream
StringStream.from(csvStream)
// we parse it as CSV without columns
.CSVParse({header: false})
// we set the limit of parallel operations, it will get propagated.
.setOptions({maxParallel: 16})
// now we extract the first column as `recording` and create a
// download request.
.map(([recording]) => {
// here's the first part of your code
const filename = rec.replace(/\//g, '');
const filePath = './recordings/'+filename;
const downloadPath = path.resolve(filePath)
const fileUrl = 'http:' + rec;
// at this point we return the dl object so we can keep these
// parts separate.
// see that the download hasn't been started yet
return downloader.download(fileUrl, downloadPath);
})
// what we get is a stream of not started download objects
// so we run this asynchronous function. If this returns a Promise
// it will wait
.map(
async (dl) => new Promise((res, rej) => {
// let's assume a couple retries we allow
let retries = 10;
dl.on('error', async (dl) => {
try {
// here we reject if the download fails too many times.
if (retries-- === 0) throw new Error(`Download of ${dl.url} failed too many times`);
var dlUrl = dl.url;
console.log('error downloading = > '+dl.url+' restarting download....');
if(!dlUrl.endsWith('.wav') && !dlUrl.endsWith('Recording')){
console.log('resuming file download => '+dlUrl);
// lets wait half a second before retrying
await sleep(500);
dl.resume();
}
} catch(e) {
// here we call the `reject` function - meaning that
// this file wasn't downloaded despite retries.
rej(e);
}
});
// here we call `resolve` function to confirm that the file was
// downloaded.
dl.on('end', () => res());
})
)
// we log some message and ignore the result in case of an error
.catch(e => {
console.error('An error occured:', e.message);
return;
})
// Every steram must have some sink to flow to, the `run` method runs
// every operation above.
.run();
You can also use the stream to push out some kind of log messages and use pipe(process.stderr) in the end, instead of those console.logs. Please check the scramjet documentation for additional info and a Mozilla doc on async functions

Read file with NodeJS returns `ENOENT no such file or directory`

I'm trying to read a file, but I always got the error Error: ENOENT: no such file or directory, open \'SB01028A.RET\'. The file name is correct, and exists because I put the file in my Home/sentbox directory.
What I did wrong here ?
Code:
function downloadFile () {
return new Promise((resolve, reject) => {
try {
const testFolder = `${require('os').homedir()}/sentbox`
fs.readdir(testFolder, (err, files) => {
if (err) {
return reject(err)
}
files.forEach(fileRetorno => {
const retorno = fs.readFileSync(fileRetorno, 'UTF8')
return resolve(retorno)
})
})
} catch (err) {
return reject(err)
}
})
}
You have a number of things wrong in your code like
Using synchronous file reading within a promise when you could be making your file reading asynchronous
Using try/catch in an asynchronous context without wrapping that in a async/await function
You're not using the results of fs.readdir() correctly.
Attemping to resolve/reject a promise that could have already been resolved or rejected
Using require() in a loop, and in an asynchronous context
fs.readdir() is going to return all the entry names within that directory, both files and directories, as an array. Before you can call fs.readFile() you'll need to check if the entry is a file or directory and you'll need to join() the path to the directory read with readdir() in this case, testFolder, with the entry name.
When returning using Promises (async/await wraps promises but still uses them), you only can resolve each promise once. So resolving the same promise multiple times to return different values doesn't work, the same is true for rejecting multiple times. Instead, you'll need to return your values in an Array or an Object. For the scenario in the above code, using an Object would be more ideal since you can associate the file contents to a key for reference and better access later on.
I've used async/await to clean up this code, this will give you the synchronously development approach while getting the functionality of promises. You can read more about async/await on MDN. I've also promisified all of the asynchronous version of the needed fs functions using util.promisify()
The code below will
Read all of the entries in testFolder
Filter the entries array to only include files by calling stat() for each entry and checking if it is a file.
stat() will return a fs.Stat object that can tell us if the entry is a file via stat.isFile(). Since Array#filter() is only expecting a boolean result for each entry interated over, the result of stat.isFile() can be returned directly
Iterate over the files array with Array#reduce() and call readFile() for each file, returning the contents on an object with each file name as a key to the contents
const fs = require('fs')
const {promisify} = require('util')
const os = require('os')
const path = require('path')
const readdir = promisify(fs.readdir)
const readFile = promisify(fs.readFile)
const stat = promisify(fs.stat)
const downloadFile = async () => {
const testFolder = `${os.homedir()}/sentbox`
// Get all the entries in the directory async
const entries = await readdir(testFile)
// We only want the file entries returned
const files = entries.filter(async entry => {
let stat = await stat(path.join(testFolder, entry))
return stat.isFile()
})
return Promise.all(files.reduce(async (filesContents, file) => {
let filepath = path.join(testFolder, file)
fileContents[file] = await readFile(filepath, 'utf8')
return fileContents
}, {}))
}

Stop function from being invoked multiple times

I'm in the process of building a file upload component that allows you to pause/resume file uploads.
The standard way to achieve this seems to be to break the file into chunks on the client machine, then send the chunks along with book-keeping information up to the server which can store the chunks into a staging directory, then merge them together when it has received all of the chunks. So, this is what I am doing.
I am using node/express and I'm able to get the files fine, but I'm running into an issue because my merge_chunks function is being invoked multiple times.
Here's my call stack:
router.post('/api/videos',
upload.single('file'),
validate_params,
rename_uploaded_chunk,
check_completion_status,
merge_chunks,
record_upload_date,
videos.update,
send_completion_notice
);
the check_completion_status function is implemented as follows:
/* Recursively check to see if we have every chunk of a file */
var check_completion_status = function (req, res, next) {
var current_chunk = 1;
var see_if_chunks_exist = function () {
fs.exists(get_chunk_file_name(current_chunk, req.file_id), function (exists) {
if (current_chunk > req.total_chunks) {
next();
} else if (exists) {
current_chunk ++;
see_if_chunks_exist();
} else {
res.sendStatus(202);
}
});
};
see_if_chunks_exist();
};
The file names in the staging directory have the chunk numbers embedded in them, so the idea is to see if we have a file for every chunk number. The function should only next() one time for a given (complete) file.
However, my merge_chunks function is being invoked multiple times. (usually between 1 and 4) Logging does reveal that it's only invoked after I've received all of the chunks.
With this in mind, my assumption here is that it's the async nature of the fs.exists function that's causing the issue.
Even though the n'th invocation of check_completion_status may occur before I have all of the chunks, by the time we get to the nth call to fs.exists(), x more chunks may have arrived and been processed concurrently, so the function can keep going and in some cases get to the end and next(). However those chunks that arrived concurrently are also going to correspond to invocations of check_completion_status, which are also going to next() because we obviously have all of the files at this point.
This is causing issues because I didn't account for this when I wrote merge_chunks.
For completeness, here's the merge_chunks function:
var merge_chunks = (function () {
var pipe_chunks = function (args) {
args.chunk_number = args.chunk_number || 1;
if (args.chunk_number > args.total_chunks) {
args.write_stream.end();
args.next();
} else {
var file_name = get_chunk_file_name(args.chunk_number, args.file_id)
var read_stream = fs.createReadStream(file_name);
read_stream.pipe(args.write_stream, {end: false});
read_stream.on('end', function () {
//once we're done with the chunk we can delete it and move on to the next one.
fs.unlink(file_name);
args.chunk_number += 1;
pipe_chunks(args);
});
}
};
return function (req, res, next) {
var out = path.resolve('videos', req.video_id);
var write_stream = fs.createWriteStream(out);
pipe_chunks({
write_stream: write_stream,
file_id: req.file_id,
total_chunks: req.total_chunks,
next: next
});
};
}());
Currently, I'm receiving an error because the second invocation of the function is trying to read the chunks that have already been deleted by the first invocation.
What is the typical pattern for handling this type of situation? I'd like to avoid a stateful architecture if possible. Is it possible to cancel pending handlers right before calling next() in check_completion_status?
If you just want to make it work ASAP, I would use a lock (much like a db lock) to lock the resource so that only one of the requests processes the chunks. Simply create a unique id on the client, and send it along with the chunks. Then just store that unique id in some sort of a data structure, and look that id up prior to processing. The example below is by far not optimal (in fact this map will keep growing, which is bad), but it should demonstrate the concept
// Create a map (an array would work too) and keep track of the video ids that were processed. This map will persist through each request.
var processedVideos = {};
var check_completion_status = function (req, res, next) {
var current_chunk = 1;
var see_if_chunks_exist = function () {
fs.exists(get_chunk_file_name(current_chunk, req.file_id), function (exists) {
if (processedVideos[req.query.uniqueVideoId]){
res.sendStatus(202);
} else if (current_chunk > req.total_chunks) {
processedVideos[req.query.uniqueVideoId] = true;
next();
} else if (exists) {
current_chunk ++;
see_if_chunks_exist();
} else {
res.sendStatus(202);
}
});
};
see_if_chunks_exist();
};

nodejs express fs iterating files into array or object failing

So Im trying to use the nodejs express FS module to iterate a directory in my app, store each filename in an array, which I can pass to my express view and iterate through the list, but Im struggling to do so. When I do a console.log within the files.forEach function loop, its printing the filename just fine, but as soon as I try to do anything such as:
var myfiles = [];
var fs = require('fs');
fs.readdir('./myfiles/', function (err, files) { if (err) throw err;
files.forEach( function (file) {
myfiles.push(file);
});
});
console.log(myfiles);
it fails, just logs an empty object. So Im not sure exactly what is going on, I think it has to do with callback functions, but if someone could walk me through what Im doing wrong, and why its not working, (and how to make it work), it would be much appreciated.
The myfiles array is empty because the callback hasn't been called before you call console.log().
You'll need to do something like:
var fs = require('fs');
fs.readdir('./myfiles/',function(err,files){
if(err) throw err;
files.forEach(function(file){
// do something with each file HERE!
});
});
// because trying to do something with files here won't work because
// the callback hasn't fired yet.
Remember, everything in node happens at the same time, in the sense that, unless you're doing your processing inside your callbacks, you cannot guarantee asynchronous functions have completed yet.
One way around this problem for you would be to use an EventEmitter:
var fs=require('fs'),
EventEmitter=require('events').EventEmitter,
filesEE=new EventEmitter(),
myfiles=[];
// this event will be called when all files have been added to myfiles
filesEE.on('files_ready',function(){
console.dir(myfiles);
});
// read all files from current directory
fs.readdir('.',function(err,files){
if(err) throw err;
files.forEach(function(file){
myfiles.push(file);
});
filesEE.emit('files_ready'); // trigger files_ready event
});
As several have mentioned, you are using an async method, so you have a nondeterministic execution path.
However, there is an easy way around this. Simply use the Sync version of the method:
var myfiles = [];
var fs = require('fs');
var arrayOfFiles = fs.readdirSync('./myfiles/');
//Yes, the following is not super-smart, but you might want to process the files. This is how:
arrayOfFiles.forEach( function (file) {
myfiles.push(file);
});
console.log(myfiles);
That should work as you want. However, using sync statements is not good, so you should not do it unless it is vitally important for it to be sync.
Read more here: fs.readdirSync
fs.readdir is asynchronous (as with many operations in node.js). This means that the console.log line is going to run before readdir has a chance to call the function passed to it.
You need to either:
Put the console.log line within the callback function given to readdir, i.e:
fs.readdir('./myfiles/', function (err, files) { if (err) throw err;
files.forEach( function (file) {
myfiles.push(file);
});
console.log(myfiles);
});
Or simply perform some action with each file inside the forEach.
I think it has to do with callback functions,
Exactly.
fs.readdir makes an asynchronous request to the file system for that information, and calls the callback at some later time with the results.
So function (err, files) { ... } doesn't run immediately, but console.log(myfiles) does.
At some later point in time, myfiles will contain the desired information.
You should note BTW that files is already an Array, so there is really no point in manually appending each element to some other blank array. If the idea is to put together the results from several calls, then use .concat; if you just want to get the data once, then you can just assign myfiles = files directly.
Overall, you really ought to read up on "Continuation-passing style".
I faced the same problem, and basing on answers given in this post I've solved it with Promises, that seem to be of perfect use in this situation:
router.get('/', (req, res) => {
var viewBag = {}; // It's just my little habit from .NET MVC ;)
var readFiles = new Promise((resolve, reject) => {
fs.readdir('./myfiles/',(err,files) => {
if(err) {
reject(err);
} else {
resolve(files);
}
});
});
// showcase just in case you will need to implement more async operations before route will response
var anotherPromise = new Promise((resolve, reject) => {
doAsyncStuff((err, anotherResult) => {
if(err) {
reject(err);
} else {
resolve(anotherResult);
}
});
});
Promise.all([readFiles, anotherPromise]).then((values) => {
viewBag.files = values[0];
viewBag.otherStuff = values[1];
console.log(viewBag.files); // logs e.g. [ 'file.txt' ]
res.render('your_view', viewBag);
}).catch((errors) => {
res.render('your_view',{errors:errors}); // you can use 'errors' property to render errors in view or implement different error handling schema
});
});
Note: you don't have to push found files into new array because you already get an array from fs.readdir()'c callback. According to node docs:
The callback gets two arguments (err, files) where files is an array
of the names of the files in the directory excluding '.' and '..'.
I belive this is very elegant and handy solution, and most of all - it doesn't require you to bring in and handle new modules to your script.

Resources