In my gulp task, I just need to copy a file obtained from a previous task to a dest path that I have to determine at runtime using an asynchronous function (because I use fs functions).
How do I do this? How do I feed the result of an asynchronous function to a gulp.dest directive?
Sample code:
function getDest() {
fs.readdir('my/path', function(error, contents) {
// pick the right one
return the_right_one;
})
}
gulp.task('copy-stuff', ['previous-task'], function() {
return gulp.src('some/file').pipe(gulp.dest(getDest()));
})
You can do this as with any other async node/javascript code, by using callbacks.
For instance:
function getDest(cb) {
fs.readdir('my/path', function(error, contents) {
// pick the right one
cb(the_right_one);
})
}
gulp.task('copy-stuff', ['previous-task'], function() {
getDest(function(properDest) {
gulp.src('some/file')
.pipe(gulp.dest(properDest));
})
})
Note, in this example we don't return the stream or a promise, so you'll have some troubles using dependencies (this would finish before the a dependant would start). To solve this we can either return a promise that is resolved on end or simply use the callback given to the task.
Something like this should work (not tested):
gulp.task('copy-stuff', ['previous-task'], function(callback) {
getDest(function(properDest) {
gulp.src('some/file')
.pipe(gulp.dest(properDest))
.on('end', callback)
})
})
You can see more of the Orchestrator (the task runner gulp is using) API in the documentation or see the gulp documentation with async examples.
In 3.8 you can pass a function to gulp.dest that returns the destination for the file. This function has to be synchronous, so you could do
gulp.src('some/file')
.pipe(gulp.dest(function(file){
return 'build/' + fs.readFileSync('file_containing_target_dir');
}));
Related
fs.rename("${nombreHtml}.html",(err)=>{
console.log(err)
})
fs.appendFileSync("${nombreHtml}.html", htmlSeparado, () => { })
I try to run these two operations but it doesn't want to work
fs.rename is an asyncronous task.
By the time fs.rename finished its execution, fs.appendFileSync has already tried appending data to an html file which has not existed by the time.
fs.rename ... awaiting callback
fs.append ... failing
fs.rename finished, file now has a new name.
You probably want to either place fs.appendFileSync inside the fs.rename callback, or switch to promises. (example at the bottom)
example that should work:
fs.rename("${nombreHtml}.html",(err)=>{
if (err) console.log(err)
else {
fs.appendFileSync("${nombreHtml}.html", htmlSeparado, () => { })
}
})
By the way, because syncronous functions block the event loop and hence freeze your server for the time handling that function, making it unavailable for any other request - using filesystem's syncronous functions is rather less recommended for the general usecase, as the read/write/append operations are rather long. it is recommended to use the async versions of them, which return a callback or a promise, as you have done using fs.rename.
fs has a built-in sub-module with the same functions as promises which can be accessed by require('fs').promises.
this way you could just
const { rename, appendFile } = require('fs').promises;
try {
await rename("${nombreHtml}.html");
await appendFile("${nombreHtml}.html", htmlSeparado);
} catch (error) {
console.log(error);
}
I assume you want a template string so that the variables insert themselves into the string:
fs.rename(`${nombreHtml}.html`,(err)=>{
console.log(err)
})
fs.appendFileSync(`${nombreHtml}.html`, htmlSeparado, () => { })
I am working with zombie.js to scrape one site, I must to use the callback style to connect to each url. The point is that I have got an urls array and I need to process each urls using an async function. This is my first approach:
Array urls = {http..., http...};
function process_url(index)
{
if(index == urls.length)
return;
async_function(url,
function() {
...
//parse the url
...
// Process the next url
process_url(index++);
}
);
}
process_url(0)
Without use someone third party nodejs library to use the asyn funtion as sync function or to wait for the function (wait.for, synchornized, mocha), this is the way that I though to resolve this problem, I don't know what would happen if the array is too big. Is the function released from the memory when the next function is called? or all the functions are in memory until the end?
Any ideas?
Your scheme will work. I call it "manually sequencing async operations".
A general purpose version of what you're doing would look like this:
function processItem(data, callback) {
// do your async function here
// for example, let's suppose it was an http request using the request module
request(data, callback);
}
function processArray(array, fn) {
var index = 0;
function next() {
if (index < array.length) {
fn(array[index++], function(err, result) {
// process error here
if (err) return;
// process result here
next();
});
}
}
next();
}
processArray(arr, processItem);
As to your specific questions:
I don't know what would happen if the array is too big. Is the
function released from the memory when the next function is called? or
all the functions are in memory until the end?
Memory in Javascript is released when it is not longer referenced by any running code and when the garbage collector gets time to run. Since you are running a series of asynchronous operations here, it is likely that the garbage collector gets a chance to run regularly while waiting for the http response from the async operation so memory could get cleaned up then. Functions are just another type of object in Javascript and they get garbage collected just like anything else. When they are no longer reference by running code, they are eligible for garbage collection.
In your specific code, because you are re-calling process_url() only in an async callback, there is no stack build-up (as in normal recursion). The prior instance of process_url() has already completed BEFORE the async callback is called and BEFORE you call the next iteration of process_url().
In general, management and coordination of multiple async operations is much, much easier using promises which are built into the current versions of node.js and are part of the ES6 ECMAScript standard. No external libraries are required to use promises in current versions of node.js.
For a list of a number of different techniques for sequencing your asynchronous operations on your array, both using promises and not using promises, see:
How to synchronize a sequence of promises?.
The first step in using promises is the "promisify" your async function so that it returns a promise instead of takes a callback.
function async_function_promise(url) {
return new Promise(function(resolve, reject) {
async_function(url, function(err, result) {
if (err) {
reject(err);
} else {
resolve(result);
}
});
});
}
Now, you have a version of your function that returns promises.
If you want your async operations to proceed one at a time so the next one doesn't start until the previous one has completed, then a usual design pattern for that is to use .reduce() like this:
function process_urls(array) {
return array.reduce(function(p, url) {
return p.then(function(priorResult) {
return async_function_promise(url);
});
}, Promise.resolve());
}
Then, you can call it like this:
var myArray = ["url1", "url2", ...];
process_urls(myArray).then(function(finalResult) {
// all of them are done here
}, function(err) {
// error here
});
There are also Promise libraries that have some helpful features that make this type of coding simpler. I, myself, use the Bluebird promise library. Here's how your code would look using Bluebird:
var Promise = require('bluebird');
var async_function_promise = Promise.promisify(async_function);
function process_urls(array) {
return Promise.map(array, async_function_promise, {concurrency: 1});
}
process_urls(myArray).then(function(allResults) {
// all of them are done here and allResults is an array of the results
}, function(err) {
// error here
});
Note, you can change the concurrency value to whatever you want here. For example, you would probably get faster end-to-end performance if you increased it to something between 2 and 5 (depends upon the server implementation on how this is best optimized).
Folks,
I have the following function, and am wondering whats the correct way to call the callback() only when the database operation completes on all items:
function mapSomething (callback) {
_.each(someArray, function (item) {
dao.dosomething(item.foo, function (err, account) {
item.email = account.email;
});
});
callback();
},
What I need is to iterate over someArray and do a database call for each element. After replacing all items in the array, I need to only then call the callback. Ofcourse the callback is in the incorrect place right now
Thanks!
The way you currently have it, callback is executed before any of the (async) tasks finish.
The async module has an each() that allows for a final callback:
var async = require('async');
// ...
function mapSomething (callback) {
async.each(someArray, function(item, cb) {
dao.dosomething(item.foo, function(err, account) {
if (err)
return cb(err);
item.email = account.email;
cb();
});
}, callback);
}
This will not wait for all your database calls to be done before calling callback(). It will launch all the database calls at once in parallel (I'm assuming that's what dao.dosomething() is). And, then immediately call callback() before any of the database calls have finished.
You have several choices to solve the problem.
You can use promises (by promisifying the database call) and then use Promise.all() to wait for all the database calls to be done.
You can use the async library to manage the coordination for you.
You can keep track of when each one is done yourself and when the last one is done, call your callback.
I would recommend options 1. or 2. Personally, I prefer to use promises and since you're interfacing with a database, this is probably not the only place you're making database calls, so I'd promisify the interface (bluebird will do that for you in one function call) and then use promises.
Here's what a promise solution could look like:
var Promise = require('bluebird');
// make promise version of your DB function
// ideally, you'd promisify the whole DB API with .promisifyAll()
var dosomething = Promise.promisify(dao.dosomething, dao);
function mapSomething (callback, errCallback) {
Promise.all(_.map(someArray, function(item) {
return dosomething(item.foo).then(function (account) {
item.email = account.email;
});
}).then(callback, errCallback);
}
This assumes you want to run all the DB calls in parallel and then call the callback when they are all done.
FYI, here's a link to how Bluebird promisify's existing APIs. I use this mechanism for all async file I/O in node and it saves a ton of coding time and makes error handling much more sane. Async callbacks are a nightmare for proper error handling, especially if exceptions can be thrown from async callbacks.
P.S. You may actually want your mapSomething() function to just return a promise itself so the caller is then responsible for specifying their own .then() handler and it allows the caller to use the returned promise for their own synchronization with other things (e.g. it's just more flexible that way).
function mapSomething() {
return Promise.all(_.map(someArray, function(item) {
return dosomething(item.foo).then(function (account) {
item.email = account.email;
});
})
}
mapSomething.then(mapSucessHandler, mapErrorHandler);
I haven't tried Bluebird's .map() myself, but once you've promisified the database call, I think it would simplify it a bit more like this:
function mapSomething() {
return Promise.map(someArray, function(item) {
return dosomething(item.foo).then(function (account) {
item.email = account.email;
});
})
}
mapSomething.then(mapSucessHandler, mapErrorHandler);
I have the following example code - the first part can result in an async call or not - either way it should continue. I cannot put the rest of the code within the async callback since it needs to run when condition is false. So how to do this?
if(condition) {
someAsyncRequest(function(error, result)) {
//do something then continue
}
}
//do this next whether condition is true or not
I assume that putting the code to come after in a function might be the way to go and call that function within the async call above or on an else call if condition is false - but is there an alternative way that doesn't require me breaking it up in functions?
A library I've used in Node quite often is Async (https://github.com/caolan/async). Last I checked this also has support for the browser so you should be able to npm / concat / minify this in your distribution. If you're using this on server-side only you should consider https://github.com/continuationlabs/insync, which is a slightly improved version of Async, with some of the browser support removed.
One of the common patterns I use when using conditional async calls is populate an array with the functions I want to use in order and pass that to async.waterfall.
I've included an example below.
var tasks = [];
if (conditionOne) {
tasks.push(functionOne);
}
if (conditionTwo) {
tasks.push(functionTwo);
}
if (conditionThree) {
tasks.push(functionThree);
}
async.waterfall(tasks, function (err, result) {
// do something with the result.
// if any functions in the task throws an error, this function is
// immediately called with err == <that error>
});
var functionOne = function(callback) {
// do something
// callback(null, some_result);
};
var functionTwo = function(previousResult, callback) {
// do something with previous result if needed
// callback(null, previousResult, some_result);
};
var functionThree = function(previousResult, callback) {
// do something with previous result if needed
// callback(null, some_result);
};
Of course you could use promises instead. In either case I like to avoid nesting callbacks by using async or promises.
Some of the things you can avoid by NOT using nested callbacks are variable collision, hoisting bugs, "marching" to the right > > > >, hard to read code, etc.
Just declare some other function to be run whenever you need it :
var otherFunc = function() {
//do this next whether condition is true or not
}
if(condition) {
someAsyncRequest(function(error, result)) {
//do something then continue
otherFunc();
}
} else {
otherFunc();
}
Just an another way to do it, this is how I abstracted the pattern. There might be some libraries (promises?) that handle the same thing.
function conditional(condition, conditional_fun, callback) {
if(condition)
return conditional_fun(callback);
return callback();
}
And then at the code you can write
conditional(something === undefined,
function(callback) {
fetch_that_something_async(function() {
callback();
});
},
function() {
/// ... This is where your code would continue
});
I would recommend using clojurescript which has an awesome core-async library which makes life super easy when dealing with async calls.
In your case, you would write something like this:
(go
(when condition
(<! (someAsyncRequest)))
(otherCodeToHappenWhetherConditionIsTrueOrNot))
Note the go macro which will cause the body to run asynchronously, and the <! function which will block until the async function will return. Due to the <! function being inside the when condition, it will only block if the condition is true.
So Im trying to use the nodejs express FS module to iterate a directory in my app, store each filename in an array, which I can pass to my express view and iterate through the list, but Im struggling to do so. When I do a console.log within the files.forEach function loop, its printing the filename just fine, but as soon as I try to do anything such as:
var myfiles = [];
var fs = require('fs');
fs.readdir('./myfiles/', function (err, files) { if (err) throw err;
files.forEach( function (file) {
myfiles.push(file);
});
});
console.log(myfiles);
it fails, just logs an empty object. So Im not sure exactly what is going on, I think it has to do with callback functions, but if someone could walk me through what Im doing wrong, and why its not working, (and how to make it work), it would be much appreciated.
The myfiles array is empty because the callback hasn't been called before you call console.log().
You'll need to do something like:
var fs = require('fs');
fs.readdir('./myfiles/',function(err,files){
if(err) throw err;
files.forEach(function(file){
// do something with each file HERE!
});
});
// because trying to do something with files here won't work because
// the callback hasn't fired yet.
Remember, everything in node happens at the same time, in the sense that, unless you're doing your processing inside your callbacks, you cannot guarantee asynchronous functions have completed yet.
One way around this problem for you would be to use an EventEmitter:
var fs=require('fs'),
EventEmitter=require('events').EventEmitter,
filesEE=new EventEmitter(),
myfiles=[];
// this event will be called when all files have been added to myfiles
filesEE.on('files_ready',function(){
console.dir(myfiles);
});
// read all files from current directory
fs.readdir('.',function(err,files){
if(err) throw err;
files.forEach(function(file){
myfiles.push(file);
});
filesEE.emit('files_ready'); // trigger files_ready event
});
As several have mentioned, you are using an async method, so you have a nondeterministic execution path.
However, there is an easy way around this. Simply use the Sync version of the method:
var myfiles = [];
var fs = require('fs');
var arrayOfFiles = fs.readdirSync('./myfiles/');
//Yes, the following is not super-smart, but you might want to process the files. This is how:
arrayOfFiles.forEach( function (file) {
myfiles.push(file);
});
console.log(myfiles);
That should work as you want. However, using sync statements is not good, so you should not do it unless it is vitally important for it to be sync.
Read more here: fs.readdirSync
fs.readdir is asynchronous (as with many operations in node.js). This means that the console.log line is going to run before readdir has a chance to call the function passed to it.
You need to either:
Put the console.log line within the callback function given to readdir, i.e:
fs.readdir('./myfiles/', function (err, files) { if (err) throw err;
files.forEach( function (file) {
myfiles.push(file);
});
console.log(myfiles);
});
Or simply perform some action with each file inside the forEach.
I think it has to do with callback functions,
Exactly.
fs.readdir makes an asynchronous request to the file system for that information, and calls the callback at some later time with the results.
So function (err, files) { ... } doesn't run immediately, but console.log(myfiles) does.
At some later point in time, myfiles will contain the desired information.
You should note BTW that files is already an Array, so there is really no point in manually appending each element to some other blank array. If the idea is to put together the results from several calls, then use .concat; if you just want to get the data once, then you can just assign myfiles = files directly.
Overall, you really ought to read up on "Continuation-passing style".
I faced the same problem, and basing on answers given in this post I've solved it with Promises, that seem to be of perfect use in this situation:
router.get('/', (req, res) => {
var viewBag = {}; // It's just my little habit from .NET MVC ;)
var readFiles = new Promise((resolve, reject) => {
fs.readdir('./myfiles/',(err,files) => {
if(err) {
reject(err);
} else {
resolve(files);
}
});
});
// showcase just in case you will need to implement more async operations before route will response
var anotherPromise = new Promise((resolve, reject) => {
doAsyncStuff((err, anotherResult) => {
if(err) {
reject(err);
} else {
resolve(anotherResult);
}
});
});
Promise.all([readFiles, anotherPromise]).then((values) => {
viewBag.files = values[0];
viewBag.otherStuff = values[1];
console.log(viewBag.files); // logs e.g. [ 'file.txt' ]
res.render('your_view', viewBag);
}).catch((errors) => {
res.render('your_view',{errors:errors}); // you can use 'errors' property to render errors in view or implement different error handling schema
});
});
Note: you don't have to push found files into new array because you already get an array from fs.readdir()'c callback. According to node docs:
The callback gets two arguments (err, files) where files is an array
of the names of the files in the directory excluding '.' and '..'.
I belive this is very elegant and handy solution, and most of all - it doesn't require you to bring in and handle new modules to your script.