How I can stop async.queue after the first fail? - node.js

I want to stop of executing of my async.queue after first task error was occurred. I need to perform several similar actions in parallel with the concurrency restriction, but stop all the actions after first error. How can I do that or what should I use instead?

Assuming you fired 5 parallel functions, each will take 5 seconds. While in 3rd second, function 1 failed. Then how you can stop the execution of the rest?
It depends of what those functions do, you may poll using setInterval. However if your question is how to stop further tasks to be pushed to the queue. You may do this:
q.push(tasks, function (err) {
if (err && !called) {
//Will prevent async to push more tasks to the queue, however please note that
//whatever pushed to the queue, it will be processed anyway.
q.kill();
//This will not allow double calling for the final callback
called = true;
//This the main process callback, the final callback
main(err, results);
}
});
Here a full working example:
var async = require('async');
/*
This function is the actual work you are trying to do.
Please note for example if you are running child processes
here, by doing q.kill you will not stop the execution
of those processes, so you need actually to keep track the
spawned processed and then kill them when you call q.kill
in 'pushCb' function. In-case of just long running function,
you may poll using setInterval
*/
function worker(task, wcb) {
setTimeout(function workerTimeout() {
if (task === 11 || task === 12 || task === 3) {
return wcb('error in processing ' + task);
}
wcb(null, task + ' got processed');
}, Math.floor(Math.random() * 100));
}
/*
This function that will push the tasks to async.queue,
and then hand them to your worker function
*/
function process(tasks, concurrency, pcb) {
var results = [], called = false;
var q = async.queue(function qWorker(task, qcb) {
worker(task, function wcb(err, data) {
if (err) {
return qcb(err); //Here how we propagate error to qcb
}
results.push(data);
qcb();
});
}, concurrency);
/*
The trick is in this function, note that checking q.tasks.length
does not work q.kill introduced in async 0.7.0, it is just setting
the drain function to null and the tasks length to zero
*/
q.push(tasks, function qcb(err) {
if (err && !called) {
q.kill();
called = true;
pcb(err, results);
}
});
q.drain = function drainCb() {
pcb(null, results);
}
}
var tasks = [];
var concurrency = 10;
for (var i = 1; i <= 20; i += 1) {
tasks.push(i);
}
process(tasks, concurrency, function pcb(err, results) {
console.log(results);
if (err) {
return console.log(err);
}
console.log('done');
});

async documentation on github page is either outdated or incorrect, while inspecting the queue object returned by async.queue() method I do not see the method kill().
Nevertheless there is a way around it. Queue object has property tasks which is an array, simply assigning a reference to an empty array did the trick for me.
queue.push( someTasks, function ( err ) {
if ( err ) queue.tasks = [];
});

Related

Avoid callback multi-invocation when forEach is used

I have a function that processes an array of data (first parameter) and, once the procesing is finished, it invokes only one time a callback function (second parameter). I'm using forEach to process data item by item, consisting the processing of each item in some checkings and storing the param in database. The function storeInDB() does the storing work and uses a callback (second parameter) when the item has been stored.
A first approach to the code is the following:
function doWork(data, callback) {
data.forEach(function (item) {
// Do some check on item
...
storeInDB(item, function(err) {
// check error etc.
...
callback();
});
});
}
However, it's wrong, as the the callback function will be invoked several times (as many as element in the data array).
I'd like to know how to refactor my code in order to achieve the desired behaviour, i.e. only one invocation to callback once the storing work is finished. I guess that async could help in this task, but I haven't find the right pattern yet to combine async + forEach.
Any help is appreciated!
You can use a library such as async to do this, although I would recommend using promises if possible. For your immediate problem you can use a counter to determine how many storage calls have completed and call the callback when the total number are completed.
let counter = 0;
data.forEach(function (item) {
// Do some check on item
...
storeInDB(item, function(err) {
// check error etc.
counter++
if (counter == data.length) {
callback();
}
});
});
you can also utilize the three parameters passed to the function to execute on each array method
function doWork(data, callback) {
data.forEach(function (value,idx,arr) {
// Do some check on item
...
storeInDB(arr[idx], function(err) {
// check error etc.
...
if ( (idx + 1) === arr.length ) {
callback();
}
});
});
}
If storeInDB function returns a promise, you can push all async functions to an array and use Promise.all. After all tasks run successfully, It will invokes callback function.
Hope this helps you.
function doWork(data, callback) {
let arr = [];
data.map(function(itm) {
// Do some check on item
...
arr.push(storeInDB(item));
});
Promise.all(arr)
.then(function(res) {
callback();
});
}

How to run asynchronous tasks synchronous?

I'm developing an app with the following node.js stack: Express/Socket.IO + React. In React I have DataTables, wherein you can search and with every keystroke the data gets dynamically updated! :)
I use Socket.IO for data-fetching, so on every keystroke the client socket emits some parameters and the server calls then the callback to return data. This works like a charm, but it is not garanteed that the returned data comes back in the same order as the client sent it.
To simulate: So when I type in 'a', the server responds with this same 'a' and so for every character.
I found the async module for node.js and tried to use the queue to return tasks in the same order it received it. For simplicity I delayed the second incoming task with setTimeout to simulate a slow performing database-query:
Declaration:
const async = require('async');
var queue = async.queue(function(task, callback) {
if(task.count == 1) {
setTimeout(function() {
callback();
}, 3000);
} else {
callback();
}
}, 10);
Usage:
socket.on('result', function(data, fn) {
var filter = data.filter;
if(filter.length === 1) { // TEST SYNCHRONOUSLY
queue.push({name: filter, count: 1}, function(err) {
fn(filter);
// console.log('finished processing slow');
});
} else {
// add some items to the queue
queue.push({name: filter, count: filter.length}, function(err) {
fn(data.filter);
// console.log('finished processing fast');
});
}
});
But the way I receive it in the client console, when I search for abc is as follows:
ab -> abc -> a(after 3 sec)
I want it to return it like this: a(after 3sec) -> ab -> abc
My thought is that the queue runs the setTimeout and then goes further and eventually the setTimeout gets fired somewhere on the event loop later on. This resulting in returning later search filters earlier then the slow performing one.
How can i solve this problem?
First a few comments, which might help clear up your understanding of async calls:
Using "timeout" to try and align async calls is a bad idea, that is not the idea about async calls. You will never know how long an async call will take, so you can never set the appropriate timeout.
I believe you are misunderstanding the usage of queue from async library you described. The documentation for the queue can be found here.
Copy pasting the documentation in here, in-case things are changed or down:
Creates a queue object with the specified concurrency. Tasks added to the queue are processed in parallel (up to the concurrency limit). If all workers are in progress, the task is queued until one becomes available. Once a worker completes a task, that task's callback is called.
The above means that the queue can simply be used to priorities the async task a given worker can perform. The different async tasks can still be finished at different times.
Potential solutions
There are a few solutions to your problem, depending on your requirements.
You can only send one async call at a time and wait for the first one to finish before sending the next one
You store the results and only display the results to the user when all calls have finished
You disregard all calls except for the latest async call
In your case I would pick solution 3 as your are searching for something. Why would you use care about the results for "a" if they are already searching for "abc" before they get the response for "a"?
This can be done by giving each request a timestamp and then sort based on the timestamp taking the latest.
SOLUTION:
Server:
exports = module.exports = function(io){
io.sockets.on('connection', function (socket) {
socket.on('result', function(data, fn) {
var filter = data.filter;
var counter = data.counter;
if(filter.length === 1 || filter.length === 5) { // TEST SYNCHRONOUSLY
setTimeout(function() {
fn({ filter: filter, counter: counter}); // return to client
}, 3000);
} else {
fn({ filter: filter, counter: counter}); // return to client
}
});
});
}
Client:
export class FilterableDataTable extends Component {
constructor(props) {
super();
this.state = {
endpoint: "http://localhost:3001",
filters: {},
counter: 0
};
this.onLazyLoad = this.onLazyLoad.bind(this);
}
onLazyLoad(event) {
var offset = event.first;
if(offset === null) {
offset = 0;
}
var filter = ''; // filter is the search character
if(event.filters.result2 != undefined) {
filter = event.filters.result2.value;
}
var returnedData = null;
this.state.counter++;
this.socket.emit('result', {
offset: offset,
limit: 20,
filter: filter,
counter: this.state.counter
}, function(data) {
returnedData = data;
console.log(returnedData);
if(returnedData.counter === this.state.counter) {
console.log('DATA: ' + JSON.stringify(returnedData));
}
}
This however does send unneeded data to the client, which in return ignores it. Somebody any idea's for further optimizing this kind of communication? For example a method to keep old data at the server and only send the latest?

async.whilst - pausing between calls

I have a function which I need to call a number of time and instead of using a for loop I'm using async.whilst. But what I need is that the next call to function is not made before the previous call completes, which is not what's happening with async.whilst. Is there a way to implement this (I'm using setTimeout to pause between each call but it is not very clean).
Many thanks, C
i'd use the forever construct. Assuming your function's name is myFunction and accepts as parameter a callback:
var count = 0;
var limit = 10; // set as number of the execution of the function
async.forever(
function(next) {
myFunction(function () {
count++;
if(count < limit) {
next();
} else {
next(true);
}
})
},
function(ended) {
// function calling iteration ended
}
);

Error Handling in a Recursive setTimeout Function in Node.js

I'm building my first node.js application on my Raspberry Pi which I am using to control an air conditioner via LIRC. The following code is called when you want to increase the temperature of the AC unit. It sends a LIRC command every 250 milliseconds depending on how many degrees you want to increase it by. This code works as expected.
var iDegrees = 5;
var i = 0;
var delay = 250 // The delay in milliseconds
function increaseTemperatureLoop(){
i++;
//lirc_node.irsend.send_once("ac", "INCREASE", function() {});
console.log(i);
// Call the fucntion/loop again after the delay if we still need to increase the temperature
if (i <= iDegrees){
timer = setTimeout(increaseTemperatureLoop, delay);
}
else {
res.json({"message": "Success"});
}
}
// Start the timer to call the recursive function for the first time
var timer = setTimeout(increaseTemperatureLoop, delay);
I'm having a hard time working with the asynchronous nature of node.js. Once my recursive function is done, I return my json to the browser as shown in the code above. By habit, I feel like I should return the json in a line of code after my initial function call like below but obviously that wouldn't wait for all of the LIRC calls to be successful - it seems silly to have it inside of the function:
var timer = setTimeout(increaseTemperatureLoop, delay);
res.json({"message": "Success"});
What if I have a bunch of other stuff to do after my LIRC sends are done but before I want to send my json back to the browser? Or what if that block of code throws an error...
My second question is, how do I properly wrap the LIRC call in a try/catch and then if there is an error, stop the recursive calls, pass the error back up, and then pass this back to the browser along with the actual error message:
res.json({"message": "Failed"});
For track end of the cycle execution task, you can use a callback.
In order to know whether completed all routine tasks, you can use the task queue.
Monitor and report bugs to the top - it is possible with the help of
three of the same callback.
In general, it is desirable to wrap everything into a single object.
Some example for reflection:
var lircTasks = function __self (){
if (typeof __self.tasks === "undefined") __self.tasks = 0;
__self.func = {
increaseTemperature: function() {
// lirc_node.irsend.send_once("ac", "INCREASE_TEMPERATURE", function() {});
},
increaseFanPower: function() {
// lirc_node.irsend.send_once("ac", "INCREASE_FANPOWER", function() {});
}
}
var fab = function () {
__self.tasks++;
this.i = 0;
this.args = arguments[0];
this.callback = arguments[1];
this.run = function __ref(taskName) {
if (taskName) this.taskName = taskName;
if (this.i<this.args.deg) {
try {
__self.func[this.taskName]();
} catch(e) {
__self.tasks--;
this.callback( {message: "error", error: e, taskName: this.taskName, task: this.args, tasks: __self.tasks} );
}
this.i++;
setTimeout( __ref.bind(this), this.args.delay );
} else {
__self.tasks--;
this.callback({message:"complete", taskName: this.taskName, task: this.args, tasks: __self.tasks});
}
}
}
if ((arguments.length === 2) && (typeof arguments[1] === "function") && arguments[0].deg>0 && arguments[0].delay>=0) {
return new fab(arguments[0], arguments[1]);
}
}
function complete(e) {
console.log(e);
if (e.tasks === 0) console.log({message: "Success"});
}
lircTasks( {deg: 10, delay:100, device: "d1" }, complete ).run("increaseTemperature");
lircTasks( {deg: 20, delay:150, device: "d2" }, complete ).run("increaseTemperature");
lircTasks( {deg: 5, delay:100, device: "d3" }, complete ).run("increaseFanPower");

Idiomatic way to wait for multiple callbacks in Node.js

Suppose you need to do some operations that depend on some temp file. Since
we're talking about Node here, those operations are obviously asynchronous.
What is the idiomatic way to wait for all operations to finish in order to
know when the temp file can be deleted?
Here is some code showing what I want to do:
do_something(tmp_file_name, function(err) {});
do_something_other(tmp_file_name, function(err) {});
fs.unlink(tmp_file_name);
But if I write it this way, the third call can be executed before the first two
get a chance to use the file. I need some way to guarantee that the first two
calls already finished (invoked their callbacks) before moving on without nesting
the calls (and making them synchronous in practice).
I thought about using event emitters on the callbacks and registering a counter
as receiver. The counter would receive the finished events and count how many
operations were still pending. When the last one finished, it would delete the
file. But there is the risk of a race condition and I'm not sure this is
usually how this stuff is done.
How do Node people solve this kind of problem?
Update:
Now I would advise to have a look at:
Promises
The Promise object is used for deferred and asynchronous computations.
A Promise represents an operation that hasn't completed yet, but is
expected in the future.
A popular promises library is bluebird. A would advise to have a look at why promises.
You should use promises to turn this:
fs.readFile("file.json", function (err, val) {
if (err) {
console.error("unable to read file");
}
else {
try {
val = JSON.parse(val);
console.log(val.success);
}
catch (e) {
console.error("invalid json in file");
}
}
});
Into this:
fs.readFileAsync("file.json").then(JSON.parse).then(function (val) {
console.log(val.success);
})
.catch(SyntaxError, function (e) {
console.error("invalid json in file");
})
.catch(function (e) {
console.error("unable to read file");
});
generators: For example via co.
Generator based control flow goodness for nodejs and the browser,
using promises, letting you write non-blocking code in a nice-ish way.
var co = require('co');
co(function *(){
// yield any promise
var result = yield Promise.resolve(true);
}).catch(onerror);
co(function *(){
// resolve multiple promises in parallel
var a = Promise.resolve(1);
var b = Promise.resolve(2);
var c = Promise.resolve(3);
var res = yield [a, b, c];
console.log(res);
// => [1, 2, 3]
}).catch(onerror);
// errors can be try/catched
co(function *(){
try {
yield Promise.reject(new Error('boom'));
} catch (err) {
console.error(err.message); // "boom"
}
}).catch(onerror);
function onerror(err) {
// log any uncaught errors
// co will not throw any errors you do not handle!!!
// HANDLE ALL YOUR ERRORS!!!
console.error(err.stack);
}
If I understand correctly I think you should have a look at the very good async library. You should especially have a look at the series. Just a copy from the snippets from github page:
async.series([
function(callback){
// do some stuff ...
callback(null, 'one');
},
function(callback){
// do some more stuff ...
callback(null, 'two');
},
],
// optional callback
function(err, results){
// results is now equal to ['one', 'two']
});
// an example using an object instead of an array
async.series({
one: function(callback){
setTimeout(function(){
callback(null, 1);
}, 200);
},
two: function(callback){
setTimeout(function(){
callback(null, 2);
}, 100);
},
},
function(err, results) {
// results is now equals to: {one: 1, two: 2}
});
As a plus this library can also run in the browser.
The simplest way increment an integer counter when you start an async operation and then, in the callback, decrement the counter. Depending on the complexity, the callback could check the counter for zero and then delete the file.
A little more complex would be to maintain a list of objects, and each object would have any attributes that you need to identify the operation (it could even be the function call) as well as a status code. The callbacks would set the status code to completed.
Then you would have a loop that waits (using process.nextTick) and checks to see if all tasks are completed. The advantage of this method over the counter, is that if it is possible for all outstanding tasks to complete, before all tasks are issued, the counter technique would cause you to delete the file prematurely.
// simple countdown latch
function CDL(countdown, completion) {
this.signal = function() {
if(--countdown < 1) completion();
};
}
// usage
var latch = new CDL(10, function() {
console.log("latch.signal() was called 10 times.");
});
There is no "native" solution, but there are a million flow control libraries for node. You might like Step:
Step(
function(){
do_something(tmp_file_name, this.parallel());
do_something_else(tmp_file_name, this.parallel());
},
function(err) {
if (err) throw err;
fs.unlink(tmp_file_name);
}
)
Or, as Michael suggested, counters could be a simpler solution. Take a look at this semaphore mock-up. You'd use it like this:
do_something1(file, queue('myqueue'));
do_something2(file, queue('myqueue'));
queue.done('myqueue', function(){
fs.unlink(file);
});
I'd like to offer another solution that utilizes the speed and efficiency of the programming paradigm at the very core of Node: events.
Everything you can do with Promises or modules designed to manage flow-control, like async, can be accomplished using events and a simple state-machine, which I believe offers a methodology that is, perhaps, easier to understand than other options.
For example assume you wish to sum the length of multiple files in parallel:
const EventEmitter = require('events').EventEmitter;
// simple event-driven state machine
const sm = new EventEmitter();
// running state
let context={
tasks: 0, // number of total tasks
active: 0, // number of active tasks
results: [] // task results
};
const next = (result) => { // must be called when each task chain completes
if(result) { // preserve result of task chain
context.results.push(result);
}
// decrement the number of running tasks
context.active -= 1;
// when all tasks complete, trigger done state
if(!context.active) {
sm.emit('done');
}
};
// operational states
// start state - initializes context
sm.on('start', (paths) => {
const len=paths.length;
console.log(`start: beginning processing of ${len} paths`);
context.tasks = len; // total number of tasks
context.active = len; // number of active tasks
sm.emit('forEachPath', paths); // go to next state
});
// start processing of each path
sm.on('forEachPath', (paths)=>{
console.log(`forEachPath: starting ${paths.length} process chains`);
paths.forEach((path) => sm.emit('readPath', path));
});
// read contents from path
sm.on('readPath', (path) => {
console.log(` readPath: ${path}`);
fs.readFile(path,(err,buf) => {
if(err) {
sm.emit('error',err);
return;
}
sm.emit('processContent', buf.toString(), path);
});
});
// compute length of path contents
sm.on('processContent', (str, path) => {
console.log(` processContent: ${path}`);
next(str.length);
});
// when processing is complete
sm.on('done', () => {
const total = context.results.reduce((sum,n) => sum + n);
console.log(`The total of ${context.tasks} files is ${total}`);
});
// error state
sm.on('error', (err) => { throw err; });
// ======================================================
// start processing - ok, let's go
// ======================================================
sm.emit('start', ['file1','file2','file3','file4']);
Which will output:
start: beginning processing of 4 paths
forEachPath: starting 4 process chains
readPath: file1
readPath: file2
processContent: file1
readPath: file3
processContent: file2
processContent: file3
readPath: file4
processContent: file4
The total of 4 files is 4021
Note that the ordering of the process chain tasks is dependent upon system load.
You can envision the program flow as:
start -> forEachPath -+-> readPath1 -> processContent1 -+-> done
+-> readFile2 -> processContent2 -+
+-> readFile3 -> processContent3 -+
+-> readFile4 -> processContent4 -+
For reuse, it would be trivial to create a module to support the various flow-control patterns, i.e. series, parallel, batch, while, until, etc.
The simplest solution is to run the do_something* and unlink in sequence as follows:
do_something(tmp_file_name, function(err) {
do_something_other(tmp_file_name, function(err) {
fs.unlink(tmp_file_name);
});
});
Unless, for performance reasons, you want to execute do_something() and do_something_other() in parallel, I suggest to keep it simple and go this way.
Wait.for https://github.com/luciotato/waitfor
using Wait.for:
var wait=require('wait.for');
...in a fiber...
wait.for(do_something,tmp_file_name);
wait.for(do_something_other,tmp_file_name);
fs.unlink(tmp_file_name);
With pure Promises it could be a bit more messy, but if you use Deferred Promises then it's not so bad:
Install:
npm install --save #bitbar/deferred-promise
Modify your code:
const DeferredPromise = require('#bitbar/deferred-promise');
const promises = [
new DeferredPromise(),
new DeferredPromise()
];
do_something(tmp_file_name, (err) => {
if (err) {
promises[0].reject(err);
} else {
promises[0].resolve();
}
});
do_something_other(tmp_file_name, (err) => {
if (err) {
promises[1].reject(err);
} else {
promises[1].resolve();
}
});
Promise.all(promises).then( () => {
fs.unlink(tmp_file_name);
});

Resources