node.js setTimeout Loop Issue - node.js

I'm new to node.js.
I have tried to create a setTimeout that executes a database SELECT Query and repeats 3 seconds after processing the SELECT results has completed.
var newDBMessagesInterval = 3000; // 3 Seconds
(function newDBMessagesSchedule() {
setTimeout(function() {
dbNewMessagesQuery(function(dbResults,dbResultsLength) {
console.log(dbResults);
newDBMessagesSchedule();
});
}, newDBMessagesInterval)
})();
function dbNewMessagesQuery(callback) {
dbConnection.query("SELECT data1,data2,data3 FROM table WHERE condition=1;", function (dbError, dbResults, dbFields) {
if(dbResults.length > 0) {
callback(dbResults,dbResults.length);
}
});
callback();
}
It appears the setTimeout number of loops increases each time it runs (eg: first one console.log(dbResults), but then 2times and then 4 etc). Also I'm not sure if its waiting on the database SELECT to completed before trying to process the next time.
Looking for some advise on how to create this loop correctly with node.js
thx

Your dbNewMessagesQuery calls callback twice. Once synchronously, and once after the db query succeeds. You should just be calling it once after the query is done. With your current code, for every call to newDBMessagesSchedule, you queue up two more calls to run later.
function dbNewMessagesQuery(callback) {
dbConnection.query("SELECT data1,data2,data3 FROM table WHERE condition=1;", function (dbError, dbResults, dbFields) {
callback(dbResults, dbResults.length);
});
}
I'd also recommend not bothering to pass the length separately, and instead pass along the error if there is one. Currently you just assume there will never be an error.
function dbNewMessagesQuery(callback) {
dbConnection.query("SELECT data1,data2,data3 FROM table WHERE condition=1;", function (dbError, dbResults, dbFields) {
callback(dbError, dbResults);
});
}

Related

how to use async.parallelLimit to maximize the amount of (parallel) running processes?

Is it possible to set a Limit to parallel running processes with async.parallelLimit ? I used following very simple code to test how it works.
var async = require("async");
var i = 0;
function write() {
i++;
console.log("Done", i);
}
async.parallelLimit([
function(callback) {
write();
callback();
}
], 10, function() {
console.log("finish");
});
Of course everything what I got back was this:
Done 1
finish
In my case I have a function wich is called very often, but I only want it to run only 5 times simultaneously. The other calls should be queued. (Yes I know about async.queue but, this is not what I want in this question).
So now the question is, is this possible with async.parallelLimit?
//EDIT:
I actually have something similar to this:
for(var i = 0; i < somthing.length; i++) { //i > 100 for example
write();
}
And 'cause of the non synchronicity of node.js this could run 100 times at the same time. But how shell I limit the parallel running processes in this case?
Very short answer; Yes. That's exactly what asyncParallelLimit does.
In your case, you are passing only one function to parallelLimit. That's why it only get's called once. If you were to pass an array with this same function many times, it will get executed as many times as you put it in the array.
Please note that your example function doesn't actually do any work asynchronously. As such, this example function will always get executed in series. If you have a function that does async work, for example a network request or file i/o, it will get executed in parallel.
A better example-function for a async workload would be:
function(callback) {
setTimeout(function(){
callback();
}, 200);
}
For completion, to add to the existing answer, if you want to run the same function multiple times in parallel with a limit, here's how you do it:
// run 'my_task' 100 times, with parallel limit of 10
var my_task = function(callback) { ... };
var when_done = function(err, results) { ... };
// create an array of tasks
var async_queue = Array(100).fill(my_task);
async.parallelLimit(async_queue, 10, when_done);

Allow multiple async calls to finish before calling termination function in Node.js

I'm using node.js to search through a sqlite table of books and returning only the books that are not checked out. My question is if the second function that I am passing into stmt.each() will wait for isCheckedOut() to finish before executing?
At the moment it seems to be waiting but I'm worried that if isCheckedOut() takes longer to execute (ie: larger database) then the second function will run too early and won't return all the results.
If the second function is running prematurely, how can I make it wait?
var sqlite3 = require('sqlite3').verbose();
var db = new sqlite3.Database('database');
function showAvailableBooks(callback) {
var stmt = db.prepare("SELECT * FROM books");
var results = [];
stmt.each(function (err, row) {
isCheckedOut(row.Barcode, function (err, checkedOut) {
if (!checkedOut) {
results.push({Title: row.Title, Author: row.Author, CallNumber: row.CallNumber});
}
});
}, function () {
callback(results);
});
}
showAvailableBooks(function () {
console.log("# of results: " + results.length);
});
No, stmt.each won't wait for all the isCheckedOut calls to run their callbacks if they get deferred. Right now, being sqlite (no network latency) and small database it might work, but if you try a different, remote or bigger database it may not.
Rob W answered a similar question using a counter here: https://stackoverflow.com/a/21185103/4925989 , but there are also other ways to make your finish callback wait.

how to make this function async in node.js

Here is the situation:
I am new to node.js, I have a 40MB file containing multilevel json file like:
[{},{},{}] This is an array of objects (~7000 objects). Each object has properties and a one of those properties is also an array of objects
I wrote a function to read the content of the file and iterate it. I succeeded to get what I wanted in terms of content but not usability. I thought that I wrote an async function that would allow node to serve other web requests while iterating the array but that is not the case. I would be very thankful if anyone can point me to what I've done wrong and how to rewrite it so I can have a non-blocking iteration. Here's the function that handles the situation:
function getContents(callback) {
fs.readFile(file, 'utf8', function (err, data) {
if (err) {
console.log('Error: ' + err);
return;
}
js = JSON.parse(data);
callback();
return;
});
}
getContents(iterateGlobalArr);
var count = 0;
function iterateGlobalArr() {
if (count < js.length) {
innerArr = js.nestedProp;
//iterate nutrients
innerArr.forEach(function(e, index) {
//some simple if condition here
});
var schema = {
//.....get props from forEach iteration
}
Model.create(schema, function(err, post) {
if(err) {
console.log('\ncreation error\n', err);
return;
}
if (!post) {
console.log('\nfailed to create post for schema:\n' + schema);
return;
}
});
count++;
process.nextTick(iterateGlobalArr);
}
else {
console.log("\nIteration finished");
next();
}
Just so it is clear how I've tested the above situation. I open two tabs one loading this iteration which takes some time and second with another node route which does not load until the iteration is over. So essentially I've written a blocking code but not sure how to re-factor it! I suspect that just because everything is happening in the callback I am unable to release the event loop to handle another request...
Your code is almost correct. What you are doing is inadvertently adding ALL the items to the very next tick... which still blocks.
The important piece of code is here:
Model.create(schema, function(err, post) {
if(err) {
console.log('\ncreation error\n', err);
return;
}
if (!post) {
console.log('\nfailed to create post for schema:\n' + schema);
return;
}
});
// add EVERYTHING to the very same next tick!
count++;
process.nextTick(iterateGlobalArr);
Let's say you are in tick A of the event loop when getContents() runs and count is 0. You enter iterateGlobalArr and you call Model.create. Because Model.create is async, it is returning immediately, causing process.nextTick() to add processing of item 1 to the next tick, let's say B. Then it calls iterateGlobalArr, which does the same thing, adding item 2 to the next tick, which is still B. Then item 3, and so on.
What you need to do is move the count increment and process.nextTick() into the callback of Model.create(). This will make sure the current item is processed before nextTick is invoked... which means next item is actually added to the next tick AFTER the model item has been created... which will give your app time to handle other things in between. The fixed version of iterateGlobalArr is here:
function iterateGlobalArr() {
if (count < js.length) {
innerArr = js.nestedProp;
//iterate nutrients
innerArr.forEach(function(e, index) {
//some simple if condition here
});
var schema = {
//.....get props from forEach iteration
}
Model.create(schema, function(err, post) {
// schedule our next item to be processed immediately.
count++;
process.nextTick(iterateGlobalArr);
// then move on to handling this result.
if(err) {
console.log('\ncreation error\n', err);
return;
}
if (!post) {
console.log('\nfailed to create post for schema:\n' + schema);
return;
}
});
}
else {
console.log("\nIteration finished");
next();
}
}
Note also that I would strongly suggest that you pass in your js and counter with each call to iterageGlobalArr, as it will make your iterateGlobalArr alot easier to debug, among other things, but that's another story.
Cheers!
Node is single-threaded so async will only help you if you are relying on another system/subsystem to do the work (a shell script, external database, web service etc). If you have to do the work in Node you are going to block while you do it.
It is possible to create one node process per core. This solution would result in only blocking one of the node processes and leave the rest to service your requests, but this feature is still listed as experimental http://nodejs.org/api/cluster.html.
A single instance of Node runs in a single thread. To take advantage
of multi-core systems the user will sometimes want to launch a cluster
of Node processes to handle the load.
The cluster module allows you to easily create child processes that
all share server ports.

Asynchronous Database Queries with PostgreSQL in Node not working

Using Node.js and the node-postgres module to communicate with a database, I'm attempting to write a function that accepts an array of queries and callbacks and executes them all asynchronously using the same database connection. The function accepts a two-dimensional array and calling it looks like this:
perform_queries_async([
['SELECT COUNT(id) as count FROM ideas', function(result) {
console.log("FUNCTION 1");
}],
["INSERT INTO ideas (name) VALUES ('test')", function(result) {
console.log("FUNCTION 2");
}]
]);
And the function iterates over the array, creating a query for each sub-array, like so:
function perform_queries_async(queries) {
var client = new pg.Client(process.env.DATABASE_URL);
for(var i=0; i<queries.length; i++) {
var q = queries[i];
client.query(q[0], function(err, result) {
if(err) {
console.log(err);
} else {
q[1](result);
}
});
}
client.on('drain', function() {
console.log("drained");
client.end();
});
client.connect();
}
When I ran the above code, I expected to see output like this:
FUNCTION 1
FUNCTION 2
drained
However, the output bizarrely appears like so:
FUNCTION 2
drained
FUNCTION 2
Not only is the second function getting called for both requests, it also seems as though the drain code is getting called before the client's queue of queries is finished running...yet the second query still runs perfectly fine even though the client.end() code ostensibly killed the client once the event is called.
I've been tearing my hair out about this for hours. I tried hardcoding in my sample array (thus removing the for loop), and my code worked as expected, which leads me to believe that there is some problem with my loop that I'm not seeing.
Any ideas on why this might be happening would be greatly appreciated.
The simplest way to properly capture the value of the q variable in a closure in modern JavaScript is to use forEach:
queries.forEach(function(q) {
client.query(q[0], function(err, result) {
if(err) {
console.log(err);
} else {
q[1](result);
}
});
});
If you don't capture the value, your code reflects the last value that q had, as the callback function executed later, in the context of the containing function.
forEach, by using a callback function isolates and captures the value of q so it can be properly evaluated by the inner callback.
A victim of the famous Javascript closure/loop gotcha. See my (and other) answers here:
I am trying to open 10 websocket connections with nodejs, but somehow my loop doesnt work
Basically, at the time your callback is executed, q is set to the last element of the input array. The way around it is to dynamically generate the closure.
It will be good to execute this using async module . It will help you to reuse the code also . and will make the code more readable . I just love the auto function provided by async module
Ref: https://github.com/caolan/async

Node.js + SQLite async transactions

I am using node-sqlite3, but I am sure this problem appears in another database libraries too. I have discovered a bug in my code with mixing transactions and async code.
function insertData(arrayWithData, callback) {
// start a transaction
db.run("BEGIN", function() {
// do multiple inserts
slide.asyncMap(
arrayWithData,
function(cb) {
db.run("INSERT ...", cb);
},
function() {
// all done
db.run("COMMIT");
}
);
});
}
// some other insert
setInterval(
function() { db.run("INSERT ...", cb); },
100
);
You can also run the full example.
The problem is that some other code with insert or update query can be launched during the async pause after begin or insert. Then this extra query is run in the transaction. This is not a problem when the transaction is committed. But if the transaction is rolled back the change made by this extra query is also rolled back. Hoops we've just unpredictably lost data without any error message.
I thought about this issue and I think that one solution is to create a wrapper class that will make sure that:
Only one transaction is running at the same time.
When transaction is running only queries which belong to the transaction are executed.
All the extra queries are queued and executed after the current transaction is finished.
All attempts to start a transaction when one is already running will also get queued.
But it sounds like too complicated solution. Is there a better approach? How do you deal with this problem?
At first, I would like to state that I have no experience with SQLite. My answer is based on quick study of node-sqlite3.
The biggest problem with your code IMHO is that you try to write to DB from different locations. As I understand SQLite, you have no control of different parallel "connections" as you have in PostgreSQL, so you probably need to wrap all your communication with DB. I modified your example to use always insertData wrapper. Here is the modified function:
function insertData(callback, cmds) {
// start a transaction
db.serialize(function() {
db.run("BEGIN;");
//console.log('insertData -> begin');
// do multiple inserts
cmds.forEach(function(item) {
db.run("INSERT INTO data (t) VALUES (?)", item, function(e) {
if (e) {
console.log('error');
// rollback here
} else {
//console.log(item);
}
});
});
// all done
//here should be commit
//console.log('insertData -> commit');
db.run("ROLLBACK;", function(e) {
return callback();
});
});
}
Function is called with this code:
init(function() {
// insert with transaction
function doTransactionInsert(e) {
if (e) return console.log(e);
setTimeout(insertData, 10, doTransactionInsert, ['all', 'your', 'base', 'are', 'belong', 'to', 'us']);
}
doTransactionInsert();
// Insert increasing integers 0, 1, 2, ...
var i=0;
function doIntegerInsert() {
//console.log('integer insert');
insertData(function(e) {
if (e) return console.log(e);
setTimeout(doIntegerInsert, 9);
}, [i++]);
}
...
I made following changes:
added cmds parameter, for simplicity I added it as last parameter but callback should be last (cmds is an array of inserted values, in final implementation it should be an array of SQL commands)
changed db.exec to db.run (should be quicker)
added db.serialize to serialize requests inside transaction
ommited callback for BEGIN command
leave out slide and some underscore
Your test implementation now works fine for me.
I have end up doing full wrapper around sqlite3 to implement locking the database in a transaction. When DB is locked all queries are queued and executed after the current transaction is over.
https://github.com/Strix-CZ/sqlite3-transactions
IMHO there are some problems with the ivoszz's answer:
Since all db.run are async you cannot check the result of the whole transaction and if one run has error result you should rollback all commands. For do this you should call db.run("ROLLBACK") in the callback in the forEach loop. The db.serialize function will not serialize async run and so a "cannot start transaction within transaction occurs".
The "COMMIT/ROLLBACK" after the forEach loop has to check the result of all statements and you cannot run it before all the previous run finished.
IMHO there are only one way to make a safe-thread (obv referred to the background thread pool) transaction management: create a wrapper function and use the async library in order to serialize manually all statements. In this way you can avoid db.serialize function and (more important) you can check all single db.run result in order to rollback the whole transaction (and return the promise if needed).
The main problem of the node-sqlite3 library related to transaction is that there aren't a callback in the serialize function in order to check if one error occurs

Resources