Firebase transactions in NodeJS always running 3 times? - node.js

Whenever I define a Firebase transaction in NodeJS I notice it always runs three times - the first two times with null data, then finally a third time with actually data. Is this normal/intended?
For example this code:
firebaseOOO.child('ref').transaction(function(data) {
console.log(data);
return data;
});
outputs the following:
null
null
i1: { a1: true }
I would have expected that it only print the last item.
To answer a question in the comments, here is the same with a callback:
firebaseOOO.child('ref').transaction(function(data) {
console.log(data);
return data;
}, function(error, committed, snapshot) {
if (error)
console.log('failed');
else if (!committed)
console.log('aborted');
else
console.log('committed');
console.log('fin');
});
Which yields the following output:
null
null
i1: { a1: true }
committed
fin
I had read the details of how transactions work before posting the question, so I had tried setting applyLocally to false like this:
firebaseOOO.child('ref').transaction(function(data) {
console.log('hit');
return data;
}, function(){}, false);
But it still hits 3 times (just double-checked) so I thought it was something different. Getting the 'value' before transacting does "work" as expected, in that it only hits once, and that's regardless of what applyLocally is set to, so I'm not sure what applyLocally does? This is what I mean by getting the value before transacting:
firebaseOOO.child('ref').once('value', function(data) {
console.log('1');
firebaseOOO.child('ref').transaction(function(data) {
console.log('2');
return data;
});
});
Outputs:
1
2
#Michael: How can one make use of this behavior? Transactions are primarily for having data use itself to modify itself - the prototypical increment++ scenario. So if I need to add 1 to the existing value of 10, and continue working with the result of 11, the first two times the function hits I will have an erroneous result of 1 that I need to handle, and finally the correct result of 11 on the third hit. How can I make use of those two initial 1's? Another scenario (and maybe I shouldn't be using transactions for this, but if it worked like I expected it makes for cleaner code) is to insert a value if it does not yet exist. If transactions only hit once, a null value would mean the value does not exist, and so you could, for example, init the counter to 1 in that case, otherwise add 1 to whatever the value is. With the noisy nulls, this is not possible.
It seems the takeaway from all this is to simply use the 'once' pattern more often than not?
ONCE TRANSACTION PATTERN:
firebaseOOO.child('ref').once('value', function(data) {
console.log('1');
firebaseOOO.child('ref').transaction(function(data) {
console.log('2');
return data;
});
});

The behavior you're seeing here is related to how Firebase fires local events and then eventually synchronizes with the Firebase servers. In this specific example, the "running three times" will only happen the very first time you run the code—after that, the state has been completely synchronized and it'll just trigger once from then on out. This behavior is detailed here: https://www.firebase.com/docs/transactions.html (See the "When a Transaction is run, the following occurs" section.)
If, for example, you have an outstanding on() at the same location and then, at some later time, run this same transaction code, you'll see that it'll just run once. This is because everything is in sync prior to the transaction running (in the ideal case; barring any normal conflicts, etc).

transaction() will be called multiple times and must be able to handle null data. Even if there is existing data in your database it may not be locally cached when the transaction function is run.
firebaseOOO.child('ref').transaction(function(data) {
if(data!=null){
console.log(data);
return data;
}
else {
return data;
}
}, function(error, committed, snapshot) {
if (error)
console.log('failed');
else if (!committed)
console.log('aborted');
else
console.log('committed');
console.log('fin');
});

Related

Using callbacks with sqlite3

Okay so below is a snippet of my code where I have cut many unnecessary things and unrelated but I have left the part dealing with the question.
I am using callbacks while calling the functions needed to run the necessary queries. Since I have many queries like these below, I was wondering if thats the right way to ensure the wanted order for the queries to be executed. I know I could remove the functions and simply put them inside a serialize but its really ugly to repeat the same code so I put them in functions, to put it more clear here is my question.
Question: If I have many queries inside functions the correct way to ensure the get executed in the wanted order is with callbacks as I have done?, even in cases where you dont want to return anything e.g (when updating a row/table in the DB)
get_data(pel, function(results){
var cntl = results;
get_user(pel, function(results_from_user){
update_data(0, 0, function(cb_result){
//do some stuff
});
});
});
function get_data(dt, callback)
{
db.get(`SELECT * FROM my_table`, function(error, row) {
var data_to_return = [..];
return callback(data_to_return);
});
}
function update_data(vdr,dwe,callback)
{
db.run(`UPDATE my_table SET val1='${..}', val2 = '${..}'`);
//..
return callback("updated");
}
function get_user(ms, callback)
{
db.get(`SELECT id FROM my_table_2 WHERE id=${..};`, function(error, row) {
if(row == undefined) db.run(`INSERT INTO my_table_2 (id) VALUES (?)`,[0]);
//..
var id_to_return = [..];
return callback(id_to_return);
});
}
perhaps I should add my code is working as expected, I am just making sure I am not using a weird way.
I can ensure you that you have made a typical solution. in fact callback are used to wait for the response before moving on to the next statement.Goog job

Asserting streams

How would you go about stream assertions? that is, you care about the sequence of inputs but not that they are exclusive (something may come in between).
Listening to events in each stage is of course doable but it gets incredibly verbose, looking for a more pragmatic solution.
Brute approach would be something like the following tape test
t.plan(3);
var exe = child.spawn(...);
exe.stdout.once('data', first function(data) {
// first expected output is always 1
t.equal(data.toString(), '1\n');
// next, 2, 3, 4 is coming but in unknown order.
// the test only tests for 2
exe.stdout.on('data', function second(data) {
if (data.toString() !== '2\n') {
// skip, don't care about this entry
return;
}
exe.stdout.removeListener('data', second);
t.equal(data.toString(), '2\n');
// next is 5, 6, 7, again in unknown order but they are
// AFTER the previous sequence
exe.stdout.on('data', function third(data) {
if (data.toString() !== '7\n') {
// skip, don't care about this entry
return;
}
exe.stdout.removeListener('data', third);
t.equal(data.toString(), '7\n');
});
});
});
});
Here's a potential solution. We add a single data listener, and each time we get an event we check if it matches the next thing we expect, and if so drop that from the array of expected values. Finally, when we've emptied the array we call the callback. I assume you have a callback to call when you've gotten the expected responses. You will probably want to make sure there's a timeout on this test; otherwise it will wait forever and never fail.
assertStreamWithGaps = function(stream, strs, next) {
stream.on('data', function(data) {
if(data.toString() === strs[0]) {
strs.shift();
if(strs.length === 0) {
next();
}
}
});
}
You would call this in your example like:
assertStreamWithGaps(exe.stdout, ['1\n', '2\n', '7\n']);
This doesn't quite match your example, because you expect there to be no leading gaps. Its not clear to me whether that was intentional, so I skipped it. It would probably be easy to add that functionality.
Also, I'm not sure what t is or how to use it, so I didn't.

Using Q/promises vs callbacks

I'm using the Q library in nodejs and haven't worked too much with promises in the past, but I have semi complex logic that requires lots of nesting and thought Q would be a good solution, however I'm finding that it seems to be almost the same as just "callback hell".
Basically I have say 5 methods, all which require data from the previous or one of the previous. Here's an example:
We start with some binary data that has a sha1 hash generated based on the binary.
var data = {
hash : "XXX"
, binary: ''
}
First we want to see if we already have this, using this method:
findItemByHash(hash)
If we don't have it, we need to save it, using:
saveItem(hash)
Now we need to associate this to a user, but not only the results of the save. There's now a much larger hierarchy that we associate, so we need to get that first, doing:
getItemHierarchy(item_id), we use the item_id returned from our previous saveItem
Now, we can "copy" these results to a user:
saveUserHierarchy(hierarchy)
Now we're done, however, this assumes the item didn't exist yet. So we need to handle a case where the item did exist. This would be:
We need to check if the user may aleady have this:
getUserItemByItemId(item_id) - item_id was returned from findItemByHash
If it exists, we're done.
If it doesn't:
getItemHierarchy(item_id)
Then
saveUserHierarchy(hierarchy)
Ok, so right now we have callbacks that do these checks, which is fine. But we need to handle errors in each case along the way. That's fine too, just adds to the mess. Really, if any part of the flow throws an error or rejects then it can stop and just handle it in a single place.
Now with Q, we could do something like this:
findItemByHash(hash).then(function(res) {
if (!res) {
return saveItem(hash).then(function(item) {
return getItemHierarchy(item.id).then(function(hierarchy) {
return saveUserHierarchy(hierarchy);
});
})
} else {
return getUserItemByItemId(res.id).then(function(user_item) {
if (user_item) {
return user_item;
}
return getItemHierarchy(res.id).then(function(hierarchy) {
return saveUserHierarchy(hierarchy);
});
});
}
})
//I think this will only handle the reject for findItemByHash?
.fail(function(err) {
console.log(err);
})
.done();
So, I guess my question is this. Are there better ways to handle this in Q?
Thanks!
One of the reasons why I love promises is how easy it is to handle errors. In your case, if any one of those promises fail, it will be caught at the fail clause you have defined. You can specify more fail clauses if you want to handle them on the spot, but it isn't required.
As a quick example, sometimes I want to handle errors and return something else instead of passing along the error. I'll do something like this:
function awesomeFunction() {
var fooPromise = getFoo().then(function() {
return 'foo';
}).fail(function(reason) {
// handle the error HERE, return the string 'bar'
return 'bar';
});
return fooPromise;
}
awesomeFunction().then(function(result) {
// `result` will either be "foo" or "bar" depending on if the `getFoo()`
// call was successful or not inside of `awesomeFunction()`
})
.fail(function(reason) {
// This will never be called even if the `getFoo()` function fails
// because we've handled it above.
});
Now as for your question on getting out of "return hell" - as long as the next function doesn't require information about the previous one, you can chain .then clauses instead of nesting them:
doThis().then(function(foo) {
return thenThis(foo.id).then(function(bar) {
// `thenThat()` doesn't need to know anything about the variable
// `foo` - it only cares about `bar` meaning we can unnest it.
return thenThat(bar.id);
});
});
// same as the above
doThis().then(function(foo) {
return thenThis(foo.id);
}).then(function(bar) {
return thenThat(bar.id);
});
To reduce it further, make functions that combine duplicate promise combinations and we're left with:
function getItemHierarchyAndSave(item) {
return getItemHierarchy(item.id).then(function(hierarchy) {
return saveUserHierarchy(hierarchy);
});
}
findItemByHash(hash).then(function(resItem) {
if (!resItem) {
return saveItem(hash).then(function(savedItem) {
return getItemHierarchyAndSave(savedItem);
});
}
return getUserItemByItemId(resItem.id).then(function(userItem) {
return userItem || getItemHierarchyAndSave(resItem);
});
})
.fail(function(err) { console.log(err); })
.done();
Disclaimer: I don't use Q promises, I perfer when promises primarily for the extra goodies it comes with, but the principles are the same.

Asynchronous Database Queries with PostgreSQL in Node not working

Using Node.js and the node-postgres module to communicate with a database, I'm attempting to write a function that accepts an array of queries and callbacks and executes them all asynchronously using the same database connection. The function accepts a two-dimensional array and calling it looks like this:
perform_queries_async([
['SELECT COUNT(id) as count FROM ideas', function(result) {
console.log("FUNCTION 1");
}],
["INSERT INTO ideas (name) VALUES ('test')", function(result) {
console.log("FUNCTION 2");
}]
]);
And the function iterates over the array, creating a query for each sub-array, like so:
function perform_queries_async(queries) {
var client = new pg.Client(process.env.DATABASE_URL);
for(var i=0; i<queries.length; i++) {
var q = queries[i];
client.query(q[0], function(err, result) {
if(err) {
console.log(err);
} else {
q[1](result);
}
});
}
client.on('drain', function() {
console.log("drained");
client.end();
});
client.connect();
}
When I ran the above code, I expected to see output like this:
FUNCTION 1
FUNCTION 2
drained
However, the output bizarrely appears like so:
FUNCTION 2
drained
FUNCTION 2
Not only is the second function getting called for both requests, it also seems as though the drain code is getting called before the client's queue of queries is finished running...yet the second query still runs perfectly fine even though the client.end() code ostensibly killed the client once the event is called.
I've been tearing my hair out about this for hours. I tried hardcoding in my sample array (thus removing the for loop), and my code worked as expected, which leads me to believe that there is some problem with my loop that I'm not seeing.
Any ideas on why this might be happening would be greatly appreciated.
The simplest way to properly capture the value of the q variable in a closure in modern JavaScript is to use forEach:
queries.forEach(function(q) {
client.query(q[0], function(err, result) {
if(err) {
console.log(err);
} else {
q[1](result);
}
});
});
If you don't capture the value, your code reflects the last value that q had, as the callback function executed later, in the context of the containing function.
forEach, by using a callback function isolates and captures the value of q so it can be properly evaluated by the inner callback.
A victim of the famous Javascript closure/loop gotcha. See my (and other) answers here:
I am trying to open 10 websocket connections with nodejs, but somehow my loop doesnt work
Basically, at the time your callback is executed, q is set to the last element of the input array. The way around it is to dynamically generate the closure.
It will be good to execute this using async module . It will help you to reuse the code also . and will make the code more readable . I just love the auto function provided by async module
Ref: https://github.com/caolan/async

Node.js + SQLite async transactions

I am using node-sqlite3, but I am sure this problem appears in another database libraries too. I have discovered a bug in my code with mixing transactions and async code.
function insertData(arrayWithData, callback) {
// start a transaction
db.run("BEGIN", function() {
// do multiple inserts
slide.asyncMap(
arrayWithData,
function(cb) {
db.run("INSERT ...", cb);
},
function() {
// all done
db.run("COMMIT");
}
);
});
}
// some other insert
setInterval(
function() { db.run("INSERT ...", cb); },
100
);
You can also run the full example.
The problem is that some other code with insert or update query can be launched during the async pause after begin or insert. Then this extra query is run in the transaction. This is not a problem when the transaction is committed. But if the transaction is rolled back the change made by this extra query is also rolled back. Hoops we've just unpredictably lost data without any error message.
I thought about this issue and I think that one solution is to create a wrapper class that will make sure that:
Only one transaction is running at the same time.
When transaction is running only queries which belong to the transaction are executed.
All the extra queries are queued and executed after the current transaction is finished.
All attempts to start a transaction when one is already running will also get queued.
But it sounds like too complicated solution. Is there a better approach? How do you deal with this problem?
At first, I would like to state that I have no experience with SQLite. My answer is based on quick study of node-sqlite3.
The biggest problem with your code IMHO is that you try to write to DB from different locations. As I understand SQLite, you have no control of different parallel "connections" as you have in PostgreSQL, so you probably need to wrap all your communication with DB. I modified your example to use always insertData wrapper. Here is the modified function:
function insertData(callback, cmds) {
// start a transaction
db.serialize(function() {
db.run("BEGIN;");
//console.log('insertData -> begin');
// do multiple inserts
cmds.forEach(function(item) {
db.run("INSERT INTO data (t) VALUES (?)", item, function(e) {
if (e) {
console.log('error');
// rollback here
} else {
//console.log(item);
}
});
});
// all done
//here should be commit
//console.log('insertData -> commit');
db.run("ROLLBACK;", function(e) {
return callback();
});
});
}
Function is called with this code:
init(function() {
// insert with transaction
function doTransactionInsert(e) {
if (e) return console.log(e);
setTimeout(insertData, 10, doTransactionInsert, ['all', 'your', 'base', 'are', 'belong', 'to', 'us']);
}
doTransactionInsert();
// Insert increasing integers 0, 1, 2, ...
var i=0;
function doIntegerInsert() {
//console.log('integer insert');
insertData(function(e) {
if (e) return console.log(e);
setTimeout(doIntegerInsert, 9);
}, [i++]);
}
...
I made following changes:
added cmds parameter, for simplicity I added it as last parameter but callback should be last (cmds is an array of inserted values, in final implementation it should be an array of SQL commands)
changed db.exec to db.run (should be quicker)
added db.serialize to serialize requests inside transaction
ommited callback for BEGIN command
leave out slide and some underscore
Your test implementation now works fine for me.
I have end up doing full wrapper around sqlite3 to implement locking the database in a transaction. When DB is locked all queries are queued and executed after the current transaction is over.
https://github.com/Strix-CZ/sqlite3-transactions
IMHO there are some problems with the ivoszz's answer:
Since all db.run are async you cannot check the result of the whole transaction and if one run has error result you should rollback all commands. For do this you should call db.run("ROLLBACK") in the callback in the forEach loop. The db.serialize function will not serialize async run and so a "cannot start transaction within transaction occurs".
The "COMMIT/ROLLBACK" after the forEach loop has to check the result of all statements and you cannot run it before all the previous run finished.
IMHO there are only one way to make a safe-thread (obv referred to the background thread pool) transaction management: create a wrapper function and use the async library in order to serialize manually all statements. In this way you can avoid db.serialize function and (more important) you can check all single db.run result in order to rollback the whole transaction (and return the promise if needed).
The main problem of the node-sqlite3 library related to transaction is that there aren't a callback in the serialize function in order to check if one error occurs

Resources