Is this a "proper" way to run Firebase transactions that depend on each other sequentially using the NodeJS client:
ref.child('relationships/main').child(accountID).transaction(function(data) {
return r;
}, function(error, committed, snapshot) {
if (error) {}
else if (!committed) {}
else {
runNextTransaction();
}
});
Originally I was going to put runNextTransaction() in the core function because transactions first run locally, but wouldn't that then hold open the original transaction until the last transaction in the chain is complete, possibly causing issues? (Also I need good data for the next step so I would have to handle collisions before moving on.)
Transactions run asynchronously, so kicking off the next transaction from within the first one would work, but it may not do what you want. Transactions functions can be run more than one time, and you likely don't want to initiate multiple secondary transactions in that case. What you have looks like the right way to do serial transactions. If you're interested in making things a little cleaner, especially if you're going to chain multiple transactions, consider looking into Promises.
Related
Scenario:
Some code is listening on collection A for changes. When one occurs, it does some calculation and updates collection B.
Time between changes in A: 20-50ms.
Time for actual calculation: 20-30ms.
Time for roundtrip sending updates to firebase: 250-300ms.
So the code is something like this:
const runUpdates = async (snapshot) => {
const inputData = snapshot && snapshot.exists() && snapshot.val() || undefined
if (inputData) {
const calculatedData = calculateStuff(inputData)
await firebase.database().ref().update({ 'collectionB': calculatedData })
}
}
firebase.database().ref('collectionA').on('value', runUpdates)
I'm using Firebase Realtime Database.
Actual question:
Does the firebase package (using local cache or any other means necessary) assure that the updates will be done in firebase in the same order that I have done them in my code or do I need to await for every update to finish before I can move on to my next computation & update?
More details:
There is a mechanism in place for cases when there is a trigger event but the calculation/update is not yet finished. I'm porpusfully ignoring that or clarity.
I'm trying to improve this code and it seems that in many cases the calculation is relatively short, but then I need to wait for the response from firebase to start the next calculation.
I've been told that firebase has a local cache (server-side) and that my firebase update command actually updates that locally (and therefore is "immediate") and then works to propagate the change to firebase itself. Any sequential updates would also be propagated, while the sequence is assured.
(Needless to say, I tried looking around for this info in the docs etc)
Queries to Realtime Database are pipelined over a single socket connection. The results will be delivered in the order that the queries were issued.
If you need to know when the results of write have been fully committed to the server, you will need to pay attention to the promise returned by update(). That promise will become fulfilled only after the write completes on the server, and not when changes are available just locally.
Whether your use await or then on that promise doesn't really matter. Either way, you will know the result of the update.
My synchronous code is near thousand of lines. I want to divide them to some groups and put it in async.auto (one group is one function in async.auto). Each function has the name. I do that because I want to make it easy for other people to do maintain in the future. Code is divide to group so they will easy to understand. I want to know does async.auto cause performance loss comparing with when I don't use it
do some stuff;
do some stuff;
do some stuff;
...
do some stuff;
I want to change to below:
async.auto({
do_A: function(cb){
do some stuff;
do some stuff;
},
do_B: ['do_A', function(cb, result){
do some stuff with result;
do some stuff with result;
}]
}, function(err, result){
})
You should definitely be able to divide it up and put it as separate functions into async.auto but often with large blocks of synchronous code there is a lot of coupling between different sections of the code without you realising. My advice is to very carefully split it up, testing each time you create a new group and commit the change to SCM (e.g. git) before beginning the next change. This way when you discover problems you can go back and find out where you introduced it.
I don't know enough to say what the performance impact would be but I would think it would be minimal. Your best bet (as with any performance question) is to test it in a profiler. You won't get any performance benefits either unless you let it run some of the functions out of order. If every block depends on the previous one then it will all just be run in sequence.
In Java, I am used to try..catch, with finally to cleanup unused resources.
In Node.JS, I don't have that ability.
Odd errors can occur for example the database could shut down at any moment, any single table or file could be missing, etc.
With nested calls to db.query(..., function(err, results){..., it becomes tedious to call if(err) {send500(res); return} every time, especially if I have to cleanup resources, for example db.end() would definitely be appropriate.
How can one write code that makes async catch and finally blocks both be included?
I am already aware of the ability to restart the process, but I would like to use that as a last-resort only.
A full answer to this is pretty in depth, but it's a combination of:
consistently handling the error positional argument in callback functions. Doubling down here should be your first course of action.
You will see #izs refer to this as "boilerplate" because you need a lot of this whether you are doing callbacks or promises or flow control libraries. There is no great way to totally avoid this in node due to the async nature. However, you can minimize it by using things like helper functions, connect middleware, etc. For example, I have a helper callback function I use whenever I make a DB query and intend to send the results back as JSON for an API response. That function knows how to handle errors, not found, and how to send the response, so that reduces my boilerplate substantially.
use process.on('uncaughtExcepton') as per #izs's blog post
use try/catch for the occasional synchronous API that throws exceptions. Rare but some libraries do this.
consider using domains. Domains will get you closer to the java paradigm but so far I don't see that much talk about them which leads me to expect they are not widely adopted yet in the node community.
consider using cluster. While not directly related, it generally goes hand in hand with this type of production robustness.
some libraries have top-level error events. For example, if you are using mongoose to talk to mongodb and the connection suddenly dies, the connection object will emit an error event
Here's an example. The use case is a REST/JSON API backed by a database.
//shared error handling for all your REST GET requests
function normalREST(res, error, result) {
if (error) {
log.error("DB query failed", error);
res.status(500).send(error);
return;
}
if (!result) {
res.status(404).send();
return;
}
res.send(result); //handles arrays or objects OK
}
//Here's a route handler for /users/:id
function getUser(req, res) {
db.User.findById(req.params.id, normalREST.bind(null, res));
}
And I think my takeaway is that overall in JavaScript itself, error handling is basically woefully inadequte. In the browser, you refresh the page and get on with your life. In node, it's worse because you're trying to write a robust and long-lived server process. There is a completely epic issue comment on github that goes into great detail how things are just fundamentally broken. I wouldn't get your hopes up of ever having JavaScript code you can point at and say "Look, Ma, state-of-the-art error handling". That said, in practice if you follow the points I listed above, empirically you can write programs that are robust enough for production.
See also The 4 Keys to 100% Uptime with node.js.
I am running a transaction to update an item that needs to be stored in two keys. To accomplish this, I have setup a nested transaction as follows, and it seems to run as expected:
firebaseOOO.child('relationships/main').child(accountID).child(friendAccountID).transaction(function(data) {
data.prop = 'newval';
firebaseOOO.child('relationships/main').child(friendAccountID).child(accountID).transaction(function(data) {
return r;
});
return r;
});
Are there any gotchas or possible unexpected implications to this? I am most worried about getting stuck in some sort of transaction loop under load, where each transaction cancels the other out forcing them both to restart, or similar.
Is there a better way of doing this?
I am using the NodeJS client.
You probably don't want to start another transaction from within the callback to the first one. There is no guarantee as to how many times the function for your first transaction will run, particularly if there is a lot of contention at the location you are trying to update.
A better solution, which I believe you hit on in your other question, is to start the second transaction from the completion callback, after checking that the first one committed.
I'm not talking about real money transactions
The project I'm working on is a game where players trade stuff between each other. It's basically a transaction process, player A gives player B 10 groats in exchange for thirty cows, you get the idea.
But as it's interactive and there are many players at once, in a chatroom-like environment all trading randomly I wondered if it was possible to do such a thing with node.js but I see problems.
I come from a DB background where processing transactions and the nature of rollback and commit are necessary to maintain the DB state of health. But if we're talking node.js plus mongoDB (or any other noSQL DB for that matter) that surely is a whole different mentality, but I just don't see how it could handle a trade in the sense that only two parties should be involved without resorting to some form of locking, but surely that's not what node is about.
I haven't found anything yet, but that does not surprise me because node.js is so new.
UPDATE I am aware of the mechanisms of a transaction - and in particular banking style transactions, but this is not the same thing. I may not have made it clear, but the issue is that player B is selling something, to a community of buyers.
That means that although player A initiates a buy instruction on the client side it is also possible that around the same time Player C D or E also clicks to buy the same Cow.
Now in a normal transaction it is expected that at least the first person who obtains a record level table lock at least blocks the other parties from proceeding at that point in time.
However the nature of use of node and in particular its speed, concurrent processing and use for displaying real-time updates database mean that I could easily imagine that the slowest person (we're talking milliseconds) wins.
For example Player A initiates the purchase at the same time as player C. Player A transaction completes and the Groats are paid to Player B, and the Cow is assigned to Player A on the database. A millisecond later the Cow is assigned to player C.
I hope that explains the issue better.
This has nothing to do with Node.JS. Node.JS only connects to the database and transactions are done by database itself (unless you want to manually implement transactions in Node.JS, which might be a bit difficult task - but that's the same for any web server written in any language).
You can easily use (for example) MySQL with Node.JS which supports transactions. So the question you are asking is: can I do transactions with MongoDB? The answer is: no and yes.
No, because MongoDB does not support transactions out of the box.
Yes, because you can use some tricks to emulate transactions. See for example this article.
To do banking style transactions with a document database, it is typical to use a transaction log pattern. In this pattern you'd write each transaction as it's own document. You do not maintain documents corresponding to each account balance. Instead, you roll the transaction documents up at query time, to give the current balance(s).
Here is an example that is applicable to Couchbase map reduce: http://guide.couchdb.org/draft/recipes.html
I'm working on an oss application-level transaction database for node.js called Waterline.
We started with full-on CRUD mutices, but soon realized that was pretty hard. Better to leave that to the database. But sometimes you don't want to-- because you want to be able to switch databases and keep your code agnostic. So then we simplified down to the next easiest piece-- named transactions.
So, no rollback support built in (you'll still have to do that yourself for now), but at least Waterline prevents concurrent access for you.
In your example (assuming you're in express/connect/Sails) it might look something like:
function buyCow (req,res) {
Cow.find(req.param('cowId'),function (err,cow) {
if (err) return req.send(500,err);
Cow.transaction('buy_cow',function (err, unlock) {
if (err) {
// Notice how I unlock before each exit point? REALLY important.
// (would love to hear thoughts on an easier way to do this)
unlock();
return res.send(500,err);
}
User.find(req.session.userId,function (err,user) {
// If there's not enough cash, send an error
if (user.money - cow.price < 0) {
unlock();
return res.send(500,'Not enough cash!');
}
// Update the user's bank account
User.update(user.id,{
money: user.money - cow.price
}, function (err,user) {
if (err) { unlock(); return res.send(500,err); }
Cow.update(cow.id, {owner: user.id}, function (err, cow) {
if (err) { unlock(); return res.send(500,err); }
// Success!
unlock();
res.json({success: true});
});
});
});
});
});
}
Hope that helps. I welcome your feedback (and maybe commits?)