My synchronous code is near thousand of lines. I want to divide them to some groups and put it in async.auto (one group is one function in async.auto). Each function has the name. I do that because I want to make it easy for other people to do maintain in the future. Code is divide to group so they will easy to understand. I want to know does async.auto cause performance loss comparing with when I don't use it
do some stuff;
do some stuff;
do some stuff;
...
do some stuff;
I want to change to below:
async.auto({
do_A: function(cb){
do some stuff;
do some stuff;
},
do_B: ['do_A', function(cb, result){
do some stuff with result;
do some stuff with result;
}]
}, function(err, result){
})
You should definitely be able to divide it up and put it as separate functions into async.auto but often with large blocks of synchronous code there is a lot of coupling between different sections of the code without you realising. My advice is to very carefully split it up, testing each time you create a new group and commit the change to SCM (e.g. git) before beginning the next change. This way when you discover problems you can go back and find out where you introduced it.
I don't know enough to say what the performance impact would be but I would think it would be minimal. Your best bet (as with any performance question) is to test it in a profiler. You won't get any performance benefits either unless you let it run some of the functions out of order. If every block depends on the previous one then it will all just be run in sequence.
Related
Is there any limit on how many functions I can declare in the Phaser game update loop & does the performance decrease if there are a lot of functions in the update loop?
Declaring and Calling Functions
There's a difference between declaring a function
function foo(n) {
return n + 1;
}
and calling a function:
var bar = foo(3);
If you really mean declare, you can indeed declare functions within update, since JavaScript supports nesting and closures:
function update() {
function updateSomeThings() {
...
}
function updateSomeOtherThings() {
....
}
}
This has negligible performance impact, since this snippet doesn't actually call any of these functions. If however later in update you called them:
updateSomeThings();
updateSomeOtherThings();
then yes there is a cost.
Note: You don't have to declare functions within update itself to call them! You can call functions declared elsewhere, as long as they're in scope. It's worth looking at a JavaScript guide if this is too confusing.
The Cost of Function Calls
Every function you call takes time to execute. The time it takes depends on how complex the function is (how much work it does), and the it may call other functions which also take time to execute. This may be obvious, but the function's total execution time is the sum total of the execution time of all the code within that function, including the time taken by any functions it calls (and that they call, and so on).
Frame Rate
Phaser by default will aim to run at 60 frames per second, which is pretty standard for games. This means it will try to update and draw your game 60 times every second. Phaser does other things apart from calling your update function each time, not least of which is drawing your game, but it also has other housekeeping to do. Depending on the game, the bulk of your frame time may end up being taken up by either updates or drawing.
You certainly want to take less than 1/60th of a second (approx. 16 milliseconds) to complete your update, and that's assuming the game is incredibly quick for Phaser to draw.
Some things you do in Phaser are slower than others. Some developers have been doing this long enough to estimate what is "too slow" to work, but many 2D games will be just fine without taking too much care over optimization (making things run more efficiently in terms of memory used or time taken).
Good and Bad Ideas
Some bad ideas: if you have 50,000 sprites onscreen (though some machines are very powerful especially when Phaser is set to use WebGL), they will often times take far too long to draw even if you never update them. If you have 10,000 sprites bouncing and colliding with each other, collision detection will often times take far too long to update, even though some machines may be able to draw them just fine.
The best advice is to do everything you have to, but nothing you don't. Try to keep your design as simple as possible when getting started. Add complexity via interesting game mechanics, rather than by computationally expensive logic.
If all else fails, sometimes you can split work across multiple updates, or there may be some things you can do every other update or every n updates (which works best if there's different work you can do on the other updates, so you don't just have some updates slower than others).
In Java, I am used to try..catch, with finally to cleanup unused resources.
In Node.JS, I don't have that ability.
Odd errors can occur for example the database could shut down at any moment, any single table or file could be missing, etc.
With nested calls to db.query(..., function(err, results){..., it becomes tedious to call if(err) {send500(res); return} every time, especially if I have to cleanup resources, for example db.end() would definitely be appropriate.
How can one write code that makes async catch and finally blocks both be included?
I am already aware of the ability to restart the process, but I would like to use that as a last-resort only.
A full answer to this is pretty in depth, but it's a combination of:
consistently handling the error positional argument in callback functions. Doubling down here should be your first course of action.
You will see #izs refer to this as "boilerplate" because you need a lot of this whether you are doing callbacks or promises or flow control libraries. There is no great way to totally avoid this in node due to the async nature. However, you can minimize it by using things like helper functions, connect middleware, etc. For example, I have a helper callback function I use whenever I make a DB query and intend to send the results back as JSON for an API response. That function knows how to handle errors, not found, and how to send the response, so that reduces my boilerplate substantially.
use process.on('uncaughtExcepton') as per #izs's blog post
use try/catch for the occasional synchronous API that throws exceptions. Rare but some libraries do this.
consider using domains. Domains will get you closer to the java paradigm but so far I don't see that much talk about them which leads me to expect they are not widely adopted yet in the node community.
consider using cluster. While not directly related, it generally goes hand in hand with this type of production robustness.
some libraries have top-level error events. For example, if you are using mongoose to talk to mongodb and the connection suddenly dies, the connection object will emit an error event
Here's an example. The use case is a REST/JSON API backed by a database.
//shared error handling for all your REST GET requests
function normalREST(res, error, result) {
if (error) {
log.error("DB query failed", error);
res.status(500).send(error);
return;
}
if (!result) {
res.status(404).send();
return;
}
res.send(result); //handles arrays or objects OK
}
//Here's a route handler for /users/:id
function getUser(req, res) {
db.User.findById(req.params.id, normalREST.bind(null, res));
}
And I think my takeaway is that overall in JavaScript itself, error handling is basically woefully inadequte. In the browser, you refresh the page and get on with your life. In node, it's worse because you're trying to write a robust and long-lived server process. There is a completely epic issue comment on github that goes into great detail how things are just fundamentally broken. I wouldn't get your hopes up of ever having JavaScript code you can point at and say "Look, Ma, state-of-the-art error handling". That said, in practice if you follow the points I listed above, empirically you can write programs that are robust enough for production.
See also The 4 Keys to 100% Uptime with node.js.
FYI: I am doing this already with Web Workers, and it works fine, but I was just exploring what can and can't be done with process.nextTick.
So I have an array of a million elements that I'm sorting in Node.JS. I want Node to be responsive to other requests while it's doing this.
Is there any way to make Array.prototype.sort() not block other processes? Since this is a core function, I can't insert any process.nextTick().
I could implement quicksort manually, but I can't see how you do that efficiently, in a continuation-passing-style, which seems to be required for process.nextTick(). I can modify a for loop to do this, but sort() seems impossible.
While it's not possible to make Array.prototype.sort somehow behave asynchronously itself, asynchronous sorting is definitely possible, as shown by this sorting demo to demonstrate the speed advantage of setImmediate (shim) over setTimeout.
The source code does not seem to come with any license, unfortunately. The Github repo for the demo at https://github.com/jphpsf/setImmediate-shim-demo names Jason Weber as the author. You may want to ask him if you want to use (parts) of the code.
I think that if you use setImmediate (available since Node 0.10) the individual sort operations will be effectively interleaved with I/O callbacks. For such a big amount of work, I would not recommend process.nextTick (if it works at all, because there's a 1000 maxTickDepth limit). See setImmediate vs. nextTick for some backgroud.
Using setImmediate instead of plain "synchronous" processing will certainly be slower overall, so you could choose to handle a batch of individual sort operations per "tick" to speed things up, at the expense of Node not being responsive during that time. I think the right balance between speed and responsiveness wrt I/O can only be found with experimentation.
A much simpler alternative would be to do it more like web workers: Spawn a child process and do the sorting there. Biggest problem you face then is transferring the sorted data back to your main process(to generate some kind of output, presumably). AFAIK there's nothing like transferable objects for Node.js. After having buffered the sorted array, you could stream the results to the child process stdout and parse the data in the main process, or perhaps more simple; use child process messaging.
You may not have a spare cpu core lying around, so the child process would invade some other process cpu time. To avoid the sort process from hurting your other processes, you may need to assign it a low priority. It's seemingly not possible to do this with Node itself, but you could try using nice, as discussed here: https://groups.google.com/forum/#!topic/nodejs/9O-2gLJzmcQ . I have no experience in this matter.
Well, I initially thought you could use async.sortBy, but upon closer examination it seems that won't behave as you need. See Array.sort and Node.js for a similar question, although at the moment there's no accepted answer.
I know this is a rather old question, but I came across a similar situation, with still no simple solution that I could find.
I modified an exising quick sort and published a package that gives up execution to the eventloop periodically here:
https://www.npmjs.com/package/qsort-async
If you are familiar with a traditional quicksort, my only modification was to to the initial function which does the partitioning. Basically the function still modifies the array in place, but now returns a promise. It stops execution for other things in the eventloop if it tries to process too many elements in a single iteration. (I believe the default size I specified was 10000).
Note: Its important to use setImmedate here and not process.nextTick or setTimeout here. nextTick will actually place your execution before process IO and you will still have issues responding to other requests tied to IO. setTimeout is just too slow (which I believe one of the other answers linked a demo for).
Note 2: If something like a mergesort is more your style, you could do the same sort of logic in the 'merge' function of the of the sort.
const immediate = require('util').promisify(setImmediate);
async function quickSort(items, compare, left, right, size) {
let index;
if (items.length > 1) {
index = partition(items, compare, left, right);
if (Math.abs(left - right) > size) {
await immediate();
}
if (left < index - 1) {
await quickSort(items, compare, left, index - 1, size);
}
if (index < right) {
await quickSort(items, compare, index, right, size);
}
}
return items;
}
The full code is here: https://github.com/Mudrekh/qsort-async/blob/master/lib/quicksort.js
Is this a "proper" way to run Firebase transactions that depend on each other sequentially using the NodeJS client:
ref.child('relationships/main').child(accountID).transaction(function(data) {
return r;
}, function(error, committed, snapshot) {
if (error) {}
else if (!committed) {}
else {
runNextTransaction();
}
});
Originally I was going to put runNextTransaction() in the core function because transactions first run locally, but wouldn't that then hold open the original transaction until the last transaction in the chain is complete, possibly causing issues? (Also I need good data for the next step so I would have to handle collisions before moving on.)
Transactions run asynchronously, so kicking off the next transaction from within the first one would work, but it may not do what you want. Transactions functions can be run more than one time, and you likely don't want to initiate multiple secondary transactions in that case. What you have looks like the right way to do serial transactions. If you're interested in making things a little cleaner, especially if you're going to chain multiple transactions, consider looking into Promises.
What is currently the best practice for loading models (and goes for all required files I guess)?
I'm thinking these two ways to achieve the solution (nonsense code to illustrate follows):
var Post = require('../models/post');
function findById(id) {
return new Post(id);
}
function party() {
return Post.getParty();
}
vs
function findById(id) {
return new require('../models/post')(id);
}
function party() {
return require('../models/post').getParty();
}
Is one of these snippets preferred? Are there considerable memory and time tradeoffs? Or is it just a premature optimization?
It's a premature optimization (calls to require() are cached and idempotent), but I'd personally call your first style better (loading dependencies during initialization rather than subsequent processing) since it's easier to get your head around what you're doing. Loading everything at the start will slightly slow down your startup (which is hardly ever an issue) in return for making most requests run slightly faster (which you shouldn't worry about unless you've identified a bottleneck and done some hardcore profiling).
You should definititely use the version with the single call to requireat the beginning. Although it does not make any difference regarding to how often the modules are loaded (they are only loaded once however you do it), there are performance issues in the second way of doing it.
The problem is that require is one of only a few functions in Node.js that is blocking. That means, as long as it runs, Node.js is not able to fulfill any incoming requests. On startup this is no problem: It only takes a while until your application is up and running.
But for sure you don't want to have blocking moments while your application is already running for a while.
So, if you do not have VERY special reasons for the second option, go with the first one.
I believe the second case is useful to avoid circular module dependencies, since the require() happens at run-time rather than load time.
Otherwise, I believe the first is (slightly?) faster and to me, quite a bit more readable.