I have a Node.js application which I'm testing with Mocha on a fast(ish) dev machine. I've noticed that sometimes the fast CPU will mask some errors. If the tests are run on a machine with a slower CPU these errors starting showing up.
Question: Is there an easy way to temporarily slow down or simulate a slow down in CPU processing to surface these errors? Or way to run these tests at full speed and still discover this type of error?
One possible reason for these discrepancies is that certain functions might take longer to run depending on the machine they're running on, for instance if it involves heavy computation, or reading from a DB. This might modify the order in which callbacks are called.
To work around this problem, you can get more control over the order in which parallel sequences of operations are run by using (for instance) Sinon.js in your tests: it has great spy/stub features, and also ships fake timers.
By mocking (stubbing) the async functions which take time to run, you can remove the speed factor (machine-dependent). Also, fake timers allow getting control over functions wrapped in setTimeout or setInterval
Related
I have a simple node app with 1 function that defines 1000+ functions inside it (without running them).
When I call this function (the wrapper) around 200 times the RSS memory of the process spikes from 100MB to 1000MB and immediately goes down. (The memory spike only happens after around 200~ calls, before that all the calls do not cause a memory spike, and all the calls after do not cause a memory spike)
This issue is happening to us in our node server in production, and I was able to reproduce it in a simple node app here:
https://github.com/gileck/node-v8-memory-issue
When I use --jitless pr --no-opt the issue does not happen (no spikes). but obviously we do not want to remove all the v8 optimizations in production.
This issue must be some kind of a specific v8 optimization, I tried a few other v8 flags but non of them fix the issue (only --jitless and --no-opt fix it)
Anyone knows which v8 optimization could cause this?
Update:
We found that --no-concurrent-recompilation fix this issue (No memory spikes at all).
but still, we can't explain it.
We are not sure why it happens and which code changes might fix it (without the flag).
As one of the answers suggests, moving all the 1000+ function definitions out of the main function will solve it, but then those functions will not be able to access the context of the main function which is why they are defined inside it.
Imagine that you have a server and you want to handle a request.
Obviously, The request handler is going to run many times as the server gets a lot of requests from the client.
Would you define functions inside the request handler (so you can access the request context in those functions) or define them outside of the request handler and pass the request context as a parameter to all of them? We chose the first option... what do you think?
anyone knows which v8 optimization could cause this?
Load Elimination.
I guess it's fair to say that any optimization could cause lots of memory consumption in pathological cases (such as: a nearly 14 MB monster of a function as input, wow!), but Load Elimination is what causes it in this particular case.
You can see for yourself when your run with --turbo-stats (and optionally --turbo-filter=foo to zoom in on just that function).
You can disable Load Elimination if you feel that you must. A preferable approach would probably be to reorganize your code somewhat: defining 2,000 functions is totally fine, but the function defining all these other functions probably doesn't need to be run in a loop long enough until it gets optimized? You'll avoid not only this particular issue, but get better efficiency in general, if you define functions only once each.
There may or may not be room for improving Load Elimination in Turbofan to be more efficient for huge inputs; that's a longer investigation and I'm not sure it's worth it (compared to working on other things that likely show up more frequently in practice).
I do want to emphasize for any future readers of this that disabling optimization(s) is not generally a good rule of thumb for improving performance (or anything else), on the contrary; nor are any other "secret" flags needed to unlock "secret" performance: the default configuration is very carefully optimized to give you what's (usually) best. It's a very rare special case that a particular optimization pass interacts badly with a particular code pattern in an input function.
I need to test some node frameworks, or at least their routing part. That means from the request arrives at the node process for processing until a route has been decided and a function/class with the business logic is called, e.g. just before calling it. I have looked hard and long for a suitable approach, but concluded that it must be done directly in the code and not using an external benchmark tool. I fear measuring the wrong attributes. I tried artillery and ab but they measure a lot more attributes then I want to measure, like RTT, bad OS scheduling, random tasks executing in the OS and so on. My initial benchmarks for my custom routing code using process.hrtime() shows approx. 0.220 ms (220 microseconds) execution time but the external measure shows 0.700 (700 microseconds) which is not an acceptable difference, since it's 3.18x additional time. Sometimes execution time jumps to 1.x seconds due to GC or system tasks. Now I wonder how a reproducible approach would look like? Maybe like this:
Use Docker with Scientific Linux to get a somewhat controlled environment.
A minimal docker container install, node enabled container only, no extras.
Store time results in global scope until test is done and then save to disk.
Terminate all applications with high/moderate diskIO and/or CPU on host OS.
Measure time as explained before and crossing my fingers.
Any other recommendations to take into consideration?
I am currently working with three matlab functions to make them run near simultaneously in single Matlab session(as I known matlab is single-threaded), these three functions are allocated with individual tasks, it might be difficult for me to explain all the detail of each function here, but try to include as much information as possible.
They are CONTROL/CAMERA/DATA_DISPLAY tasks, The approach I am using is creating Timer objects to have all the function callback continuously with different callback period time.
CONTROL will sending and receiving data through wifi with udp port, it will check the availability of package, and execute callback constantly
CAMERA receiving camera frame continuously through tcp and display it, one timer object T1 for this function to refresh the capture frame
DATA_DISPLAY display all the received data, this will refresh continuously, so another timer T2 for this function to refresh the display
However I noticed that the timer T2 is blocking the timer T1 when it is executed, and slowing down the whole process. I am working on a system using a multi-core CPU and I would expect MATLAB to be able to execute both timer objects in parallel taking advantage of the computational cores.
Through searching the parallel computing toolbox in matlab, it seems not able to deal with infinite loop or continuous callback, since the code will not finish and display nothing when execute, probably I am not so sure how to utilize this toolbox
Or can anyone provide any good idea of re-structuring the code into more efficient structure.
Many thanks
I see a problem using the parallel computing toolbox here. The design implies that the jobs are controlled via your primary matlab instance. Besides this, the primary instance is the only one with a gui, which would require to let your DISPLAY_DATA-Task control everything. I don't know if this is possible, but it would result in a very strange architecture. Besides this, inter process communication is not the best idea when processing large data amounts.
To solve the issue, I would use Java to display your data and realise the 'DISPLAY_DATA'-Part. The connection to java is very fast and simple to use. You will have to write a small java gui which has a appendframe-function that allows your CAMERA-Job to push new data. Obviously updating the gui should be done parallel without blocking.
This is a general question about testing, but I will frame it in the context of Node.js. I'm not as concerned with a particular technology, but it may matter.
In my application, I have several modules that are called upon to do work when my web server receives a request. In the case of some of these modules, I close the request before I call upon them.
What is a good way to test that these modules are doing what they are supposed to do?
The advice here for RSpec is to mock out the work these modules are doing and just ensure that the appropriate methods are being called. This makes sense to me, but in Node.js, since my modules are not global, I don't think I cannot mock out functions without changing my program architecture so that every instance receives instances of objects that it needs1.
[1] This is a well known programming paradigm, but I cannot remember its name right now.
The other option I see is to use setTimeout and take my best guess at when these modules are done with their work.
Neither of these seems ideal.
Am I missing something? Are background processes not tested?
Since you are speaking of integration tests of these background components, a few strategies come to mind.
Take all the asynchronicity out of their operation for test mode. I'm imagining you have some sort of queueing process (that could be a faulty assumption), you toss work into the queue, and then your modules pick up that work and do their task. You could rework your test harness such that the test harness stands in as the queuing mechanism and you effectively get direct control over when the modules execute.
Refactor your modules to take some sort of next callback function. They would end up functioning a bit like Express's middleware layer or how async's each function works, but into each module you'd pass some callback that you call when that module's task is complete. Once all of the modules have reported in, then you can check the state of the program.
Exactly what you already suggested-- wait some amount of time, and if it still isn't done, consider that a failure. Mocha sort of does that, in that if a given test is over a definable threshold, then it's a failure. I don't like this way though, because if you add more tests, they all have to wait the same amount of time.
There is code,
async.series(tasks, function (err) {
return callback ({message: 'tasks execution error', error: err});
});
where, tasks is array of functions, each of it peforms HTTP request (using request module) and calling MongoDB API to store the data (to MongoHQ instance).
With my current input, (~200 task to execute), it takes
[normal mode] collection cycle: 1356.843 sec. (22.61405 mins.)
But simply trying change from series to parallel, it gives magnificent benefit. The almost same amount of tasks run in ~30 secs instead of ~23 mins.
But, knowing that nothing is for free, I'm trying to understand what the consequences of that change? Can I tell that number of open sockets will be much higher, more memory consumption, more hit to DB servers?
Machine that I run the code is only 1GB of RAM Ubuntu, so I so that app hangs there one time, can it be caused by lacking of resources?
Your intuition is correct that the parallelism doesn't come for free, but you certainly may be able to pay for it.
Using a load testing module (or collection of modules) like nodeload, you can quantify how this parallel operation is affecting your server to determine if it is acceptable.
Async.parallelLimit can be a good way of limiting server load if you need to, but first it is important to discover if limiting is necessary. Testing explicitly is the best way to discover the limits of your system (eachLimit has a different signature, but could be used as well).
Beyond this, common pitfalls using async.parallel include wanting more complicated control flow than that function offers (which, from your description doesn't seem to apply) and using parallel on too large of a collection naively (which, say, may cause you to bump into your system's file descriptor limit if you are writing many files). With your ~200 request and save operations on 1GB RAM, I would imagine you would be fine as long as you aren't doing much massaging in the event handlers, but if you are experiencing server hangs, parallelLimit could be a good way out.
Again, testing is the best way to figure these things out.
I would point out that async.parallel executes multiple functions concurrently not (completely) parallely. It is more like virtual parallelism.
Executing concurrently is like running different programs on a single CPU core, via multitasking/scheduling. True parallel execution would be running different program on each core of multi-core CPU. This is important as node.js has single-threaded architecture.
The best thing about node is that you don't have to worry about I/O. It handles I/O very efficiently.
In your case you are storing data to MongoDB, is mostly I/O. So running them parallely will use up your network bandwidth and if reading/writing from disk then disk bandwidth too. Your server will not hang because of CPU overload.
The consequence of this would be that if you overburden your server, your requests may fail. You may get EMFILE error (Too many open files). Each socket counts as a file. Usually connections are pooled, meaning to establish connection a socket is picked from the pool and when finished return to the pool. You can increase the file descriptor with ulimit -n xxxx.
You may also get socket errors when overburdened like ECONNRESET(Error: socket hang up), ECONNREFUSED or ETIMEDOUT. So handle them with properly. Also check the maximum number of simultaneous connections for mongoDB server too.
Finally the server can hangup because of garbage collection. Garbage collection kicks in after your memory increases to a certain point, then runs periodically after some time. The max heap memory V8 can have is around 1.5 GB, so expect GC to run frequently if its memory is high. Node will crash with process out of memory if asking for more, than that limit. So fix the memory leaks in your program. You can look at these tools.
The main downside you'll see here is a spike in database server load. That may or may not be okay depending on your setup.
If your database server is a shared resource then you will probably want to limit the parallel requests by using async.eachLimit instead.
you'll realize the difference if multiple users connect:
in this case the processor can handle multiple operations
asynch tries to run several operations of multiple users relative equal
T = task
U = user
(T1.U1 = task 1 of user 1)
T1.U1 => T1.U2 => T2.U1 => T8.U3 => T2.U2 => etc
this is the oposite of atomicy (so maybe watch for atomicy on special db operations - but thats another topic)
so maybe it is faster to use:
T2.U1 before T1.U1
- this is no problem until
T2.U1 is based on T1.U1
- this is preventable by using callbacks/ or therefore are callbacks
...hope this is what you wanted to know... its a bit late here