Running Node JS and ExpressJS, my server time seems to be 5 hours ahead. Anyone have any suggestions?
Most of us face this problem when dealing with time issues, such as producing charts based on times and dates, so you are not alone. No matter what time your server is on, it will not match the timezone of everyone who uses the Node.js program, unless this is a private creation. Some suggestions:
You did not mention what type of server. First, you need to decide what Timezone (tz) you will be using for your Node.js program (for sanity purposes if nothing else). If you have a model that will be accessed from anywhere in the world, then people expect certain things to be in their local time, while others can be in UTC.
Your programs (modules) need to be aware of when they are using UTC or not. For instance in a DB where you have a timestamp, you need to know what time is being used for that timestamp, so calculations based on TZ can be correct afterwards.
Now the headache. For a serious management of TZ and formats, you should use moment. However, and I say this after painful experience, though moment is great, take time to understand it. I would not suggest just diving in, but take the time to understand just how to manipulate the TZ correctly for a UTC especially for your own needs. TZ in moment goes from the easy to a fairly elaborate scheme. Read up on it here.
There are some other datetime modules which are excellent, (just go to NPM and search for date time) and moment has some other plugins (like twix), so the field is covered.
Finally, be real careful with date variable declaration and avoid scoping and hoisting with date and time variables. In datetime these especially, have unseen consequences.
Hope some of the above helps.
Related
I would like to know how important could be the impact between using a 15.5k library just for doing very simple validations, and, using my own 1k super-simple validation class, in the time when I'll have more than 10k users on my system (Node + Mongo running on a super pentium 8 core 32gb ram).
Is it worst to care about this 14.5k of code?
I cant find any clue in my so bleak but always wondering mind.
I'll apretiate very much your opinion.
A nice thing about server development is that you usually have significant RAM available and the code is generally loaded just once at the startup of the server so load time is not part of the user experience.
Because of these, you'd be hard pressed to even measure a meaningful impact between a 1k library and a 15k library. You might care about 15k of memory usage per active user, but you will not care about an extra 15k of code loaded into memory one time and it will not affect your server performance in any way.
So, I'd say you should not worry about skrimping on code size (within reason). Instead, pick the tool that best solves your problem, makes your development quickest and the most reliable. And, where possible, use what has been built and tested before rather than build your own from scratch. That will leave you more development time to spend on the things that really matter to making your website better, different or great. Or, it will get you to market faster.
For reference, 15k is 0.000045% of your total computer's RAM.
I agree with #jfriend00. There's almost no impact on memory/performance for the code sizes you describe. You can always benchmark different modules according to your usage profile and choose by yourself. However, I think you should ask yourself some other (similar) questions -
Why the package I use is so 'big'? maybe there's a much 'smaller' one
that does the same job with the same performance. When I say big or small here I mean in terms of functionality. Most of the times you'd want to go with minimum functionality, even if its size might seem big. If you use a validation module that also validates emails, but you don't need it - doesn't mean that you shouldn't use it, just know the tradeoffs - it might get updated more frequently because bugs in the email validation that might cause other bugs in integer validations that you use, you have more code to read if you want your production code to feel more safe (explained bellow).
Does the package function as I expect? (read the tests)
Is the package I use "secured"/"OK for production"? Read the code of the packages you use, make sure there isn't something fishy going on - usually node packages are not that big because most are minimal (I never used it, but I know https://requiresafe.com/ exists for these types of questions - you might want to check it out). Note that if they are larger in size that might mean you would have to read more code.
Ask these questions (and others of you feel you should) recursively on the package' dependencies.
A common workflow in scientific computing is to first write code (perhaps a simulation), run it and analyse its results, then make modifications based on what the previous round of results show. This cycle may go round tens, possibly even hundreds of times before the project is finished.
A key problem of this development cycle is one of reproducibility. As I have gone through this cycle, I will have produced results, graphs, and various other output. I want to be able to take any graph (from yesterday, last week, month, or longer) and reliably reconstruct the code and environment which were used to produce this. How can I solve this problem? The "obvious" solution appears to be one of organisation and recording everything, however this has the potential to create much additional work. I'm interested in the balance of achieving this without handicapping productivity.
http://ipython.org/notebook.html
for people who want to share reproducible research.
http://jupyter.org/
Not just python, there are many languages supported.
Recently, i was experimenting with julia language,
this is one of the advised tutorials.
It's using IJulia which is based on IPython, very nice intro.
I'm attempting to test an application which has a heavy dependency on the time of day. I would like to have the ability to execute the program as if it was running in normal time (not accelerated) but on arbitrary date/time periods.
My first thought was to abstract the time retrieval function calls with my own library calls which would allow me to alter the behaviour for testing but I wondered whether it would be possible without adding conditional logic to my code base or building a test variant of the binary.
What I'm really looking for is some kind of localised time domain, is this possible with a container (like Docker) or using LD_PRELOAD to intercept the calls?
I also saw a patch that enabled time to be disconnected from the system time using unshare(COL_TIME) but it doesn't look like this got in.
It seems like a problem that must have be solved numerous times before, anyone willing to share their solution(s)?
Thanks
AJ
Whilst alternative solutions and tricks are great, I think you're severely overcomplicating a simple problem. It's completely common and acceptable to include certain command-line switches in a program for testing/evaluation purposes. I would simply include a command line switch like this that accepts an ISO timestamp:
./myprogram --debug-override-time=2014-01-01Z12:34:56
Then at startup, if set, subtract it from the current system time, and indeed make a local apptime() function which corrects the output of regular system for this, and call that everywhere in your code instead.
The big advantage of this is that anyone can reproduce your testing results, without a big readup on custom linux tricks, so also an external testing team or a future co-developer who's good at coding but not at runtime tricks. When (unit) testing, that's a major advantage to be able to just call your code with a simple switch and be able to test the results for equality to a sample set.
You don't even have to document it, lots of production tools in enterprise-grade products have hidden command line switches for this kind of behaviour that the 'general public' need not know about.
There are several ways to query the time on Linux. Read time(7); I know at least time(2), gettimeofday(2), clock_gettime(2).
So you could use LD_PRELOAD tricks to redefine each of these to e.g. substract from the seconds part (not the micro-second or nano-second part) a fixed amount of seconds, given e.g. by some environment variable. See this example as a starting point.
I'm currently doing performance and load testing of a complex many-tier system investigating the effect of different changes, but I'm having problems keeping track of everything:
There are many copies of different assemblies
Orignally released assemblies
Officially released hotfixes
Assemblies that I've built containing further additional fixes
Assemblies that I've build containing additional diagnostic logging or tracing
There are many database patches, some of the above assemblies depend on certain database patches being applied
Many different logging levels exist, in different tiers (Application logging, Application performance statistics, SQL server profiling)
There are many different scenarios, sometimes it is useful to test only 1 scenario, other times I need to test combinations of different scenarios.
Load may be split across multiple machines or only a single machine
The data present in the database can change, for example some tests might be done with generated data, and then later with data taken from a live system.
There is a massive amount of potential performance data to be collected after each test, for example:
Many different types of application specific logging
SQL Profiler traces
Event logs
DMVs
Perfmon counters
The database(s) are several Gb in size so where I would have used backups to revert to a previous state I tend to apply changes to whatever database is present after the last test, causing me to quickly loose track of things.
I collect as much information as I can about each test I do (the scenario tested, which patches are applied what data is in the database), but I still find myself having to repeat tests because of inconsistent results. For example I just did a test which I believed to be an exact duplicate of a test I ran a few months ago, however with updated data in the database. I know for a fact that the new data should cause a performance degregation, however the results show the opposite!
At the same time I find myself sepdning disproportionate amounts of time recording these all these details.
One thing I considered was using scripting to automate the collection of performance data etc..., but I wasnt sure this was such a good idea - not only is it time spent developing scripts instead of testing, but bugs in my scripts could cause me to loose track of things even quicker.
I'm after some advice / hints on how better to manage the test environment, in particular how to strike a balance between collecting everything and actually getting some testing done at the risk of missing something important?
Scripting the collection of the test parameters + environment is a very good idea to check out. If you're testing across several days, and the scripting takes a day, it's time well spent. If after a day you see it won't finish soon, reevaluate and possibly stop pursuing this direction.
But you owe it to yourself to try it.
I would tend to agree with #orip, scripting at least part of your workload is likely to save you time. You might consider taking a moment to ask what tasks are the most time consuming in terms of your labor and how amenable are they to automation? Scripts are especially good at collecting and summarizing data - much better then people, typically. If the performance data requires a lot of interpretation on your part, you may have problems.
An advantage to scripting some of these tasks is that you can then check them in along side the source / patches / branches and you may find you benefit from organizational structure of your systems complexity rather than struggling to chase it as you do now.
If you can get away with testing only against a few set configurations that will keep the admin simple. It may also make it easier to put one on each of several virtual machines which can be quickly redeployed to give clean baselines.
If you genuinely need the complexity you describe I'd recommend building a simple database to allow you to query the multivariate results you have. Having a column for each of the important factors will a allow you to query in for questions like "what testing config had the lowest variance in latency?" and "which test database allowed the raising of most bugs?". I use sqlite3 (probably through the Python wrapper or the Firefox plug-in) for this kind of lightweight collection, because it keeps maintenance overhead relatively low and allows you to avoid perturbing the system under test too far, even if you need to run on the same box.
Scripting the tests will make them quicker to execute and permit results to be gathered in an already-ordered way, but it sounds like your system may be too complex to make this easy to do.
Whats a good server side language for doing some pretty cpu and memory intensive things that plays well with php and mysql. Currently, I have a php script which runs some calculations based on a large subset of a fairly large database and than updates that database based on those calculations (1.5millions rows). The current implementation is very slow, taking 1-2 hours depending on other activities on the server. I was hoping to improve this and was wondering what peoples opinions are on a good language for this type of task?
Where is the bottleneck? Run some real profiling, and see what exactly is causing the problem. Is it the DB I/O? Is it the cpu? Is the algorithm inefficient? Are you calling slow library methods in a tight inner loop? Could precalculation be used.
You're pretty much asking what vehicle you need to get from point A to point B, and you've offered a truck, car, bicycle, airplane, jet, and helicopter. The answer won't make sense without more context.
The language isn't the issue, your issue is probably where you are doing these calculations. Sounds like you may be better off writing this in SQL, if possible. Is it? What are you doing?
I suspect your bottleneck is not the computation. It definitely takes several hours to just update a few million records.
If that's the case, you can write a customized function in c/c++ for MySQL and execute the function in stored procedure.
We do this in our database to re-encrypt some sensitive fields during key-rotation. It shrunk key-rotation time from days to hours. However, it's a pain to maintain your own copy of MySQL. We have been looking for alternatives but nothing is close to the performance of this approach.