Express App, diagnosing latency spikes in EC2 instances in Beanstalk - node.js

We have a medium-sized Express API that performs well as long as incoming traffic is relatively stable, but when there is a quick uptick in requests, the latency has a tendency to briefly shoot through the roof.
You can see in this image 3 latency spikes that clearly correlate to incoming traffic spikes.
enter image description here
The app is perfectly capable of handling a larger sum of requests with low latency, it is just when there is a spike that this happens.
Also of note, the DB does not struggle or show any correlating spikes at all, and the CPU usage is only moderately affected, rarely going much above 50%-60% even at peak times.
Load-testing shows same issues, whether on local machine or against another AWS QA environment, so probably not related to network.
Given that the DB and CPU are rarely chugging to process requests, and given that the app performs well under higher traffic amounts (reqs fulfilled in <100ms), how would you go about diagnosing the root cause specific issue we have here with traffic spikes?
We have tried pinpointing poorly performing code bottlenecks, but the savings have been negligible. We have engaged with our devops folks to see if there were some options to smooth this out with config changes, but efforts to give us larger instances and adjust auto-scaling haven't really helped this particular problem.
Here's hoping seasoned eyes can give us some clues, assumption is still that there is some code improvements that will help, maybe this is a typical issue in node/express apps. Happy to give additional details, thx in advance.

Related

Prevent bottleneck on bandwidth for mobile internet

I am sure that this question has already been answered, but unfortunately I do not know the keywords. Therefore my search remained unsuccessful until now.
Scenario: I want to transmit a lifestream via Mobile Internet using RaspberryPi, and depending on the bandwidth, downscale the streams and upscale them again when available.
My two questions for the network specialists among you:
i know i can actively check the bandwidth, but how would you do this without interfering with the existing processes transmitting? Should I commit a bandwidth to the processes and then slowly determine the remaining bandwidth using a test tool? Or are there already practical solutions?
Can I determine in the mobile Internet, or in the network interface, when a bottelneck is reached?
Passive methods would be my preference. where I wouldn't have to load the bandwidth. e.g. I could know how much bandwidth the stream uses, and how much arrives. But how do I make sure there is enough capacity before I go up with the bitrate?
Thanks for your wisdom ;)

Node.js Hosting: low bandwidth, high cpu

I've been looking around for a Node.js hosting service that suits my (probably rather exceptional) needs. It's basically a web app for CMS editors to preview CMS pages. (The actual website is statically hosted.)
So it handles only few requests, but all page requests (*.html) trigger quite a series of actions, to simplify let's say it rebuilds a good part of the website.
What I need is a service that delivers high performance on few occasions. Also it should support continuous deployment, so that when we update something, the app stays always on. (also the simpler the better: I'm a frontend developer, not a dev ops).
I've tried Google Cloud: painfully slow update mechanism, rather complex, but stable and fast, but only if you pay a lot.
Heroku is very simple, but their plans are for standard web apps, focusing on many requests, high bandwidth etc. Still, the $250 plan is rather ok in terms of performance. But again: pricey.
Jelastic would offer this flexible vertical scaling. But it's hard to do Continuous deployment and I have not yet figured out how to update an app without interruption of service.
I also though about renting a virtual private server, but again I would not know how to provide continuous delivery. Also, I'd rather have a dedicated service.
It feels like there must be a simple service that I've just missed. I'm grateful for any help or hints!
I have finally found an answer to my search: Google Cloud Run
You'll only pay by usage, and it still offers very good performance. For my scenario it decreases cost by a crazy factor: we'll pay less than 1% of the previous solution (driven by Google App Engine).
I've also tried AWS Lambda, but I had many issues with the rather extensive node app (like dynamic requires).

How do I determine virtual user amount and pacing if the client cannot give me any real data about the website?

I've come across many clients who aren't really able to provide real production data about a website's peak usage. I often do not get peak pageviews per hour, etc.
In these circumstances, besides just guessing or going with what "feels right" (i.e. making it all up), how exactly does one come up with a realistic workload model with an appropriate # of virtual users and a good pacing value?
I use Loadrunner for my performance/load testing.
Ask for the logs for a month.
Find the stats for session duration, then count the number of distinct IP's blocked by session duration.
Once you have the high volume hour, count the number of page instances. Business processes will typically have a termination page which is distinct and allows you to understand how many times a particular action takes place, such as request new password, update profile, business process 1, etc...
With this you will have a measurement of users and actions. You will want your stakeholder to take ownership of this data. As Quality assurance, we should not own both the requirement and the test against it. We should own one, but not both. If your client will not own the requirement, cascading it down to rest of the organization, assume you will be left out in the cold with a result they do not like....i.e., defects that need to be addressed before deployment to production.
Now comes your largest challenge, which is something that needs to be fixed with a process issue by your client.....You are about to test using requirements that no other part of the organization, architecture, development, platform engineering, had when they built the solution. Even if your requirements are a perfect recovery, plus some amount for growth, any defects you find will be challenged aggressively.
Your test will not match any assumptions or requirements used by any other portion of the organization.
And, in a sense, these other orgs will be correct in aggressively challenging your results. It really isn't fair to hold their designed solution to a set of requirements which were not in place when they made decisions which impacted scalability and response times for the system. You would be wise to call this out with your clients before the first execution of any performance test.
You can buy yourself some time. If the client does have a demand for a particular response time, such as an adoption of the Google RAIL model, then you can implement a gate before accepting any code for multi-user performance testing that the code SHALL BE compliant for a single user. It is not going to get any faster for two or more users. Implementing this hard gate will solve about 80% of your performance issues, for the changes required to bring code into compliance for a single user most often will have benefits on the multi-user front.
You can buy yourself some time in a second way as well. Take a look at their current site using tools such as Google Lighthouse and GTMetrix. Most of us are creatures of habit, that includes architect, developers, and ops personnel. We design, build, deploy to patterns we know and are comfortable with....usually the same ones over and over again until we are forced to make a change. It is highly likely that the performance antipatterns pulled from Lighthouse and GTMetrix will be carried forward into a future release unless they are called out for mitigation. Begin citing defects directly off of these tools before you even run a performance test. You will need management support, but you might consider not even accepting a build for multi-user performance testing until GTMetrix scores at least a B across the board and Lighthouse a score of 90 or better.
This should leave edge cases when you do get to multi-user performance testing, such as too early allocation of a resource, holding onto resources too long, too large of a resource allocation, hitting something too often, lock contention on a shared resource. An architectural review might pick up on these, where someone might say, "we are pre-allocating this because.....," or "Marketing says we need to hold the cart for 30 minutes before de-allocation," or "...." Well, you get the idea.
Don't forget to have the database profiler running while functional testing is going on. You are likely to pick up a few missing indexes or high cost queries here which should be addressed before multi-user performance testing as well.
You are probably wondering why am I pointing out all of these things before your performance test takes place. Darnit, you are hired to engage in a performance test! The test you are about to conduct is very high risk politically. Even if it finds something ugly, because the other parts of the organization did not benefit from the requirements, the result is likely to be rejected until the issue shows up in production. By shifting the focus to objective measures even before you need to run two users in anger together there are many avenues to finding and fixing performance issues which are far less politically volatile. Food for thought.

Where to host my NodeJS powered socket.io API?

I'm currently working on an api for an app that I'm developing, which I don't want to tell too much about. I'm a solo dev with no patents so it's probably a good idea to keep it anonymous. It uses socket.io together with node.js to interact with clients and the other way around, which I might be swapping out sometime later for elixir and it's sockets, but that isn't relevant for now. Now I'm trying to look into cloud hosting, but I'm having a rough time finding a good service to use.
These are my requirements:
24/7 uptime
Low memory and performance necessary (at least to start with). About 1+ gig with 2+ cores will most likely suffice (need 2 threads or more for node to handle async programming well)
Preferably free for like maybe even a year, or just really cheap, but that might be munch to ask
Must somehow be able to run some sort of database. Haven't really settled on this yet, but I want to implement a custom currency at some point, and probably have the ability to add some cooldowns. So it can be fairly simple and small. If anybody has any tips on what database I should use, that would also be very welcome. I was thinking of Cassandra because of the blazing fast performance and expandability. But I also wanna look into remote databases, especially if I'm gonna go international with the product
Ability to keep open socket.io connections, as you've probably guessed :P
Low ping decently high bandwith internet. The socket.io connections are lightweight and not a lot of data has to be sent. Mostly packets of a few kilobytes every now and then for all of the clients.
If this information is too vague or you want to know some other requirements I haven't thought of, let me know.
Check out Heroku (PaaS), they have a free version to start with

Node.js - I need to know my resource usage before deploying to google cloud

EDIT: STILL NOT ANSWERED. I appreciate the advice I have received so far, but I still have not found a proper way to test the amount of resources my server is using. I decided to use GCE instead of GAE but I still want to measure the resource usage.
I have searched all over google as well as SA and can't seem to figure this one out.
I would like to deploy my (very small) node.js server to either Google App Engine or Google Compute Engine (not sure which to use yet).
I see that they charge based on how many resources you use, but how can I check this before I make my decision? Basically what I would like to do is find a way to analyse my server and see what CPU/DISK/NETWORK/RAM/Etc it uses, and then possibly make some refinements to my code to get the usage down as low as possible.
I am a hobbyist programmer and this server is just for personal stuff so I don't need anything fancy. I just want to get it hosted on google and not my home server. My real fear is that, since I am not a professional, my code might be doing some crazy background stuff repeatedly that would rack my usage up for nothing.
Quick rundown on what my server does:
Basic node.js express template that IntelliJ made me, then I added my code to sit and listen to a Firebase. When the firebase gets a message (once or twice a day maybe, text message equivalent size) the server sends a quick GCM/FCM message to a few devices. Extremely simple server, very little code. Nothing crazy.
As a little bonus for me, if you have a suggestion as to which platform I should use, I am all-ears.
If you do not need this server to run 24x7, use App Engine. It stops an instance if it is not being used for 15 minutes. The startup time for new instances depends on your code, but for Node.js instances it should not be long.
Generally speaking it is easier to run an app on App Engine than Compute Engine, but if you use a single instance and don't change code often the difference is negligible.
App Engine has a generous free quota. You may end up paying nothing until the usage gets over a certain threshold.
You can run some diagnostic tools on your existing server, but even then you will get an approximation - a server with a different combination of resources sitting on a different network may use resources differently. You may be able to get a rather accurate estimate of memory usage, though.
If this is a small app with not too many users, even a small instance should be able to handle it. There is no harm in trying - start with the smallest instance, test, go to the next instance up if tests fail. Your key concern should be to have enough memory to handle a small number of requests.
As for the number of requests your server can handle, you can configure automatic scaling. It is a default option in App Engine and can be enabled for flexible runtime. Then you can have the smallest instance (i.e. your server does not crash due to the lack of memory) running, and another instance will be added if and when that small instance is not enough.
Well, after over a month I figure I might as well answer this myself.
What I ended up doing was creating a basic instance on Computer Engine (the micro. Smallest one available) and letting it just sit there for a few weeks. I looked back at the data to see what some good baselines were and took note.
Then I took my server code and ran it on the server. I left if there for a few days, changed it, updated it, etc. Just tried to simulate the things I would be doing. Sent messages on my client app (that's what this server is doing after all is said and done) and I let this go on for a few more weeks.
The rest is history. I looked at the baseline then looked at my new memory, CPU, network and disk usage and there we go. Good to go. My free trial still isn't even over so it was a free experiment.
The good news is that my server is more 'lightweight' than I thought.

Resources