Api with high compute requirements written in Node.js - node.js

we are running an api on a 12 cores server with a high performance ssd. The app is run using pm2 cluster mode (--i 0). The problem i am facing is that we are using the api to do calculations, which for instance spans over 40 years along with 12 months of each year.
In some situations i have to run them in parallel. when i run 5 tasks in parallel using async.parallel, one core is captured for 10 seconds. According to those stats we can assume that if one core is busy at 100% usage for 10 seconds, a server with 12 cores is going to server 12 concurrent requests from users.
My question is should we be using node.js for this purpose? or is there a better way or a recommended way to handle this kind of a situation. thanks

This seems unrelated to node.js. You would have this problem regardless of programming language. Your problems are more related to your design. It seems to me you need some form of caching mechanism, so future requests don't need to take 10 seconds each. You could also do the computing ahead of time and store the results in a database.

Related

Does startup time for Cloudrun docker image delays autoscaling?

I am currently using express.js as the main backend. However, I have found that fastify is generally faster in performance.
But the downside of fastify is that it has a relatively slow startup time.
I am curious that would it slow down the autoscaling of Cloudrun?
I've seen that Cloudrun autoscales when the usage is over 60%.
In this case, I am thinking slow startup time can delay the response while autoscaling which can be a reason not to use fastify. How does this exactly work?
A slow cold start doesn't influence the autoscaller. The service scale according to the CPU usage and the number of queries.
If you track the number of created instance with benchmarks, you can see that you can have suddenly 50 concurrent requests, 5 to 10 instances are created by the Cloud Run autoscaller. Why? Because if the traffic suddenly increase with a lot of concurrent request, it can means that slope can continue and you can have soon 100 or 200 concurrent request and Cloud Run service is prepared to absorb that traffic.
If you scale slowly to 50 concurrent requests, you can have only 1 or 2 instances set up.
Anyway, it's just a thing that I noted in my previous tests. In addition, keep in mind that, if you have a sustained traffic, the cold start is a marginal case. You will lost a few milliseconds "rarely", and it doesn't imply a framework change.
My recommendation is to keep what is the best and the most efficient for you (you cost at least 100 times more than the Cloud Run cost!!)

Is Azure Functions running in Consumption mode appropriate for massively varying, yet time critical Load?

I’m about to start work on an API that will literally go from 0 RPS to a couple hundred thousand HTTP RPS at the same time and run at that rate for ~2 mins. All processing of those 30 million requests must finish by the end of that 2 min period. This would happen 7 times a WEEK.
Going serverless with Azure Functions in Consumption Plan Hosting Mode sounds appealing. This document describes that a scale controller exists to coordinate app instances, but doesn't really discuss what I can expect from it for HTTP triggers. I can’t find any info that says the scale controller will be able to respond in the time frame I'd need.
The best info I could find was this info saying it took nearly 8 mins to scale up for his tests.
Is this a bad use case for Azure Functions in consumption mode?
Obviously, spinning up a testing harness that is capable of issuing 30 million requests within 2 minutes is an undertaking of its own, and an expensive one. I'd like to learn from others who have already done so.
Based on my experience, this scenario is not properly covered by Consumption Plan. They can scale up to many instances, but not very rapidly. 2 minutes is way too fast to rely on.
I was mostly working with queues, not HTTP, but I got delays up to 40 minutes caused by low pace of scaling up.
If you can predict which 2 minutes are going to be heavy-loaded, your best bet could be to provision the capacity with a script (or another Function).

Steps to improve throughput of Node JS server application

I have a very simple nodejs application that accepts json data (1KB approx.) via POST request body. The response is sent back immediately to the client and the json is posted asynchronously to an Apache Kafka queue. The number of simultaneous requests can go as high as 10000 per second which we are simulating using Apache Jmeter running on three different machines. The target is to achieve an average throughput of less than one second with no failed requests.
On a 4 core machine, the app handles upto 4015 requests per second without any failures. However since the target is 10000 requests per second, we deployed the node app in a clustered environment.
Both clustering in the same machine and clustering between two different machines (as described here) were implemented. Nginx was used as a load balancer to round robin the incoming requests between the two node instances. We expected a significant improvement in the throughput (like documented here) but the results were on the contrary.
The number of successful requests dropped to around 3100 requests per second.
My questions are:
What could have gone wrong in the clustered approach?
Is this even the right way to increase the throughput of Node application?
We also did a similar exercise with a java web application in Tomcat container and it performed as expected 4000 requests with a
single instance and around 5000 successful requests in a cluster
with two instances. This is in contradiction to our belief that
nodejs performs better than a Tomcat. Is tomcat generally better
because of its thread per request model?
Thanks a lot in advance.
Per your request, I'll put my comments into an answer:
Clustering is generally the right approach, but whether or not it helps depends upon where your bottleneck is. You will need to do some measuring and some experiments to determine that. If you are CPU-bound and running on a multi-core computer, then clustering should help significantly. I wonder if your bottleneck is something besides CPU such as networking or other shared I/O or even Nginx? If that's the case, then you need to fix that before you would see the benefits of clustering.
Is tomcat generally better because of its thread per request model?
No. That's not a good generalization. If you are CPU-bound, then threading can help (and so can clustering with nodejs). But, if you are I/O bound, then threads are often more expensive than async I/O like nodejs because of the resource overhead of the threads themselves and the overhead of context switching between threads. Many apps are I/O bound which is one of the reasons node.js can be a very good choice for server design.
I forgot to mention that for http, we are using express instead of the native http provided by node. Hope it does not introduce an overhead to the request handling?
Express is very efficient and should not be the source of any of your issues.
As jfriend said , you need to find the bottlenecks ,
one thing you can try is to reduce the bandwith/throughput by using sockets to pass the json and especially this library https://github.com/uNetworking/uWebSockets.
The main reason for that is that an http request is significantly heavier than a socket connection.
Good Example : https://webcheerz.com/one-million-requests-per-second-node-js/
lastly you can also compress the json via (http gzip) or a third party module.
work on the weight ^^
Hope it helps!

Controlling the flow of requests without dropping them - NodeJS

I have a simple nodejs webserver running, it:
Accepts requests
Spawns separate thread to perform background processing
Background thread returns results
App responds to client
Using Apache benchmark "ab -r -n 100 -c 10", performing 100 requests with 10 at a time.
Average response time of 5.6 seconds.
My logic for using nodejs is that is typically quite resource efficient, especially when the bulk of the work is being done by another process. Seems like the most lightweight webserver option for this scenario.
The Problem
With 10 concurrent requests my CPU was maxed out, which is no surprise since there is CPU intensive work going on the background.
Scaling horizontally is an easy thing to, although I want to make the most out of each server for obvious reasons.
So how with nodejs, either raw or some framework, how can one keep that under control as to not go overkill on the CPU.
Potential Approach?
Could accepting the request storing it in a db or some persistent storage and having a separate process that uses an async library to process x at a time?
In your potential approach, you're basically describing a queue. You can store incoming messages (jobs) there and have each process get one job at the time, only getting the next one when processing the previous job has finished. You could spawn a number of processes working in parallel, like an amount equal to the number of cores in your system. Spawning more won't help performance, because multiple processes sharing a core will just run slower. Keeping one core free might be preferred to keep the system responsive for administrative tasks.
Many different queues exist. A node-based one using redis for persistence that seems to be well supported is Kue (I have no personal experience using it). I found a tutorial for building an implementation with Kue here. Depending on the software your environment is running in though, another choice might make more sense.
Good luck and have fun!

What are the most important statistics to look at when deploying a Node.js web-application?

First - a little bit about my background: I have been programming for some time (10 years at this point) and am fairly competent when it comes to coding ideas up. I started working on web-application programming just over a year ago, and thankfully discovered nodeJS, which made web-app creation feel a lot more like traditional programming. Now, I have a node.js app that I've been developing for some time that is now running in production on the web. My main confusion stems from the fact that I am very new to the world of the web development, and don't really know what's important and what isn't when it comes to monitoring my application.
I am using a Joyent SmartMachine, and looking at the analytics options that they provide is a little overwhelming. There are so many different options and configurations, and I have no clue what purpose each analytic really serves. For the questions below, I'd appreciate any answer, whether it's specific to Joyent's Cloud Analytics or completely general.
QUESTION ONE
Right now, my main concern is to figure out how my application is utilizing the server that I have it running on. I want to know if my application has the right amount of resources allocated to it. Does the number of requests that it receives make the server it's on overkill, or does it warrant extra resources? What analytics are important to look at for a NodeJS app for that purpose? (using both MongoDB and Redis on separate servers if that makes a difference)
QUESTION TWO
What other statistics are generally really important to look at when managing a server that's in production? I'm used to programs that run once to do something specific (e.g. a raytracer that finishes running once it has computed an image), as opposed to web-apps which are continuously running and interacting with many clients. I'm sure there are many things that are obvious to long-time server administrators that aren't to newbies like me.
QUESTION THREE
What's important to look at when dealing with NodeJS specifically? What are statistics/analytics that become particularly critical when dealing with the single-threaded event loop of NodeJS versus more standard server systems?
I have other questions about how databases play into the equation, but I think this is enough for now...
We have been running node.js in production nearly an year starting from 0.4 and currenty 0.8 series. Web app is express 2 and 3 based with mongo, redis and memcached.
Few facts.
node can not handle large v8 heap, when it grows over 200mb you will start seeing increased cpu usage
node always seem to leak memory, or at least grow large heap size without actually using it. I suspect memory fragmentation, as v8 profiling or valgrind shows no leaks in js space nor resident heap. Early 0.8 was awful in this respect, rss could be 1GB with 50MB heap.
hanging requests are hard to track. We wrote our middleware to monitor these especially as our app is long poll based
My suggestions.
use multiple instances per machine, at least 1 per cpu. Balance with haproxy, nginx or such with session affinity
write midleware to report hanged connections, ie ones that code never responded or latency was over threshold
restart instances often, at least weekly
write poller that prints out memory stats with process module one per minute
Use supervisord and fabric for easy process management
Monitor cpu, reported memory stats and restart on threshold
Whichever the type of web app, NodeJS or otherwise, load testing will answer whether your application has the right amount of server resources. A good website I recently found for this is Load Impact.
The real question to answer is WHEN does the load time begin to increase as the number of concurrent users increase? A tipping point is reached when you get to a certain number of concurrent users, after which the server performance will start to degrade. So load test according to how many users you expect to reach your website in the near future.
How can you estimate the amount of users you expect?
Installing Google Analytics or another analytics package on your pages is a must! This way you will be able to see how many daily users are visiting your website, and what is the growth of your visits from month-to-month which can help in predicting future expected visits and therefore expected load on your server.
Even if I know the number of users, how can I estimate actual load?
The answer is in the F12 Development Tools available in all browsers. Open up your website in any browser and push F12 (or for Opera Ctrl+Shift+I), which should open up the browser's development tools. On Firefox make sure you have Firebug installed, on Chrome and Internet Explorer it should work out of the box. Go to the Net or Network tab and then refresh your page. This will show you the number of HTTP requests, bandwidth usage per page load!
So the formula to work out daily server load is simple:
Number of HTTP requests per page load X the average number of pages load per user per day X Expected number of concurrent users = Total HTTP Requests to Server per Day
And...
Number of MBs transferred per page load X the average number of pages load per user per day X Expected number of concurrent users = Total Bandwidth Required per Day
I've always found it easier to calculate these figures on a daily basis and then extrapolate it to weeks and months.
Node.js is single threaded so you should definitely start a process for every cpu your machine has. Cluster is by far the best way to do this and has the added benefit of being able to restart died workers and to detect unresponsive workers.
You also want to do load testing until your requests start timing out or exceed what you consider a reasonable response time. This will give you a good idea of the upper limit your server can handle. Blitz is one of the many options to have a look at.
I have never used Joyent's statistics, but NodeFly and their node-nodefly-gcinfo is a great tools to monitor node processes.

Resources