aws kernel is killing my node app - node.js

Problem :
I am executing test of my mongoose query but kernel kills my node app for OutOfMemory Reasons.
flow scenario: for a single request
/GET REQUEST -> READ document of user(eg.schema) [This schema has ref : user schema with one of its fields] -> COMPILE/REARRANGE the output of query read from mongodb [This involves filtering and looping of data] according the response format as required by the client. -> UPDATE a field of this document and SAVE it back to mongoDB again -> UPDATE REDIS -> SEND response [the above compiled response ] back to requested client
** the above fails when 100 concurrent customers do the same...
MEM - goes very low (<10MB)
CPU - MAX (>98%)
What i could figure out is the rate at which read and writes are occurring which is choking mongodb by queuing all requests and thereby delaying nodejs which causes such drastic CPU and MEM values and finally app gets killed by the kernel.
PLEASE suggest how do i proceed to achieve concurrency in such flows...

You've now met the Linux OOM Killer. Basically, all linux kernels (not just Amazon's) need to take action when they've run out of RAM, so they need to find a process to kill. Generally, this is the process that has been asking for the most memory.
Your 3 main options are:
Add swap space. You can create a swapfile on the root disk if it has enough space, or create a small EBS volume, attach it to the instance, and configure it as swap.
Move to an instance type with more RAM.
Decrease your memory usage on the instance, either by stopping/killing unused processes or reconfiguring your app.
Option 1 is probably the easiest for short-term debugging. For production performance, you'd want to look at optimizing your app's memory usage or getting an instance with more RAM.

Related

How to fix a memory leak when switching between databases with Mongoose & MongoDB?

I've identified a memory leak in an application I'm working on, which causes it to crash after a while due to being out of memory. Fortunately we're running it on Kubernetes, so the other replicas and an automatic reboot of the crashed pod keep the software running without downtime. I'm worried about potential data loss or data corruption though.
The memory leak is seemingly tied to HTTP requests. According to the memory usage graph, memory usage increases more rapidly during the day when most of our users are active.
In order to find the memory leak, I've attached the Chrome debugger to an instance of the application running on localhost. I made a heap snapshot and then I ran a script to trigger 1000 HTTP requests. Afterwards I triggered a manual garbage collection and made another heap snapshot. Then I opened a comparison view between the two snapshots.
According to the debugger, the increase of memory usage has been mainly caused by 1000 new NativeConnection objects. They remain in memory and thus accumulate over time.
I think this is caused by our architecture. We're using the following stack:
Node 10.22.0
Express 4.17.1
MongoDB 4.0.20 (hosted by MongoDB Atlas)
Mongoose 5.10.3
Depending on the request origin, we need to connect to a different database name. To achieve that we added some Express middleware in between that switches between databases, like so:
On boot we connect to the database cluster with mongoose.createConnection(uri, options). This sets up a connection pool.
On every HTTP request we obtain a connection to the right database with connection.useDb(dbName).
After obtaining the connection we register the Mongoose models with connection.model(modelName, modelSchema).
Do you have any ideas on how we can fix the memory leak, while still being able to switch between databases? Thanks in advance!

Is there a way to read a database link in Cosmos DB Java V4 API?

For example, reading "dbs/colls/document" instead of getting a container, then calling read on the container.
I've been having an issue where the first readItem on a container (after calling database.getContainer(x)) is extremely slow (like 1 second or longer) and was thinking using a database link could be faster.
I'm guessing a read after getting the container is slow because it doesn't make a service call until I call read.
Is there a way I can have this preloaded when reading in a database?
I have an application with a read(collectionName, key) method, and my approach was to use getContainer(collectionName) and then call read on that, but this method needs to be fast.
As discussed, the best practice is to keep an instance of your container alive between requests and call readItem on each request. This should resolve the primary issue.
As for the secondary concern, the "high latency every 50 requests or so", this is a known issue however it should only occur in the first minute or so of operation. If you can tolerate the initial slow requests, the solution is to wait for performance to stabilize. How long do you have to run your app for before you no longer see these high-latency requests?
FYI, if latency is a concern, run your client application in a geographically colocated Azure VM. Also a good rule of thumb is to allocate client CPU cores such that CPU utilization is not more than 40% or 50%.

Share memory between multiple process in nodejs environment

So here is the problem, on which i'm thinking of:
One physical server running node js http server through cluster module, which means that there are multiple separate process, each second I'm receiving large amount of requests (5000-10000k), each process counting it's incoming request separately, and then they aggregate this statistics in memcache.
Such architecture creates additional processor time consumption on i/o operation + additional large service running on same server.
What I'm thinking about is to create small service, which allocates some memory for request counters, after that, when http server processes starts - they connect to this service and receive from it pointer on memory, where counter located, so they can increment and read number directly from it without intermediate service commands.
Question: Is there any way to allocate memory in one process and then give pointer address of this memory to a multiple set of another process so this set of processes could read and write that memory directly? And this should be possible in node js.
Answer: After some researches i had come across shared memory system calls and used them in self written nodejs ad-don, that allowed me to use single memory block among multiple processes. Disadvantage of this method is that only primitive types allowed (char, int)

Nodejs application memory usage tracking and clean up on exit

"A Node application is an instance of a Node Process Object".link
Is there a way in which local memory on the server can be cleared every time the node application exits.
[By application exit i mean that when each individual user of the website shuts down the tab on the browser]
node.js is a single process that serves all your users. There is no specific memory associated with a given user other than any state that you yourself in your own node.js code might be storing locally in your node.js server on behalf of a given user. If you have some memory like that, then the typical ways to know when to clear out that state are as follows:
Offer a specific logout option in the web page and when the user logs out, you clear their state from memory. This doesn't catch all ways the user might disappear so this would typically be done in conjunction with other optins.
Have a recurring timer (say every 10 minutes) that automatically clears any state from an user who has not made a web request within the last hour (or however long you want the time set to). This also requires you to keep a timestamp for each user each time they access something on the site which is easy to do in a middleware function.
Have all your client pages keep a webSocket connection to the server and when that webSocket connection has been closed and not re-established for a few minutes, then you can assume that the user no longer has any page open to your site and you can clear their state from memory.
Don't store user state in memory. Instead, use a persistent database with good caching. Then, when the user is no longer using your site, their state info will just age out of the database cache gracefully.
Note: Tracking memory overall usage in node.js is not a trivial task so it's important you know exactly what you are measuring if you're tracking this. Overall process memory usage is a combination of memory that is actually being used and memory that was previously used, is currently available for reuse, but has not been given back to the OS. You obviously need to be able to track memory that is actually in use by node.js, not just memory that the process may be allocated. A heapsnapshot is one of the typical ways to track what is actually being used, not just what is allocated from the OS.

Node js avoid pyramid of doom and memory increases at the same time

I am writing a socket.io based server and I'm trying to avoid the pyramid of doom and to keep the memory low.
I wrote this client - http://jsfiddle.net/QUDXU/1/ which i run with node client-cluster 1000. So 1000 connections that are making continuous requests.
For the server side a tried 3 different solutions which i tested. The results in terms of RAM used by the server, after i let everything run for an hour are:
Simple callbacks - http://jsfiddle.net/DcWmJ/ - 112MB
Q module - http://jsfiddle.net/hhsja/1/ - 850MB and increasing
Async module - http://jsfiddle.net/SgemT/ - 1.2GB and increasing
The server and clients are on different machines. (Softlayer cloud instances). Node 0.10.12 and Socket.io 0.9.16
Why is this happening? How can I keep the memory low and use some kind of library which allows to keep the code readable?
Option 1. You can use the cluster module and gracefully kill your workers from time to time (make sure you disconnect() first). You can check process.memoryUsage().rss > 130000000 in the master and kill the workers when they exceed 130MB, for example :)
Option 2. NodeJS has the habit of using memory and rarely doing rigorous cleanups. As V8 reaches the maximum memory limit, GC calls are more aggressive. So you could lower the maximum memory a node process can take up by running node --max-stack-size <amount>. I do this when running node on embedded devices (often with less than 64 MB of ram available).
Option 3. If you really want to keep the memory low, use weak references where it is possible (anywhere except in long-running calls) https://github.com/TooTallNate/node-weak . This way, the objects will get garbage collected sooner. Extensive tests to make sure everything works are needed, though. GL if u use this one :) https://github.com/TooTallNate/node-weak
It seems like the problem was on the client script, not on the server one. I ran 1000 processes, each of them emitting messages to the server at every second. I think the server was getting very busy resolving all of those requests and thus using all of that memory. I rewrote the client side like this, spawning a number of processes proportional to the number of processors, each of them connecting multiple times like this:
client = io.connect(selectedEnvironment, { 'force new connection': true, 'reconnect': false });
Notice the 'force new connection' flag that allows to connect multiple clients using the same instance of socket.io-client.
The part that solved my problem was actually how the requests were made: any client would make another request after a second from receiving the acknowledge of the previous request, not at every second.
Connecting 1000 clients is making my server using ~100MB RSS. I also used async on the server script which seems very elegant and easier to understand than Q.
The bad part is that I've been running the server for about 2-3 days and the memory rised at 250MB RSS. This, I don't know why.

Resources