MariaDB keeps exceeding innodb_buffer_pool_size - linux

I have a backend server with 1G RAM for my HTTP server and my MariaDB.
I noticed the database keeps getting killed by OOM once or twice a day. Most of the time the OOM is triggered by the HTTP server, but not always.
I tried limiting innodb_buffer_pool_size many times, it is at 64M at this moment, but the process is still taking 40% to 60% of the server's memory.
How do I find the reason of this memory usage? It appears to be some kind of memory leak, because it keeps increasing throughout the day.
The database usually starts consuming about 7% to 9% of memory usage.
MariaDB version 10.5

Related

Nodejs process not responding after sometime

I am testing node-webrtc project on 16 core cpu and 32 GB RAM.
I started process with pm2 and after some time node process stop responding.
Url returns not reachable, video streaming stopped.
What i noticed:
1) Every time it stopped at memory consumption 3.5 GB , CPU 900% but i tried to increase old memory size to 24 GB then it failed randomly after reaching 9 GB Memory and 1100 cpu..
2) In pm2 logs i found
"(node:3397) MaxListenersExceededWarning: Possible EventEmitter memory leak detected. 11 newBroadcast listeners added. Use emitter.setMaxListeners() to increase limit" but process keep running after this warning..
A) not sure this is memory leakage issue?
B) cpu consumption (900% out of 1600%) as i know node is single thread process so is there any chance thread assign to main node process reached to their peak point?
Please any suggestion how i can debug it..
concurrent users that time are around 110-120
Issue was server outbound bandwidth.
Server has maximum uplink speed 128 MB/s (~1 Gbps) and stream consuming maximum allowed bandwidth and after that connection to server goes unreachable...
It fixed by swtcihing our server to 500 MB/s bandwidth..

Node web app running in Fargate crashes under load with memory and CPU relatively untaxed

We are running a Koa web app in 5 Fargate containers. They are pretty straightforward crud/REST API's with Koa over Mongo Atlas. We started doing capacity testing, and noticed that the node servers started to slow down significantly with plenty of headroom left on CPU (sitting at 30%), Memory (sitting at or below 20%), and Mongo (still returning in < 10ms).
To further test this, we removed the Mongo operations and just hammered our health-check endpoints. We did see a lot of throughput, but significant degradation occurred at 25% CPU and Node actually crashed at 40% CPU.
Our fargate tasks (containers) are CPU:2048 (2 "virtual CPUs") and Memory 4096 (4 gigs).
We raised our ulimit nofile to 64000 and also set the max-old-space-size to 3.5 GB. This didn't result in a significant difference.
We also don't see significant latency in our load balancer.
My expectation is that CPU or memory would climb much higher before the system began experiencing issues.
Any ideas where a bottleneck might exist?
The main issue here was that we were running containers with 2 CPUs. Since Node only effectively uses 1 CPU, there was always a certain amount of CPU allocation that was never used. The ancillary overhead never got the container to 100%. So node would be overwhelmed on its 1 cpu while the other was basically idle. This resulted in our autoscaling alarms never getting triggered.
So adjusted to 1 cpu containers with more horizontal scale out (ie more instances).

Identify Memory Leakage Issue in NodeJS Code

I am using swagger-express-mw for my REST API application with express. However I am observing continuous memory increase in my production environment.
My Servers are Linux 4C,16 GB under an Application Load Balancer(ALB). Currently there are 2 servers under the ALB and memory usage has increased from 2% to 6%. I am not sure if GC has been executed on it or not yet.
Below is sample snapshot of memory. Here the app is using approximately 100 MB for process but buffers are increasing. Is it a memory leakage issue?

Gremlin-Server takes too much memory and hangs

I'm using gremlin-server (v3.02), with titan-hbase. I'm using the default configuration settings.The server is 8GB memory and 4-cores.
After few hours of work, the server stops responding to queries requests..
It must be said that the requests intensity on the server is NOT high, pretty much low-medium (few requests per hour, maybe less than that).
When cheking gremlin's last server log messages, I see it's about Hbase session timeout, and retries to reconnect the hbase again.
The server CPU and memory are 90-100% at this point.
JDK 1.8.0_45-b14 64bit on Redhat
Using jstat -gc I can all its time is spent in GC, also oldgen is 100%.
I have set "-Xmx 8g" but vitual memory in htop goes up to 12g, with a few tests with xmx I see that virtual memory always gets about "-Xmx + 4g ".
Jmap -histo gives me about 2g of [B (Byte[]) with a gig for CacheRelation and gig for CacheVertex.
After a restarting the gremlin-server, everything is back to normal, and works again.
Any ideas?

GC in Server Mode Not Collecting the Memory

IIS hosted WCF service is consuming Large memory like 18 GB and the server has slowed down.
I Analyzed Mini dump file and it shows only 1 GB or active objects. I understand the GC is not clearing the memory and GC must be running in server mode in 64 bit System. Any idea why the whole computer is stalling and app is taking huge memory?
The GC was running on Server Mode it was configured for better performance. I Understand GC running in Server mode will have a performance improvement because the GC's will not be triggered frequently due to high available memory and in server mode it will have high limit on memory usage. Here the problem was when the high limit is reached for the process CLR triggered the GC and it was trying to clear the Huge 18 GB of memory in one shot, so it was using 90% of system resource and rest applications were lagging.
We tried restarting but it was forever going so We had to kill the process. and now with Workstation mode GC smooth and clean. The only difference is response time has some delay due to GC after 1.5 GB allocation.
One more info: .NET 4.5 version has revision regarding this which has resolved this issue in GC.

Resources