Tried posting on Gitlab's forum and had no luck so I thought I'd try here.
We’ve been using Gitlab10 CE for a few months now. We are a pretty small shop with only 5 developers so our instance of gitlab is busy but not crazy by any stretch of the imagination, yet we are constantly running into memory problems. It is a virtual machine running on Ubuntu 16.04. I initially began with the recommended 1 core, and 4GB of memory, and we were constantly being alerted about memory and CPU issues. I upped the specs to 2 cores, and 8GB of memory. Same issue. I’ve now pushed the box to 8 cores and 32GB of CPU and I am still constantly being alerted about memory issues (although the CPU has died down quite a bit). As of the time of this message, we’ve received 20 memory alerts in the last 5 hours. These things are even coming in through the night hours when we have no one even touching the system.
When I run HTOP, there are 28 processes called sidekiq 5.0.4 gitlab-rails [0 of 25 busy] that claim to be costing 2% of our overall memory each. That is over 16GB worth! Under that there’s a whole host of unicorn workers costing 1.8% of our overall memory each.
We’re pretty new to using gitlab so there could easily be something I’m just missing. Any advice on how to throttle the number of processes for each of these or throttle git’s overall memory consumption would be awesome. Thanks!
I'd bet you are seeing threads, not processes in htop. Press Shift-H to view processes. Those threads are all sharing the same 2% of memory.
Make sure you are keeping up to date with GitLab versions, they fix bugs and optimize their code all the time.
Related
I have a problem with increasing kernel cpu usage on a web server I am running. On a 6 core cpu the kernel usage increases from 5 to 50% in some 8 hours.
I have noticed it takes less time when there are more active users on the site and I don't have this problem in dev, therefore I don't have any code that can reproduce the problem. I am hoping for some advice how to troubleshoot this though, what should I investigate to figure out what the problem is?
"pm2 restart" will take the cpu usage down so this is what I need to do every 8 hours or so. I have also noticed increasing cpu usage of systemd-resolved up to some 50% in 8 hours but restarting it with "systemctl restart systemd-resolved" will not help.
I am running it on ubuntu 20.04, node v12.19.0, next 9.5.3, express, express-session, express-socket.io.session, mongodb etc. I have had this problem even on older versions of all this though.
I'm currently prototyping a very light weight TCP server based on a custom protocol. It's written in C++ and using Boost Asio for cross-platform sockets. When I monitor the process on Windows it only eats about <3MB in memory, barely grows with many concurrent connections (I tested up to 8).
I built the same server for Linux and put it on a 128MB + 64MB swap VPS for testing. It runs fine and my testings are successful, but the process gets killed in the middle of the night by kernel. I checked the logs and it was out of memory (OOM score was 0).
I highly doubt my process has memory leaks. I checked my server logs and only 1 person has connected to it the previous night, which should not result in OOM. The process sleeps for majority of the time as it only does processing if Boost's async handler wakes up the main thread to process the packet.
What I did notice is that the default VM allocation for the process is a whooping 89MB (using top command). And as soon as I make a connection it is doubled to about 151MB. My VPS has about 100MB free ram and all 64MB swap while running the server, so the the only thing I could think of is that the process tried to allocate more virtual memory going over the ~164MB remaining and went beyond the physical limit and triggered the OOM-Killer.
I've since used the ulimit command to limit the VM allocation to 30MB and it seems to be working fine, but I'll have to wait a while to see if it actually helps the issue.
My question is how does Linux determine how much VM to allocate for a process? Is there a compiler/linker setting I can use to reduce the default VM reservation? Is my reasoning correct or are there other reasons for the OOM?
I'm running Node.js on a server with only 512MB of RAM. The problem is when I run a script, it will be killed due to out of memory.
By default the Node.js memory limit is 512MB. So I think using --max-old-space-size is useless.
Follows the content of /var/log/syslog:
Oct 7 09:24:42 ubuntu-user kernel: [72604.230204] Out of memory: Kill process 6422 (node) score 774 or sacrifice child
Oct 7 09:24:42 ubuntu-user kernel: [72604.230351] Killed process 6422 (node) total-vm:1575132kB, anon-rss:396268kB, file-rss:0kB
Is there a way to get rid of out of memory without upgrading the memory? (like using persistent storage as additional RAM)
Update:
It's a scraper which uses node module request and cheerio. When it runs, it will open hundreds or thousands of webpage (but not in parallel)
If you're giving Node access to every last megabyte of the available 512, and it's still not enough, then there's 2 ways forward:
Reduce the memory requirements of your program. This may or may not be possible. If you want help with this, you should post another question detailing your functionality and memory usage.
Get more memory for your server. 512mb is not much, especially if you're running other services (such as databases or message queues) which require in-memory storage.
There is the 3rd possibility of using swap space (disk storage that acts as a memory backup), but this will have a strong impact on performance. If you still want it, Google how to set this up for your operating system, there's a lot of articles on the topic. This is OS configuration, not Node's.
Old question, but may be this answer will help people. Using --max-old-space-size is not useless.
Before Nodejs 12, versions have an heap memory size that depends on the OS (32 or 64 bits). So, following documentations, on 64-bit machines that (the old generation alone) would be 1400 MB, far away from your 512mb.
From Nodejs12, heap size take care of system RAM; however Nodejs' heap isn't the only thing in memory, especially if your server isn't dedicated to it. So set the --max-old-space-size permit to have a limit regarding the old memory heap, and if your application comes closer, the garbage collector will be triggered and will try to free memory.
I've write a post about how I've observed this: https://loadteststories.com/nodejs-kubernetes-an-oom-serial-killer-story/
We are running cassandra version 2.0.9 in production. It's a 4 node cluster. For the past few days we are experiencing a high spike in CPU Utilisation. You may see in the picture below.
This is the jconsole output.
When we looked into the threads which are eating a lot of CPU we came across Native Transport request these are eating a lot of CPU (Like 12%) which is huge.
Thread stack trace.
Threads info.
Thread CPU%.
What can the problem be how should we go about debugging it?
Why are majority of NTR request stuck on BCrypt.java? Is this the problem?
The cluster was behaving normally a few days back but now out of 4 nodes 3 are always on high CPU Utilisation.
You have authentication enabled which stores bcrypted hash, not the password. So each request needs to to be checked. This will end up being a CPU issue if you are continually creating new connections instead of reusing an authenticated session. Sessions are long lived objects and should be by default (https://github.com/datastax/php-driver/tree/master/features#persistent-sessions) but if using CGI or something constantly creating new processes you will still have issues. Maybe try php-fpm ?
I've been working on a series of automatic load-testing scripts, and I've noticed that when averaged out, there's no difference between running a cluster of 2 processes and 4 processes on a Heroku dyno (in this case, a Hapi.js server which just immediately returns a reply), despite the dyno reporting itself as having four available CPUs. The difference between 1 and 2 processes is huge, nearly a 100% increase in throughput.
My guess is Intel CPUs / hyperthreading reporting twice as many cores as are actually available, and Node doesn't really benefit from the benefits in scheduling, but there seems to be very little information available about the specs on Heroku dynos. Is this accurate, or is there another reason performance caps out at 2 threads on a server with no I/O?
This is due to several reasons:
Heroku dynos are running on a shared EC2 server -- this means the CPU is being split up between you and X amount of other users.
Depending on how much CPU is utilized by your neighbors, you might have better / worse performance.
Your CPU is going to be your biggest bottleneck on Heroku (and with Node in general).
If you're doing CPU intensive stuff, you'll need to scale horizontally across dynos. If you're doing IO intensive stuff, you should be fine vertically scaling to large dynos over time =)
UPDATE: To add more info here, this is the way virtualization works. EC2 boxes (and any linux servers) will always report the total number of CPUs of the core machine, not the VM. Hope that helps