How to disable the oom killer in linux? [closed] - linux

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 3 years ago.
The community reviewed whether to reopen this question 4 months ago and left it closed:
Original close reason(s) were not resolved
Improve this question
My current configs are:
> cat /proc/sys/vm/panic_on_oom
0
> cat /proc/sys/vm/oom_kill_allocating_task
0
> cat /proc/sys/vm/overcommit_memory
1
but when I run a task, it's killed anyway.
> ./test/mem.sh
Killed
> dmesg | tail -2
[24281.788131] Memory cgroup out of memory: Kill process 10565 (bash) score 1001 or sacrifice child
[24281.788133] Killed process 10565 (bash) total-vm:12601088kB, anon-rss:5242544kB, file-rss:64kB
Update
My tasks are used to scientific computing, which costs many memories, it seems that overcommit_memory=1 may be the best choice.
Update 2
Actually, I'm working on a data analyzation project, which costs memory more than 16G, but I was asked to limit them in about 5G. It might be impossible to implement this requirement via optimizing the program itself, because the project uses many sub-commands, and most of them does not contains options like Xms or Xmx in Java.
Update 3
My project should be an overcommited system. Exacetly as what a3f saying, it seems that my apps prefer to crash by xmalloc when mem allocated failed.
> cat /proc/sys/vm/overcommit_memory
2
> ./test/mem.sh
./test/mem.sh: xmalloc: .././subst.c:3542: cannot allocate 1073741825 bytes (4295237632 bytes allocated)
I don't want to surrender, although so many aweful tests make me exhausted.
So please show me a way to the light ; )

The OOM killer won't go away. If there is no memory, someone's got to pay. What you can do is set a limit after which memory allocations fail.
That's exactly what setting vm.overcommit_memory to 2 achieves.
From the docs:
The Linux kernel supports the following overcommit handling modes
2 - Don't overcommit. The total address space commit for the system
is not permitted to exceed swap + a configurable amount (default is
50%) of physical RAM. Depending on the amount you use, in most
situations this means a process will not be killed while accessing
pages but will receive errors on memory allocation as appropriate.
Normally, the kernel will happily hand out virtual memory (overcommit). Only when you reference a page, the kernel has to map the page to a real physical frame. If it can't service that request, a process needs to be killed by the OOM killer to make space.
Disabling overcommit means that e.g. malloc(3) will return NULL if the kernel couldn't commit the amount of memory requested. This makes things a bit more predictable, albeit limited (many applications allocate more than they would ever need).

The possible values of oom_adj range from -17 to +15. The higher the score, more likely the associated process is to be killed by OOM-killer. If oom_adj is set to -17, the process is not considered for OOM-killing.
But, increase ram is better choice ,if increasing ram is not possible, then add swap memory.
To increase swap memory try this link,

Related

Swap memory using [closed]

Closed. This question is not about programming or software development. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 8 days ago.
Improve this question
Our system is consuming swap memory even system has avaiable memory. Is that behaviour normal ?
Our system is redhat 8.6.
memory usage figure
Solution for memory usage problem.
The Linux 2.6 kernel introduced a new kernel parameter called swappiness, which allows administrators to customize how Linux swaps.
Swappiness is a property for the Linux kernel that changes the balance between swapping out runtime memory, as opposed to dropping pages from the system page cache. Swappiness can be set to values between 0 and 100, inclusive. A low value means the kernel will try to avoid swapping as much as possible where a higher value instead will make the kernel aggressively try to use swap space.
Since the linux kernel 5.8 has a swappiness value range of 0 to 200 and a default value of 60. You can change it temporarily (until your next reboot) with the following command
echo 42 > /proc/sys/vm/swappiness
If you want to change it permanently, edit the vm.swappiness parameter in the /etc/sysctl.conf file.
It should be noted that the swappiness number does not imply that 60% of memory will be moved into swap. There is a swap algorithm that determines when and how much data is put into swap.
The following formula is provided by Redhat to determine swap tendency:
swap_tendency = mapped_ratio/2 + distress + vm_swappiness;
You can read more here

Huge difference between htop and ps aux output [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 6 years ago.
Improve this question
I am running a test on Ubuntu 14.04. When I check my CPU usage using
'ps aux|grep service' then CPU usage is 0.1 of a process, while in htop for the same process the CPU% is 12.3.
Can anyone tell me the reason? or which value should I consider the right one?
Thanks
They are measuring different things.
From the ps man-page:
CPU usage is currently expressed as the percentage of time spent
running during the entire lifetime of a process. This is not ideal,
and it does not conform to the standards that ps otherwise conforms to.
CPU usage is unlikely to add up to exactly 100%.
From the htop man-page (I am the author of htop):
PERCENT_CPU (CPU%)
The percentage of the CPU time that the process is currently
using.
So, in htop this is the percentage of total CPU time used by the program between the last refresh of the screen and now.
PercentageInHtop = (non-idle CPU time used by process during the last 1.5s) / 1.5s
In ps this is the percentage of CPU time used by the program relative to the total time it exists (ie, since it was launched).
PercentageInPs = (non-idle CPU time used by process since process startup) / (time elapsed since process startup)
That is, in your reading it means that htop is saying that the service is taking 12.3% of your CPU now, while ps is saying that your service has spent 99.9% of its total life idle.

Does Linux have a page file? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 7 years ago.
Improve this question
I found at several places that Linux uses pages and a paging mechanism but I didn't find anywhere where this file is or how to configure it.
All the information I found is about the Linux swap file / partition. There is a difference between paging and swapping:
Paging moves pages (a small frame which contains a piece of data - usually 4 KB but can vary between different OS's) from main memory to a backbend storage, happens always as a normal function of the operating system.
Swapping moves an entire process to storage and happens when the system is memory stressed or on windows 8 when a new application is hibernating.
Does Linux uses it's swap file / partition for both cases?
If so, how could I see how many page are currently paged out? This information is not there in vmstat, free or swapon commands (or that I fail to see it).
Or is there another file used for paging?
If so, how can I configure it (and watch it's usage)?
Or perhaps Linux does not use paging at all and I was mislead?
I would appreciate if the answers will be specific to red hat enterprise Linux both versions 6 and 7 but also a general answer about all Linux's will be good.
Thanks in advance.
On Linux, the swap partition(s) are used for paging.
Linux does not respond to memory pressure by swapping out whole processes. The virtual memory system does demand paging, page by page. Under extreme memory pressure, one or more processes will be killed by the OOM killer. (There are some useful links to documentation in the first NOTE in man malloc)
There is a line in the top header which shows swap partition usage, but if that is all the information you want, use
swapon -s
man swapon for more information.
The swap partition usage is not the same as the number of unmapped pages. A page might be memory-mapped to a file using the mmap call; since that page has backing store in the file, there is no need to also write it to a swap partition, and the system won't use swap space for that. But swap partition usage is a pretty good indicator.
Also note that Linux (unlike Windows) does not allocate swap space for pages when they are allocated. Instead, it adds the new page to the virtual memory map without any backing store. and allocates the swap space when the page needs to be swapped out. The consequence (as described in the malloc manpage referenced earlier) is that a malloc call may succeed in allocating virtual memory, but a subsequent attempt to use that virtual memory may fail.
Although Linux retains the term 'swap partition' as a historical relic, it actually performs paging. So your expectation is borne out; you were just thrown by the archaic terminology.

tcl script aborts : unable to realloc xxx bytes [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question appears to be off-topic because it lacks sufficient information to diagnose the problem. Describe your problem in more detail or include a minimal example in the question itself.
Closed 8 years ago.
Improve this question
My Tcl script aborts, saying that it is unable to realloc 2191392 bytes. This happens when the script is kept for a longer execution duration, say, more than 10 hours. My Tcl script connects to devices using telnet and ssh connections and executes/verifies some command outputs on devices. The Linux machine has enough RAM 32GB, and ulimit is unlimited for process, data, file size. My script process doesn't eat up more memory, but the worst case is < 1GB. I just wonder why memory allocation is failed having plenty of RAM.
That message is an indication that an underlying malloc() call returned NULL (in a non-recoverable location), and given that it's not for very much, it's an indication that the system is thoroughly unable to allocate much memory. Depending on how your system is configured (32-bit vs. 64-bit; check what parray tcl_platform prints to find out) that could be an artefact of a few things, but if you think that it shouldn't be using more than a gigabyte of memory, it's an indication of a memory leak.
Unfortunately, it's hard to chase down memory leaks in general. Tcl's built-in memory command (enabled via configure --enable-symbols=mem at build time) can help, as can a tool like Electric Fence, but they are imperfect and can't generally tell you where you're getting things wrong (as you'll be looking for the absence of something to release memory). At the Tcl level, see whether each of the variables listed by info globals is of sensible size or whether there's a growing number of globals. You'll want to use tools like string length, array exists, array size and array names for this.
It's also possible for memory allocation to fail due to another process consuming so much memory that the OS starts to feel highly constrained. I hope this isn't happening in your case, since it's much harder to prevent.

How do I find which process is leaking memory? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I have a system (Ubuntu) with many processes and one (or more) have a memory leak. Is there a good way to find the process that has the leak? Some of the process are JVMs, some are not. Some are home grown some are open source.
You can run the top command (to run non-interactively, type top -b -n 1). To see applications which are leaking memory, look at the following columns:
RPRVT - resident private address space size
RSHRD - resident shared address space size
RSIZE - resident memory size
VPRVT - private address space size
VSIZE - total memory size
if the program leaks over a long time, top might not be practical. I would write a simple shell scripts that appends the result of "ps aux" to a file every X seconds, depending on how long it takes to leak significant amounts of memory. Something like:
while true
do
echo "---------------------------------" >> /tmp/mem_usage
date >> /tmp/mem_usage
ps aux >> /tmp/mem_usage
sleep 60
done
I suggest the use of htop, as a better alternative to top.
In addition to top, you can use System Monitor (System - Administration - System Monitor, then select Processes tab). Select View - All Processes, go to Edit - Preferences and enable Virtual Memory column. Sort either by this column, or by Memory column
If you can't do it deductively, consider the Signal Flare debugging pattern: Increase the amount of memory allocated by one process by a factor of ten. Then run your program.
If the amount of the memory leaked is the same, that process was not the source of the leak; restore the process and make the same modification to the next process.
When you hit the process that is responsible, you'll see the size of your memory leak jump (the "signal flare"). You can narrow it down still further by selectively increasing the allocation size of separate statements within this process.
Difficult task. I would normally suggest to grab a debugger/memory profiler like Valgrind and run the programs one after one in it. Soon or later you will find the program that leaks and can tell it the devloper or fix it yourself.
As suggeseted, the way to go is valgrind. It's a profiler that checks many aspects of the running performance of your application, including the usage of memory.
Running your application through Valgrind will allow you to verify if you forget to release memory allocated with malloc, if you free the same memory twice etc.

Resources