What is the largest buffer size for ioctl? - rhel

I am using ioctl() to read data from a block device (scsi.)
I have noticed that when I read 1024 sectors, ioctl finishes without a problem. When I read 2048, after a few long moments it returns ENOMEM (errno=12) which is not even listed on the list of possible errors (see http://man7.org/linux/man-pages/man2/ioctl.2.html)
I have tripple checked that I am passing proper buffer size, so this cannot be the case - no buffer overrun.
How can I learn the largest buffer size to be read using ioctl then?
Edit 1
Some additional information may become helpful:
Enterprise Linux Enterprise Linux Server release 5.3 (Carthage)
Red Hat Enterprise Linux Server release 5.3 (Tikanga)
2.6.18-128.el5

The error occurs as the default behaviour of Linux memory management, which is "overcommiting". This means that the kernel claims to allocate memory successfuly, but doesn't actually allocate the memory until later when you try to access it. If the kernel finds out that it's allocated too much memory, it kills a process with "the ENOMEM (Out Of Memory) killer" to free up some memory. The way it picks the process to kill is complicated, but if you have just allocated most of the memory in the system, it's probably going to be your process that gets the bullet.
To change this behaviour try the following :
echo "2" > /proc/sys/vm/overcommit_memory
On Linux 2.6 and later, it is possible to modify the kernel's behavior so that it will not "overcommit" memory. Although this setting will not prevent the OOM killer from being invoked altogether, it will lower the chances significantly and will therefore lead to more robust system behavior. This is done by selecting strict overcommit mode via sysctl:
sysctl -w vm.overcommit_memory=2

Related

Why linux disables disk write buffer when system ram is greater than 8GB?

Background:
I was trying to setup a ubuntu machine on my desktop computer. The whole process took a whole day, including installing OS and softwares. I didn't thought much about it, though.
Then I tried doing my work using the new machine, and it was significantly slower than my laptop, which is very strange.
I did iotop and found that disk traffic when decompressing a package is around 1-2MB/s, and it's definitely abnormal.
Then, after hours of research, I found this article that describes exactly same problem, and provided a ugly solution:
We recently had a major performance issue on some systems, where disk write speed is extremely slow (~1 MB/s — where normal performance
is 150+MB/s).
...
EDIT: to solve this, either remove enough RAM, or add “mem=8G” as kernel boot parameter (e.g. in /etc/default/grub on Ubuntu — don’t
forget to run update-grub !)
I also looked at this post
https://lonesysadmin.net/2013/12/22/better-linux-disk-caching-performance-vm-dirty_ratio/
and did
cat /proc/vmstat | egrep "dirty|writeback"
output is:
nr_dirty 10
nr_writeback 0
nr_writeback_temp 0
nr_dirty_threshold 0 // and here
nr_dirty_background_threshold 0 // here
those values were 8223 and 4111 when mem=8g is set.
So, it's basically showing that when system memory is greater than 8GB (32GB in my case), regardless of vm.dirty_background_ratio and vm.dirty_ratio settings, (5% and 10% in my case), the actual dirty threshold goes to 0 and write buffer is disabled?
Why is this happening?
Is this a bug in the kernel or somewhere else?
Is there a solution other than unplugging RAM or using "mem=8g"?
UPDATE: I'm running 3.13.0-53-generic kernel with ubuntu 12.04 32-bit, so it's possible that this only happens on 32-bit systems.
If you use a 32 bit kernel with more than 2G of RAM, you are running in a sub-optimal configuration where significant tradeoffs must be made. This is because in these configurations, the kernel can no longer map all of physical memory at once.
As the amount of physical memory increases beyond this point, the tradeoffs become worse and worse, because the struct page array that is used to manage all physical memory must be kept mapped at all times, and that array grows with physical memory.
The physical memory that isn't directly mapped by the kernel is called "highmem", and by default the writeback code treats highmem as undirtyable. This is what results in your zero values for the dirty thresholds.
You can change this by setting /proc/sys/vm/highmem_is_dirtyable to 1, but with that much memory you will be far better off if you install a 64-bit kernel instead.
Is this a bug in the kernel
According to the article you quoted, this is a bug, which did not exist in earlier kernels, and is fixed in more recent kernels.
Note that this issue seems to be fixed in later releases (3.5.0+) and is a regression (doesn’t happen on e.g. 2.6.32)

fork() failing with Out of memory error

The parent process fails with errno=12(Out of memory) when it tries to fork a child. The parent process runs on Linux 3.0 kernel - SLES 11. At the point of forking the child, the parent process has already used up around 70% of the RAM(180GB/256GB). Is there any workaround for this problem?
The application is written in C++, compiled with g++ 4.6.3.
Maybe virtual memory over commit is prevented in your system.
If it is prevented, then the virtual memory can not be bigger than sizeof physical RAM + swap. If it is allowed, then virtual memory can be bigger than RAM+swap.
When your process forks, your processes (parent and child) would have 2*180GB of virtual memory (that is too much if you don't have swap).
So, allow over commit by this way:
echo 1 > /proc/sys/vm/overcommit_memory
It should help, if child process execves immediately, or frees allocated memory before the parent writes too much to own memory. So, be careful, out of memory killer may act if both processes keep using all the memory.
man page of proc(5) says:
/proc/sys/vm/overcommit_memory
This file contains the kernel virtual
memory accounting mode. Values are: 0: heuristic overcommit (this is
the default) 1: always overcommit, never check 2: always check, never
overcommit
In mode 0, calls of mmap(2) with MAP_NORESERVE are not checked, and
the default check is very weak, leading to the risk of getting a
process "OOM-killed". Under Linux 2.4 any nonzero value implies mode
1. In mode 2 (available since Linux 2.6), the total virtual address space on the system is limited to (SS + RAM*(r/100)), where SS is the
size of the swap space, and RAM is the size of the physical memory,
and r is the contents of the file /proc/sys/vm/overcommit_ratio.
More information here: Overcommit Memory in SLES
fork-ing requires resources, since it is copy-on-writing the writable pages of the process. Read again the fork(2) man page.
You could at least provide a huge temporary swap file. You could create (on some file system with enough space) a huge file $SWAPFILE with
dd if=/dev/zero of=$SWAPFILE bs=1M count=256000
mkswap $SWAPFILE
swapon $SWAPFILE
Otherwise, you could for instance design your program differently, e.g. mmap-ing some big file (and munmap-ing it just before the fork, and mmap-ing it again after), or more simply starting at the beginning of your program a popen-ed shell, or a p2open-ed one or making explicitly the pipe-s to and from it (probably a multiplexing call à la poll would also be useful), and later issue commands to it.
Maybe we could help more if we had an idea of what your program is doing, why does it consume so much memory, and why and what is it forking...
Read Advanced Linux Programming for more.
PS.
If you fork just to run gdb to show the backtrace, consider simpler alternatives like recent GCC's libbacktrace or Wolf's libbacktrace...
A nicer solution on Linux would be to use vfork or posix_spawn (as it will try to use vfork if possible): vfork "creates new processes without copying the page tables of the parent process", so it will work even if your application uses more than 50% of RAM available.
Note that std::system and QProcess::execute also use fork under the hood, there is even a ticket about this problem in Qt framework: https://bugreports.qt.io/browse/QTBUG-17331

Why the process is getting killed at 4GB?

I have written a program which works on huge set of data. My CPU and OS(Ubuntu) both are 64 bit and I have got 4GB of RAM. Using "top" (%Mem field), I saw that the process's memory consumption went up to around 87% i.e 3.4+ GB and then it got killed.
I then checked how much memory a process can access using "uname -m" which comes out to be "unlimited".
Now, since both the OS and CPU are 64 bit and also there exists a swap partition, the OS should have used the virtual memory i.e [ >3.4GB + yGB from swap space ] in total and only if the process required more memory, it should have been killed.
So, I have following ques:
How much physical memory can a process access theoretically on 64 bit m/c. My answer is 2^48 bytes.
If less than 2^48 bytes of physical memory exists, then OS should use virtual memory, correct?
If ans to above ques is YES, then OS should have used SWAP space as well, why did it kill the process w/o even using it. I dont think we have to use some specific system calls which coding our program to make this happen.
Please suggest.
It's not only the data size that could be the reason. For example, do ulimit -a and check the max stack size. Have you got a kill reason? Set 'ulimit -c 20000' to get a core file, it shows you the reason when you examine it with gdb.
Check with file and ldd that your executable is indeed 64 bits.
Check also the resource limits. From inside the process, you could use getrlimit system call (and setrlimit to change them, when possible). From a bash shell, try ulimit -a. From a zsh shell try limit.
Check also that your process indeed eats the memory you believe it does consume. If its pid is 1234 you could try pmap 1234. From inside the process you could read the /proc/self/maps or /proc/1234/maps (which you can read from a terminal). There is also the /proc/self/smaps or /proc/1234/smaps and /proc/self/status or /proc/1234/status and other files inside your /proc/self/ ...
Check with  free that you got the memory (and the swap space) you believe. You can add some temporary swap space with swapon /tmp/someswapfile (and use mkswap to initialize it).
I was routinely able, a few months (and a couple of years) ago, to run a 7Gb process (a huge cc1 compilation), under Gnu/Linux/Debian/Sid/AMD64, on a machine with 8Gb RAM.
And you could try with a tiny test program, which e.g. allocates with malloc several memory chunks of e.g. 32Mb each. Don't forget to write some bytes inside (at least at each megabyte).
standard C++ containers like std::map or std::vector are rumored to consume more memory than what we usually think.
Buy more RAM if needed. It is quite cheap these days.
In what can be addressed literally EVERYTHING has to fit into it, including your graphics adaptors, OS kernel, BIOS, etc. and the amount that can be addressed can't be extended by SWAP either.
Also worth noting that the process itself needs to be 64-bit also. And some operating systems may become unstable and therefore kill the process if you're using excessive RAM with it.

What happens if memory leaks on rootfs?

I have a linux totally on rootfs ( which as I understand is an instance of ramfs ). There's no hard disk and no swap. And I got a process that leaks memory continuously. The virutal memory eventually grows to 4 times the size of physical memory, shown with top. I can't understand what's happening. rootfs is supposed to take RAM only, right ? If I have no disk to swap to, how does the Virtual Memory grows to 4 times the physical memory ?
Not all allocated memory has to be backed by a block device; the glibc-people consider this behavior a bug:
BUGS
By default, Linux follows an optimistic memory allocation
strategy. This means that when malloc() returns non-NULL
there is no guarantee that the memory really is available.
This is a really bad bug. In case it turns out that the
system is out of memory, one or more processes will be killed
by the infamous OOM killer. In case Linux is employed under
circumstances where it would be less desirable to suddenly
lose some randomly picked processes, and moreover the kernel
version is sufficiently recent, one can switch off this
overcommitting behavior using a command like:
# echo 2 > /proc/sys/vm/overcommit_memory
See also the kernel Documentation directory, files
vm/overcommit-accounting and sysctl/vm.txt.

Allocating memory for process in linux

Dear all, I am using Redhat linux ,How to set maximum memory for particular process. For eg i have to allocate maximum memory usage to eclipse alone .Is it possible to allocate like this.Give me some solutions.
ulimit -v 102400
eclipse
...gives eclipse 100MiB of memory.
You can't control memory usage; you can only control virtual memory size, not the amount of actual memory used, as that is extremely complicated (perhaps impossible) to know for a single process on an operating system which supports virtual memory.
Not all memory used appears in the process's virtual address space at a given instant, for example kernel usage, and disc caching. A process can change which pages it has mapped in as often as it likes (e.g. via mmap() ). Some of a process's address space is also mapped in, but not actually used, or is shared with one or more other processes. This makes measuring per-process memory usage a fairly unachievable goal in practice.
And putting a cap on the VM size is not a good idea either, as that will result in the process being killed if it attempts to use more.
The right way of doing this in this case (for a Java process) is to set the heap maximum size (via various well-documented JVM startup options). However, experience suggests that you should not set it less than 1Gb.

Resources