I have a 64 bit Linux (SUSE 10) dual processor. When I run my process it uses around 4 G of virtual memory. Only 3G is resident memory. Rest around 9G memory is free. How to load this 1 G also in RAM? Why it is still in swap space why kernel can't load this into RAM when all the RAM is available.
Rahul
The kernel could load the data into memory. However, when they are not used, it choses to write them out to the swap file.
If you absolutely want the data in memory, you should either turn off all swap files (using swapoff(8)), or lock the specific pages into memory, using mlock or mlockall.
Related
The address space is huge for the x86-64 even though 48-bit addresses are mainly used.
On x86 32-bit machines it was pretty clear how much RAM the kernel took up. Generally around 1 GB of ZONE_NORMAL is on the bottom of memory while everything else above the 1GB in PHYSICAL (not virtual) addresses were for ZONE_HIGHMEM (for user space). This would be a 3:1 split. Of course we can have configurations were we can have a 1:3, 2:2, etc. (by changing VM_SPLIT).
How much memory in RAM is for kernel space for 64 bit kernels?
I know the PAGE_OFFSET is set to a value far above physically addressable memory in x64 (for both 48 and 56). PAGE_OFFSET in x64 just describes the split in virtual address space, not physical (a 48 bit PAGE_OFFSET would be ffff888000000000 ).
So does 1 GB of memory house kernel space? 2GB? 3? Are there variable or macros that describe the size? Is it calculated?
Each user-space process can use its own 2^47 bytes (128 TiB) of virtual address space. Or more on a system with PML5 support.
The available physical RAM to back those pages is the total size of physical RAM, minus maybe 30 MiB or so that the kernel needs for its own code/data. (Not including the pagecache: Linux will use any spare pages as buffers and disk cache). This is mostly unrelated to virtual address-space limits.
1G is how much virtual address space a kernel used up. Not how much physical RAM.
The address-space question mattered for how much memory a single process could use at the same time, but the kernel can still use all your RAM for caching file data, etc. Unless you're finding the 2^(48-1) or 2^(57-1) bytes of the low half virtual address-space range cramped, there's no equivalent problem.
See the kernel's Documentation/x86/x86-64/mm.txt for the x86-64 virtual memory map. Also Why 4-level paging can only cover 64 TiB of physical address re: x86-64 Linux not doing inconvenient HIGHMEM stuff - the entire high half of virtual address space is reserved for the kernel, and it maps all the RAM because it's a kernel.
Virtual address space usage does indirectly set a 64 TiB limit on how much physical RAM the kernel can use, but if you have less than that there's no effect. Just like how a 32-bit kernel wasn't a problem if your machine had less than 1 or 2 GiB of RAM.
The amount of physical RAM actually reserved by the kernel depends on build options and modules, but might be something like 16 to 32 MiB.
Check dmesg output and look for something like this kernel log message from an x86-64 5.16.3-arch1 kernel I found in an old boot-log message.
Memory: 32538176K/33352340K available (14344K kernel code, 2040K rwdata, 8996K rodata, 1652K init, 4336K bss, 813904K reserved, 0K cma-reserved
Don't count the init (freed in after boot) or reserved parts; I'm pretty sure Linux doesn't actually reserve ~800 MiB in a way that makes it unusable for anything else.
Also look for the later Freeing unused decrypted memory: 2036K / Freeing unused kernel image (initmem) memory: 1652K etc. (That's the same size as the init part listed earlier, which is why you don't have to count it.)
It might also dynamically allocate some memory during startup; that initial "memory" line is just the sum of its .text, .data, and .bss sections, static code+data sizes.
On 64-Bit systems, the only limitation is on how much physical memory the kernel can use. The kernel will map all the available ram, and user space applications should be able to gain access to as much as the kernel can provide while maintaining sufficient for the kernel to operate.
Until now I thought that a 32-bit processor can use 4 GiB of memory because 232 is 4 GiB, but this approach means processor have word size = 1 byte. So a process with 32-bit program counter can address 232 different memory words and hence we have 4 GiB.
But if a processor has word size larger than 1 byte, which is the case with most of processors now days I believe (My understanding is that word size is equal to the width of data bus, so a processor with 64-bit data bus must have a word size = 8 bytes).
Now same processor with 32 bit Program counter can address 2^32 different memory words, but in this case word size is 8 bytes hence it can address more memory which contradicts with 4 GiB thing, so what is wrong in my argument ?
Your premise is incorrect. 32-bit architectures can address more than 4GB of memory, just like most (if not all) 8-bit microcontrollers can use more than 256 bytes of memory. Indeed a 32-bit program counter can address 232 different memory locations, but word-addressable memory is only used in architectures for very special purposes like DSPs or antique architectures in the past. Modern architectures for general computing all use byte-addressable memory
See Why byte-addressable memory and not 4-byte-addressable memory?
Even in 32-bit byte-addressable architectures there are many ways to access more than 4GB of memory. For example 64-bit JVM can address 32GB of memory with 32-bit pointer using compressed Oops. See the Trick behind JVM's compressed Oops
32-bit x86 CPUs can also address 64GB (or more in later versions) of memory with PAE. It basically adds a another level of indirection in the TLB with a few more bits in the address. That allows the whole system to access more than 4GB of memory. However the pointers in applications are still 32-bit long so each process is still limited to 4GB at most. The analog on ARM is LPAE.
The 4GB address space of each process is often split into user and kernel space (before Meltdown), hence limited the user memory even more. There are several ways to workaround this
Spawning multiple processes, which is used in Adobe Premiere CS4
Mapping the needed part of memory into the current address space, like Address Windowing Extensions on Windows
...
CPU (at least x86 family 32-bit) must be able to access any byte/word/dword in 4GB space. So an instruction is encoded such a way that target word size and memory address (usually) belong to different bit-fields. So it doesn't matter whether CPU accesses byte or dword, but the encoded memory address must be the same.
Note that 32-bit OS and x86 CPU technically is able to acccess more than 4GB address space using PAE mode. But it is not supported by, say, the current Windows OS family (except Server editions). Some versions of WinXP, as well as Linux and other 32-bit OS can address 64GB of memory on x86 CPU.
Also, usually OS reserves some part of virtual address space (for OS kernel, Video memory etc.), so user programs may use, say, no more than 3 GB of RAM of the 4GB an OS can address within each process.
From this post, I know the swap space is correlated to physical memory. So assume the physical memory and the swap space are both 4 GB. Although theoretically, the memory space of the 64-bit application is near to 2^64 (certainly, the kernel will occupy some space), but per my understanding, the actual memory the application can use is only 8 GB.
So my question is: for an application running on Unix/Linux, Is the maximum memory space it can use equals to (physical memory + swap space)?
This is a complicated question.
First of all, the theoretical virtual memory space of 64-bit system is 2^64. But in fact, neither the OS nor the CPU supports so big virtual memory space or physical RAM.
Current x86-64 CPUs (aka AMD64 and Intel's current 64-bit chips) actually use 48-bit address lines (AMD64) and 42-bit address lines (Intel), theoretically allowing 256 terabytes of physical RAM.
And Linux allows 128TB of virtual memory space per process on x86-64, and can theoretically support 64TB of physical RAM.
To your question, in an ideal case, the maximum virtual memory space a Linux process can use is just the Linux limitation of virtual memory space above. Even if your system has run out of all the swap space, leaved only 100MB of free RAM, your process can also make use of the entire memory space.
But your system may have some limitations for the virtual memory space request (malloc, which call brk/sbrk syscall). For example, Linux has a vm.overcommit_memory and vm.overcommit_ratio options to determine whether malloc will refuse in a process. See http://www.win.tue.nl/~aeb/linux/lk/lk-9.html.
However, the virtual memory space is not the real RAM + swap. Considering real RAM + swap, your opinion is right: a process will never use more real RAM + swap than that your system has. But in most cases, there will be a lot of processes exist in your system, so the RAM + swap your process can use is shrinked. If all the physical RAM + swap are going to be exhausted, the OOM killer will choose some process to kill.
In 4GB RAM system running linux, 3gb is given to user-space and 1gb to kernel, does it mean that even if kernel is using 50MB and user space is running low, user cannot use kernel space? if no, why? why cannot linux map their pages to user space?
The 3/1 separation refers to VIRTUAL memory. The virtual memory, however, is sparse. Meaning that even though there is "on paper" 1 GB, in practice a LOT less than that is used. Whenever possible, the "virtual" memory is backed by physical pages (meaning, if your virtual memory footprint is 50MB, then you're using 50 MB of physical memory), up until the point where there is no more physical memory, in which case you either A) spill over to swap or B) the system encounters a low memory condition and frees memory the hard way - by killing processes.
It gets more complicated. Virtual memory is not really used (committed) until actually used. THis means when you allcoate memory, you get an "IOU" or "promise" for memory, but the memory only gets consumed when you actually use the memory, as in write some value to it. Overall, however, you are correct in that there is segregation - at the hardware level - between kernel and user mode. In other words, of the 4GB addressable (assuming 32bit), the top 1GB, even though it is in your address space, is not accessible to you, and in practice belongs to the kernel. (The limit of 4 GB stems from 32-bit pointers - for 64 bits, it's effectively 48, which means 256TB, btw, 128TB user, 128TB kernel). Further, this 1GB of your space that is the kernel's is identical in other processes, too. So it doesnt matter which process you are in, when you "call kernel", (i.e. a system call), you end up in the top 1GB, which is shared in between all processes.
Again, the key point is that the 1GB isn't REALLY used in full. The actual memory footprint of the kernel is a lot smaller - in the tens of MB. It's jsut that theoretically, the kernel can use UP to 1GB, but that is assuming it can be backed up either by RAM or (rarely) swap. You can look at /proc/meminfo. As for the answer above, about changing 3/1 - it actually CAN be changed (in Windows it's as easy as a kernel command line option in boot.ini, in Linux it requires recompilation).
The 3GB/1GB split in process space is fixed. There is no way to change it regardless of how much RAM is actually in use.
If a process uses 6GB of memory and pointers are of 32 bits,how can addressing be done for 2GB above 4GB since pointers hold virtual addresses in linux?
Is running on the 64 bit only solution?Sorry for naive question
Completing Basile's answer, most architectures have extended the physical address-space to 36-bit (see Intel's PSE, PowerPC's Extended Real Page Number, ...). Therefore, although any process can only address 4GB of memory through 32 bits pointers, two differents process are virtually able to address different 4GB of a 64GB physical memory address space. This is a way for a 32bit' OS to address up to 64GB of memory (for instance, 32GB for Windows 2003 Server).
As I said in a comment, running on 64 bits is the practical solution. You really don't want to munmap then mmap again large segments on temporary files.
You could change your address space during runtime, but you don't want to do that (except when allocating memory, e.g. thru malloc, which may increase the available space thru mmap).
Changing the address space to get the illusion of a huge memory is a nightmare. Avoid that (you'll spend months on debugging hard to reproduce bugs). In the 1960-s IBM 1130 did such insane tricks.
Today, computers are cheaper than developer's time. So just buy a 64 bits processor with 8Gb (gigabytes) RAM.
Several 32 bits processors with the PAE feature are able to use more than 4Gb RAM, but each process only see at most 4Gb (in reality 3Gb) of virtual memory.
It is related to virtual memory, not to Intel-specific segmentation. Current Linux (and others) operating system use a flat memory model even on Intel processors.