Running two processes in Unix/Linux - linux

When the kernel creates two processes whose code section is same, does the kernel actually copy the code to the virtual address space of both processes? In other words, if I create two processes of the same program, in memory, do we have two copies of the program or just one copy?
Obviously, it may depend on implementation but I'm asking in traditional Unix OS.

Does the kernel actually copy the code to the virtual address space of both processes?
The text segment will be mapped (rather than copied) into the virtual address space of each process, but will be referring to the same physical space (so the kernel will only have one copy of the text in memory).
The data and bss segments will also be mapped into the virtual address space of each process, but these will be created per process. At process initiation, the data from the data and bss segments from the executable will be mapped/copied into the process's virtual memory; if it was not copied ab initio then as soon as the processes start writing to the data the process will be given its own private copy.
Clearly, shared memory and mmap'd memory are handled after the process starts. Shared memory is always shared between processes; that's its raison d'ĂȘtre. What happens with mmap depends on the flags used, but it is often shared too.

Modern operating systems will use Copy-on-Write to avoid duplicating pages until they are actually updated. Note that on many systems (including Linux) this can lead to overcommit, where the OS doesn't actually have enough RAM to cope with all the copying required should every process decide to modify un-duplicated pages.

Related

What structure is traversed to deallocate pages, when a process terminates? (Page Table or something else?)

I am trying to understand the nature of the operations carried out regarding the deallocation of physical memory when a process terminates.
Assumed that page table for the process is a multi-level tree structure thats implemented on Linux.
My current understanding is that the OS would need to deallocate each physical page frame that is mapped to whatever subset of the virtual addresses for which the Page Table entry (PTE) exists. This could happen by a traversal of the multi-level tree PT structure & for the PTEs that have their valid bit set, the physical frame descriptor corresponding to the PTE is added to the free list (which is used in the Buddy allocation process).
My question is: Is the traversal of the Page Table actually done for this? An alternative, faster way would be to maintain a linked list of the page frame descriptors allotted to a process, for each process & then traverse that linearly during process termination. Is this more generic & faster method instead followed?
I'm not sure that page gets physically deallocated at process ending.
My understanding is that MMU is managed by the kernel.
But each process has its own virtual address space, which the kernel changes:
for explicit syscalls changing it, ie. mmap(2)
at program start thru execve(2) (which can be thought of several virtual mmap-s as described by the segments of the ELF program executable file)
at process termination, as if each segment of the address space was virtually munmap-ed
And when a process terminates, it is its virtual address space (but not any physical RAM pages) which gets destroyed or deallocated!
So the page table (whatever definition you give to it) is probably managed inside the kernel by a few primitives like adding a segment to virtual address space and removing a segment from it. The virtual space is lazily managed, since the kernel uses copy on write techniques to make fork fast.
Don't forget that some pages (e.g. the code segment of shared libraries) are shared between processes and that every task of a multi-threaded process are sharing the same virtual address space.
BTW, the Linux kernel is free software, so you should study its source code (from http://kernel.org/). Look also on http://kernelnewbies.org ; memory management happens inside the mm/ subtree of the kernel source.
There are lots of resources. Look into linux-kernel-slides, slides#245 for a start, and there are many books and resources about the Linux kernel... Look for vm_area_struct, pgetable, etc...

Understanding Memory Mapped Files

I have started reading about memory mapped IO and I'm having some difficulties grasping the concepts
This is what I have understood so far:
Each process has a virtual address space. Memory mapped files are allocated a
specific address range in the virtual address space, that maps to the same address on
the physical memory. This way, all the writes that are done by the disk controller on
the memory(through DMA) will be reflected to the process without any additional
copying. (In a non memory mapped file case, CPU will have to copy the contents over
to the buffer of the process).
My Doubts:
Is my understanding correct?
What will happen if there are multiple processes trying to mmap a
file and there is no continuous block of memory available for direct mapping?
The memory subsystem itself doesn't have any understanding of "files", which are an OS concept, and there have been some operating systems that didn't use files at all. You're close but a little off in your understanding of how mmap works.
Each process does have its own virtual address space, which may have very little to do with the physical memory (lots of virtual address space doesn't have any memory associated at all, ever, and virtual memory that's swapped out doesn't have any physical memory). The system uses some sort of lookup tables (called descriptor tables on x86) that specify what virtual address ranges map to what physical address ranges. Virtual memory that isn't "resident" (swapped out, mmapped but not loaded) has a "not present" entry.
Whenever a program tries to access this memory, the CPU causes a page fault, which tells the OS to go find the appropriate contents somewhere and load them into physical memory. In the case of swap, the contents are loaded out of a swap file or partition; in the case of mmap, they're loaded out of somewhere in the filesystem.
The mechanism for getting them into physical memory and updating the descriptor table can vary. What you're describing is DMA, which lets the drive controller copy contents directly into a block of physical memory, and zero-copy I/O, which is a technique where the OS just creates a new descriptor mapping telling the processor to "teleport" the region of physical memory into the program's address space. Neither is technically required for mmap (the OS could load the file "by hand" and copy it into a new buffer for the program, and this may happen in a read-copy-update situation), but modern systems do it like you described.
The physical memory doesn't necessarily have to be contiguous. When the POSIX version of mmap is called, the OS allocates length bytes for the mapping, but thanks to virtual memory, those bytes could be split up among multiple blocks and mapped together by the processor.
If multiple processes are trying to mmap the same file, the OS behavior depends on whether the access is read-only or read/write; read-only copies can be shared among many processes (such as the actual executable code; this is why even though Chrome may have dozens of processes running, the Chrome binary is only in memory once).

Virtual memory sections and memory mapping area

As process has virtual memory which is copied into RAM during run time. As given in the previous post.
Which part of process virtual memory layout does mmap() uses?
I have following doubles :
If memory mapping is inside unallocated memory and it is inside process's virtual memory. As virtual memory helps to avoid one process to touch other process's virtual memory. Then how can memory mapping is used for Interprocess Communication(IPC)?
In OS like Linux, whether has each individual process separate section of heap, stack and memory mapping or all processes have one common section for heap, stack and MMAP?
Example :
if there are P1,P2 and P3 processes are running on linux OS. will all have common table as given in picture or each individual task have separate table to each section.
In 32 bit system, 2^32=4 gigabytes of virtual memory is possible and 1G byte is reserved for kernel and 3 gigabytes for userspace applications. can each individual process have up to 3 gigabytes of virtual memory or sum of all userspace applications size could be 3 gigabytes (i.e virtual memory size of (P1+P2+P3)<=3 gigabytes)?
--
Learner
Using memory mapping for IPC works by mapping the same range of physical memory into two or more virtual address ranges in different processes. This works for communication because both processes are using the exact same memory cells (although they might "see" them differently, at different addresses). You change a value in one mapping, and it is instantly visible in the other mapping in a different process because it is the very same memory.
Every process has its own independent stack and heap. The OS does not care about that at all, it only cares about pages. The heap and the stack are things that are implemented by the application (via the runtime). When you call a function like malloc, the allocator in the runtime either returns a block that it already had reserved earlier or one that it has recylced (you called free earlier), or it asks the OS to reserve some more memory (sbrk or mmap). When you first access this memory, the OS sees a page fault and verifies that you are allowed to access this location (because you've reserved it) and then provides a valid page.
Every process can use (as in "reserve") the whole available address space (3GiB in your example). This does not interfere with any other process. Note that due to fragmentation and alignment, and because your executable and the stack take away a little bit, you will in practice not be able to allocate the full 3 GiB, but you can get close to it.
All processes together can use as much virtual memory as is available on the system (physical RAM plus swap space), but they can only use as much as there is physical memory available at the same time (minus a little bit for this and that, like unpageable kernel memory and such).

Linux Shared Library & Memory space [duplicate]

While I was studying about shared library I read a statement
Although the code of a shared library is shared among multiple
processes, its variables are not. Each process that uses the library
has its own copies of the global and static variables that are defined
within the library.
I just have few doubts.
Whether code part of each process are in separate address space?
Whether shared-library code part are in some some global(unique) address space.
I am just a starter so please help me understand.
Thanks!
Shared libraries are loaded into a process by memory-mapping the file into some portion of the process's address-space. When multiple processes load the same library, the OS simply lets them share the same physical RAM.
Portions of the library that can be modified, such as static globals, are generally loaded in copy-on-write mode, so that when a write is attempted, a page fault occurs, the kernel responds by copying the affected page to another physical page of RAM (for that process only), the mapping redirected to the new page, and then finally the write operation completes.
To answer your specific points:
All processes have their own address space. The sharing of physical memory between processes is invisible to each process (unless they do so deliberately via a shared memory API).
All data and code live in physical RAM, which is a kind of address-space. Most of the addresses you are likely see, however, are virtual memory addresses belonging to the address-space of one process or another, even if that "process" is the kernel.

Is heap allocated on memory pages?

In Linux x86-64 environment, is the entire process allocated on virtual memory pages? By entire process i mean the text, data, bss, heap and stack?
Also, when libc calls Brk, does the kernel returns memory that is managed via pages by virtual memory manager ?
Lastly, can a process get memory on heap, which is not managed by virtual memory manager, in other words, can a process get access to physical memory?
In Linux x86-64 environment, is the entire process allocated on virtual memory pages?
Yes, all processes have a virtual address space, i.e. have their own page table and virtual memory to physical memory mapping pattern.
Also, when libc calls Brk, does the kernel returns memory that is managed via pages by virtual memory manager ?
Yes, in fact, if you aren't hacking the OS kernel, virtual memory is transparent to you.
can a process get memory on heap, which is not managed by virtual memory manager, in other words, can a process get access to physical memory?
No, you can't manage physical memory per my knowledge unless you run your program without support from OS. Because process has its own virtual space, all your action related to memory management is on virtual memory.
A process has one or more tasks (scheduled by the kernel) which for a multi-threaded process are the processes' threads (and for a non-threaded process the task running the process), and it has an address space (and some other resources, e.g. opened file descriptors).
Of course, the address space is in virtual memory. The kernel is allowed to swap pages (to e.g. the swap zone of your disk). It tries hard to avoid doing that (swapping pages to disk is very slow, because the disk access time is in dozens of milliseconds, while the RAM access time is in tenth of microsecond).
text & bss etc are virtual memory segments, which are memory mappings. You can think of a process space as a memory map. The mmap(2) system call is the way to modify it. When an executable is started with execve system call, the kernel establish a few mappings (e.g for text, data, bss, stack, ...). The sbrk(2) system call also change it. Most malloc implementations use mmap (at least for big enough zones) and sometimes sbrk.
You can avoid that a memory range is swapped out by locking it into RAM using the mlock(2) syscall, which usually requires root privilege. It is rarely useful in practice (unless you code real-time applications). There is also the msync syscall (to flush memory to disk), you can of course map a portion of file into virtual memory (using mmap), you can change the protection with mprotect(2), remove map with munmap(2), extend a mapping with mremap -a Linux specific syscall-, and you could even catch the SIGSEGV signal and handle it (often in a machine specific way). The madvise(2) syscall enables you to tune paging with hints.
You can understand the memory map of a process of pid 1234 by reading the /proc/1234/maps file (or also /proc/1234/smaps). (From inside an application, you can use /proc/self/ instead of /proc/1234/ ...) I suggest you to run in a terminal:
cat /proc/self/maps
which will show you the memory map of the process running that cat command. You can also use the pmap utility.
Most recent linux kernels provide Adress Space Layout Randomization (so two similar processes running the same program on the same input have different mmap-ed & malloc-ed addresses). You could disable it thru /proc/sys/kernel/randomize_va_space
Except in very rare circumstances (uClinux), processes only see virtual memory, which is mapped to physical memory by the kernel.
The kernel can be asked to make specific mappings that give a predictable physical address for a given virtual address; you need the appropriate capability to do that however, as this breaks down the process separation.
On execve, the current mappings are replaced by the loadable segments from the ELF file specified; these are mapped so that referenced pages are loaded from the ELF file (some initial readahead is also performed). The brk system call mainly extends the non-executable mapping with the highest addresses (excluding the stack mapping) by a few pages, allowing the process to access more virtual addresses without being sent a SIGSEGV.
The heap is generally managed by the process internally, but the virtual address space assigned to heap objects must be known to the virtual memory manager beforehand in order to create a mapping. malloc will generally look into its internal tables for a region that is already mapped and usable, and if none can be found, use either brk() or mmap() to create more mappings.

Resources