Linux Shared Library & Memory space [duplicate] - linux

While I was studying about shared library I read a statement
Although the code of a shared library is shared among multiple
processes, its variables are not. Each process that uses the library
has its own copies of the global and static variables that are defined
within the library.
I just have few doubts.
Whether code part of each process are in separate address space?
Whether shared-library code part are in some some global(unique) address space.
I am just a starter so please help me understand.
Thanks!

Shared libraries are loaded into a process by memory-mapping the file into some portion of the process's address-space. When multiple processes load the same library, the OS simply lets them share the same physical RAM.
Portions of the library that can be modified, such as static globals, are generally loaded in copy-on-write mode, so that when a write is attempted, a page fault occurs, the kernel responds by copying the affected page to another physical page of RAM (for that process only), the mapping redirected to the new page, and then finally the write operation completes.
To answer your specific points:
All processes have their own address space. The sharing of physical memory between processes is invisible to each process (unless they do so deliberately via a shared memory API).
All data and code live in physical RAM, which is a kind of address-space. Most of the addresses you are likely see, however, are virtual memory addresses belonging to the address-space of one process or another, even if that "process" is the kernel.

Related

What structure is traversed to deallocate pages, when a process terminates? (Page Table or something else?)

I am trying to understand the nature of the operations carried out regarding the deallocation of physical memory when a process terminates.
Assumed that page table for the process is a multi-level tree structure thats implemented on Linux.
My current understanding is that the OS would need to deallocate each physical page frame that is mapped to whatever subset of the virtual addresses for which the Page Table entry (PTE) exists. This could happen by a traversal of the multi-level tree PT structure & for the PTEs that have their valid bit set, the physical frame descriptor corresponding to the PTE is added to the free list (which is used in the Buddy allocation process).
My question is: Is the traversal of the Page Table actually done for this? An alternative, faster way would be to maintain a linked list of the page frame descriptors allotted to a process, for each process & then traverse that linearly during process termination. Is this more generic & faster method instead followed?
I'm not sure that page gets physically deallocated at process ending.
My understanding is that MMU is managed by the kernel.
But each process has its own virtual address space, which the kernel changes:
for explicit syscalls changing it, ie. mmap(2)
at program start thru execve(2) (which can be thought of several virtual mmap-s as described by the segments of the ELF program executable file)
at process termination, as if each segment of the address space was virtually munmap-ed
And when a process terminates, it is its virtual address space (but not any physical RAM pages) which gets destroyed or deallocated!
So the page table (whatever definition you give to it) is probably managed inside the kernel by a few primitives like adding a segment to virtual address space and removing a segment from it. The virtual space is lazily managed, since the kernel uses copy on write techniques to make fork fast.
Don't forget that some pages (e.g. the code segment of shared libraries) are shared between processes and that every task of a multi-threaded process are sharing the same virtual address space.
BTW, the Linux kernel is free software, so you should study its source code (from http://kernel.org/). Look also on http://kernelnewbies.org ; memory management happens inside the mm/ subtree of the kernel source.
There are lots of resources. Look into linux-kernel-slides, slides#245 for a start, and there are many books and resources about the Linux kernel... Look for vm_area_struct, pgetable, etc...

How is the code segment shared between processes in Linux?

I have read about the copy-on-write principle which occurs when a new process is being forked in Linux.
I have also read about the fact that if multiple instances of one program are running at the same time, only one instance of the program code can be found in the memory.
I was wondering whether this is a direct consequence of the copy-on-write principle or not, and if it is not, what is the process which ensures that no unnecessary copies of the program's code reside in the memory?
I was wondering whether this is a direct consequence of the
copy-on-write principle or not
No, it's not. FWIW, you could have shared code segments without COW, and you could have COW without shared code segments. It's independent.
If shared program code were to be achieved as a consequence of COW, then only related processes could benefit from that.
For example, if process A forks twice and creates processes B and C, and then B and C call one of the seven exec functions on the same binary, then you could say that the code segment is shared because of COW - since the code segment is never written during execution, and is mapped read-only, then it must be automatically shared, right?
What if you start the same executable from another shell? (Or some other unrelated process forks and executes the same program? It doesn't have to be a shell...)
If code segment sharing was a consequence of COW, in this scenario we wouldn't benefit from sharing the code segment, because the processes are unrelated (so there are no COW-shared pages with the other instances to begin with).
Instead, the code segment is shared with memory mapped files. When loading a new executable in memory, mmap(2) is called to map the binary file's contents into memory.
and if it is not, what is the process which ensures that no
unnecessary copies of the program's code reside in the memory?
The exact implementation details depend on the operating system, but it's not that complicated. Conceptually, mmap(2) maps files into memory, so you just need to keep some state on the underlying file representation to keep track of which (if any) memory mappings are active for that file. Such information is usually kept in the file's inode.
Linux, for example, associates files with memory address spaces with the i_mapping field of struct inode. So, when mmap(2) is called on a binary for the first time, physical memory pages are allocated to hold information and the i_mapping field of that file's inode is set; later invocations will use the i_mapping field and realize that there is an address space associated with this inode, and because it is read-only, no physical pages are allocated, so everything ends up being shared. Note that the virtual memory might be different in each process, although it refers the same physical page (which means that the kernel will at least allocate and update each process's page tables, but that's about it).
The inode structure is defined in fs.h - I can only guess that other UNIX variants do this in a similar way.
Of course, this all works as long as the same binary file is used. If you copy the binary file and execute both copies separately, for obvious reasons, the code segment will not be shared.
The sharing of program code (sometimes called program text) relies on another mechanism: memory mapped files.
The key to understanding this, is that the code of the program does not need to be modified by the linker in order to resolve link to external symbols. Therefore, the operating system is only ever dealing in read-only copies of the program text, and it is inherently sharable amongst processes.
Upon run-time linking your program, the dynamic linker calls mmap() to create virtual address space for the your program's .so (and for any shared libraries it uses). At this stage, the file isn't backed by real pages of memory. Instead, as the program starts to execute, reads in the virtual address space of the file cause page-faults and the operating system either allocates a page, then fills it from disc, or if the page is already in memory, map to that.
A good place to learn more is Modern Operating Systems by Andrew Tanenbaum

Does binary stay in memory after program exits?

I know when a program first starts, it has massive page faults in the beginning since the code is not in memory, and thus need to load code from disk.
What happens when a program exits? Does the binary stay in memory? Would subsequent invocations of the program find that the code is already in memory and thus not have page faults (assuming nothing runs in between and pages stuff out to disk)?
It seems like the answer is no from running some experiments on my Linux machine. I ran some program over and over again, and observed the same number of page faults every time. It's a relatively quiet machine so I doubt stuff is getting paged out in between invocations. So, why is that? Why doesn't executable get to stay in memory?
There are two things to consider here:
1) The content of the executable file is likely kept in the OS cache (disk cache). While that data is still in the OS cache, every read for that data will hit the cache and the OS will honor the request without needing to re-read the file from disk
2) When a process exits, the OS unmaps every memory page mapped to a file, frees any memory (in general, releases every resource allocated by the process, including other resources, such as sockets, and so on). Strictly speaking, the physical memory may be zeroed, but not quite required (still, the security level of the OS may require to zero a page that is not used anymore - probably Windows NT, 2K, XP, etc, do that - see this Does Windows clear memory pages?). Another invocation of the same executable will create a brand new process which will map the same file in the memory, but the first access to those pages will still trigger page faults because, in the end, it is a new process, a different memory mapping. So yes, the page faults occur, but they are a lot cheaper for the second instance of the same executable compared to the first.
Of course, this is only about the read-only parts of the executable (the segments/modules containing the code and read-only data).
One may consider another scenario: forking. In this case, every page is marked as copy-on-write. When the first write occurs on each memory page, a hardware exception is triggered and intercepted by the OS memory manager. The OS determines if the page in question is allowed to be written (eg: if it is the stack, heap or any writable page in general) and if so, it allocates memory and copies the original content before allowing the process to modify the page - in order to preserve the original data in the other process. And yes, there is still another case - shared memory, where the exact physical memory is mapped to two or more processes. In this case, the copy-on-write flag is, of course, not set on the memory pages.
Hope this clarifies what is going on with the memory pages.
What I highly suspect is that parts, information blobs are not promptly erased from RAM unless there's a new request for more RAM from actually running code. For that part what probably happens is OS reusing OS dependent bits from RAM, on a next execution e.g. I think this is true for OS initiated resources (and probably not for all resources but some).
Actually most of your questions are highly implementation-dependant. But for most used OS:
What happens when a program exits? Does the binary stay in memory?
Yes, but the memory blocks are marked as unused (and thus could be allocated to other processes).
Would subsequent invocations of the program find that the code is
already in memory and thus not have page faults (assuming nothing runs
in between and pages stuff out to disk)?
No, those blocks are considered empty. Some/all blocks might have been overwritten already.
Why doesn't executable get to stay in memory?
Why would it stay? When a process is finished, all of its allocated resources are freed.
One of the reasons is that one generally wants to clear everything out on a subsequent invocation in case their was a problem in the previous.
Plus, the writeable data must be moved out.
That said, some systems do have mechanisms for keeping executable and static data in memory (possibly not linux). For example, the VMS operating system allows the system manager to install executables and shared libraries so that they remain in memory (paging allowed). The same system can be used to create create writeable shared memory allowing interprocess communication and for modifications to the memory to remain in memory (possibly paged out).

Kernel code responsible for virtual momory

I know 'even a single process can have a virtual address space larger than the system's physical memory' so Just want to know which kernel code is responsible to create virtual memory larger than physical memory?
Second thing is, Can i change the code to make it little large, Is there any performance benefit If i change the code to expand virtual memory?
All the memory management (and address space) management code is involved.
From the application point of view, you should understand more virtual memory (the kernel controls the MMU and handles page faults), notably the mmap(2), mprotect(2), madvise(2), execve(2) syscalls. Applications change their address space using these syscalls. You can use the proc(5) filesystem to query about it. For instance cat /proc/self/maps is showing the address space of the process executing that cat
Read also Advanced Linux Programming. Learn more about VDSO & ASLR.
Within the kernel, the relevant source code is mostly its mm/ subdirectory
(but nearly every filesystem has mmap specific code, and page faults are also related to scheduling, etc...)

Running two processes in Unix/Linux

When the kernel creates two processes whose code section is same, does the kernel actually copy the code to the virtual address space of both processes? In other words, if I create two processes of the same program, in memory, do we have two copies of the program or just one copy?
Obviously, it may depend on implementation but I'm asking in traditional Unix OS.
Does the kernel actually copy the code to the virtual address space of both processes?
The text segment will be mapped (rather than copied) into the virtual address space of each process, but will be referring to the same physical space (so the kernel will only have one copy of the text in memory).
The data and bss segments will also be mapped into the virtual address space of each process, but these will be created per process. At process initiation, the data from the data and bss segments from the executable will be mapped/copied into the process's virtual memory; if it was not copied ab initio then as soon as the processes start writing to the data the process will be given its own private copy.
Clearly, shared memory and mmap'd memory are handled after the process starts. Shared memory is always shared between processes; that's its raison d'ĂȘtre. What happens with mmap depends on the flags used, but it is often shared too.
Modern operating systems will use Copy-on-Write to avoid duplicating pages until they are actually updated. Note that on many systems (including Linux) this can lead to overcommit, where the OS doesn't actually have enough RAM to cope with all the copying required should every process decide to modify un-duplicated pages.

Resources