I'm new to Linux and computer architecture, just some questions on how process and thread related to virtual memory and physical memory RAM.Below is my questions.
Q1-When there is two processes(process A and process B) running concurrently, if process A is running now, the process B's states like register values, heap objects etc have to be pushed to store on disk (Virtual Memory), and when the next context switch happens, process B will be "recovery" from disk to RAM, process A's state will be pushed to disk, is my understanding correct?
Q2- If my understanding in Q1 is correct, why not just save all processes on RAM too? normally we have large RAM like 16gb,32gb etc, how about just store every process's state on RAM, and when there is too many processes and RAM is going to run out, then further processes' states will be stored to disk?
Q3-How about threads? if there is multiple threads (e.g thread A and thread B), when thread A is running, does thread B's state will be pushed to stored on disk too?
is my understanding correct?
No, it's wrong. Waiting or blocked processes don't get swapped to disc. They wait in memory. Virtual memory is not on disc.
Also on a system with two processors, two processes are running concurrently, so both processes A and B can be running at the same time.
why not just save all processes on RAM too?
This is exactly what happens. All processes memory kindly waits in RAM until scheduler switches to this process.
Side note: If there is no RAM available and the system has swap available and this process is idle for some defined time, than it may get swapped on disc, ie. the processes memory may get moved to disc. But this doesn't happen immediately, it happens after a long time and in certain situation
will be pushed to stored on disk too?
No.
Virtual memory is not about physical location of the memory. It's the other way round - virtual memory is a of abstraction that allows system to modify the physical (maybe if any) location of the memory. A simplest explanation I give: there is a special cpu register that is added to each address upon dereferencing. A user space program does *(int*)4 but he doesn't get the value behind 4th byte in RAM, the special cpu register value is added to the pointer value upon dereferencing. The register value is configured by the system, can be different in different programs. So you can have exact same pointer values in two programs, but they both point to different locations. Of cause, this is over-over-simplification.
Related
I always read that, at any given time, the processor can only run one process at a time. So one and only one process is in state running.
However, we can have a number of runnable processes. These are all of these processes who are waiting for the scheduler to schedule their execution.
At any given time, do all these runnable processes exist in user address space? Or has the currently running process in user address space, and it is only once they are scheduled that they are brought back to RAM from disk. In which case, does it mean that the kernel keeps the process task descriptor in its list of all runnable processes even if they are in disk? I guess you can tell I am confused.
If CPU supports virtual memory addressing, each process has a unique view of the memory. Two different processes that tries to read from the same memory address, will map to different location in physical memory, unless the memory maps tells otherwize (shared memory, like DLL files are mapped read only like this for instance)
If CPU does not support virtual memory, but only memory protection, the memory from the other processes will be protected away, so that the running process can only access its own memory.
When a process P1 is in a blocked or suspended state, will the memory management system swap it out of main memory for room for an active process?
And if the process is determined to come back where is the Program's procedure call stack, Contents of program counter (PC) and Contents of program status word (PSW) stored? Does the OS keep it all in secondary memory or is part of the suspended/blocked process of P1 kept in main memory?
So I'm guessing when a process is swapped out of memory and put in a
suspended state, all of its resident pages are moved out. When the
process is resumed, all of the pages that were previously in main
memory are returned to main memory
Think in terms of pages, not processes.
Even an active process may have many pages evicted out of physical memory and into swap if the system is under memory pressure.
So, sure, a suspended process may have effectively all of its pages swapped out entirely.
But it is unlikely to have all pages swapped in simply because the process woke up. Doing so would be a waste of CPU, I/O and memory. Instead, pages will be brought back as needed (general case -- some pagers may bring back sets of pages heuristically).
If a process is active, then it won't be swapped out, so the dynamic state of the lowest call stack (all the register noise, red zone on stack, etc... ) isn't in play when the swap happens.
I.e. for a process to be swapped out the threads need to be blocked on something, typically a call into the kernel or into a system library that is blocking. Registers will be out of play, etc... Thus, the execution state that needs to be swapped out is pretty straightforward as the call return state will be preserved in the thread state itself (as the thread is blocked).
In fact, things like the PC and the PSW are preserved more as a part of the context switching subsystem than paging. I.e. on a typical system, you'll likely have several hundred, maybe thousands, of threads running at once across the N physical cores of the CPU. The concurrency support of the architecture is where you'll find how that state is maintained.
By swapped and terminated, I mean, if the process is about to be swapped to a swap space or terminated(by OOM killer) to free up memory.
What algorithm does the linux kernel follow?
For instance, Process A needs extra memory and Process B has been chosen to be swapped or killed(if swap space is already occupied), but process B still has a blocking thread.
a.) Does process B gets swapped or killed regardless of the blocking thread?
b.) If not, how is this kind of case handled?
If my example is an unlikely case, any insights would be appreciated.
Yeah - you need to read up on paged virtual memory, as suggested by #CL. Processes are not swapped out in their entirety and swapping!=termination.
If the OS needs to terminate a process, either because of a specific API request or because of its OOM algorithm, the OS stops all its threads first. Blocked threads are easy to 'stop' because they are not running anyway - it's only necessary to change their state to ensure that they are never run again. Thread/s that are actually running on cores have to be stopped by means of an inter-core comms driver that can hardware-interrupt the cores running the threads. Once all threads are not running, the resources, including all user-space memory, allocated to the process can be freed and OS thread/process management structs released. The process then no longer exists.
I understand how programs in machine code can load values from memory in to registers, perform jumps, or store values in registers to memory, but I don't understand how this works for multiple processes. A process is allocated memory on the fly, so must it use relative addressing? Is this done automatically (meaning there are assembly instructions that perform relative jumps, etc.), or does the program have to "manually" add the correct offset to every memory position it addresses.
I have another question regarding multitasking that is somewhat related. How does the OS, which isn't running, stop a thread and move on to the next. Is this done with timed interrupts? If so, then how can the values in registers be preserved for a thread. Are they saved to memory before control is given to a different thread? Or, rather than timed interrupts, does the thread simply choose a good time to give up control. In the case of timed interrupts, what happens if a thread is given processor time and it doesn't need it. Does it have to waste it, can it call the interrupt manually, or does it alert the OS that it doesn't need much time?
Edit: Or are executables edited before being run to compensate for the correct offsets?
That's not how it works. All modern operating systems virtualize the available memory. Giving every process the illusion that it has 2 gigabytes of memory (or more) and doesn't have to share it with anybody. The key component in a machine that does this is the MMU, nowadays built in the processor itself. Another core feature of this virtualization is that it isolates processes. One misbehaving one cannot bring another one down with it.
Yes, a clock tick interrupt is used to interrupt the currently running code. Processor state is simply saved on the stack. The operating system scheduler then checks if any other thread is ready to run and has a high enough priority to get first in line. Some extra code ensures that everybody gets a fair share. Then it just a matter of setting the MMU to resume execution on the other thread. If no thread is ready to run then the CPU gets physically turned off with the HALT instruction. To be woken again by the next clock interrupt.
This is ten-thousand foot view, it is well covered in any book about operating system design.
A process is allocated memory on the fly, so must it use relative addressing?
No, it can use relative or absolute addressing depending on what it is trying to address.
At least historically, the various different addressing modes were more about local versus remote memory. Relative addressing was for memory addresses close to the current address while absolute was more expensive but could address anything. With modern virtual memory systems, these distinctions may be no longer necessary.
A process is allocated memory on the fly, so must it use relative addressing? Is this done automatically (meaning there are assembly instructions that perform relative jumps, etc.), or does the program have to "manually" add the correct offset to every memory position it addresses.
I'm not sure about this one. This is taken care of by the compiler normally. Again, modern virtual memory systems make make this complexity unnecessary.
Are they saved to memory before control is given to a different thread?
Yes. Typically all of the state (registers, etc.) is stored in a process control block (PCB), a new context is loaded, the registers and other context is loaded from the new PCB, and execution begins in the new context. The PCB can be stored on the stack or in kernel memory or in can utilize processor specific operations to optimize this process.
Or, rather than timed interrupts, does the thread simply choose a good time to give up control.
The thread can yield control -- put itself back at the end of the run queue. It can also wait for some IO or sleep. Thread libraries then put the thread in wait queues and switch to another context. When the IO is ready or the sleep expires, the thread is put back into the run queue. The same happens with mutex locks. It waits for the lock in a wait queue. Once the lock is available, the thread is put back into the run queue.
In the case of timed interrupts, what happens if a thread is given processor time and it doesn't need it. Does it have to waste it, can it call the interrupt manually, or does it alert the OS that it doesn't need much time?
Either the thread can run (perform CPU instructions) or it is waiting -- either on IO or a sleep. It can ask to yield but typically it is doing so by [again] sleeping or waiting on IO.
I probably walked into this question quite late, but then, it may be of use to some other programmers. First - the theory.
The modern day operating system will virtualize the memory, and to do so, it maintains, within its system memory area, a series of page pointers. Each page is of a fixed size (usually 4K), and when any program seeks some memory, its allocated memory addresses that are virtualized using the memory page pointer. Its approximates the behaviour of "segment" registers in the prior generation of the processors.
Now when the scheduler decides to get another process running, it may or may not keep the previous process in memory. If it keeps it in memory, then all that the scheduler does is to save the entire register snapshot (now, including YMM registers - this bit was a complex issue earlier as there are no single instructions that saved the entire context : read up on XSAVE), and this has a fixed format (available in Intel SW manual). This is stored in the memory space of the scheduler itself, along with the information on the memory pages that were being used.
If however, the scheduler needs to "dump" the current process context that is about to go to sleep to the hard disk - this situation usually arises when the process that is waking up needs extraordinary amount of memory, then the scheduler writes the memory page files in the disk blocks (called pagefile - reserved area of memory - also the source of "old grandmother wisdom" that pagefile must be equal to size of real memory) and the scheduler preserves the memory page pointer addresses as offsets in the pagefile. When it wakes up, the scheduler reads from pagefile the offset address, allocates real memory and populates the memory page pointers, and then loads the contents from the disk blocks.
Now, to answer your specific questions :
1. Do u need to use only relative addressing, or you can use absolute?
And. You may use either - whatever u perceive to be as absolute is also relative as the memory page pointer relativizes that address in an invisible format. There is no really absolute memory address anywhere (including the io device memories) except the kernel of the operating system itself. To test this, u may unassemble any .EXE program, to see that the entry point is always CALL 0010 which clearly implies that each thread gets a different "0010" to start the execution.
How do threads get life and what if it surrenders the unused slice.
Ans. The threads usually get a slice - modern systems have 20ms as the usual standard - but this is sometimes changed in special purpose compilation for servers that do not have many hardware interrupts to deal with - in order of their position on the process queue. A thread usually surrenders its slice by calling function sleep(), which is a formal (and very nice way) to surrender your balance part of the time slice. Most libraries implementing asynchronous reads, or interrupt actions, call sleep() internally, but in many instances, top level programs also call sleep() - e.g. to create a time gap. An invocation to sleep will certainly change the process context - the CPU actually is not given the liberty to sleep using NOP.
The other method is to wait for an IO to complete, and this is handled differently. The program on asking for an IO process, will cede its time slice, and the process scheduler flags this thread to be in "WAITING FOR AN IO" state - and this thread will not be given a time slice by the processor till its intended IO is completed, or timed out. This feature helps programmers as they do not have to explicitly write a sleep_until_IO() kind of interface.
Trust this sets you going further in your explorations.
Could any one tell me what is exactly done in both situations? What is the main cost each of them?
The main distinction between a thread switch and a process switch is that during a thread switch, the virtual memory space remains the same, while it does not during a process switch.
Both types involve handing control over to the operating system kernel to perform the context switch. The process of switching in and out of the OS kernel along with the cost of switching out the registers is the largest fixed cost of performing a context switch.
A more fuzzy cost is that a context switch messes with the processors cacheing mechanisms. Basically, when you context switch, all of the memory addresses that the processor "remembers" in its cache effectively become useless. The one big distinction here is that when you change virtual memory spaces, the processor's Translation Lookaside Buffer (TLB) or equivalent gets flushed making memory accesses much more expensive for a while. This does not happen during a thread switch.
Process context switching involves switching the memory address space. This includes memory addresses, mappings, page tables, and kernel resources—a relatively expensive operation. On some architectures, it even means flushing various processor caches that aren't sharable across address spaces. For example, x86 has to flush the TLB and some ARM processors have to flush the entirety of the L1 cache!
Thread switching is context switching from one thread to another in the same process (switching from thread to thread across processes is just process switching).Switching processor state (such as the program counter and register contents) is generally very efficient.
First of all, operating system brings outgoing thread in a kernel mode if it is not already there, because thread switch can be performed only between threads, that runs in kernel mode. Then the scheduler is invoked to make a decision about thread to which will be performed switching. After decision is made, kernel saves part of the thread context that is located in CPU (CPU registers) into the dedicated place in memory (frequently on the top of the kernel stack of outgoing thread). Then the kernel performs switch from kernel stack of outgoing thread on to kernel stack of the incoming thread. After that, kernel loads previously stored context of incoming thread from memory into CPU registers. And finally returns control back into user mode, but in user mode of the new thread.
In the case when OS has determined that incoming thread runs in another process, kernel performs one additional step: sets new active virtual address space.
The main cost in both scenarios is related to a cache pollution. In most cases, the working set used by the outgoing thread will differ significantly from working set which is used by the incoming thread. As a result, the incoming thread will start its life with avalanche of cache misses, thus flushing old and useless data from the caches and loading the new data from memory. The same is true for TLB (Translation Look Aside Buffer, which is on the CPU). In the case of reset of virtual address space (threads run in different processes) the penalty is even worse, because reset of virtual address space leads to the flushing of the entire TLB, even if new thread actually needs to load only few new entries. As a result, the new thread will start its time quantum with lots TLB misses and frequent page walking. Direct cost of threads switch is also not negligible (from ~250 and up to ~1500-2000 cycles) and depends on the CPU complexity, states of both threads and sets of registers which they actually use.
P.S.: Good post about context switch overhead: http://blog.tsunanet.net/2010/11/how-long-does-it-take-to-make-context.html
process switching: it is a transition between two memory resident of process in a multiprogramming environment;
context switching: it is a changing context from an executing program to an interrupt service routine (ISR).
In Thread Context Switching, the virtual memory space remains the same while it is not in the case of Process Context Switch. Also, Process Context Switch is costlier than Thread Context Switch.
I think main difference is when calling switch_mm() which handles memory descriptors of old and new task. In the case of threads, the virtual memory address space is unchanged (threads share virtual memory), so very little has to be done, and therefore less costly.
Though thread context switching needs to change the execution context (registers, stack pointers, program counters), they don't need to change address space as processes context switches do. There's an additional cost when you switch address space, more memory access (paging, segmentation, etc) and you have to flush TLB when entering or exiting a new process...
In short, the thread context switch does not assign a brand new set of memory and pid, it uses the same as the parent since it is running within the same process. A process one spawns a new process and thus assigns new mem and pid.
There is a loooooot more to it. They have written books on it.
As for cost, a process context switch >>>> thread as you have to reset all of the stack counters etc.
Assuming that The CPU the OS runs has got Some High Latency Devices Attached,
It makes sense to run another thread Of the Process's Address Space, while the high latency device responds back.
But, if the High Latency Device is responding faster than the time to need do set up of table + translation of Virtual To Physical memories for a NEW Process, then it is questionable if a switch is essential at all.
Also, HOT cache(data needed for running the process/thread is reachable in less time) is better choice.