Kernel mode in multithreaded program - linux

If a thread in a process makes system call then in uni-threaded process, process will switch o kernel mode. But what will in case of multi-threaded process?
In other words, if a thread in a process makes system call then what is mode of the process which contains that thread -- kernelmode/user mode?

In Linux a thread is simply a process that happens to share memory with several other processes (other threads within the same process).
So, the CPU will be system mode during the syscall, but the execution will still switch to some other thread or process when its time slice expires, just like it normally switches from process to process even if the currently running process is executing a syscall.


How does operating system preempt a process and regain control?

When a process is running on a CPU, the operating system is not running in the background as a single core CPU can execute only 1 instruction at a time. Then how does the operating system preempt a process, is it done by the hardware?
I couldn't find an answer anywhere
To understand how the OS regains control of a process, the concept of interrupts must be understood. An interrupt is a signal sent to the CPU that signifies that the current processes must be stopped (i.e. interrupted) so that another process can begin. In some sense, this is accomplished at the hardware level as there are dedicated registers in the CPU that interrupting bits are placed in.
When an interrupt occurs, the contents of the CPU's registers are stored, the current stack pointer is saved, and the program counter is then pointed to the next instruction set forth by the scheduler that decides which process to begin next - usually the interrupting one. Barring deadlock, in which no progress on any processes can be made - the scheduler will make its way back to the original process, and that process's executing context will be reloaded into the machine (since we saved it prior). This concept of saving the state of the machine, executing a new process, and returning to the original process is known as a context switch. More on that here

How does the scheduler know that a thread is blocked waiting for input?

When a thread executing user code is waiting for input, how does the scheduler know to interrupt it or how does the thread know to call the scheduler, seeing as the average programmer of a simple single threaded application is unlikely to insert sched_yield() everywhere. Does the compiler insert sched_yield() on optimisation or does the thread just spin lock until the general timer interrupt set by the scheduler fires, or does the user have to explicitly state wait(), sleep() functions in order for the context to switch?
This question is especially relevant if the scheduler is not preemptive because then it has to call the scheduler when it is waiting for input for throughput to be effective, but I'm not sure how it does this.
Be careful not to confuse preemption with the ability of a process to sleep. Processes can sleep even with a non-preempting scheduler. This is what happens when a process is waiting for I/O. The process makes a system call such as read() and the device determines no data is available. It then internally puts the process to sleep by updating a data structure used by the scheduler. The scheduler then executes other processes until an interrupt or some other event occurs that wakes the original process. The awoken process then becomes eligible again for scheduling.
On the other hand preemption is the ability of an architecture's scheduler to stop execution of a process without its cooperation. The interruption can occur anywhere in the program's instruction stream. Control returns to the scheduler which can then execute other processes and return to the interrupted (preempted) process later. Most schedulers allocate time slices where a process is allowed to run for up to a predetermined amount of time, after which it is preempted if higher-priority processes need time slices.
Unless you're writing drivers or kernel code, you don't need to worry about the underlying mechanisms too much. When writing user-space applications the key concepts are (1) that some system calls may block which means your process is put to sleep until an event occurs, and (2) on preemptible systems (all mainstream modern operating systems) your program may be preempted at any time so that other processes can run.
* Note that in some platforms, such as Linux, a thread is really just another process which shares its virtual address space with another process. Processes and threads are therefore treated exactly the same by the scheduler.
It is not clear to me whether your question is about theory or practice. In practice in every modern operating system, i/o operations are privileged. Meaning that in order for a user process or thread to access files, devices and so on it must issue a system call.
Then the kernel has the opportunity to do whatever it considers appropriate. For example it can check whether the I/o operation will block and, therefore switch the running (i.e. “call” the scheduler) process after issuing the operation.
Note that this mechanism can work even when there is no timer interruption handled by the kernel. Anyway in general it will depend upon your system. For example in an embedded system where no OS exits (or a minimal one) it could be the entire responsibility of the user’s code to invoke the scheduler before issueing a blocking operation.
Kernel can be preemptive, not scheduler.
First sched_yield() and wait() are types of voluntary preemption, when process itself gives out CPU even if kernel is non-preemptive.
If kernel has ability to switch to another process when time quantum has expired or higher priority process become runnable then we are talking about involuntary preemption, i.e preemptive kernel, and it can happen on different places explained below.
Difference is that insched_yield() process stays in runnable TASK_RUNNING state but just goes to the end of the run queue for it's static priority. Process must wait to get the CPU again.
On the other hand, wait() puts process to a sleep TASK_(UN)INTERRUPTABLE state, on a wait queue, calls schedule() and waits for an event to occur. When event occur, process are moved to run queue again. But that doesn't mean that they will get CPU immediately.
Here is explained when schedule() can be called after process is woken up:
Wakeups don't really cause entry into schedule(). They add a
task to the run-queue and that's it.
If the new task added to the run-queue preempts the current
task, then the wakeup sets TIF_NEED_RESCHED and schedule() gets
called on the nearest possible occasion:
If the kernel is preemptible (CONFIG_PREEMPT=y):
in syscall or exception context, at the next outmost
preempt_enable(). (this might be as soon as the wake_up()'s
in IRQ context, return from interrupt-handler to
preemptible context
If the kernel is not preemptible (CONFIG_PREEMPT is not set)
then at the next:
cond_resched() call
explicit schedule() call
return from syscall or exception to user-space
return from interrupt-handler to user-space

How does gdb multi-thread debugging coordinate with Linux thread scheduling?

When debugging multi-thread program using gdb one can do
1. switching between existing thread
2. step debugging
3. etc.
Meanwhile, process and its threads as resource of OS is managed by and under control of Linux Kernel. When gdb switch to a thread (say t1) from another(t2), how does it coordinate with the kernel since the kernel might still want to run t2 for some period of time. Also when gdb step is debugging in one specific thread (by issuing "si" command), how does other threads get run (or totally paused) during this period?
When gdb switch to a thread (say t1) from another(t2), how does it coordinate with the kernel since the kernel might still want to run t2 for some period of time.
By default, GDB operates in all-stop mode. That means that all threads are stopped whenever you see the (gdb) prompt. Switching between 2 stopped threads doesn't need any coordination with the kernel, because kernel will not run non-runnable (stopped) threads.
In a non-stop mode, threads other than current run freely, and the kernel can and will schedule them to run as it sees fit.
when gdb step is debugging in one specific thread (by issuing "si" command), how does other threads get run (or totally paused) during this period?
When you step or stepi, by default all threads are resumed. You can control this with set scheduler-locking on, in which case only the single thread will be resumed. If you forget to turn scheduler-locking off and do continue, only the current thread will be resumed, which is likely to confuse you.

Threads: some questions

I have couple of questions on threads. Could you please clarify.
Suppose process with one or multiple threads. If the process is prempted/suspended, does the threads also get preempted or does the threads continue to run?
When the suspended process rescheduled, does the process threads also gets scheduled? If the process has process has multiple threads, which threads will be rescheduled and on what basis?
if the thread in the process is running and recieves a signal(say Cntrl-C) and the default action of the signal is to terminate a process, does the running thread terminates or the parent process will also terminate? What happens to the threads if the running process terminates because of some signal?
If the thread does fork fallowed exec, does the exece'd program overlays the address space of parent process or the running thread? If it overlays the parent process what happens to threads, their data, locks they are holding and how they get scheduled once the exec'd process terminates.
Suppose process has multiple threads, how does the threads get scheduled. If one of the thread blocks on some I/O, how other threads gets scheduled. Does the threads scheduled with the parent process is running?
While the thread is running what the current kernel variable points(parent process task_stuct or threads stack_struct?
If the process with the thread is running, when the thread starts does the parent
process gets preempted and how each threads gets scheduled?
If the process running on CPU creates multiple threads, does the threads created by the parent process schedule on another CPU on multiprocessor system?
First, I should clear up some terminology that you appear to be confused about. In POSIX, a "process" is a single address space plus at least one thread of control, identified by a process ID (PID). A thread is an individually-scheduled execution context within a process.
All processes start life with just one thread, and all processes have at least one thread. Now, onto the questions:
Suppose process with one or multiple threads. If the process is prempted/suspended, does the threads also get preempted or does the threads continue to run?
Threads are scheduled independently. If a thread blocks on a function like connect(), then other threads within the process can still be scheduled.
It is also possible to request that every thread in a process be suspended, for example by sending SIGSTOP to the process.
When the suspended process rescheduled, does the process threads also gets scheduled? If the process has process has multiple threads, which threads will be rescheduled and on what basis?
This only makes sense in the context that an explicit request was made to stop the entire process. If you send the process SIGCONT to restart the process, then any of the threads which are not blocked can run. If more threads are runnable than there are processors available to run them, then it is unspecified which one(s) run first.
If the thread in the process is running and recieves a signal(say Cntrl-C) and the default action of the signal is to terminate a process, does the running thread terminates or the parent process will also terminate? What happens to the threads if the running process terminates because of some signal?
If a thread recieves a signal like SIGINT or SIGSEGV whose action is to terminate the process, then the entire process is terminated. This means that every thread in the process is unceremoniously killed.
If the thread does fork followed by exec, does the exece'd program overlays the address space of parent process or the running thread? If it overlays the parent process what happens to threads, their data, locks they are holding and how they get scheduled once the exec'd process terminates.
The fork() call creates a new process by duplicating the address space of the original process, and duplicating just the single thread that called fork() within that new address space.
If that thread in the new process calls execve(), it will replace the new, duplicated address space with the exec'd program. The original process, and all its threads, continue running normally.
Suppose process has multiple threads, how does the threads get scheduled. If one of the thread blocks on some I/O, how other threads gets scheduled. Does the threads scheduled with the parent process is running?
The threads are scheduled independently. Any of the threads that are not blocked can run.
While the thread is running what the current kernel variable points(parent process task_stuct or threads stack_struct?
Each thread has its own task_struct within the kernel. What userspace calls a "thread" is called a "process" in kernel space. Thus current always points at the task_struct corresponding to the currently executing thread (in the userspace sense of the word).
If the process with [a second] thread is running, when the thread starts does the parent process gets preempted and how each threads gets scheduled?
Presumably you mean "the process's main thread" rather than "parent process" here. As before, the threads are scheduled independently. It's unspecified whether one runs before the other - and if you have multiple CPUs, both might run simultaneously.
If the process running on CPU creates multiple threads, does the threads created by the parent process schedule on another CPU on multiprocessor system?
That's really up to the kernel, but the threads are certainly allowed to execute on other CPUs.
Depends. If a thread is preempted because the OS scheduler decides to give CPU time to some other thread, then other threads in the process will continue running. If the process is suspended (i.e. it gets the SIGSTP signal) then AFAIK all the threads will be suspended.
When a suspended process is woken up, all the threads are marked as waiting or blocked (if they are waiting e.g. on a mutex). Then the scheduler at some points run them. There is no guarantee about any specific order the threads are run after waking up the process.
The process will terminate, and with it the threads as well.
When you fork you get a new address space, so there is no "overlay". Note that fork() and the exec() family affect the entire process, not only the thread from which they where called. When you call fork() in a multi-threaded process, the child gets a copy of that process, but with only the calling thread. Then if you call exec() in one or both of the processes (presumably only in the child process, but that's up to you), then the process which calls exec() (and with it, all its threads) is replaced by the exec()'ed program.
The thread scheduling order is decided by the OS scheduler, there is no guarantee given about any particular order.
From the kernel perspective a process is an address space with one or more threads (and some other gunk). There is no concept of threads that somehow exist without a process.
There is no such thing as a process without a single thread. A "plain process" is just a process with a single thread.
Probably yes. This is determined by the OS scheduler. Note that there are API's and tools (numactl) that one can use to force some thread(s) to run on a specific CPU core.
Assuming your questions are about POSIX threads, then
1a. A process that's preempted by the O/S will have all its threads preempted.
1b. The O/S will suspend all the threads of a process that is sent a SIGSTOP.
The O/S will resume all thread of a suspended process that is sent a SIGCONT.
By default, a SIGINT will terminate all the threads in a process.
If a thread calls fork(), then all its threads are duplicated. If it then call one of the exec() functions, then all the duplicated threads disappear.
POSIX allows for user-selection of the thread scheduling algorithm.
I don't understand the question.
I don't understand the question.
How threads are mapped to CPU-s is implementation-dependent. Many implementations will try to distribute threads amongst the available CPU-s to improve performance.
The Linux kernel doesn't distinguish between threads and processes. As far as kernel is concerned, a thread is simply another process which happens to share address space with other processes. (You would call the set of "processes" (i.e. threads) which share a single address space a "process".)
So POSIX threads are scheduled exactly as full-blown processes would be. There is no difference in scheduling whether you have one process with five threads, or five separate processes.
There are kernel calls that provide fine grained control over what is shared between processes. The POSIX threads API wraps over them.

Process state when calling syscall?

What process state has when it calls a syscall?
I mean, don't asume it's an I/O syscall like read or write...
It's the process itselft that executes kernel code, or the process is suspendes and there's like a "kernel thread" that execute the syscall handler (and knows wich process called (current))?
I'm not sure if changes from executing to ready, or executing to blocked.
It's the process itself that switches to kernel mode and executes the system call - although it switches to a kernel stack to do so. A process executing inside the kernel has state Running, and can be pre-empted and end up in state Runnable.
It depends what the syscall does.
Suppose there is a hypothetical syscall which calculates PI to a lot of digits, and places the result in a buffer the application specifies, then the process will probably just be in the "R" running state. Switching to kernel mode doesn't stop it running in the context of the task that made the call.
Of course many system calls wait for things - consider sleep() for example, which releases the CPU rather than spinning. This puts the process to sleep, having registered a kernel timer to wake it up.
Quite a lot of syscalls never sleep, the likes of getpid() which just retrieve information which is always in ram. And many which do sometimes sleep don't necessarily do so, for example, if you call read() on data already in a kernel buffer.
