Does kernel threads get scheduled by the scheduller? - linux

How kernel threads gets executed on the CPU
does these kernel threads get scheduled by the scheduller , like normal user space processes?
or they get waken up when some events happen ?
root 2 0 0 Nov30 ? 00:00:00 [kthreadd]
root 3 2 0 Nov30 ? 00:00:03 [ksoftirqd/0]

The answer to both questions is yes - kernel threads gets scheduled just like user threads and they are normally blocking pending certain events (different events per kernel thread).

Answer is Yes.
Only major difference between kernel threads and user space process would be task->mm = NULL for kernel threads.
Hence they don't have distinct address space. Rest is pretty much same for kernel threads and user space processes.

Related

SCHED_FIFO higher priority thread is getting preempted by the SCHED_FIFO lower priority thread?

I am testing my multithreaded application in Linux RT multicore machine.
However during testing, we are observing that scheduling (created with SCHED_FIFO scheduling policy ) in Linux RT is not happening according to the SCHED_FIFO policy.
We could see in multiple places that the higher priority thread execution is getting preempted by a lower priority thread.
Based on some research we did on the internet, we found that the following kernel parameters need to be changed from
/proc/sys/kernel/sched_rt_period_us containing 1000000
/proc/sys/kernel/sched_rt_runtime_us containing 950000
to
/proc/sys/kernel/sched_rt_period_us containing 1000000
/proc/sys/kernel/sched_rt_runtime_us containing 1000000
or
/proc/sys/kernel/sched_rt_period_us containing -1
/proc/sys/kernel/sched_rt_runtime_us containing -1
We tried doing both but still we are facing the problem sometimes. We are facing the issue even when higher priority thread is not suspended by any system call.
It would be great if you could let us know if you are aware of such problems in Linux RT scheduling and/or have any solutions to make the Linux RT scheduling deterministic based on priority.
There are no printfs or any system calls in the higher priority thread but still the higher priority thread is getting preempted by the lower priority thread.
Also I have made sure all the threads in the process are running on a single core using taskset command.
There could be two reasons:
CPU throttling: the scheduler is designed to reserve some CPU time to non-RT tasks; you have already disabled it by acting on the /proc/sys/kernel/ entries
blocking: your high-priority task is blocking either
on some synchronization mechanism (e.g., mutex, semaphore) or
on some blocking call (e.g., malloc, printf, read, write, etc.)

How to bind certain kernel threads to a given core?

I have a number of kernel threads that I want to get off of a given core for performance reasons. Some of these I am able to move using taskset however there are others I cannot.
In particular I see processes like migration, watchdog, rcuc, etc. that do not respond to my attempt to rebind them.
For example, if I try to rebind the watchdog process, I get the following:
# taskset -pc 0 207
pid 207's current affinity list: 0
sched_setaffinity: Invalid argument
failed to set pid 207's affinity.
How can I get these off of the cores so I can properly isolate them for performance reasons?
I suspect these processes are interfering with my full dynticks mode.
Several kernel threads are tied to a specific core, in order to effect capabilities needed by the SMP infrastructure, such as synchronization, interrupt handling and so on. The kworker, migration and ksoftirqd threads, for example, usually have one instance per virtual processor (e.g. 8 threads on a 4-core 8-thread CPU).
You cannot (and should not be able to) move those threads - without them that processor would not be fully usable by the system any more.
Why exactly do you want to move those threads anyway?

user threads v.s. kernel threads

Could someone help clarify my understanding of kernel threads. I heard that, on Linux/Unix, kernel threads(such as those of system calls) get executed faster than user threads. But, aren't those user threads scheduled by kernel and executed using kernel threads? could someone please tell me what is the difference between a kernel thread and a user thread other than the fact that they have access to different address spaces. what are other difference between them? Is it true that on a single processor box, when user thread is running, kernel will be suspended?
Thanks in advance,
Alex
I heard that, on Linux/Unix, kernel threads(such as those of system calls) get executed faster than user threads.
This is a largely inaccurate statement.
Kernel threads are used for "background" tasks internal to the kernel, such as handling interrupts and flushing data to disk. The bulk of system calls are processed by the kernel within the context of the process that called them.
Kernel threads are scheduled more or less the same way as user processes. Some kernel threads have higher than default priority (up to realtime priority in some cases), but saying that they are "executed faster" is misleading.
Is it true that on a single processor box, when user thread is running, kernel will be suspended?
Of course. Only one process can be running at a time on a single CPU core.
That being said, there are a number of situations where the kernel can interrupt a running task and switch to another one (which may be a kernel thread):
When the timer interrupt fires. By default, this occurs 100 times every second.
When the task makes a blocking system call (such as select() or read()).
When a CPU exception occurs in the task (e.g, a memory access fault).

Concurrency of posix threads in multiprocessor machine

I have some doubts regarding concurrency of posix threads in multiprocessor machine. I have found similar questions in SO regarding it but didnt find conclusive answer.
Below is my understanding. I want to know if i am correct.
Posix threads are user level threads and kernel is not aware of it.
Kernel scheduler will treat Process( with all its threads) as one entity for scheduling. It is the thread library that in turn chooses which thread to run. It can slice the cpu time given by the kernel among the run-able threads.
User threads can run on different cpu cores. ie Let threads T1 & T2 be created by a Process(T), then T1 can run in Cpu1 and T2 can run in Cpu2 BUT they cant run concurrently.
Please let me know if my understanding in correct.
Thanks...
Since you marked your question with "Linux" tag I'm going to answer it according to standard pthreads implementation under linux. If you are talking about "green" threads, which are scheduled at the VM/language level instead of the OS, then your answers are mostly correct. But my comments below are on Linux pthreads.
1) Posix threads are user level threads and kernel is not aware of it.
No this is certainly not correct. The Linux kernel and the pthreads libraries work together to administer the threads. The kernel does the context switching, scheduling, memory management, cache memory management, etc.. There is other administration done at the user level of course but without he kernel, much of the power of pthreads would be lost.
2) Kernel scheduler will treat Process( with all its threads) as one entity for scheduling. It is the thread library that in turn chooses which thread to run. It can slice the cpu time given by the kernel among the run-able threads.
No, the kernel treats each process-thread as one entity. It has it's own rules about time slicing that take processes (and process priorities) into consideration but each sub-process thread is a schedulable entity.
3) User threads can run on different cpu cores. ie Let threads T1 & T2 be created by a Process(T), then T1 can run in Cpu1 and T2 can run in Cpu2 BUT they cant run concurrently.
No. Concurrent executing is expected for multi-threaded programs. That's why synchronization and mutexes are so important and why programmers put up with the complexity of multithreaded programming.
One way to prove this to you is to look at the output of ps with -L option to show the associated threads. ps usually wraps multiple threaded processes into one line but with -L you can see that the kernel has a separate virtual process-id for each thread:
ps -ef | grep 20587
foo 20587 1 1 Apr09 ? 00:16:39 java -server -Xmx1536m ...
versus
ps -eLf | grep 20587
foo 20587 1 20587 0 641 Apr09 ? 00:00:00 java -server -Xmx1536m ...
foo 20587 1 20588 0 641 Apr09 ? 00:00:30 java -server -Xmx1536m ...
foo 20587 1 20589 0 641 Apr09 ? 00:00:03 java -server -Xmx1536m ...
...
I'm not sure if Linux threads still do this but historically pthreads used the clone(2) system call to create another thread copy of itself:
Unlike fork(2), these calls allow the child process to share parts of its execution context with the calling process, such as the memory space, the table of file descriptors, and the table of signal handlers.
This is different from fork(2) which is used when another full process is created.
POSIX does not specify how the threads created with pthread_create are scheduled on to processor cores. This is up to the implementation.
However, I would expect the following in a quality implementation, and this is the case for current versions of linux:
Threads are full kernel threads, and scheduled by the kernel
Threads from the same process can run concurrently on separate processors
i.e. all 3 of your numbered statements are wrong with current implementations of linux, but could in theory be true for another implementation that also conformed to POSIX.
Most POSIX implementations use OS support to provide thread functionality - they wrap the system calls needed to manage threads. As such, the behaviour of the threads re. scheduling etc depends upon the underlying OS.
So, on most modern OS:
A process with POSIX threads is no different to the kernel than any other process with multiple threads.
The kernel scheduler dispatches threads, not processes. A 'process' is often regarded as a higher-level construct that has code, memory-management, quotas, auditing, and security permissions, but not execution. A process can do nothing unless a thread runs its code, which is why, when a process is created, a 'main thread' is created at the same time, else nothing would run. The OS scheduling algorithm may use the process that the thread runs as one of the parameters to decide on which set of ready threads to run next - it's cheaper to swap one thread out with one from the same process - but it does not have to.
Slicing the cpu time given by the kernel among the run-able threads is a side-effect of the OS timer interrupt when there are more ready threads than there are cores to run them. Any machine that regularly has to resort to (ugh! - I hate the term), 'time slicing' should be regarded as overloaded and should have more CPU or less work. Threads should ideally only become ready when signaled by another thread or an IO driver, not because the OS timer interrupt has decided to run it in place of another thread that could still be doing useful work.
They just can run concurrently. If two threads are ready and there are two cores, the threads are dispatched onto the two cores. It does not matter if they are from the same process or not.

Linux kernel: What process does schedule() run in?

When you call a system call such as fork in process X, the kernel is said to be executing in process context. So, fork can be said to be running in process X, right?
But if schedule() is called (and it isn't a sys call) in the same process, would you say that it is running as part of X? Or does it runs in the swapper process? Or does it sound absurd, taking into account the monolithic nature of the kernel?
schedule() is always running in process context. The special part about it is that it can change which process context is current - but it does always have a process context. Prior to the call to context_switch() it runs in the context of the process to be swapped out, and after it runs in the process swapped in.
The Linux kernel does not have a dedicated "swapper" task (there is an idle task, which is always runnable in case nothing else is eligible to run).
It really depends upon where the schedule() call is made from; schedule() can be called both from process context or from a work queue. The work queues are kernel-scheduled threads:
# ps auxw | grep worker
root 1378 0.0 0.0 0 0 ? S 20:45 0:00 [kworker/1:0]
root 1382 0.0 0.0 0 0 ? S 20:45 0:00 [kworker/2:0]
root 1384 0.0 0.0 0 0 ? S 20:45 0:00 [kworker/3:1]
...
The [..] signifies that the processes do not execute in userspace.
The worker_thread() function calls schedule() after handling a work item but before starting all over again.
schedule() can also be called on behalf of a process, either by a driver or by signal handling code, or filesystem internals, or myriad other options.
The scheduler take care of all processes, so does not run inside one process.
Of course, when e.g. a process is scheduled out because of a clock interrupt, some process was running (and later, another one is scheduled).
You cannot view all the kernel as running for processes (only system calls are).
Q: So, fork can be said to be running in process X, right?
A: Yes, absolutely. The system call by which a process REQUESTS to "fork" occurs in user space. The act of making the system call TRANSITIONS from user space to kernel space. The IMPLEMENTATION of the system call may involve many separate steps. Some may occur in user space; other steps occur in kernel space.
Q: ...taking into account the monolithic nature of the kernel ?
A: The issue of "user space" vs "kernel space" has absolutely NOTHING to do with whether the kernel happens to be "monolithic", a "microkernel" or something else entirely.

Resources