Does linux schedule a process or a thread? - linux

After reading this SO question I got a few doubts. Please help in understanding.
Scheduling involves deciding when to run a process and for what quantum of time.
Does linux kernel schedule a thread or a process? As process and thread are not differentiated inside kernel how a scheduler treats them?
How quantum for each thread is decided?
a. If a quantum of time (say 100us) is decided for a process is that getting shared between all the threads of the process? or
b. A quantum for each thread is decided by the scheduler?
Note: Questions 1 and 2 are related and may look the same but just wanted to be clear on how things are working posted them both here.

The Linux scheduler (on recent Linux kernels, e.g. 3.0 at least) is scheduling schedulable tasks or simply tasks.
A task may be :
a single-threaded process (e.g. created by fork without any thread library)
any thread inside a multi-threaded process (including its main thread), in particular Posix threads (pthreads)
kernel tasks, which are started internally in the kernel and stay in kernel land (e.g. kworker, nfsiod, kjournald , kauditd, kswapd etc etc...)
In other words, threads inside multi-threaded processes are scheduled like non-threaded -i.e. single threaded- processes.
The low-level clone(2) syscall creates user-land schedulable tasks (and can be used both for creating fork-ed process or for implementation of thread libraries, like pthread). Unless you are a low-level thread library implementor, you don't want to use clone directly.
AFAIK, for multi-threaded processes, the kernel is (almost) not scheduling the process, but each individual thread inside (including the main thread).
Actually, there is some notion of thread groups and affinity in the scheduling, but I don't know them well
These days, processors have generally more than one core, and each core is running a task (at some given instant) so you do have several tasks running in parallel.
CPU quantum times are given to tasks, not to processes

The NPTL implementation of POSIX thread specifications sees thread as a different process inside kernel, having unique task_struct (and therefore pid too) so each thread is schedulable in itself as mentioned. Therefore each thread gets its own timeslice and is scheduled just like processes as mentioned above.
Just to add, Currently Linux scheduler is also capable of scheduling not only single tasks ( a simple process), but groups of processes or even users ( all processes, belonging to a user) as a whole. This allows implementing of group scheduling, where CPU time is first divided between process groups and then distributed within those groups to single threads.
Linux threads does not directly operate on processes or threads, but works with schedulable entities. Represented by struct sched_entity.
It's fair to say that every process/thread is a sched_entity but the converse might not be true.
To know detailed process scheduling, refer here

Related

How does a process schedule its own threads

After the Kernel schedules a process that has threads, How does said process schedule its own threads during its time splice?
For most modern kernels, the kernel only schedules threads, and processes are mostly just a container for the threads to execute inside (e.g. a container that contains a virtual address space, however many threads, and a few other scraps like file handles).
For some kernels (mostly very old unix kernels that existed before threads were invented) the kernel schedules processes and then a user-space library emulates threads. For this to work properly all of the "blocking" system calls (e.g. write()) have to be replaced by asynchronous system calls (e.g. aio_write()) so that other threads in the process can be given CPU time; however I wouldn't want to assume it works properly (e.g. if any thread blocks, then maybe all threads in the process block).
Also it may not work when there's multiple CPUs (kernel gives a process one CPU, but then from the kernel's perspective that process is running and can't use a second CPU). There are sophisticated work-arounds for this (to support "M:N threading") but it's just easier and better to fix the scheduler so it works with threads. Fortunately/unfortunately this didn't matter much in the early days because very few computers had more than one CPU anyway.
Lastly; it doesn't work for thread priorities - e.g. one process might keep CPU busy executing an unimportant/low priority thread while another process doesn't get that CPU time when it desperately needs it for an important/high priority thread. This occurs because no process knows about threads belonging to other processes and the kernel only knows about processes and not threads.
Of course these are also the reasons why every kernel adopted "kernel schedules threads and not processes" (and those that didn't died).
It's down to jargon definitions, but threads are simply a bunch of processes sharing an address space. Older Unixes even called them Light Weight Processes.
With that classical understanding of threads, the answer is that, these days, it's the OS that does the scheduling and each thread gets its own timeslices.
Extras
Some OSes do things to "the whole process" - e.g. Windows will give the process that has mouse focus a priority boost (all it's threads get dynamically notched up a few priority places), to make that application appear to be more sprightly (this goes back to Windows 3).
Other operating systems will increase the priority of a thread dynamically, to solve priority inversion situations. This is where a low priority thread that has control of a resource (I/O, or perhaps a semaphore) is blocking a higher priority thread from running (because the resource is not available. This is the priority inversion, and it's solved by the OS boosting the priority of the blocking thread until it gives up the required resource.
Either the kernel schedules the threads or the kernel schedules processes simulates thread by scheduling it own threads.
Usually, the process schedules its own threads using a library that sets timers. When the timer handler saves the current "thread's" registers then loads a new set of registers from another "thread."

Can kernel schedule user level threads of same process on different cores?

As far as I know kernel doesn't know whether it is executing a user thread or user process because for kernel user threads are user process, it only schedules user processes and doesn't care which thread was running in that process.
I have one more question, Is there per core ready queue or a single ready queue for all the cores?
I was reading this paper and it is written that
In the stock Linux kernel the set of runnable threads is partitioned
into mostly-private per core scheduling queues; in the common case,
each core only reads, writes, and locks its own queue.
The linux kernel scheduler uses the "task" as its primary schedulable entity. This corresponds to a user-space thread. For a traditional simple Unix-style program, there is only a single thread in the process and so the distinction can be ignored. Other programs of course may have multiple threads. But in all cases, the kernel only schedules tasks (i.e. threads).
Your terminology above therefore doesn't really match the situation. The kernel doesn't really care whether the different threads it schedules are part of the same process or different processes: each thread can be scheduled independently. You can have multiple threads from the same process running on different processors/cores at the same time.
Yes, there are separate run queues for each core.
The paper you reference is, I think, slightly misleading in its phrasing. In particular, saying that the "set of runnable threads is partitioned into..." doesn't give quite the right meaning; that makes it sound like the threads are divided into multiple groups that are then assigned to different cores and can only be executed there. It would be more accurate to say that there is a separate run queue for each core containing a set of threads waiting to execute, and in common use, the scheduler doesn't need to reference the queues for other cores.
But in fact, threads can migrate from one core to another. For example, if there is a thread waiting to run on core A (hence in core A's run queue), but core A is already busy running some other thread, and there is another core that is not busy, the waiting thread may be migrated to that other core and executed there. (This is an oversimplification of course as there are other factors that go into deciding whether/when to migrate a thread.)

Scheduler for Linux kernel threads

Linux includes a few privileged processes called kernel threads. Is there any scheduler which runs/suspends them? If yes, is this scheduler the same as the system scheduler (I mean the one to schedule the whole system processes)?
The Linux scheduler is scheduling tasks. These can be
kernel threads (e.g. kswapd), or
single-threaded processes (e.g. bash), or
individual threads of a multi-threaded process (e.g. some browsers or servers)
The many threads of a multi-threaded process are tasks sharing a common address space (and other things, e.g. file descriptors).
AFAIK, the scheduler does not separate kernel threads from other tasks. But the scheduler do take into account scheduling policies (sched_setscheduler(2)) and priorities (setpriority(2)) (For most kernel threads, the priority is often very high). See sched(7)
Yes ! Let me clarify the system scheduler part here.
Every task is associated with a task_struct which contains the details of each task say its pid, its name, when it recently started, priority etc etc.http://lxr.free-electrons.com/source/include/linux/sched.h#L1224
Typically depending on the priority of the task either Fair scheduler or Real time scheduler kicks in and these co exist. Just to keep it simple and not to go into details, these are different scheduler algorithms that cater to different type of tasks.
Now Kernel threads also have an associated task_struct and as #Basile Starynkevitch pointed a couple of KPI's, we can use sched_setparam KPI's to modify the sched params and change the scheduler to which the task belongs to depening on what they are about to do.

Why is process scheduling not called thread scheduling?

I found out that Linux and Windows both schedule threads and not processes.
Source
So I don't understand why we call it "process scheduling" any more. Shouldn't we be calling it thread scheduling? The idea of shared memory for threads of the same process just seems to be a technicality that has to be taken care of while actually running the threads (we could assume 2 threads of the same process to be a 2 single threaded processes sharing memory).
Are there any operating systems that schedule processes and when it is time for a process to run, specially decide how to run its threads?
OS-scheduled threads are a relatively new feature. It was not that long ago when a separate path of execution on Unix meant creating an entirely new process. So there is historical resistance.
Some systems (Unix variants, VMS) schedule processes, not threads. Process scheduling is likely to remain the way to go in real time operating systems.
In process scheduling resources are allocated to each process differently i.e suppose you create 2 processes then each process will get his own resources(file buffer,i/o files, CPU control etc). In this, time is wasted when scheduling is done. As first process is called then resources are allocated to that process when second process is called then resources are allocated to that process so resources are allocated separately to each process and also context switching time increases during scheduling.
Thread is basically a small unit of process. So one process can have many threads. But here resources are shared between different threads as they are one part of process, so multitasking is available and also context switching time is less.

What is the difference between lightweight process and thread?

I found an answer to the question here. But I don't understand some ideas in the answer. For instance, lightweight process is said to share its logical address space with other processes. What does it mean? I can understand the same situation with 2 threads: both of them share one address space, so both of them can read any variables from bss segment (for example). But we've got a lot of different processes with different bss sections, and I don't know, how to share all of them.
I am not sure that answers are correct here, so let me post my version.
There is a difference between process - LWP (lightweight process) and user thread. I will leave process definition aside since that's more or less known and focus on LWP vs user threads.
LWP is what essentially are called today threads. Originally, user thread meant a thread that is managed by the application itself and the kernel does not know anything about it.
LWP, on the other hand, is a unit of scheduling and execution by the kernel.
Example:
Let's assume that system has 3 other processes running and scheduling is round-robin without priorities. And you have 1 processor/core.
Option 1. You have 2 user threads using one LWP. That means that from OS perspective you have ONE scheduling unit. Totally there are 4 LWP running (3 others + 1 yours). Your LWP gets 1/4 of total CPU time and since you have 2 user threads, each of them gets 1/8 of total CPU time (depends on your implementation)
Option2. You have 2 LWP. From OS perspective, you have TWO scheduling units. Totally there are 5 LWP running. Your LWP gets 1/5 of total CPU time EACH and your application get's 2/5 of CPU.
Another rough difference - LWP has pid (process id), user threads do not.
For some reason, naming got little messed and we refer to LWP as threads.
There are definitely more differences, but please, refer to slides.
http://www.cosc.brocku.ca/Offerings/4P13/slides/threads.ppt
EDIT:
After posting, I found a good article that explains everything in more details and is in better English than I write.
http://www.thegeekstuff.com/2013/11/linux-process-and-threads/
From MSDN, Threads and Processes:
Processes exist in the operating system and correspond to what users
see as programs or applications. A thread, on the other hand, exists
within a process. For this reason, threads are sometimes referred to
as light-weight processes. Each process consists of one or more
threads.
Based on Tanenbaum's book "Distributes Systems", light weight processes is generally referred to a hybrid form of user-level thread and kernel-level thread. An LWP runs in the context of a single process, and there can be several LWPs per process. In addition each LWP can be running its own (user-level) thread. Multi-threaded applications are constructed by creating threads (with thread library package), and subsequently assigning each thread to an LWP.
The biggest advantage of using this hybrid approach is that creating, destroying, and synchronizing threads is relatively cheap and do not need any kernel intervention. Beside that, provided that a process has enough LWPs, a blocking system call will not suspend the entire process.
Threads run within processes.
Each process may contain one or more threads.
If kernel doesn't know anything about the threads running in the process, we have threads running on user space and thus no multiprocessing capabilities are available.
On the other hand, we can have threads running on the kernel space; this means that each process runs on a different CPU. This enables us multiprocessing, but as you may assume it is more expensive in terms of operating system resources.
Finally, there is a solution that lies somewhere in the middle; we group threads together into LWP. Each group runs on different CPU, but the threads in the group cannot be multi processed. That's because kernel in this version knows only about the groups (which are multiprocessed) but nothing about the threads that they contain.
Hope it is clear enough.
IMO, LWP is a kernel thread binding which can be created and executed in the user context.
If I'm not mistaken, you can attach user threads to a single LWP to potentially increase the level of concurrency without involving a system call.
Thread is basically task assigned with one goal and enough information to perform a specific task .
A process can create multiple threads for doing its work as fast as possible.
e.g A portion of program may need input output, a portion may need permissions.
User level thread are those that can be handled by thread library.
On the other hand kernel level thread (which needs to deal with hadrware)are also called LWP(light weight process) to maximize the use of system and so the system does not halt upon just one system call.
From here.
Each LWP is a kernel resource in a kernel pool, and is attached and detached to a thread on a per thread basis. This happens as threads are scheduled or created and destroyed.
A process contains one or more threads in it and a thread can do anything a process can do. Also threads within a process share the same address space because of which cost of communication between threads is low as it is using the same code section, data section and OS resources, so these all features of thread makes it a "lightweight process".

Resources