Process Priority vs Thread Priority - linux

In Linux, a process is a set of threads. Each thread has its own priority! But does a process have a priority too? If so, how is it different from the thread priority? And when a new process is created, how are these values propagated?

Linux implements (kernel level) Threads essentially as Processes. So you fall back to the good old process-priorities there.
See NPTL and nice (for understanding that processes are the first ones to have priorities). Mostly defaults are applied - in case of threads, the thread is a copy, so its priorities should be copied too. Will certainly vary with varying schedulers.

Related

How is a process and a thread the same thing in Linux?

I have read that a process and a thread are the same thing in Linux, for example in this question it says:
There is absolutely no difference between a thread and a process on
Linux.
But I don't understand how can a process and a thread means the same thing. I mean a thread is what gets executed by the CPU, and a process is simply an "enclosure" for the threads which allows the threads to have shared memory. This image shows the relationship between a process and its threads:
So clearly a process and a thread does not mean the same thing!
Linux didn't use to have special support for (POSIX) threads, and it simply treated them as processes that shared their address space as well as a few other resources (filedescriptors, signal actions, ...) with other "processes".
That implementation, while elegant, made certain things required for threads by POSIX difficult, so Linux did end up gaining that special support for threads and your premise is now no longer true.
Nevertheless, processes and threads still both remain represented as tasks within the kernel (but now the kernel has support for grouping those tasks into thread groups as well and APIs for working with those ((tgkill, tkill, exit_group, ...)).
You can google LinuxThreads and NPTL threads to learn more about the topic.

How does a process schedule its own threads

After the Kernel schedules a process that has threads, How does said process schedule its own threads during its time splice?
For most modern kernels, the kernel only schedules threads, and processes are mostly just a container for the threads to execute inside (e.g. a container that contains a virtual address space, however many threads, and a few other scraps like file handles).
For some kernels (mostly very old unix kernels that existed before threads were invented) the kernel schedules processes and then a user-space library emulates threads. For this to work properly all of the "blocking" system calls (e.g. write()) have to be replaced by asynchronous system calls (e.g. aio_write()) so that other threads in the process can be given CPU time; however I wouldn't want to assume it works properly (e.g. if any thread blocks, then maybe all threads in the process block).
Also it may not work when there's multiple CPUs (kernel gives a process one CPU, but then from the kernel's perspective that process is running and can't use a second CPU). There are sophisticated work-arounds for this (to support "M:N threading") but it's just easier and better to fix the scheduler so it works with threads. Fortunately/unfortunately this didn't matter much in the early days because very few computers had more than one CPU anyway.
Lastly; it doesn't work for thread priorities - e.g. one process might keep CPU busy executing an unimportant/low priority thread while another process doesn't get that CPU time when it desperately needs it for an important/high priority thread. This occurs because no process knows about threads belonging to other processes and the kernel only knows about processes and not threads.
Of course these are also the reasons why every kernel adopted "kernel schedules threads and not processes" (and those that didn't died).
It's down to jargon definitions, but threads are simply a bunch of processes sharing an address space. Older Unixes even called them Light Weight Processes.
With that classical understanding of threads, the answer is that, these days, it's the OS that does the scheduling and each thread gets its own timeslices.
Extras
Some OSes do things to "the whole process" - e.g. Windows will give the process that has mouse focus a priority boost (all it's threads get dynamically notched up a few priority places), to make that application appear to be more sprightly (this goes back to Windows 3).
Other operating systems will increase the priority of a thread dynamically, to solve priority inversion situations. This is where a low priority thread that has control of a resource (I/O, or perhaps a semaphore) is blocking a higher priority thread from running (because the resource is not available. This is the priority inversion, and it's solved by the OS boosting the priority of the blocking thread until it gives up the required resource.
Either the kernel schedules the threads or the kernel schedules processes simulates thread by scheduling it own threads.
Usually, the process schedules its own threads using a library that sets timers. When the timer handler saves the current "thread's" registers then loads a new set of registers from another "thread."

Mapping of user level and kernel level thread

While going through OPERATING SYSTEM PRINCIPLES, 7TH ED
(By Abraham Silberschatz, Peter Baer Galvin, Greg Gagne), i encountered a
statement in Thread Scheduling Section.It is given as -:
To run on a CPU, user-level threads must ultimately be mapped
to an associated kernel-level thread, although this mapping may
be indirect and may use a lightweight process (LWP).
The first half of the statement i.e
To run on a CPU, user-level threads must ultimately be mapped to an associated kernel-level
is trying to say that When a user level thread is executed ,it will need support from kernel thread like system calls.
But i am completely stuck in other half i.e
although this mapping may
be indirect and may use a lightweight process (LWP)
What does it really mean ???
Please help me out !
You're reading a book that is notoriously crapola. Threads are implemented in two ways.
In the olde days (and still persists on some operating systems) there were just processes. A process consisted of an execution stream and an address space.
When languages that needed thread support (e.g., Ada—"tasks") there was a need to create libraries to implement threads. The libraries used timers to switch among the various threads within the process. This is poor man's threading. The major drawback here is that, even when you have multiple processors, all the threads of a process run on the same processor. The threads are just interleaved execution within a single process that executes on one processors.
These are sometimes called "user level threads." Some books call this the "many-to-one model."
To say
To run on a CPU, user-level threads must ultimately be mapped to an associated kernel-level thread
is highly misleading. There [usually] ARE no kernel threads in this model; just processes. Multiple threads run interleaved in a process. To call this a mapping "to an associated kernel-level thread" is misleading and overly theoretical.
This is mumbo jumbo.
although this mapping may be indirect and may use a lightweight process (LWP)
The next stage in operating system evolution here was for the operating system to support threads directly. Instead of a process being an execution stream + address-space, a process became one-or-more-threads + address-space. Instead of scheduling processes for execution, the OS schedules threads for execution.
Those are kernel threads.
Your book is making the simple complex.
These days the term Light Weight Processes and threads are used interchangeably.
although this mapping may be indirect and may use a lightweight
process (LWP)
I know the above statement is confusing(Notice the 2 mays). I can think only 1 thing which the above statement signifies is that:
Earlier when linux supported only user-level threads, the kernel was unaware of the fact that there are multiple user-level threads, and the way it handled these multiple threads was by associating all of them to a light weight process(which kernel sees as a single scheduling and execution unit) at kernel level.
So associating a kernel-level thread with each user-level thread is kind of direct mapping and associating a single light weight process with each user-level thread is indirect mapping.

Does linux schedule a process or a thread?

After reading this SO question I got a few doubts. Please help in understanding.
Scheduling involves deciding when to run a process and for what quantum of time.
Does linux kernel schedule a thread or a process? As process and thread are not differentiated inside kernel how a scheduler treats them?
How quantum for each thread is decided?
a. If a quantum of time (say 100us) is decided for a process is that getting shared between all the threads of the process? or
b. A quantum for each thread is decided by the scheduler?
Note: Questions 1 and 2 are related and may look the same but just wanted to be clear on how things are working posted them both here.
The Linux scheduler (on recent Linux kernels, e.g. 3.0 at least) is scheduling schedulable tasks or simply tasks.
A task may be :
a single-threaded process (e.g. created by fork without any thread library)
any thread inside a multi-threaded process (including its main thread), in particular Posix threads (pthreads)
kernel tasks, which are started internally in the kernel and stay in kernel land (e.g. kworker, nfsiod, kjournald , kauditd, kswapd etc etc...)
In other words, threads inside multi-threaded processes are scheduled like non-threaded -i.e. single threaded- processes.
The low-level clone(2) syscall creates user-land schedulable tasks (and can be used both for creating fork-ed process or for implementation of thread libraries, like pthread). Unless you are a low-level thread library implementor, you don't want to use clone directly.
AFAIK, for multi-threaded processes, the kernel is (almost) not scheduling the process, but each individual thread inside (including the main thread).
Actually, there is some notion of thread groups and affinity in the scheduling, but I don't know them well
These days, processors have generally more than one core, and each core is running a task (at some given instant) so you do have several tasks running in parallel.
CPU quantum times are given to tasks, not to processes
The NPTL implementation of POSIX thread specifications sees thread as a different process inside kernel, having unique task_struct (and therefore pid too) so each thread is schedulable in itself as mentioned. Therefore each thread gets its own timeslice and is scheduled just like processes as mentioned above.
Just to add, Currently Linux scheduler is also capable of scheduling not only single tasks ( a simple process), but groups of processes or even users ( all processes, belonging to a user) as a whole. This allows implementing of group scheduling, where CPU time is first divided between process groups and then distributed within those groups to single threads.
Linux threads does not directly operate on processes or threads, but works with schedulable entities. Represented by struct sched_entity.
It's fair to say that every process/thread is a sched_entity but the converse might not be true.
To know detailed process scheduling, refer here

What is the difference between lightweight process and thread?

I found an answer to the question here. But I don't understand some ideas in the answer. For instance, lightweight process is said to share its logical address space with other processes. What does it mean? I can understand the same situation with 2 threads: both of them share one address space, so both of them can read any variables from bss segment (for example). But we've got a lot of different processes with different bss sections, and I don't know, how to share all of them.
I am not sure that answers are correct here, so let me post my version.
There is a difference between process - LWP (lightweight process) and user thread. I will leave process definition aside since that's more or less known and focus on LWP vs user threads.
LWP is what essentially are called today threads. Originally, user thread meant a thread that is managed by the application itself and the kernel does not know anything about it.
LWP, on the other hand, is a unit of scheduling and execution by the kernel.
Example:
Let's assume that system has 3 other processes running and scheduling is round-robin without priorities. And you have 1 processor/core.
Option 1. You have 2 user threads using one LWP. That means that from OS perspective you have ONE scheduling unit. Totally there are 4 LWP running (3 others + 1 yours). Your LWP gets 1/4 of total CPU time and since you have 2 user threads, each of them gets 1/8 of total CPU time (depends on your implementation)
Option2. You have 2 LWP. From OS perspective, you have TWO scheduling units. Totally there are 5 LWP running. Your LWP gets 1/5 of total CPU time EACH and your application get's 2/5 of CPU.
Another rough difference - LWP has pid (process id), user threads do not.
For some reason, naming got little messed and we refer to LWP as threads.
There are definitely more differences, but please, refer to slides.
http://www.cosc.brocku.ca/Offerings/4P13/slides/threads.ppt
EDIT:
After posting, I found a good article that explains everything in more details and is in better English than I write.
http://www.thegeekstuff.com/2013/11/linux-process-and-threads/
From MSDN, Threads and Processes:
Processes exist in the operating system and correspond to what users
see as programs or applications. A thread, on the other hand, exists
within a process. For this reason, threads are sometimes referred to
as light-weight processes. Each process consists of one or more
threads.
Based on Tanenbaum's book "Distributes Systems", light weight processes is generally referred to a hybrid form of user-level thread and kernel-level thread. An LWP runs in the context of a single process, and there can be several LWPs per process. In addition each LWP can be running its own (user-level) thread. Multi-threaded applications are constructed by creating threads (with thread library package), and subsequently assigning each thread to an LWP.
The biggest advantage of using this hybrid approach is that creating, destroying, and synchronizing threads is relatively cheap and do not need any kernel intervention. Beside that, provided that a process has enough LWPs, a blocking system call will not suspend the entire process.
Threads run within processes.
Each process may contain one or more threads.
If kernel doesn't know anything about the threads running in the process, we have threads running on user space and thus no multiprocessing capabilities are available.
On the other hand, we can have threads running on the kernel space; this means that each process runs on a different CPU. This enables us multiprocessing, but as you may assume it is more expensive in terms of operating system resources.
Finally, there is a solution that lies somewhere in the middle; we group threads together into LWP. Each group runs on different CPU, but the threads in the group cannot be multi processed. That's because kernel in this version knows only about the groups (which are multiprocessed) but nothing about the threads that they contain.
Hope it is clear enough.
IMO, LWP is a kernel thread binding which can be created and executed in the user context.
If I'm not mistaken, you can attach user threads to a single LWP to potentially increase the level of concurrency without involving a system call.
Thread is basically task assigned with one goal and enough information to perform a specific task .
A process can create multiple threads for doing its work as fast as possible.
e.g A portion of program may need input output, a portion may need permissions.
User level thread are those that can be handled by thread library.
On the other hand kernel level thread (which needs to deal with hadrware)are also called LWP(light weight process) to maximize the use of system and so the system does not halt upon just one system call.
From here.
Each LWP is a kernel resource in a kernel pool, and is attached and detached to a thread on a per thread basis. This happens as threads are scheduled or created and destroyed.
A process contains one or more threads in it and a thread can do anything a process can do. Also threads within a process share the same address space because of which cost of communication between threads is low as it is using the same code section, data section and OS resources, so these all features of thread makes it a "lightweight process".

Resources