How the kernel determines the process priority

How the kernel determines the process priority - linux

Suppose four user processes are running in my system say P1, P2, P3, P4. Can the user understand which process is of least priority among others? How does the kernel prioritizes the processes? What are the parameters it takes into consideration while determining process priority?
I need this information since I'm trying to suspend one of the process which has least priority compared to others.

Process priority is a not as simple as that and typically unless you do something, all user level process have the same priority to begin with (as they are time shared by scheduler). However, you can instruct Kernel to either prioritize/de-prioritize a process by using a nice value per process.
For more details, take a look at http://man7.org/linux/man-pages/man7/sched.7.html

Related

Process with multiple threads on multiprocessor system. How do they work?

So I was reading about Processes and Threads and I had a question. Following is the scenario.
Uniprocessor Environment
I understand that the OS rotates the processes over processor for a particular time period.(quantum) . Now I get it when the process is single threaded, ie just one path of execution. In that case, whenever it is assigned the processor, it continues with it's execution. Let's say the process forks and or just creates a new thread. Now how does the entire process works? Is it that the OS will say to process P "Go on, continue with execution" and the Process within itself will pick the new thread or the parent thread on rotation? So that if there are more than two threads, the rotation seems fair to each thread. Or does the OS actually interacts with the threads? (In that case I am not sure what happens).
Multiprocessor Environment
Now say I have a multiprocessor environment. Now in this case, if there was just uni-threaded process, then OS will assign either of the processors to it and on it will go with it's execution. Now say, there are multiple threads in the Process. Now if I assign one of the processor to the process, and ask it to continue it's execution, and the Process has to pick either of the thread for it's execution, then there never will be parallel processing going on in that specific process. Since the process will have to put either of it's threads on the processor.
So how does it happen in both the cases?
Cheers.

Process Scheduing
Operating Systems ultimately control these types of thread scheduling.
Windows systems are priority-based and so will allow a process to consume more resources that others. This is why your machine can 'hang', if a process has been escalated to a high priority. Priorities are ranged between 1-31 as far as I know.
Mac OS / Linux / Unix are time-based, allowing all processes to have equal amounts of CPU time. Therefore loading more processes will slow your system down as they all share a smaller slice of execution time.
Uniprocessor Environment
The OS is ultimately responsible for this but switching processes involves (I cannot guarantee accuracy here, but its just an indication):
Halting a process / thread
Storing the current stack (code location)
Storing the current registers of the CPU
Asking the kernel for the next process/thread to run
Kernel indicates which one has to be run
OS reloads the registers from the cache
OS reloads the current stack for the next application.
Resumes the process
Obviously the more threads and processes you have running, the slower it will become. The problem is that the time taken to switch processes can actually take longer than the time allowed to execute the process.
Threads are just child processes of a single process. For a single processor, it just looks like additional work.
Multi-processor Environment
Multi-processor environments work differently as the cache is shared amongst processors. I believe these are called L1 (Level) and L2 caches. So the difference is that processor A can reload the state stored by processor B without conflicts. 'Hyper-threading' also has the same approach, although this is processor specific. The difference here is that a processor could solely control a specific process - this is called 'CPU Affinity' Its not encouraged for every process, but it does allow an application to have a dedicated processor to work off.

This is OS-specific, of course, but most operating systems schedule at the thread level. A process is just a grouping of threads. For example, on Linux, threads are called "tasks" and each is scheduled independently. They are created with the clone call. What is typically called a thread is a task which shares its address space (and other resources such as file descriptors, mount points, etc.) with the creating task. Note that the clone call can also create what is typically called a process if the flags to enable sharing are not passed.
Considering the above, any thread may be scheduled at any time on any processor, no matter how many processors there are available. That said, most OSs also attempt to maintain some measure of processor affinity to avoid excessive cache misses, but usually if a thread is runnable and a different CPU is available, it will change CPUs. Often there is also a way to specify which CPUs a particular thread may execute upon.

Doesn't matter whether there is 1 or 128 processors. The OS manages access to resources to try an efficiently match up requests with availabilty, and that includes CPU execution. If a thread is running, it has already managed to get some CPU but, if it requests a resource that is not immediately available, it no longer needs any CPU until that other resource does become free, and so the OS will remove CPU execution from it and, if there is another thread that is waiting for CPU, it will hand it over. When the requested reource does become available, the thread will be made ready again. If there is a core free, it will be made running 'immediately', if not, the CPU scheduling algorithm makes a decision on whether to stop a currently-running thread to free up a core or to leave the newly-ready thrad waiting.
It's better to try and ignore things like 'time-slice, quantum, priority' - it causes much confusion and FUD. If a running thread wants something it cannot have yet, it doesn't need any more CPU cycles, and the OS will take them away and, if another thread needs it, apply them there. That is why preemptive multitaskers exist - to match up threads with resources in an attempt to maximize forward progress.

what is the difference among three priorities used in Linux kernel?

I am new to Linux kernel and right now i am studying about Process Scheduling in Linux kernel . There are three types of priorities in Linux :
Static priority
Dynamic priority
Real time priority
Now what i have understood is that :
Static priority and Dynamic priority are defined only for Conventional processes and they can take value from 100 to 139 only.
Static priority is used to determine base time slice of a process
Dynamic priority is used to select a process to be executed next.
Real time priorities are defined only for Real time processes and it's value can range from 0 to 99
Now my questions are :
Correct me if i am wrong and please also tell me why we are using
three types of priorities in Linux and what are the differences
among these priorities?
Are the processes are differentiated as Real time or Conventional on the basis of priorities that is if priority is between 100 to 139
then processes are Conventional processes otherwise Real time
processes?
How the priorities are changed in Linux , i mean , we know that priority of a process does not remain constant through out the execution ?

Disclaimer: Following is true for scheduling in linux (I am not sure about windows or other OS). Thread and process has been used interchangeably here, althogh there is a difference in between them.
Priorities & differences
1.Static priority: These are the default priorities (value 0 for conventional processes aka non-real time processes ie when real time scheduling is not used) set while creating a new thread. You can change them using:
`pthread_setschedparam(pthread_t thread, int policy, const struct sched_param *param);`
where, sched_param contains the priority:
struct sched_param
{
int sched_priority; /* Scheduling priority */
};
2 Dynamic priority: When threads start to starve because higher priority threads being scheduled all the time, there becomes a need to raise the priority of such a thread using various mechanisms. This raised/lowered (yes, this happens too) priority is known as the dynamic priority because it keeps on changing. In Linux, even the fat kid gets to play.
3.Real time priority: This comes into picture only when threads (processes) are scheduled under one of the real-time policies (SCHED_FIFO, SCHED_RR) and have a sched_priority value in the range 1 (low) to 99 (high). This is the highest in comparison to the static/dynamic priorities of non real time processes.
More information: http://man7.org/linux/man-pages/man3/pthread_getschedparam.3.html
Now, to your questions:
Correct me if i am wrong and please also tell me why we are using three types of priorities in Linux and what are the differences among
these priorities?
So, for non-real time scheduling policies, every process has some static priorities, a higher priority gives the thread a kick-start, and later to avoid any injustice, the priority is boosted/lowered which becomes the dynamic priority.
Are the processes are differentiated as Real time or Conventional on the basis of priorities that is if priority is between 100 to 139
then processes are Conventional processes otherwise Real time
processes?
Not really, it depends upon the scheduling mechanism in place.
How the priorities are changed in Linux , i mean , we know that priority of a process does not remain constant through out the
execution ?
That's when the dynamicness comes into picture. Read about the "nice value" in the given links.

What is the difference between lightweight process and thread?

I found an answer to the question here. But I don't understand some ideas in the answer. For instance, lightweight process is said to share its logical address space with other processes. What does it mean? I can understand the same situation with 2 threads: both of them share one address space, so both of them can read any variables from bss segment (for example). But we've got a lot of different processes with different bss sections, and I don't know, how to share all of them.

I am not sure that answers are correct here, so let me post my version.
There is a difference between process - LWP (lightweight process) and user thread. I will leave process definition aside since that's more or less known and focus on LWP vs user threads.
LWP is what essentially are called today threads. Originally, user thread meant a thread that is managed by the application itself and the kernel does not know anything about it.
LWP, on the other hand, is a unit of scheduling and execution by the kernel.
Example:
Let's assume that system has 3 other processes running and scheduling is round-robin without priorities. And you have 1 processor/core.
Option 1. You have 2 user threads using one LWP. That means that from OS perspective you have ONE scheduling unit. Totally there are 4 LWP running (3 others + 1 yours). Your LWP gets 1/4 of total CPU time and since you have 2 user threads, each of them gets 1/8 of total CPU time (depends on your implementation)
Option2. You have 2 LWP. From OS perspective, you have TWO scheduling units. Totally there are 5 LWP running. Your LWP gets 1/5 of total CPU time EACH and your application get's 2/5 of CPU.
Another rough difference - LWP has pid (process id), user threads do not.
For some reason, naming got little messed and we refer to LWP as threads.
There are definitely more differences, but please, refer to slides.
http://www.cosc.brocku.ca/Offerings/4P13/slides/threads.ppt
EDIT:
After posting, I found a good article that explains everything in more details and is in better English than I write.
http://www.thegeekstuff.com/2013/11/linux-process-and-threads/

From MSDN, Threads and Processes:
Processes exist in the operating system and correspond to what users
see as programs or applications. A thread, on the other hand, exists
within a process. For this reason, threads are sometimes referred to
as light-weight processes. Each process consists of one or more
threads.

Based on Tanenbaum's book "Distributes Systems", light weight processes is generally referred to a hybrid form of user-level thread and kernel-level thread. An LWP runs in the context of a single process, and there can be several LWPs per process. In addition each LWP can be running its own (user-level) thread. Multi-threaded applications are constructed by creating threads (with thread library package), and subsequently assigning each thread to an LWP.
The biggest advantage of using this hybrid approach is that creating, destroying, and synchronizing threads is relatively cheap and do not need any kernel intervention. Beside that, provided that a process has enough LWPs, a blocking system call will not suspend the entire process.

Threads run within processes.
Each process may contain one or more threads.
If kernel doesn't know anything about the threads running in the process, we have threads running on user space and thus no multiprocessing capabilities are available.
On the other hand, we can have threads running on the kernel space; this means that each process runs on a different CPU. This enables us multiprocessing, but as you may assume it is more expensive in terms of operating system resources.
Finally, there is a solution that lies somewhere in the middle; we group threads together into LWP. Each group runs on different CPU, but the threads in the group cannot be multi processed. That's because kernel in this version knows only about the groups (which are multiprocessed) but nothing about the threads that they contain.
Hope it is clear enough.

IMO, LWP is a kernel thread binding which can be created and executed in the user context.
If I'm not mistaken, you can attach user threads to a single LWP to potentially increase the level of concurrency without involving a system call.

Thread is basically task assigned with one goal and enough information to perform a specific task .
A process can create multiple threads for doing its work as fast as possible.
e.g A portion of program may need input output, a portion may need permissions.
User level thread are those that can be handled by thread library.
On the other hand kernel level thread (which needs to deal with hadrware)are also called LWP(light weight process) to maximize the use of system and so the system does not halt upon just one system call.

From here.
Each LWP is a kernel resource in a kernel pool, and is attached and detached to a thread on a per thread basis. This happens as threads are scheduled or created and destroyed.

A process contains one or more threads in it and a thread can do anything a process can do. Also threads within a process share the same address space because of which cost of communication between threads is low as it is using the same code section, data section and OS resources, so these all features of thread makes it a "lightweight process".

When two threads run a specific process separately, will the program end when one thread returns the value?

Here's the scenario:
You have two threads (which represent different machines) who take the same input from a singular data source, run through the same processes (which do not depend on any shared resources) and return the same value.
If one thread (read:machine) is faster than the other and finishes first, will the program accept that value and end, or will it wait for the other thread to finish? If the answer is the latter, is there anyway to force the program to take the first answer?
The practical reason for this would be to handle unbearably slow machines.

This is entirely you to decide. If you spawn two threads, you can control them from the parent process and decide the behavior you want. You may wait on both threads, or wait until one of them is available (eg. using select from the parent thread or signal from the child thread), and possibly kill the other one (using a signal again, or kill).
For a very good reference on system programming (multi-processing, threads, communications, concurrency..), see Unix system programming in Objective Caml. It has an example (psearch here) where threads collaborate to find a result, and stop as soon as one of them succeeded.

when to use process v/s thread?

I know the theoretical difference between the thread and process. But in practical when to use the thread and process because both will does the same work.

In general (and it varies by operating system):
Threads are usually lighter-weight than processes
Processes provide better isolation between actions
Threads provide simpler data sharing and coordination within the process
Typically the middle point is the kicker for me - if you really, really don't want two actions to interfere with each other, to the extent that one process going belly-up doesn't affect the other action, use separate processes. Otherwise I personally go for threads.
(I'm assuming that both models are available - if you want to run a separate executable, that's going to be pretty hard to do within an existing thread, at least in most environments I'm aware of.)

Thread is a subtotal of a process. Hereby the main difference is memory allocation and CPU time scheduling:
operating system handles memory per process and schedules execution time for processes
you allocate memory (within the bounds allowed per process) and you schedule execution time (within given execution timeframe per process) for threads
Other than that there's a lot of minor defining differences, like hardware allocation (threads can share hardware locked by their process), communication (depending on the platform/language/runtime, threads can share variables, processes need a pipe to share information) etc. There's much more in this distinction if you think of a thread as of an atomic entity, whilst process in that case would be the way to group these entities.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string