Why do we need a swapper task in linux? - linux

The idle task (a.k.a. swapper task) is chosen to run when no more runnable tasks in the run queue at the point of task scheduling. But what is the usage for this so special task? Another question is why i can't find this thread/process in the "ps aux" output (PID=0) from the userland?

The reason is historical and programatic. The idle task is the task running, if no other task is runnable, like you said it. It has the lowest possible priority, so that's why it's running of no other task is runnable.
Programatic reason: This simplifies process scheduling a lot, because you don't have to care about the special case: "What happens if no task is runnable?", because there always is at least one task runnable, the idle task. Also you can count the amount of cpu time used per task. Without the idle task, which task gets the cpu-time accounted no one needs?
Historical reason: Before we had cpus which are able to step-down or go into power saving modes, it HAD to run on full speed at any time. It ran a series of NOP-instructions, if no tasks were runnable. Today the scheduling of the idle task usually steps down the cpu by using HLT-instructions (halt), so power is saved. So there is a functionality somehow in the idle task in our days.
In Windows you can see the idle task in the process list, it's the idle process.

The linux kernel maintains a waitlist of processes which are "blocked" on IO/mutexes etc. If there is no runnable process, the idle process is placed onto the run queue until it is preempted by a task coming out of the wait queue.
The reason it has a task is so that you can measure (approximately) how much time the kernel is wasting due to blocks on IO / locks etc. Additionally it makes the code that much easier for the kernel as the idle task is the same as every task it needs to context switch, instead of a "special case" idle task which could make changing kernel behaviour more difficult.

There is actually one idle task per cpu, but it's not held in the main task list, instead it's in the cpu's "struct rq" runqueue struct, as a struct task_struct * .
This gets activated by the scheduler whenever there is nothing better to do (on that CPU) and executes some architecture-specific code to idle the cpu in a low power state.

You can use ps -ef and it will list the no of process which are running. Then in the first link, it will list the first pid - 0 which is the swapper task.

Related

Why “ps aux” in Linux does not show the process whose pid=0? [duplicate]

The idle task (a.k.a. swapper task) is chosen to run when no more runnable tasks in the run queue at the point of task scheduling. But what is the usage for this so special task? Another question is why i can't find this thread/process in the "ps aux" output (PID=0) from the userland?
The reason is historical and programatic. The idle task is the task running, if no other task is runnable, like you said it. It has the lowest possible priority, so that's why it's running of no other task is runnable.
Programatic reason: This simplifies process scheduling a lot, because you don't have to care about the special case: "What happens if no task is runnable?", because there always is at least one task runnable, the idle task. Also you can count the amount of cpu time used per task. Without the idle task, which task gets the cpu-time accounted no one needs?
Historical reason: Before we had cpus which are able to step-down or go into power saving modes, it HAD to run on full speed at any time. It ran a series of NOP-instructions, if no tasks were runnable. Today the scheduling of the idle task usually steps down the cpu by using HLT-instructions (halt), so power is saved. So there is a functionality somehow in the idle task in our days.
In Windows you can see the idle task in the process list, it's the idle process.
The linux kernel maintains a waitlist of processes which are "blocked" on IO/mutexes etc. If there is no runnable process, the idle process is placed onto the run queue until it is preempted by a task coming out of the wait queue.
The reason it has a task is so that you can measure (approximately) how much time the kernel is wasting due to blocks on IO / locks etc. Additionally it makes the code that much easier for the kernel as the idle task is the same as every task it needs to context switch, instead of a "special case" idle task which could make changing kernel behaviour more difficult.
There is actually one idle task per cpu, but it's not held in the main task list, instead it's in the cpu's "struct rq" runqueue struct, as a struct task_struct * .
This gets activated by the scheduler whenever there is nothing better to do (on that CPU) and executes some architecture-specific code to idle the cpu in a low power state.
You can use ps -ef and it will list the no of process which are running. Then in the first link, it will list the first pid - 0 which is the swapper task.

In Linux scheduler, how do different processes containing multiple threads get fair time quota?

I know linux scheduler will schedule the task_struct which is a thread. Then if we have two processes, e.g., A contains 100 threads while B is single thread, how can the two processes be scheduled fairly, considering if each thread would be scheduled fairly?
In addition, so in Linux, context switch between threads from the same process would be faster than that between threads from different processes, right? Since the latter will have something to do with process control block while the former wouldn't.
The point you are missing here is, how scheduler looks at threads or tasks. Well, the Linux kernel scheduler will treat them as individual scheduling entity, therefore will be counted and scheduled differently.
Now let's see what CFS documentation says - it has a simplistic approach of giving out even slice of CPU time to each runnable process, therefore, if there are 4 runnable process/threads they'll get 25% of cpu time each. But on real hardware it's not possible and to fix the issue vruntime was introduced (take more on this from here
Now come back to your example, if process A creates 100 threads and B creates 1 thread then the # of running processes or threads becomes 103 (assuming all are runnable state) then CFS will evenly share the cpu using formula 1/103 (cpu/number of running tasks). And the context switching is same for all the scheduling entities, threads only shares task's internal mm_struct and when they run they have their own sets of registers, task status to load up to start with. Hope this will help to understand better.

Working of scheduler

Scheduler is a program , that schedules different processes in the OS.
The question that came in mind is:
Since scheduler is also a process , and it is scheduling other processes by context switching.
So , there will be a time when , scheduler will it self get switched with any other process.
If this happens then how the scheduling happens after that.
Or , If it is not like that, then hw does it work, as in a multitasking system, In order to run different processes together , the processes has to be context switched, and if scheduler is running all the time then, how will it give space for other processes to run together.
The scheduler is a program, yes, but very rarely is it a process. Rather the schedule is part of the Kernel, or the program that abstracts processes from the hardware (including the processor usage).
In a preemptive scheduler, Since the scheduler is part of the kernel, it actually exists in the address space of every single process. When a process's alloted time is up, the scheduler takes control of program execution and then does the necessary work to move to the next process. When the schedule does this, however, it does not remove itself from the new process's address space so that when the new process's time is up, it can saftely perform the work needed to move on.
While there have been kernels whose functions were often offloaded into other processes (CMU Mach), there will always be a part of the kernel that retains functionality for changing processes, and this will never be exclusively in its own process.
For more information on how scheduling works, I find the following articles helpful:
http://wiki.osdev.org/Context_Switching
http://wiki.osdev.org/Scheduling_Algorithms
http://wiki.osdev.org/Processes_and_Threads

Linux Threads and process - CPU affinity

I have few queries related to threads and Process scheduling.
When my process goes into sleep and wakes back, is it always that it will be scheduled on the same CPU that it got scheduled before?
When i create a thread from the process, Will it also be executed on the same CPU always? Even if other CPU's are free and sleeping.
I would like to know the mechanism in Linux in specific. Also i am creating the threads through pthread library. I am facing a random hangup issue which is always not reproducible. Need this information to proceed in the right direction.
On single processor/core systems
Yes
Yes
on multi processor/core systems
No.
No.
use taskset to retrieve or set a processes’s CPU affinity on multicore systems. Setting the CPU affinity to a specific processor/core will change the answers to
Yes
Yes
also for multicore systems.
From within an application you may use sched_setaffinity and/or sched_getaffinity to adjust the CPU affinity.
Edit: Additional details about how/when CPU swaps are managed with respect to cache disadvantages:
The Linux/SMP Scheduler: "... In order to achieve good system performance, Linux/SMP (2.4 kernel) adopts an empirical rule to solve the dilemma ..." Read the details in the linked reference, section The Linux/SMP Scheduler.
For the newer CFS (Completely Fair Scheduler) you'd look at sched_migration_cost. "...if the real runtime of the task is smaller than the values of this parameter then the scheduler assumes that it is still in the cache and tries to avoid moving the task to another CPU during the load balancing procedure ..." (e.g.: Completely Fair Scheduler and its tuning).
when process goes in to sleep and when it wake up ,it is not necessary that it will schedule on same cpu.if u have multiprocessor environment then according to scheduler policy it will schedule on any cpu.When process goes to sleep there are different reason ,it goes to sleep beacause it is waiting for io,any resource.When event will occurs it goes from waiting state to ready state.At that time which cpu will be free scheduler will schedule that process on free cpu.It is not necessary it will schedule on same cpu.
for extra information about scheduler open source code of scheduler in linux release tree path.

Task in vxworks

When we doing taskSpawn a task is creating in vxworks. What is actually a task. Is there any relation with thread.
In my understanding vxworks is thread based Operating system.
Can some one please help me the real difference between task/thread/process in real scenario.
Somewhere I saw task is the execution of set of instruction. If it is like that then thread also have some set of instruction so can we call thread as task.
Please help
Thread is a concept typically used with an OS supporting process models (Unix/Linux/Windows) where you run a process.
This process could have a single thread of execution (like a simple C program). Or you could create multiple threads to perform certain operations in parallel within the current process memory space.
With older vxWorks, there was no process model. Everything would run in the same memory space. A vxWorks tasks provides the context where the system code would execute. All code (with the exception of interrupt handlers) will execute in the context of a Task.
Tasks are independent execution units. They can share resources, have common memory, etc... but the scheduler executes the tasks based on very specific criteria. Typically, the highest priority task in the system is the task that will be executing at any given time.
Once a task is done/sleeps/blocked waiting for resources, then the next highest priority task in the system will run.
For your purpose, you can probably think of the task as a thread.
A task is abstract concept in OS design. A task is a single context of execution. A task has a memory space it operates in where its data and code is stored. This memory space may or may not be shared with other tasks. A task has a state (e.g. running, stopped, killed...), it (usually) has a stack. A task has a priority over other tasks.
On example of such a task, is a VxWorks task. Another is a Linux thread.
In Linux (and I believe also in latest version of VxWorks btw), there exists a concept of a related group of tasks. Tasks belonging to the same group share memory space and several other resources (e.g. file handlers). A Linux process is such a group of tasks.
By an large, the OS scheduler schedules tasks and not processes. The process is a convective abstraction for the programmer to think about group of related threads together.
I hope that helped.
In vxWorks, tasks is a runnable unit.
Task have TCB (Task Control Block) with a unique task space and specific priority (as you defined in the taskSpawn function).
The vxWorks scheduler can run only task, this is the minimum runnable unit (the scheduler can run the kernel itself and interrupt can run in the system).
The decision which task to run will base on the task state (must be in READY), and the task priority (in vxWorks, the highest priority is the lower number).
Note that several tasks might be in the same priority and then the kernel will run different tasks according the scheme you configured (FIFO or round robin).
In vxWorks, all tasks have the same memory space (including Kernel memory space). This is the reason that WindRiver added the "Process like" mechanism from vxWorks 6.x. Process have its own "virtual memory space" that protected by MMU.
Just to summery it for you:
Tasks have the same memory space over the system.
Threads have the same memory space within their process.
Process memory space protected by MMU.
task and threads are similar to process. but the difference is threads dont have seperate memory space for their own they run under the pcb(stack) of the process itself.but whereas,task has its own stack area and is a light weighted process i.e.,tcb is much smaller when compared to pcb so context switching or task switching can happen faster .
since vxworks deals with rtos and switching latency should be very less ,it deals with tasks.
In addition to the existing anwers:
If you ever need to create POSIX threads on your VxWorks system (which is possible by including POSIX in the kernel config and calling pthread_create() ), you will notice that those threads will appear as tasks in you task list (type 'i' in C shell).
Hence, tasks and threads are very much alike. VxWorks even wraps POSIX threads as tasks so they can be handled in parallel to existing native tasks.

Resources