Threads inside a Process - multithreading

Threads inside a Process - multithreading

Processes get CPU time as managed by the OS process scheduler.
Since threads run in parallel within a single process, does this mean that a process's CPU time is further distributed(sliced) among threads?
Or can the scheduler directly distribute CPU time among threads bypassing the parent process?

I suspect the answer varies with the OS. On Windows, the process is not merely bypassed, but completely ignored -- all the scheduler deals with is threads. Processes are relevant only to the degree that all non-kernel threads do have to belong to some process, and every process has to contain at least one thread.

The threads are run/scheduled by the operating system and therefore they get their own CPU time. The process CPU time is just the sum of the CPU times of all the threads in the process.
If you want your process to schedule the tasks itself, you should use fibers (Windows). These are a kind of threads but they are not scheduled by the OS. The process should handle the scheduling of fibers itself.

For Windows see http://msdn.microsoft.com/en-us/library/ms681917%28VS.85%29.aspx

Related

Context Switch: Thread vs Process

From what I understand, scheduling is based on processes, not threads. Then let's say I'm running two programs with the same logic, but one with multi-processing(10 processes) and the other one with multi-threading(10 threads). Then, since scheduling is based on processes, wouldn't the program with multi-processing dominate 10/11 of cpu time? The multi-threaded program would only have 1/11 of cpu time and 10 threads share that tiny time slice.
What am I missing?

Threads of one process in parallel execution

I know, that threads exist in borders of process: each process has at least 1 thread and thread can't exist without process; threads share memory and processes does not(without special manipulations) and so on. Also we can load CPU cores by giving it multiple processes to execute at the same time.
But can we execute multiple threads of the SAME process at one time(i mean real parallel execution, not pseudo-parallel) and if we can, is it better than using mupltiple processes and why?
Thank you for answer!

Threads are basically lightweight processes. OS threads can be executed in parallel, real parallel execution requires just having multiple CPU cores.
The threads have lower isolation since they share memory and can clobber each other's memory, unlike processes. The upside is that they generally have less metadata associated with them and are easier/faster to create, so you can have more of them running at the same time, than processes.

How does a process schedule its own threads

After the Kernel schedules a process that has threads, How does said process schedule its own threads during its time splice?

For most modern kernels, the kernel only schedules threads, and processes are mostly just a container for the threads to execute inside (e.g. a container that contains a virtual address space, however many threads, and a few other scraps like file handles).
For some kernels (mostly very old unix kernels that existed before threads were invented) the kernel schedules processes and then a user-space library emulates threads. For this to work properly all of the "blocking" system calls (e.g. write()) have to be replaced by asynchronous system calls (e.g. aio_write()) so that other threads in the process can be given CPU time; however I wouldn't want to assume it works properly (e.g. if any thread blocks, then maybe all threads in the process block).
Also it may not work when there's multiple CPUs (kernel gives a process one CPU, but then from the kernel's perspective that process is running and can't use a second CPU). There are sophisticated work-arounds for this (to support "M:N threading") but it's just easier and better to fix the scheduler so it works with threads. Fortunately/unfortunately this didn't matter much in the early days because very few computers had more than one CPU anyway.
Lastly; it doesn't work for thread priorities - e.g. one process might keep CPU busy executing an unimportant/low priority thread while another process doesn't get that CPU time when it desperately needs it for an important/high priority thread. This occurs because no process knows about threads belonging to other processes and the kernel only knows about processes and not threads.
Of course these are also the reasons why every kernel adopted "kernel schedules threads and not processes" (and those that didn't died).

It's down to jargon definitions, but threads are simply a bunch of processes sharing an address space. Older Unixes even called them Light Weight Processes.
With that classical understanding of threads, the answer is that, these days, it's the OS that does the scheduling and each thread gets its own timeslices.
Extras
Some OSes do things to "the whole process" - e.g. Windows will give the process that has mouse focus a priority boost (all it's threads get dynamically notched up a few priority places), to make that application appear to be more sprightly (this goes back to Windows 3).
Other operating systems will increase the priority of a thread dynamically, to solve priority inversion situations. This is where a low priority thread that has control of a resource (I/O, or perhaps a semaphore) is blocking a higher priority thread from running (because the resource is not available. This is the priority inversion, and it's solved by the OS boosting the priority of the blocking thread until it gives up the required resource.

Either the kernel schedules the threads or the kernel schedules processes simulates thread by scheduling it own threads.
Usually, the process schedules its own threads using a library that sets timers. When the timer handler saves the current "thread's" registers then loads a new set of registers from another "thread."

Why is process scheduling not called thread scheduling?

I found out that Linux and Windows both schedule threads and not processes.
Source
So I don't understand why we call it "process scheduling" any more. Shouldn't we be calling it thread scheduling? The idea of shared memory for threads of the same process just seems to be a technicality that has to be taken care of while actually running the threads (we could assume 2 threads of the same process to be a 2 single threaded processes sharing memory).
Are there any operating systems that schedule processes and when it is time for a process to run, specially decide how to run its threads?

OS-scheduled threads are a relatively new feature. It was not that long ago when a separate path of execution on Unix meant creating an entirely new process. So there is historical resistance.
Some systems (Unix variants, VMS) schedule processes, not threads. Process scheduling is likely to remain the way to go in real time operating systems.

In process scheduling resources are allocated to each process differently i.e suppose you create 2 processes then each process will get his own resources(file buffer,i/o files, CPU control etc). In this, time is wasted when scheduling is done. As first process is called then resources are allocated to that process when second process is called then resources are allocated to that process so resources are allocated separately to each process and also context switching time increases during scheduling.
Thread is basically a small unit of process. So one process can have many threads. But here resources are shared between different threads as they are one part of process, so multitasking is available and also context switching time is less.

How Linux scheduler schedules processes on multi-core processors?

Multi-core processors exploits thread level parallelism, it means that multiple threads runs in parallel. Suppose, a process has only one thread then, do the other cores remain idle during the execution of this process? In linux system, scheduler consider processes and threads both as a task. It doesn't differentiate between process and thread while scheduling it. So, does this means that different cores executes different threads of different processes in parallel?
When context-switch happens, does this happen only for one core or for all the cores of the cpu?

You are right: processes and threads are the same from the Linux scheduler's point of view. These tasks are queued according to the scheduler's rules and wait for their turn.
There are scheduling rules such as priority or CPU affinity (to prevent a thread to migrate to another core and preserve cache data).
A context switch may happen on a core every fixed amount of time (a time slice) because the CPU automatically runs some kernel code periodically to permit preemption. Depending on the scheduler's rules, a task can be run for many time slices. A context switch can also occur when a thread calls functions that makes it unrunnable (eg. waiting for IO).
In some cases, if not all, there is one scheduling process per core which does all that.
There is also a similar question on superuser

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string