Why is process scheduling not called thread scheduling? - multithreading

I found out that Linux and Windows both schedule threads and not processes.
So I don't understand why we call it "process scheduling" any more. Shouldn't we be calling it thread scheduling? The idea of shared memory for threads of the same process just seems to be a technicality that has to be taken care of while actually running the threads (we could assume 2 threads of the same process to be a 2 single threaded processes sharing memory).
Are there any operating systems that schedule processes and when it is time for a process to run, specially decide how to run its threads?

OS-scheduled threads are a relatively new feature. It was not that long ago when a separate path of execution on Unix meant creating an entirely new process. So there is historical resistance.
Some systems (Unix variants, VMS) schedule processes, not threads. Process scheduling is likely to remain the way to go in real time operating systems.

In process scheduling resources are allocated to each process differently i.e suppose you create 2 processes then each process will get his own resources(file buffer,i/o files, CPU control etc). In this, time is wasted when scheduling is done. As first process is called then resources are allocated to that process when second process is called then resources are allocated to that process so resources are allocated separately to each process and also context switching time increases during scheduling.
Thread is basically a small unit of process. So one process can have many threads. But here resources are shared between different threads as they are one part of process, so multitasking is available and also context switching time is less.


Threads of one process in parallel execution

I know, that threads exist in borders of process: each process has at least 1 thread and thread can't exist without process; threads share memory and processes does not(without special manipulations) and so on. Also we can load CPU cores by giving it multiple processes to execute at the same time.
But can we execute multiple threads of the SAME process at one time(i mean real parallel execution, not pseudo-parallel) and if we can, is it better than using mupltiple processes and why?
Thank you for answer!
Threads are basically lightweight processes. OS threads can be executed in parallel, real parallel execution requires just having multiple CPU cores.
The threads have lower isolation since they share memory and can clobber each other's memory, unlike processes. The upside is that they generally have less metadata associated with them and are easier/faster to create, so you can have more of them running at the same time, than processes.

Can kernel schedule user level threads of same process on different cores?

As far as I know kernel doesn't know whether it is executing a user thread or user process because for kernel user threads are user process, it only schedules user processes and doesn't care which thread was running in that process.
I have one more question, Is there per core ready queue or a single ready queue for all the cores?
I was reading this paper and it is written that
In the stock Linux kernel the set of runnable threads is partitioned
into mostly-private per core scheduling queues; in the common case,
each core only reads, writes, and locks its own queue.
The linux kernel scheduler uses the "task" as its primary schedulable entity. This corresponds to a user-space thread. For a traditional simple Unix-style program, there is only a single thread in the process and so the distinction can be ignored. Other programs of course may have multiple threads. But in all cases, the kernel only schedules tasks (i.e. threads).
Your terminology above therefore doesn't really match the situation. The kernel doesn't really care whether the different threads it schedules are part of the same process or different processes: each thread can be scheduled independently. You can have multiple threads from the same process running on different processors/cores at the same time.
Yes, there are separate run queues for each core.
The paper you reference is, I think, slightly misleading in its phrasing. In particular, saying that the "set of runnable threads is partitioned into..." doesn't give quite the right meaning; that makes it sound like the threads are divided into multiple groups that are then assigned to different cores and can only be executed there. It would be more accurate to say that there is a separate run queue for each core containing a set of threads waiting to execute, and in common use, the scheduler doesn't need to reference the queues for other cores.
But in fact, threads can migrate from one core to another. For example, if there is a thread waiting to run on core A (hence in core A's run queue), but core A is already busy running some other thread, and there is another core that is not busy, the waiting thread may be migrated to that other core and executed there. (This is an oversimplification of course as there are other factors that go into deciding whether/when to migrate a thread.)

Process with multiple threads on multiprocessor system. How do they work?

So I was reading about Processes and Threads and I had a question. Following is the scenario.
Uniprocessor Environment
I understand that the OS rotates the processes over processor for a particular time period.(quantum) . Now I get it when the process is single threaded, ie just one path of execution. In that case, whenever it is assigned the processor, it continues with it's execution. Let's say the process forks and or just creates a new thread. Now how does the entire process works? Is it that the OS will say to process P "Go on, continue with execution" and the Process within itself will pick the new thread or the parent thread on rotation? So that if there are more than two threads, the rotation seems fair to each thread. Or does the OS actually interacts with the threads? (In that case I am not sure what happens).
Multiprocessor Environment
Now say I have a multiprocessor environment. Now in this case, if there was just uni-threaded process, then OS will assign either of the processors to it and on it will go with it's execution. Now say, there are multiple threads in the Process. Now if I assign one of the processor to the process, and ask it to continue it's execution, and the Process has to pick either of the thread for it's execution, then there never will be parallel processing going on in that specific process. Since the process will have to put either of it's threads on the processor.
So how does it happen in both the cases?
Process Scheduing
Operating Systems ultimately control these types of thread scheduling.
Windows systems are priority-based and so will allow a process to consume more resources that others. This is why your machine can 'hang', if a process has been escalated to a high priority. Priorities are ranged between 1-31 as far as I know.
Mac OS / Linux / Unix are time-based, allowing all processes to have equal amounts of CPU time. Therefore loading more processes will slow your system down as they all share a smaller slice of execution time.
Uniprocessor Environment
The OS is ultimately responsible for this but switching processes involves (I cannot guarantee accuracy here, but its just an indication):
Halting a process / thread
Storing the current stack (code location)
Storing the current registers of the CPU
Asking the kernel for the next process/thread to run
Kernel indicates which one has to be run
OS reloads the registers from the cache
OS reloads the current stack for the next application.
Resumes the process
Obviously the more threads and processes you have running, the slower it will become. The problem is that the time taken to switch processes can actually take longer than the time allowed to execute the process.
Threads are just child processes of a single process. For a single processor, it just looks like additional work.
Multi-processor Environment
Multi-processor environments work differently as the cache is shared amongst processors. I believe these are called L1 (Level) and L2 caches. So the difference is that processor A can reload the state stored by processor B without conflicts. 'Hyper-threading' also has the same approach, although this is processor specific. The difference here is that a processor could solely control a specific process - this is called 'CPU Affinity' Its not encouraged for every process, but it does allow an application to have a dedicated processor to work off.
This is OS-specific, of course, but most operating systems schedule at the thread level. A process is just a grouping of threads. For example, on Linux, threads are called "tasks" and each is scheduled independently. They are created with the clone call. What is typically called a thread is a task which shares its address space (and other resources such as file descriptors, mount points, etc.) with the creating task. Note that the clone call can also create what is typically called a process if the flags to enable sharing are not passed.
Considering the above, any thread may be scheduled at any time on any processor, no matter how many processors there are available. That said, most OSs also attempt to maintain some measure of processor affinity to avoid excessive cache misses, but usually if a thread is runnable and a different CPU is available, it will change CPUs. Often there is also a way to specify which CPUs a particular thread may execute upon.
Doesn't matter whether there is 1 or 128 processors. The OS manages access to resources to try an efficiently match up requests with availabilty, and that includes CPU execution. If a thread is running, it has already managed to get some CPU but, if it requests a resource that is not immediately available, it no longer needs any CPU until that other resource does become free, and so the OS will remove CPU execution from it and, if there is another thread that is waiting for CPU, it will hand it over. When the requested reource does become available, the thread will be made ready again. If there is a core free, it will be made running 'immediately', if not, the CPU scheduling algorithm makes a decision on whether to stop a currently-running thread to free up a core or to leave the newly-ready thrad waiting.
It's better to try and ignore things like 'time-slice, quantum, priority' - it causes much confusion and FUD. If a running thread wants something it cannot have yet, it doesn't need any more CPU cycles, and the OS will take them away and, if another thread needs it, apply them there. That is why preemptive multitaskers exist - to match up threads with resources in an attempt to maximize forward progress.

Does linux schedule a process or a thread?

After reading this SO question I got a few doubts. Please help in understanding.
Scheduling involves deciding when to run a process and for what quantum of time.
Does linux kernel schedule a thread or a process? As process and thread are not differentiated inside kernel how a scheduler treats them?
How quantum for each thread is decided?
a. If a quantum of time (say 100us) is decided for a process is that getting shared between all the threads of the process? or
b. A quantum for each thread is decided by the scheduler?
Note: Questions 1 and 2 are related and may look the same but just wanted to be clear on how things are working posted them both here.
The Linux scheduler (on recent Linux kernels, e.g. 3.0 at least) is scheduling schedulable tasks or simply tasks.
A task may be :
a single-threaded process (e.g. created by fork without any thread library)
any thread inside a multi-threaded process (including its main thread), in particular Posix threads (pthreads)
kernel tasks, which are started internally in the kernel and stay in kernel land (e.g. kworker, nfsiod, kjournald , kauditd, kswapd etc etc...)
In other words, threads inside multi-threaded processes are scheduled like non-threaded -i.e. single threaded- processes.
The low-level clone(2) syscall creates user-land schedulable tasks (and can be used both for creating fork-ed process or for implementation of thread libraries, like pthread). Unless you are a low-level thread library implementor, you don't want to use clone directly.
AFAIK, for multi-threaded processes, the kernel is (almost) not scheduling the process, but each individual thread inside (including the main thread).
Actually, there is some notion of thread groups and affinity in the scheduling, but I don't know them well
These days, processors have generally more than one core, and each core is running a task (at some given instant) so you do have several tasks running in parallel.
CPU quantum times are given to tasks, not to processes
The NPTL implementation of POSIX thread specifications sees thread as a different process inside kernel, having unique task_struct (and therefore pid too) so each thread is schedulable in itself as mentioned. Therefore each thread gets its own timeslice and is scheduled just like processes as mentioned above.
Just to add, Currently Linux scheduler is also capable of scheduling not only single tasks ( a simple process), but groups of processes or even users ( all processes, belonging to a user) as a whole. This allows implementing of group scheduling, where CPU time is first divided between process groups and then distributed within those groups to single threads.
Linux threads does not directly operate on processes or threads, but works with schedulable entities. Represented by struct sched_entity.
It's fair to say that every process/thread is a sched_entity but the converse might not be true.
To know detailed process scheduling, refer here

Threads inside a Process

Processes get CPU time as managed by the OS process scheduler.
Since threads run in parallel within a single process, does this mean that a process's CPU time is further distributed(sliced) among threads?
Or can the scheduler directly distribute CPU time among threads bypassing the parent process?
I suspect the answer varies with the OS. On Windows, the process is not merely bypassed, but completely ignored -- all the scheduler deals with is threads. Processes are relevant only to the degree that all non-kernel threads do have to belong to some process, and every process has to contain at least one thread.
The threads are run/scheduled by the operating system and therefore they get their own CPU time. The process CPU time is just the sum of the CPU times of all the threads in the process.
If you want your process to schedule the tasks itself, you should use fibers (Windows). These are a kind of threads but they are not scheduled by the OS. The process should handle the scheduling of fibers itself.
For Windows see http://msdn.microsoft.com/en-us/library/ms681917%28VS.85%29.aspx
