Operating Systems - Organization Questions - multithreading

I am studying for a 3 topic comprehensive exam that decides If I graduate or not, and have some questions on Operating System Organization
A) How does a multicore computer with shared memory differ from a distributed or a clustered system with respect to OS? Make specific reference to the OS Kernel.
B) Briefly explain the difference between processes and threads
C) Threads on a single core system are often handled in User mode. Explain why this is not acceptable on a multicore computer
D) Explain at least 2 ways that the OS can handle threads on am ulticore computer
Here are my attempted answers.
A) Multicore is a single processor, which has multiple processors that work together to speed up the processing power, however since they share memory, the kernel already know the state of each other. Distributed and clustered systems use message passing, and must always alert the other kernel what the other is doing.
B) processes refer to the high level heavyweight task, which can usually be broken down into smaller individual tasks (threads). Threading a single process allows for the abstraction of multiprocessing, allowing concurrent actions to take place.
C) DO not know, but my guess is the OS must properly distribute tasks in Kernel mode
D) Assign processes per core, or assign threads per core. If you assign proccess per core, the core will iterate through all the threads of the process, while the other core works on another process. If you assign threads per core, each core will work on a group of threads that relate to the same process.
Please let me know if anyone has any thing that can help my understanding, especially on OS Organization Topics.
Thanks in advance

A. How does a multi-core computer differ from a distributed or a clustered system with respect to the OS?
a. Clustered systems are typically constructed by combining multiple computers into a single system to perform a computational task distributed across the cluster. Multiprocessor systems on the other hand could be a single physical entity comprising of multiple CPUs. Clustered systems communicate via messages, while multiprocessors communicate via shared memory.
B. Briefly explain the difference between process and thread?
a. Both process and threads are independent sequences of execution. The typical difference is that threads (of the same process) run in shared memory space, while processes run in separate memory spaces.
C. Threads on a single core system are often handled in user mode. Explain why this is not acceptable on a multicore computer.
a. A multithreaded application running on a traditional single computer chip would have to interleave threads. On a multicore chip however the threads could be spread across the available cores.
D. Explain atleast 2 ways the OS can handle threads on a multicore computer
a. Data Parallelism - divides the data up against multiple cores and perform the same task on each subset of the data.
b. Task Parallelism - divides the different tasks to be performed among different cores and perform them simultaneously.

Related

Does multicore processors really perform work in parallel?

I am deal with threading and related topics like: processes, context switching...
I understand that on a system with one multicore processer real work of more than one processes isn't real. We have just an illusion of such work, because of process context switching.
But, what about threads within one process, that runs on a multicore processor. Does they really work simultaneously or it's also just an illusion of such work? Does processor with 2 hardware cores can work over two threads at a time? If not, what is the point in multicore processors?
Does processor with 2 hardware cores can work over two threads at a time?
Yes,...
...But, Imagine yourself back in Victorian times, hiring a bunch of clerks to perform a complex computation. All of the data that they need are in one book, and they're supposed to write all of their results back into the same book.
The book is like a computer's memory, and the clerks are like individual CPUs. Since only one clerk can use the book at any given time, then it might seem as if there's no point in having more than one of them,...
... Unless, you give each clerk a notepad. They can go to the book, copy some numbers, and then work for a while just from their own notepad, before they return to copy a partial result from their notepad into the book. That allows other clerks to do some useful work when any one clerk is at the book.
The notepads are like a computer's Level 1 caches—relatively small areas of high-speed memory that are associated with a single CPU, and which hold copies of data that have been read from, or need to be written back to the main memory. The computer hardware automatically copies data between main memory and the cache as needed, so the program does not necessarily need to be aware that the cache even exists. (see https://en.wikipedia.org/wiki/Cache_coherence)
But, the programmer should to be aware: If you can structure your program so that different threads spend most of their time reading and writing private variables, and relatively little time accessing variables that are shared with other threads, then most of the private variable accesses will go no further than the L1 cache, and the threads will be able to truly run in parallel. If, on the other hand, threads all try to use the same variables at the same time, or if threads all try to iterate over large amounts of data (too large to all fit in the cache,) then they will have much less ability to work in parallel.
See also:
https://en.wikipedia.org/wiki/Cache_hierarchy
Multiple cores do actually perform work in parallel (at least on all mainstream modern CPU architecture). Processes have one or multiple threads. The OS scheduler schedule active tasks, which are generally threads, to available core. When there are more active tasks than available cores, the OS use preemption so execute tasks concurrently on each core.
In practice, software applications can perform synchronization that may cause some cores to be inactive for a given period of time. Hardware operation can also cause this (eg. waiting for memory data to be retrieved, doing an atomic operation).
Moreover, on modern processors, physical cores are often split in multiple hardware threads that can each execute different tasks. This is called SMT (aka Hyper-threading). On quite recent x86 processors, 2 hardware threads of a same core can simultaneously execute 2 tasks in parallel. The tasks can share parts of the physical core like execution units so using 2 hardware thread can be faster than 1 for some tasks (typically the ones not using fully the processor cores).
Having 2 hardware threads that cannot truly run in parallel but run concurrently at a low granularity can still beneficial for performance. In fact, it was the case for a long time (during the last decade). For example, when a task is latency bound (eg. waiting for data to be retrieved from the RAM), another task can be scheduled so to do some work, improving the overall efficiency. This was the initial goal of SMT. The same is true for pre-empted tasks on a same core (though the granularity need to be much bigger): one process can perform a networking operation and be pre-empted so another process can do some work before being pre-empted again because of data being received from the network.

Threads vs processess: are the visualizations correct?

I have no background in Computer Science, but I have read some articles about multiprocessing and multi-threading, and would like to know if this is correct.
SCENARIO 1:HYPERTHREADING DISABLED
Lets say I have 2 cores, 3 threads 'running' (competing?) per core, as shown in the picture (HYPER-THREADING DISABLED). Then I take a snapshot at some moment, and I observe, for example, that:
Core 1 is running Thread 3.
Core 2 is running Thread 5.
Are these declarations (and the picture) correct?
A) There are 6 threads running in concurrency.
B) There are 2 threads (3 and 5) (and processes) running in parallel.
SCENARIO 2:HYPERTHREADING ENABLED
Lets say I have MULTI-THREADING ENABLED this time.
Are these declarations (and the picture) correct?
C) There are 12 threads running in concurrency.
D) There are 4 threads (3,5,7,12) (and processes) running in 'almost' parallel, in the vcpu?.
E) There are 2 threads (5,7) running 'strictlÿ́' in parallel?
A process is an instance of a program running on a computer. The OS uses processes to maximize utilization, support multi-tasking, protection, etc.
Processes are scheduled by the OS - time sharing the CPU. All processes have resources like memory pages, open files, and information that defines the state of a process - program counter, registers, stacks.
In CS, concurrency is the ability of different parts or units of a program, algorithm or problem to be executed out-of-order or in a partial order, without affecting the final outcome.
A "traditional process" is when a process is an OS abstraction to present what is needed to run a single program. There is NO concurrency within a "traditional process" with a single thread of execution.
However, a "modern process" is one with multiple threads of execution. A thread is simply a sequential execution stream within a process. There is no protection between threads since they share the process resources.
Multithreading is when a single program is made up of a number of different concurrent activities (threads of execution).
There are a few concepts that need to be distinguished:
Multiprocessing is whenwe have Multiple CPUs.
Multiprogramming when the CPU executes multiple jobs or processes
Multithreading is when the CPU executes multiple mhreads per Process
So what does it mean to run two threads concurrently?
The scheduler is free to run threads in any order and interleaving a FIFO or Random. It can choose to run each thread to completion or time-slice in big chunks or small chunks.
A concurrent system supports more than one task by allowing all tasks to make progress. A parallel system can perform more than one task simultaneously. It is possible though, to have concurrency without parallelism.
Uniprocessor systems provide the illusion of parallelism by rapidly switching between processes (well, actually, the CPU schedulers provide the illusion). Such processes were running concurrently, but not in parallel.
Hyperthreading is Intel’s name for simultaneous multithreading. It basically means that one CPU core can work on two problems at the same time. It doesn’t mean that the CPU can do twice as much work. Just that it can ensure all its capacity is used by dealing with multiple simpler problems at once.
To your OS, each real silicon CPU core looks like two, so it feeds each one work as if they were separate. Because so much of what a CPU does is not enough to work it to the maximum, hyperthreading makes sure you’re getting your money’s worth from that chip.
There are a couple of things that are wrong (or unrealistic) about your diagrams:
A typical desktop or laptop has one processor chipset on its motherboard. With Intel and similar, the chipset consists of a CPU chip together with a "northbridge" chip and a "southbridge" chip.
On a server class machine, the motherboard may actually have multiple CPU chips.
A typical modern CPU chip will have more than one core; e.g. 2 or 4 on low-end chips, and up to 28 (for Intel) or 64 (for AMD) on high-end chips.
Hyperthreading and VCPUs are different things.
Hyperthreading is Intel proprietary technology1 which allows one physical to at as two logical cores running two independent instructions streams in parallel. Essentially, the physical core has two sets of registers; i.e. 2 program counters, 2 stack pointers and so on. The instructions for both instruction streams share instruction execution pipelines, on-chip memory caches and so on. The net result is that for some instruction mixes (non-memory intensive) you get significantly better performance than if the instruction pipelines are dedicated to a single instruction stream. The operating system sees each hyperthread as if it was a dedicated core, albeit a bit slower.
VCPU or virtual CPU terminology used in cloud computing context. On a typical cloud computing server, the customer gets a virtual server that behaves like a regular single or multi-core computer. In reality, there will typically be many of these virtual servers on a compute node. Some special software called a hypervisor mediates access to the hardware devices (network interfaces, disks, etc) and allocates CPU resources according to demand. A VCPU is a virtual server's view of a core, and is mapped to a physical core by the hypervisor. (The accounting trick is that VCPUs are typically over committed; i.e. the sum of VCPUs is greater than the number of physical cores. This is fine ... unless the virtual servers all get busy at the same time.)
In your diagram, you are using the term VCPU where the correct term would be hyperthread.
Your diagram shows each core (or hyperthread) associated with a distinct group of threads. In reality, the mapping from cores to threads is more fluid. If a core is idle, the operating system is free to schedule any (runnable) thread to run on it. (Some operating systems allow you to tie a given thread to a specific core for performance reasons. It is rarely necessary to do this.)
Your observations about the first diagram are correct.
Your observations about the second diagram are slightly incorrect. As stated above the hyperthreads on a core share the execution pipelines. This means that they are effectively executing at the same time. There is no "almost parallel". As I said, above, it is simplest to think of a hyperthread as a core "that runs a bit slower".
1 - Intel was not the first computer to com up with this idea. For example, CDC mainframes used this idea in the 1960's to get 10 PPUs from a single core and 10 sets of registers. This was before the days of pipelined architectures.

is multi-threading dependent on the architecture of the machine?

I have been reading lately about system architecture and the topic of multi-threading has not been covered in detail with latest improvements in technology. I did my part of search, but could not find answers for the following:
The questions have are
1) Is multi-threading dependent on the system architecuture (CPU). do all CPU (single core) support multi-threading? If it does not, what happens to multi-threaded applications when run on those machines
It is cited here that
Intel CPUs support multithreading, but only two threads per CPU.
AMD CPUs do not support multithreading and AMD often sites Microsoft's
recommendations to turn off Hyperthreading on Intel CPUs when running applications
like peoplesoft and Exchange.
2) so what does it mean it say only two threads per CPU here. At any given time, CPU (single core) can process only thread. and the other thread is waiting to be processed correct?
3) how is it different from an application that spawns, say, 10 threads and waiting for them to be executed. If the CPU at the most can tackle only two threads, shouldn't programmer keep that fact in consideration when writing multi-threaded applications.
Even with multi-core processors (say quad-core) at the most 8 threads can be queued, but only 4 threads can be processed at the same time.
P.S: I have a read a little about hyper-threading but I am not sure if that is relevant here and if
all processors support hyper-threading
1) It depends on the operating system more than anything. Even for single core architectures, multi-threading can be supported, but the threads are not executing in parallel - The OS will context-switch between them.
2) Intel usually supports two-way hardware threading ( also called simultaneous multi-threading), where each thread is allocated a pipeline. So if you have a process with two threads they can both execute on the same core simultaneously.
3) See 1. Basically the operating system is going to allocate as many threads as it can to hardware before it plans to context-switch between the threads it couldn't allocate. This process is dependent on the OS's scheduler, and you can read about the Linux one to get a good idea of what's going on.
Edit: Hypethreading is basically the hardware threading feature I mentioned.
In your question CPU means core.
1) It does. I believe memory access on ARMs is in words, so write to char is not atomic
Also memory ordering differs Modern OSes (anything but DOS) support context switching: while one thread executes, others wait. Total number of threads in all Windows processes is about 1000. Common time quant (time to load CPU) is 1-10 ms. One core multithreading don't improve computational power but allows asynchronous tasks. For example GUI doesn't freeze during network activity. One threads waits net, another one responds to user activity.
2) Yes
3) It is common practice to spawn number of threads equal to number of (virtual) cores, ie number of cores in system for AMD and twice for Intel. It is true only for computational threads. Web server threads usually wait net and don't load CPU a lot, so it is better to spawn thousands of threads.
Hyperthreading is cool for tasks that wait RAM. While one thread waits data another one executes. For math it usually not increase performance. It is good for work with data that is not cache-friendly: lists, trees, hash tables that don't fit into cache.

Multithreads on kernel

In Galvin, I came across
Finally, many operating system kernels are now multithreaded; several threads operate in the kernel, and each thread performs a specific task.
Question 1
It does not imply that all of them will run at the same time, since at a given time only 1 process/thread can acquire control over the processor right? Though they could be doing various work, like one on CPU, other working on I/O like getting key strokes in the buffer etc., right?
Question 2
Multithreading will show better performance on multiprocessor systems only right?
Answer 1: Every core of your CPU can execute one command at any given time. Since nearly all of modern CPUs are multi core you'll get better performance if your app is multithreaded.
Answer 2:Multithreading will show better performance in most of the cases even on systems with single core CPUs. Your app will become more responsive to user input if you dispatch your time intensive jobs to multiple threads
The parallelization levels are as below:
Mutli Computers
Multi Processors
Multi Cores
Multi Threads
At higher levels you see more benefit from threading. E.g your multithreaded app will run better in multi cores CPUs in compare with single core(multi threaded) CPUs

What is the difference between multicore and concurrent programming

Can anyone help me out I am working on a presentation and would like to include a bit about - 'The difference between multicore and concurrent programming', I have googled a bit but not turning up many good descriptions, any help appreciated! :)
Thanks,
Eamonn
Concurrent (occurring or existing simultaneously) implies that different code MAY execute at the exact same cycle. It means that things can possibly happen in parallel if multiple processors or a processor with multiple cores is available and the program is crafted correctly. Just adding threads does not imply concurrent execution.
The reason I say MAY and possibly is that anytime the programs separate threads need to share volatile/mutable state, other threads that need access to that state can not continue executing and will have to wait their turn to access that state, and things start happening serially again.
Typically this is implemented in a single program as more than one thread executing code concurrently at the same exact cycle as another thread, given that there is no resource contentions as listed above. This requires multiple physical processors or cores. Other models run multiple heavyweight OS processes that can execute concurrently.
Concurrent programming is very hard to do correctly with mutable shared state.
You can write a concurrent program
that runs serially on a single single
core processor, but scales up to
execute more things at the same time
when more processors or cores, or even
multiple processors with multiple
cores is present.
You can also cause single threaded programs to appear concurrent on a multi-core / multi-processor system if they can operate on independent ranges of input data at the same time. Example: a single threaded 3D rendering program can on a dual core machine can run 2 separate instances the first rendering all the odd frames and the second rendering all the even frames. As long as they don't try to share any mutable resources.
Multi-core means that a single CPU has multiple Processor cores that can execute threads or processes concurrently and typically appears as multiple processors to mainstream operating systems.
It does NOT imply that programs that are single threaded gain any concurrency behaviors or benefits from the additional processor cores available.
Concurrent Programming is more broad - it just refers to writing software that will run "concurrently" - ie: more than one thing will happen at a time.
"Multi-core" programming is really referring to a specific subset of concurrent programming, in which you are targetting multiple available CPU cores on a specific machine. This is the most common form of concurrent programming (typically single process running on a single computer), but still only one form of concurrent programming.
You can do concurrent programming on a machine that has only a single CPU core. The operating system provides the illusion that more than one thread is running at the same time, it rapidly switches back-and-forth between them.
A machine with multiple cores simply needs to this context switching less often since two threads can run at the same time on two cores. It is only a bit special because threading bugs can make your life difficult much quicker. The odds that two threads try to access a shared memory location at the same time is much higher.
At a high level, multi-core is an attribute of the processor chip in your computer. Multi core means it has got multiple processing cores. There are several types of multi-processor computers: the old style super computers with thousands of computers connected via ethernet, systems with more than processors (like 2 Pentium 4s), and contemporary multi-core systems where every processor package has multiple processing cores 9like Intel i7). The third type is often called multi-core of Chip Multiprocessor (CMP).
Concurrent programming is an attribute of software. Concurrent programming is about writing code which has is split into multiple tasks that can execute concurrently if processors are available. While concurrent programs do leverage multi-core, concurrent programming is broader in two dimensions:
Concurrent programs can run on a single core or multiple cores.
Concurrent programs can be used on any type of multi-processors I mentioned above.
Thus, to summarize:
Concurrent programming is about software that can use multiple processors if available. those processors can be on the same chip (multi-core or Chip Multiprocessor) or on different chips (often known as SMP). You can have systems where you can put two multi-core chips in the same system making it a CMP and an SMP at the same time. Concurrent programming will work for that as well.
Concurrent programming regards operations that appear to overlap and is primarily concerned with the complexity that arises due to non-deterministic control flow. The quantitative costs associated with concurrent programs are typically both throughput and latency. Concurrent programs are often IO bound but not always, e.g. concurrent garbage collectors are entirely on-CPU. The pedagogical example of a concurrent program is a web crawler. This program initiates requests for web pages and accepts the responses concurrently as the results of the downloads become available, accumulating a set of pages that have already been visited. Control flow is non-deterministic because the responses are not necessarily received in the same order each time the program is run. This characteristic can make it very hard to debug concurrent programs. Some applications are fundamentally concurrent, e.g. web servers must handle client connections concurrently. Erlang, F# asynchronous workflows and Scala's Akka library are perhaps the most promising approaches to highly concurrent programming.
Multicore programming is a special case of parallel programming. Parallel programming concerns operations that are overlapped for the specific goal of improving throughput. The difficulties of concurrent programming are evaded by making control flow deterministic. Typically, programs spawn sets of child tasks that run in parallel and the parent task only continues once every subtask has finished. This makes parallel programs much easier to debug than concurrent programs. The hard part of parallel programming is performance optimization with respect to issues such as granularity and communication. The latter is still an issue in the context of multicores because there is a considerable cost associated with transferring data from one cache to another. Dense matrix-matrix multiply is a pedagogical example of parallel programming and it can be solved efficiently by using Straasen's divide-and-conquer algorithm and attacking the sub-problems in parallel. Cilk is perhaps the most promising approach for high-performance parallel programming on multicores and it has been adopted in both Intel's Threaded Building Blocks and Microsoft's Task Parallel Library (in .NET 4).

Resources