Multi Cores Architecture VS Multi Threading - multithreading

Can we take full benefits of Multi core architecture without Multi threading.?

Can we take full benefits of Multi core architecture without Multi threading.?
For conventional environments; you can take some of the benefits of multi-CPU without multi-threading (e.g. if you've got 8 CPUs and you're running 8 separate single-threaded processes, then...).
For non-conventional environments, who knows? For an example, maybe the entire system uses the actor model (software divided into separate/independent objects where each object is an event handler), where the OS has a queue of pending events, and each CPU does "get event from queue, execute the corresponding object's event handler for that event" in a loop. In this case you can say that there's no threads at all (just CPUs and events) and therefore there is no multi-threading.

Can we take full benefit of multicores without multithreading ? Definitely no. But we can still have some parallelism.
As already answered, we can have several independent processes running on different processors to improve global computer performances.
And it is still possible to do parallel processing by means of interprocess communication (IPC) as pipes or shared memory. For instance, if doing
taskset 0x01 sort | taskset 0x02 uniq
you will run two processes, sort on core 0 and uniq on core 1, and these process will communicate by a pipe (implemented in the shared memory). Note that this just an example and that OSes do run new processes on different cores without the taskset directive.
With posix shared memory IPC, you can do parallel processes running on different cores and exchanging data in a dedicated memory zone.
And you can use openMPI to run multiprocess parallel programs on a multicore. The shared memory will be used to implement MPI message passing.
But in either case, compared to multithreading, the programming burden will be higher and performances much lower.

Related

cpu core threads vs process threads [duplicate]

What is the difference between software threads, hardware threads and java threads?
Are software threads, java threads and hardware threads independent or interdependent?
I am asking this because, I know Java threads are created inside a process with in jvm (java.exe).
Also is it true that these different process are executed on different hardware threads?
A "hardware thread" is a physical CPU or core. So, a 4 core CPU can genuinely support 4 hardware threads at once - the CPU really is doing 4 things at the same time.
One hardware thread can run many software threads. In modern operating systems, this is often done by time-slicing - each thread gets a few milliseconds to execute before the OS schedules another thread to run on that CPU. Since the OS switches back and forth between the threads quickly, it appears as if one CPU is doing more than one thing at once, but in reality, a core is still running only one hardware thread, which switches between many software threads.
Modern JVMs map java threads directly to the native threads provided by the OS, so there is no inherent overhead introduced by java threads vs native threads. As to hardware threads, the OS tries to map threads to cores, if there are sufficient cores. So, if you have a java program that starts 4 threads, and have 4 or more cores, there's a good chance your 4 threads will run truly in parallel on 4 separate cores, if the cores are idle.
Software threads are threads of execution managed by the operating system.
Hardware threads are a feature of some processors that allow better utilisation of the processor under some circumstances. They may be exposed to/by the operating system as appearing to be additional cores ("hyperthreading").
In Java, the threads you create maintain the software thread abstraction, where the JVM is the "operating system". Whether the JVM then maps Java threads to OS threads is the JVM's business (but it almost certainly does). And then the OS will be using hardware threads if they are available.
Hardware threads (e.g. Intel Hyperthreading) are a cheaper and slower alternative to having multiple-cores
Software threads are a software abstraction implemented by the (Linux) kernel:
either the kernel runs one software thread per CPU (or hyperthread)
or it fakes it with the scheduler by running a process for a bit, then a timer interrupt comes, then it switches to another process, and so on
Key to their implementation is the hardware provided and kernel configured separation between userland and kerneland: What are Ring 0 and Ring 3 in the context of operating systems?
I will now focus on hardware threads, which is the more obscure hardware question, with a focus on Intel's implementation which it calls Hyperthreading.
The Intel Manual Volume 3 System Programming Guide - 325384-056US September 2015 8.7 "INTEL HYPER-THREADING TECHNOLOGY ARCHITECTURE" describes HT briefly. It contains the following diagram:
TODO it is slower by how much percent in average in real applications?
Hyperthreading is possible because modern single CPUs cores already execute multiple instructions at once with the instruction pipeline https://en.wikipedia.org/wiki/Instruction_pipelining
The instruction pipeline is a separation of functions inside of a single core to ensure that each part of the circuit is used at any given time: reading memory, decoding instructions, executing instructions, etc.
Hyperthreading separates functions further by using:
a single backend, which actually runs the instructions with its pipeline.
Dual core has two backends, which explains the greater cost and performance.
two front-ends, which take two streams of instructions and order them in a way to maximize pipelining usage of the single backend by avoiding hazards.
Dual core would also have 2 front-ends, one for each backend.
There are edge cases where instruction reordering produces no benefit, making hyperthreading useless. But it produces a significant improvement in average.
Two hyperthreads in a single core share further cache levels (TODO how many? L1?) than two different cores, which share only L3, see:
Multiple threads and CPU cache
How are cache memories shared in multicore Intel CPUs?
The interface that each hyperthread exposes to the operating system is similar to that of an actual core, and both can be controlled separately. Thus cat /proc/cpuinfo shows me 4 processors, even though I only have 2 cores with 2 hyperthreads each.
Operating systems can however take advantage of knowing which hyperthreads are on the same core to run multiple threads of a given program on a single core, which might improve cache usage.
This LinusTechTips video contains a light-hearted non-technical explanation: https://www.youtube.com/watch?v=wnS50lJicXc
Hardware threads can be thought of as the CPU cores, although each core can run multiple threads. Most of the CPUs mention how many threads can be run on each core (on linux, lscpu command gives this detail). These are the number of cores that can be used in parallel.
Software threads are abstraction to the hardware to make multi-processing possible. If you have multiple software threads but there are not multiple resources then these software threads are a way to run all tasks in parallel by allocating resources for limited time(or using some other strategy) so that it appears that all threads are running in parallel. These are managed by the operating system. Java thread is an abstraction at the JVM level.
I think you are mistaken. I never heard about hardware threads (unless you mean hyper threading on certain intel machines). Every process is a running representation of a program. Threads are simultaneous execution flows with in a process. Java thread definitions are mapped to system threads by JVM. Java used to have a concept of GreenThreads, which is no longer the case.

Threads vs processess: are the visualizations correct?

I have no background in Computer Science, but I have read some articles about multiprocessing and multi-threading, and would like to know if this is correct.
SCENARIO 1:HYPERTHREADING DISABLED
Lets say I have 2 cores, 3 threads 'running' (competing?) per core, as shown in the picture (HYPER-THREADING DISABLED). Then I take a snapshot at some moment, and I observe, for example, that:
Core 1 is running Thread 3.
Core 2 is running Thread 5.
Are these declarations (and the picture) correct?
A) There are 6 threads running in concurrency.
B) There are 2 threads (3 and 5) (and processes) running in parallel.
SCENARIO 2:HYPERTHREADING ENABLED
Lets say I have MULTI-THREADING ENABLED this time.
Are these declarations (and the picture) correct?
C) There are 12 threads running in concurrency.
D) There are 4 threads (3,5,7,12) (and processes) running in 'almost' parallel, in the vcpu?.
E) There are 2 threads (5,7) running 'strictlÿ́' in parallel?
A process is an instance of a program running on a computer. The OS uses processes to maximize utilization, support multi-tasking, protection, etc.
Processes are scheduled by the OS - time sharing the CPU. All processes have resources like memory pages, open files, and information that defines the state of a process - program counter, registers, stacks.
In CS, concurrency is the ability of different parts or units of a program, algorithm or problem to be executed out-of-order or in a partial order, without affecting the final outcome.
A "traditional process" is when a process is an OS abstraction to present what is needed to run a single program. There is NO concurrency within a "traditional process" with a single thread of execution.
However, a "modern process" is one with multiple threads of execution. A thread is simply a sequential execution stream within a process. There is no protection between threads since they share the process resources.
Multithreading is when a single program is made up of a number of different concurrent activities (threads of execution).
There are a few concepts that need to be distinguished:
Multiprocessing is whenwe have Multiple CPUs.
Multiprogramming when the CPU executes multiple jobs or processes
Multithreading is when the CPU executes multiple mhreads per Process
So what does it mean to run two threads concurrently?
The scheduler is free to run threads in any order and interleaving a FIFO or Random. It can choose to run each thread to completion or time-slice in big chunks or small chunks.
A concurrent system supports more than one task by allowing all tasks to make progress. A parallel system can perform more than one task simultaneously. It is possible though, to have concurrency without parallelism.
Uniprocessor systems provide the illusion of parallelism by rapidly switching between processes (well, actually, the CPU schedulers provide the illusion). Such processes were running concurrently, but not in parallel.
Hyperthreading is Intel’s name for simultaneous multithreading. It basically means that one CPU core can work on two problems at the same time. It doesn’t mean that the CPU can do twice as much work. Just that it can ensure all its capacity is used by dealing with multiple simpler problems at once.
To your OS, each real silicon CPU core looks like two, so it feeds each one work as if they were separate. Because so much of what a CPU does is not enough to work it to the maximum, hyperthreading makes sure you’re getting your money’s worth from that chip.
There are a couple of things that are wrong (or unrealistic) about your diagrams:
A typical desktop or laptop has one processor chipset on its motherboard. With Intel and similar, the chipset consists of a CPU chip together with a "northbridge" chip and a "southbridge" chip.
On a server class machine, the motherboard may actually have multiple CPU chips.
A typical modern CPU chip will have more than one core; e.g. 2 or 4 on low-end chips, and up to 28 (for Intel) or 64 (for AMD) on high-end chips.
Hyperthreading and VCPUs are different things.
Hyperthreading is Intel proprietary technology1 which allows one physical to at as two logical cores running two independent instructions streams in parallel. Essentially, the physical core has two sets of registers; i.e. 2 program counters, 2 stack pointers and so on. The instructions for both instruction streams share instruction execution pipelines, on-chip memory caches and so on. The net result is that for some instruction mixes (non-memory intensive) you get significantly better performance than if the instruction pipelines are dedicated to a single instruction stream. The operating system sees each hyperthread as if it was a dedicated core, albeit a bit slower.
VCPU or virtual CPU terminology used in cloud computing context. On a typical cloud computing server, the customer gets a virtual server that behaves like a regular single or multi-core computer. In reality, there will typically be many of these virtual servers on a compute node. Some special software called a hypervisor mediates access to the hardware devices (network interfaces, disks, etc) and allocates CPU resources according to demand. A VCPU is a virtual server's view of a core, and is mapped to a physical core by the hypervisor. (The accounting trick is that VCPUs are typically over committed; i.e. the sum of VCPUs is greater than the number of physical cores. This is fine ... unless the virtual servers all get busy at the same time.)
In your diagram, you are using the term VCPU where the correct term would be hyperthread.
Your diagram shows each core (or hyperthread) associated with a distinct group of threads. In reality, the mapping from cores to threads is more fluid. If a core is idle, the operating system is free to schedule any (runnable) thread to run on it. (Some operating systems allow you to tie a given thread to a specific core for performance reasons. It is rarely necessary to do this.)
Your observations about the first diagram are correct.
Your observations about the second diagram are slightly incorrect. As stated above the hyperthreads on a core share the execution pipelines. This means that they are effectively executing at the same time. There is no "almost parallel". As I said, above, it is simplest to think of a hyperthread as a core "that runs a bit slower".
1 - Intel was not the first computer to com up with this idea. For example, CDC mainframes used this idea in the 1960's to get 10 PPUs from a single core and 10 sets of registers. This was before the days of pipelined architectures.

is multi-threading dependent on the architecture of the machine?

I have been reading lately about system architecture and the topic of multi-threading has not been covered in detail with latest improvements in technology. I did my part of search, but could not find answers for the following:
The questions have are
1) Is multi-threading dependent on the system architecuture (CPU). do all CPU (single core) support multi-threading? If it does not, what happens to multi-threaded applications when run on those machines
It is cited here that
Intel CPUs support multithreading, but only two threads per CPU.
AMD CPUs do not support multithreading and AMD often sites Microsoft's
recommendations to turn off Hyperthreading on Intel CPUs when running applications
like peoplesoft and Exchange.
2) so what does it mean it say only two threads per CPU here. At any given time, CPU (single core) can process only thread. and the other thread is waiting to be processed correct?
3) how is it different from an application that spawns, say, 10 threads and waiting for them to be executed. If the CPU at the most can tackle only two threads, shouldn't programmer keep that fact in consideration when writing multi-threaded applications.
Even with multi-core processors (say quad-core) at the most 8 threads can be queued, but only 4 threads can be processed at the same time.
P.S: I have a read a little about hyper-threading but I am not sure if that is relevant here and if
all processors support hyper-threading
1) It depends on the operating system more than anything. Even for single core architectures, multi-threading can be supported, but the threads are not executing in parallel - The OS will context-switch between them.
2) Intel usually supports two-way hardware threading ( also called simultaneous multi-threading), where each thread is allocated a pipeline. So if you have a process with two threads they can both execute on the same core simultaneously.
3) See 1. Basically the operating system is going to allocate as many threads as it can to hardware before it plans to context-switch between the threads it couldn't allocate. This process is dependent on the OS's scheduler, and you can read about the Linux one to get a good idea of what's going on.
Edit: Hypethreading is basically the hardware threading feature I mentioned.
In your question CPU means core.
1) It does. I believe memory access on ARMs is in words, so write to char is not atomic
Also memory ordering differs Modern OSes (anything but DOS) support context switching: while one thread executes, others wait. Total number of threads in all Windows processes is about 1000. Common time quant (time to load CPU) is 1-10 ms. One core multithreading don't improve computational power but allows asynchronous tasks. For example GUI doesn't freeze during network activity. One threads waits net, another one responds to user activity.
2) Yes
3) It is common practice to spawn number of threads equal to number of (virtual) cores, ie number of cores in system for AMD and twice for Intel. It is true only for computational threads. Web server threads usually wait net and don't load CPU a lot, so it is better to spawn thousands of threads.
Hyperthreading is cool for tasks that wait RAM. While one thread waits data another one executes. For math it usually not increase performance. It is good for work with data that is not cache-friendly: lists, trees, hash tables that don't fit into cache.

Multithreads on kernel

In Galvin, I came across
Finally, many operating system kernels are now multithreaded; several threads operate in the kernel, and each thread performs a specific task.
Question 1
It does not imply that all of them will run at the same time, since at a given time only 1 process/thread can acquire control over the processor right? Though they could be doing various work, like one on CPU, other working on I/O like getting key strokes in the buffer etc., right?
Question 2
Multithreading will show better performance on multiprocessor systems only right?
Answer 1: Every core of your CPU can execute one command at any given time. Since nearly all of modern CPUs are multi core you'll get better performance if your app is multithreaded.
Answer 2:Multithreading will show better performance in most of the cases even on systems with single core CPUs. Your app will become more responsive to user input if you dispatch your time intensive jobs to multiple threads
The parallelization levels are as below:
Mutli Computers
Multi Processors
Multi Cores
Multi Threads
At higher levels you see more benefit from threading. E.g your multithreaded app will run better in multi cores CPUs in compare with single core(multi threaded) CPUs

software threads vs hardware threads

What is the difference between software threads, hardware threads and java threads?
Are software threads, java threads and hardware threads independent or interdependent?
I am asking this because, I know Java threads are created inside a process with in jvm (java.exe).
Also is it true that these different process are executed on different hardware threads?
A "hardware thread" is a physical CPU or core. So, a 4 core CPU can genuinely support 4 hardware threads at once - the CPU really is doing 4 things at the same time.
One hardware thread can run many software threads. In modern operating systems, this is often done by time-slicing - each thread gets a few milliseconds to execute before the OS schedules another thread to run on that CPU. Since the OS switches back and forth between the threads quickly, it appears as if one CPU is doing more than one thing at once, but in reality, a core is still running only one hardware thread, which switches between many software threads.
Modern JVMs map java threads directly to the native threads provided by the OS, so there is no inherent overhead introduced by java threads vs native threads. As to hardware threads, the OS tries to map threads to cores, if there are sufficient cores. So, if you have a java program that starts 4 threads, and have 4 or more cores, there's a good chance your 4 threads will run truly in parallel on 4 separate cores, if the cores are idle.
Software threads are threads of execution managed by the operating system.
Hardware threads are a feature of some processors that allow better utilisation of the processor under some circumstances. They may be exposed to/by the operating system as appearing to be additional cores ("hyperthreading").
In Java, the threads you create maintain the software thread abstraction, where the JVM is the "operating system". Whether the JVM then maps Java threads to OS threads is the JVM's business (but it almost certainly does). And then the OS will be using hardware threads if they are available.
Hardware threads (e.g. Intel Hyperthreading) are a cheaper and slower alternative to having multiple-cores
Software threads are a software abstraction implemented by the (Linux) kernel:
either the kernel runs one software thread per CPU (or hyperthread)
or it fakes it with the scheduler by running a process for a bit, then a timer interrupt comes, then it switches to another process, and so on
Key to their implementation is the hardware provided and kernel configured separation between userland and kerneland: What are Ring 0 and Ring 3 in the context of operating systems?
I will now focus on hardware threads, which is the more obscure hardware question, with a focus on Intel's implementation which it calls Hyperthreading.
The Intel Manual Volume 3 System Programming Guide - 325384-056US September 2015 8.7 "INTEL HYPER-THREADING TECHNOLOGY ARCHITECTURE" describes HT briefly. It contains the following diagram:
TODO it is slower by how much percent in average in real applications?
Hyperthreading is possible because modern single CPUs cores already execute multiple instructions at once with the instruction pipeline https://en.wikipedia.org/wiki/Instruction_pipelining
The instruction pipeline is a separation of functions inside of a single core to ensure that each part of the circuit is used at any given time: reading memory, decoding instructions, executing instructions, etc.
Hyperthreading separates functions further by using:
a single backend, which actually runs the instructions with its pipeline.
Dual core has two backends, which explains the greater cost and performance.
two front-ends, which take two streams of instructions and order them in a way to maximize pipelining usage of the single backend by avoiding hazards.
Dual core would also have 2 front-ends, one for each backend.
There are edge cases where instruction reordering produces no benefit, making hyperthreading useless. But it produces a significant improvement in average.
Two hyperthreads in a single core share further cache levels (TODO how many? L1?) than two different cores, which share only L3, see:
Multiple threads and CPU cache
How are cache memories shared in multicore Intel CPUs?
The interface that each hyperthread exposes to the operating system is similar to that of an actual core, and both can be controlled separately. Thus cat /proc/cpuinfo shows me 4 processors, even though I only have 2 cores with 2 hyperthreads each.
Operating systems can however take advantage of knowing which hyperthreads are on the same core to run multiple threads of a given program on a single core, which might improve cache usage.
This LinusTechTips video contains a light-hearted non-technical explanation: https://www.youtube.com/watch?v=wnS50lJicXc
Hardware threads can be thought of as the CPU cores, although each core can run multiple threads. Most of the CPUs mention how many threads can be run on each core (on linux, lscpu command gives this detail). These are the number of cores that can be used in parallel.
Software threads are abstraction to the hardware to make multi-processing possible. If you have multiple software threads but there are not multiple resources then these software threads are a way to run all tasks in parallel by allocating resources for limited time(or using some other strategy) so that it appears that all threads are running in parallel. These are managed by the operating system. Java thread is an abstraction at the JVM level.
I think you are mistaken. I never heard about hardware threads (unless you mean hyper threading on certain intel machines). Every process is a running representation of a program. Threads are simultaneous execution flows with in a process. Java thread definitions are mapped to system threads by JVM. Java used to have a concept of GreenThreads, which is no longer the case.

Resources