Dynamic scheduling across OpenMP thread teams - multithreading

I am curious why OpenMP (v4.5) specification does not provide dynamic scheduling across thread teams.
Is it possible to schedule loop iterations across OpenMP thread teams at runtime, using Nvidia GPUs or Intel Xeon Phis?

Related

What Operating System services are necessary to support kernel-level threads?

I am studying Solaris and Linux and am viewing Kernel Level Threads (KLTs) as the fundamental entity that can be scheduled and dispatched by the OS. I know that a multi-threaded OS must store thread execution context and provide mechanisms to schedule and dispatch KLTs, and that kernel level threads handle interrupts, system calls, and provide an interface to the CPU as a resource at the user-kernel interface. I am not clear on what services are necessary to support KLTs in a multi-threaded OS.
I cannot determine if there is a core kernel process that is necessary to support all KLTs, or if KLTs run interdependently as the base-level of computing. I would like to understand what minimum set of operations (resource allocation, scheduling) is necessary to support an OS with KLTs.
I have looked at Tanenbaums Tanenbaum's discussion of threads in his distributed systems book, Understanding the Linux Kernel, and MultiThreading the SunOS kernel but I cannot find an answer to my question.
I believe that answering the question -- What Operating System services are necessary to support kernel-level threads? -- will help me understand how KLTs are implemented.

What's the difference between a physical CPU and hyper Thread on Azure?

I'm reading the documentation of the SQL Databases on Microsoft Azure about the performance between two kinds of database service, GEN4 and GEN5. Currently the documentation shows that GEN4 CPUs are based on Intel E5-2673 v3 (Haswell) 2.4 GHz processors and 1 vCore = 1 physical CPU, and GEN5 logical CPUs are based on Intel E5-2673 v4 (Broadwell) 2.3 GHz processors where 1 vCore = 1 Hyper thread.
My question is, Is GEN4 1 physical cpu equivalent to a Intel E5-2673 v3 with 12 cores and 24 logical proccesors or Is a individual core? , and Is GEN5 1 hyper Thread equivalent to a logical core of a physical core on a Intel E5-2673 v4?
This is the link of the documentation :Azure SQL Database pricing
Is GEN4 1 physical cpu equivalent to a Intel E5-2673 v3 with 12 cores and 24 logical proccesors or Is a individual core.
1 physical cpu in GEN4 is represent one core that based on Intel E5-2673 v3 (Haswell) 2.4 GHz processors.
Is GEN5 1 hyper Thread equivalent to a logical core of a physical core on a Intel E5-2673 v4?
Introduction of Hyper Thread:
Hyper-threading (officially called Hyper-Threading Technology or HT Technology, and abbreviated as HTT or HT) is Intel's proprietary simultaneous multithreading (SMT) implementation used to improve parallelization of computations (doing multiple tasks at once) performed on x86 microprocessors. It first appeared in February 2002 on Xeon server processors and in November 2002 on Pentium 4 desktop CPUs.[4] Later, Intel included this technology in Itanium, Atom, and Core 'i' Series CPUs, among others.
For each processor core that is physically present, the operating system addresses two virtual (logical) cores and shares the workload between them when possible. The main function of hyper-threading is to increase the number of independent instructions in the pipeline; it takes advantage of superscalar architecture, in which multiple instructions operate on separate data in parallel. With HTT, one physical core appears as two processors to the operating system, allowing concurrent scheduling of two processes per core. In addition, two or more processes can use the same resources: if resources for one process are not available, then another process can continue if its resources are available.
In addition to requiring simultaneous multithreading (SMT) support in the operating system, hyper-threading can be properly utilized only with an operating system specifically optimized for it.[5] Furthermore, Intel recommends HTT to be disabled when using operating systems unaware of this hardware feature.
More information about Hyper Thread, we can refer to: Hyper Thread
It seems that Microsoft is intentionally being deceptive with how they've labeled/described the CPU count between the two models. It seems pretty clear based on the wording as you described and the performance we are seeing that the same level in GEN5 has half as many logical processors. This makes sense when you figure that you get improved hardware in GEN5, but the prices are the same for the same levels.
We have many processor intensive analytical queries, in testing over the last week, we have to go to GEN5_16 in order to get the same performance as GEN4_8. Unfortunately, the price skyrockets from $42k a year to $84k to do this. We moved to GEN5_8 over the holidays and are currently suffering from incredible contention on GEN5 and log rate throttles on simple SELECT INTO queries and are in a pickle. We are constantly bumping up on the 1TB limit of GEN4 (MSSQL's log growth kills us - we dont need full recovery but have no choice), but we never had performance or throttling issues on GEN4_8.

The distribution of processing threads on the processor cores

The operating system independently distributes the processing of threads over the processor cores. The program has two threads. Initially, both threads are not loaded with work and are processed by one core. Later they are loaded with work. Will the operating system transfer the processing of one thread to another processor core?

How does Erlang implement concurrency without the use of OS threads?

If Erlang does its own process creation and scheduling, without utilizing OS threads, how does it make use of multiple CPU cores? My limited understanding is that the OS assigns the CPU cores to OS threads.
Erlang runs on a virtual machine called BEAM.
The Erlang process runs a separate BEAM VM on each core (using OS threads).
See this related SO question.

Hardware thread vs soft threads?

I have read that in a multi core processor each core contains 2 hardware threads for example in dual core processor 4 hardware threads are running. Now if i create 2 threads in java are those threads going to map with 2 hardware threads or those 2 java threads are executed by single hardware thread of a particular core ?
That is dependent on a lot of things, however the 2 hardware threads per core you are referring to is the Intel HyperThreading technology. This technology enables the CPU to have two Thread Context's in memory and be executing simultaneously, sharing execution resources.
What threads run where is OS implemention dependent and mostly resolved by the Thread Scheduler algorithm of your OS.

Resources