Is init_task thread is per cpu? or just one ?
If Im iterating from Init_task ill get to all of the threads in all of cpus ?
for example , using the following Macro which define at sched.h :
#define do_each_thread(g, t) \
for (g = t = &init_task ; (g = t = next_task(g)) != &init_task ; ) do
will iterate over all threads ?
thanks !
Related
What's the significance of (void) (&_max1 == &_max2); in the following definition of max found in Linux/tools/lib/lockdep/uinclude/linux/kernel.h?
#define max(x, y) ({ \
typeof(x) _max1 = (x); \
typeof(y) _max2 = (y); \
(void) (&_max1 == &_max2); \
_max1 > _max2 ? _max1 : _max2; })
It helps the compiler detect invalid uses of max(), i.e. with non-comparable x and y. As Sukminder points out, the == check is only used at compile time, it doesn't end up in the resulting binary.
I'm playing with the Linux kernel, and one thing that I don't understand is the pid of the init_task task.
As far as I know, there are two special pids: pid 0 for the idle/swapper task, and pid 1 for the init task.
Every online resource (e.g. one, two) I could find say that the init_task task represents the swapper task, i.e. it should have pid 0.
But when I print all the pids using the for_each_process macro, which starts from init_task, I get pid 1 as the first process. I don't get pid 0 at all. Which means that init_task has pid 1, and that it's the init task (?!).
Please help me resolve this confusion.
P.S. the kernel version is 2.4.
The reason for my confusion was the tricky definition of the for_each_task macro:
#define for_each_task(p) \
for (p = &init_task ; (p = p->next_task) != &init_task ; )
Even though it seems that p starts from init_task, it actually starts from init_task.next_task because of the assignment in the condition.
So for_each_task(p) { /* ... */ } could be rewritten as:
p = init_task.next_task;
while(p != &init_task)
{
/* ... */
p = p->next_task;
}
As it can be seen, the swapper process is not part of the iteration.
I am trying to get the intel MKL version of pardiso to work with multiple cores. Im using it to solve a structurally symmetric system (mtype=1) with around 60K equations.
iparm= 0
iparm(1) = 1 !
iparm(2) = 3 !
iparm(3) = omp_get_max_threads() !
iparm(4) = 0 !
iparm(5) = 0 !
iparm(6) = 0 !
iparm(7) = 0 !
iparm(8) = 9 !
iparm(9) = 0 !
iparm(10) = 13
iparm(11) = 1
iparm(12) = 0
iparm(13) = 0
iparm(14) = 0
iparm(15) = 0
iparm(16) = 0
iparm(17) = 0
iparm(18) = -1
iparm(19) = -1
iparm(20) = 0
These are the my ipram parameters. When compiling I have
F90FLAGS = ${F77FLAGS} -I${SOLIDroot} -openmp -mkl=parallel -d-lines -debug
Before calling pardiso I also set the number of threads available to MKL and openmp
call mkl_set_num_threads(3)
call omp_set_num_threads(3)
call mkl_set_dynamic(0) ! disabling dynamic adjustment of the number of threads
As far as I have understood, all MKL functions will try to use multiple threads if allowed or enabled for "sufficiently" large problems. I already have some parallelism using OMP and the code runs on several cores. The region from which I call pardiso is serial. My question is, what else is needed to make pardiso work with multiple cores?
Tried with the default values for iparm, ie iparm(1)=0 and there was no change
I cannot add a comment (not enough reputations).
You can try setting the environment variable OMP_NUM_THREADS before you run the code and see if that works.
In my OPENMP code, I want all threads do the same job and at the end take the average ( basically calculate error). ( How I calculate error? Each thread generates different random numbers, so the result from each threads is different.)
Here is simple code
program ...
..
!$OMP PARALLEL
do i=1,Nstep
!.... some code goes here
result=...
end do
!$END PARALLEL
sum = result(from thread 0)+result(from thread 1)+...
sum = sum/(number of threads)
Simply I have to send do loop inside OPENMP to all threads, not blocking this loop.
I can do what I want using MPI and MPI_reduce, but I want to write a hybrid code OPENMP + MPI. I haven't figured out the OPENMP part, so suggestions please?
It is as simple as applying sum reduction over result:
USE omp_lib ! for omp_get_num_threads()
INTEGER :: num_threads
result = 0.0
num_threads = 1
!$OMP PARALLEL REDUCTION(+:result)
!$OMP SINGLE
num_threads = omp_get_num_threads()
!$OMP END SINGLE
do i = 1, Nstep
...
result = ...
...
end do
!$END PARALLEL
result = result / num_threads
Here num_threads is a shared INTEGER variable that is assigned the actual number of threads used to execute the parallel region. The assignment is put in a SINGLE construct since it suffices one thread - and no matter which one - to execute the assignment.
I have a Dell Inspiron 620. The System control panel says 1 Intel i3-2100. Intel says (http://ark.intel.com/products/53422/) it has 2 cores and four threads. Three questions: In system environment variables the no_of_processors=4; so is that 4 threads?
Regarding the cores, when I run this code:
// get a list of all processor devices
deviceList = SetupDiGetClassDevs(ref processorGuid, "ACPI", IntPtr.Zero, (int)DIGCF.PRESENT);
// attempt to process each item in the list
for (int deviceNumber = 0; ; deviceNumber++)
{
SP_DEVINFO_DATA deviceInfo = new SP_DEVINFO_DATA();
deviceInfo.cbSize = Marshal.SizeOf(deviceInfo);
// attempt to read the device info from the list, if this fails, we're at the end of the list
if (!SetupDiEnumDeviceInfo(deviceList, deviceNumber, ref deviceInfo))
{
deviceCount = deviceNumber - 1;
break;
}
}
I get 3 as the number of cores, not 2.
Also, in terms of the number of threads this system will support adequately, is that cores x processors?
Thanks.
This is pure supposition but could it be:
2 Core with 4 threads implies hyper threading
System properties are referring 4 processors => its reading the number of hyper threads available as 'logical single thread processors'
Your loop starts at 0 so you're still seeing 4.