how to set CPU affinity of a particular pthread? - multithreading

I'd like to specify the cpu-affinity of a particular pthread. All the references I've found so far deal with setting the cpu-affinity of a process (pid_t) not a thread (pthread_t). I tried some experiments passing pthread_t's around and as expected they fail. Am I trying to do something impossible? If not, can you send a pointer please? Thanks a million.

This is a wrapper I've made to make my life easier. Its effect is that the calling thread gets "stuck" to the core with id core_id:
// core_id = 0, 1, ... n-1, where n is the system's number of cores
int stick_this_thread_to_core(int core_id) {
int num_cores = sysconf(_SC_NPROCESSORS_ONLN);
if (core_id < 0 || core_id >= num_cores)
return EINVAL;
cpu_set_t cpuset;
CPU_ZERO(&cpuset);
CPU_SET(core_id, &cpuset);
pthread_t current_thread = pthread_self();
return pthread_setaffinity_np(current_thread, sizeof(cpu_set_t), &cpuset);
}

Assuming linux:
The interface to setting the affinity is - as you've probably already discovered:
int sched_setaffinity(pid_t pid,size_t cpusetsize,cpu_set_t *mask);
Passing 0 as the pid, and it'll apply to the current thread only, or have other threads report their kernel pid with the linux-specific call pid_t gettid(void); and pass that in as the pid.
Quoting the man page
The affinity mask is actually a per-thread attribute that can be
adjusted independently for each of the
threads in a thread group. The value
returned from a call to gettid(2) can
be passed in the argument pid.
Specifying pid as 0 will set the
attribute for the calling thread, and
passing the value returned from a call
to getpid(2) will set the attribute
for the main thread of the thread
group. (If you are using the POSIX
threads API, then use
pthread_setaffinity_np (3) instead of
sched_setaffinity().)

//compilation: gcc -o affinity affinity.c -lpthread
#define _GNU_SOURCE
#include <sched.h> //cpu_set_t , CPU_SET
#include <pthread.h> //pthread_t
#include <stdio.h>
void *th_func(void * arg);
int main(void) {
pthread_t thread; //the thread
pthread_create(&thread,NULL,th_func,NULL);
pthread_join(thread,NULL);
return 0;
}
void *th_func(void * arg)
{
//we can set one or more bits here, each one representing a single CPU
cpu_set_t cpuset;
//the CPU we whant to use
int cpu = 2;
CPU_ZERO(&cpuset); //clears the cpuset
CPU_SET( cpu , &cpuset); //set CPU 2 on cpuset
/*
* cpu affinity for the calling thread
* first parameter is the pid, 0 = calling thread
* second parameter is the size of your cpuset
* third param is the cpuset in which your thread will be
* placed. Each bit represents a CPU
*/
sched_setaffinity(0, sizeof(cpuset), &cpuset);
while (1);
; //burns the CPU 2
return 0;
}
In POSIX environment you can use cpusets to control
which CPUs can be used by processes or pthreads.
This type of control is called CPU affinity.
The function 'sched_setaffinity' receives pthread IDs and
a cpuset as parameter.
When you use 0 in the first parameter, the calling thread
will be affected

Please find the below example program to cpu-affinity of a particular pthread.
Please add appropriate libs.
double waste_time(long n)
{
double res = 0;
long i = 0;
while (i <n * 200000) {
i++;
res += sqrt(i);
}
return res;
}
void *thread_func(void *param)
{
unsigned long mask = 1; /* processor 0 */
/* bind process to processor 0 */
if (pthread_setaffinity_np(pthread_self(), sizeof(mask),
&mask) <0) {
perror("pthread_setaffinity_np");
}
/* waste some time so the work is visible with "top" */
printf("result: %f\n", waste_time(2000));
mask = 2; /* process switches to processor 1 now */
if (pthread_setaffinity_np(pthread_self(), sizeof(mask),
&mask) <0) {
perror("pthread_setaffinity_np");
}
/* waste some more time to see the processor switch */
printf("result: %f\n", waste_time(2000));
}
int main(int argc, char *argv[])
{
pthread_t my_thread;
if (pthread_create(&my_thread, NULL, thread_func, NULL) != 0) {
perror("pthread_create");
}
pthread_exit(NULL);
}
Compile above program with -D_GNU_SOURCE flag.

The scheduler will change the cpu affinity as it sees fit; to set it persistently please see cpuset in /proc file system.
http://man7.org/linux/man-pages/man7/cpuset.7.html
Or you can write a short program that sets the cpu affinity periodically (every few seconds) with sched_setaffinity

Related

Why EAGAIN in pthread_key_create happens?

Sometimes when I try to create key with pthread_key_create I'm getting EAGAIN error code. Is it possible to know exactly why?
Documentation says:
The system lacked the necessary resources to create another thread-specific data key, or the system-imposed limit on the total number of keys per process [PTHREAD_KEYS_MAX] would be exceeded.
How to check if it was a limit for keys? Maybe some king of monitor tool to check how many keys already opened in system and how many still could be used?
One important thing about our code: we use fork() and have multiple processes running. And each process could have multiple threads.
I found that we don't have independent limit for thread keys when we use fork(). Here is little example.
#include <stdio.h>
#include <pthread.h>
#include <string.h>
#include <unistd.h>
size_t create_keys(pthread_key_t *keys, size_t number_of_keys)
{
size_t counter = 0;
for (size_t i = 0; i < number_of_keys; i++)
{
int e = pthread_key_create(keys + i, NULL);
if (e)
{
printf("ERROR (%d): index: %ld, pthread_key_create (%d)\n", getpid(), i, e);
break;
}
counter++;
}
return counter;
}
int main(int argc, char const *argv[])
{
printf("maximim number of thread keys: %ld\n", sysconf(_SC_THREAD_KEYS_MAX));
printf("process id: %d\n", getpid());
const size_t number_of_keys = 1024;
pthread_key_t keys_1[number_of_keys];
memset(keys_1, 0, number_of_keys * sizeof(pthread_key_t));
printf("INFO (%d): number of active keys: %ld\n", getpid(), create_keys(keys_1, number_of_keys));
pid_t p = fork();
if (p == 0)
{
printf("process id: %d\n", getpid());
pthread_key_t keys_2[number_of_keys];
memset(keys_2, 0, number_of_keys * sizeof(pthread_key_t));
printf("INFO (%d): number of active keys: %ld\n", getpid(), create_keys(keys_2, number_of_keys));
}
return 0;
}
When I run this example on Ubuntu 16.04 I see that child process can not create any new thread key if I use same number of keys as limit (1024). But if I use 512 keys for parent and child processes I can run it without error.
As you know, fork() traditionally works by copying the process in memory and then continuing execution from the same point within each copy as parent and child. This is what the return code of fork() indicates.
In order to perform fork(), the internals of the process must be duplicated. Memory, stack, open files, and probably thread local storage keys. Each system is different in its implementation of fork(). Some systems allow you to customise the areas of the process that get copied (see Linux clone(2) interface). However, the concept remains the same.
So, on to your example code: if you allocate 1024 keys in the parent, every child process inherits a full key table and has no spare keys to work with, resulting in the errors. If you allocate only 512 keys in the parent, then every child inherits a half-empty keys table and has 512 spare keys to play with, hence no errors arise.
Maximum value:
#include <unistd.h>
#include <stdio.h>
int main ()
{
printf ("%ld\n", sysconf(_SC_THREAD_KEYS_MAX));
return 0;
}
Consider using pthread_key_delete.

Linux Thread priority , behaviour is abnormal

In the below code snippet, I am creating 6 threads. Each with different priorities. The priority is mentioned in global priority array. I am doing a continuous increment of global variables inside each thread based on thread index. I was expecting the count to be higher if thread priority is higher. but my output is not adhering to priority concepts pl. refer to the output order shown below. I am trying this out on Ubuntu 16.04 and Linux kernel 4.10.
O/P,
Thread=0
Thread=3
Thread=2
Thread=5
Thread=1
Thread=4
pid=32155 count=4522138740
pid=32155 count=4509082289
pid=32155 count=4535088439
pid=32155 count=4517943246
pid=32155 count=4522643905
pid=32155 count=4519640181
Code:
#include <stdio.h>
#include <pthread.h>
#define FAILURE -1
#define MAX_THREADS 15
long int global_count[MAX_THREADS];
/* priority of each thread */
long int priority[]={1,20,40,60,80,99};
void clearGlobalCounts()
{
int i=0;
for(i=0;i<MAX_THREADS;i++)
global_count[i]=0;
}
/**
thread parameter is thread index
**/
void funcDoNothing(void *threadArgument)
{
int count=0;
int index = *((int *)threadArgument);
printf("Thread=%d\n",index);
clearGlobalCounts();
while(1)
{
count++;
if(count==100)
{
global_count[index]++;
count=0;
}
}
}
int main()
{
int i=0;
for(int i=0;i<sizeof(priority)/sizeof(long int);i++)
create_thread(funcDoNothing, i,priority[i]);
sleep(3600);
for(i=0;i<sizeof(priority)/sizeof(long int);i++)
{
printf("pid=%d count=%ld\n",getpid(),
global_count[i]);
}
}
create_thread(void *func,int thread_index,int priority)
{
pthread_attr_t attr;
struct sched_param schedParam;
void *pParm=NULL;
int id;
int * index = malloc(sizeof(int));
*index = thread_index;
void *res;
/* Initialize the thread attributes */
if (pthread_attr_init(&attr))
{
printf("Failed to initialize thread attrs\n");
return FAILURE;
}
if(pthread_attr_setschedpolicy(&attr, SCHED_FIFO))
{
printf("Failed to pthread_attr_setschedpolicy\n");
return FAILURE;
}
if (pthread_attr_setschedpolicy(&attr, SCHED_FIFO))
{
printf("Failed to setschedpolicy\n");
return FAILURE;
}
/* Set the capture thread priority */
pthread_attr_getschedparam(&attr, &schedParam);;
schedParam.sched_priority = sched_get_priority_max(SCHED_FIFO) - 1;
schedParam.sched_priority = priority;
if (pthread_attr_setschedparam(&attr, &schedParam))
{
printf("Failed to setschedparam\n");
return FAILURE;
}
pthread_create(&id, &attr, (void *)func, index);
}
The documentation for pthread_attr_setschedparam says:
In order for the parameter setting made by
pthread_attr_setschedparam() to have effect when calling
pthread_create(3), the caller must use pthread_attr_setinheritsched(3)
to set
the inherit-scheduler attribute of the attributes object attr to PTHREAD_EXPLICIT_SCHED.
So you have to call pthread_attr_setinheritsched(&attr, PTHREAD_EXPLICIT_SCHED) , for example:
if (pthread_attr_setinheritsched(&attr, PTHREAD_EXPLICIT_SCHED) != 0) {
perror("pthread_attr_setinheritsched");
}
pthread_create(&id, &attr, (void *)func, index);
Note: Your code produces a lot of compiler warnings, you need to fix those. You do not want to try to test code which have a lot of undefined behavior - as indicated by some of the warnings. You should probably lower the sleep(3600) to just a few seconds, since when you get your threads running under SCHED_FIFO, they will hog your CPU and the machine appears freezed while they are running.

error: undefined reference to `sched_setaffinity' on windows xp

Basically the code below was intended for use on linux and maybe thats the reason I get the error because I'm using windows XP, but I figure that pthreads should work just as well on both machines. I'm using gcc as my compiler and I did link with -lpthread but I got the following error anyways.
|21|undefined reference to sched_setaffinity'|
|30|undefined reference tosched_setaffinity'|
If there is another method to setting the thread affinity using pthreads (on windows) let me know. I already know all about the windows.h thread affinity functions available but I want to keep things multiplatform. thanks.
#include <stdio.h>
#include <math.h>
#include <sched.h>
double waste_time(long n)
{
double res = 0;
long i = 0;
while(i <n * 200000)
{
i++;
res += sqrt (i);
}
return res;
}
int main(int argc, char **argv)
{
unsigned long mask = 1; /* processor 0 */
/* bind process to processor 0 */
if (sched_setaffinity(0, sizeof(mask), &mask) <0)//line 21
{
perror("sched_setaffinity");
}
/* waste some time so the work is visible with "top" */
printf ("result: %f\n", waste_time (2000));
mask = 2; /* process switches to processor 1 now */
if (sched_setaffinity(0, sizeof(mask), &mask) <0)//line 30
{
perror("sched_setaffinity");
}
/* waste some more time to see the processor switch */
printf ("result: %f\n", waste_time (2000));
}
sched_getaffinity() and sched_setaffinity() are strictly Linux-specific calls. Windows provides its own set of specific Win32 API calls that affect scheduling. See this answer for sample code for Windows.

sem_init() causing SEGV

I have the following code and it is being killed by a SEGV signal. Using the debugger shows that it is being killed by the first sem_init() in main(). If I comment out the first sem_init() the second causes the same problem. I have tried figuring out what would cause this sys call to cause a SEGV. The else is not being run, so the error is happening before it can return a value.
Any help would be greatly appreciated,
Thank you.
I removed the rest of the code that isnt being run before this problem occurs.
#define PORTNUM 7000
#define NUM_OF_THREADS 5
#define oops(msg) { perror(msg); exit(1);}
#define FCFS 0
#define SJF 1;
void bindAndListen();
void acceptConnection(int socket_file_descriptor);
void* dispatchJobs(void*);
void* replyToClient(void* pos);
//holds ids of worker threads
pthread_t threads[NUM_OF_THREADS];
//mutex variable for sleep_signal_cond
pthread_mutex_t sleep_signal_mutex[NUM_OF_THREADS];
//holds the condition variables to signal when the thread should be unblocked
pthread_cond_t sleep_signal_cond[NUM_OF_THREADS];
//mutex for accessing sleeping_thread_list
pthread_mutex_t sleeping_threads_mutex = PTHREAD_MUTEX_INITIALIZER;
//list of which threads are sleeping so they can be signaled and given a job
std::vector<bool> *sleeping_threads_list = new std::vector<bool>();
//number of threads ready for jobs
sem_t* available_threads;
sem_t* waiting_jobs;
//holds requests waiting to be given to one of the threads for execution
std::vector<std::vector<int> >* jobs = new std::vector<std::vector<int> >();
pthread_mutex_t jobs_mutex = PTHREAD_MUTEX_INITIALIZER;
int main (int argc, char * const argv[]) {
//holds id for thread responsible for removing jobs from ready queue and assigning them to worker thread
pthread_t dispatcher_thread;
//initializes semaphores
if(sem_init(available_threads, 0, NUM_OF_THREADS) != 0){ //this is the line causing the SEGV
oops("Error Initializing Semaphore");
}
if(sem_init(waiting_jobs, 0, 0) !=0){
oops("Error Initializing Semaphore");
}
//initializes condition variables and guarding mutexes
for(int i=0; i<NUM_OF_THREADS; i++){
pthread_cond_init(&sleep_signal_cond[i], NULL);
pthread_mutex_init(&sleep_signal_mutex[i], NULL);
}
if(pthread_create(&dispatcher_thread, NULL, dispatchJobs, (void*)NULL) !=0){
oops("Error Creating Distributer Thread");
You declare pointers to your semaphores:
sem_t* available_threads;
sem_t* waiting_jobs;
but never initialize the memory. The sem_init function is not expecting to allocate memory, just to initialize an existing blob of memory. Either allocate some memory and assign these pointers to it, or declare the semaphores as sem_t and pass the address to sem_init.

is nice() used to change the thread priority or the process priority?

The man page for nice says "nice() adds inc to the nice value for the calling process. So, can we use it to change the nice value for a thread created by pthread_create?
EDIT:
It seems that we can set the nice value per thread.
I wrote an application, setting different nice values for different threads, and observed that the "nicer" thread has been scheduled with lower priority. Checking the output, I found that the string "high priority ................" gets outputted more frequently.
void * thread_function1(void *arg)
{
const pid_t tid = syscall(SYS_gettid);
int ret = setpriority(PRIO_PROCESS, tid, -10);
printf("tid of high priority thread %d , %d\n", tid ,getpriority(PRIO_PROCESS, tid));
while(1)
{
printf("high priority ................\n");
}
}
void * thread_function(void *arg)
{
const pid_t tid = syscall(SYS_gettid);
int ret = setpriority(PRIO_PROCESS, tid, 10);
printf("tid of low priority thread %d , %d \n", tid ,getpriority(PRIO_PROCESS, tid));
while(1)
{
printf("lower priority\n");
}
}
int main()
{
pthread_t id1;
pthread_t id2;
pid_t pid = getpid();
pid_t tid = syscall(SYS_gettid);
printf("main thread : pid = %d , tid = %d \n" , pid, tid);
pthread_create(&id1, NULL, thread_function1, NULL);
pthread_create(&id2, NULL,thread_function, NULL);
pthread_join(id1, NULL);
pthread_join(id2, NULL);
}
The pthreads man page says:
POSIX.1 also requires that threads share a range of other attributes
(i.e., these attributes are process-wide rather than per-thread):
[...]
nice value (setpriority(2))
So, theoretically, the "niceness" value is global to the process and shared by all threads, and you should not be able to set a specific niceness for one or more individual threads.
However, the very same man page also says:
LinuxThreads
The notable features of this implementation are the following:
[...]
Threads do not share a common nice value.
NPTL
[...]
NPTL still has a few non-conformances with POSIX.1:
Threads do not share a common nice value.
So it turns out that both threading implementations on Linux (LinuxThreads and NPTL) actually violate POSIX.1, and you can set a specific niceness for one or more individual threads by passing a tid to setpriority() on these systems.
According to the man page for setpriority, a lower nice value (nice values are in the range of -20 to 20) means higher priority in scheduling. It looks like your program works as expected (nice = -10 gives this thread higher priority).
I wanted to test how changing these values really affects the thread's priority, so I modified your snippet to this benchmark:
Running on default SCHED_OTHER scheduling policy
Created 12 low priority threads to make sure they compete on resources - on Red hat 7 with 8 cores. (cat /proc/cpuinfo)
Modified the thread_function() to do some "number crunching work"
When setting to edge priorities you can definitely see with top -H that the high priority thread runs more often, but no starvation occurs to other threads. relevant fields are NI and TIME+
From top man page:
NI -- Nice Value
The nice value of the task. A negative nice value means
higher priority, whereas a positive nice value means lower
priority. Zero in this field simply means priority will not
be adjusted in determining a task's dispatch-ability.
TIME -- CPU Time
Total CPU time the task has used since it started.
#include<cstdio>
#include<pthread.h>
#include<unistd.h>
#include<sys/syscall.h>
#include<sys/resource.h>
#include <stdio.h>
#include <stdlib.h>
#define NUM_THREADS 12
struct ThreadParams{
const char* priority;
const int niceLevel;
};
void * thread_function(void *arg)
{
const pid_t tid = syscall(SYS_gettid);
struct ThreadParams* params = (ThreadParams*)arg;
int ret = setpriority(PRIO_PROCESS, tid, params->niceLevel);
printf("tid of %s priority thread %d , %d\n", params->priority, tid ,getpriority(PRIO_PROCESS, tid));
long long int count = 0;
while(1)
{
count++;
if(count == 10000000000) //10^10 iterations
{
printf("%s priority ................\n", params->priority);
count = 0;
}
}
}
int main()
{
pthread_t tIdHigh;
pthread_t tIdsLow[NUM_THREADS];
pid_t pid = getpid();
pid_t tid = syscall(SYS_gettid);
printf("main thread : pid = %d , tid = %d \n" , pid, tid);
struct ThreadParams highParams = {"High", -20};
struct ThreadParams lowParams = {"Low", 19};
for(int i=0; i < NUM_THREADS ; i++)
{
pthread_create(&(tIdsLow[i]), NULL,thread_function, &lowParams);
}
pthread_create(&tIdHigh, NULL, thread_function, &highParams);
for(int i=0; i < NUM_THREADS ; i++)
{
pthread_join(tIdsLow[i], NULL);
}
pthread_join(tIdHigh, NULL);
return 0;
}
Compiled with g++ <FILE_NAME>.cpp -lpthread.
Run top -H -p $(pidof <PROCESS_NAME>) to enable Threads-mode and get information for specific process

Resources