is nice() used to change the thread priority or the process priority? - linux

The man page for nice says "nice() adds inc to the nice value for the calling process. So, can we use it to change the nice value for a thread created by pthread_create?
EDIT:
It seems that we can set the nice value per thread.
I wrote an application, setting different nice values for different threads, and observed that the "nicer" thread has been scheduled with lower priority. Checking the output, I found that the string "high priority ................" gets outputted more frequently.
void * thread_function1(void *arg)
{
const pid_t tid = syscall(SYS_gettid);
int ret = setpriority(PRIO_PROCESS, tid, -10);
printf("tid of high priority thread %d , %d\n", tid ,getpriority(PRIO_PROCESS, tid));
while(1)
{
printf("high priority ................\n");
}
}
void * thread_function(void *arg)
{
const pid_t tid = syscall(SYS_gettid);
int ret = setpriority(PRIO_PROCESS, tid, 10);
printf("tid of low priority thread %d , %d \n", tid ,getpriority(PRIO_PROCESS, tid));
while(1)
{
printf("lower priority\n");
}
}
int main()
{
pthread_t id1;
pthread_t id2;
pid_t pid = getpid();
pid_t tid = syscall(SYS_gettid);
printf("main thread : pid = %d , tid = %d \n" , pid, tid);
pthread_create(&id1, NULL, thread_function1, NULL);
pthread_create(&id2, NULL,thread_function, NULL);
pthread_join(id1, NULL);
pthread_join(id2, NULL);
}

The pthreads man page says:
POSIX.1 also requires that threads share a range of other attributes
(i.e., these attributes are process-wide rather than per-thread):
[...]
nice value (setpriority(2))
So, theoretically, the "niceness" value is global to the process and shared by all threads, and you should not be able to set a specific niceness for one or more individual threads.
However, the very same man page also says:
LinuxThreads
The notable features of this implementation are the following:
[...]
Threads do not share a common nice value.
NPTL
[...]
NPTL still has a few non-conformances with POSIX.1:
Threads do not share a common nice value.
So it turns out that both threading implementations on Linux (LinuxThreads and NPTL) actually violate POSIX.1, and you can set a specific niceness for one or more individual threads by passing a tid to setpriority() on these systems.

According to the man page for setpriority, a lower nice value (nice values are in the range of -20 to 20) means higher priority in scheduling. It looks like your program works as expected (nice = -10 gives this thread higher priority).

I wanted to test how changing these values really affects the thread's priority, so I modified your snippet to this benchmark:
Running on default SCHED_OTHER scheduling policy
Created 12 low priority threads to make sure they compete on resources - on Red hat 7 with 8 cores. (cat /proc/cpuinfo)
Modified the thread_function() to do some "number crunching work"
When setting to edge priorities you can definitely see with top -H that the high priority thread runs more often, but no starvation occurs to other threads. relevant fields are NI and TIME+
From top man page:
NI -- Nice Value
The nice value of the task. A negative nice value means
higher priority, whereas a positive nice value means lower
priority. Zero in this field simply means priority will not
be adjusted in determining a task's dispatch-ability.
TIME -- CPU Time
Total CPU time the task has used since it started.
#include<cstdio>
#include<pthread.h>
#include<unistd.h>
#include<sys/syscall.h>
#include<sys/resource.h>
#include <stdio.h>
#include <stdlib.h>
#define NUM_THREADS 12
struct ThreadParams{
const char* priority;
const int niceLevel;
};
void * thread_function(void *arg)
{
const pid_t tid = syscall(SYS_gettid);
struct ThreadParams* params = (ThreadParams*)arg;
int ret = setpriority(PRIO_PROCESS, tid, params->niceLevel);
printf("tid of %s priority thread %d , %d\n", params->priority, tid ,getpriority(PRIO_PROCESS, tid));
long long int count = 0;
while(1)
{
count++;
if(count == 10000000000) //10^10 iterations
{
printf("%s priority ................\n", params->priority);
count = 0;
}
}
}
int main()
{
pthread_t tIdHigh;
pthread_t tIdsLow[NUM_THREADS];
pid_t pid = getpid();
pid_t tid = syscall(SYS_gettid);
printf("main thread : pid = %d , tid = %d \n" , pid, tid);
struct ThreadParams highParams = {"High", -20};
struct ThreadParams lowParams = {"Low", 19};
for(int i=0; i < NUM_THREADS ; i++)
{
pthread_create(&(tIdsLow[i]), NULL,thread_function, &lowParams);
}
pthread_create(&tIdHigh, NULL, thread_function, &highParams);
for(int i=0; i < NUM_THREADS ; i++)
{
pthread_join(tIdsLow[i], NULL);
}
pthread_join(tIdHigh, NULL);
return 0;
}
Compiled with g++ <FILE_NAME>.cpp -lpthread.
Run top -H -p $(pidof <PROCESS_NAME>) to enable Threads-mode and get information for specific process

Related

Why EAGAIN in pthread_key_create happens?

Sometimes when I try to create key with pthread_key_create I'm getting EAGAIN error code. Is it possible to know exactly why?
Documentation says:
The system lacked the necessary resources to create another thread-specific data key, or the system-imposed limit on the total number of keys per process [PTHREAD_KEYS_MAX] would be exceeded.
How to check if it was a limit for keys? Maybe some king of monitor tool to check how many keys already opened in system and how many still could be used?
One important thing about our code: we use fork() and have multiple processes running. And each process could have multiple threads.
I found that we don't have independent limit for thread keys when we use fork(). Here is little example.
#include <stdio.h>
#include <pthread.h>
#include <string.h>
#include <unistd.h>
size_t create_keys(pthread_key_t *keys, size_t number_of_keys)
{
size_t counter = 0;
for (size_t i = 0; i < number_of_keys; i++)
{
int e = pthread_key_create(keys + i, NULL);
if (e)
{
printf("ERROR (%d): index: %ld, pthread_key_create (%d)\n", getpid(), i, e);
break;
}
counter++;
}
return counter;
}
int main(int argc, char const *argv[])
{
printf("maximim number of thread keys: %ld\n", sysconf(_SC_THREAD_KEYS_MAX));
printf("process id: %d\n", getpid());
const size_t number_of_keys = 1024;
pthread_key_t keys_1[number_of_keys];
memset(keys_1, 0, number_of_keys * sizeof(pthread_key_t));
printf("INFO (%d): number of active keys: %ld\n", getpid(), create_keys(keys_1, number_of_keys));
pid_t p = fork();
if (p == 0)
{
printf("process id: %d\n", getpid());
pthread_key_t keys_2[number_of_keys];
memset(keys_2, 0, number_of_keys * sizeof(pthread_key_t));
printf("INFO (%d): number of active keys: %ld\n", getpid(), create_keys(keys_2, number_of_keys));
}
return 0;
}
When I run this example on Ubuntu 16.04 I see that child process can not create any new thread key if I use same number of keys as limit (1024). But if I use 512 keys for parent and child processes I can run it without error.
As you know, fork() traditionally works by copying the process in memory and then continuing execution from the same point within each copy as parent and child. This is what the return code of fork() indicates.
In order to perform fork(), the internals of the process must be duplicated. Memory, stack, open files, and probably thread local storage keys. Each system is different in its implementation of fork(). Some systems allow you to customise the areas of the process that get copied (see Linux clone(2) interface). However, the concept remains the same.
So, on to your example code: if you allocate 1024 keys in the parent, every child process inherits a full key table and has no spare keys to work with, resulting in the errors. If you allocate only 512 keys in the parent, then every child inherits a half-empty keys table and has 512 spare keys to play with, hence no errors arise.
Maximum value:
#include <unistd.h>
#include <stdio.h>
int main ()
{
printf ("%ld\n", sysconf(_SC_THREAD_KEYS_MAX));
return 0;
}
Consider using pthread_key_delete.

Linux Thread priority , behaviour is abnormal

In the below code snippet, I am creating 6 threads. Each with different priorities. The priority is mentioned in global priority array. I am doing a continuous increment of global variables inside each thread based on thread index. I was expecting the count to be higher if thread priority is higher. but my output is not adhering to priority concepts pl. refer to the output order shown below. I am trying this out on Ubuntu 16.04 and Linux kernel 4.10.
O/P,
Thread=0
Thread=3
Thread=2
Thread=5
Thread=1
Thread=4
pid=32155 count=4522138740
pid=32155 count=4509082289
pid=32155 count=4535088439
pid=32155 count=4517943246
pid=32155 count=4522643905
pid=32155 count=4519640181
Code:
#include <stdio.h>
#include <pthread.h>
#define FAILURE -1
#define MAX_THREADS 15
long int global_count[MAX_THREADS];
/* priority of each thread */
long int priority[]={1,20,40,60,80,99};
void clearGlobalCounts()
{
int i=0;
for(i=0;i<MAX_THREADS;i++)
global_count[i]=0;
}
/**
thread parameter is thread index
**/
void funcDoNothing(void *threadArgument)
{
int count=0;
int index = *((int *)threadArgument);
printf("Thread=%d\n",index);
clearGlobalCounts();
while(1)
{
count++;
if(count==100)
{
global_count[index]++;
count=0;
}
}
}
int main()
{
int i=0;
for(int i=0;i<sizeof(priority)/sizeof(long int);i++)
create_thread(funcDoNothing, i,priority[i]);
sleep(3600);
for(i=0;i<sizeof(priority)/sizeof(long int);i++)
{
printf("pid=%d count=%ld\n",getpid(),
global_count[i]);
}
}
create_thread(void *func,int thread_index,int priority)
{
pthread_attr_t attr;
struct sched_param schedParam;
void *pParm=NULL;
int id;
int * index = malloc(sizeof(int));
*index = thread_index;
void *res;
/* Initialize the thread attributes */
if (pthread_attr_init(&attr))
{
printf("Failed to initialize thread attrs\n");
return FAILURE;
}
if(pthread_attr_setschedpolicy(&attr, SCHED_FIFO))
{
printf("Failed to pthread_attr_setschedpolicy\n");
return FAILURE;
}
if (pthread_attr_setschedpolicy(&attr, SCHED_FIFO))
{
printf("Failed to setschedpolicy\n");
return FAILURE;
}
/* Set the capture thread priority */
pthread_attr_getschedparam(&attr, &schedParam);;
schedParam.sched_priority = sched_get_priority_max(SCHED_FIFO) - 1;
schedParam.sched_priority = priority;
if (pthread_attr_setschedparam(&attr, &schedParam))
{
printf("Failed to setschedparam\n");
return FAILURE;
}
pthread_create(&id, &attr, (void *)func, index);
}
The documentation for pthread_attr_setschedparam says:
In order for the parameter setting made by
pthread_attr_setschedparam() to have effect when calling
pthread_create(3), the caller must use pthread_attr_setinheritsched(3)
to set
the inherit-scheduler attribute of the attributes object attr to PTHREAD_EXPLICIT_SCHED.
So you have to call pthread_attr_setinheritsched(&attr, PTHREAD_EXPLICIT_SCHED) , for example:
if (pthread_attr_setinheritsched(&attr, PTHREAD_EXPLICIT_SCHED) != 0) {
perror("pthread_attr_setinheritsched");
}
pthread_create(&id, &attr, (void *)func, index);
Note: Your code produces a lot of compiler warnings, you need to fix those. You do not want to try to test code which have a lot of undefined behavior - as indicated by some of the warnings. You should probably lower the sleep(3600) to just a few seconds, since when you get your threads running under SCHED_FIFO, they will hog your CPU and the machine appears freezed while they are running.

pthread_cond_broadcast problem

Using pthreads in linux 2.6.30 I am trying to send a single signal which will cause multiple threads to begin execution. The broadcast seems to only be received by one thread. I have tried both pthread_cond_signal and pthread cond_broadcast and both seem to have the same behavior. For the mutex in pthread_cond_wait, I have tried both common mutexes and separate (local) mutexes with no apparent difference.
worker_thread(void *p)
{
// setup stuff here
printf("Thread %d ready for action \n", p->thread_no);
pthread_cond_wait(p->cond_var, p->mutex);
printf("Thread %d off to work \n", p->thread_no);
// work stuff
}
dispatch_thread(void *p)
{
// setup stuff
printf("Wakeup, everyone ");
pthread_cond_broadcast(p->cond_var);
printf("everyone should be working \n");
// more stuff
}
main()
{
pthread_cond_init(cond_var);
for (i=0; i!=num_cores; i++) {
pthread_create(worker_thread...);
}
pthread_create(dispatch_thread...);
}
Output:
Thread 0 ready for action
Thread 1 ready for action
Thread 2 ready for action
Thread 3 ready for action
Wakeup, everyone
everyone should be working
Thread 0 off to work
What's a good way to send signals to all the threads?
First off, you should have the mutex locked at the point where you call pthread_cond_wait(). It's generally a good idea to hold the mutex when you call pthread_cond_broadcast(), as well.
Second off, you should loop calling pthread_cond_wait() while the wait condition is true. Spurious wakeups can happen, and you must be able to handle them.
Finally, your actual problem: you are signaling all threads, but some of them aren't waiting yet when the signal is sent. Your main thread and dispatch thread are racing your worker threads: if the main thread can launch the dispatch thread, and the dispatch thread can grab the mutex and broadcast on it before the worker threads can, then those worker threads will never wake up.
You need a synchronization point prior to signaling where you wait to signal till all threads are known to be waiting for the signal. That, or you can keep signaling till you know all threads have been woken up.
In this case, you could use the mutex to protect a count of sleeping threads. Each thread grabs the mutex and increments the count. If the count matches the count of worker threads, then it's the last thread to increment the count and so signals on another condition variable sharing the same mutex to the sleeping dispatch thread that all threads are ready. The thread then waits on the original condition, which causes it release the mutex.
If the dispatch thread wasn't sleeping yet when the last worker thread signals on that condition, it will find that the count already matches the desired count and not bother waiting, but immediately broadcast on the shared condition to wake workers, who are now guaranteed to all be sleeping.
Anyway, here's some working source code that fleshes out your sample code and includes my solution:
#include <stdio.h>
#include <pthread.h>
#include <err.h>
static const int num_cores = 8;
struct sync {
pthread_mutex_t *mutex;
pthread_cond_t *cond_var;
int thread_no;
};
static int sleeping_count = 0;
static pthread_cond_t all_sleeping_cond = PTHREAD_COND_INITIALIZER;
void *
worker_thread(void *p_)
{
struct sync *p = p_;
// setup stuff here
pthread_mutex_lock(p->mutex);
printf("Thread %d ready for action \n", p->thread_no);
sleeping_count += 1;
if (sleeping_count >= num_cores) {
/* Last worker to go to sleep. */
pthread_cond_signal(&all_sleeping_cond);
}
int err = pthread_cond_wait(p->cond_var, p->mutex);
if (err) warnc(err, "pthread_cond_wait");
printf("Thread %d off to work \n", p->thread_no);
pthread_mutex_unlock(p->mutex);
// work stuff
return NULL;
}
void *
dispatch_thread(void *p_)
{
struct sync *p = p_;
// setup stuff
pthread_mutex_lock(p->mutex);
while (sleeping_count < num_cores) {
pthread_cond_wait(&all_sleeping_cond, p->mutex);
}
printf("Wakeup, everyone ");
int err = pthread_cond_broadcast(p->cond_var);
if (err) warnc(err, "pthread_cond_broadcast");
printf("everyone should be working \n");
pthread_mutex_unlock(p->mutex);
// more stuff
return NULL;
}
int
main(void)
{
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t cond_var = PTHREAD_COND_INITIALIZER;
pthread_t worker[num_cores];
struct sync info[num_cores];
for (int i = 0; i < num_cores; i++) {
struct sync *p = &info[i];
p->mutex = &mutex;
p->cond_var = &cond_var;
p->thread_no = i;
pthread_create(&worker[i], NULL, worker_thread, p);
}
pthread_t dispatcher;
struct sync p = {&mutex, &cond_var, num_cores};
pthread_create(&dispatcher, NULL, dispatch_thread, &p);
pthread_exit(NULL);
/* not reached */
return 0;
}

how to set CPU affinity of a particular pthread?

I'd like to specify the cpu-affinity of a particular pthread. All the references I've found so far deal with setting the cpu-affinity of a process (pid_t) not a thread (pthread_t). I tried some experiments passing pthread_t's around and as expected they fail. Am I trying to do something impossible? If not, can you send a pointer please? Thanks a million.
This is a wrapper I've made to make my life easier. Its effect is that the calling thread gets "stuck" to the core with id core_id:
// core_id = 0, 1, ... n-1, where n is the system's number of cores
int stick_this_thread_to_core(int core_id) {
int num_cores = sysconf(_SC_NPROCESSORS_ONLN);
if (core_id < 0 || core_id >= num_cores)
return EINVAL;
cpu_set_t cpuset;
CPU_ZERO(&cpuset);
CPU_SET(core_id, &cpuset);
pthread_t current_thread = pthread_self();
return pthread_setaffinity_np(current_thread, sizeof(cpu_set_t), &cpuset);
}
Assuming linux:
The interface to setting the affinity is - as you've probably already discovered:
int sched_setaffinity(pid_t pid,size_t cpusetsize,cpu_set_t *mask);
Passing 0 as the pid, and it'll apply to the current thread only, or have other threads report their kernel pid with the linux-specific call pid_t gettid(void); and pass that in as the pid.
Quoting the man page
The affinity mask is actually a per-thread attribute that can be
adjusted independently for each of the
threads in a thread group. The value
returned from a call to gettid(2) can
be passed in the argument pid.
Specifying pid as 0 will set the
attribute for the calling thread, and
passing the value returned from a call
to getpid(2) will set the attribute
for the main thread of the thread
group. (If you are using the POSIX
threads API, then use
pthread_setaffinity_np (3) instead of
sched_setaffinity().)
//compilation: gcc -o affinity affinity.c -lpthread
#define _GNU_SOURCE
#include <sched.h> //cpu_set_t , CPU_SET
#include <pthread.h> //pthread_t
#include <stdio.h>
void *th_func(void * arg);
int main(void) {
pthread_t thread; //the thread
pthread_create(&thread,NULL,th_func,NULL);
pthread_join(thread,NULL);
return 0;
}
void *th_func(void * arg)
{
//we can set one or more bits here, each one representing a single CPU
cpu_set_t cpuset;
//the CPU we whant to use
int cpu = 2;
CPU_ZERO(&cpuset); //clears the cpuset
CPU_SET( cpu , &cpuset); //set CPU 2 on cpuset
/*
* cpu affinity for the calling thread
* first parameter is the pid, 0 = calling thread
* second parameter is the size of your cpuset
* third param is the cpuset in which your thread will be
* placed. Each bit represents a CPU
*/
sched_setaffinity(0, sizeof(cpuset), &cpuset);
while (1);
; //burns the CPU 2
return 0;
}
In POSIX environment you can use cpusets to control
which CPUs can be used by processes or pthreads.
This type of control is called CPU affinity.
The function 'sched_setaffinity' receives pthread IDs and
a cpuset as parameter.
When you use 0 in the first parameter, the calling thread
will be affected
Please find the below example program to cpu-affinity of a particular pthread.
Please add appropriate libs.
double waste_time(long n)
{
double res = 0;
long i = 0;
while (i <n * 200000) {
i++;
res += sqrt(i);
}
return res;
}
void *thread_func(void *param)
{
unsigned long mask = 1; /* processor 0 */
/* bind process to processor 0 */
if (pthread_setaffinity_np(pthread_self(), sizeof(mask),
&mask) <0) {
perror("pthread_setaffinity_np");
}
/* waste some time so the work is visible with "top" */
printf("result: %f\n", waste_time(2000));
mask = 2; /* process switches to processor 1 now */
if (pthread_setaffinity_np(pthread_self(), sizeof(mask),
&mask) <0) {
perror("pthread_setaffinity_np");
}
/* waste some more time to see the processor switch */
printf("result: %f\n", waste_time(2000));
}
int main(int argc, char *argv[])
{
pthread_t my_thread;
if (pthread_create(&my_thread, NULL, thread_func, NULL) != 0) {
perror("pthread_create");
}
pthread_exit(NULL);
}
Compile above program with -D_GNU_SOURCE flag.
The scheduler will change the cpu affinity as it sees fit; to set it persistently please see cpuset in /proc file system.
http://man7.org/linux/man-pages/man7/cpuset.7.html
Or you can write a short program that sets the cpu affinity periodically (every few seconds) with sched_setaffinity

msemaphore on linux?

AIX (and HPUX if anyone cares) have a nice little feature called msemaphores that make it easy to synchronize granular pieces (e.g. records) of memory-mapped files shared by multiple processes. Is anyone aware of something comparable in linux?
To be clear, the msemaphore functions are described by following the related links here.
POSIX semaphores can be placed in memory shared between processes, if the second argument to sem_init(3), "pshared", is true. This seems to be the same as what msem does.
#include <semaphore.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/mman.h>
#include <time.h>
#include <unistd.h>
int main() {
void *shared;
sem_t *sem;
int counter, *data;
pid_t pid;
srand(time(NULL));
shared = mmap(NULL, sysconf(_SC_PAGE_SIZE), PROT_READ | PROT_WRITE,
MAP_ANONYMOUS | MAP_SHARED, -1, 0);
sem_init(sem = shared, 1, 1);
data = shared + sizeof(sem_t);
counter = *data = 0;
pid = fork();
while (1) {
sem_wait(sem);
if (pid)
printf("ping>%d %d\n", data[0] = rand(), data[1] = rand());
else if (counter != data[0]) {
printf("pong<%d", counter = data[0]);
sleep(2);
printf(" %d\n", data[1]);
}
sem_post(sem);
if (pid) sleep(1);
}
}
This is a pretty dumb test, but it works:
$ cc -o test -lrt test.c
$ ./test
ping>2098529942 315244699
pong<2098529942 315244699
pong<1195826161 424832009
ping>1195826161 424832009
pong<1858302907 1740879454
ping>1858302907 1740879454
ping>568318608 566229809
pong<568318608 566229809
ping>1469118213 999421338
pong<1469118213 999421338
ping>1247594672 1837310825
pong<1247594672 1837310825
ping>478016018 1861977274
pong<478016018 1861977274
ping>1022490459 935101133
pong<1022490459 935101133
...
Because the semaphore is shared between the two processes, the pongs don't get interleaved data from the pings despite the sleeps.
This can be done using POSIX shared-memory mutexes:
pthread_mutexattr_t attr;
int pshared = PTHREAD_PROCESS_SHARED;
pthread_mutexattr_init(&attr);
pthread_mutexattr_setpshared(&attr, &pshared);
pthread_mutex_init(&some_shared_mmap_structure.mutex, &attr);
pthread_mutexattr_destroy(&attr);
Now you can unlock and lock &some_shared_mmap_structure.mutex using ordinary pthread_mutex_lock() etc calls, from multiple processes that have it mapped.
Indeed, you can even implement the msem API in terms of this: (untested)
struct msemaphore {
pthread_mutex_t mut;
};
#define MSEM_LOCKED 1
#define MSEM_UNLOCKED 0
#define MSEM_IF_NOWAIT 1
msemaphore *msem_init(msemaphore *msem_p, int initialvalue) {
pthread_mutex_attr_t attr;
int pshared = PTHREAD_PROCESS_SHARED;
assert((unsigned long)msem_p & 7 == 0); // check alignment
pthread_mutexattr_init(&attr);
pthread_mutexattr_setpshared(&attr, &pshared); // might fail, you should probably check
pthread_mutex_init(&msem_p->mut, &attr); // never fails
pthread_mutexattr_destroy(&attr);
if (initialvalue)
pthread_mutex_lock(&attr);
return msem_p;
}
int msem_remove(msemaphore *msem) {
return pthread_mutex_destroy(&msem->mut) ? -1 : 0;
}
int msem_lock(msemaphore *msem, int cond) {
int ret;
if (cond == MSEM_IF_NOWAIT)
ret = pthread_mutex_trylock(&msem->mut);
else
ret = pthread_mutex_lock(&msem->mut);
return ret ? -1 : 0;
}
int msem_unlock(msemaphore *msem, int cond) {
// pthreads does not allow us to directly ascertain whether there are
// waiters. However, a unlock/trylock with no contention is -very- fast
// using linux's pthreads implementation, so just do that instead if
// you care.
//
// nb, only fails if the mutex is not initialized
return pthread_mutex_unlock(&msem->mut) ? -1 : 0;
}
Under Linux, you may be able to achieve what you want with SysV shared memory; quick googling turned up this (rather old) guide that may be of help.

Resources