operating issues qustions - threads, processes etc. for the above code: - multithreading

int S1 = 0;
int S2 = 0;
int x = 0;
int run = 1;
void Producer(void) {
while(run) {
while (S1 == S2);
x++;
__sync_synchronize();
S1 = S2;
__sync_synchronize();
}
}
void Consumer(void) {
while(run) {
while (S1 != S2);
x--;
__sync_synchronize();
S1 = !S2;
__sync_synchronize();
}
}
void* Worker(void *func) {
long func_id = (long)func & 0x1;
printf("%s %d\n",__func__, (int)func_id);
switch (func_id) {
case 0:
Producer();
break;
case 1:
Consumer();
break;
}
return NULL;
}
int main(int argc, char *argv[]) {
pthread_t t[argc];
pthread_attr_t at;
cpu_set_t cpuset;
int threads;
int i;
#define MAX_PROCESSORS 4 // Minimal processors is 2.
threads = argc > 1 ? (( atoi(argv[1]) < 4) ? atoi(argv[1]): MAX_PROCESSORS ) : 1;
for (i = 0;i < threads; i++){
CPU_ZERO(&cpuset);
CPU_SET(i, &cpuset);
pthread_attr_init(&at);
(&at, sizeof(cpuset), &cpuset);
if (pthread_create(&t[i], &at , Worker, (void *) (long)i) ) {
perror("pthread create 1 error\n"); }
}
do {
sleep(1);
} while(x < 0);
run = 0;
void *val;
for(i = 0; i < threads; i++)
pthread_join(t[i], &val);
printf("x=%d\n", x);
}
The questions:
In ex1.c (6.1), which of the following properties achieved:
(1) Mutual exclusion but not progress
(2) Progress but not mutual exclusion
(3) Neither mutual exclusion nor progress
(4) Both mutual exclusion and progress
Please explain?
1.2
To which arguments (in 6.1) is correct and which does not:
(1) always exits. when threads = 2 or threads <= 0
(2) always hangs. threads = 1 or thread > 2
Any help would be much appreeciated

Related

Why is my multithreaded C program not working on macOS, but completely fine on Linux?

I have written a multithreaded program in C using pthreads to solve the N-queens problem. It uses the producer consumer programming model. One producer who creates all possible combinations and consumers who evaluate if the combination is valid. I use a shared buffer that can hold one combination at a time.
Once I have 2+ consumers the program starts to behave strange. I get more consumptions than productions. 1.5:1 ratio approx (should be 1:1). The interesting part is that this only happens on my MacBook and is nowhere to be seen when I run it on the Linux machine (Red Hat Enterprise Linux Workstation release 6.10 (Santiago)) I have access to over SSH.
I'm quite sure that my implementation is correct with locks and conditional variables too, the program runs for 10+ seconds which should reveal if there are any mistakes with the synchronization.
I compile with GCC (Apple clang version 12.0.5) via xcode developer tools on my MacBook Pro (2020, x86_64) and GCC on Linux too, but version 4.4.7 20120313 (Red Hat 4.4.7-23).
compile: gcc -o 8q 8q.c
run: ./8q <producers> <N>, NxN chess board, N queens to place
parameters: ./8q 2 4 Enough to highlight the problem (should yield 2 solutions, but every other run yields 3+ solutions, i.e duplicate solutions exist
note: print(printouts) Visualizes the valid solutions (duplicates shown)
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <string.h>
#include <assert.h>
typedef struct stack_buf {
int positions[8];
int top;
} stack_buf;
typedef struct global_buf {
int positions[8];
volatile int buf_empty;
volatile long done;
} global_buf;
typedef struct print_buf {
int qpositions[100][8];
int top;
} print_buf;
stack_buf queen_comb = { {0}, 0 };
global_buf global = { {0}, 1, 0 };
print_buf printouts = { {{0}}, -1 };
int N; //NxN board and N queens to place
clock_t start, stop, diff;
pthread_mutex_t buffer_mutex, print_mutex;
pthread_cond_t empty, filled;
/* ##########################################################################################
################################## VALIDATION FUNCTIONS ##################################
########################################################################################## */
/* Validate that no queens are placed on the same row */
int valid_rows(int qpositions[]) {
int rows[N];
memset(rows, 0, N * sizeof(int));
int row;
for (int i = 0; i < N; i++) {
row = qpositions[i] / N;
if (rows[row] == 0) rows[row] = 1;
else return 0;
}
return 1;
}
/* Validate that no queens are placed in the same column */
int valid_columns(int qpositions[]) {
int columns[N];
memset(columns, 0, N*sizeof(int));
int column;
for (int i = 0; i < N; i++) {
column = qpositions[i] % N;
if (columns[column] == 0) columns[column] = 1;
else return 0;
}
return 1;
}
/* Validate that left and right diagonals aren't used by another queen */
int valid_diagonals(int qpositions[]) {
int left_bottom_diagonals[N];
int right_bottom_diagonals[N];
int row, col, temp_col, temp_row, fill_value, index;
for (int queen = 0; queen < N; queen++) {
row = qpositions[queen] / N;
col = qpositions[queen] % N;
/* position --> left down diagonal endpoint (index) */
fill_value = col < row ? col : row; //min of col and row
temp_row = row - fill_value;
temp_col = col - fill_value;
index = temp_row * N + temp_col; // position
for (int i = 0; i < queen; i++) { // check if interference occurs
if (left_bottom_diagonals[i] == index) return 0;
}
left_bottom_diagonals[queen] = index; // no interference
/* position --> right down diagonal endpoint (index) */
fill_value = (N-1) - col < row ? N - col - 1 : row; // closest to bottom or right wall
temp_row = row - fill_value;
temp_col = col + fill_value;
index = temp_row * N + temp_col; // position
for (int i = 0; i < queen; i++) { // check if interference occurs
if (right_bottom_diagonals[i] == index) return 0;
}
right_bottom_diagonals[queen] = index; // no interference
};
return 1;
}
/* ##########################################################################################
#################################### HELPER FUNCTIONS ####################################
########################################################################################## */
/* print the collected solutions */
void print(print_buf printouts) {
static int solution_number = 1;
int placement;
for (int sol = 0; sol <= printouts.top; sol++) { // number of solutions
printf("Solution %d: [ ", solution_number++);
for (int pos = 0; pos < N; pos++) {
printf("%d ", printouts.qpositions[sol][pos]+1);
}
printf("]\n");
printf("Placement:\n");
for (int i = 1; i <= N; i++) { // rows
printf("[ ");
placement = printouts.qpositions[sol][N-i];
for (int j = (N-i)*N; j < (N-i)*N+N; j++) { // physical position
if (j == placement) {
printf(" Q ");
} else printf("%2d ", j+1);
}
printf("]\n");
}
printf("\n");
}
}
/* push value to top of list instance */
void push(stack_buf *instance, int value) {
assert(instance->top <= 8 || instance->top >= 0);
instance->positions[instance->top++] = value;
}
/* pop top element of list instance */
void pop(stack_buf *instance) {
assert(instance->top > 0);
instance->positions[--instance->top] = -1;
}
/* ##########################################################################################
#################################### THREAD FUNCTIONS ####################################
########################################################################################## */
static int consumptions = 0;
/* entry point for each worker (consumer)
workers will check each queen's row, column and
diagonal to evaluate satisfactory placements */
void *eval_positioning(void *id) {
long thr_id = (long)id;
int qpositions[N];
while (!global.done) {
pthread_mutex_lock(&buffer_mutex);
while (global.buf_empty == 1) {
if (global.done) break; // consumers who didn't get last production
pthread_cond_wait(&filled, &buffer_mutex);
}
if (global.done) break;
consumptions++;
memcpy(qpositions, global.positions, N * sizeof(int)); // retrieve queen combination
global.buf_empty = 1;
pthread_cond_signal(&empty);
pthread_mutex_unlock(&buffer_mutex);
if (valid_rows(qpositions) && valid_columns(qpositions) && valid_diagonals(qpositions)) {
/* save for printing later */
pthread_mutex_lock(&print_mutex);
memcpy(printouts.qpositions[++printouts.top], qpositions, N * sizeof(int));
pthread_mutex_unlock(&print_mutex);
}
}
return NULL;
}
static int productions = 0;
/* recursively generate all possible queen_combs */
void rec_positions(int pos, int queens) {
if (queens == 0) { // base case
pthread_mutex_lock(&buffer_mutex);
while (global.buf_empty == 0) {
pthread_cond_wait(&empty, &buffer_mutex);
}
productions++;
memcpy(global.positions, queen_comb.positions, N * sizeof(int));
global.buf_empty = 0;
pthread_mutex_unlock(&buffer_mutex);
pthread_cond_broadcast(&filled); // wake one worker
return;
}
for (int i = pos; i <= N*N - queens; i++) {
push(&queen_comb, i); // physical chess box
rec_positions(i+1, queens-1);
pop(&queen_comb);
}
}
/* binomial coefficient | without order, without replacement
8 queens on 8x8 board: 4'426'165'368 queen combinations */
void *generate_positions(void *arg) {
rec_positions(0, N);
return (void*)1;
}
/* ##########################################################################################
########################################## MAIN ##########################################
########################################################################################## */
/* main procedure of the program */
int main(int argc, char *argv[]) {
if (argc < 3) {
printf("usage: ./8q <workers> <board width/height>\n");
exit(1);
}
int workers = atoi(argv[1]);
N = atoi(argv[2]);
pthread_t thr[workers];
pthread_t producer;
// int sol1[] = {5,8,20,25,39,42,54,59};
// int sol2[] = {2,12,17,31,32,46,51,61};
printf("\n");
start = (float)clock()/CLOCKS_PER_SEC;
pthread_create(&producer, NULL, generate_positions, NULL);
for (long i = 0; i < workers; i++) {
pthread_create(&thr[i], NULL, eval_positioning, (void*)i+1);
}
pthread_join(producer, (void*)&global.done);
pthread_cond_broadcast(&filled);
for (int i = 0; i < workers; i++) {
pthread_join(thr[i], NULL);
}
stop = clock();
diff = (double)(stop - start) / CLOCKS_PER_SEC;
/* go through all valid solutions and print */
print(printouts);
printf("board: %dx%d, workers: %d (+1), exec time: %ld, solutions: %d\n", N, N, workers, diff, printouts.top+1);
printf("productions: %d\nconsumptions: %d\n", productions, consumptions);
return 0;
}
EDIT: I have reworked sync around prod_done and made a new shared variable last_done. When producer is done, it will set prod_done and the thread currently active will either return (last element already validated) or capture the last element at set last_done to inform the other consumers.
Despite the fact that I solved the data race in my book, I still have problems with the shared combination. I have really put time looking into the synchronization but I always get back to the feeling that it should work, but it clearly doesn't when I run it.
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <string.h>
#include <assert.h>
typedef struct stack_buf {
int positions[8];
int top;
} stack_buf;
typedef struct global_buf {
int positions[8];
volatile int buf_empty;
volatile long prod_done;
volatile int last_done;
} global_buf;
typedef struct print_buf {
int qpositions[100][8];
int top;
} print_buf;
stack_buf queen_comb = { {0}, 0 };
global_buf global = { {0}, 1, 0, 0 };
print_buf printouts = { {{0}}, -1 };
int N; //NxN board and N queens to place
long productions, consumptions = 0;
clock_t start, stop, diff;
pthread_mutex_t buffer_mutex, print_mutex;
pthread_cond_t empty, filled;
/* ##########################################################################################
################################## VALIDATION FUNCTIONS ##################################
########################################################################################## */
/* Validate that no queens are placed on the same row */
int valid_rows(int qpositions[]) {
int rows[N];
memset(rows, 0, N*sizeof(int));
int row;
for (int i = 0; i < N; i++) {
row = qpositions[i] / N;
if (rows[row] == 0) rows[row] = 1;
else return 0;
}
return 1;
}
/* Validate that no queens are placed in the same column */
int valid_columns(int qpositions[]) {
int columns[N];
memset(columns, 0, N*sizeof(int));
int column;
for (int i = 0; i < N; i++) {
column = qpositions[i] % N;
if (columns[column] == 0) columns[column] = 1;
else return 0;
}
return 1;
}
/* Validate that left and right diagonals aren't used by another queen */
int valid_diagonals(int qpositions[]) {
int left_bottom_diagonals[N];
int right_bottom_diagonals[N];
int row, col, temp_col, temp_row, fill_value, index;
for (int queen = 0; queen < N; queen++) {
row = qpositions[queen] / N;
col = qpositions[queen] % N;
/* position --> left down diagonal endpoint (index) */
fill_value = col < row ? col : row; // closest to bottom or left wall
temp_row = row - fill_value;
temp_col = col - fill_value;
index = temp_row * N + temp_col; // board position
for (int i = 0; i < queen; i++) { // check if interference occurs
if (left_bottom_diagonals[i] == index) return 0;
}
left_bottom_diagonals[queen] = index; // no interference
/* position --> right down diagonal endpoint (index) */
fill_value = (N-1) - col < row ? N - col - 1 : row; // closest to bottom or right wall
temp_row = row - fill_value;
temp_col = col + fill_value;
index = temp_row * N + temp_col; // board position
for (int i = 0; i < queen; i++) { // check if interference occurs
if (right_bottom_diagonals[i] == index) return 0;
}
right_bottom_diagonals[queen] = index; // no interference
}
return 1;
}
/* ##########################################################################################
#################################### HELPER FUNCTIONS ####################################
########################################################################################## */
/* print the collected solutions */
void print(print_buf printouts) {
static int solution_number = 1;
int placement;
for (int sol = 0; sol <= printouts.top; sol++) { // number of solutions
printf("Solution %d: [ ", solution_number++);
for (int pos = 0; pos < N; pos++) {
printf("%d ", printouts.qpositions[sol][pos]+1);
}
printf("]\n");
printf("Placement:\n");
for (int i = 1; i <= N; i++) { // rows
printf("[ ");
placement = printouts.qpositions[sol][N-i];
for (int j = (N-i)*N; j < (N-i)*N+N; j++) { // physical position
if (j == placement) {
printf(" Q ");
} else printf("%2d ", j+1);
}
printf("]\n");
}
printf("\n");
}
}
/* ##########################################################################################
#################################### THREAD FUNCTIONS ####################################
########################################################################################## */
/* entry point for each worker (consumer)
workers will check each queen's row, column and
diagonal to evaluate satisfactory placements */
void *eval_positioning(void *id) {
long thr_id = (long)id;
int qpositions[N];
pthread_mutex_lock(&buffer_mutex);
while (!global.last_done) {
while (global.buf_empty == 1) {
pthread_cond_wait(&filled, &buffer_mutex);
if (global.last_done) { // last_done ==> prod_done, so thread returns
pthread_mutex_unlock(&buffer_mutex);
return NULL;
}
if (global.prod_done) { // prod done, current thread takes last elem produced
global.last_done = 1;
break;
}
}
if (!global.last_done) consumptions++;
memcpy(qpositions, global.positions, N*sizeof(int)); // retrieve queen combination
global.buf_empty = 1;
pthread_mutex_unlock(&buffer_mutex);
pthread_cond_signal(&empty);
if (valid_rows(qpositions) && valid_columns(qpositions) && valid_diagonals(qpositions)) {
/* save for printing later */
pthread_mutex_lock(&print_mutex);
memcpy(printouts.qpositions[++printouts.top], qpositions, N*sizeof(int));
pthread_mutex_unlock(&print_mutex);
}
pthread_mutex_lock(&buffer_mutex);
}
pthread_mutex_unlock(&buffer_mutex);
return NULL;
}
/* recursively generate all possible queen_combs */
void rec_positions(int pos, int queens) {
if (queens == 0) { // base case
pthread_mutex_lock(&buffer_mutex);
while (global.buf_empty == 0) {
pthread_cond_wait(&empty, &buffer_mutex);
}
productions++;
memcpy(global.positions, queen_comb.positions, N*sizeof(int));
global.buf_empty = 0;
pthread_mutex_unlock(&buffer_mutex);
pthread_cond_signal(&filled);
return;
}
for (int i = pos; i <= N*N - queens; i++) {
queen_comb.positions[queen_comb.top++] = i;
rec_positions(i+1, queens-1);
queen_comb.top--;
}
}
/* binomial coefficient | without order, without replacement
8 queens on 8x8 board: 4'426'165'368 queen combinations */
void *generate_positions(void *arg) {
rec_positions(0, N);
return (void*)1;
}
/* ##########################################################################################
########################################## MAIN ##########################################
########################################################################################## */
/* main procedure of the program */
int main(int argc, char *argv[]) {
if (argc < 3) {
printf("usage: ./8q <workers> <board width/height>\n");
exit(1);
}
int workers = atoi(argv[1]);
N = atoi(argv[2]);
pthread_t thr[workers];
pthread_t producer;
printf("\n");
start = (float)clock()/CLOCKS_PER_SEC;
pthread_create(&producer, NULL, generate_positions, NULL);
for (long i = 0; i < workers; i++) {
pthread_create(&thr[i], NULL, eval_positioning, (void*)i+1);
}
pthread_join(producer, (void*)&global.prod_done);
pthread_cond_broadcast(&filled);
for (int i = 0; i < workers; i++) {
printf("thread #%d done\n", i+1);
pthread_join(thr[i], NULL);
pthread_cond_broadcast(&filled);
}
stop = clock();
diff = (double)(stop - start) / CLOCKS_PER_SEC;
/* go through all valid solutions and print */
print(printouts);
printf("board: %dx%d, workers: %d (+1), exec time: %ld, solutions: %d\n", N, N, workers, diff, printouts.top+1);
printf("productions: %ld\nconsumptions: %ld\n", productions, consumptions);
return 0;
}
I'm quite sure that my implementation is correct with locks and conditional variables
That is a bold statement, and it's provably false. Your program hangs on Linux when run with clang -g q.c -o 8q && ./8q 2 4.
When I look at the state of the program, I see one thread here:
#4 __pthread_cond_wait (cond=0x404da8 <filled>, mutex=0x404d80 <buffer_mutex>) at pthread_cond_wait.c:619
#5 0x000000000040196b in eval_positioning (id=0x1) at q.c:163
#6 0x00007ffff7f8cd80 in start_thread (arg=0x7ffff75b6640) at pthread_create.c:481
#7 0x00007ffff7eb7b6f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
and the main thread trying to join the above thread. All other threads have exited, so there is nothing to signal the condition.
One immediate problem I see is this:
void *eval_positioning(void *id) {
long thr_id = (long)id;
int qpositions[N];
while (!global.done) {
...
int main(int argc, char *argv[]) {
...
pthread_join(producer, (void*)&global.done);
If the producer thread finishes before the eval_positioning starts, then eval_positioning will do nothing at all.
You should set global.done when all positions have been evaluated, not when the producer thread is done.
Another obvious problem is that global.done is accessed without any mutexes held, yielding a data race (undefined behavior -- anything can happen).

Pthread Scheduling policy and priority

I have four threads which are waiting on a condition variable and fifth thread posts condition variable when all four threads are waiting. When I set thread priority to maximum that is 99, threads switch takes a lot of time which is far from acceptable. Can anybody please take a look and tell what's happening ?
#define N_WORK_THREADS 4
pthread_mutex_t count_mutex = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t condition_var = PTHREAD_COND_INITIALIZER;
void *functionCount1(void * arg);
void *functionCount2(void * arg);
int count = 0;
int valid = 0;
int thread_personal[N_WORK_THREADS];
static int display_thread_sched_attr(int id)
{
int policy, s;
struct sched_param param;
s = pthread_getschedparam(pthread_self(), &policy, &param);
if (s != 0) { printf("pthread_getschedparam"); return 1; }
printf("Thread Id=%d policy=%s, priority=%d\n",id,
(policy == SCHED_FIFO) ? "SCHED_FIFO" : (policy == SCHED_RR) ? "SCHED_RR" : (policy == SCHED_OTHER) ? "SCHED_OTHER" : "???",
param.sched_priority);
return 0;
}
int main(void)
{
pthread_t thread_work[N_WORK_THREADS];
pthread_t thread;
int i,s;
pthread_attr_t attr;
struct sched_param param;
s = pthread_attr_init(&attr);
if (s != 0) { printf("pthread_attr_init"); return 1; }
s = pthread_attr_setinheritsched(&attr, PTHREAD_EXPLICIT_SCHED);
if (s != 0) { printf("pthread_attr_setinheritsched"); return 1; }
s = pthread_attr_setschedpolicy(&attr, SCHED_RR);
if (s != 0) { printf("pthread_attr_setschedpolicy"); return 1; }
param.sched_priority = 99;
s = pthread_attr_setschedparam(&attr, &param);
if (s != 0) { printf("pthread_attr_setschedparam"); return 1; }
for (i=0; i<N_WORK_THREADS; i++) { thread_personal[i] = 0; }
for (i=0; i<N_WORK_THREADS; i++) { pthread_create( &thread_work[i], &attr, &functionCount1, (void *)i); }
param.sched_priority = 99;
s = pthread_attr_setschedparam(&attr, &param);
if (s != 0) { printf("pthread_attr_setschedparam"); return 1; }
pthread_create( &thread, &attr, &functionCount2, (void *)N_WORK_THREADS);
for (i=0; i<N_WORK_THREADS; i++) { pthread_join( thread_work[i], NULL); }
pthread_join( thread, NULL);
for (i=0; i<N_WORK_THREADS; i++) { printf("Thread Id=%d Mutex USed=%d\n",i,thread_personal[i]); }
exit(EXIT_SUCCESS);
}
void *functionCount1(void * arg)
{
int i;
int id = (int) arg;
display_thread_sched_attr(id);
for(i=0; i<10; i++)
{
pthread_mutex_lock( &count_mutex );
thread_personal[id] += 1;
while (((count>>id) & 0x1) == 0)
{
pthread_cond_wait( &condition_var, &count_mutex );
}
count = count^ (1<<id);
printf("Thread Id %d: Valid = %d\n",id,valid);
pthread_mutex_unlock( &count_mutex );
}
return NULL;
}
void *functionCount2(void * arg)
{
int check;
int id = (int) arg;
display_thread_sched_attr(id);
check =0;
while (check < 10)
{
pthread_mutex_lock( &count_mutex );
if (count == 0)
{
pthread_cond_broadcast ( &condition_var );
count =0xF;
printf("Thread Id %d: Counter = %d\n",id,check);
valid = check++;
}
pthread_mutex_unlock( &count_mutex );
}
return NULL;
}
I'm unable to test your program with the scheduling policy code enabled because the program simply doesn't work when that's in there (as I mention in a comment: Linux 3.16.0 x86_64 with gcc 4.8.4).
But I'm guessing that your problem might be due to the loop in functionCount2():
while (check < 10)
{
pthread_mutex_lock( &count_mutex );
if (count == 0)
{
pthread_cond_broadcast ( &condition_var );
count =0xF;
printf("Thread Id %d: Counter = %d\n",id,check);
valid = check++;
}
pthread_mutex_unlock( &count_mutex );
}
In general, acquisition of mutex objects in pthreads is not guaranteed to be fair or FIFO (though to be honest, I'm not sure how thread scheduling policies might affect it). What I believe is happening is that this loop releases count_mutex then immediately re-acquires it even though other threads are blocked waiting to claim the mutex. And with the scheduling policy in place, this may occur until the thread uses its quantum.

I have some trouble with multi threading

I am struggling with this code. It's about counting some patterns from a text file. I tried to use thread(divide and conquer) processing, but it return a wrong value. I used mutex value to synchronize critical section.. The code is below. First main argument is number of threads, second is the name of a text file I want counting patterns from, and following patterns I want to look up on the text.
please save me..
Code is below
char *buffer;
int fsize, count;
char **searchword;
int *wordcount;
int *strlength;
typedef struct _params{
int num1;
int num2;
}params;
pthread_mutex_t mutex;
void *childfunc(void *arg)
{
int size, i, j, k, t, start, end, len, flag = -1;
int result;
params *a = (params *)arg;
start = a->num1;
end = a->num2;
while(1){
if(start == 0 || start == fsize)
break;
if(buffer[start]!= ' ' && buffer[start] != '\n' && buffer[start] != '\t')
start++;
else
break;
}
while(1){
if(end == fsize)
break;
if(buffer[end] != ' ' && buffer[end] != '\n' && buffer[end] != '\t')
end++;
else
break;
}
for(i = 0; i < count; i++){
len = strlength[i];
for(j = start; j<(end - len + 1); j++){
if(buffer[j] == searchword[i][0]){
flag = 0;
for(k = j +1; k<j + len; k++){
if(buffer[k] != searchword[i][k-j])
{
flag = 1;
break;
}
}
if(flag == 0){
pthread_mutex_lock(&mutex);
wordcount[i]++;
pthread_mutex_unlock(&mutex);// mutex unlocking
sleep(1);
flag = -1;
}
}
}
}
}
int main(int argc, char **argv){
FILE *fp;
char *inputFile;
pthread_t *tid;
int *status;
int inputNumber, i, j, diff, searchstart, searchend;
int result = 0;
count = argc -3;
inputNumber = atoi(argv[1]);
inputFile = argv[2];
searchword = (char **)malloc(sizeof(char *)*count);
tid = malloc(sizeof(pthread_t)*inputNumber);
strlength = (int *)malloc(4*count);
status = (int *)malloc(4*inputNumber);
wordcount = (int *)malloc(4*count);
for(i = 0; i < count; i++)
searchword[i] = (char*)malloc(sizeof(char)*(strlen(argv[i+3]) + 1));
for(i = 3; i < argc; i++)
strcpy(searchword[i-3], argv[i]);
fp = fopen(inputFile, "r");
fseek(fp, 0, SEEK_END);
fsize = ftell(fp);
rewind(fp);
buffer = (char *)malloc(1*fsize);
fread(buffer, fsize, 1, fp);
diff = fsize / inputNumber;
if(diff == 0)
diff = 1;
for(i = 0; i < count ; i++){
strlength[i] = strlen(searchword[i]);
wordcount[i] = 0;
}
for(i = 0; i < inputNumber; i++){
searchstart = 0 + i*diff;
searchend = searchstart + diff;
if(searchstart > fsize)
searchstart = fsize;
if(searchend > fsize)
searchend = fsize;
if( i == inputNumber -1)
searchend = fsize;
params a;
a.num1 = searchstart;
a.num2 = searchend;
pthread_mutex_init(&mutex, NULL);
result = pthread_create(&tid[i], NULL, childfunc, (void *)&a);
if(result < 0){
perror("pthread_create()");
}
}
//스레드 받는 부분
for(i = 0; i < inputNumber; i++){
result = pthread_join(tid[i], (void **)status);
if(result < 0)
perror("pthread_join()");
}
pthread_mutex_destroy(&mutex); // mutex 해제
for(i = 0; i < count; i++)
printf("%s : %d \n", searchword[i], wordcount[i]);
for(i = 0; i < count; i++) //동적메모리해제
free(searchword[i]);
free(searchword);
free(buffer);
free(tid);
free(strlength);
free(wordcount);
free(status);
fclose(fp);
return 0;
}
params a; // new a for each loop, previous a no longer exists
a.num1 = searchstart;
a.num2 = searchend;
pthread_mutex_init(&mutex, NULL);
result = pthread_create(&tid[i], NULL, childfunc, (void *)&a);
if(result < 0){
perror("pthread_create()");
}
} // a goes out of scope here
You pass each thread the address of a, but then a goes out of scope immediately after you create the thread. So the thread now has the address of some random leftover junk on the stack.
You need to have some conception of ownership of any object that's accessed by more than one thread like a is here. It can be owned by the thread that called pthread_create, owned by the newly-created thread, or jointly owned. But you have to be consistent. You have neither of these, why?
Is a owned only be the thread that called pthread_create? No, because the newly-created thread has a pointer to it and accesses it through that pointer. So the thread that called pthread_create cannot destroy it.
Is a owned only be the thread created by pthread_create? No, because it's on the stack of the thread that called pthread_create and will cease to exist when the next loop comes around.
Is a jointly owned? Well, no, because the thread that called pthread_create can destroy the object before the newly-created thread accesses it.
So no sane model of multi-thread use is followed by a. It's broken.
One common solution to this problem is to allocate a new structure (using malloc or new) for each thread, fill it in, and pass the thread the address of the structure. Let the thread free (or delete) the structure when it's done with it.

VC++ threads deadlocked

The following program goes into a deadlock. Can anyone please tell me why?
#include<cstdlib>
#include<windows.h>
#include<iostream>
using namespace std;
class CircularQueue
{
public:
CircularQueue(int s)
{
size = s;
array = (int*)malloc(sizeof(int) * size);
head = tail = -1;
InitializeCriticalSection(&critical_section);
}
CircularQueue(void)
{
size = default_size;
array = (int*)malloc(sizeof(int) * size);
head = tail = -1;
InitializeCriticalSection(&critical_section);
}
void initialize(int s)
{
EnterCriticalSection(&critical_section);
size = s;
array = (int*)realloc(array, sizeof(int) * size);
head = tail = -1;
LeaveCriticalSection(&critical_section);
}
void enqueue(int n)
{
EnterCriticalSection(&critical_section);
tail = (tail + 1) % size;
array[tail] = n;
LeaveCriticalSection(&critical_section);
}
int dequeue(void)
{
EnterCriticalSection(&critical_section);
head = (head + 1) % size;
return array[head];
LeaveCriticalSection(&critical_section);
}
private:
int *array;
int size;
int head, tail;
CRITICAL_SECTION critical_section;
bool initialized;
static const int default_size = 10;
};
DWORD WINAPI thread1(LPVOID param)
{
CircularQueue* cqueue = (CircularQueue*)param;
cqueue->enqueue(2);
cout << cqueue->dequeue() << endl;
return 0;
}
DWORD WINAPI thread2(LPVOID param)
{
CircularQueue* cqueue = (CircularQueue*)param;
cqueue->enqueue(3);
cout << cqueue->dequeue() << endl;
return 0;
}
int main(void)
{
HANDLE thread1_handle;
HANDLE thread2_handle;
CircularQueue cqueue;
HANDLE array[2];
thread1_handle = CreateThread(NULL, 0, thread1, &cqueue, 0, NULL);
thread2_handle = CreateThread(NULL, 0, thread2, &cqueue, 0, NULL);
array[0] = thread1_handle;
array[1] = thread2_handle;
WaitForMultipleObjects(1, array, TRUE, INFINITE);
CloseHandle(thread1_handle);
CloseHandle(thread2_handle);
printf("end\n");
return 0;
}
In dequeue(), you have a return statement before the LeaveCriticalSection() call. If you had compiler warnings turned up higher, it would probably have told you about this!

How to parallelize Sudoku solver using Grand Central Dispatch?

As a programming exercise, I just finished writing a Sudoku solver that uses the backtracking algorithm (see Wikipedia for a simple example written in C).
To take this a step further, I would like to use Snow Leopard's GCD to parallelize this so that it runs on all of my machine's cores. Can someone give me pointers on how I should go about doing this and what code changes I should make? Thanks!
Matt
Please let me know if you end up using it. It is run of the mill ANSI C, so should run on everything. See other post for usage.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
short sudoku[9][9];
unsigned long long cubeSolutions=0;
void* cubeValues[10];
const unsigned char oneLookup[64] = {0x8b, 0x80, 0, 0x80, 0, 0, 0, 0x80, 0, 0,0,0,0,0,0, 0x80, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0x80,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0};
int ifOne(int val) {
if ( oneLookup[(val-1) >> 3] & (1 << ((val-1) & 0x7)) )
return val;
return 0;
}
void init_sudoku() {
int i,j;
for (i=0; i<9; i++)
for (j=0; j<9; j++)
sudoku[i][j]=0x1ff;
}
void set_sudoku( char* initialValues) {
int i;
if ( strlen (initialValues) != 81 ) {
printf("Error: inputString should have length=81, length is %2.2d\n", strlen(initialValues) );
exit (-12);
}
for (i=0; i < 81; i++)
if ((initialValues[i] > 0x30) && (initialValues[i] <= 0x3a))
sudoku[i/9][i%9] = 1 << (initialValues[i] - 0x31) ;
}
void print_sudoku ( int style ) {
int i, j, k;
for (i=0; i < 9; i++) {
for (j=0; j < 9; j++) {
if ( ifOne(sudoku[i][j]) || !style) {
for (k=0; k < 9; k++)
if (sudoku[i][j] & 1<<k)
printf("%d", k+1);
} else
printf("*");
if ( !((j+1)%3) )
printf("\t");
else
printf(",");
}
printf("\n");
if (!((i+1) % 3) )
printf("\n");
}
}
void print_HTML_sudoku () {
int i, j, k, l, m;
printf("<TABLE>\n");
for (i=0; i<3; i++) {
printf(" <TR>\n");
for (j=0; j<3; j++) {
printf(" <TD><TABLE>\n");
for (l=0; l<3; l++) { printf(" <TR>"); for (m=0; m<3; m++) { printf("<TD>"); for (k=0; k < 9; k++) { if (sudoku[i*3+l][j*3+m] & 1<<k)
printf("%d", k+1);
}
printf("</TD>");
}
printf("</TR>\n");
}
printf(" </TABLE></TD>\n");
}
printf(" </TR>\n");
}
printf("</TABLE>");
}
int doRow () {
int count=0, new_value, row_value, i, j;
for (i=0; i<9; i++) {
row_value=0x1ff;
for (j=0; j<9; j++)
row_value&=~ifOne(sudoku[i][j]);
for (j=0; j<9; j++) {
new_value=sudoku[i][j] & row_value;
if (new_value && (new_value != sudoku[i][j]) ) {
count++;
sudoku[i][j] = new_value;
}
}
}
return count;
}
int doCol () {
int count=0, new_value, col_value, i, j;
for (i=0; i<9; i++) {
col_value=0x1ff;
for (j=0; j<9; j++)
col_value&=~ifOne(sudoku[j][i]);
for (j=0; j<9; j++) {
new_value=sudoku[j][i] & col_value;
if (new_value && (new_value != sudoku[j][i]) ) {
count++;
sudoku[j][i] = new_value;
}
}
}
return count;
}
int doCube () {
int count=0, new_value, cube_value, i, j, l, m;
for (i=0; i<3; i++)
for (j=0; j<3; j++) {
cube_value=0x1ff;
for (l=0; l<3; l++)
for (m=0; m<3; m++)
cube_value&=~ifOne(sudoku[i*3+l][j*3+m]);
for (l=0; l<3; l++)
for (m=0; m<3; m++) {
new_value=sudoku[i*3+l][j*3+m] & cube_value;
if (new_value && (new_value != sudoku[i*3+l][j*3+m]) ) {
count++;
sudoku[i*3+l][j*3+m] = new_value;
}
}
}
return count;
}
#define FALSE -1
#define TRUE 1
#define INCOMPLETE 0
int validCube () {
int i, j, l, m, r, c;
int pigeon;
int solved=TRUE;
//check horizontal
for (i=0; i<9; i++) {
pigeon=0;
for (j=0; j<9; j++)
if (ifOne(sudoku[i][j])) {
if (pigeon & sudoku[i][j]) return FALSE;
pigeon |= sudoku[i][j];
} else {
solved=INCOMPLETE;
}
}
//check vertical
for (i=0; i<9; i++) {
pigeon=0;
for (j=0; j<9; j++)
if (ifOne(sudoku[j][i])) {
if (pigeon & sudoku[j][i]) return FALSE;
pigeon |= sudoku[j][i];
}
else {
solved=INCOMPLETE;
}
}
//check cube
for (i=0; i<3; i++)
for (j=0; j<3; j++) {
pigeon=0;
r=j*3; c=i*3;
for (l=0; l<3; l++)
for (m=0; m<3; m++)
if (ifOne(sudoku[r+l][c+m])) {
if (pigeon & sudoku[r+l][c+m]) return FALSE;
pigeon |= sudoku[r+l][c+m];
}
else {
solved=INCOMPLETE;
}
}
return solved;
}
int solveSudoku(int position ) {
int status, i, k;
short oldCube[9][9];
for (i=position; i < 81; i++) {
while ( doCube() + doRow() + doCol() );
status = validCube() ;
if ((status == TRUE) || (status == FALSE))
return status;
if ((status == INCOMPLETE) && !ifOne(sudoku[i/9][i%9]) ) {
memcpy( &oldCube, &sudoku, sizeof(short) * 81) ;
for (k=0; k < 9; k++) {
if ( sudoku[i/9][i%9] & (1<<k) ) {
sudoku[i/9][i%9] = 1 << k ;
if (solveSudoku(i+1) == TRUE ) {
/* return TRUE; */
/* Or look for entire set of solutions */
if (cubeSolutions < 10) {
cubeValues[cubeSolutions] = malloc ( sizeof(short) * 81 ) ;
memcpy( cubeValues[cubeSolutions], &sudoku, sizeof(short) * 81) ;
}
cubeSolutions++;
if ((cubeSolutions & 0x3ffff) == 0x3ffff ) {
printf ("cubeSolutions = %llx\n", cubeSolutions+1 );
}
//if ( cubeSolutions > 10 )
// return TRUE;
}
memcpy( &sudoku, &oldCube, sizeof(short) * 81) ;
}
if (k==8)
return FALSE;
}
}
}
return FALSE;
}
int main ( int argc, char** argv) {
int i;
if (argc != 2) {
printf("Error: number of arguments on command line is incorrect\n");
exit (-12);
}
init_sudoku();
set_sudoku(argv[1]);
printf("[----------------------- Input Data ------------------------]\n\n");
print_sudoku(1);
solveSudoku(0);
if ((validCube()==1) && !cubeSolutions) {
// If sudoku is effectively already solved, cubeSolutions will not be set
printf ("\n This is a trivial sudoku. \n\n");
print_sudoku(1);
}
if (!cubeSolutions && validCube()!=1)
printf("Not Solvable\n");
if (cubeSolutions > 1) {
if (cubeSolutions >= 10)
printf("10+ Solutions, returning first 10 (%lld) [%llx] \n", cubeSolutions, cubeSolutions);
else
printf("%llx Solutions. \n", cubeSolutions);
}
for (i=0; (i < cubeSolutions) && (i < 10); i++) {
memcpy ( &sudoku, cubeValues[i], sizeof(short) * 81 );
printf("[----------------------- Solution %2.2d ------------------------]\n\n", i+1);
print_sudoku(0);
//print_HTML_sudoku();
}
return 0;
}
For one, since backtracking is a depth-first search it is not directly parallelizable, since any newly computed result cannot be used be directly used by another thread. Instead, you must divide the problem early, i.e. thread #1 starts with the first combination for a node in the backtracking graph, and proceeds to search the rest of that subgraph. Thread #2 starts with the second possible combination at the first and so forth. In short, for n threads find the n possible combinations on the top level of the search space (do not "forward-track"), then assign these n starting points to n threads.
However I think the idea is fundamentally flawed: Many sudoku permutations are solved in a matter of a couple thousands of forward+backtracking steps, and are solved within milliseconds on a single thread. This is in fact so fast that even the small coordination required for a few threads (assume that n threads reduce computation time to 1/n of original time) on a multi-core/multi-CPU is not negligible compared to the total running time, thus it is not by any chance a more efficient solution.
Are you sure you want to do that? Like, what problem are you trying to solve? If you want to use all cores, use threads. If you want a fast sudoku solver, I can give you one I wrote, see output below. If you want to make work for yourself, go ahead and use GCD ;).
Update:
I don't think GCD is bad, it just isn't terribly relevant to the task of solving sudoku. GCD is a technology to tie GUI events to code. Essentially, GCD solves two problems, a Quirk in how the MacOS X updates windows, and, it provides an improved method (as compared to threads) of tying code to GUI events.
It doesn't apply to this problem because Sudoku can be solved significantly faster than a person can think (in my humble opinion). That being said, if your goal was to solve Sudoku faster, you would want to use threads, because you would want to directly use more than one processor.
[bear#bear scripts]$ time ./a.out ..1..4.......6.3.5...9.....8.....7.3.......285...7.6..3...8...6..92......4...1...
[----------------------- Input Data ------------------------]
*,*,1 *,*,4 *,*,*
*,*,* *,6,* 3,*,5
*,*,* 9,*,* *,*,*
8,*,* *,*,* 7,*,3
*,*,* *,*,* *,2,8
5,*,* *,7,* 6,*,*
3,*,* *,8,* *,*,6
*,*,9 2,*,* *,*,*
*,4,* *,*,1 *,*,*
[----------------------- Solution 01 ------------------------]
7,6,1 3,5,4 2,8,9
2,9,8 1,6,7 3,4,5
4,5,3 9,2,8 1,6,7
8,1,2 6,4,9 7,5,3
9,7,6 5,1,3 4,2,8
5,3,4 8,7,2 6,9,1
3,2,7 4,8,5 9,1,6
1,8,9 2,3,6 5,7,4
6,4,5 7,9,1 8,3,2
real 0m0.044s
user 0m0.041s
sys 0m0.001s

Resources