I am struggling with this code. It's about counting some patterns from a text file. I tried to use thread(divide and conquer) processing, but it return a wrong value. I used mutex value to synchronize critical section.. The code is below. First main argument is number of threads, second is the name of a text file I want counting patterns from, and following patterns I want to look up on the text.
please save me..
Code is below
char *buffer;
int fsize, count;
char **searchword;
int *wordcount;
int *strlength;
typedef struct _params{
int num1;
int num2;
}params;
pthread_mutex_t mutex;
void *childfunc(void *arg)
{
int size, i, j, k, t, start, end, len, flag = -1;
int result;
params *a = (params *)arg;
start = a->num1;
end = a->num2;
while(1){
if(start == 0 || start == fsize)
break;
if(buffer[start]!= ' ' && buffer[start] != '\n' && buffer[start] != '\t')
start++;
else
break;
}
while(1){
if(end == fsize)
break;
if(buffer[end] != ' ' && buffer[end] != '\n' && buffer[end] != '\t')
end++;
else
break;
}
for(i = 0; i < count; i++){
len = strlength[i];
for(j = start; j<(end - len + 1); j++){
if(buffer[j] == searchword[i][0]){
flag = 0;
for(k = j +1; k<j + len; k++){
if(buffer[k] != searchword[i][k-j])
{
flag = 1;
break;
}
}
if(flag == 0){
pthread_mutex_lock(&mutex);
wordcount[i]++;
pthread_mutex_unlock(&mutex);// mutex unlocking
sleep(1);
flag = -1;
}
}
}
}
}
int main(int argc, char **argv){
FILE *fp;
char *inputFile;
pthread_t *tid;
int *status;
int inputNumber, i, j, diff, searchstart, searchend;
int result = 0;
count = argc -3;
inputNumber = atoi(argv[1]);
inputFile = argv[2];
searchword = (char **)malloc(sizeof(char *)*count);
tid = malloc(sizeof(pthread_t)*inputNumber);
strlength = (int *)malloc(4*count);
status = (int *)malloc(4*inputNumber);
wordcount = (int *)malloc(4*count);
for(i = 0; i < count; i++)
searchword[i] = (char*)malloc(sizeof(char)*(strlen(argv[i+3]) + 1));
for(i = 3; i < argc; i++)
strcpy(searchword[i-3], argv[i]);
fp = fopen(inputFile, "r");
fseek(fp, 0, SEEK_END);
fsize = ftell(fp);
rewind(fp);
buffer = (char *)malloc(1*fsize);
fread(buffer, fsize, 1, fp);
diff = fsize / inputNumber;
if(diff == 0)
diff = 1;
for(i = 0; i < count ; i++){
strlength[i] = strlen(searchword[i]);
wordcount[i] = 0;
}
for(i = 0; i < inputNumber; i++){
searchstart = 0 + i*diff;
searchend = searchstart + diff;
if(searchstart > fsize)
searchstart = fsize;
if(searchend > fsize)
searchend = fsize;
if( i == inputNumber -1)
searchend = fsize;
params a;
a.num1 = searchstart;
a.num2 = searchend;
pthread_mutex_init(&mutex, NULL);
result = pthread_create(&tid[i], NULL, childfunc, (void *)&a);
if(result < 0){
perror("pthread_create()");
}
}
//스레드 받는 부분
for(i = 0; i < inputNumber; i++){
result = pthread_join(tid[i], (void **)status);
if(result < 0)
perror("pthread_join()");
}
pthread_mutex_destroy(&mutex); // mutex 해제
for(i = 0; i < count; i++)
printf("%s : %d \n", searchword[i], wordcount[i]);
for(i = 0; i < count; i++) //동적메모리해제
free(searchword[i]);
free(searchword);
free(buffer);
free(tid);
free(strlength);
free(wordcount);
free(status);
fclose(fp);
return 0;
}
params a; // new a for each loop, previous a no longer exists
a.num1 = searchstart;
a.num2 = searchend;
pthread_mutex_init(&mutex, NULL);
result = pthread_create(&tid[i], NULL, childfunc, (void *)&a);
if(result < 0){
perror("pthread_create()");
}
} // a goes out of scope here
You pass each thread the address of a, but then a goes out of scope immediately after you create the thread. So the thread now has the address of some random leftover junk on the stack.
You need to have some conception of ownership of any object that's accessed by more than one thread like a is here. It can be owned by the thread that called pthread_create, owned by the newly-created thread, or jointly owned. But you have to be consistent. You have neither of these, why?
Is a owned only be the thread that called pthread_create? No, because the newly-created thread has a pointer to it and accesses it through that pointer. So the thread that called pthread_create cannot destroy it.
Is a owned only be the thread created by pthread_create? No, because it's on the stack of the thread that called pthread_create and will cease to exist when the next loop comes around.
Is a jointly owned? Well, no, because the thread that called pthread_create can destroy the object before the newly-created thread accesses it.
So no sane model of multi-thread use is followed by a. It's broken.
One common solution to this problem is to allocate a new structure (using malloc or new) for each thread, fill it in, and pass the thread the address of the structure. Let the thread free (or delete) the structure when it's done with it.
As a programming exercise, I just finished writing a Sudoku solver that uses the backtracking algorithm (see Wikipedia for a simple example written in C).
To take this a step further, I would like to use Snow Leopard's GCD to parallelize this so that it runs on all of my machine's cores. Can someone give me pointers on how I should go about doing this and what code changes I should make? Thanks!
Matt
Please let me know if you end up using it. It is run of the mill ANSI C, so should run on everything. See other post for usage.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
short sudoku[9][9];
unsigned long long cubeSolutions=0;
void* cubeValues[10];
const unsigned char oneLookup[64] = {0x8b, 0x80, 0, 0x80, 0, 0, 0, 0x80, 0, 0,0,0,0,0,0, 0x80, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0x80,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0};
int ifOne(int val) {
if ( oneLookup[(val-1) >> 3] & (1 << ((val-1) & 0x7)) )
return val;
return 0;
}
void init_sudoku() {
int i,j;
for (i=0; i<9; i++)
for (j=0; j<9; j++)
sudoku[i][j]=0x1ff;
}
void set_sudoku( char* initialValues) {
int i;
if ( strlen (initialValues) != 81 ) {
printf("Error: inputString should have length=81, length is %2.2d\n", strlen(initialValues) );
exit (-12);
}
for (i=0; i < 81; i++)
if ((initialValues[i] > 0x30) && (initialValues[i] <= 0x3a))
sudoku[i/9][i%9] = 1 << (initialValues[i] - 0x31) ;
}
void print_sudoku ( int style ) {
int i, j, k;
for (i=0; i < 9; i++) {
for (j=0; j < 9; j++) {
if ( ifOne(sudoku[i][j]) || !style) {
for (k=0; k < 9; k++)
if (sudoku[i][j] & 1<<k)
printf("%d", k+1);
} else
printf("*");
if ( !((j+1)%3) )
printf("\t");
else
printf(",");
}
printf("\n");
if (!((i+1) % 3) )
printf("\n");
}
}
void print_HTML_sudoku () {
int i, j, k, l, m;
printf("<TABLE>\n");
for (i=0; i<3; i++) {
printf(" <TR>\n");
for (j=0; j<3; j++) {
printf(" <TD><TABLE>\n");
for (l=0; l<3; l++) { printf(" <TR>"); for (m=0; m<3; m++) { printf("<TD>"); for (k=0; k < 9; k++) { if (sudoku[i*3+l][j*3+m] & 1<<k)
printf("%d", k+1);
}
printf("</TD>");
}
printf("</TR>\n");
}
printf(" </TABLE></TD>\n");
}
printf(" </TR>\n");
}
printf("</TABLE>");
}
int doRow () {
int count=0, new_value, row_value, i, j;
for (i=0; i<9; i++) {
row_value=0x1ff;
for (j=0; j<9; j++)
row_value&=~ifOne(sudoku[i][j]);
for (j=0; j<9; j++) {
new_value=sudoku[i][j] & row_value;
if (new_value && (new_value != sudoku[i][j]) ) {
count++;
sudoku[i][j] = new_value;
}
}
}
return count;
}
int doCol () {
int count=0, new_value, col_value, i, j;
for (i=0; i<9; i++) {
col_value=0x1ff;
for (j=0; j<9; j++)
col_value&=~ifOne(sudoku[j][i]);
for (j=0; j<9; j++) {
new_value=sudoku[j][i] & col_value;
if (new_value && (new_value != sudoku[j][i]) ) {
count++;
sudoku[j][i] = new_value;
}
}
}
return count;
}
int doCube () {
int count=0, new_value, cube_value, i, j, l, m;
for (i=0; i<3; i++)
for (j=0; j<3; j++) {
cube_value=0x1ff;
for (l=0; l<3; l++)
for (m=0; m<3; m++)
cube_value&=~ifOne(sudoku[i*3+l][j*3+m]);
for (l=0; l<3; l++)
for (m=0; m<3; m++) {
new_value=sudoku[i*3+l][j*3+m] & cube_value;
if (new_value && (new_value != sudoku[i*3+l][j*3+m]) ) {
count++;
sudoku[i*3+l][j*3+m] = new_value;
}
}
}
return count;
}
#define FALSE -1
#define TRUE 1
#define INCOMPLETE 0
int validCube () {
int i, j, l, m, r, c;
int pigeon;
int solved=TRUE;
//check horizontal
for (i=0; i<9; i++) {
pigeon=0;
for (j=0; j<9; j++)
if (ifOne(sudoku[i][j])) {
if (pigeon & sudoku[i][j]) return FALSE;
pigeon |= sudoku[i][j];
} else {
solved=INCOMPLETE;
}
}
//check vertical
for (i=0; i<9; i++) {
pigeon=0;
for (j=0; j<9; j++)
if (ifOne(sudoku[j][i])) {
if (pigeon & sudoku[j][i]) return FALSE;
pigeon |= sudoku[j][i];
}
else {
solved=INCOMPLETE;
}
}
//check cube
for (i=0; i<3; i++)
for (j=0; j<3; j++) {
pigeon=0;
r=j*3; c=i*3;
for (l=0; l<3; l++)
for (m=0; m<3; m++)
if (ifOne(sudoku[r+l][c+m])) {
if (pigeon & sudoku[r+l][c+m]) return FALSE;
pigeon |= sudoku[r+l][c+m];
}
else {
solved=INCOMPLETE;
}
}
return solved;
}
int solveSudoku(int position ) {
int status, i, k;
short oldCube[9][9];
for (i=position; i < 81; i++) {
while ( doCube() + doRow() + doCol() );
status = validCube() ;
if ((status == TRUE) || (status == FALSE))
return status;
if ((status == INCOMPLETE) && !ifOne(sudoku[i/9][i%9]) ) {
memcpy( &oldCube, &sudoku, sizeof(short) * 81) ;
for (k=0; k < 9; k++) {
if ( sudoku[i/9][i%9] & (1<<k) ) {
sudoku[i/9][i%9] = 1 << k ;
if (solveSudoku(i+1) == TRUE ) {
/* return TRUE; */
/* Or look for entire set of solutions */
if (cubeSolutions < 10) {
cubeValues[cubeSolutions] = malloc ( sizeof(short) * 81 ) ;
memcpy( cubeValues[cubeSolutions], &sudoku, sizeof(short) * 81) ;
}
cubeSolutions++;
if ((cubeSolutions & 0x3ffff) == 0x3ffff ) {
printf ("cubeSolutions = %llx\n", cubeSolutions+1 );
}
//if ( cubeSolutions > 10 )
// return TRUE;
}
memcpy( &sudoku, &oldCube, sizeof(short) * 81) ;
}
if (k==8)
return FALSE;
}
}
}
return FALSE;
}
int main ( int argc, char** argv) {
int i;
if (argc != 2) {
printf("Error: number of arguments on command line is incorrect\n");
exit (-12);
}
init_sudoku();
set_sudoku(argv[1]);
printf("[----------------------- Input Data ------------------------]\n\n");
print_sudoku(1);
solveSudoku(0);
if ((validCube()==1) && !cubeSolutions) {
// If sudoku is effectively already solved, cubeSolutions will not be set
printf ("\n This is a trivial sudoku. \n\n");
print_sudoku(1);
}
if (!cubeSolutions && validCube()!=1)
printf("Not Solvable\n");
if (cubeSolutions > 1) {
if (cubeSolutions >= 10)
printf("10+ Solutions, returning first 10 (%lld) [%llx] \n", cubeSolutions, cubeSolutions);
else
printf("%llx Solutions. \n", cubeSolutions);
}
for (i=0; (i < cubeSolutions) && (i < 10); i++) {
memcpy ( &sudoku, cubeValues[i], sizeof(short) * 81 );
printf("[----------------------- Solution %2.2d ------------------------]\n\n", i+1);
print_sudoku(0);
//print_HTML_sudoku();
}
return 0;
}
For one, since backtracking is a depth-first search it is not directly parallelizable, since any newly computed result cannot be used be directly used by another thread. Instead, you must divide the problem early, i.e. thread #1 starts with the first combination for a node in the backtracking graph, and proceeds to search the rest of that subgraph. Thread #2 starts with the second possible combination at the first and so forth. In short, for n threads find the n possible combinations on the top level of the search space (do not "forward-track"), then assign these n starting points to n threads.
However I think the idea is fundamentally flawed: Many sudoku permutations are solved in a matter of a couple thousands of forward+backtracking steps, and are solved within milliseconds on a single thread. This is in fact so fast that even the small coordination required for a few threads (assume that n threads reduce computation time to 1/n of original time) on a multi-core/multi-CPU is not negligible compared to the total running time, thus it is not by any chance a more efficient solution.
Are you sure you want to do that? Like, what problem are you trying to solve? If you want to use all cores, use threads. If you want a fast sudoku solver, I can give you one I wrote, see output below. If you want to make work for yourself, go ahead and use GCD ;).
Update:
I don't think GCD is bad, it just isn't terribly relevant to the task of solving sudoku. GCD is a technology to tie GUI events to code. Essentially, GCD solves two problems, a Quirk in how the MacOS X updates windows, and, it provides an improved method (as compared to threads) of tying code to GUI events.
It doesn't apply to this problem because Sudoku can be solved significantly faster than a person can think (in my humble opinion). That being said, if your goal was to solve Sudoku faster, you would want to use threads, because you would want to directly use more than one processor.
[bear#bear scripts]$ time ./a.out ..1..4.......6.3.5...9.....8.....7.3.......285...7.6..3...8...6..92......4...1...
[----------------------- Input Data ------------------------]
*,*,1 *,*,4 *,*,*
*,*,* *,6,* 3,*,5
*,*,* 9,*,* *,*,*
8,*,* *,*,* 7,*,3
*,*,* *,*,* *,2,8
5,*,* *,7,* 6,*,*
3,*,* *,8,* *,*,6
*,*,9 2,*,* *,*,*
*,4,* *,*,1 *,*,*
[----------------------- Solution 01 ------------------------]
7,6,1 3,5,4 2,8,9
2,9,8 1,6,7 3,4,5
4,5,3 9,2,8 1,6,7
8,1,2 6,4,9 7,5,3
9,7,6 5,1,3 4,2,8
5,3,4 8,7,2 6,9,1
3,2,7 4,8,5 9,1,6
1,8,9 2,3,6 5,7,4
6,4,5 7,9,1 8,3,2
real 0m0.044s
user 0m0.041s
sys 0m0.001s