String comparision performance in Java and C - string

I need to measure performance of a function that compares two strings. My task is to write it in Java and C and compare execution time. For testing purposes I generated a txt file with 100000 random strings varying from 100 to 200 characters each. Using them I invoke comparision function 20'000'000 times. In Java it takes ~500ms while in C the execution time is 0ms (Im doing exactly the same tests on exaclty the same data in both languages). Even if I increase it to 20'000'000'000 calls in C, it still measures 0ms duration. How is it possible? Am I missing something important?
implementaton in Java
public class StringComparer {
public static boolean compareStrings(String string1, String string2) {
if(string1.length() != string2.length()) {
return false;
}
for (int i = 0; i < string1.length(); i++) {
if(string1.charAt(i) != string2.charAt(i)) {
return false;
}
}
return true;
}
}
implementation in C
bool string_compare(char* s1, char* s2)
{
int i = 0;
while (s1[i] != NULL && s1[i] == s2[i])
i++;
return s1[i] == s2[i];
}
This is the code I use to test performance in C
#include <stdio.h>
#include <stdbool.h>
#include <stdlib.h>
#include <windows.h>
#define NUMBER_OF_WORDS 100000
#define MAX_WORD_LENGTH 200
long long milliseconds_now() {
static LARGE_INTEGER s_frequency;
BOOL s_use_qpc = QueryPerformanceFrequency(&s_frequency);
if (s_use_qpc) {
LARGE_INTEGER now;
QueryPerformanceCounter(&now);
return (1000LL * now.QuadPart) / s_frequency.QuadPart;
}
else {
return GetTickCount();
}
}
int main()
{
char* fileName = "tests.txt";
FILE *file = fopen(fileName, "r");
char* words[NUMBER_OF_WORDS];
long long i, j;
for (i = 0; i < NUMBER_OF_WORDS; i++) {
words[i] = (char*)malloc((MAX_WORD_LENGTH + 1) * sizeof(char));
fgets(words[i], MAX_WORD_LENGTH + 1, file);
}
long long repeats = 10000000000 / NUMBER_OF_WORDS;
long long start = milliseconds_now();
for (i = 0; i < repeats; i++)
{
for (j = 0; j < NUMBER_OF_WORDS - 1; j++)
{
;
}
}
long long loopDuration = milliseconds_now() - start;
start = milliseconds_now();
for (i = 0; i < repeats; i++)
{
for (j = 0; j < NUMBER_OF_WORDS - 1; j++)
{
string_compare(words[j], words[j + 1]); //compare different strings
string_compare(words[j], words[j]); //compare the same strings
}
}
long long customFunctionDuration = milliseconds_now() - start;
printf("Loop duration: %lld\n", loopDuration);
printf("Custom function duration: %lld - %lld = %lld ms", customFunctionDuration, loopDuration, customFunctionDuration - loopDuration);
return 0;
}

Your code's observable behavior is precisely the same as code that does nothing at all. You need to make the result of the string comparisons part of your program's observable behavior so they can't be optimized to nothing. Try tallying the number of times strings match and the number of times the strings don't match and outputting that number.

Related

Why is my multithreaded C program not working on macOS, but completely fine on Linux?

I have written a multithreaded program in C using pthreads to solve the N-queens problem. It uses the producer consumer programming model. One producer who creates all possible combinations and consumers who evaluate if the combination is valid. I use a shared buffer that can hold one combination at a time.
Once I have 2+ consumers the program starts to behave strange. I get more consumptions than productions. 1.5:1 ratio approx (should be 1:1). The interesting part is that this only happens on my MacBook and is nowhere to be seen when I run it on the Linux machine (Red Hat Enterprise Linux Workstation release 6.10 (Santiago)) I have access to over SSH.
I'm quite sure that my implementation is correct with locks and conditional variables too, the program runs for 10+ seconds which should reveal if there are any mistakes with the synchronization.
I compile with GCC (Apple clang version 12.0.5) via xcode developer tools on my MacBook Pro (2020, x86_64) and GCC on Linux too, but version 4.4.7 20120313 (Red Hat 4.4.7-23).
compile: gcc -o 8q 8q.c
run: ./8q <producers> <N>, NxN chess board, N queens to place
parameters: ./8q 2 4 Enough to highlight the problem (should yield 2 solutions, but every other run yields 3+ solutions, i.e duplicate solutions exist
note: print(printouts) Visualizes the valid solutions (duplicates shown)
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <string.h>
#include <assert.h>
typedef struct stack_buf {
int positions[8];
int top;
} stack_buf;
typedef struct global_buf {
int positions[8];
volatile int buf_empty;
volatile long done;
} global_buf;
typedef struct print_buf {
int qpositions[100][8];
int top;
} print_buf;
stack_buf queen_comb = { {0}, 0 };
global_buf global = { {0}, 1, 0 };
print_buf printouts = { {{0}}, -1 };
int N; //NxN board and N queens to place
clock_t start, stop, diff;
pthread_mutex_t buffer_mutex, print_mutex;
pthread_cond_t empty, filled;
/* ##########################################################################################
################################## VALIDATION FUNCTIONS ##################################
########################################################################################## */
/* Validate that no queens are placed on the same row */
int valid_rows(int qpositions[]) {
int rows[N];
memset(rows, 0, N * sizeof(int));
int row;
for (int i = 0; i < N; i++) {
row = qpositions[i] / N;
if (rows[row] == 0) rows[row] = 1;
else return 0;
}
return 1;
}
/* Validate that no queens are placed in the same column */
int valid_columns(int qpositions[]) {
int columns[N];
memset(columns, 0, N*sizeof(int));
int column;
for (int i = 0; i < N; i++) {
column = qpositions[i] % N;
if (columns[column] == 0) columns[column] = 1;
else return 0;
}
return 1;
}
/* Validate that left and right diagonals aren't used by another queen */
int valid_diagonals(int qpositions[]) {
int left_bottom_diagonals[N];
int right_bottom_diagonals[N];
int row, col, temp_col, temp_row, fill_value, index;
for (int queen = 0; queen < N; queen++) {
row = qpositions[queen] / N;
col = qpositions[queen] % N;
/* position --> left down diagonal endpoint (index) */
fill_value = col < row ? col : row; //min of col and row
temp_row = row - fill_value;
temp_col = col - fill_value;
index = temp_row * N + temp_col; // position
for (int i = 0; i < queen; i++) { // check if interference occurs
if (left_bottom_diagonals[i] == index) return 0;
}
left_bottom_diagonals[queen] = index; // no interference
/* position --> right down diagonal endpoint (index) */
fill_value = (N-1) - col < row ? N - col - 1 : row; // closest to bottom or right wall
temp_row = row - fill_value;
temp_col = col + fill_value;
index = temp_row * N + temp_col; // position
for (int i = 0; i < queen; i++) { // check if interference occurs
if (right_bottom_diagonals[i] == index) return 0;
}
right_bottom_diagonals[queen] = index; // no interference
};
return 1;
}
/* ##########################################################################################
#################################### HELPER FUNCTIONS ####################################
########################################################################################## */
/* print the collected solutions */
void print(print_buf printouts) {
static int solution_number = 1;
int placement;
for (int sol = 0; sol <= printouts.top; sol++) { // number of solutions
printf("Solution %d: [ ", solution_number++);
for (int pos = 0; pos < N; pos++) {
printf("%d ", printouts.qpositions[sol][pos]+1);
}
printf("]\n");
printf("Placement:\n");
for (int i = 1; i <= N; i++) { // rows
printf("[ ");
placement = printouts.qpositions[sol][N-i];
for (int j = (N-i)*N; j < (N-i)*N+N; j++) { // physical position
if (j == placement) {
printf(" Q ");
} else printf("%2d ", j+1);
}
printf("]\n");
}
printf("\n");
}
}
/* push value to top of list instance */
void push(stack_buf *instance, int value) {
assert(instance->top <= 8 || instance->top >= 0);
instance->positions[instance->top++] = value;
}
/* pop top element of list instance */
void pop(stack_buf *instance) {
assert(instance->top > 0);
instance->positions[--instance->top] = -1;
}
/* ##########################################################################################
#################################### THREAD FUNCTIONS ####################################
########################################################################################## */
static int consumptions = 0;
/* entry point for each worker (consumer)
workers will check each queen's row, column and
diagonal to evaluate satisfactory placements */
void *eval_positioning(void *id) {
long thr_id = (long)id;
int qpositions[N];
while (!global.done) {
pthread_mutex_lock(&buffer_mutex);
while (global.buf_empty == 1) {
if (global.done) break; // consumers who didn't get last production
pthread_cond_wait(&filled, &buffer_mutex);
}
if (global.done) break;
consumptions++;
memcpy(qpositions, global.positions, N * sizeof(int)); // retrieve queen combination
global.buf_empty = 1;
pthread_cond_signal(&empty);
pthread_mutex_unlock(&buffer_mutex);
if (valid_rows(qpositions) && valid_columns(qpositions) && valid_diagonals(qpositions)) {
/* save for printing later */
pthread_mutex_lock(&print_mutex);
memcpy(printouts.qpositions[++printouts.top], qpositions, N * sizeof(int));
pthread_mutex_unlock(&print_mutex);
}
}
return NULL;
}
static int productions = 0;
/* recursively generate all possible queen_combs */
void rec_positions(int pos, int queens) {
if (queens == 0) { // base case
pthread_mutex_lock(&buffer_mutex);
while (global.buf_empty == 0) {
pthread_cond_wait(&empty, &buffer_mutex);
}
productions++;
memcpy(global.positions, queen_comb.positions, N * sizeof(int));
global.buf_empty = 0;
pthread_mutex_unlock(&buffer_mutex);
pthread_cond_broadcast(&filled); // wake one worker
return;
}
for (int i = pos; i <= N*N - queens; i++) {
push(&queen_comb, i); // physical chess box
rec_positions(i+1, queens-1);
pop(&queen_comb);
}
}
/* binomial coefficient | without order, without replacement
8 queens on 8x8 board: 4'426'165'368 queen combinations */
void *generate_positions(void *arg) {
rec_positions(0, N);
return (void*)1;
}
/* ##########################################################################################
########################################## MAIN ##########################################
########################################################################################## */
/* main procedure of the program */
int main(int argc, char *argv[]) {
if (argc < 3) {
printf("usage: ./8q <workers> <board width/height>\n");
exit(1);
}
int workers = atoi(argv[1]);
N = atoi(argv[2]);
pthread_t thr[workers];
pthread_t producer;
// int sol1[] = {5,8,20,25,39,42,54,59};
// int sol2[] = {2,12,17,31,32,46,51,61};
printf("\n");
start = (float)clock()/CLOCKS_PER_SEC;
pthread_create(&producer, NULL, generate_positions, NULL);
for (long i = 0; i < workers; i++) {
pthread_create(&thr[i], NULL, eval_positioning, (void*)i+1);
}
pthread_join(producer, (void*)&global.done);
pthread_cond_broadcast(&filled);
for (int i = 0; i < workers; i++) {
pthread_join(thr[i], NULL);
}
stop = clock();
diff = (double)(stop - start) / CLOCKS_PER_SEC;
/* go through all valid solutions and print */
print(printouts);
printf("board: %dx%d, workers: %d (+1), exec time: %ld, solutions: %d\n", N, N, workers, diff, printouts.top+1);
printf("productions: %d\nconsumptions: %d\n", productions, consumptions);
return 0;
}
EDIT: I have reworked sync around prod_done and made a new shared variable last_done. When producer is done, it will set prod_done and the thread currently active will either return (last element already validated) or capture the last element at set last_done to inform the other consumers.
Despite the fact that I solved the data race in my book, I still have problems with the shared combination. I have really put time looking into the synchronization but I always get back to the feeling that it should work, but it clearly doesn't when I run it.
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <string.h>
#include <assert.h>
typedef struct stack_buf {
int positions[8];
int top;
} stack_buf;
typedef struct global_buf {
int positions[8];
volatile int buf_empty;
volatile long prod_done;
volatile int last_done;
} global_buf;
typedef struct print_buf {
int qpositions[100][8];
int top;
} print_buf;
stack_buf queen_comb = { {0}, 0 };
global_buf global = { {0}, 1, 0, 0 };
print_buf printouts = { {{0}}, -1 };
int N; //NxN board and N queens to place
long productions, consumptions = 0;
clock_t start, stop, diff;
pthread_mutex_t buffer_mutex, print_mutex;
pthread_cond_t empty, filled;
/* ##########################################################################################
################################## VALIDATION FUNCTIONS ##################################
########################################################################################## */
/* Validate that no queens are placed on the same row */
int valid_rows(int qpositions[]) {
int rows[N];
memset(rows, 0, N*sizeof(int));
int row;
for (int i = 0; i < N; i++) {
row = qpositions[i] / N;
if (rows[row] == 0) rows[row] = 1;
else return 0;
}
return 1;
}
/* Validate that no queens are placed in the same column */
int valid_columns(int qpositions[]) {
int columns[N];
memset(columns, 0, N*sizeof(int));
int column;
for (int i = 0; i < N; i++) {
column = qpositions[i] % N;
if (columns[column] == 0) columns[column] = 1;
else return 0;
}
return 1;
}
/* Validate that left and right diagonals aren't used by another queen */
int valid_diagonals(int qpositions[]) {
int left_bottom_diagonals[N];
int right_bottom_diagonals[N];
int row, col, temp_col, temp_row, fill_value, index;
for (int queen = 0; queen < N; queen++) {
row = qpositions[queen] / N;
col = qpositions[queen] % N;
/* position --> left down diagonal endpoint (index) */
fill_value = col < row ? col : row; // closest to bottom or left wall
temp_row = row - fill_value;
temp_col = col - fill_value;
index = temp_row * N + temp_col; // board position
for (int i = 0; i < queen; i++) { // check if interference occurs
if (left_bottom_diagonals[i] == index) return 0;
}
left_bottom_diagonals[queen] = index; // no interference
/* position --> right down diagonal endpoint (index) */
fill_value = (N-1) - col < row ? N - col - 1 : row; // closest to bottom or right wall
temp_row = row - fill_value;
temp_col = col + fill_value;
index = temp_row * N + temp_col; // board position
for (int i = 0; i < queen; i++) { // check if interference occurs
if (right_bottom_diagonals[i] == index) return 0;
}
right_bottom_diagonals[queen] = index; // no interference
}
return 1;
}
/* ##########################################################################################
#################################### HELPER FUNCTIONS ####################################
########################################################################################## */
/* print the collected solutions */
void print(print_buf printouts) {
static int solution_number = 1;
int placement;
for (int sol = 0; sol <= printouts.top; sol++) { // number of solutions
printf("Solution %d: [ ", solution_number++);
for (int pos = 0; pos < N; pos++) {
printf("%d ", printouts.qpositions[sol][pos]+1);
}
printf("]\n");
printf("Placement:\n");
for (int i = 1; i <= N; i++) { // rows
printf("[ ");
placement = printouts.qpositions[sol][N-i];
for (int j = (N-i)*N; j < (N-i)*N+N; j++) { // physical position
if (j == placement) {
printf(" Q ");
} else printf("%2d ", j+1);
}
printf("]\n");
}
printf("\n");
}
}
/* ##########################################################################################
#################################### THREAD FUNCTIONS ####################################
########################################################################################## */
/* entry point for each worker (consumer)
workers will check each queen's row, column and
diagonal to evaluate satisfactory placements */
void *eval_positioning(void *id) {
long thr_id = (long)id;
int qpositions[N];
pthread_mutex_lock(&buffer_mutex);
while (!global.last_done) {
while (global.buf_empty == 1) {
pthread_cond_wait(&filled, &buffer_mutex);
if (global.last_done) { // last_done ==> prod_done, so thread returns
pthread_mutex_unlock(&buffer_mutex);
return NULL;
}
if (global.prod_done) { // prod done, current thread takes last elem produced
global.last_done = 1;
break;
}
}
if (!global.last_done) consumptions++;
memcpy(qpositions, global.positions, N*sizeof(int)); // retrieve queen combination
global.buf_empty = 1;
pthread_mutex_unlock(&buffer_mutex);
pthread_cond_signal(&empty);
if (valid_rows(qpositions) && valid_columns(qpositions) && valid_diagonals(qpositions)) {
/* save for printing later */
pthread_mutex_lock(&print_mutex);
memcpy(printouts.qpositions[++printouts.top], qpositions, N*sizeof(int));
pthread_mutex_unlock(&print_mutex);
}
pthread_mutex_lock(&buffer_mutex);
}
pthread_mutex_unlock(&buffer_mutex);
return NULL;
}
/* recursively generate all possible queen_combs */
void rec_positions(int pos, int queens) {
if (queens == 0) { // base case
pthread_mutex_lock(&buffer_mutex);
while (global.buf_empty == 0) {
pthread_cond_wait(&empty, &buffer_mutex);
}
productions++;
memcpy(global.positions, queen_comb.positions, N*sizeof(int));
global.buf_empty = 0;
pthread_mutex_unlock(&buffer_mutex);
pthread_cond_signal(&filled);
return;
}
for (int i = pos; i <= N*N - queens; i++) {
queen_comb.positions[queen_comb.top++] = i;
rec_positions(i+1, queens-1);
queen_comb.top--;
}
}
/* binomial coefficient | without order, without replacement
8 queens on 8x8 board: 4'426'165'368 queen combinations */
void *generate_positions(void *arg) {
rec_positions(0, N);
return (void*)1;
}
/* ##########################################################################################
########################################## MAIN ##########################################
########################################################################################## */
/* main procedure of the program */
int main(int argc, char *argv[]) {
if (argc < 3) {
printf("usage: ./8q <workers> <board width/height>\n");
exit(1);
}
int workers = atoi(argv[1]);
N = atoi(argv[2]);
pthread_t thr[workers];
pthread_t producer;
printf("\n");
start = (float)clock()/CLOCKS_PER_SEC;
pthread_create(&producer, NULL, generate_positions, NULL);
for (long i = 0; i < workers; i++) {
pthread_create(&thr[i], NULL, eval_positioning, (void*)i+1);
}
pthread_join(producer, (void*)&global.prod_done);
pthread_cond_broadcast(&filled);
for (int i = 0; i < workers; i++) {
printf("thread #%d done\n", i+1);
pthread_join(thr[i], NULL);
pthread_cond_broadcast(&filled);
}
stop = clock();
diff = (double)(stop - start) / CLOCKS_PER_SEC;
/* go through all valid solutions and print */
print(printouts);
printf("board: %dx%d, workers: %d (+1), exec time: %ld, solutions: %d\n", N, N, workers, diff, printouts.top+1);
printf("productions: %ld\nconsumptions: %ld\n", productions, consumptions);
return 0;
}
I'm quite sure that my implementation is correct with locks and conditional variables
That is a bold statement, and it's provably false. Your program hangs on Linux when run with clang -g q.c -o 8q && ./8q 2 4.
When I look at the state of the program, I see one thread here:
#4 __pthread_cond_wait (cond=0x404da8 <filled>, mutex=0x404d80 <buffer_mutex>) at pthread_cond_wait.c:619
#5 0x000000000040196b in eval_positioning (id=0x1) at q.c:163
#6 0x00007ffff7f8cd80 in start_thread (arg=0x7ffff75b6640) at pthread_create.c:481
#7 0x00007ffff7eb7b6f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
and the main thread trying to join the above thread. All other threads have exited, so there is nothing to signal the condition.
One immediate problem I see is this:
void *eval_positioning(void *id) {
long thr_id = (long)id;
int qpositions[N];
while (!global.done) {
...
int main(int argc, char *argv[]) {
...
pthread_join(producer, (void*)&global.done);
If the producer thread finishes before the eval_positioning starts, then eval_positioning will do nothing at all.
You should set global.done when all positions have been evaluated, not when the producer thread is done.
Another obvious problem is that global.done is accessed without any mutexes held, yielding a data race (undefined behavior -- anything can happen).

Beginners sincerely ask for advice about string

What the class describes is about "reversing a string", which is correct and usable from the Leetcode website. Today, I want to present "reversing a string" by inputting a value by myself (such as the int main() part below), but I still can't execute it after thinking for a long time. Beginners sincerely ask for advice, maybe you can also attach your writing so that I can learn, thank you.
#include <iostream>
#include <string>
using namespace std;
class Solution
{
public:
string reverseWords(string s)
{
if (s.size() == 0)
{
return s;
}
int front = 0, back = 0;
for (int i = 0; i < s.size() - 1; i++)
{
if (s[i] != ' ')
{
back++;
}
else
{
reverse(s.begin() + front, s.begin() + back);
front = back + 1;
back = front;
}
}
back++;
reverse(s.begin() + front, s.begin() + back);
return s;
}
};
int main()
{
Solution word01;
string s1= "Hello caterpillar";
word01 s1;
cout << s1.reverseWords();
}
Your code is pretty good, however we just want to reverse the words not the chars, for that we can use a while loop.
Similarly using two pointers, this'd pass just fine:
// The following block might trivially improve the exec time;
// Can be removed;
static const auto __optimize__ = []() {
std::ios::sync_with_stdio(false);
std::cin.tie(NULL);
std::cout.tie(NULL);
return 0;
}();
// Most of headers are already included;
// Can be removed;
#include <cstdint>
#include <string>
#include <algorithm>
static const struct Solution {
using ValueType = std::uint_fast16_t;
std::string reverseWords(std::string s) {
std::reverse(std::begin(s), std::end(s));
ValueType len = std::size(s);
ValueType index = 0;
for (auto left = 0; left < len; ++left) {
if (s[left] != ' ') {
if (index) {
s[index++] = ' ';
}
ValueType right = left;
while (right < len && s[right] != ' ') {
s[index++] = s[right++];
}
std::reverse(std::begin(s) + index - (right - left), std::begin(s) + index);
left = right;
}
}
s.erase(std::begin(s) + index, std::end(s));
return s;
}
};
Here is LeetCode's solution with comments:
class Solution {
public:
string reverseWords(string s) {
// reverse the whole string
reverse(s.begin(), s.end());
int n = s.size();
int idx = 0;
for (int start = 0; start < n; ++start) {
if (s[start] != ' ') {
// go to the beginning of the word
if (idx != 0) s[idx++] = ' ';
// go to the end of the word
int end = start;
while (end < n && s[end] != ' ') s[idx++] = s[end++];
// reverse the word
reverse(s.begin() + idx - (end - start), s.begin() + idx);
// move to the next word
start = end;
}
}
s.erase(s.begin() + idx, s.end());
return s;
}
};
References
For additional details, please see the Discussion Board where you can find plenty of well-explained accepted solutions with a variety of languages including low-complexity algorithms and asymptotic runtime/memory analysis1, 2.

CS50 Plurality Problem, error: use of undeclared identifier 'i'

Trying to solve the pset3 plurality problem for the CS50 class, line 93 of my code has been the issue, I'm having some trouble solving the last part of the problem set, printing the winner.
I think the vote totals section is okay, but I can't get the code right for the winners section. When I run the code I receive the following error message:
error: use of undeclared identifier 'i' printf("%s\n", candidates[i].name);
#include <cs50.h>
#include <stdio.h>
#include <string.h>
// Max number of candidates
#define MAX 9
// Candidates have name and vote count
typedef struct
{
string name;
int votes;
}
candidate;
// Array of candidates
candidate candidates[MAX];
// Number of candidates
int candidate_count;
// Function prototypes
bool vote(string name);
void print_winner(void);
int main(int argc, string argv[])
{
// Check for invalid usage
if (argc < 2)
{
printf("Usage: plurality [candidate ...]\n");
return 1;
}
// Populate array of candidates
candidate_count = argc - 1;
if (candidate_count > MAX)
{
printf("Maximum number of candidates is %i\n", MAX);
return 2;
}
for (int i = 0; i < candidate_count; i++)
{
candidates[i].name = argv[i + 1];
candidates[i].votes = 0;
}
int voter_count = get_int("Number of voters: ");
// Loop over all voters
for (int i = 0; i < voter_count; i++)
{
string name = get_string("Vote: ");
// Check for invalid vote
if (!vote(name))
{
printf("Invalid vote.\n");
}
}
// Display winner of election
print_winner();
}
// Update vote totals given a new vote
bool vote(string name)
{
for (int i = 0; i < candidate_count; i++)
{
if (strcmp(name, candidates[i].name) == 0)
candidates[i].votes++;
}
return true;
return false;
}
// Print the winner (or winners) of the election
void print_winner(void)
{
int maxvote = 0;
for (int i = 0; i < candidate_count; i++)
{
if (candidates[i].votes > maxvote)
maxvote = candidates[i].votes;
}
printf("%s\n", candidates[i].name);
return;
}
The i variable is defined only within the context of your loop. When the loop is over, where your print statement tries to print candidates[i].name but i is not defined anymore. Just like how you save your max number of votes, you also need to save your candidate index in a value declared outside of your loop.
int maxvote = 0;
int winnerIndex;
for (int i = 0; i < candidate_count; i++)
{
if (candidates[i].votes > maxvote) {
maxvote = candidates[i].votes;
winnerIndex = i;
}
}
printf("%s\n", candidates[winnerIndex].name);

stringstream serialization, on centOS, of large set of floats is faster than 4 pthread-ts which serialize chunks. std::threads are faster on Windows

I have the task to optimize the serialization of large sets of floats on a hard-disk.
My initial approach has the following:
class StringStreamDataSerializer
{
public:
void serializeRawData(const vector<float>& data);
void saveToFileStream(std::fstream& file);
private:
stringstream _stringStream;
};
void StringStreamDataSerializer::serializeRawData(const vector<float>& data)
{
for (float currentFloat : data)
_stringStream << currentFloat;
}
void StringStreamDataSerializer::saveToFileStream(std::fstream& file)
{
file << _stringStream.str().c_str();
file.close();
}
I wanted to separate the task of serializaton between 4 threads, to make the
serialization faster. Here's how:
struct st_args
{
const vector<float>* data;
size_t from;
size_t to;
size_t segment;
} ;
string outputs[4];
std::mutex g_display_mutex;
void serializeLocal(void *context)
{
struct st_args *readParams = (st_args*)context;
for (auto i = readParams->from; i < readParams->to; i++)
{
string currentFloat = std::to_string( readParams->data->at(i));
currentFloat.erase(currentFloat.find_last_not_of('0') + 1,
std::string::npos);
outputs[readParams->segment] += currentFloat;
}
}
void SImplePThreadedSerializer::serializeRawData(const vector<float>& data)
{
const int N = 4;
size_t totalFloats = data.size();
st_args* seg;
pthread_t* chunk;
chunk = (pthread_t *) malloc(N*sizeof(pthread_t));
seg = (st_args *) malloc(N*sizeof(st_args));
size_t from = 0;
for(int i = 0; i < N; i++)
{
seg[i].from = 0;
seg[i].data = &data;
}
int i = 0;
for (; i < N - 1; ++i)
{
seg[i].from = from;
seg[i].to = seg[i].from + totalFloats / N;
seg[i].segment = i;
pthread_create(&chunk[i], NULL, (void *(*)(void *)) serializeLocal,
(void *) &(seg[i]));
from += totalFloats / N;
}
seg[i].from = from;
seg[i].to = totalFloats;
seg[i].segment = i;
pthread_create(&chunk[i], NULL, (void *(*)(void *)) serializeLocal, (void *)
&(seg[i]));
size_t totalBuffered = 0;
for (int k = 0; k < N; k++)
{
pthread_join(chunk[k], NULL);
totalBuffered += outputs[k].size();
}
str.reserve(totalBuffered);
for (int k = 0; k < N; k++)
{
str+= outputs[k];
}
free(chunk);
free(seg);
}
Turns out, that the stringstream is faster even from 4 thread on Linux. On Windows I am archiving an optimization with the presented approach (with std::thread) on Windows, but on Linux I have the opposite results. Any explanation why would be helpful and appreciated.
Here are the results on centOS:
* Serialization of 10000000 floats on the hard disk *
StringStreamDataSerializer flushes data in file in 0.55 seconds.
StringStreamDataSerializer Finished in 3.28 seconds.
SImplePThreadedSerializer flushes data in file in 0.46 seconds.
SImplePThreadedSerializer Finished in 6.96 seconds.
On windows, the multithreaded serialization is done by 4 std::threads and they actually optimize the serialization:
static void serializeChunk(string& output, const vector<float>& data, size_t
from, size_t to)
{
for (auto i = from; i < to; i++)
{
string currentFloat = std::to_string(data[i]);
//fuckin trim the zeroes at the end
currentFloat.erase(currentFloat.find_last_not_of('0') + 1,
std::string::npos);
output += currentFloat;
}
}
void SimpleMultiThreadedSerializer::serializeRawData(const vector<float>&
data)
{
const int N = 4;
thread t[N]; // say, 4 CPUs.
string outputs[N];
size_t totalFloats = data.size();
size_t from = 0;
int i = 0;
for (; i < N - 1; ++i)
{
t[i] = thread(serializeChunk, std::ref(outputs[i]), data, from, from +
totalFloats / N);
from += totalFloats / N;
}
t[i] = thread(serializeChunk, std::ref(outputs[i]), data, from,
totalFloats);
for (i = 0; i < N; ++i)
t[i].join();
size_t totalBuffered = 0;
for (int i = 0; i < N; ++i)
totalBuffered += outputs[i].size();
str.reserve(totalBuffered);
for (int i = 0; i < N; ++i)
str += outputs[i];
}
And the results:
* Serialization of 1000000 floats on the hard disk *
StringStreamDataSerializer flushes data in file in 0.116 seconds.
StringStreamDataSerializer Finished in 10.236 seconds.
SimpleMultiThreadedSerializer flushes data in file in 0.105 seconds.
SimpleMultiThreadedSerializer Finished in 3.01 seconds.
Conversion between binary floating point and decimal output is very expensive. If performance is a concern, you should serialize the data in binary (possibly after endianess conversion, so you get at least interoperability across IEEE 754 systems).
Regarding the poor threading on GNU/Linux performance, this is like a known performance issue regarding locale object handling. In multi-threaded mode, stringstream currently uses a process-wide, heavily contended reference counter for locale handling.

C, convert hex number to decimal number without functions

i'm trying to convert hexadecimal number to decimal number. What i've come up so far is:
#include <unistd.h>
#include <stdio.h>
long convert(char *input, short int *status){
int length = 0;
while(input[length])
{
length++;
}
if(length = 0)
{
*status = 0;
return 0;
}
else
{
int index;
int converter;
int result = 0;
int lastNumber = length-1;
int currentNumber;
for(index = 0; index < length; index++){
if(index == 0)
{
converter = 1;
}
else if(index == 1)
{
converter = 16;
}
else{
converter *= 16;
}
if(input[lastNumber] < 45 || input[lastNumber] > 57)
{
*status = 0;
return 0;
}
else if(input[lastNumber] > 45 && input[lastNumber] < 48)
{
*status = 0;
return 0;
}
else{
if(input[lastNumber] == 45)
{
*status = -1;
return result *= -1;
}
currentNumber = input[lastNumber] - 48;
result += currentNumber * converter;
lastNumber--;
}
}
*status = -1;
return result;
}
}
int main(int argc, char **argv)
{
char *input=0;
short int status=0;
long rezult=0;
if(argc!=2)
{
status=0;
}
else
{
input=argv[1];
rezult=convert(input,&status);
}
printf("result: %ld\n", rezult);
printf("status: %d\n", status);
return 0;
}
Somehow i always get resoult 0. Ia am also not allowed to use any other outher functions (except printf). What could be wrong with my code above?
This:
if(dolzina = 0)
{
*status = 0;
return 0;
}
is not merely testing dolzina, it's first setting it to 0. This causes the else clause to run, but with dolzina equal to 0 which is not the expected outcome.
You should just use == to compare, of course.

Resources