// Function prototype with default parameters
void func(int p1, int p2 = 0, int p3 = 0);
// Function body
void func(int p1, int p2, int p3)
printf("The # of explicit parameters is %d\n", ??);
printf("The # of implicit parameters is %d\n", ??);
// Use in code
int main(int argc, char* argv[])
func(0); // Should print 1 explicit, 2 implicit
func(0, 0); // Should print 2 explicit, 1 implicit
func(0, 0, 0); // Should print 3 explicit, 0 implicit
return 0;
In the above example, how can I get the actual number of parameters passed in explicitly? And the number of parameters passed in implicitly?

As #Igor pointed out in a comment, it is not possible to detect if a parameter is default or actually passed.
One of possible ways to emulate it is to switch from default parameters to overload.
Another is to make the default value distinct from explicit value. It can be done with std::optional if you wouldn't pass empty optional explcitly:
#include <stdio.h>
#include <optional>
void func(std::optional<int> p1 = std::nullopt, std::optional<int> p2 = std::nullopt, std::optional<int> p3 = std::nullopt);
// Function body
void func(std::optional<int> p1, std::optional<int> p2, std::optional<int> p3)
printf("The # of explicit parameters is %d\n", p1.has_value() + p2.has_value() + p3.has_value());
printf("The # of implicit parameters is %d\n", !p1.has_value() + !p2.has_value() + !p3.has_value());
// Use in code
int main(int argc, char* argv[])
func(0); // Should print 1 explicit, 2 implicit
func(0, 0); // Should print 2 explicit, 1 implicit
func(0, 0, 0); // Should print 3 explicit, 0 implicit
return 0;
Also it can be emulated with variadic template parameters.


Is there a way to perform signed division in eBPF?

I am trying to perform signed division in eBPF but llvm is throwing unsupported error. Is there a way to perform signed division in any other way (direct/indirect) in eBPF?
eBPF doesn't have a signed division instruction in its instruction set.
You can still work around it though. Signed division is nothing more than preserving the XOR of the two sides. Meaning, a output is negative if one or the other is negative, but dividing a negative by a negative number gives a positive back.
This is what I came up with:
#include <stdio.h>
#include <stdint.h>
#include <stdbool.h>
int32_t sdiv(int32_t a, int32_t b) {
bool aneg = a < 0;
bool bneg = b < 0;
// get the absolute positive value of both
uint32_t adiv = aneg ? -a : a;
uint32_t bdiv = bneg ? -b : b;
// Do udiv
uint32_t out = adiv / bdiv;
// Make output negative if one or the other is negative, not both
return aneg != bneg ? -out : out;
int main()
printf("%d\n", sdiv(100, 5));
printf("%d\n", sdiv(-100, 5));
printf("%d\n", sdiv(100, -5));
printf("%d\n", sdiv(-100, -5));
return 0;
I am sure there are better ways to do it, but this seems to work.

Non collective write using in file view

When trying to write blocks to a file, with my blocks being unevenly distributed across my processes, one can use MPI_File_write_at with the good offset. As this function is not a collective operation, this works well.
Exemple :
#include <cstdio>
#include <cstdlib>
#include <string>
#include <mpi.h>
int main(int argc, char* argv[])
int rank, size;
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
int global = 7; // prime helps have unbalanced procs
int local = (global/size) + (global%size>rank?1:0);
int strsize = 5;
MPI_File fh;
for (int i=0; i<local; ++i)
size_t idx = i * size + rank;
std::string buffer = std::string(strsize, 'a' + idx);
size_t offset = buffer.size() * idx;
MPI_File_write_at(fh, offset, buffer.c_str(), buffer.size(), MPI_CHAR, MPI_STATUS_IGNORE);
return 0;
However for more complexe write, particularly when writting multi dimensional data like raw images, one may want to create a view at the file with MPI_Type_create_subarray. However, when using this methods with simple MPI_File_write (which is suppose to be non collective) I run in deadlocks. Exemple :
#include <cstdio>
#include <cstdlib>
#include <string>
#include <mpi.h>
int main(int argc, char* argv[])
int rank, size;
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
int global = 7; // prime helps have unbalanced procs
int local = (global/size) + (global%size>rank?1:0);
int strsize = 5;
MPI_File fh;
for (int i=0; i<local; ++i)
size_t idx = i * size + rank;
std::string buffer = std::string(strsize, 'a' + idx);
int dim = 2;
int gsizes[2] = { buffer.size(), global };
int lsizes[2] = { buffer.size(), 1 };
int offset[2] = { 0, idx };
MPI_Datatype filetype;
MPI_Type_create_subarray(dim, gsizes, lsizes, offset, MPI_ORDER_C, MPI_CHAR, &filetype);
MPI_File_set_view(fh, 0, MPI_CHAR, filetype, "native", MPI_INFO_NULL);
MPI_File_write(fh, buffer.c_str(), buffer.size(), MPI_CHAR, MPI_STATUS_IGNORE);
return 0;
How to avoid such a code to lock ? Keep in mind that by real code will really use the multidimensional capabilities of MPI_Type_create_subarray and cannot just use MPI_File_write_at
Also, it is difficult for me to know the maximum number of block in a process, so I'd like to avoid doing a reduce_all and then loop on the max number of block with empty writes when localnb <= id < maxnb
You don't use MPI_REDUCE when you have a variable number of blocks per node. You use MPI_SCAN or MPI_EXSCAN: MPI IO Writing a file when offset is not known
MPI_File_set_view is collective, so if 'local' is different on each processor, you'll find yourself calling a collective routine from less than all processors in the communicator. If you really really need to do so, open the file with MPI_COMM_SELF.
the MPI_SCAN approach means each process can set the file view as needed, and then blammo you can call the collective MPI_File_write_at_all (even if some processes have zero work -- they still need to participate) and take advantage of whatever clever optimizations your MPI-IO implementation provides.

Pass multiple args to thread using struct (pthread)

I'm learning to programming using pthread for a adder program, after reference several codes still don't get how to pass multiple arguments into a thread using a struct, here is my buggy program:
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <pthread.h>
typedef struct s_addition {
int num1;
int num2;
int sum;
} addition;
void *thread_add_function (void *ad)
printf ("ad.num1:%d, ad.num2:%d\n",ad.num1, ad.num2);
ad.sum = ad.num1 + ad.num2;
int main()
int N = 5;
int a[N], b[N], c[N];
srand (time(NULL));
// fill them with random numbers
for ( int j = 0; j < N; j++ ) {
a[j] = rand() % 392;
b[j] = rand() % 321;
addition ad1;
pthread_t thread[N];
for (int i = 0; i < N; i++) {
ad1.num1 = a[i];
ad1.num2 = b[i];
printf ("ad1.num1:%d, ad1.num2:%d\n",ad1.num1, ad1.num2);
pthread_create (&thread[i], NULL, thread_add_function, &ad1);
pthread_join(thread[i], NULL);
c[i] = ad.sum;
printf( "This is the result of using pthread.\n");
for ( int i = 0; i < N; i++) {
printf( "%d + %d = %d\n", a[i], b[i], c[i]);
But when compiling I got the following error:
vecadd_parallel.c:15:39: error: member reference base type 'void *' is not a
structure or union
printf ("ad.num1:%d, ad.num2:%d\n",ad.num1, ad.num2);
I tried but still cannot get a clue, what I am doing wrong with it?
Seems like you have a problem with trying to access the members of a void datatype.
You will need to add a line to cast your parameter to thread_add_function to the correct datatype similar to addition* add = (addition*)ad;, and then use this variable in your function (note that you also have to change you r .'s to -> because it's a pointer)
You also should only pass data to threads that was malloc()'d, as stack allocated data may not be permanent. It should be fine for the current implementation, but changes later could easily give strange, unpredictable behaviour.

Pthread create function

I am having difficulty understanding the creation of a pthread.
This is the function I declared in the beginning of my code
void *mini(void *numbers); //Thread calls this function
Initialization of thread
pthread_t minThread;
pthread_create(&minThread, NULL, (void *) mini, NULL);
void *mini(void *numbers)
min = (numbers[0]);
for (i = 0; i < 8; i++)
if ( numbers[i] < min )
min = numbers[i];
numbers is an array of integers
int numbers[8];
Im not sure if I created the pthread correctly.
In the function, mini, I get the following error about setting min (declared as an int) equal to numbers[0]:
Assigning to 'int' from incompatible type 'void'
My objective is to compute the minimum value in numbers[ ] (min) in this thread and use that value later to pass it to another thread to display it. Thanks for any help I can get.
You need to pass 'numbers' as the last argument to pthread_create(). The new thread can then call 'mini' on its own stack with 'numbers' as the argument.
In 'mini', you shoudl cast the void* back to an integer array in order to dereference it correctly - you cannot dereference a void* directly - it does not point to anything:)
Also, it's very confusing to have multiple vars in different threads with the name 'numbers'.
There are some minor improprieties in this pgm but it illustrates basically what you want to do. You should play around, break and improve it.
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
void *mini(void *numbs)
int *numbers = (int *) numbs;
int *min = malloc(sizeof(int));
*min = (numbers[0]);
for (int i = 0; i < 8; i++)
if (numbers[i] < *min )
*min = numbers[i];
int main(int argc, char *argv[])
pthread_t minThread;
int *min;
int numbers[8] = {28, 47, 36, 45, 14, 23, 32, 16};
pthread_create(&minThread, NULL, (void *) mini, (void *) numbers);
pthread_join(minThread, (void *) &min);
printf("min: %d\n", *min);

Anyone can show me an example of the function dspr in openblas

I am trying to use the function dspr in Openblas with Rcpp.
the purpose of dspr is:
A := alpha*x*x**T + A
In my case, firstly I define A as a matrix with all the elements are 0, alpha=1, x=(1,3), so, the final matrix A should be {(1,3),(3,9)}, but I never get the right result, I set the parameters as follow:
cblas_dspr(CblasColMajor,CblasUpper,2, 1, &(x[0]),1, &(A[0]));
Can anyone tell me how to set the right parameters of dspr? Thanks.
The file /usr/include/cblas,h on my machine shows the following signature for the C interface of the BLAS:
void cblas_dspr(const enum CBLAS_ORDER order, const enum CBLAS_UPLO Uplo,
const int N, const double alpha, const double *X,
const int incX, double *Ap);
Try that. You get the beginning of Rcpp vectors via x.begin() or via &(x[0]) as you did.
There is nothing pertinent to Rcpp here though.
Repeated from your own post: The BLAS dyadic product performs
A := alpha*x*x' + A
So A would need to be initialized with zero values.
In addition do not forget that A is an upper triangular matrix.
For further reading I recommend these links:
However, you wanted an example. Here goes:
/** dspr_demo.cpp */
#include <cblas.h>
#include <iostream>
#include <cstdlib>
using namespace std;
int main(int argc, char** argv)
int n=2;
double* x = (double*)malloc(n*sizeof(double));
double* upperTriangleResult = (double*)malloc(n*(n+1)*sizeof(double)/2);
for (int j=0;j<n*(n+1)/2;j++) upperTriangleResult[j] = 0;
x[0] = 1; x[1] = 3;
double*& A = upperTriangleResult;
cout << A[0] << "\t" << A[1] << endl << "*\t" << A[2] << endl;
free(upperTriangleResult); free(x);
