RcppParallel: trying to implement an approach of lapply but in Parallel

RcppParallel: trying to implement an approach of lapply but in Parallel - rcpp

I'm trying to implement an approach of lapply , like R, but in parallel using Rcpp and RcppParallel in C++. I want to apply a function to every element in a List in PARALLEL and I have some problems with the function inside 'transform' where I want to pass a function which will be applied, but that function comes from R and I get an error when I try to compile the program.
But making it in serial avoiding loops, using transform too and passing a ' Function f ' it compiles and works perfectly, that is the code:
List lapplyCppSerial(List input, Function f){
std::transform(input.begin(), input.end(), output.begin(), f);
return output;
}
When I try to make the same in the parallel way, using parallelFor() I have to create a Worker class that can "iterate" over all elements. In operator() I want to use transform() again, but I got an error passing the function as parameter like before.
I think it's something missing but I can't find it. I hope someone could help me.
Here is the code:
#include <Rcpp.h>
#include <RcppParallel.h>
// [[Rcpp::depends(RcppParallel)]]
using namespace RcppParallel;
using namespace Rcpp;
struct Fun : public Worker
{
// source vector
const RVector<double> input;
Rcpp::Function f;
// output values
RVector<double> output;
Fun(const NumericVector input,NumericVector output, Function fun)
: input(input), output(output), f(fun) {}
void operator()(std::size_t begin, std::size_t end, Rcpp::Function f) {
std::transform(input.begin() + begin, input.end() + end,
output.begin() + begin, as<double>(f)); //Here I have PROBLEMS!!!
}
};
// [[Rcpp::export]]
List lapplyCppParallel(List input, Function f) {
List output(input.size());
Fun fun(input, output, f);
parallelFor(0, input.size(), fun);
return output;
}

Related

Passing class as argument in Rcpp function

I was reading the awesome Rcpp vignette on exposing c++ classes and functions using Rcpp modules. In that context, is it possible to create an Rcpp function that has a class of type Uniform as one of the arguments and that is not part of the particular module being exported? Below here is just a model of what I was thinking. The example is taken from the same vignette. The answer might be already there. It would be great if someone can point to the right place.
#include <RcppArmadillo.h>
using namespace Rcpp;
class Uniform {
public:
Uniform(double min_, double max_) :
min(min_), max(max_) {}
NumericVector draw(int n) const {
RNGScope scope;
return runif(n, min, max);
}
double min, max;
};
double uniformRange(Uniform* w) {
return w->max - w->min;
}
RCPP_MODULE(unif_module) {
class_<Uniform>("Uniform")
.constructor<double,double>()
.field("min", &Uniform::min)
.field("max", &Uniform::max)
.method("draw", &Uniform::draw)
.method("range", &uniformRange)
;
}
/// JUST AN EXAMPLE: WON'T RUN
// [[Rcpp::export]]
double test(double z, Uniform* w) {
return z + w->max ;
}

Following Dirk's comment, I am posting a possible solution. The idea would be to create a new instance of a class object with a pointer on it and create an external pointer that can be further passed as an argument of a function. Below here is what I have gathered from his post:
#include <RcppArmadillo.h>
using namespace Rcpp;
class Uniform {
public:
Uniform(double min_, double max_) :
min(min_), max(max_) {}
NumericVector draw(int n) const {
RNGScope scope;
return runif(n, min, max);
}
double min, max;
};
// create external pointer to a Uniform object
// [[Rcpp::export]]
XPtr<Uniform> getUniform(double min, double max) {
// create pointer to an Uniform object and
// wrap it as an external pointer
Rcpp::XPtr<Uniform> ptr(new Uniform( min, max ), true);
// return the external pointer to the R side
return ptr;
}
/// CAN RUN IT NOW:
// [[Rcpp::export]]
double test(double z, XPtr<Uniform> xp) {
double k = z + xp ->max;
return k;
}

Couldn't template<typename> deduce pointer type?

I have the following program, compile+run, no problem
#include <thread>
#include <future>
#include <iostream>
#include <algorithm>
void f(int* first,
int* last,
std::promise<int> accumulate_promise)
{
int sum = std::accumulate(first, last, 0);
accumulate_promise.set_value(sum); // Notify future
}
int main()
{
int numbers[] = { 1, 2, 3, 4, 5, 6 };
std::promise<int> accumulate_promise;
std::future<int> accumulate_future = accumulate_promise.get_future();
std::thread work_thread(f, begin(numbers), end(numbers),
std::move(accumulate_promise));
accumulate_future.wait(); // wait for result
std::cout << "result=" << accumulate_future.get() << '\n';
work_thread.join(); // wait for thread completion
}
But if I change "f" into a template:
template<typename Iterator>
void f(Iterator first,
Iterator last,
std::promise<int> accumulate_promise)
{
int sum = std::accumulate(first, last, 0);
accumulate_promise.set_value(sum); // Notify future
}
Then it fails compilation，gcc report that thread::thread() ctor cannot find proper overload:
error: no matching function for call to 'std::thread::thread(, int*, int*, std::remove_reference&>::type)'
What is the message indicating, anything wrong with my template?
How to fix it?
Thanks.

f is a template.
std::thread work_thread(f, begin(numbers), end(numbers),
std::move(accumulate_promise));
To put it in loose terms, std::thread's first parameter is either a function pointer or something that acts like a function pointer. It doesn't take a template as the first parameter.
A template becomes a class, or a function, when it is instantiated. The template gets instantiated when it gets used. So, given this template definition, and using it in a manner like this:
f(something.begin(), something.end(), some_kind_of_a_promise);
this instantiates a template, and uses it. To instantiate a template explicitly, without using it:
f<int *>
Now, you have an instantiated template here. The following works here:
std::thread work_thread(f<int *>, std::begin(numbers),
std::end(numbers),
std::move(accumulate_promise));
Tested with gcc 5.3.1

Returning multiple values from a function with std::future

According to this Q&A, std::future works if a function returns a value, but you can't pass references and get multiple values. So a function like this will give no results with std::future:
void doSomething(int &a, int &b) { a = 1; b = 2; }
My idea was to create a structure and have the function return the structure:
#include <iostream>
#include <future>
using namespace std;
struct myData
{
int a;
int b;
};
myData doSomething()
{
myData d;
d.a = 1;
d.b = 2;
return d;
}
int main()
{
future<myData> t1 = async(launch::deferred, doSomething);
printf("A=%d, B=%d\n", t1.get().a, t1.get().b);
return 0;
}
So, how can I get two or more values from a std::future? Is there a better method than this?

but you can't pass references and get multiple values.
Not true, as explained in the answers to the linked question, you can pass references, you just need to use std::ref to protect them from decaying. So to call void doSomething(int &a, int &b) you would use:
int a;
int b;
auto fut = std::async(std::launch::deferred, doSomething, std::ref(a), std::ref(b));
fut.get(); // wait for future to be ready
std::printf("A=%d, B=%d\n", a, b);
But that function doesn't return multiple values, it uses out parameters to set multiple variables. For a function to return multiple values you do need to return some composite type such as a struct, but that has nothing to do with std::future, that's how C++ works. Functions have a single return type.
Your solution returning a struct is the idiomatic way, although your code will fail at run-time because you use t1.get() twice, and you can only retrieve the result from a std::future once. To access the result twice either move the result into a new variable:
auto result = t1.get();
or convert the future to a std::shared_future which allows the result to be accessed multiple times:
auto t2 = t1.share();
But you don't need to use a custom structure to return multiple values, you can just use a pair or tuple:
#include <cstdio>
#include <future>
#include <tuple>
std::tuple<int, int> doSomething()
{
return std::make_tuple(1, 2);
}
int main()
{
auto fut = std::async(std::launch::deferred, doSomething);
auto result = fut.get();
std::printf("A=%d, B=%d\n", std::get<0>(result), std::get<1>(result));
}

The error you get have nothing to do with your implementation, it's that the linker doesn't link with the pthread library by default.
Add the flagg -pthread to the compiler and linker (if you're using GCC or Clang) and it should work.
Alternatively, add the pthread library as a linker library with the -l linker flag.

can't pass std::vector<std::unique_ptr<>> to std::thread

I created a threadpool which captures a function and arguments into tuples and then perfect forwards when the task is dequeued.
However I am unable to pass a vector of unique_ptr's to the thread by rvalue. A simplified project is below:
#include <future>
#include <memory>
#include <vector>
template <typename F, typename... Args>
typename std::result_of<F(Args...)>::type pushTask(F&& f, Args&&... args)
{
using result_type = typename std::result_of<F(Args...)>::type;
// create a functional object of the passed function with the signature std::function<result_type(void)> by creating a
// bound Functor lambda which will bind the arguments to the function call through perfect forwarding and lambda capture
auto boundFunctor = [func = std::move(std::forward<F>(f)),
argsTuple = std::move(std::make_tuple(std::forward<Args>(args)...))](void) mutable->result_type
{
// forward function and turn variadic arguments into a tuple
return result_type();
};
// create a packaged task of the function object
std::packaged_task<result_type(void)> taskFunctor{ std::move(boundFunctor) };
}
int main(int argc, char *argv [])
{
auto testvup = [](std::vector<std::unique_ptr<int>>&& vup)
{
};
std::vector<std::unique_ptr<int>> vup;
pushTask(testvup, std::move(vup));
}
I get the following compiler error with VS2015 rather I use the std::function or std::packaged_task
Severity Description Project File Line
Error error C2280: 'std::unique_ptr<int,std::default_delete<_Ty>>::unique_ptr(const std::unique_ptr<_Ty,std::default_delete<_Ty>> &)': attempting to reference a deleted function Stack xmemory0 659
passing other arguments by rvalue including std::vector works.
Has anyone else run across this or have suggestions.

C++ Standard section §20.9.11.2.1 [func.wrap.func]
template<class F> function(F f);
template <class F, class A> function(allocator_arg_t, const A& a, F
f);
Requires: F shall be CopyConstructible. f shall be Callable for
argument types ArgTypes and return type R. The copy constructor and
destructor of A shall not throw exceptions.
Your lambda function boundFunctor is a move only type (because it captures move only types, since std::unique_ptr cannot be copied)
Hence, boundFunctor is not copyable and not suitable as an argument to an std::function

Thread safety of a member function pointer

A class t_analyser contains a function which performs some operations on a t_data object and outputs the results.
I would like to add to t_analyser a filter capability: a small and fast function
bool filter(const t_data & d) {...}
which allows to skip the analysis (and the output) if some conditions are met for that particular data. The filter should be setted up easily from the main, so I was thinking to store a shared function pointer in t_analyser and use a lambda to initialize it.
Is this a good approach? My concerns are related to the fact that many analysers can call the same filter function simultaneously in different threads, could this be a problem? Can I simply copy the pointer in the t_analyser's copy constructor? Any hint would be much appreciated.

This could be a problem if your filter function had side effects. Its signature is simple and says that it just makes some decision reading data from t_data, so make sure that t_data is not modified in parallel thread and you'll be fine.

Consider the following program:
#include <iostream>
struct X
{
void foo1(){ std::cout << "foo1" << std::endl; }
void foo2(){ std::cout << "foo2" << std::endl; }
};
typedef void (X::*MemberFunctionPointer)();
struct ByRef
{
ByRef( MemberFunctionPointer& f )
: f_( f )
{
}
void operator()()
{
X x;
(x.*f_)();
}
MemberFunctionPointer& f_;
};
struct ByValue
{
ByValue( MemberFunctionPointer f )
: f_( f )
{
}
void operator()()
{
X x;
(x.*f_)();
}
MemberFunctionPointer f_;
};
int main()
{
MemberFunctionPointer p = &X::foo1;
ByRef byRef( p );
ByValue byValue( p );
byRef();
byValue();
p = &X::foo2;
byRef();
byValue();
return 0;
}
Output:
foo1
foo1
foo2
foo1
Press <RETURN> to close this window...
From this you will notice that in the one case the member function pointer is passed by value (and not shared), and in the other it is passed by reference (and shared). When using the syntax:
foo( void( X::*f)() )...
the pointer to member function is passed by value, and is copied (and cannot be modified) again.

You can declare the function pointer as static + thread specific:
static _declspec(thread) FUNC_TYPE filterFunc;
Each thread that modifies filterFunc works on a different copy of the pointer.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

RcppParallel: trying to implement an approach of lapply but in Parallel - rcpp

Related

Passing class as argument in Rcpp function

Couldn't template<typename> deduce pointer type?

Returning multiple values from a function with std::future

can't pass std::vector<std::unique_ptr<>> to std::thread

Thread safety of a member function pointer

Categories

Resources