I am learning to use RcppArmadillo(Rcpp) to make the code run faster.
In practice, I usually encounter that I want to call some functions from given packages.
The below is a little example.
I want to calculate the thresholds of lasso (or scad, mcp). In R we can use
the thresh_est function in library(HSDiC), which reads
library(HSDiC) # acquire the thresh_est() function
# usage: thresh_est(z, lambda, tau, a = 3, penalty = c("MCP", "SCAD", "lasso"))
# help: ?thresh_est
z = seq(-5,5,length=500)
thresh = thresh_est(z,lambda=1,tau=1,penalty="lasso")
# thresh = thresh_est(z,lambda=1,tau=1,a=3,penalty="MCP")
# thresh = thresh_est(z,lambda=1,tau=1,a=3.7,penalty="SCAD")
plot(z,thresh)
Then I try to realize the above via RcppArmadillo(Rcpp).
According to the answers of
teuder as well as
coatless
and coatless,
I write the code (which is saved as thresh_arma.cpp) below:
#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]
using namespace Rcpp; //with this, Rcpp::Function can be Function for short, etc.
using namespace arma; // with this, arma::vec can be vec for short, etc.
// using namespace std; //std::string
// [[Rcpp::export]]
vec thresh_arma(vec z, double lambda, double a=3, double tau=1, std::string penalty) {
// Obtaining namespace of the package
Environment pkg = namespace_env("HSDiC");
// Environment HSDiC("package:HSDiC");
// Picking up the function from the package
Function f = pkg["thresh_est"];
// Function f = HSDiC["thresh_est"];
vec thresh;
// In R: thresh_est(z, lambda, tau, a = 3, penalty = c("MCP", "SCAD", "lasso"))
if (penalty == "lasso") {
thresh = f(_["z"]=z,_["lambda"]=lambda,_["a"]=a,_["tau"]=tau,_["penalty"]="lasso");
} else if (penalty == "scad") {
thresh = f(_["z"]=z,_["lambda"]=lambda,_["a"]=a,_["tau"]=tau,_["penalty"]="SCAD");
} else if (penalty == "mcp") {
thresh = f(_["z"]=z,_["lambda"]=lambda,_["a"]=a,_["tau"]=tau,_["penalty"]="MCP");
}
return thresh;
}
Then I do the compilation as
library(Rcpp)
library(RcppArmadillo)
sourceCpp("thresh_arma.cpp")
# z = seq(-5,5,length=500)
# thresh = thresh_arma(z,lambda=1,tau=1,penalty="lasso")
# # thresh = thresh_arma(z,lambda=1,a=3,tau=1,penalty="mcp")
# # thresh = thresh_arma(z,lambda=1,a=3.7,tau=1,penalty="scad")
# plot(z,thresh)
However, the compilation fails and I have no idea about the reasons.
Related
I want check the BIC usage in statistics.
My little example, which is saved as check_bic.cpp, is presented as follows:
#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]
using namespace Rcpp;
using namespace arma;
// [[Rcpp::export]]
List check_bic(const int N = 10, const int p = 20, const double seed=0){
arma_rng::set_seed(seed); // for reproducibility
arma::mat Beta = randu(p,N); //randu/randn:random values(uniform and normal distributions)
arma::vec Bic = randu(N);
uvec ii = find(Bic == min(Bic)); // may be just one or several elements
int id = ii(ii.n_elem); // fetch the last one element
vec behat = Beta.col(id); // fetch the id column of matrix Beta
List ret;
ret["Bic"] = Bic;
ret["ii"] = ii;
ret["id"] = id;
ret["Beta"] = Beta;
ret["behat"] = behat;
return ret;
}
Then I compile check_bic.cpp in R by
library(Rcpp)
library(RcppArmadillo);
sourceCpp("check_bic.cpp")
and the compilation can pass successfully.
However, when I ran
check_bic(10,20,0)
in R, it shows errors as
error: Mat::operator(): index out of bounds
Error in check_bic(10, 20, 0) : Mat::operator(): index out of bounds
I check the .cpp code line by line, and guess the problems probably
happen at
uvec ii = find(Bic == min(Bic)); // may be just one or several elements
int id = ii(ii.n_elem); // fetch the last one element
since if uvec ii only has one element, then ii.n_elem may be NaN or something
else in Rcpp (while it's ok in Matlab), while I dont konw how to
deal with case. Any help?
In this simple example I would like to subset a matrix by row and pass it to another cpp function; the example demonstrates this works by passing an input array to the other function first.
#include "RcppArrayFire.h"
using namespace Rcpp;
af::array theta_check_cpp( af::array theta){
if(*theta(1).host<double>() >= 1){
theta(1) = 0;
}
return theta;
}
// [[Rcpp::export]]
af::array theta_check(RcppArrayFire::typed_array<f64> theta){
const int theta_size = theta.dims()[0];
af::array X(2, theta_size);
X(0, af::seq(theta_size)) = theta_check_cpp( theta );
X(1, af::seq(theta_size)) = theta;
// return X;
Rcpp::Rcout << " works till here";
return theta_check_cpp( X.row(1) );
}
/*** R
theta <- c( 2, 2, 2)
theta_check(theta)
*/
The constructor you are using to create X has an argument ty for the data type, which defaults to f32. Therefore X uses 32 bit floats and you cannot extract a 64 bit host pointer from that. Either use
af::array X(2, theta_size, f64);
to create an array using 64 bit doubles, or extract a 32 bit host pointer via
if(*theta(1).host<float>() >= 1){
...
I am experiencing trouble getting this example to run correctly. Currently it produces the same random sample for every iteration and seed input, despite the seed changing as shown by af::getSeed().
#include "RcppArrayFire.h"
#include <random>
using namespace Rcpp;
using namespace RcppArrayFire;
// [[Rcpp::export]]
af::array random_test(RcppArrayFire::typed_array<f64> theta, int counts, int seed){
const int theta_size = theta.dims()[0];
af::array out(counts, theta_size, f64);
for(int f = 0; f < counts; f++){
af::randomEngine engine;
af_set_seed(seed + f);
//Rcpp::Rcout << af::getSeed();
af::array out_temp = af::randu(theta_size, u8, engine);
out(f, af::span) = out_temp;
// out(f, af::span) = theta(out_temp);
}
return out;
}
/*** R
theta <- 1:10
random_test(theta, 5, 1)
random_test(theta, 5, 2)
*/
The immediate problem is that you are creating a random engine within each iteration of the loop but set the seed of the global random engine. Either you set the seed of the local engine via engine.setSeed(seed), or you get rid of the local engine all together, letting af::randu default to using the global engine.
However, it would still be "unusual" to change the seed during each step of the loop. Normally one sets the seed only once, e.g.:
// [[Rcpp::depends(RcppArrayFire)]]
#include "RcppArrayFire.h"
// [[Rcpp::export]]
af::array random_test(RcppArrayFire::typed_array<f64> theta, int counts, int seed){
const int theta_size = theta.dims()[0];
af::array out(counts, theta_size, f64);
af::setSeed(seed);
for(int f = 0; f < counts; f++){
af::array out_temp = af::randu(theta_size, u8);
out(f, af::span) = out_temp;
}
return out;
}
BTW, it makes sense to parallelize this as long as your device has enough memory. For example, you could generate a random counts x theta_size matrix in one go using af::randu(counts, theta_size, u8).
I have been experimenting with the RcppArrayFire Package, mostly rewriting some cost functions from RcppArmadillo and can't seem to get over "no viable conversion from 'af::array' to 'float'. I have also been getting some backend errors, the example below seems free of these.
This cov-var example is written poorly just to use all relevant coding pieces from my actual cost function. As of now it is the only addition in a package generated by, "RcppArrayFire.package.skeleton".
#include "RcppArrayFire.h"
#include <Rcpp.h>
// [[Rcpp::depends(RcppArrayFire)]]
// [[Rcpp::export]]
float example_ols(const RcppArrayFire::typed_array<f32>& X_vect, const RcppArrayFire::typed_array<f32>& Y_vect){
int Len = X_vect.dims()[0];
int Len_Y = Y_vect.dims()[0];
while( Len_Y < Len){
Len --;
}
float mean_X = af::sum(X_vect)/Len;
float mean_Y = af::sum(Y_vect)/Len;
RcppArrayFire::typed_array<f32> temp(Len);
RcppArrayFire::typed_array<f32> temp_x(Len);
for( int f = 0; f < Len; f++){
temp(f) = (X_vect(f) - mean_X)*(Y_vect(f) - mean_Y);
temp_x(f) = af::pow(X_vect(f) -mean_X, 2);
}
return af::sum(temp)/af::sum(temp_x);
}
/*** R
X <- 1:10
Y <- 2*X +rnorm(10, mean = 0, sd = 1)
example_ols(X, Y)
*/
The first thing to consider is the af::sum function, which comes in different forms: An sf::sum(af::array) that returns an af::array in device memory and a templated af::sum<T>(af::array) that returns a T in host memory. So the minimal change to your example would be using af::sum<float>:
#include "RcppArrayFire.h"
#include <Rcpp.h>
// [[Rcpp::depends(RcppArrayFire)]]
// [[Rcpp::export]]
float example_ols(const RcppArrayFire::typed_array<f32>& X_vect,
const RcppArrayFire::typed_array<f32>& Y_vect){
int Len = X_vect.dims()[0];
int Len_Y = Y_vect.dims()[0];
while( Len_Y < Len){
Len --;
}
float mean_X = af::sum<float>(X_vect)/Len;
float mean_Y = af::sum<float>(Y_vect)/Len;
RcppArrayFire::typed_array<f32> temp(Len);
RcppArrayFire::typed_array<f32> temp_x(Len);
for( int f = 0; f < Len; f++){
temp(f) = (X_vect(f) - mean_X)*(Y_vect(f) - mean_Y);
temp_x(f) = af::pow(X_vect(f) -mean_X, 2);
}
return af::sum<float>(temp)/af::sum<float>(temp_x);
}
/*** R
set.seed(1)
X <- 1:10
Y <- 2*X +rnorm(10, mean = 0, sd = 1)
example_ols(X, Y)
*/
However, there are more things one can improve. In no particular order:
You don't need to include Rcpp.h.
There is an af::mean function for computing the mean of an af::array.
In general RcppArrayFire::typed_array<T> is only needed for getting arrays from R into C++. Within C++ and for the way back you can use af::array.
Even when your device does not support double, you can still use double values on the host.
In order to get good performance, you should avoid for loops and use vectorized functions, just like in R. You have to impose equal dimensions for X and Y, though.
Interestingly I get a different result when I use vectorized functions. Right now I am not sure why this is the case, but the following form makes more sense to me. You should verify that the result is what you want to get:
#include <RcppArrayFire.h>
// [[Rcpp::depends(RcppArrayFire)]]
// [[Rcpp::export]]
double example_ols(const RcppArrayFire::typed_array<f32>& X_vect,
const RcppArrayFire::typed_array<f32>& Y_vect){
double mean_X = af::mean<double>(X_vect);
double mean_Y = af::mean<double>(Y_vect);
af::array temp = (X_vect - mean_X) * (Y_vect - mean_Y);
af::array temp_x = af::pow(X_vect - mean_X, 2.0);
return af::sum<double>(temp)/af::sum<double>(temp_x);
}
/*** R
set.seed(1)
X <- 1:10
Y <- 2*X +rnorm(10, mean = 0, sd = 1)
example_ols(X, Y)
*/
BTW, an even shorter version would be:
#include <RcppArrayFire.h>
// [[Rcpp::depends(RcppArrayFire)]]
// [[Rcpp::export]]
af::array example_ols(const RcppArrayFire::typed_array<f32>& X_vect,
const RcppArrayFire::typed_array<f32>& Y_vect){
return af::cov(X_vect, Y_vect) / af::var(X_vect);
}
Generally it is a good idea to use the in-build functions as much as possible.
Is there a base function in Rcpp that:
Fills entirely by a single value if size of a vector is 1.
Fills the other vector completely if same length.
Fills with an NA value if neither Vector are the same length nor a vector is of size 1.
I've written the above criteria as a function below using a NumericVector as an example. If there isn't a base function in Rcpp that performs said operations there should be a way to template the function so that given any type of vector (e.g. numeric, character and so on) the above logic would be able to be executed.
// [[Rcpp::export]]
NumericVector cppvectorize(NumericVector x,NumericVector y) {
NumericVector y_out(y.size());
if(x.size() == 1) {
for(int i = 0; i < y_out.size(); i++) {
y_out[i] = x[0];
}
} else if(x.size() == y_out.size()) {
for(int i = 0; i < y_out.size(); i++) {
y_out[i] = x[i];
}
} else {
for(int i = 0; i < y_out.size(); i++) {
y_out[i] = NA_REAL;
}
}
return y_out;
}
Unfortunately, the closest you will come to such a function is one of the rep variants that Rcpp supports. However, none of the variants match the desired output. Therefore, the only option is to really implement a templated version of your desired function.
To create the templated function, we will first create a routing function that handles the dispatch of SEXP objects. The rationale behind the routing function is SEXP objects are able to be retrieved from and surfaced into R using Rcpp Attributes whereas a templated version is not. As a result, we need to specify the SEXTYPE (used as RTYPE) dispatches that are possible. The TYPEOF() macro retrieves the coded number. Using a switch statement, we can dispatch this number into the appropriate cases.
After dispatching, we arrive at the templated function. The templated function makes use of the base Vector class of Rcpp to simplify the data flow. From here, the notable novelty will be the use of ::traits::get_na<RTYPE>() to dynamically retrieve the appropriate NA value and fill it.
With the plan in place, let's look at the code:
#include <Rcpp.h>
using namespace Rcpp;
// ---- Templated Function
template <int RTYPE>
Vector<RTYPE> vec_helper(const Vector<RTYPE>& x, const Vector<RTYPE>& y) {
Vector<RTYPE> y_out(y.size());
if(x.size() == 1){
y_out.fill(x[0]);
} else if (x.size() == y.size()) {
y_out = x;
} else {
y_out.fill(::traits::get_na<RTYPE>());
}
return y_out;
}
// ---- Dispatch function
// [[Rcpp::export]]
SEXP cppvectorize(SEXP x, SEXP y) {
switch (TYPEOF(x)) {
case INTSXP: return vec_helper<INTSXP>(x, y);
case REALSXP: return vec_helper<REALSXP>(x, y);
case STRSXP: return vec_helper<STRSXP>(x, y);
default: Rcpp::stop("SEXP Type Not Supported.");
}
// Need to return a value even though this will never be triggered
// to quiet the compiler.
return R_NilValue;
}
Sample Tests
Here we conduct a few sample tests on each of the supported data
# Case 1: x == 1
x = 1:5
y = 2
cppvectorize(x, y)
## [1] NA
# Case 2: x == y
x = letters[1:5]
y = letters[6:10]
cppvectorize(x, y)
## [1] "a" "b" "c" "d" "e"
# Case 3: x != y && x > 1
x = 1.5
y = 2.5:6.5
cppvectorize(x, y)
## [1] 1.5 1.5 1.5 1.5 1.5