Compute density using Rcpp sugar function [duplicate]

Compute density using Rcpp sugar function [duplicate] - rcpp

From R, I'm trying to run sourceCpp on this file:
#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]
using namespace arma;
using namespace Rcpp;
// [[Rcpp::export]]
vec dnormLog(vec x, vec means, vec sds) {
int n = x.size();
vec res(n);
for(int i = 0; i < n; i++) {
res[i] = log(dnorm(x[i], means[i], sds[i]));
}
return res;
}
See this answer to see where I got the function from. This throws the error:
no matching function for call to 'dnorm4'
Which is the exact error I was hoping to prevent by using the loop, since the referenced answer mentions that dnorm is only vectorized with respect to its first argument. I fear the answer is obvious, but I've tried adding R:: before the dnorm, tried using NumericVector instead of vec, without using log() in front. No luck. However, adding R:: before dnorm does produce a separate error:
too few arguments to function call, expected 4, have 3; did you mean '::dnorm4'?
Which is not fixed by replacing dnorm above with R::dnorm4.

There are two nice teachable moments here:
Pay attention to namespaces. If in doubt, don't go global.
Check the headers for the actual definitions. You missed the fourth argument in the scalar version R::dnorm().
Here is a repaired version, included is a second variant you may find interesting:
#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::export]]
arma::vec dnormLog(arma::vec x, arma::vec means, arma::vec sds) {
int n = x.size();
arma::vec res(n);
for(int i = 0; i < n; i++) {
res[i] = std::log(R::dnorm(x[i], means[i], sds[i], FALSE));
}
return res;
}
// [[Rcpp::export]]
arma::vec dnormLog2(arma::vec x, arma::vec means, arma::vec sds) {
int n = x.size();
arma::vec res(n);
for(int i = 0; i < n; i++) {
res[i] = R::dnorm(x[i], means[i], sds[i], TRUE);
}
return res;
}
/*** R
dnormLog( c(0.1,0.2,0.3), rep(0.0, 3), rep(1.0, 3))
dnormLog2(c(0.1,0.2,0.3), rep(0.0, 3), rep(1.0, 3))
*/
When we source this, both return the same result because the R API allows us to ask for logarithms to be taken.
R> sourceCpp("/tmp/dnorm.cpp")
R> dnormLog( c(0.1,0.2,0.3), rep(0.0, 3), rep(1.0, 3))
[,1]
[1,] -0.923939
[2,] -0.938939
[3,] -0.963939
R> dnormLog2(c(0.1,0.2,0.3), rep(0.0, 3), rep(1.0, 3))
[,1]
[1,] -0.923939
[2,] -0.938939
[3,] -0.963939
R>

Related

Conversion of Rcpp::NumericMatrix to Eigen::MatrixXd

I am facing a very similar issue to these questions:
convert Rcpp::NumericVector to Eigen::VectorXd
Converting between NumericVector/Matrix and VectorXd/MatrixXd in Rcpp(Eigen) to perform Cholesky solve
I am writing an R-package, that uses the RcppEigen library for Matrix arithmetic. My problem is that the project is not compiling because of an error in the conversion of the Rcpp::NumericMatrix input to an Eigen::MatrixXd.
The file looks like this:
#include <map>
#include <Rcpp.h>
...
using namespace Eigen;
...
// [[Rcpp::export]]
Rcpp::List my_function(Rcpp::NumericMatrix input_matrix)
{
...
Map<MatrixXd> GATE_matrixx(Rcpp::as<Map<MatrixXd> >(GATE_matrix));
...
}
This gives me the following error:
Myfile.cpp:40:65: required from here
C:/Users/User/AppData/Local/Programs/R/R-4.2.2/library/Rcpp/include/Rcpp/internal/Exporter.h:31:31:error:
matching function for call to 'Eigen::Map<Eigen::Matrix<double, -1,
-1> >::Map(SEXPREC*&) 31 | Exporter( SEXP x ) : t(x) | ^ In file included from
C:/Users/User/AppData/Local/Programs/R/R-4.2.2/library/RcppEigen/include/Eigen/Core:19
from
C:/Users/User/AppData/Local/Programs/R/R-4.2.2/library/RcppEigen/include/Eigen/SparseCore:11
from
C:/Users/User/AppData/Local/Programs/R/R-4.2.2/library/RcppEigen/include/Eigen/Sparse:26
from Myfile.cpp:8
I have also tried to change the line to:
MatrixXd input_matrix_eigen(Rcpp::as\<MatrixXd\>(input_matrix));
This gives me the equivalent error :
Myfile.cpp:40:54: required from here
C:/Users/User/AppData/Local/Programs/R/R
4.2.2/library/RcppEigen/include/Eigen/src/Core/Matrix.h:332:31:error: matching function for call to 'Eigen::Matrix<double, -1,
-1>::_init1<SEXPREC*>(SEXPREC* const&)
Do you have any ideas?
If more information is required to evaluate the issue, just let me know.

If you use the command
RcppEigen::RcppEigen.package.skeleton("demoPackage")
an example package demoPackage is created for you which you can install the usual way. It contains example functions to create amd return a matrix:
// [[Rcpp::export]]
Eigen::MatrixXd rcppeigen_hello_world() {
Eigen::MatrixXd m1 = Eigen::MatrixXd::Identity(3, 3);
// Eigen::MatrixXd m2 = Eigen::MatrixXd::Random(3, 3);
// Do not use Random() here to not promote use of a non-R RNG
Eigen::MatrixXd m2 = Eigen::MatrixXd::Zero(3, 3);
for (auto i=0; i<m2.rows(); i++)
for (auto j=0; j<m2.cols(); j++)
m2(i,j) = R::rnorm(0, 1);
return m1 + 3 * (m1 + m2);
}
If you set the same seed as I do you should get the same matrix
> set.seed(42)
> demoPackage::rcppeigen_hello_world()
[,1] [,2] [,3]
[1,] 8.11288 -1.694095 1.089385
[2,] 1.89859 5.212805 -0.318374
[3,] 4.53457 -0.283977 10.055271
>

Ignore missing values while finding min in Rcpp

I am trying to find the minimum of 2 values from 2 vectors in Rcpp. But the following does not compile:
#include <cmath>
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
NumericVector timesTwo(int time_length, double BXadd,
NumericVector vn_complete, NumericVector vn1_complete) {
// Empty vectors
NumericVector BX (time_length);
for(int t = 0; t < time_length; t++) {
BX[t] = BXadd * sqrt(std::min(na_omit(vn_complete[t], vn1_complete[t])));
}
return BX;
// return vn_complete[0];
}
Error 1 occurred building shared library.
It works if I don't use na_omit.
R code for running the function:
Rcpp::sourceCpp("test.cpp")
timesTwo(5, 2, 5:9, 1:5)

What follows below is a slightly non-sensical answer (as it only works with vectors not containing NA) but it has all the component your code has, and it compiles.
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
NumericVector foo(int time_length, double BXadd,
NumericVector vn_complete, NumericVector vn1_complete) {
// Empty vectors
NumericVector BX (time_length);
vn_complete = na_omit(vn_complete);
vn1_complete = na_omit(vn1_complete);
for(int t = 0; t < time_length; t++) {
double a = vn_complete[t];
double b = vn1_complete[t];
BX[t] = BXadd * std::sqrt(std::min(a,b));
}
return BX;
}
// Edited version with new function added
// [[Rcpp::export]]
NumericVector foo2(double BXadd, NumericVector vn_complete,
NumericVector vn1_complete) {
return BXadd * sqrt(pmin(vn_complete, vn1_complete));
}
/*** R
foo(5, 2, 5:9, 1:5)
foo2(5, 2, 5:9, 1:5)
*/
For a real solution you will have to think harder about what the removal of NA is supposed to do as it will alter the lenth of your vectors too. So this still needs work.
Lastly, your whole function can be written in R as
2 * sqrt(pmin(5:9, 1:5))
and I think you could write that same expression using Rcpp sugar too as we have pmin() and sqrt():
// [[Rcpp::export]]
NumericVector foo2(double BXadd, NumericVector vn_complete,
NumericVector vn1_complete) {
return BXadd * sqrt(pmin(vn_complete, vn1_complete));
}

af::array conversion to float or double

I have been experimenting with the RcppArrayFire Package, mostly rewriting some cost functions from RcppArmadillo and can't seem to get over "no viable conversion from 'af::array' to 'float'. I have also been getting some backend errors, the example below seems free of these.
This cov-var example is written poorly just to use all relevant coding pieces from my actual cost function. As of now it is the only addition in a package generated by, "RcppArrayFire.package.skeleton".
#include "RcppArrayFire.h"
#include <Rcpp.h>
// [[Rcpp::depends(RcppArrayFire)]]
// [[Rcpp::export]]
float example_ols(const RcppArrayFire::typed_array<f32>& X_vect, const RcppArrayFire::typed_array<f32>& Y_vect){
int Len = X_vect.dims()[0];
int Len_Y = Y_vect.dims()[0];
while( Len_Y < Len){
Len --;
}
float mean_X = af::sum(X_vect)/Len;
float mean_Y = af::sum(Y_vect)/Len;
RcppArrayFire::typed_array<f32> temp(Len);
RcppArrayFire::typed_array<f32> temp_x(Len);
for( int f = 0; f < Len; f++){
temp(f) = (X_vect(f) - mean_X)*(Y_vect(f) - mean_Y);
temp_x(f) = af::pow(X_vect(f) -mean_X, 2);
}
return af::sum(temp)/af::sum(temp_x);
}
/*** R
X <- 1:10
Y <- 2*X +rnorm(10, mean = 0, sd = 1)
example_ols(X, Y)
*/

The first thing to consider is the af::sum function, which comes in different forms: An sf::sum(af::array) that returns an af::array in device memory and a templated af::sum<T>(af::array) that returns a T in host memory. So the minimal change to your example would be using af::sum<float>:
#include "RcppArrayFire.h"
#include <Rcpp.h>
// [[Rcpp::depends(RcppArrayFire)]]
// [[Rcpp::export]]
float example_ols(const RcppArrayFire::typed_array<f32>& X_vect,
const RcppArrayFire::typed_array<f32>& Y_vect){
int Len = X_vect.dims()[0];
int Len_Y = Y_vect.dims()[0];
while( Len_Y < Len){
Len --;
}
float mean_X = af::sum<float>(X_vect)/Len;
float mean_Y = af::sum<float>(Y_vect)/Len;
RcppArrayFire::typed_array<f32> temp(Len);
RcppArrayFire::typed_array<f32> temp_x(Len);
for( int f = 0; f < Len; f++){
temp(f) = (X_vect(f) - mean_X)*(Y_vect(f) - mean_Y);
temp_x(f) = af::pow(X_vect(f) -mean_X, 2);
}
return af::sum<float>(temp)/af::sum<float>(temp_x);
}
/*** R
set.seed(1)
X <- 1:10
Y <- 2*X +rnorm(10, mean = 0, sd = 1)
example_ols(X, Y)
*/
However, there are more things one can improve. In no particular order:
You don't need to include Rcpp.h.
There is an af::mean function for computing the mean of an af::array.
In general RcppArrayFire::typed_array<T> is only needed for getting arrays from R into C++. Within C++ and for the way back you can use af::array.
Even when your device does not support double, you can still use double values on the host.
In order to get good performance, you should avoid for loops and use vectorized functions, just like in R. You have to impose equal dimensions for X and Y, though.
Interestingly I get a different result when I use vectorized functions. Right now I am not sure why this is the case, but the following form makes more sense to me. You should verify that the result is what you want to get:
#include <RcppArrayFire.h>
// [[Rcpp::depends(RcppArrayFire)]]
// [[Rcpp::export]]
double example_ols(const RcppArrayFire::typed_array<f32>& X_vect,
const RcppArrayFire::typed_array<f32>& Y_vect){
double mean_X = af::mean<double>(X_vect);
double mean_Y = af::mean<double>(Y_vect);
af::array temp = (X_vect - mean_X) * (Y_vect - mean_Y);
af::array temp_x = af::pow(X_vect - mean_X, 2.0);
return af::sum<double>(temp)/af::sum<double>(temp_x);
}
/*** R
set.seed(1)
X <- 1:10
Y <- 2*X +rnorm(10, mean = 0, sd = 1)
example_ols(X, Y)
*/
BTW, an even shorter version would be:
#include <RcppArrayFire.h>
// [[Rcpp::depends(RcppArrayFire)]]
// [[Rcpp::export]]
af::array example_ols(const RcppArrayFire::typed_array<f32>& X_vect,
const RcppArrayFire::typed_array<f32>& Y_vect){
return af::cov(X_vect, Y_vect) / af::var(X_vect);
}
Generally it is a good idea to use the in-build functions as much as possible.

Rcpp sugar unique of List

I have a list of Numeric Vector and I need a List of unique elements. I tried Rcpp:unique fonction. It works very well when apply to a Numeric Vector but not to List. This is the code and the error I got.
List h(List x){
return Rcpp::unique(x);
}
Error in dyn.load("/tmp/RtmpDdKvcH/sourceCpp-x86_64-pc-linux-gnu-1.0.0/sourcecpp_272635d5289/sourceCpp_10.so") :
unable to load shared object '/tmp/RtmpDdKvcH/sourceCpp-x86_64-pc-linux-gnu-1.0.0/sourcecpp_272635d5289/sourceCpp_10.so':
/tmp/RtmpDdKvcH/sourceCpp-x86_64-pc-linux-gnu-1.0.0/sourcecpp_272635d5289/sourceCpp_10.so: undefined symbol: _ZNK4Rcpp5sugar9IndexHashILi19EE8get_addrEP7SEXPREC

It is unclear what you are doing wrong, and it is an incomplete / irreproducible question.
But there is a unit test that does just what you do, and we can do it by hand too:
R> Rcpp::cppFunction("NumericVector uq(NumericVector x) { return Rcpp::unique(x); }")
R> uq(c(1.1, 2.2, 2.2, 3.3, 27))
[1] 27.0 1.1 3.3 2.2
R>

Even if there isn't a matching Rcpp sugar function, you can call R functions from within C++. Example:
#include <Rcpp.h>
using namespace Rcpp;
Rcpp::Environment base("package:base");
Function do_unique = base["unique"];
// [[Rcpp::export]]
List myfunc(List x) {
return do_unique(x);
}

Thank you for being interested to this issue.
As I notified that, my List contains only NumericVector. I propose this code that works very well and faster than unique function in R. However its efficiency decreases when the list is large. Maybe this can help someone. Moreover, someone can also optimise this code.
List uniqueList(List& x) {
int xsize = x.size();
List xunique(x);
int s = 1;
for(int i(1); i<xsize; ++i){
NumericVector xi = x[i];
int l = 0;
for(int j(0); j<s; ++j){
NumericVector xj = x[j];
int xisize = xi.size();
int xjsize = xj.size();
if(xisize != xjsize){
++l;
}
else{
if((sum(xi == xj) == xisize)){
goto notkeep;
}
else{
++l;
}
}
}
if(l == s){
xunique[s] = xi;
++s;
}
notkeep: 0;
}
return head(xunique, s);
}
/***R
x <- list(1,42, 1, 1:3, 42)
uniqueList(x)
[[1]]
[1] 1
[[2]]
[1] 42
[[3]]
[1] 1 2 3
microbenchmark::microbenchmark(uniqueList(x), unique(x))
Unit: microseconds
expr min lq mean median uq max neval
uniqueList(x) 2.382 2.633 3.05103 2.720 2.8995 29.307 100
unique(x) 2.864 3.110 3.50900 3.254 3.4145 24.039 100
But R function becomes faster when the List is large. I am sure that someone can optimise this code.

Rcpp and "optional" arguments on functions [duplicate]

I created a cumsum function in an R package with rcpp which will cumulatively sum a vector until it hits the user defined ceiling or floor. However, if one wants the cumsum to be bounded above, the user must still specify a floor.
Example:
a = c(1, 1, 1, 1, 1, 1, 1)
If i wanted to cumsum a and have an upper bound of 3, I could cumsum_bounded(a, lower = 1, upper = 3). I would rather not have to specify the lower bound.
My code:
#include <Rcpp.h>
#include <float.h>
#include <cmath>
using namespace Rcpp;
// [[Rcpp::export]]
NumericVector cumsum_bounded(NumericVector x, int upper, int lower) {
NumericVector res(x.size());
double acc = 0;
for (int i=0; i < x.size(); ++i) {
acc += x[i];
if (acc < lower) acc = lower;
else if (acc > upper) acc = upper;
res[i] = acc;
}
return res;
}
What I would like:
#include <Rcpp.h>
#include <float.h>
#include <cmath>
#include <climits> //for LLONG_MIN and LLONG_MAX
using namespace Rcpp;
// [[Rcpp::export]]
NumericVector cumsum_bounded(NumericVector x, long long int upper = LLONG_MAX, long long int lower = LLONG_MIN) {
NumericVector res(x.size());
double acc = 0;
for (int i=0; i < x.size(); ++i) {
acc += x[i];
if (acc < lower) acc = lower;
else if (acc > upper) acc = upper;
res[i] = acc;
}
return res;
}

In short, yes its possible but it requires finesse that involves creating an intermediary function or embedding sorting logic within the main function.
In long, Rcpp attributes only supports a limit feature set of values. These values are listed in the Rcpp FAQ 3.12 entry
String literals delimited by quotes (e.g. "foo")
Integer and Decimal numeric values (e.g. 10 or 4.5)
Pre-defined constants including:
Booleans: true and false
Null Values: R_NilValue, NA_STRING, NA_INTEGER, NA_REAL, and NA_LOGICAL.
Selected vector types can be instantiated using the
empty form of the ::create static member function.
CharacterVector, IntegerVector, and NumericVector
Matrix types instantiated using the rows, cols constructor Rcpp::Matrix n(rows,cols)
CharacterMatrix, IntegerMatrix, and NumericMatrix)
If you were to specify numerical values for LLONG_MAX and LLONG_MIN this would meet the criteria to directly use Rcpp attributes on the function. However, these values are implementation specific. Thus, it would not be ideal to hardcode them. Thus, we have to seek an outside solution: the Rcpp::Nullable<T> class to enable the default NULL value. The reason why we have to wrap the parameter type with Rcpp::Nullable<T> is that NULL is a very special and can cause heartache if not careful.
The NULL value, unlike others on the real number line, will not be used to bound your values in this case. As a result, it is the perfect candidate to use on the function call. There are two choices you then have to make: use Rcpp::Nullable<T> as the parameters on the main function or create a "logic" helper function that has the correct parameters and can be used elsewhere within your application without worry. I've opted for the later below.
#include <Rcpp.h>
#include <float.h>
#include <cmath>
#include <climits> //for LLONG_MIN and LLONG_MAX
using namespace Rcpp;
NumericVector cumsum_bounded_logic(NumericVector x,
long long int upper = LLONG_MAX,
long long int lower = LLONG_MIN) {
NumericVector res(x.size());
double acc = 0;
for (int i=0; i < x.size(); ++i) {
acc += x[i];
if (acc < lower) acc = lower;
else if (acc > upper) acc = upper;
res[i] = acc;
}
return res;
}
// [[Rcpp::export]]
NumericVector cumsum_bounded(NumericVector x,
Rcpp::Nullable<long long int> upper = R_NilValue,
Rcpp::Nullable<long long int> lower = R_NilValue) {
if(upper.isNotNull() && lower.isNotNull()){
return cumsum_bounded_logic(x, Rcpp::as< long long int >(upper), Rcpp::as< long long int >(lower));
} else if(upper.isNull() && lower.isNotNull()){
return cumsum_bounded_logic(x, LLONG_MAX, Rcpp::as< long long int >(lower));
} else if(upper.isNotNull() && lower.isNull()) {
return cumsum_bounded_logic(x, Rcpp::as< long long int >(upper), LLONG_MIN);
} else {
return cumsum_bounded_logic(x, LLONG_MAX, LLONG_MIN);
}
// Required to quiet compiler
return x;
}
Test Output
cumsum_bounded(a, 5)
## [1] 1 2 3 4 5 5 5
cumsum_bounded(a, 5, 2)
## [1] 2 3 4 5 5 5 5

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Compute density using Rcpp sugar function [duplicate] - rcpp

Related

Conversion of Rcpp::NumericMatrix to Eigen::MatrixXd

Ignore missing values while finding min in Rcpp

af::array conversion to float or double

Rcpp sugar unique of List

Rcpp and "optional" arguments on functions [duplicate]

Categories

Resources