How to slice Rcpp NumericVector for elements 2 to 101? - rcpp

Hi I'm trying to slice Rcpp's NumericVector for elements 2 to 101
in R, I would do this:
array[2:101]
How do I do the same in RCpp?
I tried looking here: http://gallery.rcpp.org/articles/subsetting/
But the resource has an example that lists all the elements using IntegerVector::create(). However, ::create() is limited by the number of elements. (in addition to being tedious). Any way to slice a vector given 2 indices?

This is possible with Rcpp's Range function. This generates the equivalent C++ positional index sequence. e.g.
Rcpp::Range(0, 3)
would give:
0 1 2 3
Note: C++ indices begin at 0 not 1!
Example:
#include <Rcpp.h>
// [[Rcpp::export]]
Rcpp::NumericVector subset_range(Rcpp::NumericVector x,
int start = 1, int end = 100) {
// Use the Range function to create a positional index sequence
return x[Rcpp::Range(start, end)];
}
/***R
x = rnorm(101)
# Note: C++ indices start at 0 not 1!
all.equal(x[2:101], subset_range(x, 1, 100))
*/

Related

Fill a NumericMatrix with a single value on construction

I'm trying to fill a NumericMatrix with a single value on construction. As an example, consider the following:
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
void test() {
NumericMatrix res(1, 1, NA_REAL);
}
This is throwing the error of:
error: call to constructor of 'Vector<14, PreserveStorage>' is ambiguous
VECTOR( start, start + (static_cast<R_xlen_t>(nrows_)*ncols) ),
^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
file46e92f4e027d.cpp:6:17: note: in instantiation of function template specialization 'Rcpp::Matrix<14, PreserveStorage>::Matrix<double>' requested here
NumericMatrix res(1, 1, NA_REAL);
/Library/Frameworks/R.framework/Versions/4.0/Resources/library/Rcpp/include/Rcpp/vector/Vector.h:88:5: note: candidate constructor [with T = double]
Vector( const T& size, const stored_type& u,
^
/Library/Frameworks/R.framework/Versions/4.0/Resources/library/Rcpp/include/Rcpp/vector/Vector.h:211:5: note: candidate constructor [with InputIterator = double]
Vector( InputIterator first, InputIterator last){
^
Why is a NumericMatrix unable to be instantiated with a single value alongside fixed dimensions?
So in short this works (one longer line broken in three for display):
> Rcpp::cppFunction("NumericVector fp() {
+ NumericVector res(3,NA_REAL);
+ return res;}")
> fp()
[1] NA NA NA
>
but there is no matching constructor using rows, cols for matrices. So you have to use what vectors give you above, and set dimensions by hand.
For example via (where I had it all in one line which I broke up here for exposition)
> Rcpp::cppFunction("NumericMatrix fp(int n, int k) {
+ NumericVector res(n*k,NA_REAL);
+ res.attr(\"dim\") = IntegerVector::create(n,k);
+ return NumericMatrix(res);}")
> fp(2,3)
[,1] [,2] [,3]
[1,] NA NA NA
[2,] NA NA NA
>
Not to usurp Dirk, but there isn't a need to set the dimensions of the matrix with .attr().
Filling a matrix, unlike a vector, requires supplying an iterator with n * p elements alongside dimensions for the constructor.
Matrix(const int& nrows_, const int& ncols, Iterator start)
For other constructors, please see: inst/include/Rcpp/vector/Matrix.h
With this in mind, the original example can be changed to:
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
Rcpp::NumericMatrix matrix_fill_by_vec(int n, int p) {
// fill matrix using a vector
Rcpp::NumericVector A = Rcpp::NumericVector(n * p, NA_REAL);
Rcpp::NumericMatrix B = Rcpp::NumericMatrix(n, p, A.begin());
return B;
}
Taking it for a test drive, we get:
matrix_fill_by_vec(3, 2)
# [,1] [,2]
# [1,] NA NA
# [2,] NA NA
# [3,] NA NA

find the number of ways you can form a string on size N, given an unlimited number of 0s and 1s

The below question was asked in the atlassian company online test ,I don't have test cases , this is the below question I took from this link
find the number of ways you can form a string on size N, given an unlimited number of 0s and 1s. But
you cannot have D number of consecutive 0s and T number of consecutive 1s. N, D, T were given as inputs,
Please help me on this problem,any approach how to proceed with it
My approach for the above question is simply I applied recursion and tried for all possiblity and then I memoized it using hash map
But it seems to me there must be some combinatoric approach that can do this question in less time and space? for debugging purposes I am also printing the strings generated during recursion, if there is flaw in my approach please do tell me
#include <bits/stdc++.h>
using namespace std;
unordered_map<string,int>dp;
int recurse(int d,int t,int n,int oldd,int oldt,string s)
{
if(d<=0)
return 0;
if(t<=0)
return 0;
cout<<s<<"\n";
if(n==0&&d>0&&t>0)
return 1;
string h=to_string(d)+" "+to_string(t)+" "+to_string(n);
if(dp.find(h)!=dp.end())
return dp[h];
int ans=0;
ans+=recurse(d-1,oldt,n-1,oldd,oldt,s+'0')+recurse(oldd,t-1,n-1,oldd,oldt,s+'1');
return dp[h]=ans;
}
int main()
{
int n,d,t;
cin>>n>>d>>t;
dp.clear();
cout<<recurse(d,t,n,d,t,"")<<"\n";
return 0;
}
You are right, instead of generating strings, it is worth to consider combinatoric approach using dynamic programming (a kind of).
"Good" sequence of length K might end with 1..D-1 zeros or 1..T-1 of ones.
To make a good sequence of length K+1, you can add zero to all sequences except for D-1, and get 2..D-1 zeros for the first kind of precursors and 1 zero for the second kind
Similarly you can add one to all sequences of the first kind, and to all sequences of the second kind except for T-1, and get 1 one for the first kind of precursors and 2..T-1 ones for the second kind
Make two tables
Zeros[N][D] and Ones[N][T]
Fill the first row with zero counts, except for Zeros[1][1] = 1, Ones[1][1] = 1
Fill row by row using the rules above.
Zeros[K][1] = Sum(Ones[K-1][C=1..T-1])
for C in 2..D-1:
Zeros[K][C] = Zeros[K-1][C-1]
Ones[K][1] = Sum(Zeros[K-1][C=1..T-1])
for C in 2..T-1:
Ones[K][C] = Ones[K-1][C-1]
Result is sum of the last row in both tables.
Also note that you really need only two active rows of the table, so you can optimize size to Zeros[2][D] after debugging.
This can be solved using dynamic programming. I'll give a recursive solution to the same. It'll be similar to generating a binary string.
States will be:
i: The ith character that we need to insert to the string.
cnt: The number of consecutive characters before i
bit: The character which was repeated cnt times before i. Value of bit will be either 0 or 1.
Base case will: Return 1, when we reach n since we are starting from 0 and ending at n-1.
Define the size of dp array accordingly. The time complexity will be 2 x N x max(D,T)
#include<bits/stdc++.h>
using namespace std;
int dp[1000][1000][2];
int n, d, t;
int count(int i, int cnt, int bit) {
if (i == n) {
return 1;
}
int &ans = dp[i][cnt][bit];
if (ans != -1) return ans;
ans = 0;
if (bit == 0) {
ans += count(i+1, 1, 1);
if (cnt != d - 1) {
ans += count(i+1, cnt + 1, 0);
}
} else {
// bit == 1
ans += count(i+1, 1, 0);
if (cnt != t-1) {
ans += count(i+1, cnt + 1, 1);
}
}
return ans;
}
signed main() {
ios_base::sync_with_stdio(false), cin.tie(nullptr);
cin >> n >> d >> t;
memset(dp, -1, sizeof dp);
cout << count(0, 0, 0);
return 0;
}

Creating a Templated Function to Fill a Vector with another depending on Size

Is there a base function in Rcpp that:
Fills entirely by a single value if size of a vector is 1.
Fills the other vector completely if same length.
Fills with an NA value if neither Vector are the same length nor a vector is of size 1.
I've written the above criteria as a function below using a NumericVector as an example. If there isn't a base function in Rcpp that performs said operations there should be a way to template the function so that given any type of vector (e.g. numeric, character and so on) the above logic would be able to be executed.
// [[Rcpp::export]]
NumericVector cppvectorize(NumericVector x,NumericVector y) {
NumericVector y_out(y.size());
if(x.size() == 1) {
for(int i = 0; i < y_out.size(); i++) {
y_out[i] = x[0];
}
} else if(x.size() == y_out.size()) {
for(int i = 0; i < y_out.size(); i++) {
y_out[i] = x[i];
}
} else {
for(int i = 0; i < y_out.size(); i++) {
y_out[i] = NA_REAL;
}
}
return y_out;
}
Unfortunately, the closest you will come to such a function is one of the rep variants that Rcpp supports. However, none of the variants match the desired output. Therefore, the only option is to really implement a templated version of your desired function.
To create the templated function, we will first create a routing function that handles the dispatch of SEXP objects. The rationale behind the routing function is SEXP objects are able to be retrieved from and surfaced into R using Rcpp Attributes whereas a templated version is not. As a result, we need to specify the SEXTYPE (used as RTYPE) dispatches that are possible. The TYPEOF() macro retrieves the coded number. Using a switch statement, we can dispatch this number into the appropriate cases.
After dispatching, we arrive at the templated function. The templated function makes use of the base Vector class of Rcpp to simplify the data flow. From here, the notable novelty will be the use of ::traits::get_na<RTYPE>() to dynamically retrieve the appropriate NA value and fill it.
With the plan in place, let's look at the code:
#include <Rcpp.h>
using namespace Rcpp;
// ---- Templated Function
template <int RTYPE>
Vector<RTYPE> vec_helper(const Vector<RTYPE>& x, const Vector<RTYPE>& y) {
Vector<RTYPE> y_out(y.size());
if(x.size() == 1){
y_out.fill(x[0]);
} else if (x.size() == y.size()) {
y_out = x;
} else {
y_out.fill(::traits::get_na<RTYPE>());
}
return y_out;
}
// ---- Dispatch function
// [[Rcpp::export]]
SEXP cppvectorize(SEXP x, SEXP y) {
switch (TYPEOF(x)) {
case INTSXP: return vec_helper<INTSXP>(x, y);
case REALSXP: return vec_helper<REALSXP>(x, y);
case STRSXP: return vec_helper<STRSXP>(x, y);
default: Rcpp::stop("SEXP Type Not Supported.");
}
// Need to return a value even though this will never be triggered
// to quiet the compiler.
return R_NilValue;
}
Sample Tests
Here we conduct a few sample tests on each of the supported data
# Case 1: x == 1
x = 1:5
y = 2
cppvectorize(x, y)
## [1] NA
# Case 2: x == y
x = letters[1:5]
y = letters[6:10]
cppvectorize(x, y)
## [1] "a" "b" "c" "d" "e"
# Case 3: x != y && x > 1
x = 1.5
y = 2.5:6.5
cppvectorize(x, y)
## [1] 1.5 1.5 1.5 1.5 1.5

RcppArmadillo and C++ division issue

A very simple question regarding RcppArmadillo. Trying to multiply a vector by a scalar, and getting different results depending on small changes in syntax.
Any ideas?
// [[Rcpp::depends("RcppArmadillo")]]
// [[Rcpp::export]]
arma::vec funtemp(arma::vec x)
{
// return(x/10); // this works
// return((1/10)*x); // this does not work
return(x*(1/10)); // this does not work
}
Ahh, the good ol' integer vs. double division problem in C++. Before we begin, note that: arma::vec is by default a double and 1, 10, 1/10 are all ints...
Let's take a look at your functions separately:
#include <RcppArmadillo.h>
// [[Rcpp::depends("RcppArmadillo")]]
// [[Rcpp::export]]
arma::vec funtemp_one(arma::vec x)
{
return(x/10); // this works
}
// [[Rcpp::export]]
arma::vec funtemp_two(arma::vec x)
{
return((1/10)*x); // this does not work
}
// [[Rcpp::export]]
arma::vec funtemp_three(arma::vec x)
{
return(x*(1/10)); // this does not work
}
Therefore, when we run through your problem we get:
> funtemp_one(1)
[,1]
[1,] 0.1
> funtemp_two(1)
[,1]
[1,] 0
> funtemp_three(1)
[,1]
[1,] 0
In the later functions (e.g. the 1/10), the operator/ that is being used is int based division. As a result, 2 ints enter and 1 int is returned. If the result is not divisible, then you end up with a zero being returned as it is outside of the integer scope.
In order to use the double version, which returns a double, at least one of the ints must be explicitly cast to a double. This happens by default in the first case as you have a double/int due to arma::vec's structure. The second and third cases have an int/int structure that can be dealt with in two ways: 1. use a .0 after the int or 2. explicitly cast the value as a double with double(int)
e.g.
// [[Rcpp::export]]
arma::vec funtemp_four(arma::vec x)
{
return(x*(1/10.0)); // this works
}
// [[Rcpp::export]]
arma::vec funtemp_five(arma::vec x)
{
return(x*(1/double(10))); // this works
}
Will give what you expect:
> funtemp_four(1)
[,1]
[1,] 0.1
> funtemp_five(1)
[,1]
[1,] 0.1

Rcpp Armadillo, submatrices and subvectors

I try to translate some R code into RcppArmadillo and therefore I would also like to do the following:
Assume there is a nonnegative vector v and a matrix M, both with for example m rows. I would like to get rid of all rows in the matrix M whenever there is a zero in the corresponding row of the vector v and afterwards also get rid of all entries that are zero in the vector v. Using R this is simply just the following:
M = M[v>0,]
v = v[v>0]
So my question is if there is a way to do this in RcppArmadillo. Since I am quite new to any programming language I was not able to find anything that could solve my problem, although I think that I am not the first one who asks this maybe quite easy question.
Of course there is a way to go about subsetting elements in both Rcpp (subsetting with Rcpp) and RcppArmadillo (Armadillo subsetting).
Here is a way to replicate the behavior of R subsets in Armadillo.
#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]
using namespace Rcpp;
// Isolate by Row
// [[Rcpp::export]]
arma::mat vec_subset_mat(const arma::mat& x, const arma::uvec& idx) {
return x.rows(find(idx > 0));
}
// Isolate by Element
// [[Rcpp::export]]
arma::vec subset_vec(const arma::vec& x) {
return x.elem(find(x > 0));
}
/*** R
set.seed(1334)
m = matrix(rnorm(100), 10, 10)
v = sample(0:1, 10, replace = T)
all.equal(m[v>0,], vec_subset_mat(m,v))
all.equal(v[v>0], as.numeric(subset_vec(v)))
*/

Resources