I am implementing a bootstrap procedure using Rcpp and seem to have hit a performance bottleneck (?). Basically, I am creating a boot matrix using RcppArmadillo, calling the fit_power_law function from igraph and estimating the power law coefficient. The code is as follows:
#include <RcppArmadillo.h>
#include <RcppArmadilloExtensions/sample.h>
using namespace Rcpp;
// [[Rcpp::depends(RcppArmadillo)]]
double alphaPLFit(const arma::vec& x) {
Environment igraph("package:igraph");
Function fit_power_law = igraph["fit_power_law"];
List p = fit_power_law(Named("x", x));
return p["alpha"];
}
arma::mat bootstraps(const arma::vec& x, int times) {
arma::mat bootMatrix;
for (int t = 0; t < times; ++t) {
arma::vec resample = RcppArmadillo::sample(x, x.n_elem, true);
bootMatrix.insert_cols(t, resample);
}
return bootMatrix;
}
// [[Rcpp::export]]
arma::vec bootAlphaPLFit(const arma::vec& x, int times) {
arma::mat bootMatrix = bootstraps(x, times);
arma::vec alphas(times);
for (int i = 0; i < times; ++i) {
double alpha = alphaPLFit(bootMatrix.col(i));
alphas[i] = alpha;
}
return alphas;
}
/***R
library(igraph)
sampleData <- degree(sample_pa(n = 1e5, m = 3, directed = FALSE))
system.time(bootAlphaPLFit(sampleData, 500))
*/
This takes about a minute to run on my computer. As a comparison, the boot function from the boot package takes about 30 seconds, with a custom statistic that calls igraph::fit_power_law and extracts alpha.
Is there any way to speed this up?
Related
I am developing an R package using RcppArmadillo. I was writing a few util functions, which manipulate arma::mat and arma::vec objects. So I was trying to use pointer of arma::mat (or arma::vec) as arguments of those functions. Just like the following C++ example (https://onlinegdb.com/mNczwaPaV), I just want to pass the address of object, then manipulate the object value:
#include <iostream>
using namespace std;
void plus_one(int *x){
*x = *x + 1;
}
int main(){
int x = 1;
plus_one(&x);
printf("%d", x);
return 0;
}
2
...Program finished with exit code 0
Press ENTER to exit console.
Here is a toy example I was trying. RStudio gave me the error message "called object type 'arma::vec *' (aka 'Col *') is not a function or function pointer."
#include <RcppArmadillo.h>
using namespace Rcpp;
//
// [[Rcpp::depends(RcppArmadillo)]]
void f2(arma::vec *v){
*v = (*v)%log(*v) + (1-(*v))*log(1-(*v));
}
void trim(arma::vec *v, double tol){
*v(find(*v<=0.0)).fill(tol);
*v(find(*v>=1.0)).fill(1-tol);
}
// [[Rcpp::export]]
arma::vec f1(arma::vec v){
trim(&v, 1e-8);
return(f2(&v));
}
/*** R
f1(seq(0,1,0.2))
*/
I don't think v.memptr() allows me to manipulate the vector by R-like vector operations. For example,
double* v_mem = v.memptr();
*v_mem+1;
does not give the entrywise addition result. (Here, I want is v+1 in R). Do you have any suggestions?
Thank you!
I was reading the awesome Rcpp vignette on exposing c++ classes and functions using Rcpp modules. In that context, is it possible to create an Rcpp function that has a class of type Uniform as one of the arguments and that is not part of the particular module being exported? Below here is just a model of what I was thinking. The example is taken from the same vignette. The answer might be already there. It would be great if someone can point to the right place.
#include <RcppArmadillo.h>
using namespace Rcpp;
class Uniform {
public:
Uniform(double min_, double max_) :
min(min_), max(max_) {}
NumericVector draw(int n) const {
RNGScope scope;
return runif(n, min, max);
}
double min, max;
};
double uniformRange(Uniform* w) {
return w->max - w->min;
}
RCPP_MODULE(unif_module) {
class_<Uniform>("Uniform")
.constructor<double,double>()
.field("min", &Uniform::min)
.field("max", &Uniform::max)
.method("draw", &Uniform::draw)
.method("range", &uniformRange)
;
}
/// JUST AN EXAMPLE: WON'T RUN
// [[Rcpp::export]]
double test(double z, Uniform* w) {
return z + w->max ;
}
Following Dirk's comment, I am posting a possible solution. The idea would be to create a new instance of a class object with a pointer on it and create an external pointer that can be further passed as an argument of a function. Below here is what I have gathered from his post:
#include <RcppArmadillo.h>
using namespace Rcpp;
class Uniform {
public:
Uniform(double min_, double max_) :
min(min_), max(max_) {}
NumericVector draw(int n) const {
RNGScope scope;
return runif(n, min, max);
}
double min, max;
};
// create external pointer to a Uniform object
// [[Rcpp::export]]
XPtr<Uniform> getUniform(double min, double max) {
// create pointer to an Uniform object and
// wrap it as an external pointer
Rcpp::XPtr<Uniform> ptr(new Uniform( min, max ), true);
// return the external pointer to the R side
return ptr;
}
/// CAN RUN IT NOW:
// [[Rcpp::export]]
double test(double z, XPtr<Uniform> xp) {
double k = z + xp ->max;
return k;
}
I am trying to compile the following code. Please see below what I have tried so far. Is there anything that I am missing. Any help would be appreciated.
#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]
using namespace Rcpp;
// [[Rcpp::export]]
List beta(const arma::rowvec beta,
const int n,
const int L1,
const int p,
const arma::mat YWeight1,
const arma::mat z){
double S0=0;
for(int i = 0;i < n; ++i){
arma::rowvec zr = z.rows(i,i);
S0 += exp(arma::as_scalar(beta*zr.t()));
}
List res;
res["1"] = S0;
return(res);
}
I can't copy the error but this is what I am getting.
no match for call to '(Rcpp::traits::input_parameter<const arma::Row<double>
and so on...
There is a rowvec converter. The issue here is:
filece5923f317b2.cpp:39:34: error: type 'Rcpp::traits::input_parameter::type' (aka 'ConstInputParameter >') does not provide a call operator
rcpp_result_gen = Rcpp::wrap(beta(beta, n, L1, p, YWeight1, z));
Few thoughts: 1. There is already a function called beta() and 2. there is a variable named beta that might be causing havoc with Rcpp attributes.
Solution:
Remove the using namespace Rcpp;
Rename the function away from beta() to beta_estimator().
Specify the length of Rcpp::List
Access by numeric index instead of string.
Corrected code:
#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::export]]
Rcpp::List beta_estimator( // renamed function
const arma::rowvec beta,
const int n,
const int L1,
const int p,
const arma::mat YWeight1,
const arma::mat z){
double S0 = 0;
for(int i = 0;i < n; ++i){
arma::rowvec zr = z.rows(i,i);
S0 += exp(arma::as_scalar(beta*zr.t()));
}
// Specify list length
Rcpp::List res(1);
res[0] = S0;
return(res);
}
Right now i'm trying to run very large for loops for some task, nearly about 8e+12 iterations. I tried using c++11 threading, but it do not seems to be working that fast as required. I am using system with 8 gb ram, i5 cpu and intel graphics 4000 card. If i use openmp would it be better or i have to use nvidia gpu and use cuda for this task? My code is as below:
#include <ros/ros.h>
// PCL specific includes
#include <sensor_msgs/PointCloud2.h>
#include <pcl_conversions/pcl_conversions.h>
#include <pcl/point_types.h>
#include <pcl/filters/voxel_grid.h>
#include <visualization_msgs/Marker.h>
#include <rosbag/bag.h>
#include <std_msgs/Int32.h>
#include <rosbag/view.h>
#include <boost/foreach.hpp>
#define foreach BOOST_FOREACH
#include <fstream>
#include <pcl/point_cloud.h>
#include <pcl/octree/octree_pointcloud_changedetector.h>
#include <pcl/io/pcd_io.h>
#include <iostream>
#include <vector>
#include <ctime>
#include <thread>
ros::Publisher marker_publisher;
int frame_index = 0;
using namespace std;
int x[200000];
void thread_function(pcl::PointCloud<pcl::PointXYZRGB>::ConstPtr cloudB,vector<int> v,int p0) {
for(size_t p1=0;p1<v.size() && ros::ok();++p1) {
int p0p1 = sqrt( pow(cloudB->points[v[p1]].x-cloudB->points[v[p0]].x,2)
+pow(cloudB->points[v[p1]].y-cloudB->points[v[p0]].y,2)
+pow(cloudB->points[v[p1]].z-cloudB->points[v[p0]].z,2) ) * 1000;
if(p0p1>10) {
for(size_t p2=0;p2<v.size() && ros::ok();++p2) {
int p0p2 = sqrt( pow(cloudB->points[v[p2]].x-cloudB->points[v[p0]].x,2)
+pow(cloudB->points[v[p2]].y-cloudB->points[v[p0]].y,2)
+pow(cloudB->points[v[p2]].z-cloudB->points[v[p0]].z,2) ) * 1000;
int p1p2 = sqrt( pow(cloudB->points[v[p2]].x-cloudB->points[v[p1]].x,2)
+pow(cloudB->points[v[p2]].y-cloudB->points[v[p1]].y,2)
+pow(cloudB->points[v[p2]].z-cloudB->points[v[p1]].z,2) ) * 1000;
if(p0p2>10 && p1p2>10) {
}
}
}
}
x[p0] = 3;
cout<<"ended thread="<<p0<<endl;
}
void cloud_cb (const sensor_msgs::PointCloud2ConstPtr& input)
{
frame_index++;
pcl::PointCloud<pcl::PointXYZRGB>::Ptr cloudB (new pcl::PointCloud<pcl::PointXYZRGB> );
pcl::fromROSMsg(*input,*cloudB);
// Initializing Marker parameters which will be used in rviz
vector<visualization_msgs::Marker> line_list, marker, text_view_facing;
line_list.resize(4); marker.resize(4); text_view_facing.resize(4);
for(int i=0;i<line_list.size();i++) {
marker[i].header.frame_id = line_list[i].header.frame_id = text_view_facing[i].header.frame_id = "/X3/base_link";
marker[i].header.stamp = line_list[i].header.stamp = text_view_facing[i].header.stamp =ros::Time();
marker[i].ns = line_list[i].ns = text_view_facing[i].ns ="lines";
marker[i].action = line_list[i].action = text_view_facing[i].action = visualization_msgs::Marker::ADD;
marker[i].pose.orientation.w = line_list[i].pose.orientation.w = text_view_facing[i].pose.orientation.w = 1;
marker[i].id = i+4;
line_list[i].id = i;
marker[i].type = visualization_msgs::Marker::POINTS;
line_list[i].type = visualization_msgs::Marker::LINE_LIST;
line_list[i].color.r = 1; line_list[i].color.g = 1; line_list[i].color. b = 1; line_list[i].color.a = 1;
marker[i].scale.x = 0.003;
marker[i].scale.y = 0.003;
marker[i].scale.z = 0.003;
text_view_facing[i].id = i+8;
text_view_facing[i].type = visualization_msgs::Marker::TEXT_VIEW_FACING;
text_view_facing[i].color.b = 1; text_view_facing[i].color.a = 1.0; text_view_facing[i].color.g = 1.0; text_view_facing[i].color.r = 1.0;
text_view_facing[i].scale.z = 0.015;
}
marker[3].scale.x = 0.05;
marker[3].scale.y = 0.05;
marker[3].scale.z = 0.05;
if(frame_index==10) // Saving the point cloud for only one time to find moved object in it
{
pcl::io::savePCDFileASCII ("test_pcd.pcd", *cloudB);
}
if(frame_index>10) // Reading above point cloud file after saving for once to compare it with newly arriving point clouds
{
pcl::PointCloud<pcl::PointXYZRGB>::Ptr cloud (new pcl::PointCloud<pcl::PointXYZRGB>);
if (pcl::io::loadPCDFile<pcl::PointXYZRGB> ("test_pcd.pcd", *cloud) == -1) //* load the file
{
PCL_ERROR ("Couldn't read file test_pcd.pcd \n");
}
else {
srand ((unsigned int) time (NULL));
// Octree resolution - side length of octree voxels
double resolution = 0.1;
// Instantiate octree-based point cloud change detection class
pcl::octree::OctreePointCloudChangeDetector<pcl::PointXYZRGB> octree (resolution);
// Add points from cloudA to octree
octree.setInputCloud (cloud);
octree.addPointsFromInputCloud ();
// Switch octree buffers: This resets octree but keeps previous tree structure in memory.
octree.switchBuffers ();
// Add points from cloudB to octree
octree.setInputCloud (cloudB);
octree.addPointsFromInputCloud ();
std::vector<int> newPointIdxVector;
// Get vector of point indices from octree voxels which did not exist in previous buffer
octree.getPointIndicesFromNewVoxels (newPointIdxVector);
geometry_msgs::Point p; std_msgs::ColorRGBA c;
for (size_t i = 0; i < newPointIdxVector.size (); ++i)
{
p.x = cloudB->points[newPointIdxVector[i]].x;
p.y = cloudB->points[newPointIdxVector[i]].y;
p.z = cloudB->points[newPointIdxVector[i]].z;
c.r = cloudB->points[newPointIdxVector[i]].r/255.0;
c.g = cloudB->points[newPointIdxVector[i]].g/255.0;
c.b = cloudB->points[newPointIdxVector[i]].b/255.0;
c.a = 1;
//cout<<newPointIdxVector.size()<<"\t"<<p.x<<"\t"<<p.y<<"\t"<<p.z<<endl;
if(!isnan(p.x) && !isnan(p.y) && !isnan(p.z)) {
marker[3].points.push_back(p);
marker[3].colors.push_back(c);
}
}
marker_publisher.publish(marker[3]);
pcl::PointCloud<pcl::PointXYZRGB> P;
thread t[newPointIdxVector.size()];
for(int p0=0;p0<newPointIdxVector.size();++p0) { // For each voxel in moved object
t[p0] = thread(thread_function,cloudB,newPointIdxVector,p0);
}
for(int p0=0;p0<newPointIdxVector.size();++p0) { // For each voxel in moved object
t[p0].join();
cout<<"joined"<<"\t"<<p0<<"\t"<<x[p0]<<endl;
}
}
}
}
int main (int argc, char** argv)
{
ros::init (argc, argv, "training");
ros::NodeHandle nh;
ros::Subscriber sub = nh.subscribe<sensor_msgs::PointCloud2> ("input", 1, cloud_cb);
marker_publisher = nh.advertise<visualization_msgs::Marker> ("visualization_marker",1);
// Spin
ros::spin ();
}
This task is really important for my algorithm to complete. I need a suggestion how to make this loops run very fast.
In above code the thread_function is the main function where i'm putting the for loops currentely. Is their any way to increase its performance in above code?
OpenMP is the easiest to implement and try. Just add a couple of lines at your CMakeLists.txt, an include and the famous #pragma omp parallel for line just before your for loop.
Threading itself is not necessarily a guarantee for speed. If your process is mostly linear, there is nothing to be done in parallel. In your case, it looks like you have a loop and each iteration might be able to be done independently in parallel, but because each loop is so small and mostly simple mathematical operations, the overhead for making each item its own thread might not save you much (if any) time. The algorithm itself might need an overhaul (i.e. doing this an entirely different way), but threading could potentially solve your issue if your loop is huge and you can break it into, say, 4 chunks and parallel process the 4 chunks (i.e. one thread does items 0-100, another 101-200, etc). Just be aware that one process might finish before another and if some other process is relying on the completion of the whole set of data, then you'll need to ensure that you're done with all 4 threads before continuing. And if you do any kind of manipulation of the data (i.e. shifting elements, adding, removing) in the parallel processes, then you could end up screwing up a parallel thread. Hope that helps!
I appreciate any help, and would like to thank you in advance. I'm working on a project for one of my classes. Essentially performing merge sort using multithreading and reference classes. In main I'm just trying to create an initial thread that will begin the recursive mergesort. Each time the array is split a new thread is spawned to handle that subroutine. I don't need all of it done, i just don't under stand why my Thread constructor and ThreadStart delegate are not working. Thanks again!!
#include <iostream>
#include <vector>
#include <string>
#include <time.h>
#include <cstdlib>
using namespace System;
using namespace System::Threading;
public ref class MergeSort
{
private: int cnt;
public: MergeSort()
{
cnt = 0;
}
public: void mergeSort(char a[], int from, int to)
{
Thread^ current = Thread::CurrentThread;
if(from == to)
return;
int mid = (from + to)/2;
//Sort the first and the second half
//addThread(a, from, mid);
//addThread(a, mid+1, to);
//threads[0]->Join();
//threads[1]->Join();
merge(a, from, mid, to);
}
public: void merge(char a[], int from, int mid, int to)
{
Thread^ current = Thread::CurrentThread;
while (current ->ThreadState == ThreadState::Running)
{
int n = to-from + 1; // Size of range to be merged
std::vector<char> b(n);
int i1 = from; //Next element to consider in the first half
int i2 = mid + 1; //Next element to consider in the second half
int j = 0; //Next open position in b
//As long as neight i1 or i2 is past the end, move the smaller element into b
while(i1 <= mid && i2 <= to)
{
if(a[i1] < a[i2])
{
b[j] = a[i1];
i1++;
}
else
{
b[j] = a[i2];
i2++;
}
j++;
}
//Copy any remaining entries of the first half
while(i1 <= mid)
{
b[j] = a[i1];
i1++;
j++;
}
while(i2 <= to)
{
b[j] = a[i2];
i2++;
j++;
}
//Copy back from temporary vector
for(j = 0; j < n; j++)
a[from+j] = b[j];
}
}
};
void main()
{
char A[10];
for(int i = 0; i < 10; i++)
{
A[i] = ((char) ((rand() % (122-65)) + 65));
}
array<Thread^>^ tr = gcnew array<Thread^>(10);
MergeSort^ ms1 = gcnew MergeSort();
ThreadStart^ TS = gcnew ThreadStart(ms1, &MergeSort::mergeSort(A, 0, 10));
tr[0] = gcnew Thread(TS);
tr[0] -> Start();
system("pause");
}
The issue you are facing here is how to construct a ThreadStart delegate. You are trying to do too many things in the ThreadStart constructor. You cannot pass in arguments at this point because all it is looking for is a start location for the thread.
The delegate should be:
ThreadStart^ TS = gcnew ThreadStart(ms1, &MergeSort::mergeSort);
Since however you are passing in some state, I would recommend doing a bit more research on how that is done using C++\CLI. This MSDN topic should give you a start.
Edit:
Never mind, the problem was that I had to change the parameter of the method I tried to pass from Int32 to Object^.
I´m having a similar issue, though i think my problem are not the arguments. I´m passing those through during thread->Start().
I think my problem is rather that I´m trying to start the thread using a method of a ref class.
invalid delegate initializer -- function does not match the delegate type
Is the error I´m getting. Any Ideas?
void AddForcesAll() {
for (int index = 0; index < n; index++) {
Thread^ thread = gcnew Thread (gcnew ParameterizedThreadStart(this, &Bodies::AddForces));
thread->Start(index);
}
The Syntax worked fine for me for non referenced classes.