Rcpp with structured library in /src

Rcpp with structured library in /src - rcpp

I'm trying to write a wrapper for a C++ function I've written, making use of the Point Clouds Library (PCL). This is my first try interfacing R and C++, so I apologise if any solution is too trivial. My goal is to make a few functions available for myself and my colleagues directly in R, on mac and windows. My example function cloudSize is included at the bottom of the text. I will try to be as clear as possible.
I've installed PCL with the vcpkg package manager for winx64 at C:\src\vcpkg\vcpkg.
This is added to my Environmental Variable Path for my user.
I created an empty R-package with Rcpp.package.skeleton():
C:/User/csvi0001/Desktop/GitHub/RPCLpackage/PCLR
PCL is a massive library, but thankfully modular,and so I only #include the headers that are needed to compile the executable: pcl/io/pcd_io.h, pcl/point_types.h, pcl/registration/icp.h.
Now, since I'd like this to work on more than one OS - and therefore compile on install (?) - I should use a dynamic library? I'll presume that the person installing my package already has a compiled copy of pcl. However, I do not know how to find a flag showing that pcl is installed - how do I find these for inclusion in Makevars(?). CMake must find them when testing the C++ function in VSCode after adding an include path. In lieu of this:
I copy the pcl folder installed by vcpkg to ./src . When I tried copying all the .h files, they seemed to lose track of one another as they refer to eachother through which module they are placed in, e.g. <pcl/memory.h> cannot be found if memory.h is placed directly in ./src. However, flattening the structure of the modules means that every single dependency and #include must be manually changed, in some cases there are also files with the same name in different folders. e.g. pcl/kdtree.h and pcl/search/kdtree.h. After this, it must be done again when replacing < > with " " for each header.
Is there any way of telling Rcpp that the library included in /src is structured?
I'm working on Win 10 winx64.
Since I'm making use of the depends RcppEigen and BH; and I must have C++14 or higher (choice: C++17) I add to my DESCRIPTION file:
LinkingTo: Rcpp, RcppEigen, BH
SystemRequirements: C++17
My actual C++ function:
//PCL requires at least C++14
//[[Rcpp::plugins(cpp17)]]
//[[Rcpp::depends(RcppEigen)]]
//[[Rcpp::depends(BH)]]
#include <Rcpp.h>
#include <iostream>
#include "pcl/io/pcd_io.h"
#include "pcl/point_types.h"
#include "pcl/registration/icp.h"
//[[Rcpp::export]]
int cloudSize(Rcpp::DataFrame x)
{
pcl::PointCloud<pcl::PointXYZ> sourceCloud;
for(int i=0;i<x.nrows();i++)
{
sourceCloud.push_back(pcl::PointXYZ(x[0][i],x[1][i],x[2][i])); //This way of referring to elements in a Rcpp::DataFrame may be erroneous.
}
int cloudSize = sourceCloud->size();
return (cloudSize);
}

That is a non-trivial question. In the simplest case, use a 'hook' offered by configure and configure.win to pre-build a (static) library you ship in your sources and then link your package to that.
That said, the Writing R Extensions manual and/or the CRAN Repository Policy (both of which are the references here) expressed more of a preference for an external library -- which may not be an option here if PCL is too exotic.
As the topic comes up with Rcpp, I wrote a short paper about it (at arXiv here) which is also included as a vignette in the package. It requires a few pages to cover the common cases but even then it cannot cover all.
Your main source of reference may be CRAN. The are lots of packages in this space. A few of mine use external libraries, I contributed to package nloptr which uses a hybrid approach ("use system library if found, else build") and some like httpuv always build (a small-ish library).

Related

How to install the "FastAD" C++ library in Rcpp?

I am trying to install a C++ library called FastAD ( https://github.com/JamesYang007/FastAD#user-guide) in Rcpp but the installation instructions are generic (not specifically for Rcpp).
It would be greatly appreciated if someone coule give me some guidance for how to install and be able to #include the files?

FastAD is a header-only library which only depends on Eigen3. That makes for a pretty straightforward application for Rcpp and friends.
First, rely on the RcppEigen.package.skeleton() function to create the barebones RcppEigen-using package.
Second, we copy the FastAD library into inst/include. We add a PKG_CPPFLAGS variable in src/Makevars to let R know to source it from there. That is all the compiler needs with a header-only library. (Edit: We also set CXX_STD=CXX17 unless one has a new enough compiler or R (currently: r-devel) which already default to C++17.)
Third, we create a simple example in src/ based on an example in FastAD. We picked the Black-Scholes example.
Fourth, minor cleanups like removing the hello* stanza files.
That is pretty much it. In its embryonic form the package is now here on GitHub. An example is
> library(RcppFastAD)
> blackScholesExamples()
56.5136
0.773818
51.4109
-0.226182
>

Building R Package that uses RcppArmadillo, RcppEigen and depends on Cpp11 Plugins

I followed all the procedures explained so far about this matter either in this website or published notes by Dirk, Hadley or others. However, I still have problems in building my package due to the issue regarding cpp11 plugin.
I used RcppArmadillo.package.skeleton() function. I put my cpp file in the src directory. The NAMESPACE file looks as it should which contains importFrom(Rcpp, sourceCpp) line. I also edited DESCRIPTION file and in the LinkingTo section, I added RcppEigen and other packages I use. I finally ran the compileAttributes(verbose=TRUE) function in R and everything looked OK. Therefore, I think I have done everything as I should. I have to also mention that when I compile my code in R using sourceCpp(), it works perfect and is compiled with no errors!
To illustrate better what my dependencies are, I put the first block of my code here:
#include <RcppArmadillo.h>
#include <RcppNumerical.h>
#include <RcppArmadilloExtensions/sample.h>
#include <Eigen/LU>
#include <algorithm>
// [[Rcpp::depends(RcppArmadillo)]]
// [[Rcpp::depends(RcppEigen)]]
// [[Rcpp::depends(RcppNumerical)]]
// [[Rcpp::plugins(cpp11)]]
The problem is when I build my package and I get errors and warnings for the lines I have auto type which relates to cpp11 plugin.
After searching similar posts on this website, I concluded that I have to force my R compiler to use c++11 and there fore I edited my Makvars file located at ~/.R/Makevars and since I use MAC I added this line:
CXX=clang++ -std=c++11 to that file. However, when I do that those 3 errors go away but 50 new errors are generated as all of the Armadillo variable types, such as mat, uvec, etc are not recognized any more. So I don't know how to fix this.
I think basically putting // [[Rcpp::plugins(cpp11)]] should take care of it as the new version of Rcpp supports this plug in and probably that's why when I run sourceCpp in R I get no errors and everything looks fine. But I don't know what happens when building my package. My Rcpp version is 0.12.8 .
Thank you in advance for any sorts of help.

Plugins for both dependencies (ie other headers) and compiler options are for use by sourceCpp().
Packages do this with LinkingTo: and, for the C++11 directive, either src/Makevars or SystemRequirements. See Writing R Extensions which documents this.

How can I include additional modules in a NodeJS custom binary?

I am building a custom binary of NodeJS from the latest code base for an embedded system. I have a couple modules that I would like to ship as standard with the binary - or even run a custom script the is compiled into the binary and can be invoked through a command line option.
So two questions:
1) I vaguely remember that node allowed to include custom modules during build time but I went through the latest 5.9.0 configure script and I can't see anything related - or maybe I am missing it.
2) Did someone already do something similar? If yes, what were the best practices you came up with?
I am not looking for something like Electron or other binary bundlers but actually building into the node binary.
Thanks,
Andy

So I guess I figure it out much faster that I thought.
For anyone else, you can add any NPM module to it and just add the actual source files to the node.gyp configuration file.
Compile it and run the custom binary. It's all in there now.
> var cmu = require("cmu");
undefined
> cmu
{ version: [Function] }
> cmu.version()
'It worked!'
> `

After studying this for quite a while, I have to say that the flyandi's answer is not quite true. You cannot add any NPM module just by adding it to the node.gyp.
You can only add pure JavaScript modules this way. To be able to embed a C++ module (I deliberately don't use the word "native", because that one is quite ambiguous in nodeJS terminology - just look at the sources).
To summarize this:
To embed a JS module to your custom nodejs, just add it in the library_files section of the node.gyp file. Also note that it should be placed within the lib folder, otherwise you'll have troubles requiring the module. That's because the name/path listed in node.gyp / library_files is used to encode the id of the module in the node_javascript.cc intermediate file which is then used when searching for the built-in modules.
To embed a native module is much more difficult. The best way I have found so far is to build the module as a static library instead of dynamic, which for cmake(-js) based module you can achieve by changing the SHARED parameter to STATIC like this:
add_library(${PROJECT_NAME} STATIC ${SRC})
instead of:
add_library(${PROJECT_NAME} SHARED ${SRC})
And also changing the suffix:
set_target_properties(
${PROJECT_NAME}
PROPERTIES
PREFIX ""
SUFFIX ".lib") /* instead of .node */
Then you can link it from node.gyp by adding this section:
'link_settings': {
'libraries' : [
"path/to/my/library.lib",
#...add other static dependencies
],
},
(how to do this with node-gyp based project should be quite ease to google)
This allows you to build the module, but you won't be able to require it, because require() function in node can only be used to load built-in JS modules, external JS modules or external dynamic node modules. But now we have a built-in C++ module. Well, lot of node integrated modules are C++, but they always have a JS wrapper in /lib, and those wrappers they use process.binding() to load the C++ module. That is, process.binding() is sort of a require() function for integrated C++ modules.
That said, we also need to call require.binding() instead of require to load our integrated module. To be able to do that, we have to make our module "built-in" first.
We can do that by replacing
NODE_MODULE(mymodule, InitAll)
int the module definition with
NODE_BUILTIN_MODULE_CONTEXT_AWARE(mymodule, InitAll)
which will register it as internal module and from now on we can process.binding() it.
Note that NODE_BUILTIN_MODULE_CONTEXT_AWARE is not defined in node.h as NODE_MODULE but in node_internals.h so you either have to include that one, or copy the macro definition over to your cpp file (the first one is of course better because the nodejs API tends to change quite often...).
The last thing we need to do is to list our newly integrated module among the others so that the node knows to initialize them (that is include them within the list of modules used when searching for the modules loaded with process.binding()). In node_internals.h there is this macro:
#define NODE_BUILTIN_STANDARD_MODULES(V) \
V(async_wrap) \
V(buffer) \
V(cares_wrap) \
...
So just add the your module to the list the same way as the others V(mymodule).
I might have forgotten some step, so ask in the comments if you think I have missed something.
If you wonder why would anyone even want to do this... You can come up with several reasons, but here's one most important to me: Those package managers used to pack your project within one executable (like pkg or nexe) work only with node-gyp based modules. If you, like me, need to use cmake based module, the final executable won't work...

Why are there so many libraries in MSVC and why do I have to recompile the code again

In every platform there are various versions of a given library: multi-threaded, debug, dynamic, etc..
Correct me if I am wrong here, but in Linux an object can link to any version of a library just fine, regardless of how its compiled. For example, there is no need to use any special flags at compile time to specify whether the link will eventually be to a dynamic or a static version of the run-time libraries (clarification: I am not talking about creating dynamic/static libraries, I am talking about linking to them - so -fPIC doesn't apply). Same goes for debug or optimized version of libraries.
Why in MSVC (Windows in general with other compilers. true?) I need to recompile the code every time in order to link to different versions of libraries? I am talking the /MD, /MT, /MTd, /MDd, etc flags. Is the code actually using different system headers each time. If so, why?
I would really appreciate any pointers to solid documentation that discusses these library matters in Windows for a C/C++ programmer..
thanks!

The compiler setting does very little other than simple change some macro definitions. Its microsoft's c-runtime header files that change their behaviour based on the runtime selected.
First, the header files use a # pragma directive to embed in the object file a directive specifying which .lib file to include, choosing one of: msvcrt.lib, msvcrtd.lib, libcmt.lib and mibcmtd.lib
The directives look like this
#ifdef <release dll runtime>
#pragma comment(lib,"msvcrt.lib")
#endif
Next, it also modifies a macro definition used on all c-rt functions that adds the __declspec(dllimport) directive if a dll runtime was selected. the effect of this directive is to change the imported symbol from, say, '_strcmp' to '__imp__strcmp'.
The dll import libraries (msvcrt.lib and msvcrtd.lib) export their symbols (to the linker) as __imp_<function name>, which means that, in the Visual C++ world, once you have compiled code to link against the dll runtimes you cannot change your mind - they will NOT link against a static runtime.
Of course, the reverse is not the case - dll import libraries actually export their public symbols both ways: with and without the __imp_ prefix.
Which means that code built against a static runtime CAN be later co-erced into linking with the dll or static runtimes.
If you are building a static library for other consumers, you should ensure that your compiler settings include:
One of the static library settings, so that consumers of your .lib can choose themselves which c-runtime to use, and
Set the 'Omit Default Library Name' (/Zl)flag. This tells the compiler to ignore the #pragma comment(lib,... directives, so the obj files and resulting lib does NOT have any kind of implicit runtime dependency. If you don't do this, users of your lib who choose a different runtime setting will see confusing messages about duplicate symbols in libc.lib and msvcrt.lib which they will have to bypass by using the ignore default libraries flag.

These using these compiler options have two effects. The automatically #define a macro that may be used by header files (and your own code) to do different things. This effects only a small part of the C runtime, and you can check the headers to see if it's happening in your case.
The other thing is that the C++ compiler embeds a comment in your object file that tells the linker to automatically include a particular flavor of the MSVC runtime, whether you specify that library at link time or not.
This is convenient for small programs, where you simply type at a command prompt cl myprogram.cpp to compile and link, producing myprogram.exe.
You can defeat automatic linking of the commented-in flavor of the c-runtime by passing /nodefaultlib to the linker. And then specify a different flavor of the c-runtime instead. This will work if you are careful not to depend on the #defines for _MT and
_DLL (keep in mind that the standard C headers might be looking at these also).
I don't recommend this, but if you have a reason to need to do this, it can be made to work in most cases.
If you want to know what parts of the C header files behave differently, you should just search for _MT and _DLL in the headers and see.

All of the options use the same header files, however they all imply different #define which affect the header files. So they need to be recompiled.
The switches also link to the appropriate library, but the recompile is not because of the linking.
See here for a list of what is defined when you use each.

Link libraries with dependencies in Visual C++ without getting LNK4006

I have a set of statically-compiled libraries, with fairly deep-running dependencies between the libraries. For example, the executable X uses libraries A and B, A uses library C, and B uses libraries C and D:
X -> A
A -> C
X -> B
B -> C
B -> D
When I link X with A and B, I don't want to get errors if C and D were not also added to the list of libraries—the fact that A and B use these libraries internally is an implementation detail that X should not need to know about. Also, when new dependencies are added anywhere in the dependency tree, the project file of any program that uses A or B would have to be reconfigured. For a deep dependency tree, the list of required libraries can become really long and hard to maintain.
So, I am using the "Additional Dependencies" setting of the Librarian section in the A project, adding C.lib. And in the same section of B's project, I add C.lib and D.lib. The effect of this is that the librarian bundles C.lib into A.lib, and C.lib and D.lib into B.lib.
When I link X, however, both A.lib and B.lib contain their own copy of C.lib. This leads to tons of warnings along the lines of
A.lib(c.obj) : warning LNK4006 "symbol" (_symbol) already defined in B.lib(c.obj); second definition ignored.
How can I accomplish this without getting warnings? Is there a way to simply disable the warning, or is there a better way?
EDIT: I have seen more than one answer suggesting that, for the lack of a better alternative, I simply disable the warning. Well, this is part of the problem: I don't even know how to disable it!

As far as I know you can't disable linker warnings.
However, you can ignore some of them, using command line parameter of linker eg. /ignore:4006
Put it in your project properties under linker->command line setting (don't remember exact location).
Also read this:
Link /ignore
MSDN Forum - hiding LNK warnings
Wacek

Update If you can build all involved project in single solution, try this:
Put all project in one sln.
Remove all references to static libraries from projects' linker or librarian properties.
There is "Project Dependencies..." option in context menu for each project in Solution Explorer. Use it to define dependencies between project.
It should work. It doesn't invalidate anything I said before, the basic model of building C/C++ programs stays the same. VS (at least 2005 and newer) is simply smart enough to add all needed static libraries to linker command line. You can see it in project properties.
Of course this method won't help if you need to use already compiled static libraries. Then you need to add them all to exe or dll project that directly or indirectly uses them.
I don't think you can do anything about that. You should remove references to other static libs from static libs projects and add all needed static libs projects as dependences of exe or dll projects. You will just have to live with fact that any project that includes A.lib or B.lib also needs to include C.lib.
As an alternative you can turn your libraries into dlls which provide a richer model.
Statically compiled libraries simply aren't real libraries with dependency information, etc, like dlls. See how, when you build them, you don't really need to provide libraries they depend on? Headers are all that's needed. See? You can't even really say static libraries depend on something.
Static library is just an archive of compiled and not yet linked object code. It's not consistent whole. Each object file is compiled separately and remains separate entity inside the library. Linking happens when you build exe or dll. That's when you need to provide all object code. That's when all the symbol and dependency resolving happens.
If you add other static libraries to static library dependencies, librarian will simply copy all code together. Then, when building exe, linker will give you lots of warnings about duplicate symbols. You might be able to block those warnings (I don't know how) but be careful. It may conceal real problems like real duplicate symbols with differing definitions. And if you have static data defined in libraries, it probably won't work anyway.

Microsoft (R) Incremental Linker Version 9.00.x (link.exe) knows argument /ignore:4006

You could create one library which contains A, B, C & D and then link X against that.
Since it's a library, only object modules which are actually referenced will get linked into the final executable.

Note that one way of getting this warning is to define a member function in a header without the inline statement:
// Foo.h
class Foo
{
void someFunction();
};
void Foo:someFunction() // Warning! - should be "inline void Foo::someFunction()"
{
// do stuff
}

The problem is you are not localizing library C's symbols. So you have a ODR violation when you link in A and B. You need to have a way to make these private. By default all symbols are exported. One way to do this is to have a special linker definition file for both A and B that explicitly mention which files need to be exported.
[1] ODR = One Definition Rule.

I think the best course of action here will be to ignore/disable the linker warnings(LNK4006) since C.lib needs to be part of both A.Lib and B.lib and A.Lib does not need to know that B.lib itself uses C.Lib.

This may not fix your link error, but it might help with your dependency tree issue.
What I do, is just use a #pragma to include a lib in the .cpp file that needs it. For example:
#pragma comment(lib:"wsock32")
Like I said, I'm not sure it would keep the symbols in that object file, I'd have to whip up an example to try it out.

Poor flodin seems frustrated that nobody will explain how to disable the linker warnings. Well, I've had a similar problem, and for years I have simply lived with the fact that several hundred warnings were displayed. Now, however, thanks to the info from Link /ignore, I figured out how to disable the linker warnings.
I'm using Visual Studio 2008. In Project -> Settings -> Configuration Properties -> Librarian -> Command Line -> Additional Options, I added "/ignore:4006" (without the quotes). Now my warnings are gone!

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string