Rcpp: are objects safely PROTECTed if I cast to/from SEXP? - rcpp

When creating an R extension in C or C++, objects created in the compiled code need to be manually protected and unprotected from the garbage collector - e.g.:
SEXP some_func(SEXP inp)
{
SEXP some_vec = PROTECT(Rf_allocVector(REALSXP, 100));
// do something with 'some_vec' ...
UNPROTECT(1);
return some_vec;
}
Rcpp OTOH offers convenient classes which automatically handle the PROTECT and UNPROTECT calls, so one can safely do something like:
Rcpp::NumericVector some_func(SEXP inp)
{
Rcpp::NumericVector some_vec(100);
// do something with 'some_vec' ...
return some_vec;
}
Now, if I were to cast the Rcpp objects to generic SEXP, are they still auto protected? Example:
SEXP some_func(SEXP inp)
{
SEXP some_vec = Rcpp::NumericVector(100);
// do something with 'some_vec' ...
return some_vec;
}
Since some_vec was allocated as a specialized NumericVector but then casted to SEXP, does it still undergo automatic PROTECT/UNPROTECT, or do I need to do it manually?
And what about casting the other way around? E.g. is this safe?
Rcpp::NumericVector some_func(SEXP inp)
{
Rcpp::NumericVector some_vec = Rf_allocVector(REALSXP, 100);
// do something with 'some_vec' ...
return some_vec;
}

Yes as the only interface that there is from R uses SEXP .Call(SEXP ...). So every Rcpp object automatically gets transformed (a lot with implicit conversion but also via Rcpp::wrap(...)) and the protection is part of that.
You can easily verify that by for once creating a SEXP manually without the protection layer and R will tell you in R CMD check and/or crash.
From that you can conclude that Rcpp objects are correctly protected as this does not happen to them.

Related

Code analysis C26408 — Replacing the m_pszHelpFilePath variable in InitInstance

In my application's InitInstance function, I have the following code to rewrite the location of the CHM Help Documentation:
CString strHelp = GetProgramPath();
strHelp += _T("MeetSchedAssist.CHM");
free((void*)m_pszHelpFilePath);
m_pszHelpFilePath = _tcsdup(strHelp);
It is all functional but it gives me a code analysis warning:
C26408 Avoid malloc() and free(), prefer the nothrow version of new with delete (r.10).
When you look at the official documentation for m_pszHelpFilePath it does state:
If you assign a value to m_pszHelpFilePath, it must be dynamically allocated on the heap. The CWinApp destructor calls free( ) with this pointer. You many want to use the _tcsdup( ) run-time library function to do the allocating. Also, free the memory associated with the current pointer before assigning a new value.
Is it possible to rewrite this code to avoid the code analysis warning, or must I add a __pragma?
You could (should?) use a smart pointer to wrap your reallocated m_pszHelpFilePath buffer. However, although this is not trivial, it can be accomplished without too much trouble.
First, declare an appropriate std::unique_ptr member in your derived application class:
class MyApp : public CWinApp // Presumably
{
// Add this member...
public:
std::unique_ptr<TCHAR[]> spHelpPath;
// ...
};
Then, you will need to modify the code that constructs and assigns the help path as follows (I've changed your C-style cast to an arguably better C++ cast):
// First three (almost) lines as before ...
CString strHelp = GetProgramPath();
strHelp += _T("MeetSchedAssist.CHM");
free(const_cast<TCHAR *>(m_pszHelpFilePath));
// Next, allocate the shared pointer data and copy the string...
size_t strSize = static_cast<size_t>(strHelp.GetLength() + 1);
spHelpPath std::make_unique<TCHAR[]>(strSize);
_tcscpy_s(spHelpPath.get(), strHelp.GetString()); // Use the "_s" 'safe' version!
// Now, we can use the embedded raw pointer for m_pszHelpFilePath ...
m_pszHelpFilePath = spHelpPath.get();
So far, so good. The data allocated in the smart pointer will be automatically freed when your application object is destroyed, and the code analysis warnings should disappear. However, there is one last modification we need to make, to prevent the MFC framework from attempting to free our assigned m_pszHelpFilePath pointer. This can be done by setting that to nullptr in the MyApp class override of ExitInstance:
int MyApp::ExitInstance()
{
// <your other exit-time code>
m_pszHelpFilePath = nullptr;
return CWinApp::ExitInstance(); // Call base class
}
However, this may seem like much ado about nothing and, as others have said, you may be justified in simply supressing the warning.
Technically, you can take advantage of the fact that new / delete map to usual malloc/free by default in Visual C++, and just go ahead and replace. The portability won't suffer much as MFC is not portable anyway. Sure you can use unique_ptr<TCHAR[]> instead of direct new / delete, like this:
CString strHelp = GetProgramPath();
strHelp += _T("MeetSchedAssist.CHM");
std::unique_ptr<TCHAR[]> str_old(m_pszHelpFilePath);
auto str_new = std::make_unique<TCHAR[]>(strHelp.GetLength() + 1);
_tcscpy_s(str_new.get(), strHelp.GetLength() + 1, strHelp.GetString());
m_pszHelpFilePath = str_new.release();
str_old.reset();
For robustness for replaced new operator, and for least surprise principle, you should keep free / strdup.
If you replace multiple of those CWinApp strings, suggest writing a function for them, so that there's a single place with free / strdup with suppressed warnings.

Generating XPtr from Rcpp function

I am writing an R package in which one of the functions takes Rcpp::XPtr as an input (as a SEXP). However, the creation of XPtr from Rcpp::Function is something I want to do inside the package (i.e., the user should be able to input Function).
e.g, my package takes input generated as follows, which requires the user to write an additional function (here putFunPtrInXPtr()) and run the function in R to generate the XPtr (here my_ptr).
#include <Rcpp.h>
using namespace Rcpp;
typedef NumericVector (*funcPtr) (NumericVector y);
// [[Rcpp::export]]
NumericVector timesTwo(NumericVector x) {
return x * 2;
}
// [[Rcpp::export]]
XPtr<funcPtr> putFunPtrInXPtr() {
XPtr<funcPtr> testptr(new funcPtr(&timesTwo), false);
return testptr;
}
/*** R
my_xptr <- putFunPtrInXPtr()
*/
How can I write something in which the user provides Function user_fun and I create the XPtr?
I tried
XPtr<funcPtr> package_fun(Function user_fun_input){
XPtr<funcPtr> testptr(new funcPtr(&user_fun_input), false);
}
user_fun_input is the parameter name inside the package function, but I am getting the following error
cannot initialize a new value of type 'funcPtr' (aka 'Vector<14> (*) (Vector<14>') with an rvalue of type 'Rcpp::Function *' (aka 'Function_Impl<PreserveStorage> *')
Also, there is an R step involved in creating the pointer, I am not sure how to implement that in the package (my .cpp file).
I think the creation of XPtr from Function could be confusing to the user, so better to just take Function as input and create the pointer to it, inside the package. I do use the XPtr in my package to gain speed.
Suggestions are most appreciated!

Another weird issue with Garbage Collection?

OK, so here's the culprit method :
class FunctionDecl
{
// More code...
override void execute()
{
//...
writeln("Before setting... " ~ name);
Glob.functions.set(name,this);
writeln("After setting." ~ name);
//...
}
}
And here's what happens :
If omit the writeln("After setting." ~ name); line, the program crashes, just at this point
If I keep it in (using the name attribute is the key, not the writeln itself), it works just fine.
So, I suppose this is automatically garbage collected? Why is that? (A pointer to some readable reference related to GC and D would be awesome)
How can I solve that?
UPDATE :
Just tried a GC.disable() at the very beginning of my code. And... automagically, everything works again! So, that was the culprit as I had suspected. The thing is : how is this solvable without totally eliminating Garbage Collection?
UPDATE II :
Here's the full code of functionDecl.d - "unnecessary" code omitted :
//================================================
// Imports
//================================================
// ...
//================================================
// C Interface for Bison
//================================================
extern (C)
{
void* FunctionDecl_new(char* n, Expressions i, Statements s) { return cast(void*)(new FunctionDecl(to!string(n),i,s)); }
void* FunctionDecl_newFromReference(char* n, Expressions i, Expression r) { return cast(void*)(new FunctionDecl(to!string(n),i,r)); }
}
//================================================
// Functions
//================================================
class FunctionDecl : Statement
{
// .. class variables ..
this(string n, Expressions i, Statements s)
{
this(n, new Identifiers(i), s);
}
this(string n, Expressions i, Expression r)
{
this(n, new Identifiers(i), r);
}
this(string n, Identifiers i, Statements s)
{
// .. implementation ..
}
this(string n, Identifiers i, Expression r)
{
// .. implementation ..
}
// .. other unrelated methods ..
override void execute()
{
if (Glob.currentModule !is null) parentModule = Glob.currentModule.name;
Glob.functions.set(name,this);
}
}
Now as for what Glob.functions.set(name,this); does :
Glob is an instance holding global definitions
function is the class instance dealing with defined functions (it comes with a FunctionDecl[] list
set simply does that : list ~= func;
P.S. I'm 99% sure it has something to do with this one : Super-weird issue triggering "Segmentation Fault", though I'm still not sure what went wrong this time...
I think the problem is that the C function is allocating the object, but D doesn't keep a reference. If FunctionDecl_new is called back-to-back in a tight memory environment, here's what would happen:
the first one calls, creating a new object. That pointer goes into the land of C, where the D GC can't see it.
The second one goes, allocating another new object. Since memory is tight (as far as the GC pool is concerned), it tries to run a collection cycle. It finds the object from (1), but cannot find any live pointers to it, so it frees it.
The C function uses that freed object, causing the segfault.
The segfault won't always run because if there's memory to spare, the GC won't free the object when you allocate the second one, it will just use its free memory instead of collecting. That's why omitting the writeln can get rid of the crash: the ~ operator allocates, which might just put you over the edge of that memory line, triggering a collection (and, of course, running the ~ gives the gc a chance to run in the first place. If you never GC allocate, you never GC collect either - the function looks kinda like gc_allocate() { if(memory_low) gc_collect(); return GC_malloc(...); })
There's three solutions:
Immediately store a reference in the FunctionDecl_new function in a D structure, before returning:
FunctionDecl[] fdReferences;
void* FunctionDecl_new(...) {
auto n = new FunctionDecl(...);
fdReferences ~= n; // keep the reference for later so the GC can see it
return cast(void*) n;
}
Call GC.addRoot on the pointer right before you return it to C. (I don't like this solution, I think the array is better, a lot simpler.)
Use malloc to create the object to give to C:
void* FunctionDecl_new(...) {
import std.conv : emplace;
import core.stdc.stdlib : malloc;
enum size = __traits(classInstanceSize, FunctionDecl);
auto memory = malloc(size)[0 .. size]; // need to slice so we know the size
auto ref = emplace!FunctionDecl(memory, /* args to ctor */); // create the object in the malloc'd block
return memory.ptr; // give the pointer to C
}
Then, of course, you ought to free the pointer when you know it is no longer going to be used, though if you don't, it isn't really wrong.
The general rule I follow btw is any memory that crosses language barriers for storage (usage is different) ought to be allocated similarly to what that language expects: So if you pass data to C or C++, allocate it in a C fashion, e.g. with malloc. This will lead to the least surprising friction as it gets stored.
If the object is just being temporarily used, it is fine to pass a plain pointer to it, since a temp usage isn't stored or freed by the receiving function so there's less danger there. Your reference will still exist too, if nothing else, on the call stack.

duck typing in D

I'm new to D, and I was wondering whether it's possible to conveniently do compile-time-checked duck typing.
For instance, I'd like to define a set of methods, and require that those methods be defined for the type that's being passed into a function. It's slightly different from interface in D because I wouldn't have to declare that "type X implements interface Y" anywhere - the methods would just be found, or compilation would fail. Also, it would be good to allow this to happen on any type, not just structs and classes. The only resource I could find was this email thread, which suggests that the following approach would be a decent way to do this:
void process(T)(T s)
if( __traits(hasMember, T, "shittyNameThatProbablyGetsRefactored"))
// and presumably something to check the signature of that method
{
writeln("normal processing");
}
... and suggests that you could make it into a library call Implements so that the following would be possible:
struct Interface {
bool foo(int, float);
static void boo(float);
...
}
static assert (Implements!(S, Interface));
struct S {
bool foo(int i, float f) { ... }
static void boo(float f) { ... }
...
}
void process(T)(T s) if (Implements!(T, Interface)) { ... }
Is is possible to do this for functions which are not defined in a class or struct? Are there other/new ways to do it? Has anything similar been done?
Obviously, this set of constraints is similar to Go's type system. I'm not trying to start any flame wars - I'm just using D in a way that Go would also work well for.
This is actually a very common thing to do in D. It's how ranges work. For instance, the most basic type of range - the input range - must have 3 functions:
bool empty(); //Whether the range is empty
T front(); // Get the first element in the range
void popFront(); //pop the first element off of the range
Templated functions then use std.range.isInputRange to check whether a type is a valid range. For instance, the most basic overload of std.algorithm.find looks like
R find(alias pred = "a == b", R, E)(R haystack, E needle)
if (isInputRange!R &&
is(typeof(binaryFun!pred(haystack.front, needle)) : bool))
{ ... }
isInputRange!R is true if R is a valid input range, and is(typeof(binaryFun!pred(haystack.front, needle)) : bool) is true if pred accepts haystack.front and needle and returns a type which is implicitly convertible to bool. So, this overload is based entirely on static duck typing.
As for isInputRange itself, it looks something like
template isInputRange(R)
{
enum bool isInputRange = is(typeof(
{
R r = void; // can define a range object
if (r.empty) {} // can test for empty
r.popFront(); // can invoke popFront()
auto h = r.front; // can get the front of the range
}));
}
It's an eponymous template, so when it's used, it gets replaced with the symbol with its name, which in this case is an enum of type bool. And that bool is true if the type of the expression is non-void. typeof(x) results in void if the expression is invalid; otherwise, it's the type of the expression x. And is(y) results in true if y is non-void. So, isInputRange will end up being true if the code in the typeof expression compiles, and false otherwise.
The expression in isInputRange verifies that you can declare a variable of type R, that R has a member (be it a function, variable, or whatever) named empty which can be used in a condition, that R has a function named popFront which takes no arguments, and that R has a member front which returns a value. This is the API expected of an input range, and the expression inside of typeof will compile if R follows that API, and therefore, isInputRange will be true for that type. Otherwise, it will be false.
D's standard library has quite a few such eponymous templates (typically called traits) and makes heavy use of them in its template constraints. std.traits in particular has quite a few of them. So, if you want more examples of how such traits are written, you can look in there (though some of them are fairly complicated). The internals of such traits are not always particularly pretty, but they do encapsulate the duck typing tests nicely so that template constraints are much cleaner and more understandable (they'd be much, much uglier if such tests were inserted in them directly).
So, that's the normal approach for static duck typing in D. It does take a bit of practice to figure out how to write them well, but that's the standard way to do it, and it works. There have been people who have suggested trying to come up with something similar to your Implements!(S, Interface) suggestion, but nothing has really come of that of yet, and such an approach would actually be less flexible, making it ill-suited for a lot of traits (though it could certainly be made to work with basic ones). Regardless, the approach that I've described here is currently the standard way to do it.
Also, if you don't know much about ranges, I'd suggest reading this.
Implements!(S, Interface) is possible but did not get enough attention to get into standard library or get better language support. Probably if I won't be the only one telling it is the way to go for duck typing, we will have a chance to have it :)
Proof of concept implementation to tinker around:
http://dpaste.1azy.net/6d8f2dc4
import std.traits;
bool Implements(T, Interface)()
if (is(Interface == interface))
{
foreach (method; __traits(allMembers, Interface))
{
foreach (compareTo; MemberFunctionsTuple!(Interface, method))
{
bool found = false;
static if ( !hasMember!(T, method) )
{
pragma(msg, T, " has no member ", method);
return false;
}
else
{
foreach (compareWhat; __traits(getOverloads, T, method))
{
if (is(typeof(compareTo) == typeof(compareWhat)))
{
found = true;
break;
}
}
if (!found)
{
return false;
}
}
}
}
return true;
}
interface Test
{
bool foo(int, double);
void boo();
}
struct Tested
{
bool foo(int, double);
// void boo();
}
pragma(msg, Implements!(Tested, Test)());
void main()
{
}

MFC multithreading with delete[] , dbgheap.c

I've got a strange problem and really don't understand what's going on.
I made my application multi-threaded using the MFC multithreadclasses.
Everything works well so far, but now:
Somewhere in the beginning of the code I create the threads:
m_bucketCreator = new BucketCreator(128,128,32);
CEvent* updateEvent = new CEvent(FALSE, FALSE);
CWinThread** threads = new CWinThread*[numThreads];
for(int i=0; i<8; i++){
threads[i]=AfxBeginThread(&MyClass::threadfunction, updateEvent);
m_activeRenderThreads++;
}
this creates 8 threads working on this function:
UINT MyClass::threadfunction( LPVOID params ) //executed in new Thread
{
Bucket* bucket=m_bucketCreator.getNextBucket();
...do something with bucket...
delete bucket;
}
m_bucketCreator is a static member. Now I get some thread error in the deconstructor of Bucket on the attempt to delete a buffer (however, the way I understand it this buffer should be in the memory of this thread, so I don't get why there is an error). On the attempt of delete[] buffer, the error happens in _CrtIsValidHeapPointer() in dbgheap.c.
Visual studio outputs the message that it trapped a halting point and this can be either due to heap corruption or because the user pressed f12 (I didn't ;) )
class BucketCreator {
public:
BucketCreator();
~BucketCreator(void);
void init(int resX, int resY, int bucketSize);
Bucket* getNextBucket(){
Bucket* bucket=NULL;
//enter critical section
CSingleLock singleLock(&m_criticalSection);
singleLock.Lock();
int height = min(m_resolutionY-m_nextY,m_bucketSize);
int width = min(m_resolutionX-m_nextX,m_bucketSize);
bucket = new Bucket(width, height);
//leave critical section
singleLock.Unlock();
return bucket;
}
private:
int m_resolutionX;
int m_resolutionY;
int m_bucketSize;
int m_nextX;
int m_nextY;
//multithreading:
CCriticalSection m_criticalSection;
};
and class Bucket:
class Bucket : public CObject{
DECLARE_DYNAMIC(RenderBucket)
public:
Bucket(int a_resX, int a_resY){
resX = a_resX;
resY = a_resY;
buffer = new float[3 * resX * resY];
int buffersize = 3*resX * resY;
for (int i=0; i<buffersize; i++){
buffer[i] = 0;
}
}
~Bucket(void){
delete[] buffer;
buffer=NULL;
}
int getResX(){return resX;}
int getResY(){return resY;}
float* getBuffer(){return buffer;}
private:
int resX;
int resY;
float* buffer;
Bucket& operator = (const Bucket& other) { /*..*/}
Bucket(const Bucket& other) {/*..*/}
};
Can anyone tell me what could be the problem here?
edit: this is the other static function I'm calling from the threads. Is this safe to do?
static std::vector<Vector3> generate_poisson(double width, double height, double min_dist, int k, std::vector<std::vector<Vector3> > existingPoints)
{
CSingleLock singleLock(&m_criticalSection);
singleLock.Lock();
std::vector<Vector3> samplePoints = std::vector<Vector3>();
...fill the vector...
singleLock.Unlock();
return samplePoints;
}
All the previous replies are sound. For the copy constructor, make sure that it doesn't just copy the buffer pointer, otherwise that will cause the problem. It needs to allocate a new buffer, not the pointer value, which would cause an error in 'delete'. But I don't get the impression that the copy contructor will get called in your code.
I've looked at the code and I am not seeing any error in it as is. Note that the thread synchronization isn't even necessary in this GetNextBucket code, since it's returning a local variable and those are pre-thread.
Errors in ValidateHeapPointer occur because something has corrupted the heap, which happens when a pointer writes past a block of memory. Often it's a for() loop that goes too far, a buffer that wasn't allocated large enough, etc.
The error is reported during a call to 'delete' because that's when the heap is validated for bugs in debug mode. However, the error has occurred before that time, it just happens that the heap is checked only in 'new' and 'delete'. Also, it isn't necessarily related to the 'Bucket' class.
What you need to need to find this bug, short of using tools like BoundsChecker or HeapValidator, is comment out sections of your code until it goes away, and then you'll find the offending code.
There is another method to narrow down the problem. In debug mode, include in your code, and sprinkle calls to _CrtCheckMemory() at various points of interest. That will generate the error when the heap is corrupted. Simply move the calls in your code to narrow down at what point the corruption begins to occur.
I don't know which version of Visual C++ you are using. If you're using a earlier one like VC++ 6.0, make sure that you are using the Multitreaded DLL version of the C Run Time Library in the compiler option.
You're constructing a RenderBucket. Are you sure you're calling the 'Bucket' class's constructor from there? It should look like this:
class RenderBucket : public Bucket {
RenderBucket( int a_resX, int a_resY )
: Bucket( a_resX, a_resY )
{
}
}
Initializers in the Bucket class to set the buffer to NULL is a good idea... Also making the Default constructor and copy constructor private will help to make double sure those aren't being used. Remember.. the compiler will create these automatically if you don't:
Bucket(); <-- default constructor
Bucket( int a_resx = 0, int a_resy = 0 ) <-- Another way to make your default constructor
Bucket(const class Bucket &B) <-- copy constructor
You haven't made a private copy constructor, or any default constructor. If class Bucket is constructed via one of these implicitly-defined methods, buffer will either be uninitialized, or it will be a copied pointer made by a copy constructor.
The copy constructor for class Bucket is Bucket(const Bucket &B) -- if you do not explicitly declare a copy constructor, the compiler will generate a "naive" copy constructor for you.
In particular, if this object is assigned, returned, or otherwise copied, the copy constructor will copy the pointer to a new object. Eventually, both objects' destructors will attempt to delete[] the same pointer and the second attempt will be a double deletion, a type of heap corruption.
I recommend you make class Bucket's copy constructor private, which will cause attempted copy construction to generate a compile error. As an alternative, you could implement a copy constructor which allocates new space for the copied buffer.
Exactly the same applies to the assignment operator, operator=.
The need for a copy constructor is one of the 55 tips in Scott Meyer's excellent book, Effective C++: 55 Specific Ways to Improve Your Programs and Designs:
This book should be required reading for all C++ programmers.
If you add:
class Bucket {
/* Existing code as-is ... */
private:
Bucket() { buffer = NULL; } // No default construction
Bucket(const Bucket &B) { ; } // No copy construction
Bucket& operator= (const Bucket &B) {;} // No assignment
}
and re-compile, you are likely to find your problem.
There is also another possibility: If your code contains other uses of new and delete, then it is possible these other uses of allocated memory are corrupting the linked-list structure which defines the heap memory. It is common to detect this corruption during a call to delete, because delete must utilize these data structures.

Resources