Disabling the default class constructor in Rcpp modules - rcpp

I would like to disable the default (zero argument) constructor for a C++ class exposed to R using RCPP_MODULE so that calls to new(class) without any further arguments in R give an error. Here are the options I've tried:
1) Specifying a default constructor and throwing an error in the function body: this does what I need it to, but means specifying a default constructor that sets dummy values for all const member variables (which is tedious for my real use case)
2) Specifying that the default constructor is private (without a definition): this means the code won't compile if .constructor() is used in the Rcpp module, but has no effect if .constructor() is not used in the Rcpp module
3) Explicitly using delete on the default constructor: requires C++11 and seems to have the same (lack of) effect as (2)
I'm sure that I am missing something obvious, but can't for the life of me work out what it is. Does anyone have any ideas?
Thanks in advance,
Matt
Minimal example code (run in R):
inc <- '
using namespace Rcpp;
class Foo
{
public:
Foo()
{
stop("Disallowed default constructor");
}
Foo(int arg)
{
Rprintf("Foo OK\\n");
}
};
class Bar
{
private:
Bar();
// Also has no effect:
// Bar() = delete;
public:
Bar(int arg)
{
Rprintf("Bar OK\\n");
}
};
RCPP_MODULE(mod) {
class_<Foo>("Foo")
.constructor("Disallowed default constructor")
.constructor<int>("Intended 1-argument constructor")
;
class_<Bar>("Bar")
// Wont compile unless this line is commented out:
// .constructor("Private default constructor")
.constructor<int>("Intended 1-argument constructor")
;
}
'
library('Rcpp')
library('inline')
fx <- cxxfunction(signature(), plugin="Rcpp", include=inc)
mod <- Module("mod", getDynLib(fx))
# OK as expected:
new(mod$Foo, 1)
# Fails as expected:
new(mod$Foo)
# OK as expected:
new(mod$Bar, 1)
# Unexpectedly succeeds:
new(mod$Bar)
How can I get new(mod$Bar) to fail without resorting to the solution used for Foo?
EDIT
I have discovered that my question is actually a symptom of something else:
#include <Rcpp.h>
class Foo {
public:
int m_Var;
Foo() {Rcpp::stop("Disallowed default constructor"); m_Var=0;}
Foo(int arg) {Rprintf("1-argument constructor\n"); m_Var=1;}
int GetVar() {return m_Var;}
};
RCPP_MODULE(mod) {
Rcpp::class_<Foo>("Foo")
.constructor<int>("Intended 1-argument constructor")
.property("m_Var", &Foo::GetVar, "Get value assigned to m_Var")
;
}
/*** R
# OK as expected:
f1 <- new(Foo, 1)
# Value set in the 1-parameter constructor as expected:
f1$m_Var
# Unexpectedly succeeds without the error message:
tryCatch(f0 <- new(Foo), error = print)
# This is the type of error I was expecting to see:
tryCatch(f2 <- new(Foo, 1, 2), error = print)
# Note that f0 is not viable (and sometimes brings down my R session):
tryCatch(f0$m_Var, error = print)
*/
[With acknowledgements to #RalfStubner for the improved code]
So in fact it seems that new(Foo) is not actually calling any C++ constructor at all for Foo, so my question was somewhat off-base ... sorry.
I guess there is no way to prevent this happening at the C++ level, so maybe it makes most sense to use a wrapper function around the call to new(Foo) at the R level, or continue to specify a default constructor that throws an error. Both of these solutions will work fine - I was just curious as to exactly why my expectation regarding the absent default constructor was wrong :)
As a follow-up question: does anybody know exactly what is happening in f0 <- new(Foo) above? My limited understanding suggests that although f0 is created in R, the associated pointer leads to something that has not been (correctly/fully) allocated in C++?

After a bit of experimentation I have found a simple solution to my problem that is obvious in retrospect...! All I needed to do was use .factory for the default constructor along with a function that takes no arguments and just throws an error. The default constructor for the Class is never actually referenced so doesn't need to be defined, but it obtains the desired behaviour in R (i.e. an error if the user mistakenly calls new with no additional arguments).
Here is an example showing the solution (Foo_A) and a clearer illustration of the problem (Foo_B):
#include <Rcpp.h>
class Foo {
private:
Foo(); // Or for C++11: Foo() = delete;
const int m_Var;
public:
Foo(int arg) : m_Var(arg) {Rcpp::Rcout << "Constructor with value " << m_Var << "\n";}
void ptrAdd() const {Rcpp::Rcout << "Pointer: " << (void*) this << "\n";}
};
Foo* dummy_factory() {Rcpp::stop("Default constructor is disabled for this class"); return 0;}
RCPP_MODULE(mod) {
Rcpp::class_<Foo>("Foo_A")
.factory(dummy_factory) // Disable the default constructor
.constructor<int>("Intended 1-argument constructor")
.method("ptrAdd", &Foo::ptrAdd, "Show the pointer address")
;
Rcpp::class_<Foo>("Foo_B")
.constructor<int>("Intended 1-argument constructor")
.method("ptrAdd", &Foo::ptrAdd, "Show the pointer address")
;
}
/*** R
# OK as expected:
fa1 <- new(Foo_A, 1)
# Error as expected:
tryCatch(fa0 <- new(Foo_A), error = print)
# OK as expected:
fb1 <- new(Foo_B, 1)
# No error:
tryCatch(fb0 <- new(Foo_B), error = print)
# But this terminates R with the following (quite helpful!) message:
# terminating with uncaught exception of type Rcpp::not_initialized: C++ object not initialized. (Missing default constructor?)
tryCatch(fb0$ptrAdd(), error = print)
*/
As was suggested to me in a comment I have started a discussion at https://github.com/RcppCore/Rcpp/issues/970 relating to this.

Related

Function Signatures/Interfaces from Pybind11 Module (IDE Suggestions)

Let's assume we have a simple module called _sample built with pybind11:
/* py_bindings.cpp */
#include <pybind11/pybind11.h>
namespace py = pybind11;
PYBIND11_MODULE(_sample, m) {
m.def("add", [](int a, int b) { return a + b; });
m.def("add", [](const std::string& lhs, const std::string& rhs) { return lhs + rhs; });
}
This produces a dynamic module file _sample.pyd (Windows) or _sample.so (Linux), which we can then import in the actual module sample:
## sample\__init__.py ##
from ._sample import *
So that we can write:
## script.py ##
import sample as s
print(s.add(4, 2)) # 6
print(s.add('AB', 'C')) # ABC
The above code works fine, but the IDE does not know which functions are included in _sample until the code is actually run. And as a result, there are no function suggestions at all (and no function signature suggestions either).
As I would like to help the users of my library, my question is: how do I include function suggestions (or "function hints") in my module?
I've tried including the below code in sample\__init__.py as I thought the ... might work as a "hint". But unfortunately, this overrides the original add function from _sample.
def add(arg0: int, arg1: int) -> int:
...
Are there ways to hint the function signatures to a Python IDE?
Of course, I want to extend this to classes, class functions & module attributes too. I just picked functions as a starting point.
I think what you're looking for is a stub or interface (pyi) file. The IDE can understand the signature of functions and classes from this file.
If you're using pybind11, check out pybind11-stubgen for automatic generation of a stub file.

COleDateTime::SetDateTime and noexcept (code analysis)

See this code:
COleDateTime CSpecialEventDlg::GetSpecialEventDate() noexcept
{
COleDateTime datEvent;
if (datEvent.SetDateTime(m_datEvent.GetYear(),
m_datEvent.GetMonth(),
m_datEvent.GetDay(),
0, 0, 0) != 0)
{
// Error
}
return datEvent;
}
My code analysis said I could add noexcept (which I did). But I decided to investigate a little more. I noticed that SetDateTime returns a value:
Zero if the value of this COleDateTime object was set successfully; otherwise, 1. This return value is based on the DateTimeStatus enumerated type. For more information, see the SetStatus member function.
For that latter function (SetStatus) it states:
The status parameter value is defined by the DateTimeStatus enumerated type, which is defined within the COleDateTime class. See COleDateTime::GetStatus for details.
Now, the documentation for GetStatus is documented to have a definition of:
DateTimeStatus GetStatus() const throw();
So, it throws an exception if there is an error. Therefore, I decided to look at the MFC source for SetDateTime:
inline int COleDateTime::SetDateTime(
_In_ int nYear,
_In_ int nMonth,
_In_ int nDay,
_In_ int nHour,
_In_ int nMin,
_In_ int nSec) throw()
{
SYSTEMTIME st;
::ZeroMemory(&st, sizeof(SYSTEMTIME));
st.wYear = WORD(nYear);
st.wMonth = WORD(nMonth);
st.wDay = WORD(nDay);
st.wHour = WORD(nHour);
st.wMinute = WORD(nMin);
st.wSecond = WORD(nSec);
m_status = ConvertSystemTimeToVariantTime(st) ? valid : invalid;
return m_status;
}
It uses ConvertSystemTimeToVariantTime to set the status. That uses:
inline BOOL COleDateTime::ConvertSystemTimeToVariantTime(_In_ const SYSTEMTIME& systimeSrc)
{
return AtlConvertSystemTimeToVariantTime(systimeSrc,&m_dt);
}
And that uses:
inline BOOL AtlConvertSystemTimeToVariantTime(
_In_ const SYSTEMTIME& systimeSrc,
_Out_ double* pVarDtTm)
{
ATLENSURE(pVarDtTm!=NULL);
//Convert using ::SystemTimeToVariantTime and store the result in pVarDtTm then
//convert variant time back to system time and compare to original system time.
BOOL ok = ::SystemTimeToVariantTime(const_cast<SYSTEMTIME*>(&systimeSrc), pVarDtTm);
SYSTEMTIME sysTime;
::ZeroMemory(&sysTime, sizeof(SYSTEMTIME));
ok = ok && ::VariantTimeToSystemTime(*pVarDtTm, &sysTime);
ok = ok && (systimeSrc.wYear == sysTime.wYear &&
systimeSrc.wMonth == sysTime.wMonth &&
systimeSrc.wDay == sysTime.wDay &&
systimeSrc.wHour == sysTime.wHour &&
systimeSrc.wMinute == sysTime.wMinute &&
systimeSrc.wSecond == sysTime.wSecond);
return ok;
}
At this point I have got lost. In short, I am assuming that SetDateTime does not throw an exception.
I understand this much, if I decide to make a call to GetStatus inside the if clause then we do have a potential for exception to be thrown.
All three MFC functions for which you have shown the source code (COleDateTime::SetDateTime, COleDateTime::ConvertSystemTimeToVariantTime and AtlConvertSystemTimeToVariantTime) have – according to their declarations – the potential to throw exceptions (because they have no specification to the contrary, such as noexcept).
However, that doesn't mean that they will (or are even likely to). Digging a bit further into the MFC source code, the only place I can see, in those three functions, where an exception could be thrown is in the ATLENSURE(pVarDtTm!=NULL); line (in the third function).
The ATLENSURE macro is defined as follows:
#define ATLENSURE(expr) ATLENSURE_THROW(expr, E_FAIL)
And ATLENSURE_THROW is, in turn, defined thus:
#define ATLENSURE_THROW(expr, hr) \
do { \
int __atl_condVal=!!(expr); \
ATLASSUME(__atl_condVal); \
if(!(__atl_condVal)) AtlThrow(hr); \
} __pragma(warning(suppress:4127)) while (0)
So, it would seem that, in your code, an exception will be thrown if expr (in the above snippets) is null (the double-bang, !! pseudo-operator makes any non-zero value into 1 and leaves a zero as zero). That expr is the result of the pVarDtTm!=NULL expression, which can only be 0 (false) if the &m_dt argument in the call in your second MFC source excerpt is itself NULL – and, as the address of a member of the class object through which it is being called, that seems improbable (if not impossible).
Another issue you have is that you appear to misunderstand what the throw() specification in the DateTimeStatus GetStatus() const throw(); declaration means. As described here, it is actually (since C++17) an alias for noexcept (or noexcept(true), to be more precise). To specify that a function can throw any type of exception, the throw(...) or noexcept(false) specifications should be used (or use no except/throw specification at all).
So, your last sentence is not really true:
I understand this much, if I decide to make a call to GetStatus inside the if clause then we do have a potential for exception to be
thrown.
Because the GetStatus() function is specified explicitly as non-throwing, and you already have a call to the SetDateTime member function, which (as described above) can throw an exception (but won't, in your code).

Problems using malloc in D language: why writeln call the destructor twice in this example

I am trying to write a D wrapper for a C library (libmpdec) that stores its data using the standard C malloc function. But the are
some nasty bugs in my programs that I don't know how to solve.
So I have written the following test example, trying to understand this. The idea is to create structure holding a pointer to a memory area allocated using malloc in the constructor and that contains a zero-terminated C string, and free the area using the destructor. Also I can print the string using printf. The problem arises when I try to implement a method toString() so that I can use the standard D function writeln. For some reason that I don't understand the destructor seems to be called twice! (one after writeln) and so a segmentation fault occurs.
import std.stdio;
import core.stdc.stdlib;
import std.string;
import core.stdc.string;
struct Prueba {
char* pointer;
string name;
this(string given_name)
{
writeln("calling the constructor");
pointer= cast (char*) malloc(char.sizeof*10);
name=given_name;
char* p= pointer;
*p= 'a';
p++;
*p= 'b';
p++;
*p= '\n';
p++;
*p= '\0';
}
~this()
{
writeln("\n calling the destructor");
free(pointer);
}
void print()
{
printf("Using printf %s \n",pointer);
}
string toString()
{
ulong len=strlen(pointer);
return cast(string) pointer[0..len];
}
}
void main()
{
writeln("version 1");
Prueba p=Prueba("a");
writeln("using writeln ",p);
p.print();
}
But if I store the result in a string varible like
string s=p.toString();
writeln("using writeln ",s);
The program just works! I cannot figure out why!
You can see both versions of my test program at
https://github.com/pdenapo/example_programs_in_D/tree/master/using_malloc
Many thanks for any help!
Update: It seems that writeln plays no role here. And I can get the
same result with something like
void probando(Prueba q)
{
q.print();
}
probando(p);
The problem seems to be that a copy of p is created when calling a function.
In cases like this, it's often a good idea to see if it's the same instance being destroyed. Adding &this to the writeln calls, I get this output:
version 1
calling the constructor at 6FBB70F960
Instance on stack: 6FBB70F960
using writeln ab
calling the destructor at 6FBB70F820
calling the destructor at 6FBB70F7F0
As we can see, the pointers are different, so there's two instances.
D structs are value types, and so are copied and moved. When you call a function taking a class parameter, a pointer is what's actually being passed, and it basically says 'the class instance you're looking for is over there'. With structs a copy is created, and suddenly you have two independent objects living their separate lives.
Of course, that's not what you want - Prueba isn't actually a copyable type, since having two copies will result in two calls to the destructor, and thus double freeing. To mark it as non-copyable, simply add #disable this(this); to disable the postblit, and the compiler will helpfully throw error messages at you when a copy would be created.
This will cause a compiler error on the writeln line, and you will have to manually call toString, e.g.: writeln("using writeln ", p.toString());
Note that a non-copyable struct may be passed to functions as ref, since that doesn't create a new copy. We can't really modify writeln to do that, but it's worth knowing for your own functions.

C++ link error, symbol redefinition

I came across a problem recently.
I have three files, A.h, B.cpp, C.cpp:
A.h
#ifndef __A_H__
#define __A_H__
int M()
{
return 1;
}
#endif // __A_H__
B.cpp
#include "A.h"
C.cpp
#include "A.h"
As I comile the three files by MSVC, there is a error:
C.obj : error LNK2005: "int __cdecl M(void)" (?M##YAHXZ) already defined in B.obj
It is easy understanding, as we know, B.obj has a symbol named "M", also C.obj has a "M".
Here the error comes.
However, if I change M method to a class which contain a method M like this below:
A.h
#ifndef __A_H__
#define __A_H__
class CA
{
public:
int M()
{
return 1;
}
};
#endif // __A_H__
there is no more errors!! Could somebody tell me what is happening?
If B.cpp and C.cpp include A.h, then both are compiled with your definition of M, so both object files will contain code for M. When the linker gathers all the functions, he sees that M is defined in two object files and does not know which one to use. Thus the linker raises an LNK2005.
If you put your function M into a class declaration, then the compiler marks/handles M as an inline function. This information is written into the object file. The linker sees that both object files contain a definition for an inline version of CA::M, so he assumes that both are equal and picks up randomly one of the two definitions.
If you had written
class CA {
public:
int M();
};
int CA::M()
{
return 1;
}
this would have caused the same problems (LNK2005) as your initial version, because then CA::M would not have been inline any more.
So as you might guess by now, there are two solutions for you. If you want M to be inlined, then change your code to
__inline int M()
{
return 1;
}
If you don't care about inlining, then please do it the standard way and put the function declaration into the header file:
extern int M();
And put the function definition into a cpp file (for A.h this would ideally be A.cpp):
int M()
{
return 1;
}
Please note that the extern is not really necessary in the header file.
Another user suggested that you write
static int M()
{
return 1;
}
I'd not recommend this. This would mean that the compiler puts M into both of your object files and marks M as being a function that is only visible in each object file itself. If the linker sees that a function in B.cpp calls M, it finds M in B.obj and in C.obj. Both have M marked as static, so the linker ignores M in C.obj and picks the M from B.obj. Vice versa if a function in C.cpp calls M, the linker picks the M from C.obj. You will end up with multiple definitions of M, all with the same implementation. This is a waste of space.
See http://faculty.cs.niu.edu/~mcmahon/CS241/c241man/node90.html how to do ifdef guards. You have to start with ifndef before the define.
Edit: Ah no, while your guard is wrong that's not the issue. Put static in front of your function to make it work. Classes are different because they define types.
I don't know what's under the hood, but if you don't need a class I guess that the compiler will automatically add the "extern" key to your functions, so you'll get the error including the header 2 times.
You can add the static keyword to M() method so you'll have only one copy of that function in memory and no errors at compile time.
By the way: I see you have a #endif, but not a #ifdef or #ifndef, is it a copy/paste error?

MFC multithreading with delete[] , dbgheap.c

I've got a strange problem and really don't understand what's going on.
I made my application multi-threaded using the MFC multithreadclasses.
Everything works well so far, but now:
Somewhere in the beginning of the code I create the threads:
m_bucketCreator = new BucketCreator(128,128,32);
CEvent* updateEvent = new CEvent(FALSE, FALSE);
CWinThread** threads = new CWinThread*[numThreads];
for(int i=0; i<8; i++){
threads[i]=AfxBeginThread(&MyClass::threadfunction, updateEvent);
m_activeRenderThreads++;
}
this creates 8 threads working on this function:
UINT MyClass::threadfunction( LPVOID params ) //executed in new Thread
{
Bucket* bucket=m_bucketCreator.getNextBucket();
...do something with bucket...
delete bucket;
}
m_bucketCreator is a static member. Now I get some thread error in the deconstructor of Bucket on the attempt to delete a buffer (however, the way I understand it this buffer should be in the memory of this thread, so I don't get why there is an error). On the attempt of delete[] buffer, the error happens in _CrtIsValidHeapPointer() in dbgheap.c.
Visual studio outputs the message that it trapped a halting point and this can be either due to heap corruption or because the user pressed f12 (I didn't ;) )
class BucketCreator {
public:
BucketCreator();
~BucketCreator(void);
void init(int resX, int resY, int bucketSize);
Bucket* getNextBucket(){
Bucket* bucket=NULL;
//enter critical section
CSingleLock singleLock(&m_criticalSection);
singleLock.Lock();
int height = min(m_resolutionY-m_nextY,m_bucketSize);
int width = min(m_resolutionX-m_nextX,m_bucketSize);
bucket = new Bucket(width, height);
//leave critical section
singleLock.Unlock();
return bucket;
}
private:
int m_resolutionX;
int m_resolutionY;
int m_bucketSize;
int m_nextX;
int m_nextY;
//multithreading:
CCriticalSection m_criticalSection;
};
and class Bucket:
class Bucket : public CObject{
DECLARE_DYNAMIC(RenderBucket)
public:
Bucket(int a_resX, int a_resY){
resX = a_resX;
resY = a_resY;
buffer = new float[3 * resX * resY];
int buffersize = 3*resX * resY;
for (int i=0; i<buffersize; i++){
buffer[i] = 0;
}
}
~Bucket(void){
delete[] buffer;
buffer=NULL;
}
int getResX(){return resX;}
int getResY(){return resY;}
float* getBuffer(){return buffer;}
private:
int resX;
int resY;
float* buffer;
Bucket& operator = (const Bucket& other) { /*..*/}
Bucket(const Bucket& other) {/*..*/}
};
Can anyone tell me what could be the problem here?
edit: this is the other static function I'm calling from the threads. Is this safe to do?
static std::vector<Vector3> generate_poisson(double width, double height, double min_dist, int k, std::vector<std::vector<Vector3> > existingPoints)
{
CSingleLock singleLock(&m_criticalSection);
singleLock.Lock();
std::vector<Vector3> samplePoints = std::vector<Vector3>();
...fill the vector...
singleLock.Unlock();
return samplePoints;
}
All the previous replies are sound. For the copy constructor, make sure that it doesn't just copy the buffer pointer, otherwise that will cause the problem. It needs to allocate a new buffer, not the pointer value, which would cause an error in 'delete'. But I don't get the impression that the copy contructor will get called in your code.
I've looked at the code and I am not seeing any error in it as is. Note that the thread synchronization isn't even necessary in this GetNextBucket code, since it's returning a local variable and those are pre-thread.
Errors in ValidateHeapPointer occur because something has corrupted the heap, which happens when a pointer writes past a block of memory. Often it's a for() loop that goes too far, a buffer that wasn't allocated large enough, etc.
The error is reported during a call to 'delete' because that's when the heap is validated for bugs in debug mode. However, the error has occurred before that time, it just happens that the heap is checked only in 'new' and 'delete'. Also, it isn't necessarily related to the 'Bucket' class.
What you need to need to find this bug, short of using tools like BoundsChecker or HeapValidator, is comment out sections of your code until it goes away, and then you'll find the offending code.
There is another method to narrow down the problem. In debug mode, include in your code, and sprinkle calls to _CrtCheckMemory() at various points of interest. That will generate the error when the heap is corrupted. Simply move the calls in your code to narrow down at what point the corruption begins to occur.
I don't know which version of Visual C++ you are using. If you're using a earlier one like VC++ 6.0, make sure that you are using the Multitreaded DLL version of the C Run Time Library in the compiler option.
You're constructing a RenderBucket. Are you sure you're calling the 'Bucket' class's constructor from there? It should look like this:
class RenderBucket : public Bucket {
RenderBucket( int a_resX, int a_resY )
: Bucket( a_resX, a_resY )
{
}
}
Initializers in the Bucket class to set the buffer to NULL is a good idea... Also making the Default constructor and copy constructor private will help to make double sure those aren't being used. Remember.. the compiler will create these automatically if you don't:
Bucket(); <-- default constructor
Bucket( int a_resx = 0, int a_resy = 0 ) <-- Another way to make your default constructor
Bucket(const class Bucket &B) <-- copy constructor
You haven't made a private copy constructor, or any default constructor. If class Bucket is constructed via one of these implicitly-defined methods, buffer will either be uninitialized, or it will be a copied pointer made by a copy constructor.
The copy constructor for class Bucket is Bucket(const Bucket &B) -- if you do not explicitly declare a copy constructor, the compiler will generate a "naive" copy constructor for you.
In particular, if this object is assigned, returned, or otherwise copied, the copy constructor will copy the pointer to a new object. Eventually, both objects' destructors will attempt to delete[] the same pointer and the second attempt will be a double deletion, a type of heap corruption.
I recommend you make class Bucket's copy constructor private, which will cause attempted copy construction to generate a compile error. As an alternative, you could implement a copy constructor which allocates new space for the copied buffer.
Exactly the same applies to the assignment operator, operator=.
The need for a copy constructor is one of the 55 tips in Scott Meyer's excellent book, Effective C++: 55 Specific Ways to Improve Your Programs and Designs:
This book should be required reading for all C++ programmers.
If you add:
class Bucket {
/* Existing code as-is ... */
private:
Bucket() { buffer = NULL; } // No default construction
Bucket(const Bucket &B) { ; } // No copy construction
Bucket& operator= (const Bucket &B) {;} // No assignment
}
and re-compile, you are likely to find your problem.
There is also another possibility: If your code contains other uses of new and delete, then it is possible these other uses of allocated memory are corrupting the linked-list structure which defines the heap memory. It is common to detect this corruption during a call to delete, because delete must utilize these data structures.

Resources