GCHandle, AppDomains managed code and 3rd party dll - c#-4.0

I have looking at many threads about the exception "cannot pass a GCHandle across AppDomains" but I still don't get it....
I'm working with an RFID Reader which is driven by a DLL. I don't have source code for this DLL but only a sample to show how to use it.
The sample works great but I have to copy some code in another project to add the reader to the middleware Microsoft Biztalk.
The problem is that the process of Microsoft Biztalk works in another AppDomain. The reader handle events when a tag is read. But when I run it under Microsoft Biztalk I got this annoying exception.
I can't see any solution on how to make it work...
Here is some code that may be interesting :
// Let's connecting the result handlers.
// The reader calls a command-specific result handler if a command is done and the answer is ready to send.
// So let's tell the reader which functions should be called if a result is ready to send.
// result handler for reading EPCs synchronous
Reader.KSRWSetResultHandlerSyncGetEPCs(ResultHandlerSyncGetEPCs);
[...]
var readerErrorCode = Reader.KSRWSyncGetEPCs();
if (readerErrorCode == tKSRWReaderErrorCode.KSRW_REC_NoError)
{
// No error occurs while sending the command to the reader. Let's wait until the result handler was called.
if (ResultHandlerEvent.WaitOne(TimeSpan.FromSeconds(10)))
{
// The reader's work is done and the result handler was called. Let's check the result flag to make sure everything is ok.
if (_readerResultFlag == tKSRWResultFlag.KSRW_RF_NoError)
{
// The command was successfully processed by the reader.
// We'll display the result in the result handler.
}
else
{
// The command can't be proccessed by the reader. To know why check the result flag.
logger.error("Command \"KSRWSyncGetEPCs\" returns with error {0}", _readerResultFlag);
}
}
else
{
// We're getting no answer from the reader within 10 seconds.
logger.error("Command \"KSRWSyncGetEPCs\" timed out");
}
}
[...]
private static void ResultHandlerSyncGetEPCs(object sender, tKSRWResultFlag resultFlag, tKSRWExtendedResultFlag extendedResultFlag, tKSRWEPCListEntry[] epcList)
{
if (Reader == sender)
{
// Let's store the result flag in a global variable to get access from everywhere.
_readerResultFlag = resultFlag;
// Display all available epcs in the antenna field.
Console.ForegroundColor = ConsoleColor.White;
foreach (var resultListEntry in epcList)
{
handleTagEvent(resultListEntry);
}
// Let's set the event so that the calling process knows the command was processed by reader and the result is ready to get processed.
ResultHandlerEvent.Set();
}
}

You are having a problem with the gcroot<> helper class. It is used in the code that nobody can see, inside that DLL. It is frequently used by C++ code that was designed to interop with managed code, gcroot<> stores a reference to a managed object. The class uses the GCHandle type to add the reference. The GCHandle.ToIntPtr() method returns a pointer that the C++ code can store. The operation that fails is GCHandle.FromIntPtr(), used by the C++ code to recover the reference to the object.
There are two basic explanations for getting this exception:
It can be accurate. Which will happen when you initialized the code in the DLL from one AppDomain and use it in another. It isn't clear from the snippet where the Reader class object gets initialized so there are non-zero odds that this is the explanation. Be sure to keep it close to the code that uses the Reader class.
It can be caused by another bug, present in the C++ code inside the DLL. Unmanaged code often suffers from pointer bugs, the kind of bug that can accidentally overwrite memory. If that happens with the field that stores the gcroot<> object then nothing goes wrong for a while. Until the code tries to recover the object reference again. At that point the CLR notices that the corrupted pointer value no longer matches an actual object handle and generates this exception. This is certainly the hard kind of bug to solve since this happens in code you cannot fix and showing the programmer that worked on it a repro for the bug is very difficult, such memory corruption problems never repro well.
Chase bullet #1 first. There are decent odds that Biztalk runs your C# code in a separate AppDomain. And that the DLL gets loaded too soon, before or while the AppDomain is created. Something you can see with SysInternals' ProcMon. Create a repro of this by writing a little test program that creates an AppDomain and runs the test code. If that reproduces the crash then you'll have a very good way to demonstrate the issue to the RFID vendor and some hope that they'll use it and work on a fix.
Having a good working relationship with the RFID reader vendor to get to a resolution is going to be very important. That's never not a problem, always a good reason to go shopping elsewhere.

Related

Getting error "attempting to detach while still running code" when calling JavaVm->DetachCurrentThread [duplicate]

I have an Android app that uses NDK - a regular Android Java app with regular UI and C++ core. There are places in the core where I need to call Java methods, which means I need a JNIEnv* for that thread, which in turn means that I need to call JavaVM->AttachCurrentThread() to get the valid env.
Previously, was just doing AttachCurrentThread and didn't bother to detach at all. It worked fine in Dalvik, but ART aborts the application as soon as a thread that has called AttachCurrentThread exits without calling DetachCurrentThread. So I've read the JNI reference, and indeed it says that I must call DetachCurrentThread. But when I do that, ART aborts the app with the following message:
attempting to detach while still running code
What's the problem here, and how to call DetachCurrentThread properly?
Dalvik will also abort if the thread exits without detaching. This is implemented through a pthread key -- see threadExitCheck() in Thread.cpp.
A thread may not detach unless its call stack is empty. The reasoning behind this is to ensure that any resources like monitor locks (i.e. synchronized statements) are properly released as the stack unwinds.
The second and subsequent attach calls are, as defined by the spec, low-cost no-ops. There's no reference counting, so detach always detaches, no matter how many attaches have happened. One solution is to add your own reference-counted wrapper.
Another approach is to attach and detach every time. This is used by the app framework on certain callbacks. This wasn't so much a deliberate choice as a side-effect of wrapping Java sources around code developed primarily in C++, and trying to shoe-horn the functionality in. If you look at SurfaceTexture.cpp, particularly JNISurfaceTextureContext::onFrameAvailable(), you can see that when SurfaceTexture needs to invoke a Java-language callback function, it will attach the thread, invoke the callback, and then if the thread was just attached it will immediately detach it. The "needsDetach" flag is set by calling GetEnv to see if the thread was previously attached.
This isn't a great thing performance-wise, as each attach needs to allocate a Thread object and do some internal VM housekeeping, but it does yield the correct behavior.
I'll try a direct and practical approach (with sample code, without use of classes) answering this question for the occasional developer that came up with this error in android, in cases where they had it working and after a OS or framework update (Qt?) it started to give problems with that error and message.
JNIEXPORT void Java_com_package_class_function(JNIEnv* env.... {
JavaVM* jvm;
env->GetJavaVM(&jvm);
JNIEnv* myNewEnv; // as the code to run might be in a different thread (connections to signals for example) we will have a 'new one'
JavaVMAttachArgs jvmArgs;
jvmArgs.version = JNI_VERSION_1_6;
int attachedHere = 0; // know if detaching at the end is necessary
jint res = jvm->GetEnv((void**)&myNewEnv, JNI_VERSION_1_6); // checks if current env needs attaching or it is already attached
if (JNI_EDETACHED == res) {
// Supported but not attached yet, needs to call AttachCurrentThread
res = jvm->AttachCurrentThread(reinterpret_cast<JNIEnv **>(&myNewEnv), &jvmArgs);
if (JNI_OK == res) {
attachedHere = 1;
} else {
// Failed to attach, cancel
return;
}
} else if (JNI_OK == res) {
// Current thread already attached, do not attach 'again' (just to save the attachedHere flag)
// We make sure to keep attachedHere = 0
} else {
// JNI_EVERSION, specified version is not supported cancel this..
return;
}
// Execute code using myNewEnv
// ...
if (attachedHere) { // Key check
jvm->DetachCurrentThread(); // Done only when attachment was done here
}
}
Everything made sense after seeing the The Invocation API docs for GetEnv:
RETURNS:
If the current thread is not attached to the VM, sets *env to NULL, and returns JNI_EDETACHED. If the specified version is not supported, sets *env to NULL, and returns JNI_EVERSION. Otherwise, sets *env to the appropriate interface, and returns JNI_OK.
Credits to:
- This question Getting error "attempting to detach while still running code" when calling JavaVm->DetachCurrentThread that in its example made it clear that it was necessary to double check every time (even though before calling detach it doesn't do it).
- #Michael that in this question comments he notes it clearly about not calling detach.
- What #fadden said: "There's no reference counting, so detach always detaches, no matter how many attaches have happened."

Passing managed reference (this) to unmanaged code and call managed callback

this is the source of the issue. My answer there was deleted with the hint to start a new question. So, here we go:
I want to pass the managed reference of this to unmanaged code. And then call the managed callback out of the unmanaged callback.
public ref class CReader
with a private field
private:
[...]
void *m_pTag;
[...]
In the constructor of the managed class I initialize the m_pTag like this:
m_pTag = new gcroot<CReader ^>(this);
Later, I pass this void *m_pTag to the unmanaged code. If the unmanaged callback is called, I'm casting the void *m_pTag back to managed reference and call the managed callback
(*(gcroot<CReader ^>*)pTag)->MANAGEDCALLBACKFUNCTION
and there is an exception thrown, if the DLL is used under another AppDomain. The Debugger stops in gcroot.h on line
// don't return T& here because & to gc pointer not yet implemented
// (T should be a pointer anyway).
T operator->() const {
// gcroot is typesafe, so use static_cast
return static_cast<T>(__VOIDPTR_TO_GCHANDLE(_handle).Target);
}
with Cannot pass a GCHandle across AppDomains.
My question is, what should I do?
Sincerly,
Sebastian
==================== EDIT ====================
I am now able to reproduce the problem. I've took some screenshots, to show the issue.
1st screenshot: constructor
snd screenshot: callback
The problem is, that the value-member of the struct gcroot in the callback is empty.
Thank you.
Sebastian
==================== EDIT ====================
Push.
The gcroot<> C++ class is a wrapper that uses the GCHandle class. The constructor calls GCHandle.ToIntPtr() to turn the handle into an opaque pointer, one that you can safely store as a member of an unmanaged struct or C++ class.
The cast then, later, converts that raw pointer back to the handle with the GCHandle.FromIntPtr() method. The GCHandle.Target property gives you the managed object reference back.
GCHandle.FromIntPtr() can indeed fail, the generic exception message is "Cannot pass a GCHandle across AppDomains". This message only fingers the common reason that this fails, it assumes that GCHandle is used in safe code and simply used incorrectly.
The message does not cover the most common reason that this fails in unsafe code. Code that invariably dies with undiagnosable exceptions due to heap corruption. In other words, the gcroot<> member of the C++ class or struct getting overwritten with an arbitrary value. Which of course dooms GCHandle.FromIntPtr(), the CLR can no longer find the handle back from a junk pointer value.
You diagnose this bug the way you diagnose any heap corruption problem. You first make sure that you get a good repro so you can trip the exception reliably. And you set a data breakpoint on the gcroot member. The debugger automatically breaks when the member is written inappropriately, the call stack gives you good idea why this happened.

Dispatcher xps memory leak

I'm call a .net 4.0 dll from a vb6 app using com interop. In .net I create an xps document, via a xaml fixed document and save it to disk. This causes and memory leak and I've found a great solution here.
Saving a FixedDocument to an XPS file causes memory leak
The solution above, that worked for me, involves this line of code:
Dispatcher.CurrentDispatcher.Invoke(DispatcherPriority.SystemIdle, new DispatcherOperationCallback(delegate { return null; }), null);
What exactly is happening with this line of code. Is that by setting the delegate to null this disposes the Dispatcher object?
While it initially appears that the supplied code does nothing, it actually has a non-obvious side effect that is resolving your problem. Let's break it down into steps:
Dispatcher.CurrentDispatcher Gets the dispatcher for the current thread.
Invoke synchronously executes the supplied delegate on the dispatcher's thread (the current one)
DispatcherPriority.SystemIdle sets the execution priority
new DispatcherOperationCallback(delegate { return null; }) creates a delegate that does nothing
null is passed as an argument to the delegate
All together, this looks like it does nothing, and indeed it actually does do "nothing". The important part is that it waits until the dispatcher for the current thread has cleared any scheduled tasks that are higher priority than SystemIdle before doing the "nothing". This allows the scheduled cleanup work to happen before you return to your vb6 app.

How can I implement callback functions in a QObject-derived class which are called from non-Qt multi-threaded libraries?

(Pseudo-)Code
Here is a non-compilable code-sketch of the concepts I am having trouble with:
struct Data {};
struct A {};
struct B {};
struct C {};
/* and many many more...*/
template<typename T>
class Listener {
public:
Listener(MyObject* worker):worker(worker)
{ /* do some magic to register with RTI DDS */ };
public:
// This function is used ass a callback from RTI DDS, i.e. it will be
// called from other threads when new Data is available
void callBackFunction(Data d)
{
T t = extractFromData(d);
// Option 1: direct function call
// works somewhat, but shows "QObject::startTimer: timers cannot be started
// from another thread" at the console...
worker->doSomeWorkWithData(t); //
// Option 2: Use invokeMethod:
// seems to fail, as the macro expands including '"T"' and that type isn't
// registered with the QMetaType system...
// QMetaObject::invokeMethod(worker,"doSomeGraphicsWork",Qt::AutoConnection,
// Q_ARG(T, t)
// );
// Option 3: use signals slots
// fails as I can't make Listener, a template class, a QObject...
// emit workNeedsToBeDone(t);
}
private:
MyObject* worker;
T extractFromData(Data d){ return T(d);};
};
class MyObject : public QObject {
Q_OBJECT
public Q_SLOTS:
void doSomeWorkWithData(A a); // This one affects some QGraphicsItems.
void doSomeWorkWithData(B b){};
void doSomeWorkWithData(C c){};
public:
MyObject():QObject(nullptr){};
void init()
{
// listeners are not created in the constructor, but they should have the
// same thread affinity as the MyObject instance that creates them...
// (which in this example--and in my actual code--would be the main GUI
// thread...)
new Listener<A>(this);
new Listener<B>(this);
new Listener<C>(this);
};
};
main()
{
QApplication app;
/* plenty of stuff to set up RTI DDS and other things... */
auto myObject = new MyObject();
/* stuff resulting in the need to separate "construction" and "initialization" */
myObject.init();
return app.exec();
};
Some more details from the actual code:
The Listener in the example is a RTI DataReaderListener, the callback
function is onDataAvailable()
What I would like to accomplish
I am trying to write a little distributed program that uses RTI's Connext DDS for communication and Qt5 for the GUI stuff--however, I don't believe those details do matter much as the problem, as far as I understood it, boils down to the following:
I have a QObject-derived object myObject whose thread affinity might or might not be with the main GUI thread (but for simplicity, let's assume that is the case.)
I want that object to react to event's which happen in another, non-Qt 3rd-party library (in my example code above represented by the functions doSomeWorkWithData().
What I understand so far as to why this is problematic
Disclaimer: As usual, there is always more than one new thing one learns when starting a new project. For me, the new things here are/were RTI's Connext and (apparently) my first time where I myself have to deal with threads.
From reading about threading in Qt (1,2,3,4, and 5 ) it seems to me that
QObjects in general are not thread safe, i.e. I have to be a little careful about things
Using the right way of "communicating" with QObjects should allow me to avoid having to deal with mutexes etc myself, i.e. somebody else (Qt?) can take care of serializing access for me.
As a result from that, I can't simply have (random) calls to MyClass::doSomeWorkWithData() but I need to serialize that. One, presumably easy, way to do so is to post an event to the event queue myObject lives in which--when time is available--will trigger the execution of the desired method, MyClass::doSomeWorkWithData() in my case.
What I have tried to make things work
I have confirmed that myObject, when instantiated similarly as in the sample code above, is affiliated with the main GUI thread, i.e. myObject.thread() == QApplication::instance()->thread().
With that given, I have tried three options so far:
Option 1: Directly calling the function
This approach is based upon the fact that
- myObject lives in the GUI thread
- All the created listeners are also affiliated with the GUI thread as they are
created by `myObject' and inherit its thread that way
This actually results in the fact that doSomeWorkWithData() is executed. However,
some of those functions manipulate QGraphicsItems and whenever that is the case I get
error messages reading: "QObject::startTimer: timers cannot be started from another
thread".
Option 2: Posting an event via QMetaObject::invokeMethod()
Trying to circumvent this problem by properly posting an event for myObject, I
tried to mark MyObject::doSomeWorkWithData() with Q_INVOKABLE, but I failed at invoking the
method as I need to pass arguments with Q_ARG. I properly registered and declared my custom types
represented by struct A, etc. in the example), but I failed at the fact the
Q_ARG expanded to include a literal of the type of the argument, which in the
templated case didn't work ("T" isn't a registered or declared type).
Trying to use conventional signals and slots
This approach essentially directly failed at the fact that the QMeta system doesn't
work with templates, i.e. it seems to me that there simply can't be any templated QObjects.
What I would like help with
After spending about a week on attempting to fix this, reading up on threads (and uncovering some other issues in my code), I would really like to get this done right.
As such, I would really appreciate if :
somebody could show me a generic way of how a QObject's member function can be called via a callback function from another 3rd-party library (or anything else for that matter) from a different, non QThread-controlled, thread.
somebody could explain to me why Option 1 works if I simply don't create a GUI, i.e. do all the same work, just without a QGraphcisScene visualizing it (and the project's app being a QCoreApplication instead of a QApplication and all the graphics related work #defineed out).
Any, and I mean absolutely any, straw I could grasp on is truly appreciated.
Update
Based on the accepted answer I altered my code to deal with callbacks from other threads: I introduced a thread check at the beginning of my void doSomeWorkWithData() functions:
void doSomeWorkWithData(A a)
{
if( QThread::currentThread() != this->thread() )
{
QMetaObject::invokeMethod( this,"doSomeWorkWithData"
,Qt::QueuedConnection
,Q_ARG(A, a) );
return;
}
/* The actual work this function does would be below here... */
};
Some related thoughts:
I was contemplating to introduce a QMutexLocker before the if statement, but decided against it: the only part of the function that is potentially used in parallel (anything above the return; in the if statement) is--as far as I understand--thread safe.
Setting the connection type manually to Qt::QueuedConnection: technically, if I understand the documentation correctly, Qt should do the right thing and the default, Qt::AutoConnection, should end up becoming a Qt::QueuedConnection. But since would always be the case when that statement is reached, I decided to put explicitly in there to remind myself about why this is there.
putting the queuing code directly in the function and not hiding it in an interim function: I could have opted to put the call to invokeMethod in another interim function, say queueDoSomeWorkWithData()', which would be called by the callback in the listener and then usesinvokeMethodwith anQt::AutoConnection' on doSomeWorkWithData(). I decided against this as there seems no way for me to auto-code this interim function via templates (templates and the Meta system was part of the original problem), so "the user" of my code (i.e. the person who implements doSomeWorkWithData(XYZ xyz)) would have to hand type the interim function as well (as that is how the templated type names are correctly resolved). Including the check in the actual function seems to me to safe typing an extra function header, keeps the MyClass interface a little cleaner, and better reminds readers of doSomeWorkWithData() that there might be a threading issue lurking in the dark.
It is ok to call a public function on a subclass of QObject from another thread if you know for certain that the individual function will perform only thread-safe actions.
One nice thing about Qt is that it will handle foreign threads just as well as it handles QThreads. So, one option is to create a threadSafeDoSomeWorkWithData function for each doSomeWorkWithData that does nothing but QMetaMethod::invoke the non-threadsafe one.
public:
void threadSafeDoSomeWorkWithData(A a) {
QMetaMethod::invoke("doSomeWorkWithData", Q_ARG(A,a));
}
Q_INVOKABLE void doSomeWorkWithData(A a);
Alternatively, Sergey Tachenov suggests an interesting way of doing more or less the same thing in his answer here. He combines the two functions I suggested into one.
void Obj2::ping() {
if (QThread::currentThread() != this->thread()) {
// not sure how efficient it is
QMetaObject::invoke(this, "ping", Qt::QueuedConnection);
return;
}
// thread unsafe code goes here
}
As to why you see normal behaviour when not creating a GUI? Perhaps you're not doing anything else that is unsafe, aside from manipulating GUI objects. Or, perhaps they're the only place in which your thread-safety problems are obvious.

How to make an assert window in QT

I currently have a very long running GUI-Application in QT. Later that application is going to be tested and run on an embedded device without a keyboard in full screen.
For easier debugging I have a custom assert macro, which allows me to ignore certain asserts (may include known buggy parts I have to work around for now) etc. For the time being I just print something on the console such as "Assertion: XXXX failed; abort/ignore". This is fine when I start the application within a console, but ultimately fails when I run it on the final device. In that case the assert will just block the main thread waiting for input and make the GUI hang badly without hope for recovery.
Now I am thinking about how to remedy this situation. One Idea is to just have the assert crash, as the standard assert does. But I do not really like that Idea, since there are a lot of know problems, and I've always found ignorable asserts very helpful when testing applications. Also I would have to put the messages into a separate file, so I can later see what happened while testing. Reading these files afterwards is possible, but I would prefer a simpler way to find out what went wrong.
The other idea was to make a window instead. However the asserts may be triggered in any thread and I can only create new windows in the GUI thread. Also the main event loop may be blocked by the assert, so I cannot be sure that it will handle the events correctly. I would somehow need a fully responsive stand-alone window in a separate thread, which only handles a few buttons.
Is this somehow possible in QT4?
You may post events to main thread to display dialog and wait for an answer from non-gui threads, or just display dialog if current thread is app thread
int result = -1;
if ( QTrhead::currentThread() == QCoreApplication::instance()->thread() )
{
result = AssertHandler->ShowDialog(where, what, condition);
}
else
{
QMetaObject::invokeMethod(AssertHandler, "ShowDialog", Qt::QueuedBlockingConnection, Q_RETURN_ARG(int, result), Q_ARG(QString, where), Q_ARG(QString, what), Q_ARG(QString, condition);
}
if (result != 0)
{
// handle assert
}
The AssertHandler is QObject based class with slot int ShowDialog(const QString &where, const QString what, const QString &condition). It should display dialog with assert data and buttons assert/ignore. Returns 0 when user pressed ignore, otherwise returns non zero value.

Resources