Pthread mutex assertion error - linux

I'm encountering the following error at unpredictable times in a linux-based (arm) communications application:
pthread_mutex_lock.c:82: __pthread_mutex_lock: Assertion `mutex->__data.__owner == 0' failed.
Google turns up a lot of references to that error, but little information that seems relevant to my situation. I was wondering if anyone can give me some ideas about how to troubleshoot this error. Does anyone know of a common cause for this assertion?
Thanks in advance.

Rock solid for 4 days straight. I'm declaring victory on this one. The answer is "stupid user error" (see comments above). A mutex should only be unlocked by the thread that locked it. Thanks for bearing with me.

TLDR: Make sure you are not locking a mutex that has been destroyed / hasn't been initialized.
Although the OP has his answer, I thought I would share my issue in case anyone else has the same problem I did.
Notice that the assertion is in __pthread_mutex_lock and not in the unlock. This, to me, suggests that most other people having this issue are not unlocking a mutex in a different thread than the one that locked it; they are just locking a mutex that has been destroyed.
For me, I had a class (Let's call it Foo) that registered a static callback function with some other class (Let's call it Bar). The callback was being passed a reference to Foo and would occasionally lock/unlock a mutex that was a member of Foo.
This problem occurred after the Foo instance was destroyed while the Bar instance was still using the callback. The callback was being passed a reference to an object that no longer existed and, therefore, was calling __pthread_mutex_lock on garbage memory.
Note, I was using C++11's std::mutex and std::lock_guard<std::mutex>, but, since I was on Linux, the problem was exactly the same.

I was faced with the same problem and google sent me here. The problem with my program was that in some situations I was not initializing the mutex before locking it.
Although the statement in the accepted answer is legitimate, I think it is not the cause of this failed assertion. Because the error is reported on pthread_mutex_lock (and not unlock).
Also, as always, it is more likely that the error is in the programmers source code rather than the compiler.

The quick bit of Googling I've done often blames this on a compiler mis-optimization. A decent summation is here. It might be worth looking at the assembly output to see if gcc is producing the right code.
Either that or you are managing to stomp on the memory used by the pthread library... those sort of problems are rather tricky to find.

I was having same problem
in my case inside the thread i was connecting vertica db with odbc
adding following setting to /etc/odbcinst.ini solved my problem. dont geting the exception so far.
[ODBC]
Threading = 1
credits to : hynek

I have just fought my way through this one and thought it might help others.
In my case the issue occured in a very simple method that locked the mutex, checked a shared variable and then returned.
The method is an override of the base class which creates a worker thread.
The problem in this instance was that the base class was creating the thread in the constructor. The thread then started executing and the derived classes implementation of the method was called. Unfortunately the derived class had not yet completed constructing and the mutex in the derived class had uninitialised data as the mutex owner. This made it look like it was actually locked when it wasn't.
The solution is really simple. Add a protected method to the base class called StartThread(). This needs to be called in the derived classes constructor, not from the base class.

In case you are using C++ and std::unique_lock, check this answer: https://stackoverflow.com/a/9240466/9057530

adding Threading=0 in /etc/odbcinst.ini file fixed this issue

Related

OpenFileDialoug Current Thread Must Be STA before OLE calls made

Can someone explain to me what this error I'm seeing is?
Current thread must be set to single thread apartment (STA) mode before OLE calls can be made.
Specifically, I'm trying to open the SaveFileDialog/OpenFileDialog within C++/CLI on a form.
SaveFileDialog^ saveFileDialog1 = gcnew SaveFileDialog;
saveFileDialog1->ShowDialog();
if (saveFileDialog1->ShowDialog() == System::Windows::Forms::DialogResult::OK)
{
s = saveFileDialog1->OpenFile();
}
s->Close();
}
The error that is throwing is
An unhandled exception of type 'System.Threading.ThreadStateException' occurred in System.Windows.Forms.dll
Additional information: Current thread must be set to single thread apartment (STA) mode before OLE calls can be made. Ensure that your Main function has STAThreadAttribute marked on it. This exception is only raised if a debugger is attached to the process.
I'm not really familiar with what this error is saying. I know just a bit about threading, but I'm not sure how threading would be an issue here. I've seen some people reference things like STAThread without providing a clear explanation as to what it does, and Microsoft's documentation makes no mention of having this exception thrown when calling SaveFileDialog/OpenFileDialog, or how to handle it.
Thanks!
When you use OpenFileDialog then a lot of code gets loaded into your process. Not just the operating system component that implements the dialog but also shell extensions. Plugins that programmers write to add functionality to Windows Explorer. They work in that dialog as well. There are many, one you are surely familiar with is the extension that makes a .zip file look like a folder.
One thing Microsoft did when they designed the plug-in interface is to not force an extension to be thread-safe. Because that is very hard to do and often a major source of bugs. They made the promise that the thread that creates the plugin instance is also the thread on which any call to the plugin is made. Thus ensuring that the plugin is always used in a thread-safe manner.
That however requires a little help from you. You have to make a promise that your thread, the one that calls OpenFileDialog::Show(), observes the requirements of a single-threaded apartment. STA for short. You make the promise with the [STAThread] attribute on your program's Main() entrypoint. Or if it is a thread that you created yourself then by calling Thread::SetApartmentState() before you start it.
That's just a promise however, you also have to implement what you promised. Takes two things, you promise to never block the thread and you promise to pump a message loop. Application::Run() in a .NET program. The never-block promise ensures that you won't cause deadlock. And the message loop promise says that you implement a solution to the producer-consumer problem.
This should never be a problem, it is very unclear how this got fumbled in your project. Another implicit requirement for a dialog is that it must have an owner. Another window on which it can be on top of. If it doesn't have one then there are very high odds that the user never sees the dialog. Covered by another program's window, the user can only ever find it back by accident. When you create windows then you always also must call Application::Run() so the windows can respond to user input. Use the boilerplate code in a C++/CLI app so this is done correctly.

Creating promise in one thread and setting it in another

Can I have an boost::promise<void> created in a thread and set its value in another different thread through boost::promise<void>::set_value().
I think I am having a crash because of this, probably, so I must guess that no, but I would need confirmation. Thanks in advance.
P.S.: Note that I am using boost implementation.
Yes, you can do that, but you must ensure that the call to set_value() does not conflict with anything in the other thread, such as the completion of the constructor or the start of the destructor.
(According to the C++ standard you cannot even make potentially concurrent calls to set_value() and get_future() but that is a defect and should get fixed.)
To give a more precise answer it would be necessary to see exactly what your code is doing.

wxwidgets and threads

In my application I run wglGetCurrentDC() and wglGetCurrentContext() from onThread function
(this function should be called as declared here - EVT_THREAD(wxID_ANY,MyCanvas::onThread))
and I get NULL in both cases. When I run it not from onThread it is ok…
What is work around in order to solve the problem – (I have to run them when getting event from the thread!)
As Alex suggested I changed to wxPostEvent to redirect the event to main thread, which catches the event in its onThread function.In this onThread function I have wglGetCurrentDC() and wglGetCurrentContext() calls ...They still return null.Please explain me what I am doing wrong. And how to solve he problem.
Maybe I'm misunderstanding, but should you not be using wxGLCanvas and wxGLContext rather than the windows-specific code? At the very least it's probably more compatible with other wxWidget code.
Anyway, from the wglGetCurrentDC documentation, the function returns NULL if a DC for the current window doesn't exist. This suggests that either the context was destroyed somehow or you're not calling it from the window you think you're calling it from (perhaps because of your threading?). I would reiterate what Alex said; don't call UI code from any thread besides the main one.
If you could post some code showing how you're returning from the thread it might help identify the problem. It seems likely that you're doing UI stuff from the thread and just not realizing it. (Hard to tell without seeing any code, though.)
Don't touch any UI-related stuff from a worker thread. This is general requirement for all UI frameworks. Use wxPostEvent to redirect a work to the main application thread.

Best way to prevent early garbage collection in CLR

I have written a managed class that wraps around an unmanaged C++ object, but I found that - when using it in C# - the GC kicks in early while I'm executing a method on the object. I have read up on garbage collection and how to prevent it from happening early. One way is to use a "using" statement to control when the object is disposed, but this puts the responsibility on the client of the managed object. I could add to the managed class:
MyManagedObject::MyMethod()
{
System::Runtime::InteropServices::GCHandle handle =
System::Runtime::InteropServices::GCHandle::Alloc(this);
// access unmanaged member
handle.Free();
}
This appears to work. Being new to .NET, how do other people deal with this problem?
Thank you,
Johan
You might like to take a look at this article: http://www.codeproject.com/Tips/246372/Premature-NET-garbage-collection-or-Dude-wheres-my. I believe it describes your situation exactly. In short, the remedies are either ausing block or a GC.KeepAlive. However, I agree that in many cases you will not wish to pass this burden onto the client of the unmanaged object; in this case, a call to GC.KeepAlive(this) at the end of every wrapper method is a good solution.
You can use GC.KeepAlive(this) in your method's body if you want to keep the finalizer from being called. As others noted correctly in the comments, if your this reference is not live during the method call, it is possible for the finalizer to be called and for memory to be reclaimed during the call.
See http://blogs.microsoft.co.il/blogs/sasha/archive/2008/07/28/finalizer-vs-application-a-race-condition-from-hell.aspx for a detailed case study.

Delphi Win API CreateTimerQueueTimer threads and thread safe FormatDateTime crashes

This is a bit of a long question, but here we go. There is a version of FormatDateTime that is said to be thread safe in that you use
GetLocaleFormatSettings(3081, FormatSettings);
to get a value and then you can use it like so;
FormatDateTime('yyyy', 0, FormatSettings);
Now imagine two timers, one using TTimer (interval say 1000ms) and then another timer created like so (10ms interval);
CreateTimerQueueTimer
(
FQueueTimer,
0,
TimerCallback,
nil,
10,
10,
WT_EXECUTEINTIMERTHREAD
);
Now the narly bit, if in the call back and also the timer event you have the following code;
for i := 1 to 10000 do
begin
FormatDateTime('yyyy', 0, FormatSettings);
end;
Note there is no assignment. This produces access violations almost immediatley, sometimes 20 minutes later, whatever, at random places. Now if you write that code in C++Builder it never crashes. The header conversion we are using is the JEDI JwaXXXX ones. Even if we put locks in the Delphi version around the code, it only delays the inevitable. We've looked at the original C header files and it all looks good, is there some different way that C++ uses the Delphi runtime? The thread safe version of FormatDatTime looks to be re-entrant. Any ideas or thoughts from anyone who may have seen this before.
UPDATE:
To narrow this down a bit, FormatSettings is passed in as a const, so does it matter if they use the same copy (as it turns out passing a local version within the function call yeilds the same problem)? Also the version of FormatDateTime that takes the FormatSettings doesn't call GetThreadLocale, because it already has the Locale information in the FormatSettings structure (I double checked by stepping through the code).
I made mention of no assignment to make it clear that no shared storage is being accessed, so no locking is required.
WT_EXECUTEINTIMERTHREAD is used to simplify the problem. I was under the impression you should only use it for very short tasks because it may mean it'll miss the next interval if it is running something long?
If you use a plain old TThread the problem doesn't occur. What I am getting at here I suppose is that using a TThread or a TTimer works but using a thread created outside the VCL doesn't, that's why I asked if there was a difference in the way C++ Builder uses the VCL/Delphi RTL.
As an aside this code as mentioned before also fails (but takes longer), after a while, CS := TCriticalSection.Create;
CS.Acquire;
for i := 1 to LoopCount do
begin
FormatDateTime('yyyy', 0, FormatSettings);
end;
CS.Release;
And now for the bit I really don't understand, I wrote this as suggested;
function ReturnAString: string;
begin
Result := 'Test';
UniqueString(Result);
end;
and then inside each type of timer the code is;
for i := 1 to 10000 do
begin
ReturnAString;
end;
This causes the same kinds of failiures, as I said before the fault is never in the same place inside the CPU window etc. Sometimes it's an access violation and sometimes it might be an invalid pointer operation. I am using Delphi 2009 btw.
UPDATE 2:
Roddy (below) points out the Ontimer event (and unfortunately also Winsock, i.e. TClientSocket) use the windows message pump (as an aside it would be nice to have some nice Winsock2 components using IOCP and Overlapping IO), hence the push to get away from it. However does anyone know how to see what sort of thread local storage is setup on the CreateQueueTimerQueue?
Thanks for taking the time to think and answer this problem.
I am not sure if it is good form to post an "Answer" to my own question but it seemed logical, let me know if that is uncool.
I think I have found the problem, the thread local storage idea lead me to follow a bunch of leads and I found this magical line;
IsMultiThread := True;
From the help;
"IsMultiThread is set to true to indicate that the memory manager should support multiple threads. IsMultiThread is set to true by BeginThread and class factories."
This of course is not set by using a single Main VCL thread using a TTimer, However it is set for you when you use TThread. If I set it manually the problem goes away.
In C++Builder, I do not use a TThread but it appears by using the following code;
if (IsMultiThread) {
ShowMessage("IsMultiThread is True!");
}
that is it set for you somewhere automatically.
I am really glad for peoples input so that I could find this and I hope vainly it might help someone else.
As DateTimeToString which FormatDateTime calls uses GetThreadLocale, you may wish to try having a local FormatSettings variable for each thread, maybe even setting up FormatSettings in a local variable before the loop.
It may also be the WT_EXECUTEINTIMERTHREAD parameter which causes this. Note that it states it should only be used for very short tasks.
If the problem persists the problem may actually be elsewhere, which was my first hunch when I saw this but I don't have enough information about what the code does to really determine that.
Details about where the access violation occurs may help.
Are you sure this actually has anything to do with FormatDateTime? You made a point of mentioning that there is no assignment statement there; is that an important aspect of your question? What happens if you call some other string-returning function instead? (Make sure it's not a constant string. Write your own function that calls UniqueString(Result) before returning.)
Is the FormatSettings variable thread-specific? That's the point of having the extra parameter for FormatDateTime, so each thread has its own private copy that is guaranteed not to be modified by any other thread while the function is active.
Is the timer queue important to this question? Or do you get the same results when you use a plain old TThread and run your loop in the Execute method?
You did warn that it was a long question, but it seems there are a few things you could do to make it smaller, to narrow down the scope of the issue.
I wonder if the RTL/VCL calls you're making are expecting access to some thread-local storage (TLS) variables that aren't correctly set up whn you invoke your code via the timer queue?
This isn't the answer to your problem, but are you aware that TTimer OnTimer events just run as part of the normal message loop in the main VCL thread?
You found your answer - IsMultiThread. This has to be used anytime to revert to using the API and create threads. From MSDN: CreateTimerQueueTimer is creating a thread pool to handle this functionality so you have an outside thread working with the main VCL thread with no protection. (Note: your CS.acquire/release doesn't do anything at all unless other parts of the code respect this lock.)
Re. your last question about Winsock and overlapping I/O: You should look closely at Indy.
Indy uses blocking I/O, and is a great choice when you want high performance network IO regardless of what the main thread is doing. Now you've cracked the multi-threading issue, you should just create another thread (or more) to use indy to handle your I/O.
The problem with Indy is that if you need many connections it's not effiecient at all. It requires one thread per connection (blocking I/O) which doesn't scale at all, hence the benefit of IOCP and Overlapping IO, it's pretty much the only scalable way on Windows.
For update2 :
There is a free IOCP socket components : http://www.torry.net/authorsmore.php?id=7131 (source code included)
"By Naberegnyh Sergey N.. High
performance socket server based on
Windows Completion Port and with using
Windows Socket Extensions. IPv6
supported. "
i've found it while looking better components/library to rearchitecture my little instant messaging server. I haven't tried it yet but it looks good coded as a first impression.

Resources