I have a native Visual C++ COM object and I need to make it completely thread-safe to be able to legally mark it as "free-threaded" in th system registry. Specifically I need to make sure that no more than one thread ever accesses any member variable of the object simultaneously.
The catch is I'm almost sure that no sane consumer of my COM object will ever try to simultaneously use the object from more than one thread. So I want the solution as simple as possible as long as it meets the requirement above.
Here's what I came up with. I add a mutex or critical section as a member variable of the object. Every COM-exposed method will acquire the mutex/section at the beginning and release before returning control.
I understand that this solution doesn't provide fine-grained access and this might slow execution down, but since I suppose simultaneous access will not really occur I don't care of this.
Will this solution suffice? Is there a simpler solution?
This solution should work, but I'd recommend mutexes over critical sections as they handle time-outs, which provide some level of fall back in case of deadlock. You also want to be very careful that a function locking a mutex does not call another function that has already locked the same mutex in the same thread. This shouldn't be a problem for your COM interface, so long as you don't add extra functionality on top of your mutex to the interface. You could hit issues if the COM includes call backs.
If you are certain that actual concurrent access is not going to happen in practice, then mutexing the entire execution is not an unreasonable approach.
Related
Given a situation where thread A had to dispatch work to thread B, is there any synchronisation mechanism that allows thread A to not return, but remain usable for other tasks, until thread B is done, of which then thread A can return?
This is not language specific, but simple c language would be a great choice in responding to this.
This could be absolutely counterintuitive; it actually sounds as such, but I have to ask before presuming...
Please Note This is a made up hypothetical situation that I'm interested in. I am not looking for a solution to an existing problem, so alternative concurrency solutions are completely pointless. I have no code for it, and if I were in it I can think of a few alternative code engineering solutions to avoid this setup. I just wish to know if a thread can be usable, in some way, while waiting for a signal from another thread, and what synchronisation mechanism to use for that.
UPDATE
As I mentioned above, I know how to synchronise threads etc. Im only interested in the situation that I have presented here. Mutexes, semaphores and locks all kinds of mechanisms will all synchronise access to resources, synchronise order of events, synchronise all kinds of concurrently issues, yes. But Im not interested in how to do it properly. I just have this made up situation that I wish to know if it can be addressed with a mechanism as described prior.
UPDATE 2
It seems I have opened up a portal for people that think they are experts in concurrency to teleport and lecture at chance how they think the rest of world does not know how threading works. I simply asked if there is a mechanism for this situation, not a work around solution, not 'the proper way to synchronise', not a better way to do it. I already know what I would do and never be in this made up situation. It's simply hypothetical.
After much research, thought, and overview, I have come to the conclusion that its like asking:
If a calculator has the ability for me simply enter a series of 5 digits and automatically get their sum on the screen.
No, it does not have such a mode ready. But I can still get the sum with a few extra clicks using the plus and eventually the equal button.
If i really wanted a thread that can continue while listening for a condition of some sort, I could easily implement a personal class or object around the OS/kernel/SDK thread or whatever and make use of that.
• So at a low level, my answer is no, there is no such mechanism •
If a thread is waiting, then it's waiting. If it can continue executing then it is not really 'waiting', in the concurrency meaning of waiting. Otherwise there would be some other term for this state (Alert Waiting, anyone?). This is not to say it is not possible, just not with one simple low level predefined mechanism similar to a mutex or semaphore etc. One could wrap the required functionality in some class or object etc.
Having said that, there are Interrupts and Interrupt handlers, which come close to addressing this situation. However, an interrupt has to be defined, with its handler. The interrupts may actually be running on another thread (not to say a thread per interrupt). So a number of objects are involved here.
You have a misunderstanding about how mutexes are typically used.
If you want to do some work, you acquire the mutex to figure out what work you need to do. You do this because "what work you need to do" is shared between the thread that decide what work needed to be done and the thread that's going to do the work. But then you release the mutex that protects "what work you need to do" while you do the work.
Then, when you finish the work, you acquire the mutex that protects your report that the work is done. This is needed because the status of the work is shared with other threads. You set that status to "done" and then you release the mutex.
Notice that no thread holds the mutex for very long, just for the microscopic fraction of a second it needs to check on or modify shared state. So to see if work is done, you can acquire the mutex that protects the reporting of the status of that work, check the status, and then release the mutex. The thread doing the work will not hold that mutex for longer than the tiny fraction of a second it needs to change that status.
If you're holding mutexes so long that you worry at all about waiting for them to be released, you're either doing something wrong or using mutexes in a very atypical way.
So use a mutex to protect the status of the work. If you need to wait for work to be done, also use a condition variable. Only hold that mutex while changing, or checking, the status of the work.
But, If a thread attempts to acquire an already acquired mutex, that thread will be forced to wait until the thread that originally acquired the mutex releases it. So, while that thread is waiting, can it actually be usable. This is where my question is.
If you consider any case where one thread might slow another thread down to be "waiting", then you can never avoid waiting. All that has to happen is one thread accesses memory and that might slow another thread down. So what do you do, never access memory?
When we talk about one thread "waiting" for another, what we mean is waiting for the thread to do actual work. We don't worry about the microscopic overhead of inter-thread synchronization both because there's nothing we can do about it and because it's negligible.
If you literally want to find some way that one thread can never, ever slow another thread down, you'll have to re-design pretty much everything we use threads for.
Update:
For example, consider some code that has a mutex and a boolean. The boolean indicates whether or not the work is done. The "assign work" flow looks like this:
Create a work object with a mutex and a boolean. Set the boolean to false.
Dispatch a thread to work on that object.
The "do work" flow looks like this:
Do work. (The mutex is not held here.)
Acquire mutex.
Set boolean to true.
Release mutex.
The "is work done" flow looks like this:
Acquire mutex.
Copy boolean.
Release mutex.
Look at copied value.
This allows one thread to do work and another thread to check if the work is done any time it wants to while doing other things. The only case where one thread waits for the other is the one-in-a-million case where a thread that needs to check if the work is done happens to check right at the instant that the work has just finished. Even in that case, it will typically block for less than a microsecond as the thread that holds the mutex only needs to set one boolean and release the mutex. And if even that bothers you, most mutexes have a non-blocking "try to lock" function (which you would use in the "check if work is done" flow so that the checking thread never blocks).
And this is the normal way mutexes are used. Actual contention is the exception, not the rule.
I know that you need synchronize (yourprocedure) to set e.g. a label's text.
But what about:
Reading a label's text.
Toggle/Set the label's enabled property.
Call other labels procedures/functions (e.g. onclick event).
Is there an easy rule to know/remember when I need to use synchronize?
PS.: Is synchronize similar to PostMessage/SendMessage?
Easy rule of thumb: ANY access to VCL UI components needs to be synchronized. That includes both reading and writing of UI control properties. Win32 UIs, most notably dialogs like MessageBox() and TaskDialog(), can be used directly in worker threads without synchronizing.
TThread.Synchronize() is similar to SendMessage() (in fact, it used to be implemented using SendMessage() internally in Delphi 5 and earlier). TThread.Queue() is similar to PostMessage().
Any time you access a VCL UI component, you need to implement some type of thread safety measure. This is also, typically, the case when you're accessing a variable or procedure that exists or will be accessed by another thread. However, you don't need to use the Synchronize method in all of these situations. There are other tools at your disposal, and Synchronize is not always your best solution.
Synchronize blocks both the main thread and the calling thread while it's performing the procedure that you pass to it, so overusing it can detract from the benefits of multi-threading. Synchronize is probably most commonly used for updating your UI, but if you find that you're having to use it really frequently, then it might not be a bad idea to check and see if you can restructure your code. I.E. do you really need to read labels from within your thread? Can you read the label before starting the thread and pass it into the thread's constructor? Can you handle any of these tasks in the thread's OnTerminate event handler?
Can you suggest an approach when design-time components can be accessed both from general code (VCL or other) and from my own threads?
The problem is that when I have full control over my own threads I know exactly when I should access mutexes. In case of design-time elements I have no control at least of the code related to VCL.
One of the variants would be to wrap HandleMessage in a mutex access code. The idea behind this is that almost everything related to VCL comes from message processing code (the exception is direct SendMessage handling). But looking at the sources I see no "official" way to wrap message handling in any code fragment.
Don't even try to go there. Google for "global interpreter lock" (Python specific) to see what a bad idea such a bottleneck is.
If you need synchronized access to data, try to make the locked access as short as possible, and lock not any higher in the call chain than you absolutely must. If you have objects that are to be accessed from multiple threads, then synchronize inside their methods.
I'm wondering what is the "best" way to make data thread-safe.
Specifically, I need to protect a linked-list across multiple threads -- one thread might try to read from it while another thread adds/removes data from it, or even frees the entire list. I've been reading about locks; they seem to be the most commonly used approach, but apparently they can be problematic (deadlocks). I've also read about atomic-operations as well as thread-local storage.
In your opinion, what would be my best course of action? What's the approach that most programmers use, and for what reason?
One approach that is not heavily used, but quite sound, is to designate one special purpose thread to own every "shared" structure. That thread generally sits waiting on a (thread-safe;-) queue, e.g. in Python a Queue.Queue instance, for work requests (reading or changing the shared structure), including both ones that request a response (they'll pass their own queue on which the response is placed when ready) and ones that don't. This approach entirely serializes all access to the shared resource, remaps easily to a multi-process or distributed architecture (almost brainlessly, in Python, with multiprocessing;-), and absolutely guarantees soundness and lack of deadlocks as well as race conditions as long as the underlying queue object is well-programmed once and for all.
It basically turns the hell of shared data structures into the paradise of message-passing concurrency architectures.
OTOH, it may be a tad higher-overhead than slugging it out the hard way with locks &c;-).
You could consider an immutable collection. Much like how a string in .net has methods such as Replace, Insert, etc. It doesn't modify the string but instead creates a new one, a LinkedList collection can be designed to be immutable as well. In fact, a LinkedList is actually fairly simple to implement this way as compared to some other collection data structures.
Here's a link to a blog post discussing immutable collections and a link to some implementations in .NET.
http://blogs.msdn.com/jaredpar/archive/2009/04/06/immutable-vs-mutable-collection-performance.aspx
Always remember the most important rule of thread safety. Know all the critical sections of your code inside out. And by that, know them like your ABCs. Only if you can identify them at go once asked will you know which areas to operate your thread safety mechanisms on.
After that, remember the rules of thumb:
Look out for all your global
variables / variables on the heap.
Make sure your subroutines are
re-entrant.
Make sure access to shared data is
serialized.
Make sure there are no indirect
accesses through pointers.
(I'm sure others can add more.)
The "best" way, from a safety point of view, is to put a lock on the entire data structure, so that only one thread can touch it at a time.
Once you decide to lock less than the entire structure, presumably for performance reasons, the details of doing this are messy and differ for every data structure, and even variants of the same structure.
My suggestion is to
Start with a global lock on your data structure. Profile your program to see if it's really a problem.
If it is a problem, consider whether there's some other way to distribute the problem. Can you minimize the amount of data in the data structure in question, so that it need not be accessed so often or for so long? If it's a queuing system, for example, perhaps you can keep a local queue per thread, and only move things into or out of a global queue when a local queue becomes over- or under-loaded.
Look at data structures designed to help reduce contention for the particular type of thing you're doing, and implement them carefully and precisely, erring on the side of safety. For the queuing example, work-stealing queues might be what you need.
I am doing a BHO (extension for IE) that receives events on other thread. When I access the DOM from that other thread, IE crashes. Is it possible to make the DOM accessed from the same thread as the main BHO thread so that it does not crash?
It seems like a general COM multithreading problem, which I don't understand much.
Look into using CoMarshalInterface or CoMarshalInterThreadInterfaceInStream
These will give you a wrapped interface to an STA COM object that is thread safe.
I don't know much about IE extensions, but it sounds like some COM object needs to be marked a Single Threaded Apartment, so that the COM runtime system ensures that it is run on the same thread which called it initially. If you can't alter the other object, you could probably route your calls to the DOM through a separate COM object marked as STA to achieve the same effect. Hope this helps... I know a bit about COM multithreading, but not much about IE extensions.
ah, fun fun fun multithreading with COM.
Gerald's answer looks right if you want to transfer an interface pointer from one thread to another exactly once. I've found that the GIT (global interface table) is a big help for this kind of thing if you're in a multithreaded system... basically you don't keep around interface pointers but rather DWORD cookies used by the GIT to get an appropriately-marshaled interface pointer for whatever thread you are using it. (you have to register the object in question with the GIT first, and unregister it later when you are done or your object is finished)
Be careful though. Performance can become a serious issue.
If you're just playing around to learn about BHOs, you can use the STA to make your ::SetSite() implementing object operate as if it were single threaded (this allows you to let other threads pull your BHO's pointer out of the GlobalInterfaceTable as #JasonS mentions.
If you're doing something that is expected to be part of a product I highly recommend you very carefully reconsider going MTA everywhere you can and handling the concurrency and thread safety issues yourself. In this case you would only need to ensure that the threads inter-operating with your BHO COM object, were themselves, initialized for COM.
For example, if you want to monitor incoming/outgoing data of website looking for things (either dangerous or sensitive) - then you do NOT want to force all of those threads down the throat of an STA object because, using Yahoo as an example, more than 30 requests will launch and your BHO will start locking up IE.