Questions about COM multithreading and STA / MTA

Questions about COM multithreading and STA / MTA - multithreading

Hi I am a beginner in COM. I want to test a COM dll in both STA and MTA modes. My first question is: is it possible a COM object supports both STA and MTA?
Now I imagine the STA code snippet below:
// this is the main thread
m_IFoo;
CoInitializeEx(STA); // initialize COM in main thread
CreateInstance(m_IFoo);
m_IFoo->Bar();
CreateThread(ThreadA);
// start ThreadA
// this is secondary thread
ThreadA()
{
CoInitializeEx(STA);
m_IFoo->Buz(); // call m_IFoo's method directly
}
Will this code work? Am I missing any fundamental things? I know the main thread needs a window message loop to let calls from other threads be executed. Do I have to do anything about it?
Now I move on to test MTA. If I merely replace "STA" with "MTA" in the above code, will it work？
Another question is: As a thread with GUI must be STA, I cannot initialize and test MTA in a GUI thread?
Thanks in advance and sorry for me being naive on COM and threading.

Your code is not legal COM, because you are passing a pointer directly from one STA to another, which COM doesn't allow.
In COM, interface pointers have "apartment affinity", they can only be used within an apartment. To pass a pointer from one STA to another, or between STA and MTA, you have to 'marshal' the pointer to a safe representation, which is then unmarshaled by the receiving thread.
The simplest way to do this is using the Global Interface Table; you register the interface with it in one thread and get back a DWORD, which you then use in the other thread to get back a version of the interface that the other thread can use.
If both threads are MTA, you can avoid doing this. While STA are one-per-thread - each STA thread has its own aparment - the MTA is shared by all MTA threads. This means that MTA threads can pass COM pointers between themselves freely. (But they still need to marshal if passing pointers to or from STA threads.)
Generally speaking, you don't change code between STA or MTA, you usually decide this once at the outset. If the thread has UI, then it needs a message loop, and is usually STA. If there's no UI, you may decide to use MTA. But once you make that decision and write your code, it's rare to change to the other later, since picking one or the other has different requirements and assumptions that affect the code; change from STA to MTA or vice versa and you'd have to carefully review the code and see if things like pointer assignments needed to be changed.

Being able to switch from "MTA" to "STA" and consequences of such switch will depend on how the object is registered in system registry. In order for the object to "support" both cases without marshalling it has to have ThreadingModel set to Both.
Please see this great answer - Both means "either Free or Apartment depending on how the caller initializes COM". That's exactly what you want.
As to using the "STA" mode - yes, the tread object belongs to will have to run the message loop by calling GetMessage(), TranslateMesage() and DispatchMessage() in a loop. Anyway the objects methods won't be called directly from the second thread - they will go through the proxy. Please see this very good article for thorough explanation.

Related

Microsoft's Aparment Analogy (STA, MTA): Need help understanding it

I've read lots about the Microsoft's threaded apartment model, but I'm still having a little trouble visualizing it.
Microsoft uses the analogy of living things living in an apartment. So, for STA, consider the following (I know it's a little silly).
Assume thread = person and COMObject = bacteria. The person lives in the apartment, and the bacteria lives inside the person. So in STA-Land, a thread lives in the STA and the COMObject lives inside the thread, so in order to interact with the COMObject, one must do so by running code on the COMObject's thread.
Assume thread = person and COMObject = cat. The person lives in the apartment, and the cat lives in the apartment with the person. SO in STA-Land, the thread and the COMObject at the same hierarchical level.
Q1. Which analogy above is correct, or if neither are correct, how would you describe the STA?
Q2. How would you describe the MTA?

I do not like these analogies. They are confusing.
You create an apartment.
If it is an STA there will be only one thread in the apartment so all the objects in that apartment will be executed on that single thread (so there is no concurrent execution in the objects in that apartment)
If it is an MTA there can be multiple threads in that apartment. So the objects in the MTA need to implement the synchronization explicitly if needed.
An object lives in one apartment. There can be multiple objects in the same apartment.
A very good read here

It is not a great term. It actually describes thread behavior. A thread tells COM how it behaves in the CoInitializeEx() call, selecting between STA and MTA. By using STA, the thread promises that it behaves in a manner that suitable for code that is not thread-safe. The hard promises it makes are:
Never blocks execution
Pumps a message loop
Using MTA means a thread can do whatever it wants and does not make any effort to support code that is not thread-safe.
This matters first when a COM object gets created. Such an object contains a key in the registry that describes what kind of thread-safety it implements. The ThreadingModel key. By far the most common value for this key is "Apartment" (or is missing), telling COM that it doesn't support threading at all and that any calls on the object must be made from the same thread.
If the thread that creates such an object is in an STA then everything is happy. After all, the thread promised to support single threaded objects. If the thread is in the MTA then there's a problem, the thread said it didn't support thread-safety but still created an object that isn't thread-safe. COM steps in an creates a new thread, an STA thread that can support code that isn't thread safe. The code gets a proxy to the object. Any calls made on the object go through that proxy. The proxy code intercepts the call and makes it run on the STA thread that was created, thus ensuring the call is made in a thread-safe way.
As you can imagine, the job done by the proxy isn't cheap. It involves two thread context switches and a stack frame must be constructed from the function arguments to make the call. It must also wait until the thread is ready to execute the call. This is called marshaling, it is an easy 3 orders of magnitude slower than making a call that doesn't have to be marshaled. This perhaps also explains the reason an STA thread has those two requirements listed above. It cannot block because as long as it blocks that marshaled call cannot be made and makes deadlock very likely. And it must pump a message loop, that loop is what makes injecting a call into another thread possible.
So making a thread join the MTA is easy programming for you. But deadly to performance. An STA is efficient.

Difference between "free-threaded" and "thread-safe"

Sometimes I see the term "free-threaded" to describe a class or a method. It seems to have a similar or identical meaning to "thread-safe".
Is there a difference between the two terms?

There may well be other things meant in other contexts, but in cases I've worked with in the past, "free threaded" means it works, or at least can work, across different threads without any marshalling between apartments.
Apartment-threading in contrast blocks off different "apartments" with separate copies of "global" data (which hence isn't really global, when you think about it) and either allows only one thread to operate in the apartment, or allows several but which will still be separate from those using other apartments.
Now, because the apartment model offers some thread-safety of its own, some (but not all) concerns about thread-safety go away. A piece of code that is designed to operate in an apartment model will be thread-safe, but some or all of that thread-safety is coming from the apartment model.
A free threaded piece of code will have to provide full guarantees of whatever degree of thread-safety it is claiming itself.
Which means it pretty much does mean the same thing as thread-safe, for any intents and purposes where you don't also have to consider the thread-safety of the use of apartment-model code.

I've just done some research on what "free-threaded" might mean, and I also ended up with COM. Let me first cite two passages from the 1998 edition of Don Box' book Essential COM. (The book actually contains more sections about the free-threaded model, but I'll leave it at that for now.)
A thread executes in exactly one apartment at a time. Before a thread can use COM, it must first enter an apartment. […] COM defines two types of apartments: multithreaded apartments (MTAs) and singlethreaded apartments (STAs). Each process has at most one MTA; however, a process can contain multiple STAs. As their names imply, multiple threads can execute in an MTA concurrently, whereas only one thread can execute in an STA. […]
— from pages 200-201. (Emphasis added by me.)
Each CLSID in a DLL can have its own distinct ThreadingModel. […]
ThreadingModel="Both" indicates that the class can execute in either an MTA or an STA.
ThreadingModel="Free" indicates that the class can execute only in an MTA.
ThreadingModel="Apartment" indicates that the class can execute only in an STA.
The absence of a ThreadingModel value implies that the class can run only on the main STA. The main STA is defined as the first STA to be initialized in the process.
— from page 204. (Formatting and emphasis added by me.)
I take this to mean that a component (class) who is declared as free-threaded runs in an MTA, where concurrency of several threads is possible and calls to the component from different threads is explicitly allowed; ie. a free-threaded component supports a multithreaded environment. It would obviously have to be thread-safe in order to achieve this.
The opposite would be a component that is designed for an STA, ie. only allows calls from one particular thread. Such a class would not have to be thread-safe (because COM will take care that no other thread than the one who "entered" / set up the STA can use the component in the first place, that is, COM protects the component from concurrent access).
Conclusion: COM's term "free-threaded" essentially seems to have the same implications as the more general term "thread-safe".
P.S.: This answer assumes that "thread-safe" basically means something like, "can deal with concurrent accesses (possibly by different threads)."
P.P.S.: I do wonder whether "free-threaded" is the opposite of "having thread affinity".

“free-threaded” and “thread-safe” have different meaning. For example, ASP.NET can use "application state" to share data between different web sessions and users of a application (aka IIS virtual directory and all subdirectory). Microsoft Help said:
https://learn.microsoft.com/en-us/previous-versions/aspnet/ms178594(v=vs.100)
Application state is free-threaded, which means that application state
data can be accessed simultaneously by many threads. Therefore, it is
important to ensure that when you update application state data, you
do so in a thread-safe manner by including built-in synchronization
support. You can use the Lock and UnLock methods to ensure data
integrity by locking the data for writing by only one source at a
time.
Therefore, I understand the "application state" can be read simultaneously; but to update its contents, your source codes of ASP.NET must use lock and unlock operations around.
It can extend the understand to COM domain. “free-threaded” for MTA apartment means threads can read COM data simultaneously；but to update COM data in MTA, your source codes should use synchronization manner by yourself. In the other hand, STA is “thread-safe” by COM itself. Your program can access STA data diretly and easily.

VC++ thread marshalling and COM : The application called an interface that was marshalled for a different thread

My VC++ 2005 Dialog based application initializes a COM object in the dialog class and uses it in the worker thread.
I called CoInitialize(NULL) At the start of the application and the at the start of the worker thread. But when a COM method is called the error "The application called an interface that was marshalled for a different thread" follows.
If I use CoInitializeEx(0,COINIT_MULTITHREADED) then I will get the same error message
Please help me in finding the root cause.
Thanks.

You created two single-threaded apartments by calling CoInitialize(NULL). An interface pointer must be marshaled from one apartment to the other before it is usable. Initializing the worker thread as MTA doesn't solve the problem. The original interface pointer was still created in a single-threaded apartment and is thus not thread-safe. In other words, you cannot call the interface methods directly from a thread. Those calls have to be marshaled to the thread that created the interface. Marshaling the interface pointer sets up the plumbing that makes that possible.
The only time you don't have to marshal is when both threads are MTA. That's almost never possible, your main thread must be STA if it creates any windows. And the COM server would actually have to be thread-safe, they very rarely are. They advertise what they need with the ThreadingModel key in the registry. COM will actually create an STA thread if necessary to find a good home for the server.
You must marshal the pointer with CoMarshalInterThreadInterfaceInStream() to avoid the error. That's a fairly unfriendly function, IGlobalInterfaceTable is easier to use. The COM server also has to support it, you typically need a proxy/stub DLL that takes care of the marshaling. You'll get E_NOINTERFACE if it doesn't.
Also beware the overhead, marshaling a call from the worker thread to the main thread is pretty expensive and subject to how responsive your main thread is. In other words, if you wrote the thread to speed up your program or to avoid blocking the user interface then this won't actually work. It is the 'there is no free lunch' principle.

Probably CoMarshalInterface() and CoUnMarshalInterface() are the simplest way to do this.
Look at http://support.microsoft.com/kb/206076. You can download the code example and find different implementations of your requirements in Client.cpp.

I think one of the ways to access COM objects inside another thread would be to use Global Interface Pointers. After initialization,form the GIT pointer to the thread along with dwCookie value. Then inside the thread reinterpret-cast the pointer as a DWORD and pass it to the GI table to get our COM pointer.
Thanks

Has anyone seen a programming language that handles threads like this?

Most of the multithreaded work I have done has been in C/C++, Python, or Delphi (Object Pascal). All on Windows. I'll use Delphi for my discussion here. Delphi has a nice class called TThread which abstracts the thread creation process. The class provides an Execute method which is the created thread's thread function. You override that method and typically create a loop within it that exits when the thread is terminated. You do the thread's work inside the loop.
One of the recurring tasks that crops up is keeping track (carefully) of which code gets executed in the thread's context and which code gets executed by external threads in an external context, with synchronization objects guarding data shared by the threads. All basic thread programming stuff. One of the recurring annoyances is creating functions to allow external threads to submit or retrieve data, and moving data from public thread-safe memory objects to those private to the thread and back the other way.
I was wondering if anyone has ever seen a programming language that makes this simpler? Here's what I would love as a programming thread idiom. Let's take a Delphi TThread sub-class created for this discussion. Suppose I could mark class methods with one of three keywords like Private, PublicExecuteInAnyContext, or PublicExecuteInPrivateContext. Here's how they would work.
Private: Private methods would only execute in the thread's context. The compiler would automatically add code that would raise an exception if a code path led to that method being executed in a context outside of the host thread. (E.g. - "Error, attempt to execute method private to thread $AEB from thread $EE0").
PublicExecuteInAnyContext: methods marked as such could be called by the thread that owns the method and any external thread. Any data objects referenced in these methods would automatically be guarded with synchronization objects, with the option to override the default choice and supply your own. (Mutex or semaphore instead of Critical Section, etc.)
PublicExecuteInPrivateContext: Methods marked with this keyword would execute in the thread's context, but are callable by any thread. This option would allow for two strategies for dealing with calls to such methods by external threads:
1) Mode 1 - Block calling thread: the calling thread would block until the method returned. In other words the compiler would automatically write code to make the calling thread would block. Any parameters passed by the calling thread would be copied into variables private to the host thread. The method would not execute until the host thread received control. When the host thread exited the method the calling thread would be released and would have any results returned by the method copied into it's own private variable space.
2) Mode 2 - Do not block the calling thread: this would allow for the additional argument of a callback function. Any parameters passed to the method by an external thread would be copied into the thread's private variable space. The compiler would again hold the execution of the method until the host thread received control, but would let the calling thread continue without blocking. When the host thread completed execution of the method, if a callback function was specified, it would call that function ** but in the context of the original thread that called the method, not in the host thread context **.
A language that would provide this kind of automatic thread handling would, at least to me, be a lot more fun to do multithreading with then the way I have to do it now. Has anybody seen a programming language, or a module/hack for one of the mainstream languages, that provides this kind of multi-threading model?

I don't think it matches your description exactly, but Erlang provides one of the easiest concurrency models I've seen. It uses a share-nothing approach, which may or may not sound appealing to you, but I found it really interesting. It's also really easy to create distributed systems with it, if you ever have the need. Check out "Getting Started with Erlang", it's a great tutorial that covers almost all parts of the language.

Wanted to mention. LabVIEW really makes starting with multi threading a breeze. I just found some article on the Internet which actually explains this in a better way:
If you have programmed in a
traditional, textual language before,
the data-flow paradigm of Labview can
be somewhat hard to embrace. The
data-flow paradigm stipulates that it
does not matter where on the 2D
surface of the block diagram a
particular component is placed in
relation to other components, but what
other components it is wired to. A
particular component-node does not
execute until all its inputs are
available; it executes, however, as
soon as all its inputs are available,
regardless of what else might be
executing at the same time, that is -
in parallel with potentially many
other things.
You need extensive experience with
multi-threading programming in a
traditional language, however, to
truly appreciate how easy it is to
achieve multi-threading with LabVIEW
and how naturally the parallelism
arises - whether intentionally, or
not. While LabVIEW does not magically
dissolve all the challenges related to
parallel processing, it certainly
makes it considerably simpler to get
started with it. In fact, in contrast
with the traditionally sequential
textual programming languages, its is
the sequential execution that comes at
a price in LabVIEW - parallelism is
almost free.
Taken from http://saberrobotics.org/?id=34

COM: calling from other thread causes crashes, how to make it run on the same thread?

I am doing a BHO (extension for IE) that receives events on other thread. When I access the DOM from that other thread, IE crashes. Is it possible to make the DOM accessed from the same thread as the main BHO thread so that it does not crash?
It seems like a general COM multithreading problem, which I don't understand much.

Look into using CoMarshalInterface or CoMarshalInterThreadInterfaceInStream
These will give you a wrapped interface to an STA COM object that is thread safe.

I don't know much about IE extensions, but it sounds like some COM object needs to be marked a Single Threaded Apartment, so that the COM runtime system ensures that it is run on the same thread which called it initially. If you can't alter the other object, you could probably route your calls to the DOM through a separate COM object marked as STA to achieve the same effect. Hope this helps... I know a bit about COM multithreading, but not much about IE extensions.

ah, fun fun fun multithreading with COM.
Gerald's answer looks right if you want to transfer an interface pointer from one thread to another exactly once. I've found that the GIT (global interface table) is a big help for this kind of thing if you're in a multithreaded system... basically you don't keep around interface pointers but rather DWORD cookies used by the GIT to get an appropriately-marshaled interface pointer for whatever thread you are using it. (you have to register the object in question with the GIT first, and unregister it later when you are done or your object is finished)

Be careful though. Performance can become a serious issue.
If you're just playing around to learn about BHOs, you can use the STA to make your ::SetSite() implementing object operate as if it were single threaded (this allows you to let other threads pull your BHO's pointer out of the GlobalInterfaceTable as #JasonS mentions.
If you're doing something that is expected to be part of a product I highly recommend you very carefully reconsider going MTA everywhere you can and handling the concurrency and thread safety issues yourself. In this case you would only need to ensure that the threads inter-operating with your BHO COM object, were themselves, initialized for COM.
For example, if you want to monitor incoming/outgoing data of website looking for things (either dangerous or sensitive) - then you do NOT want to force all of those threads down the throat of an STA object because, using Yahoo as an example, more than 30 requests will launch and your BHO will start locking up IE.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string