Premature leak in constructor - multithreading

Java docs state following regarding synchronization of constructor:
Note that constructors cannot be synchronized — using the synchronized keyword with a constructor is a syntax error. Synchronizing constructors doesn't make sense, because only the thread that creates an object should have access to it while it is being constructed.
Warning: When constructing an object that will be shared between
threads, be very careful that a reference to the object does not
"leak" prematurely. For example, suppose you want to maintain a List
called instances containing every instance of class. You might be
tempted to add the following line to your constructor:
instances.add(this); But then other threads can use instances to
access the object before construction of the object is complete.
I am not able to understand this whole block. First it states that only the thread that creates an object has access to constructor. Then it warns of premature leak which may cause issues if other threads access the object before construction is complete. Are not these two things in contradiction. If only the creating thread can access the constructor then how can other threads prematurely access the object as it can only be accessed once contructor has run fully?
Any input would be of great help.

Imagine two threads that both have access to a global List (called "instances") holding instances of the class in question. Thread 1 continuously cycles through the list and does something with each instance. Thread 2 goes its own merry way, and occasionally constructs a new instance of the class. If the class would add itself to the List in its constructor (using instances.add(this)) Thread 1 would immediately get access to the instance and could do things with it before it is fully constructed, resulting in unpredictable behavior.
There may be a misunderstanding of the word "should". You wrote: "First it states that only the thread that creates an object has access to constructor. " However, the Java docs say: "only the thread that creates an object should have access to it while it is being constructed", which means that you should take care that only one thread has access to the object while it is being constructed.

Related

Why pass parameters through thread function?

When I create a new thread in a program... in it's thread handle function, why do I pass variables that I want that thread to use through the thread function prototype as parameters (as a void pointer)? Since threads share the same memory segments (except for stack) as the main program, shouldn't I be able to just use the variables directly instead of passing parameters from main program to new thread?
Well, yes, you could use the variables directly. Maybe. Assuming that they aren't changed by some other thread before your thread starts running.
Also, a big part of passing parameters to functions (including thread functions) is to limit the amount of information the called function has to know about the outside world. If you pass the thread function everything it needs in order to do its work, then you can change the rest of the program with relative impunity and the thread will still continue to work. If, however, you force the thread to know that there is a global list of strings called MyStringList, then you can't change that global list without also affecting the thread.
Information hiding. Encapsulation. Separation of concerns. Etc.
You cannot pass parameters to a thread function in any kind of normal register/stack manner because thread functions are not called by the creating thread - they are given execution directly by the underlying OS and the API's that do this copy a fixed number of parameters, (usually only one void pointer), to the new and different stack of the new thread.
As Jim says, failure to understand this mechanism often results in disaster. There are numnerous questions on SO where the vars that devs. hope would be used by a new thread are RAII'd away before the new thread even starts.

How to automatically initialize / uninitialize something globally for every thread?

I have a unit with an initialization and finalization section. This unit contains a complex object which is instantiated in the initialization and destroyed in the finalization. However, this object also contains an ADO Connection. That makes it an issue when using this across threads, because ADO is COM, and needs to be initialized for every thread.
This is how I currently handle this global object instance:
uses
ActiveX;
...
initialization
CoInitialize(nil);
_MyObject:= TMyObject.Create;
finalization
_MyObject.Free;
CoUninitialize;
end.
This only works on the main thread. Any other thread wouldn't be able to access it, and will return an exception CoInitialize has not been called.
How do I get around this to make this unit thread-safe? I would need a way to hook every creation/destruction of any thread created, and each thread would need to refer to a different instance of this object. But how to go about doing so?
Well, as you already say yourself, each thread needs to call CoInitialize separately. And in addition, each thread needs to have its own ADOConnection too.
I think you need to leave the idea of using the single global object/connection from that unit. Just repeat that object creation and destruction in each thread. When the thread types are different, then you could design a base thread class on top of them. If the object is too big (has overhead with regard to the thread) or does not 'fit' completely in the thread, then split the object design.
For now, your question sounds like just wanting to keep convenience, but if it is really necessary to centralize the ADO connection involvement, then maybe you could implement multi-cast events for the connection events of both main thread and the other threads. Logging in should not be a problem for successive connections: just store the login values and feed them to the threads.
While another design might be a better solution, you can declare _MyObject as threadvar to have a separate instance for each thread. In addition you can move the CoInitialize/CoUnitialize into the constructor/destructor of TMyObject.
I cannot give advice on when to create and free these instances as I have no idea how your threads are created and freed.

Is it safe to call Dispose on an instance from event handler?

public class MyTask : IDisposable { ... }
MyTask task = new MyTask(() => SomeTask);
task.Completed += (s, e) =>
{
// do something with result
...
// dispose of this instance
((MyTask)s).Dispose();
};
// execute the task
task.Execute();
Clearly I cannot tell when the task will be completed, so the only actual place, as I see it, that i can dispose of this instance is in Completed event.
Is this safe to do?
There is, alas, no general rule as to when it is safe to call Dispose. If Microsoft had specified that Dispose must be safe to call at any time when an object isn't in use, complying with such a rule would seldom have been difficult; in cases where a class might not always be able to perform all necessary cleanup immediately(*), it would generally be possible for it to set a flag and/or otherwise arrange to have necessary cleanup performed at the next opportunity. Unfortunately, Microsoft does not specify that Dispose implementations have to handle asynchronous Dispose requests, nor is there any general way for an object which holds the last useful reference to an IDisposable instance to ask for notification when it would be safe to dispose.
Despite the general lack of assurance as to when it is safe to call Dispose, many particular classes which implement Dispose do offer guarantees as to when it may safely be called. If one knows that a particular object is of a type which can be safely disposed in a particular context, one may dispose it then. Especially in cases where an event from an object may be the only opportunity to Dispose it in a threading context it could know about, and where disposing an object within an event handler would make sense, it should be safe to dispose of the object. Any properly-written event handlers should be prepared for the possibility that the object sending the event may be disposed between the time the system decides that they should run, and the time it actually runs them.
(*) The essential purpose of IDisposable is to allow an object to notify entities which are outside it but are acting on its behalf to the detriment of other entities, that they should no longer do so [e.g. to tell a file system that it should no longer grant an object exclusive access to a file]. Such action is referred to as "releasing resources". The fact that someone holds the last surviving reference to an object may imply that no other thread can be using that object, but does not imply that no other thread is using any non-thread-safe entities whose resources need to be released.

Is synchronization needed inside FinalConstruct()/FinalRelease()?

In my free-threaded in-proc COM object using ATL I want to add a member variable that will be set only in FinalConstruct() and read only in FinalRelease(). No other code will ever manipulate that member variable.
I doubt whether I need synchronization when accessing that member variable. I carefully read ATL sources and looks like those methods are always called no more than once and therefore from one thread only.
Is that correct assumption? Can I omit synchronization?
Yes, the assumption is correct. Think of it as an extension of the C++ constructor and destructor. In theory, you could call a method on a COM object from a different thread, while FinalRelease() is executing. Although, that is undefined behaviour, and not an expected occurrence. You shouldn't try to protect yourself from it, just as you wouldn't try to protect yourself from other threads in a destructor. If you have to protect yourself in the destructor, the design is generally broken (it would indicate that you do not have a proper termination protocol between your threads).
The only way FinalRelease() could be called from another thread, is when the client code does not have a valid reference count to your object, or if some other parts of your code is releasing twice. This is a hard error, and will probably end up in a crash anyway, totally unrelated to any synchronization errors you might have. The ATL code for managing object reference count is thread safe, and will not leave any race conditions open.
As for FinalConstruct(), a reference to the object is not returned to any client before FinalConstruct() has finished with a successful return code.

are class level property or variables thread safe

I always had this specific scenario worry me for eons. Let's say my class looks like this
public class Person {
public Address Address{get;set;}
public string someMethod()
{}
}
My question is, I was told by my fellow developers that the Address propery of type Address, is not thread safe.
From a web request perspective, every request is run on a separate thread and every time
the thread processes the following line in my business object or code behind, example
var p = new Person();
it creates a new instance of Person object on heap and so the instance is accessed by the requesting thread, unless and otherwise I spawn multiple threads in my application.
If I am wrong, please explain to me why I am wrong and why the public property (Address) is not thread safe?
Any help will be much appreciated.
Thanks.
If the reference to your Person instance is shared among multiple threads then multiple threads could potentially change Address causing a race condition. However unless you are holding that reference in a static field or in Session (some sort of globally accessible place) then you don't have anything to be worried about.
If you are creating references to objects in your code like you have show above (var p = new Person();) then you are perfectly thread safe as other threads will not be able to access the reference to these objects without resorting to nasty and malicious tricks.
Your property is not thread safe, because you have no locking to prevent multiple writes to the property stepping on each others toes.
However, in your scenario where you are not sharing an instance of your class between multiple threads, the property doesn't need to be thread safe.
Objects that are shared between multiple threads, where each thread can change the state of the object, then all state changes need to be protected so that only one thread at a time can modify the object.
You should be fine with this, however there are a few things I'd worry about...
If your Person object was to be modified or held some disposable resources, you could potentially find that one of the threads will be unable to read this variable. To prevent this, you will need to lock the object before read/writing it to ensure it won't be trampled on by other threads. The easiest way is by using the lock{} construct.

Resources