Does garbage collection lead to bad software design? - garbage-collection

I have heard that garbage collection leads to bad software design. Is that true? It is true that we don't care about the lifetime of objects in garbage collected languages, but does that have an effect in program design?

If an object asks other objects to do something on its behalf until further notice, the object's owner should notify it when its services are not required. Having code abandon such objects without notifying them that their services won't be required anymore would be bad design, but that would be just as true in a garbage-collected framework as in a non-GC framework.
Garbage-collected frameworks, used properly, offer two advantages:
In many cases, objects are created for the purpose of encapsulating values therein, and references to the objects are passed around as proxies for that data. Code receiving such references shouldn't care about whether other references exist to those same objects or whether it holds the last surviving references. As long as someone holds a reference to an object, the data should be kept. Once nobody needs the data anymore, it should cease to exist, but nobody should particularly notice.
In non-GC frameworks, an attempt to use a disposed object will usually generate Undefined Behavior that cannot be reliably trapped (and may allow code to violate security policies). In many GC frameworks, it's possible to ensure that attempts to use disposed resources will be trapped deterministically and cannot undermine security.
In some cases, garbage-collection will allow a programmer to "get away with" designs that are sloppier than would be tolerable in a non-GC system. A GC-based framework will also, however, allow the use of many good programming patterns which would could not be implemented as efficiently in a non-GC system. For example, if a program uses multiple worker threads to find the optimal solution for a problem, and has a UI thread which periodically wants to show the best solution found so far, the UI thread would want to know that when it asks for a status update it will get a solution that has been found, but won't want to burden the worker threads with the synchronization necessary to ensure that it has the absolute-latest solution.
In a non-GC system, thread synchronization would be unavoidable, since the UI thread and worker thread would have to coordinate who was going to delete a status object that becomes obsolete while it's being shown. In a GC-based system, however, the GC would be able to tell whether a UI thread was able to grab a reference to a status object before it got replaced, and thus resolve whether the object needed to be kept alive long enough for the UI thread to display it. The GC would sometimes have to force thread synchronization to find all reachable references, but occasional synchronization for the GC may pose less of a performance drain better than the frequent thread synchronization required in a non-GC system.


Why can I not blocking main thread in WinRT(Windows Store App)?

This question is not about "should I block my main thread" as it is generally a bad idea to block a main/STA/UI thread-for messaging and UI operations, but why WinRT C++/cx doesn't allow any blocking of the main thread compared to iOS, Android, and even C#(await doesn't actually block though).
Is there a fundamental difference in the way Android or iOS block the main thread? Why is WinRT the only platform that doesn't allow any form of blocking synchronization?
EDIT: I'm aware of co-await in VS2015, but due to backward compatibility my company still uses VS2013.
Big topic, at break-neck speed. This continues a tradition that started a long time ago in COM. WinRT inherits about all of the same concepts, it did get cleaned-up considerably. The fundamental design consideration is that thread-safety is one of the most difficult aspects of library design. And that any library has classes that are fundamentally thread-unsafe and if the consumer of the library is not aware of it then he'll easily create a nasty bug that is excessively difficult to diagnose.
This is an ugly problem for a company that relies on a closed-source business model and a 1-800 support phone number. Such phone calls can be very unpleasant, threading bugs invariably require telling a programmer "you can't do that, you'll have to rewrite your code". Rarely an acceptable answer, not at SO either :)
So thread-safety is not treated as an afterthought that the programmer needs to get right by himself. A WinRT class explicitly specifies whether or not it is thread-safe (the ThreadingModel attribute) and, if it is used in an unsafe way anyway, what should happen to make it thread-safe (the MarshallingBehavior attribute). Mostly a runtime detail, do note how compiler warning C4451 can even make these attributes produce a compile-time diagnostic.
The "used in an unsafe way anyway" clause is what you are asking about. WinRT can make a class that is not thread-safe safe by itself but there is one detail that it can't figure out by itself. To make it safe, it needs to know whether the thread that creates an object of the class can support the operating system provided way to make the object safe. And if the thread doesn't then the OS has to create a thread by itself to give the object a safe home. Solves the problem but that is pretty inefficient since every method call has to be marshalled.
You have to make a promise, cross-your-heart-hope-to-die style. The operating system can avoid creating a thread if your thread solves the producer-consumer problem. Better known as "pumping the message loop" in Windows vernacular. Something the OS can't figure out by itself since you typically don't start to pump until after you created a thread-unsafe object.
And just one more promise you make, you also promise that the consumer doesn't block and stops accepting messages from the message queue. Blocking is bad, implicit is that worker threads can't continue while the consumer is blocking. And worse, much worse, blocking is pretty likely to cause deadlock. The threading problem that's always a significant risk when there are two synchronization objects involved. One that you block on, the other that's hidden inside the OS that is waiting for the call to complete. Diagnosing a deadlock when you can't see the state of one of the sync objects that caused the deadlock is generally unpleasant.
Emphasis on promise, there isn't anything the OS can do if you break the promise and block anyway. It will let you, and it doesn't necessarily have to be fatal. It often isn't and doesn't cause anything more than an unresponsive UI. Different in managed code that runs on the CLR, if it blocks then the CLR will pump. Mostly works, but can cause some pretty bewildering re-entrancy bugs. That mechanism doesn't exist in native C++. Deadlock isn't actually that hard to diagnose, but you do have to find the thread back that's waiting for the STA thread to get back to business. Its stack trace tells the tale.
Do beware of these attributes when you use C++/CX. Unless you explicitly provide them, you'll create a class that's always considered thread-safe (ThreadingModel = Both, MarshallingType = Standard). An aspect that is not often actually tested, it will be the client code that ruins that expectation. Well, you'll get a phone call and you have to give an unpleasant answer :) Also note that OSX and Android are hardly the only examples of runtime systems that don't provide the WinRT guarantees, the .NET Framework does not either.
In a nutshell: because the policy for WinRT apps was "thou shalt not block the UI thread" and the C++ PPL runtime enforces this policy whilst the .NET runtime does not -- look at ppltasks.h and search for prevent Windows Runtime STA threads from blocking the UI. (Note that although .NET doesn't enforce this policy, it lets you accidentally deadlock yourself instead).
If you have to block the thread, there are ways to do it using Win32 IPC mechanisms (like waiting on an event that will be signaled by your completion handler) but the general guidance is still "don't do that" because it has a poor UX.

Thread Locking in Large Parralel Applications

I have a slightly more general question about parallelisation and threadlocking synchronization in large applications. I am working on an application with a large number of object types with a deep architecture that also utilises parallelisation of most key tasks. At present synchronisation is done with thread locking management inside each object of the system. The problem is that the locking scope is only as large as each object, whereas the object attibutes are being passed through many different objects where the attributes are losing synchronisation protection.
What is best-practice on thread management, 'synchronization contexts' &c. in large applications? It seems the only foolproof solution is to make data synchronization application wide such that data can be consumed safely by any object at any time, but this seems to violate object oriented coding concepts.
How is this problem best managed?
One approach is to make your objects read-only; a read-only object doesn't need any synchronization because there is no chance of any thread reading it while another thread writes to it (because no thread ever writes to it). Object lifetime issues can be handled using lock-free reference counting (using atomic-counters for thread safety).
Of course the down side is that if you actually want to change an object's state you can't; you have to create a new object that is a copy of the old object except for the changed part. Depending on what your application does, that overhead may or may not be acceptable.

How does MonoTouch garbage collect?

Are details of the MonoTouch garbage collection published anywhere? I am interested in knowing how it works on the iPhone. I'd like to know:
How often it runs, and are there any constraints that might stop it running.
Whether it is completely thread safe, so objects passed from one thread to another are handled properly, of if there are constraints we should be aware of.
If there is any benefit in manually calling the garbage collector before initiating an action that will use memory.
How does it handle low memory notifications, and running out of memory.
Such information would help us understand the stacks and thread information that we have from application logs.
[Edit] I've now found the information at Hans Boehm's site, but that is very generic and lists various options and choices the implementer has, including how threads are handled. Specific MonoTouch information is what I am wanting here.
The garbage collector is the same one used in Mono, the source code is here:
It is completely thread safe, and multi-core safe, which means that multiple threads can allocate objects and it can garbage collect in the presence of multiple threads.
That being said, your question is a little bit tricky, because you are not really asking about the garbage collector when you say "so objects passed from one thread to another are handled property , of if there are constraints that one should be aware of".
That is not really a garbage collector question, but an API question. And this depends vastly on the API that you are calling. The rules are the same than for .NET: instance methods are never thread safe, static methods are thread safe by default. Unless explicitly stated in the API that they are not.
Now with UI APIs like UIKit or CoreGraphics these are not different than any other GUI toolkit available in the world. UI toolkits are not thread safe, so you can not assume that a UILabel created on the main thread can safely be accessed from a thread. That is why you have to call "BeginInvokeOnMainThread" on an NSObject to ensure that any methods that you call on UIKit objects are only executed no the main thread.
That is just on example.
Check for more information
Low memory notifications are delivered by the operating system to your UIViewControllers, not to Mono's GC, so you need to take appropriate action in those cases.

Are "benaphores" worth implementing on modern OS's?

Back in my days as a BeOS programmer, I read this article by Benoit Schillings, describing how to create a "benaphore": a method of using atomic variable to enforce a critical section that avoids the need acquire/release a mutex in the common (no-contention) case.
I thought that was rather clever, and it seems like you could do the same trick on any platform that supports atomic-increment/decrement.
On the other hand, this looks like something that could just as easily be included in the standard mutex implementation itself... in which case implementing this logic in my program would be redundant and wouldn't provide any benefit.
Does anyone know if modern locking APIs (e.g. pthread_mutex_lock()/pthread_mutex_unlock()) use this trick internally? And if not, why not?
What your article describes is in common use today. Most often it's called "Critical Section", and it consists of an interlocked variable, a bunch of flags and an internal synchronization object (Mutex, if I remember correctly). Generally, in the scenarios with little contention, the Critical Section executes entirely in user mode, without involving the kernel synchronization object. This guarantees fast execution. When the contention is high, the kernel object is used for waiting, which releases the time slice conductive for faster turnaround.
Generally, there is very little sense in implementing synchronization primitives in this day and age. Operating systems come with a big variety of such objects, and they are optimized and tested in significantly wider range of scenarios than a single programmer can imagine. It literally takes years to invent, implement and test a good synchronization mechanism. That's not to say that there is no value in trying :)
Java's AbstractQueuedSynchronizer (and its sibling AbstractQueuedLongSynchronizer) works similarly, or at least it could be implemented similarly. These types form the basis for several concurrency primitives in the Java library, such as ReentrantLock and FutureTask.
It works by way of using an atomic integer to represent state. A lock may define the value 0 as unlocked, and 1 as locked. Any thread wishing to acquire the lock attempts to change the lock state from 0 to 1 via an atomic compare-and-set operation; if the attempt fails, the current state is not 0, which means that the lock is owned by some other thread.
AbstractQueuedSynchronizer also facilitates waiting on locks and notification of conditions by maintaining CLH queues, which are lock-free linked lists representing the line of threads waiting either to acquire the lock or to receive notification via a condition. Such notification moves one or all of the threads waiting on the condition to the head of the queue of those waiting to acquire the related lock.
Most of this machinery can be implemented in terms of an atomic integer representing the state as well as a couple of atomic pointers for each waiting queue. The actual scheduling of which threads will contend to inspect and change the state variable (via, say, AbstractQueuedSynchronizer#tryAcquire(int)) is outside the scope of such a library and falls to the host system's scheduler.

Thread safety... what's my "best" course of action?

I'm wondering what is the "best" way to make data thread-safe.
Specifically, I need to protect a linked-list across multiple threads -- one thread might try to read from it while another thread adds/removes data from it, or even frees the entire list. I've been reading about locks; they seem to be the most commonly used approach, but apparently they can be problematic (deadlocks). I've also read about atomic-operations as well as thread-local storage.
In your opinion, what would be my best course of action? What's the approach that most programmers use, and for what reason?
One approach that is not heavily used, but quite sound, is to designate one special purpose thread to own every "shared" structure. That thread generally sits waiting on a (thread-safe;-) queue, e.g. in Python a Queue.Queue instance, for work requests (reading or changing the shared structure), including both ones that request a response (they'll pass their own queue on which the response is placed when ready) and ones that don't. This approach entirely serializes all access to the shared resource, remaps easily to a multi-process or distributed architecture (almost brainlessly, in Python, with multiprocessing;-), and absolutely guarantees soundness and lack of deadlocks as well as race conditions as long as the underlying queue object is well-programmed once and for all.
It basically turns the hell of shared data structures into the paradise of message-passing concurrency architectures.
OTOH, it may be a tad higher-overhead than slugging it out the hard way with locks &c;-).
You could consider an immutable collection. Much like how a string in .net has methods such as Replace, Insert, etc. It doesn't modify the string but instead creates a new one, a LinkedList collection can be designed to be immutable as well. In fact, a LinkedList is actually fairly simple to implement this way as compared to some other collection data structures.
Here's a link to a blog post discussing immutable collections and a link to some implementations in .NET.
Always remember the most important rule of thread safety. Know all the critical sections of your code inside out. And by that, know them like your ABCs. Only if you can identify them at go once asked will you know which areas to operate your thread safety mechanisms on.
After that, remember the rules of thumb:
Look out for all your global
variables / variables on the heap.
Make sure your subroutines are
Make sure access to shared data is
Make sure there are no indirect
accesses through pointers.
(I'm sure others can add more.)
The "best" way, from a safety point of view, is to put a lock on the entire data structure, so that only one thread can touch it at a time.
Once you decide to lock less than the entire structure, presumably for performance reasons, the details of doing this are messy and differ for every data structure, and even variants of the same structure.
My suggestion is to
Start with a global lock on your data structure. Profile your program to see if it's really a problem.
If it is a problem, consider whether there's some other way to distribute the problem. Can you minimize the amount of data in the data structure in question, so that it need not be accessed so often or for so long? If it's a queuing system, for example, perhaps you can keep a local queue per thread, and only move things into or out of a global queue when a local queue becomes over- or under-loaded.
Look at data structures designed to help reduce contention for the particular type of thing you're doing, and implement them carefully and precisely, erring on the side of safety. For the queuing example, work-stealing queues might be what you need.
