Multi-threading best practices - multithreading

I have an application I've written in C#, although any similar language would apply here.
The application has the job of drawing a graphical scene to a window on a Form in real-time based on data it receives over various UDP and TCP sockets. Each UDP and TCP connection uses its own thread: these threads each modify various objects in memory which in turn modify the graphical display. I also have a user interface thread which is capable of receiving user events (button clicks, etc) which in turn modify those same objects and the display. Finally, I also have many timers that I fire which launch their own threads which modify those same objects and the display.
The objects in memory that are being modified consist of about 15 different classes.
Everything works pretty reliably, but with all of those different classes being modified by different threads, I've had to add a lot of synchronization locks. I've had to look at each class individually to determine which memory might be altered by more than one thread.
It seems very easy in this situation to miss one of those spots: to forget to add synchronization somewhere it's needed.
I'm curious as to whether others would implement this the way I did, or if there's some more elegant way: perhaps somehow putting all of the modification of class A on its own thread or something?
(P.S. I'm deathly afraid of asking a question here after things didn't go so well the first time. But I don't think my query here is super-obvious so I'm hoping you won't either. ;o)

I believe there is no straight-forward answer for this.
I have helped other to change the design to deal with similar situation. One of the most commonly used technique is to introduce a better abstraction.
For example, Assume that you have multiple thread that needs to update a Map containing Users, and another Set containing active user, instead of having locks for the User Map and Active User Set and have your threads acquire the locks manually, I'll suggest introducing an abstraction call UserRepository, in which contains the User map and Active User Set. UserRepository will provide some business-meaningful methods for other to manipulate the UserRepository. Locks are acquired in the methods of UserRepository, instead by the caller explicitly.
From my past experience, over 80% of complicated synchronization can be greatly simplified by having better design like the above mentioned example.
There are also other technique possible. For example, if the update is ok to do asynchronously, instead of having your threads update the resources directly, you may create command objects and put in a producer-consumer queue, and have a dedicate thread performing the update.
Also sometimes it is much easier to handle to have fewer locks. For example, when updating several resources, instead of having one lock for each resource, we can see the update as a whole action, and use only one lock for the coordination between threads. Of course it will increase contention, but there are cases that contention is not a big problem but we want maintainability instead.
I believe there are lots of other way to deal with similar situation, I am just sharing some of my previous experiences (which worked :P )

Related

How to synchronize insert/removal of elements to/from a data structure, the Functional Way?

I have a data structure, say a MinHeap. It has methods like peek(), removeElement() and addElement(). removeElement() and addElement() can produce inconsistent states if they are not made thread safe (because they involve increasing/decreasing the currentHeapSize).
Now, I want to implement this data structure, the functional way. I have read that in functional programming immutability is the key which leads to thread safety. How do I implement that here? Should I avoid incrementing/decrementing the currentHeapSize? If so, how? I would like some direction with this.
Edit #1
#YuvalItzchakov and #Dima have pointed out saying that I need to return a new collection everytime I do an insert/delete, which makes sense. But wouldn't that hamper the performance critically?
My use case is that I will be getting a stream of data and I keep adding it to the heap. When ever someone requests data, the root of the min heap is returned. So here insertion happens very rapidly. Wouldn't creating a new Heap for every insert prove to be costly? I think it would. If so, how does functional programming really help? Is it just a theoretical concept or does it have practical implications as well?
The problem of parallel access to the same data structure is twofold. First, we need to serialize parallel updates. #Tim gave a comprehensive answer to this. Second, in the case there are many readers, we may want to allow them to read in parallel with writing. And in this case immutability plays its role. Without it, writers have to wait the readers to finish.
There isn't really a "functional" way to have a data structure that can be updated by multiple threads. In fact one reason that functional programming is so good in a multi-threaded environment is because there aren't any shared data structures like this.
However in the real world this problem comes up all the time, so you need some way to serialise access to the shared data structure. The most crude way is simply to put a big lock around the whole code and only allow one thread to run at once (e.g. with a Mutex). With clever design this can be made reasonably efficient, but it can be difficult to get right and complicated to maintain.
A more sophisticated approach is to have a thread-safe queue of requests to your data structure and a single worker thread that processes these requests one-by-one. One popular framework that supports this model is Akka Actors. You wrap your data structure in an Actor which then receives requests to read or modify the data structure. The Akka framework will ensure that only one message is processed at once.
In your case your the actor would manage the heap and receive updates from the stream which would go into the heap. Other threads can then make requests that will be processes in a sequential, thread-safe way. It is best if these request perform specific queries on the heap, rather than just returning the whole heap every time.
You may use cats Ref type class
https://typelevel.org/cats-effect/concurrency/ref.html
But it is that AtomicReference realization, or write some java.util.concurent.ConcurentHashMap wrapper

Why is it wrong to access GUI elements from another thread? [duplicate]

This question already has answers here:
Why are most UI frameworks single threaded?
(6 answers)
Closed 7 years ago.
In every GUI library I've used (Swing, Android, Windows Forms, WPF) there's this golden rule saying that one cannot access or modify GUI elements from another thread (other than the GUI thread). I suppose this rule applies to any GUI library. Breaking this rule will most likely cause application to crash. However, I've been wondering recently, why is it so? I couldn't find any profound explanation. So what is the low-level explanation of this rule?
No piece of software is thread-safe unless it is explicitly designed and build to be so.
A GUI is a complex and stateful beast, making it thread-safe would be 'prohibitively expensive'.
There is a very simple reason for this. Usually UI functions are not thread-safe (as making them thread-safe would pessimize performance).
Of those you listed, some may be wrappers around existing mechanisms, so you have to answer the question indirectly via the underlying GUI framework. In case of multi-platform GUI frameworks like e.g. Qt, you will also have the lowest-common denominator that determines what is possible and what isn't.
Now, why is access to the GUI not thread-safe? In the cases where I'm most familiar with (win32 and X11), accesses are often performed indirectly by sending requests and sometimes waiting for the according answer. This usually works in an atomic way, even across process boundaries, so that is not directly cause of the problem. However, if you do so from multiple threads, the worst that can happen is that data is modified in an uncoordinated way. For example, if you read, modify and write the same widget from two threads, these operations might be interleaved, so that only one thread's modifications will actually be applied.
There are other reasons for not supporting cross-thread access:
In win32, the queue with the messages is thread-local, which means that only the thread that created a window will actually find and be able to handle messages for that window. I guess this a legacy from times where processes were single-threaded and the message queue was simply a global. Making it thread-local is the same approach as the one used for making errno thread-safe.
Another reason is that support objects are created inside a process that represent some GUI element. For example, the MFC (on top of win32) use a map from the OS' widget handle to a C++ object representing that object. That map is stored in thread-local storage (which follows the thread-local message queue) and the access to the C++ objects is not guarded by a mutex. Accessing these objects from different threads is bad, not because they represent GUI objects but because they are not synchronized, simple as that.
If you think about modifying the structure of a widget tree (like e.g. the DOM tree in a browser), you either have very detailed knowledge of what other parts of the application are doing or you need to lock access to the whole tree before every operation just to be safe. Needless to say, this effectively prevents any parallel operations, so you can also take the next step and require all operations to come from one thread and thus save the whole multithreading overhead.
That said, I believe that Qt and C# (and probably others) actually do support some cross-thread operations. They will work some (more or less obscure) magic that forwards the calls to the GUI thread and forwards the results back to the calling thread again. In other words, they try to make the necessary inter-thread communication more convenient for the programmer, while retaining the efficiency and simplicity of the single-threaded GUI. This is not restricted to GUI handling though but rather a general approach, only that it is especially important for the GUI.
As far as I know, that is simply not true: Every object in Java might be accesed concurrently, as far as thread-safe techniques are correctly applied. The fact is that Java Swing objects are mostly not prepared for multithreading, so you'll have to perform external synchronization.
There are several instances in which you need several threads to interoperate in a GUI: Games, visual effects, user events...
More information about the GUI and multithreading:
https://docs.oracle.com/javase/tutorial/uiswing/concurrency/dispatch.html

Why must/should UI frameworks be single threaded?

Closely related questions have been asked before:
Why are most UI frameworks single threaded?.
Should all event-driven frameworks be single-threaded?
But the answers to those questions still leave me unclear on some points.
The asker of the first question asked if multi-threading would help performance, and the answerers mostly said that it would not, because it is very unlikely that the GUI would be the bottleneck in a 2D application on modern hardware. But this seems to me a sneaky debating tactic. Sure, if you have carefully structured your application to do nothing other than UI calls on the UI thread you won't have a bottleneck. But that might take a lot of work and make your code more complicated, and if you had a faster core or could make UI calls from multiple threads, maybe it wouldn't be worth doing.
A commonly advocated architectural design is to have view components that don't have callbacks and don't need to lock anything except maybe their descendants. Under such an architecture, can't you let any thread invoke methods on view objects, using per-object locks, without fear of deadlock?
I am less confident about the situation with UI controls, but as long their callbacks are only invoked by the system, why should they cause any special deadlock issues? After all, if the callbacks need to do anything time consuming, they will delegate to another thread, and then we're right back in the multiple threads case.
How much of the benefit of a multi-threaded UI would you get if you could just block on the UI thread? Because various emerging abstractions over async in effect let you do that.
Almost all of the discussion I have seen assumes that concurrency will be dealt with using manual locking, but there is a broad consensus that manual locking is a bad way to manage concurrency in most contexts. How does the discussion change when we take into consideration the concurrency primitives that the experts are advising us to use more, such as software transactional memory, or eschewing shared memory in favor of message passing (possibly with synchronization, as in go)?
TL;DR
It is a simple way to force sequencing to occur in an activity that is going to ultimately be in sequence anyway (the screen draw X times per second, in order).
Discussion
Handling long-held resources which have a single identity within a system is typically done by representing them with a single thread, process, "object" or whatever else represents an atomic unit with regard to concurrency in a given language. Back in the non-emptive, negligent-kernel, non-timeshared, One True Thread days this was managed manually by polling/cycling or writing your own scheduling system. In such a system you still had a 1::1 mapping between function/object/thingy and singular resources (or you went mad before 8th grade).
This is the same approach used with handling network sockets, or any other long-lived resource. The GUI itself is but one of many such resources a typical program manages, and typically long-lived resources are places where the ordering of events matters.
For example, in a chat program you would usually not write a single thread. You would have a GUI thread, a network thread, and maybe some other thread that deals with logging resources or whatever. It is not uncommon for a typical system to be so fast that its easier to just put the logging and input into the same thread that makes GUI updates, but this is not always the case. In all cases, though, each category of resources is most easily reasoned about by granting them a single thread, and that means one thread for the network, one thread for the GUI, and however many other threads are necessary for long-lived operations or resources to be managed without blocking the others.
To make life easier its common to not share data directly among these threads as much as possible. Queues are much easier to reason about than resource locks and can guarantee sequencing. Most GUI libraries either queue events to be handled (so they can be evaluated in order) or commit data changes required by events immediately, but get a lock on the state of the GUI prior to each pass of the repaint loop. It doesn't matter what happened before, the only thing that matters when painting the screen is the state of the world right then. This is slightly different than the typical network case where all the data needs to be sent in order and forgetting about some of it is not an option.
So GUI frameworks are not multi-threaded, per se, it is the GUI loop that needs to be a single thread to sanely manage that single long-held resource. Programming examples, typically being trivial by nature, are often single-threaded with all the program logic running in the same process/thread as the GUI loop, but this is not typical in more complex programs.
To sum up
Because scheduling is hard, shared data management is even harder, and a single resource can only be accessed serially anyway, a single thread used to represent each long-held resource and each long-running procedure is a typical way to structure code. GUIs are only one resource among several that a typical program will manage. So "GUI programs" are by no means single-threaded, but GUI libraries typically are.
In trivial programs there is no realized penalty to putting other program logic in the GUI thread, but this approach falls apart when significant loads are experienced or resource management requires either a lot of blocking or polling, which is why you will often see event queue, signal-slot message abstractions or other approaches to multi-threading/processing mentioned in the dusty corners of nearly any GUI library (and here I'm including game libraries -- while game libs typically expect that you want to essentially build your own widgets around your own UI concept, the basic principles are very similar, just a bit lower-level).
[As an aside, I've been doing a lot of Qt/C++ and Wx/Erlang lately. The Qt docs do a good job of explaining approaches to multi-threading, the role of the GUI loop, and where Qt's signal/slot approach fits into the abstraction (so you don't have to think about concurrency/locking/sequencing/scheduling/etc very much). Erlang is inherently concurrent, but wx itself is typically started as a single OS process that manages a GUI update loop and Erlang posts update events to it as messages, and GUI events are sent to the Erlang side as messages -- thus permitting normal Erlang concurrent coding, but providing a single point of GUI event sequencing so that wx can do its GUI update looping thing.]
Because the GUI main thread code is old. Very old and therefore very much designed for low resource usage. If someone would write everything from scratch again (and even Android as the most recent GUI OS didn't) it would be working well and be better in multithreading.
For example the best two improvements that would help for MT are
Now we have MVVM (Model-View-ViewModel) pattern, this is an extra duplication of data. When the toolskits were developed even a single duplication in a MVC was highly debated. MVVM makes multithreading much easier. IMHO this was the main reason for Microsoft to invent it in the first place in .NET not the data binding.
The scene graph approach. Android, iOS, Windows UWP (based on CoreWindow not hWnd until Windows11 Project Reunion), Gtk4 is decoupling the GPU part from the model. Yes it is in fact a MVVMGM now (Model-View-ViewModel-GPUModel). So another memory intense layer. If you duplicate stuff you need less synchronisation. Combine on Android and SwiftUI on MacOS/iOS is using immutability of GUI widgets now to further improve this View->GPUModel.
Especially with the GPU Model/Scene Graph, the statement that GUIs are single threaded is not true anymore.
Two reasons, as far as I can tell:
It is much easier to reason about single-threaded code; thus the event loop model reduces the likelihood of bugs.
2D User interfaces are not CPU intensive. An old computer with a wimpy graphics card can smoothly render all the windows, frames, widgets, etc. you could possibly desire without skipping a beat.
Basically, if single-threaded code is easier and tends to have fewer bugs, favor that over multithreaded code unless you have a compelling need for parallelization or speed. Your typical GUI frameworks don't have this need.
Now, of course we've all experienced lagginess and freezes from GUI applications before. I'd argue that the vast majority of the time, this is the fault of the developer: putting long-running synchronous code for an event that should have been handled asynchronously (which is a mechanism all the major UI frameworks have).

Multiple UI threads on the same window

I don't want multiple windows, each with its own UI thread, nor events raised on a single UI thread, not background workers and notifications, none of that Invoke, BeginInvoke stuff either.
I'm interested in a platform that allows multiple threads to update the same window in a safe manner. Something like first thread creates three buttons, the second thread another five, and they both can access them,change their properties and delete them without any unwanted consequences.
I want safe multi-threaded access to the UI without Invoking, a platform where the UI objects can be accessed directly from any thread without raising errors like "The object can only be accessed from the thread that created it". To let me do the synchronizing if I have to, not prevent me from cross-tread accessing the UI in a direct manner.
I'm gonna get down voted but ... Go Go Gadget Soapbox.
Multi threaded GUI are not possible in the general case. It has been attempted time and time again and it never comes out well. It is not a coincidence that all of the major windowing frameworks follow the single threaded ui model. They weren't copying each other, it's just that the constraints of the problem lead them to the same answer. Many people smarter than you or i have tried to solve this.
It might be possible to implement a multi-thread ui for a particular project. I'm only saying that it can't be done in the general case. That means it's unlikely you'll find a framework to do what you want.
The gist of the problem is this. Envision the gui components as a chain (in reality it's more like a tree, but a chain is simple to describe). The button connects to the frame, connects to the box, connects to the window. There are two source of events for a gui the system/OS and the user. The system/OS event originate at the bottom of the chain (the windowing system), the user event originate at the top of the chain (the button). Both of these events must move through the gui chain. If two threads are pushing these events simultaneously they must be mutex protected. However, there is no known algorithm for concurrently traversing a double linked list in both directions. It is prone to dead lock. GUI experts tried and tried to figure out ways to get around the deadlocking problem, and eventually arrived at the solution we use today called Model/View/Controller, aka one thread runs the UI.
You could make a thread-safe Producer/Consumer queue of delegates.
Any thread that wants to update a UI component would create a delegate encapsulating the operations to be performed, and add it to the queue.
The UI thread (assuming all components were created on the same thread) would then periodically pull an item from the queue, and execute the delegate.
I don't believe a platform like that exists per se
There is nothing stopping you from saying taking .Net and creating all new controls which are thread safe and can work like that(or maybe just the subset of what you need) which shouldn't be an extremely large job(though definitely no small job) because you can just derive from the base controls and override any thread-unsafe methods or properties.
The real question though is why? It would definitely be slower because of all the locking. Say your in one thread that is doing something with the UI, well it has to lock the window it's working on else it could be changed without it knowing by the other thread. So with all the locking, you will spend most of your drawing time and such waiting on locks and (expensive) context switches from threads. You could maybe make it async, but that just doesn't seem safe(and probably isn't) because controls that you supposedly just created may or may not exist and would be about like
Panel p=new Panel();
Button b=new Button();
WaitForControlsCreated(); //waits until the current control queue is cleared
p.Controls.Add(b);
which is probably just as slow..
So the real question here is why? The only "good" way of doing it is just having an invoke abstracted away so that it appears you can add controls from a non-UI thread.
I think you are misunderstanding how threads really work and what it takes to actually make an object thread safe
Accept that any code updating the GUI has to be on the GUI thread.
Learn to use BeginInvoke().
On Windows, Window handles have thread affinity. This is a limitation of the Window manager. It's a bad idea to have multiple threads accessing the same window on Windows.
I'm surprised to see these answers.
Only the higher level language frameworks like C# have thread restrictions on GUI elements.
Windows, at the SDK layer, is 100% application controlled and there are no restrictions on threads except at insignificant nitty gritty level. For example if multiple threads want to write to a window, you need to lock on a mutex, get the device context, draw, then release the context, then unlock the mutex. Getting and releasing a device context for a moment of drawing needs to be on the same thread... but those are typically within 10 lines of code from each other.
There isn't even a dedicated thread that windows messages come down on, whatever thread calls "DispatchMessage()" is the thread the WINPROC will be called on.
Another minor thread restriction is that you can only "PeekMessage" or "GetMessage" a window that was created on the current thread. But really this is very minor, and how many message pumps do you need anyway.
Drawing is completely disconnected from threads in Windows, just mutex your DC's for drawing. You can draw anytime, from anywhere, not just on a WM_PAINT message.
BeOS / Haiku OS
Based on my guessing of your requirement, you want a single Windows Form and having ways to execute certain routines asynchronously (like multi-threading), yes?
Typically (for the case of .NET WinForms) Control.Invoke / Control.BeginInvoke is used to a certain effect what I think you want.
Here's an interesting article which might help: http://www.yoda.arachsys.com/csharp/threads/winforms.shtml

Thread safety... what's my "best" course of action?

I'm wondering what is the "best" way to make data thread-safe.
Specifically, I need to protect a linked-list across multiple threads -- one thread might try to read from it while another thread adds/removes data from it, or even frees the entire list. I've been reading about locks; they seem to be the most commonly used approach, but apparently they can be problematic (deadlocks). I've also read about atomic-operations as well as thread-local storage.
In your opinion, what would be my best course of action? What's the approach that most programmers use, and for what reason?
One approach that is not heavily used, but quite sound, is to designate one special purpose thread to own every "shared" structure. That thread generally sits waiting on a (thread-safe;-) queue, e.g. in Python a Queue.Queue instance, for work requests (reading or changing the shared structure), including both ones that request a response (they'll pass their own queue on which the response is placed when ready) and ones that don't. This approach entirely serializes all access to the shared resource, remaps easily to a multi-process or distributed architecture (almost brainlessly, in Python, with multiprocessing;-), and absolutely guarantees soundness and lack of deadlocks as well as race conditions as long as the underlying queue object is well-programmed once and for all.
It basically turns the hell of shared data structures into the paradise of message-passing concurrency architectures.
OTOH, it may be a tad higher-overhead than slugging it out the hard way with locks &c;-).
You could consider an immutable collection. Much like how a string in .net has methods such as Replace, Insert, etc. It doesn't modify the string but instead creates a new one, a LinkedList collection can be designed to be immutable as well. In fact, a LinkedList is actually fairly simple to implement this way as compared to some other collection data structures.
Here's a link to a blog post discussing immutable collections and a link to some implementations in .NET.
http://blogs.msdn.com/jaredpar/archive/2009/04/06/immutable-vs-mutable-collection-performance.aspx
Always remember the most important rule of thread safety. Know all the critical sections of your code inside out. And by that, know them like your ABCs. Only if you can identify them at go once asked will you know which areas to operate your thread safety mechanisms on.
After that, remember the rules of thumb:
Look out for all your global
variables / variables on the heap.
Make sure your subroutines are
re-entrant.
Make sure access to shared data is
serialized.
Make sure there are no indirect
accesses through pointers.
(I'm sure others can add more.)
The "best" way, from a safety point of view, is to put a lock on the entire data structure, so that only one thread can touch it at a time.
Once you decide to lock less than the entire structure, presumably for performance reasons, the details of doing this are messy and differ for every data structure, and even variants of the same structure.
My suggestion is to
Start with a global lock on your data structure. Profile your program to see if it's really a problem.
If it is a problem, consider whether there's some other way to distribute the problem. Can you minimize the amount of data in the data structure in question, so that it need not be accessed so often or for so long? If it's a queuing system, for example, perhaps you can keep a local queue per thread, and only move things into or out of a global queue when a local queue becomes over- or under-loaded.
Look at data structures designed to help reduce contention for the particular type of thing you're doing, and implement them carefully and precisely, erring on the side of safety. For the queuing example, work-stealing queues might be what you need.

Resources