Progress bar requirement as sufficient justification for multithreading via BackgroundWorker? - multithreading

Taken as given that multi-threaded applications are debugging hell and should be avoided at all costs...
Is the requirement to display of a progress bar a sufficient reason to enter multi-threaded land?
Specifically, let's say a C# windows forms .NET 3.0 application needs to download a 100MB file. Is it right to multi-thread (via a BackgroundWorker component) to do the download on that thread so that the UI thread is free to update a progress bar showing progress?
Sorry if this question is a bit fuzzy. I've not done anything multi-threaded before.

You have not said anything about what language are you using; in C#, for instance, you could use WebClient with DownloadFileAsync and DownloadProgressChanged
While this is a multi-threaded program, I consider it as low complexity program.

When you're creating a GUI application, the main problem with doing everything on the UI thread, is that it might freeze your application while for example waiting for I/O. So if you don't have anything that might freeze the application for long enough to annoy the user (and that time is very subjective of course), then go ahead with single threaded applications. But otherwise, I guess the question is how much you value the user experience.

Related

Doing UI on a background thread

The SDL documentation for threading states:
NOTE: You should not expect to be able to create a window, render, or receive events on any thread other than the main one.
The glfw documentation for glfwCreateWindow states:
Thread safety: This function must only be called from the main thread.
I have read about issues regarding the glut library from people who have tried to run the windowing functions on a second thread.
I could go on with these examples, but I think you get the point I'm trying to make. A lot of cross-platform libraries don't allow you to create a window on a background thread.
Now, two of the libraries I mentioned are designed with OpenGL in mind, and I get that OpenGL is not designed for multithreading and you shouldn't do rendering on multiple threads. That's fine. The thing that I don't understand is why the rendering thread (the single thread that does all the rendering) has to be the main one of the application.
As far as I know, neither Windows nor Linux nor MacOS impose any restrictions on which threads can create windows. I do know that windows have affinity to the thread that creates them (only that thread can receive input for them, etc.); but still that thread does not need to be the main one.
So, I have three questions:
Why do these libraries impose such restrictions? Is it because there is some obscure operating system that mandates that all windows be created on the main thread, and so all operating systems have to pay the price? (Or did I get it wrong?)
Why do we have this imposition that you should not do UI on a background thread? What do threads have to do with windowing, anyways? Is it not a bad abstraction to tie your logic to a specific thread?
If this is what we have and can't get rid of it, how do I overcome this limitation? Do I make a ThreadManager class and yield the main thread to it so it can schedule what needs to be done in the main thread and what can be done in a background thread?
It would be amazing if someone could shed some light on this topic. All the advice I see thrown around is to just do input and UI both on the main thread. But that's just an arbitrary restriction if there isn't a technical reason why it isn't possible to do otherwise.
PS: Please note that I am looking for a cross platform solution. If it can't be found, I'll stick to doing UI on the main thread.
While I'm not quite up to date on the latest releases of MacOS/iOS, as of 2020 Apple UIKit and AppKit were not thread safe. Only one thread can safely change UI objects, and unless you go to a lot of trouble that's going to be the main thread. Even if you do go to all the trouble of closing the window manager connection etc etc you're still going to end up with one thread only doing UI. So the limitation still applies on at least one major system.
While it's possibly unsafe to directly modify the contents of a window from any other thread, you can do software rendering to an offscreen bitmap image from any thread you like, taking as long as you like. Then hand the finished image over to the main thread for rendering. (The possibly is why cross platform toolkits disallow/tell you not to. Sometimes it might work, but you can't say why, or even that it will keep working.)
With Vulkan and DirectX 12 (and I think but am not sure Metal) you can render from multiple threads. Woohoo! Of course now you have to figure out how to do all the coordination and locking and cross-synching without making the whole thing slower than single threaded, but at least you have the option to try.
Adding to the excellent answer by Matt, with Qt programs you can use invokeMethod and postEvent to have background threads update the UI safely.
It's highly unlikely that any of these frameworks actually care about which thread is the 'main thread', i.e., the one that called the entry point to your code. The real restriction is that you have to do all your UI work on the thread that initialized the framework, i.e., the one that called SDL_Init in your case. You will usually do this in your main thread. Why not?
Multithreaded code is difficult to write and difficult to understand, and in UI work, introducing multithreading makes it difficult to reason about when things happen. A UI is a very stateful thing, and when you're writing UI code, you usually need to have a very good idea about what has happened already and what will happen next -- those things are often undefined when multithreading is involved. Also, users are slow, so multithreading the UI is not really necessary for performance in normal cases. Because of all this, making a UI framework thread-safe isn't usually considered beneficial. (multithreading compute-intensive parts of your rendering pipeline is a different thing)
Single-threaded UI frameworks have a dispatcher of some sort that you can use to enqueue activities that should happen on the main thread when it next has time. In SDL, you use SDL_PushEvent for this. You can call that from any thread.

What are some of the core principles needed to master multi-threading using Delphi?

I am kind of new to programming in general (about 8 months with on and off in Delphi and a little Python here and there) and I am in the process of buying some books.
I am interested in learning about concurrent programming and building multi-threaded apps using Delphi. Whenever I do a search for "multithreading Delphi" or "Delphi multithreading tutorial" I seem to get conflicting results as some of the stuff is about using certain libraries (Omnithread library) and other stuff seems to be more geared towards programmers with more experience.
I have studied quite a few books on Delphi and for the most part they seem to kind of skim the surface and not really go into depth on the subject. I have a friend who is a programmer (he uses c++) who recommends I learn what is actually going on with the underlying system when using threads as opposed to jumping into how to actually implement them in my programs first.
On Amazon.com there are quite a few books on concurrent programming but none of them seem to be made with Delphi in mind.
Basically I need to know what are the main things I should be focused on learning before jumping into using threads, if I can/should attempt to learn them using books that are not specifically aimed at Delphi developers (don't want to confuse myself reading books with a bunch of code examples in other languages right now) and if there are any reliable resources/books on the subject that anyone here could recommend.
Short answer
Go to OmnyThreadLibrary install it and read everything on the site.
Longer answer
You asked for some info so here goes:
Here's some stuff to read:
http://delphi.about.com/od/kbthread/Threading_in_Delphi.htm
I personally like: Multithreading - The Delphi Way.
(It's old, but the basics still apply)
Basic principles:
Your basic VCL application is single threaded.
The VCL was not build with multi-threading in mind, rather thread-support is bolted on so that most VCL components are not thread-safe.
The way in which this is done is by making the CPU wait, so if you want a fast application be careful when and how to communicate with the VCL.
Communicating with the VCL
Your basic thread is a decendent of TThread with its own members.
These are per thread variables. As long as you use these you don't have any problems.
My favorite way of communicating with the main window is by using custom windows Messages and postmessage to communicate asynchronically.
If you want to communicate synchronically you will need to use a critical section or a synchonize method.
See this article for example: http://edn.embarcadero.com/article/22411
Communicating between threads
This is where things get tricky, because you can run into all sorts of hard to debug synchonization issues.
My advice: use OmnithreadLibrary, also see this question: Cross thread communication in Delphi
Some people will tell you that reading and writing integers is atomic on x86, but this is not 100% true, so don't use those in a naive way, because you'll most likely get subtle issues wrong and end up with hard to debug code.
Starting and stopping threads
In old Delphi versions Thread.suspend and Thread.resume were used, however these are no longer recommended and should be avoided (in the context of thread synchronization).
See this question: With what delphi Code should I replace my calls to deprecated TThread method Suspend?
Also have a look at this question although the answers are more vague: TThread.resume is deprecated in Delphi-2010 what should be used in place?
You can use suspend and resume to pause and restart threads, just don't use them for thread synchronization.
Performance issues
Putting wait_for... , synchonize etc code in your thread effectively stops your thread until the action it's waiting for has occured.
In my opinion this defeats a big purpose of threads: speed
So if you want to be fast you'll have to get creative.
A long time ago I wrote an application called Life32.
Its a display program for conways game of life. That can generate patterns very fast (millions of generations per second on small patterns).
It used a separate thread for calculation and a separate thread for display.
Displaying is a very slow operation that does not need to be done every generation.
The generation thread included display code that removes stuff from the display (when in view) and the display thread simply sets a boolean that tells the generation thread to also display the added stuff.
The generation code writes directly to the video memory using DirectX, no VCL or Windows calls required and no synchronization of any kind.
If you move the main window the application will keep on displaying on the old location until you pause the generation, thereby stopping the generation thread, at which point it's safe to update the thread variables.
If the threads are not 100% synchronized the display happens a generation too late, no big deal.
It also features a custom memory manager that avoids the thread-safe slowness that's in the standard memory manager.
By avoiding any and all forms of thread synchronization I was able to eliminate the overhead from 90%+ (on smallish patterns) to 0.
You really shouldn't get me started on this, but anyway, my suggestions:
Try hard to not use the following:
TThread.Synchronize
TThread.WaitFor
TThread.OnTerminate
TThread.Suspend
TThread.Resume, (except at the end of constructors in some Delphi versions)
TApplication.ProcessMessages
Use the PostMessage API to communicate to the main thread - post objects in lParam, say.
Use a producer-consumer queue to communicate to secondary threads, (not a Windows message queue - only one thread can wait on a WMQ, making thread pooling impossible).
Do not write directly from one thread to fields in another - use message-passing.
Try very hard indeed to create threads at application startup and to not explicitly terminate them at all.
Do use object pools instead of continually creating and freeing objects for inter-thread communication.
The result will be an app that performs well, does not leak, does not deadlock and shuts down immediately when you close the main form.
What Delphi should have had built-in:
TWinControl.PostObject(anObject:TObject) and TWinControl.OnObjectRx(anObject:TObject) - methods to post objects from a secondary thread and fire a main-thread event with them. A trivial PostMessage wrap to replace the poor performing, deadlock-generating, continually-rewritten TThread.Synchronize.
A simple, unbounded producer-consumer class that actually works for multiple producers/consumers. This is, like, 20 lines of TObjectQueue descendant but Borland/Embarcadero could not manage it. If you have object pools, there is no need for complex bounded queues.
A simple thread-safe, blocking, object pool class - again, really simple with Delphi since it has class variables and virtual constructors, eg. creating a lot of buffer objects:
myPool:=TobjectPool.create(1024,TmyBuffer);
I thought it might be useful to actually try to compile a list of things that one should know about multithreading.
Synchronization primitives: mutexes, semaphores, monitors
Delphi implementations of synchronization primitives: TCriticalSection, TMREWSync, TEvent
Atomic operations: some knowledge about what operations are atomic and what not (discussed in this question)
Windows API multithreading capabilities: InterlockedIncrement, InterlockedExchange, ...
OmniThreadLibrary
Of course this is far from complete. I made this community wiki so that everyone can edit.
Appending to all the other answers I strongly suggest reading a book like:
"Modern Operating Systems" or any other one going into multithreading details.
This seems to be an overkill but it would make you a better programmer and
you defenitely get a very good insight
into threading/processes in an abstract way - so you learn why and how to
use critical section or semaphores on examples (like the
dining philosophers problem or the sleeping barber problem)

Is firing off a Thread a valid answer to simplifying code?

As multi-processor and multi-core computers become more and more ubiquitous, is simply firing off a new thread a (relatively) simple and painless way of simplifying code? For instance, in a current personal project, I have a network server listening on a port. Since this is just a personal project, it's just a desktop app, with a GUI integrated into it for configuration. So, the app reads something like this:
Main()
Read configuration
Start listener thread
Run GUI
Listener Thread
While the app is running
Wait for a new connection
Run a client thread for the new connection
Client Thread
Write synchronously
Read synchronously
ad inifinitum, or till they disconnect
This approach means that while I have to worry about alot of locking, with the potential issues that involves, I avoid alot of spaghetti code from assynchronous calls, etc.
A slightly more insidious version of this came up today when I was working on the startup code. The startup was quick, but it was using lazy loading for alot of the configuration, which meant that while startup was quick, actually connecting to and using the service was difficult because of the lag while it loaded different sections (this was actually measurable in real time, up to 3-10 seconds sometimes). So I moved to a different strategy, on startup, loop through everything and force the lazy loading to kick in... but this made it start prohibitively slow; get up, go get a coffee slow. Final solution: throw the loop into a seperate thread with feedback in the system tray while it's still loading.
Is this "Meh, throw it in another thread, it'll be fine" attitude ok? At what point do you start getting diminishing returns and/or even reduced performance?
Multithreading does a lot of things, but I don't think "simplification" is ever one of them.
It's a great way to introduce bugs into code.
Using multiple threads properly is not easy. It should not be attempted by new developers.
In my opinion, multi-threaded programming is pretty high up on the difficulty (and complexity) scale, along with memory management. To me, the "Meh, throw it in another thread, it'll be fine" attitude is a bit too casual. Think long and hard you must, before forking threads you do.
No.
Plainly and simply, multithreading increases complexity and is a nearly trivial way to add bugs to code. There are concurrency issues such as synchronization, deadlock, race conditions, and priority inversion to name a few.
Secondly, the performance gains are not automatic. Recently, there was an excellent article in MSDN Magazine along these lines. The salient details are that a certain operation was taking 46 seconds per ten iterations coded as a single-threaded operation. The author parallelized the operation naively (one thread per four cores) and the operation dropped to 30 seconds per ten iterations. Sounds great until you take into consideration that the operation now eats 300% more processing power but only experienced a 34% gain in efficiency. It's not worth consuming all available processing power for a gain like that.
This gives you the extra job of debugging race conditions, and handling locks and sycronisation issues.
I would not use this unless there was a real need.
Read up on Amdahl's law, best summarized by "The speedup of a program using multiple processors in parallel computing is limited by the time needed for the sequential fraction of the program."
As it turns out, if only a small part of your app can run in parallel you won't get much gains, but potentially many hard-to-debug bugs.
I don't mean to be flip but what's in that configuration file that it takes so long to load? That's the origin of your problem, right?
Before spawning another thread to handle it, perhaps it can be parred down? Reduced, perhaps put in another data format that would be quicker, etc?
How often does it change? Is it something you can parse once at the beginning of the day and put the variables in shared memory so subsequent runs of your main program can just attach and get the needed values from there?
While I agree with everyone else here in saying that multithreading does not simplify code, it can be used to greatly simplify the user experience of your application.
Consider an application that has a lot of interactive widgets (I am currently developing one where this helps) - in the workflow of my application, a user can "build" the current project they are working on. This requires disabling the interactive widgets my application presents to the user and presenting a dialog with a indeterminate progress bar and a friendly "please wait" message.
The "build" occurs on a background thread; if it were to happen on the UI thread it would make the user experience less enjoyable - after all, it's no fun not being able to tell whether or not you are able to click on a widget in an application while a background task is running (cough, Visual Studio). Not to say that VS doesn't use background threads, I'm just saying their user experience could use some improvement. But I digress.
The one thing I take issue with in the title of your post is that you think of firing off threads when you need to perform tasks - I generally prefer to reuse threads - in .NET, I generally favor using the system thread pool over creating a new thread each time I want to do something, for the sake of performance.
I'm going to provide some balance against the unanimous "no".
DISCLAIMER: Yes, threads are complicated and can cause a whole bunch of problems. Everyone else has pointed this out.
From experience, a sequence of blocking reads/writes to a socket (which requires a separate thead) is much simpler than non-blocking ones. With blocking calls, you can tell the state of the connection just by looking at where you are in the function. With non-blocking calls, you need a bunch of variables to record the state of the connection, and check and modify them every time you interact with the connection. With blocking calls, you can just say "read the next X bytes" or "read until you find X" and it will actually do it (or fail). With non-blocking calls, you have to deal with fragmented data which usually requires keeping temporary buffers and filling them as necessary. You also end up checking if you've received enough data every time you receive little more. Plus you have to keep a list of open connections and handle unexpected closes for all of them.
It doesn't get much simpler than this:
void WorkerThreadMain(Connection connection) {
Request request = ReadRequest(connection);
if(!request) return;
Reply reply = ProcessRequest(request);
if(!connection.isOpen) return;
SendReply(reply, connection);
connection.close();
}
I'd like to note that this "listener spawns off a worker thread per connection" pattern is how web servers are designed, and I assume it's how a lot of request/response soft of server applications are designed.
So in conclusion, I have experienced the asynchronous socket spaghetti code you mentioned, and spawning off worker threads for every connection ended up being a good solution. Having said all this, throwing threads at a problem should usually be your last resort.
I think your have no choice but to deal with threads especially with networking and concurrent connections. Do threads make code simpler? I don't think so. But without them how would you program a server that can handle more than 1 client at the same time?

Practical uses for threads [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 14 years ago.
In your work, what specifically have you used threads for?
(Please give a description of the application and how the thread helped/enhanced the application.)
Threads are critical for most UI work. Otherwise, any time you want to do a calculation or anything that takes a while, you will freeze the UI.
Therefore, most GUI frameworks have UI threads that handles the event loop (and some drawing activities), but most user code takes place in another thread.
Threads are also useful for occasionally checking things or making episodic changes to the state of the system.
(Less serious answer) I like to use threads in any situation where I want the system to fall on its arse in interesting and unobvious ways, while still having plausible deniability as to how I could have let the problem slip through.
Or, in the words of Rasmus Lerdorf, "People aren't smart enough to write thread-safe code".
Handling concurrent client requests in a server.
Threads are fundamental for most I/O bound applications, and for any reasonably complex server side application. Consider an application that acts as an exchange for information with multiple sources of data. You need to be able to deal with this information in independent threads, in particular if operations on this data is subject to latencies or require a signifigant amount of time to complete.
In most occasions threads often help to decouple various concerns within the application. A single thread dispatching events to interested parties will not scale well in the vast majority of occasions.
All but the simplest of applications will require threading to some degree.
Most common use is for resposive UI like showing a progress bar for a long running background task.
Background tasks:
Handling network connections and protocols.
Doing Sound Synthesis running in the background of a multimedia application.
Doing File-loading in the background in a multimedia application (CD streaming)
Other uses:
Accelerating certain algorithms by running two instances of the same code in two different threads.
I know that most of the time I use threads, what I actually want to do is to launch some asynchronous lump of work - i.e. I want something to happen in the mythical "background". Unfortunately thinking about threads isn't really the correct level of abstraction for doing "launch some lump of work", because you're not putting something into the background. With a thread API you're creating another place to run stuff as a sibling of the process's original thread, and need to worry about what information is shared between them, and how, and so on. That's why I'm enjoying newer API such as Cocoa's NSOperation and NSOperationQueue. In the case of that API, launching some lump of work is just some single line, and the library takes care of whether a new thread should be launched or an old one reused.
To scan directories looking for changed files. It is a lot quicker to spawn a thread per sub directory then do it in a single thread.
We have been using threads for several applications where the main screen was comprised of a workflow tailored to the currently logged on user.
Getting the workflow can be time intensive. Various parts of the workflow get loaded by different threads. For our main application BP/GeNA, about 11 threads are fired, each running a database query.
Regards,
Lieven
I most commonly use threads when I want to do something with a bunch of resources that I know will take a long time, when there's no interdependence between the work to process the elements, and particularly if the bottleneck isn't a local resource (like CPU of disk). For example, if I've got a bunch of URLs to retrieve, each of those will go into a separate thread.
This is a very general question. I've used "threads" to take potentially blocking work off of the UI thread, whether that work is local or network i/o or that work is computationally intensive tasks which will tend to "block" depending on the hardware it is run on.
I think it is more interesting to ask about a particular problem or pattern which help alleviates it and the applicability on threads, i.e.:
how are threads relevant to model
view controller?
how or why should I
take work off of a UI thread to
ensure that the UI doesn't even think
of blocking?
how can a threadpool be
useful for recursive (network)
directory traversal as someone else
alluded to?
Should I be affinitizing
threads for cooperative scheduling of
computationally intensive tasks, or
should I be using a ThreadPool and
let the OS preemptively schedule the
threads as it sees fit.
It's a pretty broad space and more clarity would likely help.
I build web applications, so all code that I write is executed in multiple threads.
Our app is a web service, so we spawn off a thread per request. Technically, JNI spawns off the thread, but the code has to be thread-safe regardless. We've run into some interesting (FSVO) issues with both Hibernate and our ESB-based infrastructure, but for the most part keeping things in ThreadLocals and synchronizing on subsystem entry points has worked pretty well. We haven't tried more than a couple dozen simultaneous requests, so there may well be some race conditions we haven't identified yet, but overall we're performant and give the right answers.
I wrote a function that spawns a thread that beeps (from the speaker) at regular intervals to alert a test operator that something needs attention. After the modal dialog is responded to, the thread is killed.
Not employment related, but I'm doing some side work on the Netflix Prize. My computer has 8 cores and 20 GB of RAM ... running only 1 thread would be an utter waste, so I typically start 16 threads or so.

Better multi-threaded debugging in the Delphi

Leading on from the answer to another question about bugs in the Delphi IDE, does anyone know if there is a way to improve the multi-threaded debugging functionality of the IDE, or if not, at least why it is so bad on occasion?
When you have multiple threads within a program, stepping through the code with F7 or F8 can often lead to either very long pauses, or the whole IDE just locks ups. This is especially apparent when you leave or enter a method or procedure. The debugger always seems to be fine for single threaded applications.
PS. The version I'm using is 2007
From my experience multi threaded debugging is much nicer using Vista and Delphi 2009 than XP with Delphi 2007.
First, the ide is significantly more stable.
Second, in Delphi 2009 on vista the debugger can show you where deadlocks are occurring.
If you have to use Delphi 2007, I would strongly recommend debugging your code in a single threaded unit test if possible, then using your by now tested code in the main program. ;)
When the application itself has not deadlocked, try to be very aware of which thread you're in. Keep the thread list up in the debugger, and consider using named threads.
There are times when it will be impossible to interactively debug an application which itself deadlocks. When this happens, you can use tools such as WinDbg and Adplus in order to work with memory dumps. Yes, this is a lot harder than using the interactive debugger, but it beats having no debugger at all. There are sample applications, demos, and instructions, on Tess Ferrandez's blog. I would start with this page. The labs are .NET-centric, but don't let that keep you away; the ideas are the same.
When I want to debug a multithreaded operation I often use a log file (that I analyse after the application has run) instead of the interactive Debugger.
For example with the function 'OutputDebugString'. The output comes in the event log of Delphi. If you start your program outside of Delphi, you can use DebugView from SysInternals to display the log. Take care to add the Thread-ID to each output (GetCurrentThreadID).
Be aware that there could be a thread switch just before writing to the log. But at places where several threads interact you will probably have a critical session (or another synchronization object) so that it should be a problem.
Yes, debugging a multithreaded application is a hassle. Because you are constantly swapping from one thread to another.
Another Idea that I have never tried because I've just thought about it: if you are interested in debugging one thread and just want to avoid being disturbed by the other threads, it might be possible to suspend some threads temporaly.
Process Explorer from SysInternals offers a possibility to suspend and resume threads (in the tab called "Threads" in the properties of a process). But as I said I've never tested it until now.

Resources