When is multi-threading not a good idea? [closed]

When is multi-threading not a good idea? [closed] - multithreading

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
I was recently working on an application that sent and received messages over Ethernet and Serial. I was then tasked to add the monitoring of DIO discretes. I throught,
"No reason to interrupt the main
thread which is involved in message
processing, I'll just create
another thread that monitors DIO."
This decision, however, proved to be poor. Sometimes the main thread would be interrupted between a Send and a Receive serial message. This interruption would disrupt the timing and alas, messages would be lost (forever).
I found another way to monitor the DIO without using another thread and Ethernet and Serial communication were restored to their correct functionality.
The whole fiasco, however, got me thinking. Are their any general guidelines about when not to use multiple-threads and/or does anyone have anymore examples of situations when using multiple-threads is not a good idea?
**EDIT:Based on your comments and after scowering the internet for information, I have composed a blog post entitled When is multi-threading not a good idea?

On a single processor machine and a desktop application, you use multi threads so you don't freeze the app but for nothing else really.
On a single processor server and a web based app, no need for multi threading because the web server handles most of it.
On a multi-processor machine and desktop app, you are suggested to use multi threads and parallel programming. Make as many threads as there are processors.
On a multi-processor server and a web based app, no need again for multi threads because the web server handles it.
In total, if you use multiple threads for other than un-freezing desktop apps and any other generic answer, you will make the app slower if you have a single core machine due to the threads interrupting each other.
Why? Because of the hardware switches. It takes time for the hardware to switch between threads in total. On a multi-core box, go ahead and use 1 thread for each core and you will greatly see a ramp up.

To paraphrase an old quote: A programmer had a problem. He thought, "I know, I'll use threads." Now the programmer has two problems. (Often attributed to JWZ, but it seems to predate his use of it talking about regexes.)
A good rule of thumb is "Don't use threads, unless there's a very compelling reason to use threads." Multiple threads are asking for trouble. Try to find a good way to solve the problem without using multiple threads, and only fall back to using threads if avoiding it is as much trouble as the extra effort to use threads. Also, consider switching to multiple threads if you're running on a multi-core/multi-CPU machine, and performance testing of the single threaded version shows that you need the performance of the extra cores.

Multi-threading is a bad idea if:
Several threads access and update the same resource (set a variable, write to a file), and you don't understand thread safety.
Several threads interact with each other and you don't understand mutexes and similar thread-management tools.
Your program uses static variables (threads typically share them by default).
You haven't debugged concurrency issues.

Actually, multi threading is not scalable and is hard to debug, so it should not be used in any case if you can avoid it. There is few cases where it is mandatory : when performance on a multi CPU matters, or when you deal whith a server that have a lot of clients taking a long time to answer.
In any other cases, you can use alternatives such as queue + cron jobs or else.

You might want to take a look at the Dan Kegel's "The C10K problem" web page about handling multiple data sources/sinks.
Basically it is best to use minimal threads, which in sockets can be done in most OS's w/ some event system (or asynchronously in Windows using IOCP).
When you run into the case where the OS and/or libraries do not offer a way to perform communication in a non-blocking manner, it is best to use a thread-pool to handle them while reporting back to the same event loop.
Example diagram of layout:
Per CPU [*] EVENTLOOP ------ Handles nonblocking I/O using OS/library utilities
| \___ Threadpool for various blocking events
Threadpool for handling the I/O messages that would take long

Multithreading is bad except in the single case where it is good. This case is
The work is CPU Bound, or parts of it is CPU Bound
The work is parallelisable.
If either or both of these conditions are missing, multithreading is not going to be a winning strategy.
If the work is not CPU bound, then you are waiting not on threads to finish work, but rather for some external event, such as network activity, for the process to complete its work. Using threads, there is the additional cost of context switches between threads, The cost of synchronization (mutexes, etc), and the irregularity of thread preemption. The alternative in most common use is asynchronous IO, in which a single thread listens to several io ports, and acts on whichever happens to be ready now, one at a time. If by some chance these slow channels all happen to become ready at the same time, It might seem like you will experience a slow-down, but in practice this is rarely true. The cost of handling each port individually is often comparable or better than the cost of synchronizing state on multiple threads as each channel is emptied.
Many tasks may be compute bound, but still not practical to use a multithreaded approach because the process must synchronise on the entire state. Such a program cannot benefit from multithreading because no work can be performed concurrently. Fortunately, most programs that require enormous amounts of CPU can be parallelized to some level.

Multi-threading is not a good idea if you need to guarantee precise physical timing (like in your example). Other cons include intensive data exchange between threads. I would say multi-threading is good for really parallel tasks if you don't care much about their relative speed/priority/timing.

A recent application I wrote that had to use multithreading (although not unbounded number of threads) was one where I had to communicate in several directions over two protocols, plus monitoring a third resource for changes. Both protocol libraries required a thread to run the respective event loop in, and when those were accounted for, it was easy to create a third loop for the resource monitoring. In addition to the event loop requirements, the messages going through the wires had strict timing requirements, and one loop couldn't be risked blocking the other, something that was further alleviated by using a multicore CPU (SPARC).
There were further discussions on whether each message processing should be considered a job that was given to a thread from a thread pool, but in the end that was an extension that wasn't worth the work.
All-in-all, threads should if possible only be considered when you can partition the work into well defined jobs (or series of jobs) such that the semantics are relatively easy to document and implement, and you can put an upper bound on the number of threads you use and that need to interact. Systems where this is best applied are almost message passing systems.

In priciple everytime there is no overhead for the caller to wait in a queue.

A couple more possible reasons to use threads:
Your platform lacks asynchronous I/O operations, e.g. Windows ME (No completion ports or overlapped I/O, a pain when porting XP applications that use them.) Java 1.3 and earlier.
A third-party library function that can hang, e.g. if a remote server is down, and the library provides no way to cancel the operation and you can't modify it.
Keeping a GUI responsive during intensive processing doesn't always require additional threads. A single callback function is usually sufficient.
If none of the above apply and I still want parallelism for some reason, I prefer to launch an independent process if possible.

I would say multi-threading is generally used to:
Allow data processing in the background while a GUI remains responsive
Split very big data analysis onto multiple processing units so that you can get your results quicker.
When you're receiving data from some hardware and need something to continuously add it to a buffer while some other element decides what to do with it (write to disk, display on a GUI etc.).
So if you're not solving one of those issues, it's unlikely that adding threads will make your life easier. In fact it'll almost certainly make it harder because as others have mentioned; debugging mutithreaded applications is considerably more work than a single threaded solution.
Security might be a reason to avoid using multiple threads (over multiple processes). See Google chrome for an example of multi-process safety features.

Multi-threading is scalable, and will allow your UI to maintain its responsivness while doing very complicated things in the background. I don't understand where other responses are acquiring their information on multi-threading.
When you shouldn't multi-thread is a mis-leading question to your problem. Your problem is this: Why did multi-threading my application cause serial / ethernet communications to fail?
The answer to that question will depend on the implementation, which should be discussed in another question. I know for a fact that you can have both ethernet and serial communications happening in a multi-threaded application at the same time as numerous other tasks without causing any data loss.
The one reason to not use multi-threading is:
There is one task, and no user interface with which the task will interfere.
The reasons to use mutli-threading are:
Provides superior responsiveness to the user
Performs multiple tasks at the same time to decrease overall execution time
Uses more of the current multi-core CPUs, and multi-multi-cores of the future.
There are three basic methods of multi-threaded programming that make thread safety implemented with ease - you only need to use one for success:
Thread Safe Data types passed between threads.
Thread Safe Methods in the threaded object to modify data passed between.
PostMessage capabilities to communicate between threads.

Are the processes parallel? Is performance a real concern? Are there multiple 'threads' of execution like on a web server? I don't think there is a finite answer.

A common source of threading issues is the usual approaches employed to synchronize data. Having threads share state and then implement locking at all the appropriate places is a major source of complexity for both design and debugging. Getting the locking right to balance stability, performance, and scalability is always a hard problem to solve. Even the most experienced experts get it wrong frequently. Alternative techniques to deal with threading can alleviate much of this complexity. The Clojure programming language implements several interesting techniques for dealing with concurrency.

Related

Why must/should UI frameworks be single threaded?

Closely related questions have been asked before:
Why are most UI frameworks single threaded?.
Should all event-driven frameworks be single-threaded?
But the answers to those questions still leave me unclear on some points.
The asker of the first question asked if multi-threading would help performance, and the answerers mostly said that it would not, because it is very unlikely that the GUI would be the bottleneck in a 2D application on modern hardware. But this seems to me a sneaky debating tactic. Sure, if you have carefully structured your application to do nothing other than UI calls on the UI thread you won't have a bottleneck. But that might take a lot of work and make your code more complicated, and if you had a faster core or could make UI calls from multiple threads, maybe it wouldn't be worth doing.
A commonly advocated architectural design is to have view components that don't have callbacks and don't need to lock anything except maybe their descendants. Under such an architecture, can't you let any thread invoke methods on view objects, using per-object locks, without fear of deadlock?
I am less confident about the situation with UI controls, but as long their callbacks are only invoked by the system, why should they cause any special deadlock issues? After all, if the callbacks need to do anything time consuming, they will delegate to another thread, and then we're right back in the multiple threads case.
How much of the benefit of a multi-threaded UI would you get if you could just block on the UI thread? Because various emerging abstractions over async in effect let you do that.
Almost all of the discussion I have seen assumes that concurrency will be dealt with using manual locking, but there is a broad consensus that manual locking is a bad way to manage concurrency in most contexts. How does the discussion change when we take into consideration the concurrency primitives that the experts are advising us to use more, such as software transactional memory, or eschewing shared memory in favor of message passing (possibly with synchronization, as in go)?

TL;DR
It is a simple way to force sequencing to occur in an activity that is going to ultimately be in sequence anyway (the screen draw X times per second, in order).
Discussion
Handling long-held resources which have a single identity within a system is typically done by representing them with a single thread, process, "object" or whatever else represents an atomic unit with regard to concurrency in a given language. Back in the non-emptive, negligent-kernel, non-timeshared, One True Thread days this was managed manually by polling/cycling or writing your own scheduling system. In such a system you still had a 1::1 mapping between function/object/thingy and singular resources (or you went mad before 8th grade).
This is the same approach used with handling network sockets, or any other long-lived resource. The GUI itself is but one of many such resources a typical program manages, and typically long-lived resources are places where the ordering of events matters.
For example, in a chat program you would usually not write a single thread. You would have a GUI thread, a network thread, and maybe some other thread that deals with logging resources or whatever. It is not uncommon for a typical system to be so fast that its easier to just put the logging and input into the same thread that makes GUI updates, but this is not always the case. In all cases, though, each category of resources is most easily reasoned about by granting them a single thread, and that means one thread for the network, one thread for the GUI, and however many other threads are necessary for long-lived operations or resources to be managed without blocking the others.
To make life easier its common to not share data directly among these threads as much as possible. Queues are much easier to reason about than resource locks and can guarantee sequencing. Most GUI libraries either queue events to be handled (so they can be evaluated in order) or commit data changes required by events immediately, but get a lock on the state of the GUI prior to each pass of the repaint loop. It doesn't matter what happened before, the only thing that matters when painting the screen is the state of the world right then. This is slightly different than the typical network case where all the data needs to be sent in order and forgetting about some of it is not an option.
So GUI frameworks are not multi-threaded, per se, it is the GUI loop that needs to be a single thread to sanely manage that single long-held resource. Programming examples, typically being trivial by nature, are often single-threaded with all the program logic running in the same process/thread as the GUI loop, but this is not typical in more complex programs.
To sum up
Because scheduling is hard, shared data management is even harder, and a single resource can only be accessed serially anyway, a single thread used to represent each long-held resource and each long-running procedure is a typical way to structure code. GUIs are only one resource among several that a typical program will manage. So "GUI programs" are by no means single-threaded, but GUI libraries typically are.
In trivial programs there is no realized penalty to putting other program logic in the GUI thread, but this approach falls apart when significant loads are experienced or resource management requires either a lot of blocking or polling, which is why you will often see event queue, signal-slot message abstractions or other approaches to multi-threading/processing mentioned in the dusty corners of nearly any GUI library (and here I'm including game libraries -- while game libs typically expect that you want to essentially build your own widgets around your own UI concept, the basic principles are very similar, just a bit lower-level).
[As an aside, I've been doing a lot of Qt/C++ and Wx/Erlang lately. The Qt docs do a good job of explaining approaches to multi-threading, the role of the GUI loop, and where Qt's signal/slot approach fits into the abstraction (so you don't have to think about concurrency/locking/sequencing/scheduling/etc very much). Erlang is inherently concurrent, but wx itself is typically started as a single OS process that manages a GUI update loop and Erlang posts update events to it as messages, and GUI events are sent to the Erlang side as messages -- thus permitting normal Erlang concurrent coding, but providing a single point of GUI event sequencing so that wx can do its GUI update looping thing.]

Because the GUI main thread code is old. Very old and therefore very much designed for low resource usage. If someone would write everything from scratch again (and even Android as the most recent GUI OS didn't) it would be working well and be better in multithreading.
For example the best two improvements that would help for MT are
Now we have MVVM (Model-View-ViewModel) pattern, this is an extra duplication of data. When the toolskits were developed even a single duplication in a MVC was highly debated. MVVM makes multithreading much easier. IMHO this was the main reason for Microsoft to invent it in the first place in .NET not the data binding.
The scene graph approach. Android, iOS, Windows UWP (based on CoreWindow not hWnd until Windows11 Project Reunion), Gtk4 is decoupling the GPU part from the model. Yes it is in fact a MVVMGM now (Model-View-ViewModel-GPUModel). So another memory intense layer. If you duplicate stuff you need less synchronisation. Combine on Android and SwiftUI on MacOS/iOS is using immutability of GUI widgets now to further improve this View->GPUModel.
Especially with the GPU Model/Scene Graph, the statement that GUIs are single threaded is not true anymore.

Two reasons, as far as I can tell:
It is much easier to reason about single-threaded code; thus the event loop model reduces the likelihood of bugs.
2D User interfaces are not CPU intensive. An old computer with a wimpy graphics card can smoothly render all the windows, frames, widgets, etc. you could possibly desire without skipping a beat.
Basically, if single-threaded code is easier and tends to have fewer bugs, favor that over multithreaded code unless you have a compelling need for parallelization or speed. Your typical GUI frameworks don't have this need.
Now, of course we've all experienced lagginess and freezes from GUI applications before. I'd argue that the vast majority of the time, this is the fault of the developer: putting long-running synchronous code for an event that should have been handled asynchronously (which is a mechanism all the major UI frameworks have).

Instant Messaging Server Design

Let's suppose we have an instant messaging application, client-server based, not p2p. The actual protocol doesn't matter, what matters is the server architecture. The said server can be coded to operate in single-threaded, non-parallel mode using non-blocking sockets, which by definition allow us to perform operations like read-write effectively immediately (or instantly). This very feature of non-blocking sockets allows us to use some sort of select/poll function at the very core of the server and waste next to no time in the actual socket read/write operations, but rather to spend time processing all this information. Properly coded, this can be very fast, as far as I understand. But there is the second approach, and that is to multithread aggressively, creating a new thread (obviously using some sort of thread pool, because that very operation can be (very) slow on some platforms and under some circumstances), and letting those threads to work in parallel, while the main background thread handles accept() and stuff. I've seen this approach explained in various places over the Net, so it obviously does exist.
Now the question is, if we have non-blocking sockets, and immediate read/write operations, and a simple, easily coded design, why does the second variant even exist? What problems are we trying to overcome with the second design, i.e. threads? AFAIK those are usually used to work around some slow and possibly blocking operations, but no such operations seem to be present there!

I'm assuming you're not talking about having a thread per client as such a design is usually for completely diffreent reasons, but rather a pool of threads each handles several concurrent clients.
The reason for that arcitecture vs a single threaded server is simply to take advantage of multiple processors. You're doing more work than simply I/O. You have to parse the messages, do various work, maybe even run some more heavyweight crypto algorithms. All this takes CPU. If you want to scale, taking advantage of multiple processors will allow you to scale even more, and/or keep the latency even lower per client.
Some of the gain in such a design can be a bit offset by the fact you might need more locking in a multithreaded environment, but done right, and certainly depening on what you're doing, it can be a huge win - at the expense of more complexity.
Also, this might help overcome OS limitations . The I/O paths in the kernel might get more distributed among the procesors. Not all operating systems might fully be able to thread the IO from a single threaded applications. Back in the old days there were'nt all the great alternatives to the old *nix select(), which usually had a filedesciptor limit of 1024, and similar APIs severly started degrading once you told it to monitor too many socket. Spreading all those clients on multiple threads or processes helped overcome that limit.
As for a 1:1 mapping between threads, there's several reasons to implement that architecture:
Easier programming model, which might lead to less hard to find bugs, and faster to implement.
Support blocking APIs. These are all over the place. Having a thread handle many/all of the clients and then go on to do a blocking call to a database is going to stall everyone. Even reading files can block your application, and you usually can't monitor regular file handles/descriptors for IO events - or when you can, the programming model is often exceptionally complicated.
The drawback here is it won't scale, atleast not with the most widely used languages/framework. Having thousands of native threads will hurt performance. Though some languages provides a much more lightweight approach here, such as Erlang and to some extent Go.

How to find out the optimal amount of threads?

I'm planning to make a software with lot of peer to peer like network connections. Normally I would create an own thread for every connection to send and receive data, but in this case with 300-500+ connections it would mean continuously creating and destroying a lot of threads which would be a big overhead I guess. And making one thread that handles all the connections sequentially could probably slow down things a little. (I'm not really sure about this.)
The question is: how many threads would be optimal to handle this kind of problems? Would it be possible to calculate it in the software so it can decide itself to create less threads on an old computer with not as much resources and more on new ones?
It's a theoretical question, I wouldn't like to make it implementation or language dependant. However I think a lot of people would advice something like "Just use a ThreadPool, it will handle stuff like that" so let's say it will not be a .NET application. (I'll probably has to use some other parts of the code in an old Delphi project, so the language will be probably Delphi or maybe C++ but it's not decided yet.)

Understanding the performance of your application under load is key, as mentioned before profiling, measurements and re-testing is the way to go.
As a general guide Goetz talks about having
threads = number of CPUs + 1
for CPU bound applications, and
number of CPUs * (1 + wait time / service time)
for IO bound contexts

If this is Windows (you did mention .Net?), you should definitely implement this using I/O completion ports. This is the most efficient way to do Windows sockets I/O. There is an I/O-specific discussion of thread pool size at that documentation link.
The most important property of an I/O
completion port to consider carefully
is the concurrency value. The
concurrency value of a completion port
is specified when it is created with
CreateIoCompletionPort via the
NumberOfConcurrentThreads parameter.
This value limits the number of
runnable threads associated with the
completion port. When the total number
of runnable threads associated with
the completion port reaches the
concurrency value, the system blocks
the execution of any subsequent
threads associated with that
completion port until the number of
runnable threads drops below the
concurrency value.
Basically, your reads and writes are all asynchronous and are serviced by a thread pool whose size you can modify. But try it with the default first.
A good, free example of how to do this is at the Free Framework. There are some gotchas that looking at working code could help you short-circuit.

You could do a calculation based on cpu speed, cores, and memory space in your install and set a constant somewhere to tell your application how many threads to use. Semaphores and thread pools come to mind.
Personally I would separate the listening sockets from the sending ones and open sending sockets in runtime instead of running them as daemons; listening sockets can run as daemons.
Multithreading can be its own headache and introduce many bugs. The best thing to do is make a thread do one thing and block when processing to avoid undesired and unpredictable results.

Make the number of threads configurable.
Target a few specific configurations that are the most common ones that you expect to support.
Get a good performance profiler / instrument your code and then rigorously test with different values of 1. for all the different types of 2. till you find an optimal value that works for each configuration.
I know, this might seem like a not-so smart way to do things but i think when it comes to performance, benchmarking the results via testing is the only sure-fire way to really know how well / badly it will work.
Edit: +1 to the question whose link is posted by paxDiablo above as a comment. Its almost the same question and theres loads of information there including a very detailed reply by paxDiablo himself.

One thread per CPU, processing several (hundreds) connections.

Why should I use a thread vs. using a process?

Separating different parts of a program into different processes seems (to me) to make a more elegant program than just threading everything. In what scenario would it make sense to make things run on a thread vs. separating the program into different processes? When should I use a thread?
Edit
Anything on how (or if) they act differently with single-core and multi-core would also be helpful.

You'd prefer multiple threads over multiple processes for two reasons:
Inter-thread communication (sharing data etc.) is significantly simpler to program than inter-process communication.
Context switches between threads are faster than between processes. That is, it's quicker for the OS to stop one thread and start running another than do the same with two processes.
Example:
Applications with GUIs typically use one thread for the GUI and others for background computation. The spellchecker in MS Office, for example, is a separate thread from the one running the Office user interface. In such applications, using multiple processes instead would result in slower performance and code that's tough to write and maintain.

Well apart from advantages of using thread over process, like:
Advantages:
Much quicker to create a thread than
a process.
Much quicker to switch
between threads than to switch
between processes.
Threads share data
easily
Consider few disadvantages too:
No security between threads.
One thread can stomp on another thread's
data.
If one thread blocks, all
threads in task block.
As to the important part of your question "When should I use a thread?"
Well you should consider few facts that a threads should not alter the semantics of a program. They simply change the timing of operations. As a result, they are almost always used as an elegant solution to performance related problems. Here are some examples of situations where you might use threads:
Doing lengthy processing: When a windows application is calculating it cannot process any more messages. As a result, the display cannot be updated.
Doing background processing: Some
tasks may not be time critical, but
need to execute continuously.
Doing I/O work: I/O to disk or to
network can have unpredictable
delays. Threads allow you to ensure
that I/O latency does not delay
unrelated parts of your application.

I assume you already know you need a thread or a process, so I'd say the main reason to pick one over the other would be data sharing.
Use of a process means you also need Inter Process Communication (IPC) to get data in and out of the process. This is a good thing if the process is to be isolated though.

You sure don't sound like a newbie. It's an excellent observation that processes are, in many ways, more elegant. Threads are basically an optimization to avoid too many transitions or too much communication between memory spaces.
Superficially using threads may also seem like it makes your program easier to read and write, because you can share variables and memory between the threads freely. In practice, doing that requires very careful attention to avoid race conditions or deadlocks.
There are operating-system kernels (most notably L4) that try very hard to improve the efficiency of inter-process communication. For such systems one could probably make a convincing argument that threads are pointless.

I would like to answer this in a different way. "It depends on your application's working scenario and performance SLA" would be my answer.
For instance threads may be sharing the same address space and communication between threads may be faster and easier but it is also possible that under certain conditions threads deadlock and then what do you think would happen to your process.
Even if you are a programming whiz and have used all the fancy thread synchronization mechanisms to prevent deadlocks it certainly is not rocket science to see that unless a deterministic model is followed which may be the case with hard real time systems running on Real Time OSes where you have a certain degree of control over thread priorities and can expect the OS to respect these priorities it may not be the case with General Purpose OSes like Windows.
From a Design perspective too you might want to isolate your functionality into independent self contained modules where they may not really need to share the same address space or memory or even talk to each other. This is a case where processes will make sense.
Take the case of Google Chrome where multiple processes are spawned as opposed to most browsers which use a multi-threaded model.
Each tab in Chrome can be talking to a different server and rendering a different website. Imagine what would happen if one website stopped responding and if you had a thread stalled due to this, the entire browser would either slow down or come to a stop.
So Google decided to spawn multiple processes and that is why even if one tab freezes you can still continue using other tabs of your Chrome browser.
Read more about it here
and also look here

I agree to most of the answers above. But speaking from design perspective i would rather go for a thread when i want set of logically co-related operations to be carried out parallel. For example if you run a word processor there will be one thread running in foreground as an editor and other thread running in background auto saving the document at regular intervals so no one would design a process to do that auto saving task separately.

In addition to the other answers, maintaining and deploying a single process is a lot simpler than having a few executables.
One would use multiple processes/executables to provide a well-defined interface/decoupling so that one part or the other can be reused or reimplemented more easily than keeping all the functionality in one process.

Came across this post. Interesting discussion. but I felt one point is missing or indirectly pointed.
Creating a new process is costly because of all of the
data structures that must be allocated and initialized. The process is subdivided into different threads of control to achieve multithreading inside the process.
Using a thread or a process to achieve the target is based on your program usage requirements and resource utilization.

What kinds of applications need to be multi-threaded?

What are some concrete examples of applications that need to be multi-threaded, or don't need to be, but are much better that way?
Answers would be best if in the form of one application per post that way the most applicable will float to the top.

There is no hard and fast answer, but most of the time you will not see any advantage for systems where the workflow/calculation is sequential. If however the problem can be broken down into tasks that can be run in parallel (or the problem itself is massively parallel [as some mathematics or analytical problems are]), you can see large improvements.
If your target hardware is single processor/core, you're unlikely to see any improvement with multi-threaded solutions (as there is only one thread at a time run anyway!)
Writing multi-threaded code is often harder as you may have to invest time in creating thread management logic.
Some examples
Image processing can often be done in parallel (e.g. split the image into 4 and do the work in 1/4 of the time) but it depends upon the algorithm being run to see if that makes sense.
Rendering of animation (from 3DMax,etc.) is massively parallel as each frame can be rendered independently to others -- meaning that 10's or 100's of computers can be chained together to help out.
GUI programming often helps to have at least two threads when doing something slow, e.g. processing large number of files - this allows the interface to remain responsive whilst the worker does the hard work (in C# the BackgroundWorker is an example of this)
GUI's are an interesting area as the "responsiveness" of the interface can be maintained without multi-threading if the worker algorithm keeps the main GUI "alive" by giving it time, in Windows API terms (before .NET, etc) this could be achieved by a primitive loop and no need for threading:
MSG msg;
while(GetMessage(&msg, hwnd, 0, 0))
{
TranslateMessage(&msg);
DispatchMessage(&msg);
// do some stuff here and then release, the loop will come back
// almost immediately (unless the user has quit)
}

Servers are typically multi-threaded (web servers, radius servers, email servers, any server): you usually want to be able to handle multiple requests simultaneously. If you do not want to wait for a request to end before you start to handle a new request, then you mainly have two options:
Run a process with multiple threads
Run multiple processes
Launching a process is usually more resource-intensive than lauching a thread (or picking one in a thread-pool), so servers are usually multi-threaded. Moreover, threads can communicate directly since they share the same memory space.
The problem with multiple threads is that they are usually harder to code right than multiple processes.

There are really three classes of reasons that multithreading would be applied:
Execution Concurrency to improve compute performance: If you have a problem that can be broken down into pieces and you also have more than one execution unit (processor core) available then dispatching the pieces into separate threads is the path to being able to simultaneously use two or more cores at once.
Concurrency of CPU and IO Operations: This is similar in thinking to the first one but in this case the objective is to keep the CPU busy AND also IO operations (ie: disk I/O) moving in parallel rather than alternating between them.
Program Design and Responsiveness: Many types of programs can take advantage of threading as a program design benefit to make the program more responsive to the user. For example the program can be interacting via the GUI and also doing something in the background.
Concrete Examples:
Microsoft Word: Edit document while the background grammar and spell checker works to add all the green and red squiggle underlines.
Microsoft Excel: Automatic background recalculations after cell edits
Web Browser: Dispatch multiple threads to load each of the several HTML references in parallel during a single page load. Speeds page loads and maximizes TCP/IP data throughput.

These days, the answer should be Any application that can be.
The speed of execution for a single thread pretty much peaked years ago - processors have been getting faster by adding cores, not by increasing clock speeds. There have been some architectural improvements that make better use of the available clock cycles, but really, the future is taking advantage of threading.
There is a ton of research going on into finding ways of parallelizing activities that we traditionally wouldn't think of parallelizing. Even something as simple as finding a substring within a string can be parallelized.

Basically there are two reasons to multi-thread:
To be able to do processing tasks in parallel. This only applies if you have multiple cores/processors, otherwise on a single core/processor computer you will slow the task down compared to the version without threads.
I/O whether that be networked I/O or file I/O. Normally if you call a blocking I/O call, the process has to wait for the call to complete. Since the processor/memory are several orders of magnitude quicker than a disk drive (and a network is even slower) it means the processor will be waiting a long time. The computer will be working on other things but your application will not be making any progress. However if you have multiple threads, the computer will schedule your application and the other threads can execute. One common use is a GUI application. Then while the application is doing I/O the GUI thread can keep refreshing the screen without looking like the app is frozen or not responding. Even on a single processor putting I/O in a different thread will tend to speed up the application.
The single threaded alternative to 2 is to use asynchronous calls where they return immediately and you keep controlling your program. Then you have to see when the I/O completes and manage using it. It is often simpler just to use a thread to do the I/O using the synchronous calls as they tend to be easier.
The reason to use threads instead of separate processes is because threads should be able to share data easier than multiple processes. And sometimes switching between threads is less expensive than switching between processes.
As another note, for #1 Python threads won't work because in Python only one python instruction can be executed at a time (known as the GIL or Global Interpreter Lock). I use that as an example but you need to check around your language. In python if you want to do parallel calculations, you need to do separate processes.

Many GUI frameworks are multi-threaded. This allows you to have a more responsive interface. For example, you can click on a "Cancel" button at any time while a long calculation is running.
Note that there are other solutions for this (for example the program can pause the calculation every half-a-second to check whether you clicked on the Cancel button or not), but they do not offer the same level of responsiveness (the GUI might seem to freeze for a few seconds while a file is being read or a calculation being done).

All the answers so far are focusing on the fact that multi-threading or multi-processing are necessary to make the best use of modern hardware.
There is however also the fact that multithreading can make life much easier for the programmer. At work I program software to control manufacturing and testing equipment, where a single machine often consists of several positions that work in parallel. Using multiple threads for that kind of software is a natural fit, as the parallel threads model the physical reality quite well. The threads do mostly not need to exchange any data, so the need to synchronize threads is rare, and many of the reasons for multithreading being difficult do therefore not apply.
Edit:
This is not really about a performance improvement, as the (maybe 5, maybe 10) threads are all mostly sleeping. It is however a huge improvement for the program structure when the various parallel processes can be coded as sequences of actions that do not know of each other. I have very bad memories from the times of 16 bit Windows, when I would create a state machine for each machine position, make sure that nothing would take longer than a few milliseconds, and constantly pass the control to the next state machine. When there were hardware events that needed to be serviced on time, and also computations that took a while (like FFT), then things would get ugly real fast.

Not directly answering your question, I believe in the very near future, almost every application will need to be multithreaded. The CPU performance is not growing that fast these days, which is compensated for by the increasing number of cores. Thus, if we will want our applications to stay on the top performance-wise, we'll need to find ways to utilize all your computer's CPUs and keep them busy, which is quite a hard job.
This can be done via telling your programs what to do instead of telling them exactly how. Now, this is a topic I personally find very interesting recently. Some functional languages, like F#, are able to parallelize many tasks quite easily. Well, not THAT easily, but still without the necessary infrastructure needed in more procedural-style environments.
Please take this as additional information to think about, not an attempt to answer your question.

The kind of applications that need to be threaded are the ones where you want to do more than one thing at once. Other than that no application needs to be multi-threaded.

Applications with a large workload which can be easily made parallel. The difficulty of taking your application and doing that should not be underestimated. It is easy when your data you're manipulating is not dependent upon other data but v. hard to schedule the cross thread work when there is a dependency.
Some examples I've done which are good multithreaded candidates..
running scenarios (eg stock derivative pricing, statistics)
bulk updating data files (eg adding a value / entry to 10,000 records)
other mathematical processes

E.g., you want your programs to be multithreaded when you want to utilize multiple cores and/or CPUs, even when the programs don't necessarily do many things at the same time.
EDIT: using multiple processes is the same thing. Which technique to use depends on the platform and how you are going to do communications within your program, etc.

Although frivolous, games, in general are becomming more and more threaded every year. At work our game uses around 10 threads doing physics, AI, animation, redering, network and IO.

Just want to add that caution must be taken with treads if your sharing any resources as this can lead to some very strange behavior, and your code not working correctly or even the threads locking each other out.
mutex will help you there as you can use mutex locks for protected code regions, a example of protected code regions would be reading or writing to shared memory between threads.
just my 2 cents worth.

The main purpose of multithreading is to separate time domains. So the uses are everywhere where you want several things to happen in their own distinctly separate time domains.

HERE IS A PERFECT USE CASE
If you like affiliate marketing multi-threading is essential. Kick the entire process off via a multi-threaded application.
Download merchant files via FTP, unzipping the files, enumerating through each file performing cleanup like EOL terminators from Unix to PC CRLF then slam each into SQL Server via Bulk Inserts then when all threads are complete create the full text search indexes for a environmental instance to be live tomorrow and your done. All automated to kick off at say 11:00 pm.
BOOM! Fast as lightening. Heck you have so much time left you can even download merchant images locally for the products you download, save the images as webp and set the product urls to use local images.
Yep I did it. Wrote it in C#. Works like a charm. Purchase a AMD Ryzen Threadripper 64-core with 256gb memory and fast drives like nvme, get lunch come back and see it all done or just stay around and watch all cores peg to 95%+, listen to the pc's fans kick, warm up the room and the look outside as the neighbors lights flicker from the power drain as you get shit done.
Future would be to push processing to GPU's as well.
Ok well I am pushing it a little bit with the neighbors lights flickering but all else was absolutely true. :)

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string