What kinds of applications need to be multi-threaded? - multithreading

What are some concrete examples of applications that need to be multi-threaded, or don't need to be, but are much better that way?
Answers would be best if in the form of one application per post that way the most applicable will float to the top.

There is no hard and fast answer, but most of the time you will not see any advantage for systems where the workflow/calculation is sequential. If however the problem can be broken down into tasks that can be run in parallel (or the problem itself is massively parallel [as some mathematics or analytical problems are]), you can see large improvements.
If your target hardware is single processor/core, you're unlikely to see any improvement with multi-threaded solutions (as there is only one thread at a time run anyway!)
Writing multi-threaded code is often harder as you may have to invest time in creating thread management logic.
Some examples
Image processing can often be done in parallel (e.g. split the image into 4 and do the work in 1/4 of the time) but it depends upon the algorithm being run to see if that makes sense.
Rendering of animation (from 3DMax,etc.) is massively parallel as each frame can be rendered independently to others -- meaning that 10's or 100's of computers can be chained together to help out.
GUI programming often helps to have at least two threads when doing something slow, e.g. processing large number of files - this allows the interface to remain responsive whilst the worker does the hard work (in C# the BackgroundWorker is an example of this)
GUI's are an interesting area as the "responsiveness" of the interface can be maintained without multi-threading if the worker algorithm keeps the main GUI "alive" by giving it time, in Windows API terms (before .NET, etc) this could be achieved by a primitive loop and no need for threading:
MSG msg;
while(GetMessage(&msg, hwnd, 0, 0))
{
TranslateMessage(&msg);
DispatchMessage(&msg);
// do some stuff here and then release, the loop will come back
// almost immediately (unless the user has quit)
}

Servers are typically multi-threaded (web servers, radius servers, email servers, any server): you usually want to be able to handle multiple requests simultaneously. If you do not want to wait for a request to end before you start to handle a new request, then you mainly have two options:
Run a process with multiple threads
Run multiple processes
Launching a process is usually more resource-intensive than lauching a thread (or picking one in a thread-pool), so servers are usually multi-threaded. Moreover, threads can communicate directly since they share the same memory space.
The problem with multiple threads is that they are usually harder to code right than multiple processes.

There are really three classes of reasons that multithreading would be applied:
Execution Concurrency to improve compute performance: If you have a problem that can be broken down into pieces and you also have more than one execution unit (processor core) available then dispatching the pieces into separate threads is the path to being able to simultaneously use two or more cores at once.
Concurrency of CPU and IO Operations: This is similar in thinking to the first one but in this case the objective is to keep the CPU busy AND also IO operations (ie: disk I/O) moving in parallel rather than alternating between them.
Program Design and Responsiveness: Many types of programs can take advantage of threading as a program design benefit to make the program more responsive to the user. For example the program can be interacting via the GUI and also doing something in the background.
Concrete Examples:
Microsoft Word: Edit document while the background grammar and spell checker works to add all the green and red squiggle underlines.
Microsoft Excel: Automatic background recalculations after cell edits
Web Browser: Dispatch multiple threads to load each of the several HTML references in parallel during a single page load. Speeds page loads and maximizes TCP/IP data throughput.

These days, the answer should be Any application that can be.
The speed of execution for a single thread pretty much peaked years ago - processors have been getting faster by adding cores, not by increasing clock speeds. There have been some architectural improvements that make better use of the available clock cycles, but really, the future is taking advantage of threading.
There is a ton of research going on into finding ways of parallelizing activities that we traditionally wouldn't think of parallelizing. Even something as simple as finding a substring within a string can be parallelized.

Basically there are two reasons to multi-thread:
To be able to do processing tasks in parallel. This only applies if you have multiple cores/processors, otherwise on a single core/processor computer you will slow the task down compared to the version without threads.
I/O whether that be networked I/O or file I/O. Normally if you call a blocking I/O call, the process has to wait for the call to complete. Since the processor/memory are several orders of magnitude quicker than a disk drive (and a network is even slower) it means the processor will be waiting a long time. The computer will be working on other things but your application will not be making any progress. However if you have multiple threads, the computer will schedule your application and the other threads can execute. One common use is a GUI application. Then while the application is doing I/O the GUI thread can keep refreshing the screen without looking like the app is frozen or not responding. Even on a single processor putting I/O in a different thread will tend to speed up the application.
The single threaded alternative to 2 is to use asynchronous calls where they return immediately and you keep controlling your program. Then you have to see when the I/O completes and manage using it. It is often simpler just to use a thread to do the I/O using the synchronous calls as they tend to be easier.
The reason to use threads instead of separate processes is because threads should be able to share data easier than multiple processes. And sometimes switching between threads is less expensive than switching between processes.
As another note, for #1 Python threads won't work because in Python only one python instruction can be executed at a time (known as the GIL or Global Interpreter Lock). I use that as an example but you need to check around your language. In python if you want to do parallel calculations, you need to do separate processes.

Many GUI frameworks are multi-threaded. This allows you to have a more responsive interface. For example, you can click on a "Cancel" button at any time while a long calculation is running.
Note that there are other solutions for this (for example the program can pause the calculation every half-a-second to check whether you clicked on the Cancel button or not), but they do not offer the same level of responsiveness (the GUI might seem to freeze for a few seconds while a file is being read or a calculation being done).

All the answers so far are focusing on the fact that multi-threading or multi-processing are necessary to make the best use of modern hardware.
There is however also the fact that multithreading can make life much easier for the programmer. At work I program software to control manufacturing and testing equipment, where a single machine often consists of several positions that work in parallel. Using multiple threads for that kind of software is a natural fit, as the parallel threads model the physical reality quite well. The threads do mostly not need to exchange any data, so the need to synchronize threads is rare, and many of the reasons for multithreading being difficult do therefore not apply.
Edit:
This is not really about a performance improvement, as the (maybe 5, maybe 10) threads are all mostly sleeping. It is however a huge improvement for the program structure when the various parallel processes can be coded as sequences of actions that do not know of each other. I have very bad memories from the times of 16 bit Windows, when I would create a state machine for each machine position, make sure that nothing would take longer than a few milliseconds, and constantly pass the control to the next state machine. When there were hardware events that needed to be serviced on time, and also computations that took a while (like FFT), then things would get ugly real fast.

Not directly answering your question, I believe in the very near future, almost every application will need to be multithreaded. The CPU performance is not growing that fast these days, which is compensated for by the increasing number of cores. Thus, if we will want our applications to stay on the top performance-wise, we'll need to find ways to utilize all your computer's CPUs and keep them busy, which is quite a hard job.
This can be done via telling your programs what to do instead of telling them exactly how. Now, this is a topic I personally find very interesting recently. Some functional languages, like F#, are able to parallelize many tasks quite easily. Well, not THAT easily, but still without the necessary infrastructure needed in more procedural-style environments.
Please take this as additional information to think about, not an attempt to answer your question.

The kind of applications that need to be threaded are the ones where you want to do more than one thing at once. Other than that no application needs to be multi-threaded.

Applications with a large workload which can be easily made parallel. The difficulty of taking your application and doing that should not be underestimated. It is easy when your data you're manipulating is not dependent upon other data but v. hard to schedule the cross thread work when there is a dependency.
Some examples I've done which are good multithreaded candidates..
running scenarios (eg stock derivative pricing, statistics)
bulk updating data files (eg adding a value / entry to 10,000 records)
other mathematical processes

E.g., you want your programs to be multithreaded when you want to utilize multiple cores and/or CPUs, even when the programs don't necessarily do many things at the same time.
EDIT: using multiple processes is the same thing. Which technique to use depends on the platform and how you are going to do communications within your program, etc.

Although frivolous, games, in general are becomming more and more threaded every year. At work our game uses around 10 threads doing physics, AI, animation, redering, network and IO.

Just want to add that caution must be taken with treads if your sharing any resources as this can lead to some very strange behavior, and your code not working correctly or even the threads locking each other out.
mutex will help you there as you can use mutex locks for protected code regions, a example of protected code regions would be reading or writing to shared memory between threads.
just my 2 cents worth.

The main purpose of multithreading is to separate time domains. So the uses are everywhere where you want several things to happen in their own distinctly separate time domains.

HERE IS A PERFECT USE CASE
If you like affiliate marketing multi-threading is essential. Kick the entire process off via a multi-threaded application.
Download merchant files via FTP, unzipping the files, enumerating through each file performing cleanup like EOL terminators from Unix to PC CRLF then slam each into SQL Server via Bulk Inserts then when all threads are complete create the full text search indexes for a environmental instance to be live tomorrow and your done. All automated to kick off at say 11:00 pm.
BOOM! Fast as lightening. Heck you have so much time left you can even download merchant images locally for the products you download, save the images as webp and set the product urls to use local images.
Yep I did it. Wrote it in C#. Works like a charm. Purchase a AMD Ryzen Threadripper 64-core with 256gb memory and fast drives like nvme, get lunch come back and see it all done or just stay around and watch all cores peg to 95%+, listen to the pc's fans kick, warm up the room and the look outside as the neighbors lights flicker from the power drain as you get shit done.
Future would be to push processing to GPU's as well.
Ok well I am pushing it a little bit with the neighbors lights flickering but all else was absolutely true. :)

Related

How to articulate the difference between asynchronous and parallel programming?

Many platforms promote asynchrony and parallelism as means for improving responsiveness. I understand the difference generally, but often find it difficult to articulate in my own mind, as well as for others.
I am a workaday programmer and use async & callbacks fairly often. Parallelism feels exotic.
But I feel like they are easily conflated, especially at the language design level. Would love a clear description of how they relate (or don't), and the classes of programs where each is best applied.
When you run something asynchronously it means it is non-blocking, you execute it without waiting for it to complete and carry on with other things. Parallelism means to run multiple things at the same time, in parallel. Parallelism works well when you can separate tasks into independent pieces of work.
Take for example rendering frames of a 3D animation. To render the animation takes a long time so if you were to launch that render from within your animation editing software you would make sure it was running asynchronously so it didn't lock up your UI and you could continue doing other things. Now, each frame of that animation can also be considered as an individual task. If we have multiple CPUs/Cores or multiple machines available, we can render multiple frames in parallel to speed up the overall workload.
I believe the main distinction is between concurrency and parallelism.
Async and Callbacks are generally a way (tool or mechanism) to express concurrency i.e. a set of entities possibly talking to each other and sharing resources.
In the case of async or callback communication is implicit while sharing of resources is optional (consider RMI where results are computed in a remote machine).
As correctly noted this is usually done with responsiveness in mind; to not wait for long latency events.
Parallel programming has usually throughput as the main objective while latency, i.e. the completion time for a single element, might be worse than a equivalent sequential program.
To better understand the distinction between concurrency and parallelism I am going to quote from Probabilistic models for concurrency of Daniele Varacca which is a good set of notes for theory of concurrency:
A model of computation is a model for concurrency when it is able to represent systems as composed of independent autonomous components, possibly communicating with each other. The notion of concurrency should not be confused with the notion of parallelism. Parallel computations usually involve a central control which distributes the work among several processors. In concurrency we stress the independence of the components, and the fact that they communicate with each other. Parallelism is like ancient Egypt, where the Pharaoh decides and the slaves work. Concurrency is like modern Italy, where everybody does what they want, and all use mobile phones.
In conclusion, parallel programming is somewhat a special case of concurrency where separate entities collaborate to obtain high performance and throughput (generally).
Async and Callbacks are just a mechanism that allows the programmer to express concurrency.
Consider that well-known parallel programming design patterns such as master/worker or map/reduce are implemented by frameworks that use such lower level mechanisms (async) to implement more complex centralized interactions.
This article explains it very well: http://urda.cc/blog/2010/10/04/asynchronous-versus-parallel-programming
It has this about asynchronous programming:
Asynchronous calls are used to prevent “blocking” within an application. [Such a] call will spin-off in an already existing thread (such as an I/O thread) and do its task when it can.
this about parallel programming:
In parallel programming you still break up work or tasks, but the key differences is that you spin up new threads for each chunk of work
and this in summary:
asynchronous calls will use threads already in use by the system and parallel programming requires the developer to break the work up, spinup, and teardown threads needed.
async: Do this by yourself somewhere else and notify me when you complete(callback). By the time i can continue to do my thing.
parallel: Hire as many guys(threads) as you wish and split the job to them to complete quicker and let me know(callback) when you complete. By the time i might continue to do my other stuff.
the main difference is parallelism mostly depends on hardware.
My basic understanding is:
Asynchonous programming solves the problem of waiting around for an expensive operation to complete before you can do anything else. If you can get other stuff done while you're waiting for the operation to complete then that's a good thing. Example: keeping a UI running while you go and retrieve more data from a web service.
Parallel programming is related but is more concerned with breaking a large task into smaller chunks that can be computed at the same time. The results of the smaller chunks can then be combined to produce the overall result. Example: ray-tracing where the colour of individual pixels is essentially independent.
It's probably more complicated than that, but I think that's the basic distinction.
I tend to think of the difference in these terms:
Asynchronous: Go away and do this task, when you're finished come back and tell me and bring the results. I'll be getting on with other things in the mean time.
Parallel: I want you to do this task. If it makes it easier, get some folks in to help. This is urgent though, so I'll wait here until you come back with the results. I can do nothing else until you come back.
Of course an asynchronous task might make use of parallelism, but the differentiation - to my mind at least - is whether you get on with other things while the operation is being carried out or if you stop everything completely until the results are in.
It is a question of order of execution.
If A is asynchronous with B, then I cannot predict beforehand when subparts of A will happen with respect to subparts of B.
If A is parallel with B, then things in A are happening at the same time as things in B. However, an order of execution may still be defined.
Perhaps the difficulty is that the word asynchronous is equivocal.
I execute an asynchronous task when I tell my butler to run to the store for more wine and cheese, and then forget about him and work on my novel until he knocks on the study door again. Parallelism is happening here, but the butler and I are engaged in fundamentally different tasks and of different social classes, so we don't apply that label here.
My team of maids is working in parallel when each of them is washing a different window.
My race car support team is asynchronously parallel in that each team works on a different tire and they don't need to communicate with each other or manage shared resources while they do their job.
My football (aka soccer) team does parallel work as each player independently processes information about the field and moves about on it, but they are not fully asynchronous because they must communicate and respond to the communication of others.
My marching band is also parallel as each player reads music and controls their instrument, but they are highly synchronous: they play and march in time to each other.
A cammed gatling gun could be considered parallel, but everything is 100% synchronous, so it is as though one process is moving forward.
Why Asynchronous ?
With today's application's growing more and more connected and also potentially
long running tasks or blocking operations such as Network I/O or Database Operations.So it's very important to hide the latency of these operations by starting them in background and returning back to the user interface quickly as possible. Here Asynchronous come in to the picture, Responsiveness.
Why parallel programming?
With today's data sets growing larger and computations growing more complex. So it's very important to reduce the execution time of these CPU-bound operations, in this case, by dividing the workload into chunks and then executing those chunks simultaneously. We can call this as "Parallel" .
Obviously it will give high Performance to our application.
Asynchronous
Let's say you are the point of contact for your client and you need to be responsive i.e. you need to share status, complexity of operation, resources required etc whenever asked. Now you have a time-consuming operation to be done and hence cannot take this up as you need to be responsive to the client 24/7. Hence, you delegate the time-consuming operation to someone else so that you can be responsive. This is asynchronous.
Parallel programming
Let's say you have a task to read, say, 100 lines from a text file, and reading one line takes 1 second. Hence, you'll require 100 seconds to read the text file. Now you're worried that the client must wait for 100 seconds for the operation to finish. Hence you create 9 more clones and make each of them read 10 lines from the text file. Now the time taken is only 10 seconds to read 100 lines. Hence you have better performance.
To sum up, asynchronous coding is done to achieve responsiveness and parallel programming is done for performance.
Asynchronous: Running a method or task in background, without blocking. May not necessorily run on a separate thread. Uses Context Switching / time scheduling.
Parallel Tasks: Each task runs parallally. Does not use context switching / time scheduling.
I came here fairly comfortable with the two concepts, but with something not clear to me about them.
After reading through some of the answers, I think I have a correct and helpful metaphor to describe the difference.
If you think of your individual lines of code as separate but ordered playing cards (stop me if I am explaining how old-school punch cards work), then for each separate procedure written, you will have a unique stack of cards (don't copy & paste!) and the difference between what normally goes on when run code normally and asynchronously depends on whether you care or not.
When you run the code, you hand the OS a set of single operations (that your compiler or interpreter broke your "higher" level code into) to be passed to the processor. With one processor, only one line of code can be executed at any one time. So, in order to accomplish the illusion of running multiple processes at the same time, the OS uses a technique in which it sends the processor only a few lines from a given process at a time, switching between all the processes according to how it sees fit. The result is multiple processes showing progress to the end user at what seems to be the same time.
For our metaphor, the relationship is that the OS always shuffles the cards before sending them to the processor. If your stack of cards doesn't depend on another stack, you don't notice that your stack stopped getting selected from while another stack became active. So if you don't care, it doesn't matter.
However, if you do care (e.g., there are multiple processes - or stacks of cards - that do depend on each other), then the OS's shuffling will screw up your results.
Writing asynchronous code requires handling the dependencies between the order of execution regardless of what that ordering ends up being. This is why constructs like "call-backs" are used. They say to the processor, "the next thing to do is tell the other stack what we did". By using such tools, you can be assured that the other stack gets notified before it allows the OS to run any more of its instructions. ("If called_back == false: send(no_operation)" - not sure if this is actually how it is implemented, but logically, I think it is consistent.)
For parallel processes, the difference is that you have two stacks that don't care about each other and two workers to process them. At the end of the day, you may need to combine the results from the two stacks, which would then be a matter of synchronicity but, for execution, you don't care again.
Not sure if this helps but, I always find multiple explanations helpful. Also, note that asynchronous execution is not constrained to an individual computer and its processors. Generally speaking, it deals with time, or (even more generally speaking) an order of events. So if you send dependent stack A to network node X and its coupled stack B to Y, the correct asynchronous code should be able to account for the situation as if it was running locally on your laptop.
Generally, there are only two ways you can do more than one thing each time. One is asynchronous, the other is parallel.
From the high level, like the popular server NGINX and famous Python library Tornado, they both fully utilize asynchronous paradigm which is Single thread server could simultaneously serve thousands of clients (some IOloop and callback). Using ECF(exception control follow) which could implement the asynchronous programming paradigm. so asynchronous sometimes doesn't really do thing simultaneous, but some io bound work, asynchronous could really promotes the performance.
The parallel paradigm always refers multi-threading, and multiprocessing. This can fully utilize multi-core processors, do things really simultaneously.
Summary of all above answers
parallel computing:
▪ solves throughput issue.
Concerned with breaking a large task into smaller chunks
▪ is machine related (multi machine/core/cpu/processor needed), eg: master slave, map reduce.
Parallel computations usually involve a central control which distributes the work among several processors
asynchronous:
▪ solves latency issue
ie, the problem of 'waiting around' for an expensive operation to complete before you can do anything else
▪ is thread related (multi thread needed)
Threading (using Thread, Runnable, Executor) is one fundamental way to perform asynchronous operations in Java

Is there a point to multithreading?

I don’t want to make this subjective...
If I/O and other input/output-related bottlenecks are not of concern, then do we need to write multithreaded code? Theoretically the single threaded code will fare better since it will get all the CPU cycles. Right?
Would JavaScript or ActionScript have fared any better, had they been multithreaded?
I am just trying to understand the real need for multithreading.
I don't know if you have payed any attention to trends in hardware lately (last 5 years) but we are heading to a multicore world.
A general wake-up call was this "The free lunch is over" article.
On a dual core PC, a single-threaded app will only get half the CPU cycles. And CPUs are not getting faster anymore, that part of Moores law has died.
In the words of Herb Sutter The free lunch is over, i.e. the future performance path for computing will be in terms of more cores not higher clockspeeds. The thing is that adding more cores typically does not scale the performance of software that is not multithreaded, and even then it depends entirely on the correct use of multithreaded programming techniques, hence multithreading is a big deal.
Another obvious reason is maintaining a responsive GUI, when e.g. a click of a button initiates substantial computations, or I/O operations that may take a while, as you point out yourself.
The primary reason I use multithreading these days is to keep the UI responsive while the program does something time-consuming. Sure, it's not high-tech, but it keeps the users happy :-)
Most CPUs these days are multi-core. Put simply, that means they have several processors on the same chip.
If you only have a single thread, you can only use one of the cores - the other cores will either idle or be used for other tasks that are running. If you have multiple threads, each can run on its own core. You can divide your problem into X parts, and, assuming each part can run indepedently, you can finish the calculations in close to 1/Xth of the time it would normally take.
By definition, the fastest algorithm running in parallel will spend at least as much CPU time as the fastest sequential algorithm - that is, parallelizing does not decrease the amount of work required - but the work is distributed across several independent units, leading to a decrease in the real-time spent solving the problem. That means the user doesn't have to wait as long for the answer, and they can move on quicker.
10 years ago, when multi-core was unheard of, then it's true: you'd gain nothing if we disregard I/O delays, because there was only one unit to do the execution. However, the race to increase clock speeds has stopped; and we're instead looking at multi-core to increase the amount of computing power available. With companies like Intel looking at 80-core CPUs, it becomes more and more important that you look at parallelization to reduce the time solving a problem - if you only have a single thread, you can only use that one core, and the other 79 cores will be doing something else instead of helping you finish sooner.
Much of the multithreading is done just to make the programming model easier when doing blocking operations while maintaining concurrency in the program - sometimes languages/libraries/apis give you little other choice, or alternatives makes the programming model too hard and error prone.
Other than that the main benefit of multi threading is to take advantage of multiple CPUs/cores - one thread can only run at one processor/core at a time.
No. You can't continue to gain the new CPU cycles, because they exist on a different core and the core that your single-threaded app exists on is not going to get any faster. A multi-threaded app, on the other hand, will benefit from another core. Well-written parallel code can go up to about 95% faster- on a dual core, which is all the new CPUs in the last five years. That's double that again for a quad core. So while your single-threaded app isn't getting any more cycles than it did five years ago, my quad-threaded app has four times as many and is vastly outstripping yours in terms of response time and performance.
Your question would be valid had we only had single cores. The things is though, we mostly have multicore CPU's these days. If you have a quadcore and write a single threaded program, you will have three cores which is not used by your program.
So actually you will have at most 25% of the CPU cycles and not 100%. Since the technology today is to add more cores and less clockspeed, threading will be more and more crucial for performance.
That's kind of like asking whether a screwdriver is necessary if I only need to drive this nail. Multithreading is another tool in your toolbox to be used in situations that can benefit from it. It isn't necessarily appropriate in every programming situation.
Here are some answers:
You write "If input/output related problems are not bottlenecks...". That's a big "if". Many programs do have issues like that, remembering that networking issues are included in "IO", and in those cases multithreading is clearly worthwhile. If you are writing one of those rare apps that does no IO and no communication then multithreading might not be an issue
"The single threaded code will get all the CPU cycles". Not necessarily. A multi-threaded code might well get more cycles than a single threaded app. These days an app is hardly ever the only app running on a system.
Multithreading allows you to take advantage of multicore systems, which are becoming almost universal these days.
Multithreading allows you to keep a GUI responsive while some action is taking place. Even if you don't want two user-initiated actions to be taking place simultaneously you might want the GUI to be able to repaint and respond to other events while a calculation is taking place.
So in short, yes there are applications that don't need multithreading, but they are fairly rare and becoming rarer.
First, modern processors have multiple cores, so a single thraed will never get all the CPU cycles.
On a dualcore system, a single thread will utilize only half the CPU. On a 8-core CPU, it'll use only 1/8th.
So from a plain performance point of view, you need multiple threads to utilize the CPU.
Beyond that, some tasks are also easier to express using multithreading.
Some tasks are conceptually independent, and so it is more natural to code them as separate threads running in parallel, than to write a singlethreaded application which interleaves the two tasks and switches between them as necessary.
For example, you typically want the GUI of your application to stay responsive, even if pressing a button starts some CPU-heavy work process that might go for several minutes. In that time, you still want the GUI to work. The natural way to express this is to put the two tasks in separate threads.
Most of the answers here make the conclusion multicore => multithreading look inevitable. However, there is another way of utilizing multiple processors - multi-processing. On Linux especially, where, AFAIK, threads are implemented as just processes perhaps with some restrictions, and processes are cheap as opposed to Windows, there are good reasons to avoid multithreading. So, there are software architecture issues here that should not be neglected.
Of course, if the concurrent lines of execution (either threads or processes) need to operate on the common data, threads have an advantage. But this is also the main reason for headache with threads. Can such program be designed such that the pieces are as much autonomous and independent as possible, so we can use processes? Again, a software architecture issue.
I'd speculate that multi-threading today is what memory management was in the days of C:
it's quite hard to do it right, and quite easy to mess up.
thread-safety bugs, same as memory leaks, are nasty and hard to find
Finally, you may find this article interesting (follow this first link on the page). I admit that I've read only the abstract, though.

What are the tell-tale signs that my code needs to make use of multi-threading?

I am using a third party API which performs what I would assume are expensive operations in terms of time/resources used (image recognition, etc). What tell-tale signs are there that the code under test should be made to use threads to increase performance?
I have a profiler and will be profiling the code I write which will rely on this API.
Thanks
If you have two distinct sequences of events that don't depend on one-another, then consider it. If you have to write bunches of logic just to make sure that two operations aren't getting in each-others way, it pays off by making the two pieces of code clearer.
If on the other hand you find that, in attempting to make something multithreaded, you have to add gobs of code to communicate results between the threads, because one (or both) can't proceed without some information from the other, that's a good sign that you are trying to make threads where they don't make sense.
One case where it makes sense to go multi-threaded, even when you have to add communication to do it, is when you have one task that needs to stay available for input, and another to do heavy computing. One thread may poll for input from somewhere, blocking when none is available, so that when input is available it is responded to in a timely manner, and feed jobs to another 'worker' thread, so that processing continues at all times, not just when there's input.
One other thing to consider, is that even when a job is 'embarrassingly parallel' (i.e., requiring little or no communication between the parallelized parts), there are cases where multithreading may not be worthwhile. If your CPU can assign different threads to different cores, multithreading will give you a speed up, by allowing multiple cores to chew through the work simultaneously. But on a single core processor, or even a multi-core one with an unfortunate OS, having multiple threads will not speed things up, as the one core will still have to get through all the work.
Image processing is often cpu-bound. However, if your image-processing api already is designed to leverage multiple cpus, multi-threading probably won't help you. The strategy I usually consider for quickly determining if multi-threading will help is to write a simple program which does the relevant processing over and over again. Then, I will run it on a set of data, then run two instances of the process simultaneously,each on half of the data. There is no need to ensure the data is equalized for such a test; if one process runs out it will just run one instance for anything left. Timing is done via wall-clock time. I mean this literally; pick a large enough data set that it will take at least a full minute to run, but ideally 5 minutes or longer).
If running two copies at the same time improves throughput significantly, multi-threading is probably a good idea. Obviously this strategy is only practical in certain instances and in some cases multi-threading can involve leveraging shared output in ways this trick can't emulate. But, it's an absurdly easy test to run, and rarely requires much, if any, code to be written.

Multithreading in .NET 4.0 and performance

I've been toying around with the Parallel library in .NET 4.0. Recently, I developed a custom ORM for some unusual read/write operations one of our large systems has to use. This allows me to decorate an object with attributes and have reflection figure out what columns it has to pull from the database, as well as what XML it has to output on writes.
Since I envision this wrapper to be reused in many projects, I'd like to squeeze as much speed out of it as possible. This library will mostly be used in .NET web applications. I'm testing the framework using a throwaway console application to poke at the classes I've created.
I've now learned a lesson of the overhead that multithreading comes with. Multithreading causes it to run slower. From reading around, it seems like it's intuitive to people who've been doing it for a long time, but it's actually counter-intuitive to me: how can running a method 30 times at the same time be slower than running it 30 times sequentially?
I don't think I'm causing problems by multiple threads having to fight over the same shared object (though I'm not good enough at it yet to tell for sure or not), so I assume the slowdown is coming from the overhead of spawning all those threads and the runtime keeping them all straight. So:
Though I'm doing it mainly as a learning exercise, is this pessimization? For trivial, non-IO tasks, is multithreading overkill? My main goal is speed, not responsiveness of the UI or anything.
Would running the same multithreading code in IIS cause it to speed up because of already-created threads in the thread pool, whereas right now I'm using a console app, which I assume would be single-threaded until I told it otherwise? I'm about to run some tests, but I figure there's some base knowledge I'm missing to know why it would be one way or the other. My console app is also running on my desktop with two cores, whereas a server for a web app would have more, so I might have to use that as a variable as well.
Thread's don't actually all run concurrently.
On a desktop machine I'm presuming you have a dual core CPU, (maybe a quad at most). This means only 2/4 threads can be running at the same time.
If you have spawned 30 threads, the OS is going to have to context switch between those 30 threads to keep them all running. Context switches are quite costly, so hence the slowdown.
As a basic suggestion, I'd aim for 1 thread per CPU if you are trying to optimise calculations. Any more than this and you're not really doing any extra work, you are just swapping threads in an out on the same CPU. Try to think of your computer as having a limited number of workers inside, you can't do more work concurrently than the number of workers you have available.
Some of the new features in the .net 4.0 parallel task library allow you to do things that account for scalability in the number of threads. For example you can create a bunch of tasks and the task parallel library will internally figure out how many CPUs you have available, and optimise the number of threads is creates/uses so as not to overload the CPUs, so you could create 30 tasks, but on a dual core machine the TP library would still only create 2 threads, and queue the . Obviously, this will scale very nicely when you get to run it on a bigger machine. Or you can use something like ThreadPool.QueueUserWorkItem(...) to queue up a bunch of tasks, and the pool will automatically manage how many threads is uses to perform those tasks.
Yes there is a lot of overhead to thread creation, but if you are using the .net thread pool, (or the parallel task library in 4.0) .net will be managing your thread creation, and you may actually find it creates less threads than the number of tasks you have created. It will internally swap your tasks around on the available threads. If you actually want to control explicit creation of actual threads you would need to use the Thread class.
[Some cpu's can do clever stuff with threads and can have multiple Threads running per CPU - see hyperthreading - but check out your task manager, I'd be very surprised if you have more than 4-8 virtual CPUs on today's desktops]
There are so many issues with this that it pays to understand what is happening under the covers. I would highly recommend the "Concurrent Programming on Windows" book by Joe Duffy and the "Java Concurrency in Practice" book. The latter talks about processor architecture at the level you need to understand it when writing multithreaded code. One issue you are going to hit that's going to hurt your code is caching, or more likely the lack of it.
As has been stated there is an overhead to scheduling and running threads, but you may find that there is a larger overhead when you share data across threads. That data may be flushed from the processor cache into main memory, and that will cause serious slow downs to your code.
This is the sort of low-level stuff that managed environments are supposed to protect us from, however, when writing highly parallel code, this is exactly the sort of issue you have to deal with.
A colleague of mine recorded a screencast about the performance issue with Parallel.For and Parallel.ForEach which may help:
http://rocksolidknowledge.com/ScreenCasts.mvc/Watch?video=ParallelLoops.wmv
You're speaking of an ORM, so I presume some amount of I/O is going on. If this is the case, the overhead of thread creation and context switching is going to be comparatively non-existent.
Most likely, you're experiencing I/O contention: it can be slower (particularly on rotational hard drives, but also on other storage devices) to read the same set of data if you read it out of order than if you read it in-order. So, if you're executing 30 database queries, it's possible they'll run faster sequentially than in parallel if they're all backed by the same I/O device and the queries aren't in cache. Running them in parallel may cause the system to have a bunch of I/O read requests almost simultaneously, which may cause the OS to read little bits of each in turn - causing your drive head to jump back and forth, wasting precious milliseconds.
But that's just a guess; it's not possible to really determine what's causing your slowdown without knowing more.
Although thread creation is "extremely expensive" when compared to say adding two numbers, it's not usually something you'll easily overdo. If your operations are extremely short (say, a millisecond or less), using a thread-pool rather than new threads will noticeably save time. Generally though, if your operations are that short, you should reconsider the granularity of parallelism anyhow; perhaps you're better off splitting the computation into bigger chunks: for instance, by having a fairly low number of worker tasks which handle entire batches of smaller work-items at a time rather than each item separately.

Why should I use a thread vs. using a process?

Separating different parts of a program into different processes seems (to me) to make a more elegant program than just threading everything. In what scenario would it make sense to make things run on a thread vs. separating the program into different processes? When should I use a thread?
Edit
Anything on how (or if) they act differently with single-core and multi-core would also be helpful.
You'd prefer multiple threads over multiple processes for two reasons:
Inter-thread communication (sharing data etc.) is significantly simpler to program than inter-process communication.
Context switches between threads are faster than between processes. That is, it's quicker for the OS to stop one thread and start running another than do the same with two processes.
Example:
Applications with GUIs typically use one thread for the GUI and others for background computation. The spellchecker in MS Office, for example, is a separate thread from the one running the Office user interface. In such applications, using multiple processes instead would result in slower performance and code that's tough to write and maintain.
Well apart from advantages of using thread over process, like:
Advantages:
Much quicker to create a thread than
a process.
Much quicker to switch
between threads than to switch
between processes.
Threads share data
easily
Consider few disadvantages too:
No security between threads.
One thread can stomp on another thread's
data.
If one thread blocks, all
threads in task block.
As to the important part of your question "When should I use a thread?"
Well you should consider few facts that a threads should not alter the semantics of a program. They simply change the timing of operations. As a result, they are almost always used as an elegant solution to performance related problems. Here are some examples of situations where you might use threads:
Doing lengthy processing: When a windows application is calculating it cannot process any more messages. As a result, the display cannot be updated.
Doing background processing: Some
tasks may not be time critical, but
need to execute continuously.
Doing I/O work: I/O to disk or to
network can have unpredictable
delays. Threads allow you to ensure
that I/O latency does not delay
unrelated parts of your application.
I assume you already know you need a thread or a process, so I'd say the main reason to pick one over the other would be data sharing.
Use of a process means you also need Inter Process Communication (IPC) to get data in and out of the process. This is a good thing if the process is to be isolated though.
You sure don't sound like a newbie. It's an excellent observation that processes are, in many ways, more elegant. Threads are basically an optimization to avoid too many transitions or too much communication between memory spaces.
Superficially using threads may also seem like it makes your program easier to read and write, because you can share variables and memory between the threads freely. In practice, doing that requires very careful attention to avoid race conditions or deadlocks.
There are operating-system kernels (most notably L4) that try very hard to improve the efficiency of inter-process communication. For such systems one could probably make a convincing argument that threads are pointless.
I would like to answer this in a different way. "It depends on your application's working scenario and performance SLA" would be my answer.
For instance threads may be sharing the same address space and communication between threads may be faster and easier but it is also possible that under certain conditions threads deadlock and then what do you think would happen to your process.
Even if you are a programming whiz and have used all the fancy thread synchronization mechanisms to prevent deadlocks it certainly is not rocket science to see that unless a deterministic model is followed which may be the case with hard real time systems running on Real Time OSes where you have a certain degree of control over thread priorities and can expect the OS to respect these priorities it may not be the case with General Purpose OSes like Windows.
From a Design perspective too you might want to isolate your functionality into independent self contained modules where they may not really need to share the same address space or memory or even talk to each other. This is a case where processes will make sense.
Take the case of Google Chrome where multiple processes are spawned as opposed to most browsers which use a multi-threaded model.
Each tab in Chrome can be talking to a different server and rendering a different website. Imagine what would happen if one website stopped responding and if you had a thread stalled due to this, the entire browser would either slow down or come to a stop.
So Google decided to spawn multiple processes and that is why even if one tab freezes you can still continue using other tabs of your Chrome browser.
Read more about it here
and also look here
I agree to most of the answers above. But speaking from design perspective i would rather go for a thread when i want set of logically co-related operations to be carried out parallel. For example if you run a word processor there will be one thread running in foreground as an editor and other thread running in background auto saving the document at regular intervals so no one would design a process to do that auto saving task separately.
In addition to the other answers, maintaining and deploying a single process is a lot simpler than having a few executables.
One would use multiple processes/executables to provide a well-defined interface/decoupling so that one part or the other can be reused or reimplemented more easily than keeping all the functionality in one process.
Came across this post. Interesting discussion. but I felt one point is missing or indirectly pointed.
Creating a new process is costly because of all of the
data structures that must be allocated and initialized. The process is subdivided into different threads of control to achieve multithreading inside the process.
Using a thread or a process to achieve the target is based on your program usage requirements and resource utilization.

Resources