QProcess, QEventLoop - of any use for parallel-processing

QProcess, QEventLoop - of any use for parallel-processing - multithreading

I wonder whether I could use QEventLoop (QProcess?) to parallelize multiple calls to same function with Qt. What is precisely the difference with QtConcurrent or QThread? What is a process and an event loop more precisely? I read that QCoreApplication must exec() as early as possible in main() method, so that I wonder why it is different from main Thread.
could you point as some efficient reference to processes and thread with Qt? I came through the official doc and those things remain unclear.
Thanks and regards.

Process and thread are not Qt-specific concepts. You can search for "process vs. thread" anywhere for that distinction to be explained. For instance: What resources are shared between threads?
Though related concepts, spawning a new process is a more "heavyweight" form of parallelism than spawning a new thread within your existing process. Processes are protected from each other by default, while threads of execution within a process can read and write each other's memory directly. The protection you get from spawning processes comes at a greater run-time cost...and since independent processes can't read each other's memory, you have to share data between them using methods of inter-process communication.
Odds are that you want threads, because they're simpler to use in a case where one is writing all the code in a program. Given all the complexities in multithreaded programming, I'd suggest looking at a good book or reading some websites to start with. See: What are some good resources for learning threaded programming?
But if you want to dive in and just get a feel for how threading in Qt looks, you can spend time looking at the examples:
http://qt-project.org/doc/qt-4.8/examples-threadandconcurrent.html
QtConcurrent is an abstraction library that makes it easier to implement some kinds of parallel programming patterns. It's built on top of the QThread abstractions, and there's nothing it can do that you couldn't code yourself by writing to QThread directly. But it might make your code easier to write and less prone to errors.
As for an event loop...that is merely a generic term for how any given thread of execution in your program waits for work items to process, processes them, and can decide when it is no longer needed. If a thread's job were merely to start up, do some math, and exit...then it wouldn't need an event loop. But starting and stopping a thread takes time and churns resources. So typically threads live for longer periods of time, and have an event loop that knows how to wait for events it needs to respond to.
If you build on top of QtConcurrent, you won't have to worry about an event loop in your worker threads because they are managed automatically in a thread pool. The word count example is pretty simple to see:
http://qt-project.org/doc/qt-4.8/qtconcurrent-wordcount-main-cpp.html

Related

Does multithreading actually work in uniprocessor environment

Suppose we have a process with multiple threads in a uniprocessor.
Now I know that if we have several processes, only one of them will be processed at a time in a uniprocessor and hence the processes are not concurrent.
If my understanding is correct, similarly each thread will be processed at a time and not concurrent in a uniprocessor. Is this statement true? If so then does multithreading mean having more than one thread in a process and does not mean running multiple threads at a time? And does that mean there's no benefit of creating user threads in a uniprocessor environment?

TL;DR: threads are switching more often than processes and in real time we have an effect of concurrency because it is happens really fast.
when you wrote:
each thread will be processed at a time and not concurrent in a uni processor
Notice the word "concurrent", there is no real concurrency in uni processor, there is only effect of that thanks to the multiple number of context switches between processes.
Let's clarify something here, the single core of the CPU can handle one thread at a given time, each process has a main thread and (if needed) more threads running together. If a process A is now running and it has 3 threads: A1(main thread), A2, A3 all three will be running as long as process A is being processed by the CPU core. When a context switch occur process A is no longer running and now process B will run with his threads.
About this statement:
there's no benefit of creating user threads in a uni processor environment
That is not true. there is a benefit in creating threads, they are easier to create ("spawn" as in the books) and shearing the process heap memory. Creating a sub process ("child" as in the books) is a overhead comparing to a thread because a process need to have his own memory. For example each google chrome tab is a process not a thread, but this tab has multiple threads running concurrency with little responsibility.

If you are still somehow running a computer with just one, single-core, CPU, then you would be correct to observe that only one thread can be physically executing at one time. But that does not negate the value of breaking up the application into multiple threads and/or processes.
The essential benefit is concurrency. When one thread is waiting (e.g. for an input/output operation to complete), there is something else for the CPU to be doing in the meantime: it can be running a different thread that isn't waiting. With a carefully designed application, you can get much better utilization of every part of the hardware, more parallelism, and thus, more throughput.
My favorite go-to example is a fast food restaurant. About a dozen workers, each one doing different things, cooperate to bring your order to you. Even if one of them (say, "the fry guy") is standing around, someone else always has something to do. Several orders are in-process at once. This overlap, this "concurrency," is what you are shooting for – regardless of how many CPUs you have.
Multithreading is also commonly used with GUI applications that also need to do some kind of "heavy lifting." One thread handles the GUI interaction (and has no other real responsibilities) while other threads, with a slightly inferior priority (or "niceness") do the lifting. When a GUI event comes in, the GUI thread pre-empts the others and responds to it immediately, then of course goes right back to sleep again. But in this way the GUI always remains very responsive – even though the other threads are doing "heavy lifting" things, GUI messages are still handled very promptly. (I scooped-up about a 25% performance improvement by re-tooling an older application to use this approach, because the application was no longer "polling" for GUI events.)

The first question I ask about any thread is, "what does it wait for?" To me, a thread is defined by what event it waits for and what it does when that event happens.
Threads were in wide-spread use for at least a decade before multi-processor computers became commercially available. They are useful when you want to write a program that has to respond to un-synchronized events that come from multiple different sources. There's a few different ways to model a program like that. One way is to have a different thread to wait on each different event source. The next most popular is an event driven architecture in which there's a main loop that waits for all events and calls different event handler functions for each of the different kinds of event.
The multi-threaded style of program often is easier to read* because there's usually different activities going on inside the program, and the state of each activity can be implicit in the context (i.e., registers and call stack) of the thread that's driving it, while in the event-driven model, each activity's state must be explicitly encoded in some object.
The implicit-in-the-context way of keeping the state is much closer to the procedural style of coding a single activity that we learn as beginners.
*Easier to read does not mean that the code is easy to write without making bad and non-obvious mistakes!!

The main impetus for developing threads was Ada compliance. Prior to that, different operating systems had their own ways of handing multiple things at once. In eunuchs, the way to do more than one thing was to spin off a new process. In VMS, software interrupts (aka Asynchronous System Traps or Asynchronous Procedure Calls in Windoze). In those days (1970's) multiprocessor systems were rare.
One of the goals of Ada was to have a system independent way of doing things. It adopted the "task" which is effectively a thread. In order to support Ada, compiler developers had to include task (thread) libraries.
With the rise of multiprocessors, operating systems started to make threads (rather than processes) the basic schedulable unit in a system.
Threads then give a way for programs to handle multiple things simultaneously, even if there is only one processor. Sadly, support for threads in programming languages has been woefully lacking. Ada is the only major language I can think of that has real support for threads (tasks). Thread support in Java, for example, is a complete, sick joke. The result is threads are not as effective in practice as they could be.

How does process blocking apply to a multi-threaded process?

I've learned that a process has running, ready, blocked, and suspended states. Threads also have these states except for suspended because it lives in the process's address space.
A process blocks most of the time when it is doing a blocking i/o or waiting for an event.
I can easily picture out a process getting blocked if its single-threaded or if it follows a one-to-many model, but how does it work if the process is multi-threaded?
For example:
I have a process with two threads in a system that follows a one-to-one model. One handles the gui and the other handles the blocking i/o. I know the process remains responsive because the other thread handles the i/o.
So is there by any chance the process gets blocked or should I just rule it out in this case?
I'm just getting into these stuff so forgive me If I haven't understand some of the important details yet.

Let's say you have a work queue where the UI thread schedules work to be done and the I\O thread looks there for work to do. The work queue itself is data that is read and modified from both threads, therefor you must synchronize access somehow or race conditions result.
The naive approach is to synchronize access to the queue using a lock (aka critical section). If the I\O thread acquires the lock and then blocks, the UI thread will only remain responsive until it decides it needs to schedule work and tries to acquire the lock. A better approach is to use a lock-free queue about which much has been written and you can easily search for more info.
But to answer your question, yes, it is still much easier than you might think to cause UI to stutter / hang even when using multiple threads. There are various libraries that make it easier or harder to solve this problem, so depending on your OS and language of choice, there may be something better than just OS primitives. Win32 (from what I remember) doesn't it make it very easy at all despite having all sorts of synchronization primitives. Pthreads and Boost never seemed very straightforward to me either. Apple's GCD makes it semantically much easier to express what you want (in my opinion), though there are still pitfalls one must be aware of (such as scheduling too many blocking operations on a single work queue to be done in parallel and causing the processor to thrash when they all wake up at the same time).
My advice is to just dive in and write lots of multithreaded code. It can be tough to debug but you will learn a lot and eventually it becomes second nature.

should I create threads before hand to save time?

I am using python 2.7 .I am using multi-threading.Now if a thread dies I again
create one to compensate for it.So should I create a lot of threads before hand and store them
and use from them when one or more existing threads die or should I create one when some thread dies??
Which is more efficient in terms of time ??

When you say a thread "dies", do you mean you intentionally terminate it or it fails due to error?
If you're intentionally terminating it and you're worried about the time required to spawn a new thread, why not keep the thread persistent and simply have it do the job that the new thread would have done? This is a pretty standard approach - maintain a pool of "worker" threads and have a work queue with pending items to execute. They all run an identical loop which is to pull an item off the queue and execute it. These items can be objects with methods which contain the code to execute if it's convenient to work that way - if the tasks are all very similar then it might be easier to put the code into the thread's own function instead.
If you're talking about threads failing due to error, I wouldn't have imagined this was common enough to worry about it. If it is, you probably need to look at making your code more robust.
In either case, spawning a thread on most systems should be a lightweight activity - a lot more lightweight than spawning a whole new process, for example. As a result, I really wouldn't worry about keeping a pool of threads in reserve to use - that really sounds like early optimisation to me.
Even if spawning threads were slow, consider what you would be doing by spawning threads in advance - you would be taking up more memory (some memory in the OS to keep track of a the thread, some in Python for the objects that it uses to track the thread), although not a great deal; you'd also be spending more time at the start of your program creating all these threads. So, you might save a little time while you were running, but instead your program takes significantly longer to start. That doesn't sound like a sensible trade-off to me unless the speed and latency of your code is absolutely critical while it's running, and if speed is that critical then I'm not sure a pure Python solution is the right approach anyway. Something like C/C++ is going to give you better control of scheduling, at the expense of much more complexity.
In summary: seriously, don't worry about it, just spawn threads as you need them. Trust me, there will be much bigger speed problems elsewhere in your code which are much more deserving of your time.

What are multi-threading DOs and DONTs? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I am applying my new found knowledge of threading everywhere and getting lots of surprises
Example:
I used threads to add numbers in an
array. And outcome was different every
time. The problem was that all of my
threads were updating the same
variable and were not synchronized.
What are some known thread issues?
What care should be taken while using
threads?
What are good multithreading resources.
Please provide examples.
sidenote:(I renamed my program thread_add.java to thread_random_number_generator.java:-)

In a multithreading environment you have to take care of synchronization so two threads doesn't clobber the state by simultaneously performing modifications. Otherwise you can have race conditions in your code (for an example see the infamous Therac-25 accident.) You also have to schedule the threads to perform various tasks. You then have to make sure that your synchronization and scheduling doesn't cause a deadlock where multiple threads will wait for each other indefinitely.
Synchronization
Something as simple as increasing a counter requires synchronization:
counter += 1;
Assume this sequence of events:
counter is initialized to 0
thread A retrieves counter from memory to cpu (0)
context switch
thread B retrieves counter from memory to cpu (0)
thread B increases counter on cpu
thread B writes back counter from cpu to memory (1)
context switch
thread A increases counter on cpu
thread A writes back counter from cpu to memory (1)
At this point the counter is 1, but both threads did try to increase it. Access to the counter has to be synchronized by some kind of locking mechanism:
lock (myLock) {
counter += 1;
}
Only one thread is allowed to execute the code inside the locked block. Two threads executing this code might result in this sequence of events:
counter is initialized to 0
thread A acquires myLock
context switch
thread B tries to acquire myLock but has to wait
context switch
thread A retrieves counter from memory to cpu (0)
thread A increases counter on cpu
thread A writes back counter from cpu to memory (1)
thread A releases myLock
context switch
thread B acquires myLock
thread B retrieves counter from memory to cpu (1)
thread B increases counter on cpu
thread B writes back counter from cpu to memory (2)
thread B releases myLock
At this point counter is 2.
Scheduling
Scheduling is another form of synchronization and you have to you use thread synchronization mechanisms like events, semaphores, message passing etc. to start and stop threads. Here is a simplified example in C#:
AutoResetEvent taskEvent = new AutoResetEvent(false);
Task task;
// Called by the main thread.
public void StartTask(Task task) {
this.task = task;
// Signal the worker thread to perform the task.
this.taskEvent.Set();
// Return and let the task execute on another thread.
}
// Called by the worker thread.
void ThreadProc() {
while (true) {
// Wait for the event to become signaled.
this.taskEvent.WaitOne();
// Perform the task.
}
}
You will notice that access to this.task probably isn't synchronized correctly, that the worker thread isn't able to return results back to the main thread, and that there is no way to signal the worker thread to terminate. All this can be corrected in a more elaborate example.
Deadlock
A common example of deadlock is when you have two locks and you are not careful how you acquire them. At one point you acquire lock1 before lock2:
public void f() {
lock (lock1) {
lock (lock2) {
// Do something
}
}
}
At another point you acquire lock2 before lock1:
public void g() {
lock (lock2) {
lock (lock1) {
// Do something else
}
}
}
Let's see how this might deadlock:
thread A calls f
thread A acquires lock1
context switch
thread B calls g
thread B acquires lock2
thread B tries to acquire lock1 but has to wait
context switch
thread A tries to acquire lock2 but has to wait
context switch
At this point thread A and B are waiting for each other and are deadlocked.

There are two kinds of people that do not use multi threading.
1) Those that do not understand the concept and have no clue how to program it.
2) Those that completely understand the concept and know how difficult it is to get it right.

I'd make a very blatant statement:
DON'T use shared memory.
DO use message passing.
As a general advice, try to limit the amount of shared state and prefer more event-driven architectures.

I can't give you examples besides pointing you at Google. Search for threading basics, thread synchronisation and you'll get more hits than you know.
The basic problem with threading is that threads don't know about each other - so they will happily tread on each others toes, like 2 people trying to get through 1 door, sometimes they will pass though one after the other, but sometimes they will both try to get through at the same time and will get stuck. This is difficult to reproduce, difficult to debug, and sometimes causes problems. If you have threads and see "random" failures, this is probably the problem.
So care needs to be taken with shared resources. If you and your friend want a coffee, but there's only 1 spoon you cannot both use it at the same time, one of you will have to wait for the other. The technique used to 'synchronise' this access to the shared spoon is locking. You make sure you get a lock on the shared resource before you use it, and let go of it afterwards. If someone else has the lock, you wait until they release it.
Next problem comes with those locks, sometimes you can have a program that is complex, so much that you get a lock, do something else then access another resource and try to get a lock for that - but some other thread has that 2nd resource, so you sit and wait... but if that 2nd thread is waiting for the lock you hold for the 1st resource.. it's going to sit and wait. And your app just sits there. This is called deadlock, 2 threads both waiting for each other.
Those 2 are the vast majority of thread issues. The answer is generally to lock for as short a time as possible, and only hold 1 lock at a time.

I notice you are writing in java and that nobody else mentioned books so Java Concurrency In Practice should be your multi-threaded bible.

-- What are some known thread issues? --
Race conditions.
Deadlocks.
Livelocks.
Thread starvation.
-- What care should be taken while using threads? --
Using multi-threading on a single-processor machine to process multiple tasks where each task takes approximately the same time isn’t always very effective.For example, you might decide to spawn ten threads within your program in order to process ten separate tasks. If each task takes approximately 1 minute to process, and you use ten threads to do this processing, you won’t have access to any of the task results for the whole 10 minutes. If instead you processed the same tasks using just a single thread, you would see the first result in 1 minute, the next result 1 minute later, and so on. If you can make use of each result without having to rely on all of the results being ready simultaneously, the single
thread might be the better way of implementing the program.
If you launch a large number of threads within a process, the overhead of thread housekeeping and context switching can become significant. The processor will spend considerable time in switching between threads, and many of the threads won’t be able to make progress. In addition, a single process with a large number of threads means that threads in other processes will be scheduled less frequently and won’t receive a reasonable share of processor time.
If multiple threads have to share many of the same resources, you’re unlikely to see performance benefits from multi-threading your application. Many developers see multi-threading as some sort of magic wand that gives automatic performance benefits. Unfortunately multi-threading isn’t the magic wand that it’s sometimes perceived to be. If you’re using multi-threading for performance reasons, you should measure your application’s performance very closely in several different situations, rather than just relying on some non-existent magic.
Coordinating thread access to common data can be a big performance killer. Achieving good performance with multiple threads isn’t easy when using a coarse locking plan, because this leads to low concurrency and threads waiting for access. Alternatively, a fine-grained locking strategy increases the complexity and can also slow down performance unless you perform some sophisticated tuning.
Using multiple threads to exploit a machine with multiple processors sounds like a good idea in theory, but in practice you need to be careful. To gain any significant performance benefits, you might need to get to grips with thread balancing.
-- Please provide examples. --
For example, imagine an application that receives incoming price information from
the network, aggregates and sorts that information, and then displays the results
on the screen for the end user.
With a dual-core machine, it makes sense to split the task into, say, three threads. The first thread deals with storing the incoming price information, the second thread processes the prices, and the final thread handles the display of the results.
After implementing this solution, suppose you find that the price processing is by far the longest stage, so you decide to rewrite that thread’s code to improve its performance by a factor of three. Unfortunately, this performance benefit in a single thread may not be reflected across your whole application. This is because the other two threads may not be able to keep pace with the improved thread. If the user interface thread is unable to keep up with the faster flow of processed information, the other threads now have to wait around for the new bottleneck in the system.
And yes, this example comes directly from my own experience :-)

DONT use global variables
DONT use many locks (at best none at all - though practically impossible)
DONT try to be a hero, implementing sophisticated difficult MT protocols
DO use simple paradigms. I.e share the processing of an array to n slices of the same size - where n should be equal to the number of processors
DO test your code on different machines (using one, two, many processors)
DO use atomic operations (such as InterlockedIncrement() and the like)

YAGNI
The most important thing to remember is: do you really need multithreading?

I agree with pretty much all the answers so far.
A good coding strategy is to minimise or eliminate the amount of data that is shared between threads as much as humanly possible. You can do this by:
Using thread-static variables (although don't go overboard on this, it will eat more memory per thread, depending on your O/S).
Packaging up all state used by each thread into a class, then guaranteeing that each thread gets exactly one state class instance to itself. Think of this as "roll your own thread-static", but with more control over the process.
Marshalling data by value between threads instead of sharing the same data. Either make your data transfer classes immutable, or guarantee that all cross-thread calls are synchronous, or both.
Try not to have multiple threads competing for the exact same I/O "resource", whether it's a disk file, a database table, a web service call, or whatever. This will cause contention as multiple threads fight over the same resource.
Here's an extremely contrived OTT example. In a real app you would cap the number of threads to reduce scheduling overhead:
All UI - one thread.
Background calcs - one thread.
Logging errors to a disk file - one thread.
Calling a web service - one thread per unique physical host.
Querying the database - one thread per independent group of tables that need updating.
Rather than guessing how to do divvy up the tasks, profile your app and isolate those bits that are (a) very slow, and (b) could be done asynchronously. Those are good candidates for a separate thread.
And here's what you should avoid:
Calcs, database hits, service calls, etc - all in one thread, but spun up multiple times "to improve performance".

Don't start new threads unless you really need to. Starting threads is not cheap and for short running tasks starting the thread may actually take more time than executing the task itself. If you're on .NET take a look at the built in thread pool, which is useful in a lot of (but not all) cases. By reusing the threads the cost of starting threads is reduced.
EDIT: A few notes on creating threads vs. using thread pool (.NET specific)
Generally try to use the thread pool. Exceptions:
Long running CPU bound tasks and blocking tasks are not ideal run on the thread pool cause they will force the pool to create additional threads.
All thread pool threads are background threads, so if you need your thread to be foreground, you have to start it yourself.
If you need a thread with different priority.
If your thread needs more (or less) than the standard 1 MB stack space.
If you need to be able to control the life time of the thread.
If you need different behavior for creating threads than that offered by the thread pool (e.g. the pool will throttle creating of new threads, which may or may not be what you want).
There are probably more exceptions and I am not claiming that this is the definitive answer. It is just what I could think of atm.

I am applying my new found knowledge of threading everywhere
[Emphasis added]
DO remember that a little knowledge is dangerous. Knowing the threading API of your platform is the easy bit. Knowing why and when you need to use synchronisation is the hard part. Reading up on "deadlocks", "race-conditions", "priority inversion" will start you in understanding why.
The details of when to use synchronisation are both simple (shared data needs synchronisation) and complex (atomic data types used in the right way don't need synchronisation, which data is really shared): a lifetime of learning and very solution specific.

An important thing to take care of (with multiple cores and CPUs) is cache coherency.

I am surprised that no one has pointed out Herb Sutter's Effective Concurrency columns yet. In my opinion, this is a must read if you want to go anywhere near threads.

a) Always make only 1 thread responsible for a resource's lifetime. That way thread A won't delete a resource thread B needs - if B has ownership of the resource
b) Expect the unexpected

DO think about how you will test your code and set aside plenty of time for this. Unit tests become more complicated. You may not be able to manually test your code - at least not reliably.
DO think about thread lifetime and how threads will exit. Don't kill threads. Provide a mechanism so that they exit gracefully.
DO add some kind of debug logging to your code - so that you can see that your threads are behaving correctly both in development and in production when things break down.
DO use a good library for handling threading rather than rolling your own solution (if you can). E.g. java.util.concurrency
DON'T assume a shared resource is thread safe.
DON'T DO IT. E.g. use an application container that can take care of threading issues for you. Use messaging.

In .Net one thing that surprised me when I started trying to get into multi-threading is that you cannot straightforwardly update the UI controls from any thread other than the thread that the UI controls were created on.
There is a way around this, which is to use the Control.Invoke method to update the control on the other thread, but it is not 100% obvious the first time around!

Don't be fooled into thinking you understand the difficulties of concurrency until you've split your head into a real project.
All the examples of deadlocks, livelocks, synchronization, etc, seem simple, and they are. But they will mislead you, because the "difficulty" in implementing concurrency that everyone is talking about is when it is used in a real project, where you don't control everything.

While your initial differences in sums of numbers are, as several respondents have pointed out, likely to be the result of lack of synchronisation, if you get deeper into the topic, be aware that, in general, you will not be able to reproduce exactly the numeric results you get on a serial program with those from a parallel version of the same program. Floating-point arithmetic is not strictly commutative, associative, or distributive; heck, it's not even closed.
And I'd beg to differ with what, I think, is the majority opinion here. If you are writing multi-threaded programs for a desktop with one or more multi-core CPUs, then you are working on a shared-memory computer and should tackle shared-memory programming. Java has all the features to do this.
Without knowing a lot more about the type of problem you are tackling, I'd hesitate to write that 'you should do this' or 'you should not do that'.

Why might threads be considered "evil"?

I was reading the SQLite FAQ, and came upon this passage:
Threads are evil. Avoid them.
I don't quite understand the statement "Thread are evil". If that is true, then what is the alternative?
My superficial understanding of threads is:
Threads make concurrence happen. Otherwise, the CPU horsepower will be wasted, waiting for (e.g.) slow I/O.
But the bad thing is that you must synchronize your logic to avoid contention and you have to protect shared resources.
Note: As I am not familiar with threads on Windows, I hope the discussion will be limited to Linux/Unix threads.

When people say that "threads are evil", the usually do so in the context of saying "processes are good". Threads implicitly share all application state and handles (and thread locals are opt-in). This means that there are plenty of opportunities to forget to synchronize (or not even understand that you need to synchronize!) while accessing that shared data.
Processes have separate memory space, and any communication between them is explicit. Furthermore, primitives used for interprocess communication are often such that you don't need to synchronize at all (e.g. pipes). And you can still share state directly if you need to, using shared memory, but that is also explicit in every given instance. So there are fewer opportunities to make mistakes, and the intent of the code is more explicit.

Simple answer the way I understand it...
Most threading models use "shared state concurrency," which means that two execution processes can share the same memory at the same time. If one thread doesn't know what the other is doing, it can modify the data in a way that the other thread doesn't expect. This causes bugs.
Threads are "evil" because you need to wrap your mind around n threads all working on the same memory at the same time, and all of the fun things that go with it (deadlocks, racing conditions, etc).
You might read up about the Clojure (immutable data structures) and Erlang (message passsing) concurrency models for alternative ideas on how to achieve similar ends.

What makes threads "evil" is that once you introduce more than one stream of execution into your program, you can no longer count on your program to behave in a deterministic manner.
That is to say: Given the same set of inputs, a single-threaded program will (in most cases) always do the same thing.
A multi-threaded program, given the same set of inputs, may well do something different every time it is run, unless it is very carefully controlled. That is because the order in which the different threads run different bits of code is determined by the OS's thread scheduler combined with a system timer, and this introduces a good deal of "randomness" into what the program does when it runs.
The upshot is: debugging a multi-threaded program can be much harder than debugging a single-threaded program, because if you don't know what you are doing it can be very easy to end up with a race condition or deadlock bug that only appears (seemingly) at random once or twice a month. The program will look fine to your QA department (since they don't have a month to run it) but once it's out in the field, you'll be hearing from customers that the program crashed, and nobody can reproduce the crash.... bleah.
To sum up, threads aren't really "evil", but they are strong juju and should not be used unless (a) you really need them and (b) you know what you are getting yourself into. If you do use them, use them as sparingly as possible, and try to make their behavior as stupid-simple as you possibly can. Especially with multithreading, if anything can go wrong, it (sooner or later) will.

I would interpret it another way. It's not that threads are evil, it's that side-effects are evil in a multithreaded context (which is a lot less catchy to say).
A side effect in this context is something that affects state shared by more than one thread, be it global or just shared. I recently wrote a review of Spring Batch and one of the code snippets used is:
private static Map<Long, JobExecution> executionsById = TransactionAwareProxyFactory.createTransactionalMap();
private static long currentId = 0;
public void saveJobExecution(JobExecution jobExecution) {
Assert.isTrue(jobExecution.getId() == null);
Long newId = currentId++;
jobExecution.setId(newId);
jobExecution.incrementVersion();
executionsById.put(newId, copy(jobExecution));
}
Now there are at least three serious threading issues in less than 10 lines of code here. An example of a side effect in this context would be updating the currentId static variable.
Functional programming (Haskell, Scheme, Ocaml, Lisp, others) tend to espouse "pure" functions. A pure function is one with no side effects. Many imperative languages (eg Java, C#) also encourage the use of immutable objects (an immutable object is one whose state cannot change once created).
The reason for (or at least the effect of) both of these things is largely the same: they make multithreaded code much easier. A pure function by definition is threadsafe. An immutable object by definition is threadsafe.
The advantage processes have is that there is less shared state (generally). In traditional UNIX C programming, doing a fork() to create a new process would result in shared process state and this was used as a means of IPC (inter-process communication) but generally that state is replaced (with exec()) with something else.
But threads are much cheaper to create and destroy and they take less system resources (in fact, the operating itself may have no concept of threads yet you can still create multithreaded programs). These are called green threads.

The paper you linked to seems to explain itself very well. Did you read it?
Keep in mind that a thread can refer to the programming-language construct (as in most procedural or OOP languages, you create a thread manually, and tell it to executed a function), or they can refer to the hardware construct (Each CPU core executes one thread at a time).
The hardware-level thread is obviously unavoidable, it's just how the CPU works. But the CPU doesn't care how the concurrency is expressed in your source code. It doesn't have to be by a "beginthread" function call, for example. The OS and the CPU just have to be told which instruction threads should be executed.
His point is that if we used better languages than C or Java with a programming model designed for concurrency, we could get concurrency basically for free. If we'd used a message-passing language, or a functional one with no side-effects, the compiler would be able to parallelize our code for us. And it would work.

Threads aren't any more "evil" than hammers or screwdrivers or any other tools; they just require skill to utilize. The solution isn't to avoid them; it's to educate yourself and up your skill set.

Creating a lot of threads without constraint is indeed evil.. using a pooling mechanisme (threadpool) will mitigate this problem.
Another way threads are 'evil' is that most framework code is not designed to deal with multiple threads, so you have to manage your own locking mechanisme for those datastructures.
Threads are good, but you have to think about how and when you use them and remember to measure if there really is a performance benefit.

A thread is a bit like a light weight process. Think of it as an independent path of execution within an application. The thread runs in the same memory space as the application and therefore has access to all the same resources, global objects and global variables.
The good thing about them: you can parallelise a program to improve performance. Some examples, 1) In an image editing program a thread may run the filter processing independently of the GUI. 2) Some algorithms lend themselves to multiple threads.
Whats bad about them? if a program is poorly designed they can lead to deadlock issues where both threads are waiting on each other to access the same resource. And secondly, program design can me more complex because of this. Also, some class libraries don't support threading. e.g. the c library function "strtok" is not "thread safe". In other words, if two threads were to use it at the same time they would clobber each others results. Fortunately, there are often thread safe alternatives... e.g. boost library.
Threads are not evil, they can be very useful indeed.
Under Linux/Unix, threading hasn't been well supported in the past although I believe Linux now has Posix thread support and other unices support threading now via libraries or natively. i.e. pthreads.
The most common alternative to threading under Linux/Unix platforms is fork. Fork is simply a copy of a program including it's open file handles and global variables. fork() returns 0 to the child process and the process id to the parent. It's an older way of doing things under Linux/Unix but still well used. Threads use less memory than fork and are quicker to start up. Also, inter process communications is more work than simple threads.

In a simple sense you can think of a thread as another instruction pointer in the current process. In other words it points the IP of another processor to some code in the same executable. So instead of having one instruction pointer moving through the code there are two or more IP's executing instructions from the same executable and address space simultaneously.
Remember the executable has it's own address space with data / stack etc... So now that two or more instructions are being executed simultaneously you can imagine what happens when more than one of the instructions wants to read/write to the same memory address at the same time.
The catch is that threads are operating within the process address space and are not afforded protection mechanisms from the processor that full blown processes are. (Forking a process on UNIX is standard practice and simply creates another process.)
Out of control threads can consume CPU cycles, chew up RAM, cause execeptions etc.. etc.. and the only way to stop them is to tell the OS process scheduler to forcibly terminate the thread by nullifying it's instruction pointer (i.e. stop executing). If you forcibly tell a CPU to stop executing a sequence of instructions what happens to the resources that have been allocated or are being operated on by those instructions? Are they left in a stable state? Are they properly freed? etc...
So, yes, threads require more thought and responsibility than executing a process because of the shared resources.

For any application that requires stable and secure execution for long periods of time without failure or maintenance, threads are always a tempting mistake. They invariably turn out to be more trouble than they are worth. They produce rapid results and prototypes that seem to be performing correctly but after a couple weeks or months running you discover that they have critical flaws.
As mentioned by another poster, once you use even a single thread in your program you have now opened a non-deterministic path of code execution that can produce an almost infinite number of conflicts in timing, memory sharing and race conditions. Most expressions of confidence in solving these problems are expressed by people who have learned the principles of multithreaded programming but have yet to experience the difficulties in solving them.
Threads are evil. Good programmers avoid them wherever humanly possible. The alternative of forking was offered here and it is often a good strategy for many applications. The notion of breaking your code down into separate execution processes which run with some form of loose coupling often turns out to be an excellent strategy on platforms that support it. Threads running together in a single program is not a solution. It is usually the creation of a fatal architectural flaw in your design that can only be truly remedied by rewriting the entire program.
The recent drift towards event oriented concurrency is an excellent development innovation. These kinds of programs usually prove to have great endurance after they are deployed.
I've never met a young engineer who didn't think threads were great. I've never met an older engineer who didn't shun them like the plague.

Being an older engineer, I heartily agree with the answer by Texas Arcane.
Threads are very evil because they cause bugs that are extremely difficult to solve. I have literally spent months solving sporadic race-conditions. One example caused trams to suddenly stop about once a month in the middle of the road and block traffic until towed away. Luckily I didn't create the bug, but I did get to spend 4 months full-time to solve it...
It's a tad late to add to this thread, but I would like to mention a very interesting alternative to threads: asynchronous programming with co-routines and event loops. This is being supported by more and more languages, and does not have the problem of race conditions like multi-threading has.
It can replace multi-threading in cases where it is used to wait on events from multiple sources, but not where calculations need to be performed in parallel on multiple CPU cores.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string