Singlethread, Multithread, Synchronous, Asynchronous - How do these combine? - multithreading

Every illustration and explanation concerning this topic that I came across do not seem to be consistent with each other.
I illustrated my understanding of the combinations of these concepts. Can anyone confirm if its correct or erroneous?
The execution timeline in the illustration goes from left to right.

This is an extended comment, not an answer
Your pictures show threads executing tasks. IMO, that is unnecessary.
I think that part of what you are struggling with is the fact that threads belong to a lower layer of abstraction than anything that is called async in any programming system (i.e., in any language or library or framework.) If I may offer a weak analogy; If an async system is like a parcel delivery service, then tasks are like the packages that need to be delivered, and threads are like the trucks that carry them. If you want to understand how a truck works, you don't need to know specifics of the cargo that it carries. But if you want to know how FedEx works, then talking about the packages—where they are supposed to go, when they are supposed to arrive—is the very heart of the matter.
Note: I am not saying that every async feature in every programming system that has one is built on a lower layer of threads, but I would not be surprised if that was true in at least a few cases.
Anyway, my point is, if you are trying to draw a picture that illustrates the scheduling of N threads on M processors (where N > M), then there is no need to mention "task" anywhere. It only complicates the picture.
P.S.; You said, "Asynchronous," but you did not mention any specific programming language or library or framework. IMO, "asynchronous" is a vague idea—not nearly as well defined as "thread." If you want to know more about some specific async feature of some language or library or framework, then you should mention it by name. They don't necessarily all work in the same way.

Related

What is the best way to understand and analyze a multithreading code?

I'm not looking for programming techniques. My question is rather about what is the best way to understand a code developed by a third party.
I have a code for an application in a specific language (it could be C/C++, Java, etc.). This code uses several threads to control different processes. The application generates a log that shows all calls to relevant functions for each thread.
I have to analyze this code to understand its operation and be able to make an improvement of the algorithm. I worked little with threads, so I do not know which is the most convenient way to start the analysis and follow the execution of each thread.
Could you give me any recommendation?
If you are able to contact any of the code's original developers, having a conversation with them (by voice or by email) and asking them to describe how they intended things to work is always preferable to only trying to reverse-engineer their intent by looking at the code. If you can't contact the developers directly, then perhaps there is a library-specific developer's forum or other on-line resource where you can discuss the library's structure with people who have experience using/debugging it.
If that's not an option (or if you've done that and still don't feel like you understand things well enough), then I often find that profiling (either via a profiling tool, or just by temporarily putting printf() [or similar] tracing-calls into the codebase at various places and seeing what gets printed when) is a good way to find out which parts of the code are actually being used at which stages of the program's execution. That will help you confirm (or disprove) your theories about how the codebase works. Knowing where and when each thread is spawned, where its entry-function is, and where/when it gets joined again by its parent thread are particularly useful.
Finally, start looking at the various pieces of data (e.g. objects and member variables) each thread examines and/or modifies, and how accesses to each those pieces of data is synchronized/serialized. Assuming the code isn't buggy, the critical sections of the codebase are good indicators of where inter-thread communication is happening.

Functional approaches to designing the discrete side of hybrid systems

I'm working on developing controllers for hybrid systems in Haskell.
FRP libraries (right now I'm using netwire, but there are several good ones and a lot of interesting research on future ones) provide a great solution for the continuous-time side of the problem. Augmenting them with signal names, dimensions, preferred units, and so forth gets you a system that has modularity, is self-describing, and has a straightforward path to confidence in correctness.
I'm looking for information, folklore, or papers that provide similar properties for the discrete-time side. In some sense the problem is much easier, state machines are well-studied and simple. In other senses it's more difficult, I'll briefly explain how.
Correctness is obviously the most important thing, and thankfully it's also straightforward.
Self-description is more of a problem. You'd like the controller not just to be in the correct state, but to be capable of telling you what state it's in. Also how it got there. And where it might go next. So you can tack names on to everything, and it works, but it conflicts somewhat with modularity. You'd also like to be able to build complex discrete time behaviors from simpler ones. But when you ask the system what state it's in, generally the high-level answer is more interesting (or at least, as interesting) as the low level answer. How do you get this cleanly? I've tried a few naive approaches and have wrapped myself in spaghetti a few different ways, but it seems like there must be elegant solutions?
Another problem I've had with self-description is that I'd like to have a list of self-describing conditions (generally comparisons: has it been 10 seconds? am I within 3 feet of the next waypoint? has the battery power fallen below 15%? etc) that are being monitored which might trigger the next state transition. There are tricky questions of what even are the desirable semantics here, since it seems like some of these events are better handled "from the bottom up" (e.g. expected termination conditions of whatever low level step you are performing) and some "from the top down" (e.g. equipment failure detection, geofencing, ...). This can lead to spaghetti of its own even if you relax the goal of self-description.
In addition to diagnostics, accurate self-description information here could also be very useful for abstract interpretation, projecting the state of the system into the future by guessing which events are likely to occur when. Many of the event conditions lead themselves to fairly simple guesses (e.g. using velocity made good, fuel consumption rate, timers). Others are more complicated but might still be worth the effort to develop projections for some applications (e.g. expected orders from operators, weather forecasts, projected tracks for moving objects of interest). It would be nice to find a design that annotates conditions not only with names, but also with functions for this sort of stuff.
Does anyone have experience with this that they are willing to share?
Okay, so I would say the "real" answer to your question is that some of things that you are asking for are open areas of research --- in particular I think some of the self-describing features you desire may necessitate some degree of "spaghetti" simply because the problem you are trying to solve is inherently complicated.
That being said, your focus on modularity is exactly the right approach. I would say, take a look at Keymaera as I believe it has the features you are looking for despite being in Java. I would also recommend looking at the publications page on the Keymaera website as this should provide you valuable insight to the problem in general.
If you do not like Keymaera's approach you can also look into using Timed Automata which is another direction modeling-wise that should be sufficient for your problem description.

For reliable code, NModel, Spec Explorer, F# or other?

I've got a business app in C#, with unit tests. Can I increase the reliability and cut down on my testing time and expense by using NModel or Spec Explorer? Alternately, if I were to rewrite it in F# (or even Haskell), what kinds (if any) of reliability increase might I see?
Code Contracts? ASML?
I realize this is subjective, and possibly argumentative, so please back up your answers with data, if possible. :) Or maybe an worked example, such as Eric Evans Cargo Shipping System?
If we consider
Unit tests to be specific and strong theorems, checked
quasi-statically on particular “interesting instances” and Types to be general but weak theorems (usually checked statically), and contracts to be general and strong theorems, checked dynamically for particular instances that occur during regular program operation.
(from B. Pierce's Types Considered Harmful),
where do these other tools fit?
We could pose the analogous question for Java, using Java PathFinder, Scala, etc.
Reliability is a function of several variables, including the general architecture of the software, the capability of the programmers, the quality of the requirements and the maturity of your configuration management and general QA processes. All these will affect the reliability of a rewrite.
Having said that, language certainly has a significant impact. All other things being equal:
Defects are roughly proportional to SLOC count. Languages that are terser see fewer coding errors. Haskell seems to require about 10% of the SLOC required by C++, Erlang about 14%, Java around 50%. I guess C# probably fits alongside Java on this scale.
Type systems are not borne equal. Languages with type inference (e.g. Haskell and to a lesser extent O'Caml) will have fewer defects. Haskell in particular will allow you to encode invariants in the type system so that a program will only compile if they can be proven true. Doing so requires extra work, so consider the trade-off on a case-by-case basis.
Managing state is a source of many defects. Functional languages, and especially pure functional languages, avoid this problem.
QuickCheck and its relatives allow you to write unit and system tests that verify general properties rather than individual test cases. This can greatly reduce the work required to test the code, especially if you are aiming for high test coverage metrics. A set of QuickCheck properties resembles a formal specification, and this concept fits nicely with Test Driven Development (write your tests first, and when the code passes them you are done).
Put all of these things together and you should have a powerful toolkit for driving quality through the development lifecycle. Unfortunately I'm not aware of any robust studies that actually prove this. All the factors I listed at the start would confound any real study, and you would need a lot of data before an unambiguous pattern showed itself.
Some comments on the quote, in the context of C# which is my "first" language:
Unit tests to be specific and strong
theorems,
Yes, but they might not give you first order logic checks, like "for all x there exists a y where f(y)", more like "there exists a y, here it is (!), f(y)", aka setup, act, assert. ;)*
checked quasi-statically on
particular “interesting instances” and
Types to be general but weak theorems
(usually checked statically),
Types are not necessarily that weak**.
and
contracts to be general and strong
theorems, checked dynamically for
particular instances that occur during
regular program operation. (from B.
Pierce's Types Considered Harmful),
Unit Testing
Pex + Moles I think is getting closer to the first-order logic type of checking, as it generates the edge-cases and uses the C9 solver to work with integer constraint solving. I would really like to see more Moles tutorials (moles is for replacing implementations), specifically together with some sort of inversion of control container that can leverage what stub- and real- implementations of abstract classes and interfaces already exist.
Weak Types
In C# they are fairly weak, sure: generic typing/types allows you to add protocol semantics for one operation -- i.e. constraining types to be on interfaces, which are in some sense protocols which implementing classes agree to. However, the static typing of the protocol is just for one operation.
Example: Reactive Extensions API
Let's take Reactive Extensions as a discussion topic.
The contract required by the consumer, implemented by the observable.
interface IObserver<in T> : IDisposable {
void OnNext(T);
void OnCompleted();
void OnError(System.Exception);
}
There are more to the protocol than this interface shows: methods called on an IObserver< in T > instance must follow this protocol:
Ordering:
OnNext{0,n} (OnCompleted | OnError){0, 1}
Furthermore, on another axis; time-dimension:
Time:
for all t|-> t:(method -> time). t(OnNext) < t(OnCompleted)
for all t|-> t:(method -> time). t(OnNext) < t(OnError)
i.e. no invocation to OnNext may be done after one to OnCompleted xor OnError.
Furthermore, the axis of parallelism:
Parallelism:
no invocation to OnNext may be done in parallel
i.e. there's a scheduling constraint that needs to be followed from implementers of IObservable. No IObservable may push from multiple threads at the same time, without first synchronizing the invocation around a context.
How do you test this contract holds in an easy way? With c#, I don't know.
Consumer of API
From the consuming side of the application, there might be interactions between different contexts, such as Dispatcher, Background/other threads, and preferably we'd like to give guarantees that we don't end up in a deadlock.
Further, there is the requirement to handle deterministic disposing of the observables. It might not be clear all the time when an extension method's returned IObservable instance takes care of the method's arguments' IObservable instances and dispose those, so there's a requirement to know about the inner workings of the black box (alternatively you can let the references go in a "reasonable way" and the GC will take them at some point)
<<< Without Reactive Extensions, it's not necessarily easier:
There is the task pool on top of TPL is implemented. In the task pool we have a work-stealing queue of delegates to invoke on the worker threads.
Using the APM/begin/end or the async pattern (which queues to the task pool) could leave us open to callback-ordering bugs if we mutating state. Also, the protocol of begin-invocations and their callbacks might be too convoluted and hence impossible to follow. I read a post-mortem the other day about a silverlight project having problems seeing the business logic-forest for all the callback-trees. Then there's the possibility of implementing the poor-man's async monad, the IEnumerable with an async 'manager' iterating through it and calling MoveNext() every time a yielded IAsyncResult completes.
...and don't get me started on the nuuuumerous hidden protocols in IAsyncResult.
Another problem, without using Reactive extensions is the turtles problem - once you decide that you want an IO-blocking operation to be async, there need to be turtles all the way down to the p/invoke call that places the associated Win32-thread on an IO-completion port! If you have three layers and then some logic as well inside of your topmost layer, you need to make all three layers implement the APM pattern; and fulfil the numerous contract obligations of IAsyncResult (or leave it partially broken) -- and there's no default public AsyncResult implementation in the base class library.
>>>
Working with exceptions from the interface
Even with the above memory-management + parallelism + contract + protocol items covered, there are still exceptions to be handled (not just received and forgotten about), in a good, reliable application. I want to make an example;
Context
Let's say that we find ourselves catching an exception from the contract/interface (not necessarily from reactive extensions' IObservable implementations here which have monadic exception handling rather than stack-frame based).
Hopefully the programmer was diligent and documented the possible exceptions, but there might be exception possibilities all the way down. If everything is correctly defined with code contracts at least we can be sure we are capable of catching a few of the exceptions, but many different causes may be lumped together inside of one exception type, and once an exception is thrown, how do we ensure that the work of the least possible size is rectified?
Aim
Say that we are pushing some data-record from a message-bus-consumer in our application, and receiving them on the background thread which decides what to do with them.
Example
A real-life example here could be Spotify, which I'm using every day.
My $100 router/access point throws in the towel at random times. I guess it has a cache-bug or some sort of stack overflow bug, as it happens every time I push more than 2 MB/s LAN/WAN data through it.
I have to NICs up; the wifi and the ethernet card. Ethernet's connection goes down. The sockets of Spotify's event-handler loop return an invalid code (I think it's C or C++) or throw exceptions. Spotify has to handle it, but it doesn't know what my network topology looks like (and there is no code to try all routes/update the routing table and hence the interface to be used); I still have a route to the internet, but just not on the same interface. Spotify crashes.
A thesis
Exceptions are simply not semantic enough. I believe one can look at exceptions from the perspective of the Error monad in Haskell. We either continue or break: unwinding the stack, executing the catches, executing the finally's an praying we don't end up with race conditions on either other exception handlers or the GC, or async exceptions for outstanding IO-completion ports.
But when one of my interfaces' connection/route goes down, Spotify crashes freezes.
Now we have SEH/Structured Exception Handling, but I think we will have SEH2 in the future, where each source of exception gives, with the actual exception, a discriminated union (i.e. it should be statically typed to the linked library/assembly), of possible compensating actions -- in this example, I could imagine Windows' network API telling the application to execute a compensating action to open the same socket on another interface, or to handle it on its own (like now), or to retry the socket, with some kernel-managed retry policy. Each of these options are parts of a discriminated union type, so the implementer must use one of them.
I think that, when we have SEH2, it won't be called exceptions anymore.
^^
Anyway, I have digressed too much already.
Instead of reading my thoughts, listen to some of Erik Meijer's -- this is a very good round-table discussion between him and Joe Duffy. They discuss handling side-effects of calls. Or have a look at this search listing.
I'm finding myself in a position, today, as a consultant, of maintaining a system where stronger static semantics could be good, and I'm looking at tools which can give me the speed of programming + the correctness verification on a level which is accurate and precise. I haven't found it yet.
I simply think we are another 20 years if not more away from developer oriented reliable computing. There are just too many languages, frameworks, marketing BS and concepts in the air right now, for the ordinary develop to stay on top of things.
Why is this under the heading of "weak types"?
Because I find that the type system will be part of the solution; types need not be weak! Terse code and strong type systems (think Haskell) help programmers build reliable software.

Are there any practical alternatives to threads?

While reading up on SQLite, I stumbled upon this quote in the FAQ: "Threads are evil. Avoid them."
I have a lot of respect for SQLite, so I couldn't just disregard this. I got thinking what else I could, according to the "avoid them" policy, use instead in order to parallelize my tasks. As an example, the application I'm currently working on requires a user interface that is always responsive, and needs to poll several websites from time to time (a process which takes at least 30 seconds for each website).
So I opened up the PDF linked from that FAQ, and essentially it seems that the paper suggests several techniques to be applied together with threads, such as barriers or transactional memory - rather than any techniques to replace threads altogether.
Given that these techniques do not fully dispense with threads (unless I misunderstood what the paper is saying), I can see two options: either the SQLite FAQ does not literally mean what it says, or there exist practical approaches that actually avoid the use of threads altogether. Are there any?
Just a quick note on tasklets/cooperative scheduling as an alternative - this looks great in small examples, but I wonder whether a large-ish UI-heavy application can be practically parallelized in a solely cooperative way. If you have done this successfully or know of such examples this certainly qualifies as a valid answer!
Note: This answer no longer accurately reflects what I think about this subject. I don't like its overly dramatic, somewhat nasty tone. Also, I am not so certain that the quest for provably correct software has been so useless as I seemed to think back then. I am leaving this answer up because it is accepted, and up-voted, and to edit it into something I currently believe would pretty much vandalize it.
I finally got around to reading the paper. Where do I start?
The author is singing an old song, which goes something like this: "If you can't prove the program is correct, we're all doomed!" It sounds best when screamed loudly accompanied by over modulated electric guitars and a rapid drum beat. Academics started singing that song when computer science was in the domain of mathematics, a world where if you don't have a proof, you don't have anything. Even after the first computer science department was cleaved from the mathematics department, they kept singing that song. They are singing that song today, and nobody is listening. Why? Because the rest of us are busy creating useful things, good things out of software that can't be proved correct.
The presence of threads makes it even more difficult to prove a program correct, but who cares? Even without threads, only the most trivial of programs can be proved correct. Why do I care if my non-trivial program, which could not be proved correct, is even more unprovable after I use threading? I don't.
If you weren't sure the author was living in an academic dreamworld, you can be sure of it after he maintains that the coordination language he suggests as an alternative to threads could best be expressed with a "visual syntax" (drawing graphs on the screen). I've never heard that suggestion before, except every year of my career. A language that can only be manipulated by GUI and does not play with any of the programmer's usual tools is not an improvement. The author goes on to cite UML as a shining example of a visual syntax which is "routinely combined with C++ and Java." Routinely in what world?
In the mean time, I and many other programmers go on using threads without all that much trouble. How to use threads well and safely is pretty much a solved problem, as long as you don't get all hung up on provability.
Look. Threading is a big kid's toy, and you do need to know some theory and usage patterns to use them well. Just as with databases, distributed processing, or any of the other beyond-grade-school devices that programmers successfully use every day. But just because you can't prove it correct doesn't mean it's wrong.
The statement in the SQLite FAQ, as I read it, is just a comment on how difficult threading can be to the uninitiated. It is the author's opinion, and it might be a valid one. But saying you should never use threads is throwing the baby out with the bath water, in my opinion. Threads are a tool. Like all tools, they can be used and they can be abused. I can read his paper and be convinced that threads are the devil, but I have used them successfully, without killing kittens.
Keep in mind that SQLite is written to be as lightweight and easy to understand (from a coding standpoint) as possible, so I would imagine that threading is kind of the antithesis to this lightweight approach.
Also, SQLite is not meant to be used in a highly-concurrent environment. If you have one of these, you might be better off working with a more enterprisey database like Postgres.
Evil, but a necessary evil. High level abstractions of threads (Tasks in .NET for example) are becoming more common but for the most part the industry is not trying to find a way to avoid threads, just making it easier to deal with the complexities that come with any kind of concurrent programming.
One trend I've noticed, at least in the Cocoa domain, is help from the framework. Apple has gone to great lengths to help developers with the relatively difficult concept of concurrent programming. Some things I've seen:
Different granularity of threading. Cocoa supports everything from posix threads (low level) to object oriented threading with NSLock and NSThread, to high level parellelism such as NSOperation. Depending on your task, using a high level tool like NSOperation is easier and gets the job done.
Threading behind the scenes via an API. Lots of the UI and animation stuff in cocoa is hidden behind an API. You are responsible for calling an API method and providing an asynchronous callback this executed when the secondary thread completes (for example the end of some animation).
openMP. There are tools like openMP that allow you to provide pragmas that describe to the compiler that some task may be safely parelellized. For example iterating a set of items in an independent way.
It seems like a big push in this industry is to make things simple for the Application developers and leave the gory thread details to the system developers and framework developers. There is a push in academia for formalizing parellel patterns. As mentioned you cant always avoid threading, but there are an increasing number of tools in your arsenal to make it as painless as possible.
If you really want to live without threads, you can, so long as you don't call any functions that can potentially block. This may not be possible.
One alternative is to implement the tasks you would have made into threads as finite state machines. Basically, the task does what it can do immediately, then goes to its next state, waiting for an event, such as input arriving on a file or a timer going off. X Windows, as well as most GUI toolkits, support this style. When something happens, they call a callback, which does what it needs to do and returns. For a FSM, the callback checks to see what state the task is in and what the event is to determine what to do immediately and what the next state will be.
Say you have an app that needs to accept socket connections, and for each connection, parse command lines, execute some code, and return the results. A task would then be what listens to a socket. When select() (or Gtk+, or whatever) tells you the socket has something to read, you read it into a buffer, then check to see if you have enough input buffered to do something. If so, you advance to a "start doing something" state, otherwise you stay in the "reading a line" state. (What you "do" could be multiple states.) When done, your task drops the line from the buffer and goes back to the "reading a line" state. No threads or preemption needed.
This lets you act multithreaded by way of being event-driven. If your state machines are complicated, however, your code can get hard to maintain pretty fast, and you'll need to work up some kind of FSM-management library to separate the grunt work of running the FSM from the code that actually does things.
P.S. Another way to get threads without really using threads is the GNU Pth library. It doesn't do preemption, but it is another option if you really don't want to deal with threads.
Another approach to this may be to use a different concurrency model rather than avoid multithreading altogether (you have to utilize all these CPU cores in parallel somehow).
Take a look at mechanisms used in Clojure (e.g. agents, software transactional memory).
Software Transactional Memory (STM) is a good alternative concurrency control. It scales well with multiple processors and do not have most of the problems of conventional concurrency control mechanisms. It is implemented as part of the Haskell language. It worths giving a try. Although, I do not know how this is applicable in the context of SQLite.
Alternatives to threads:
coroutines
goroutines
mapreduce
workerpool
apple's grand central dispatch+lambdas
openCL
erlang
(interesting to note that half of those technologies were invented or popularised by google.)
Another thing is many web frameworks transparently use multiple threads/processes for handling requests, and usually in such a way that mostly eliminates the problems associated with multithreading (for the user of the framework), or at least makes the threading rather invisible. The web being stateless, the only shared state is session state (which isn't really a problem since by definition, a single session isn't going to be doing concurrent things), and data in a database that already has its multithreading nonsense sorted out for you.
It's somewhat important to note though that these are all abstractions. The underlying implementations of these things still use threads. But this is still incredibly useful. In the same way you wouldn't use assembler to write a web application, you wouldn't use threads directly to write any important application. Designing an application to use threads is too complicated to leave for a human to deal with.
Threading is not the only model of concurrency. The actors model (Erlang, Scala) is an example of a somewhat different approach.
http://www.scala-lang.org/node/242
If your task is really, really easily isolatable, you can use processes instead of threads, like Chrome does for its tabs.
Otherwise, inside a single process, there is no way to achieve real parallelism without threads, because you need at least two coroutines if you want two things to happen at the same time (assuming you're having multiple processors/cores at hand, of course; otherwise real parallelism is simply not possible).
The complexity of threading a program is always relative to the degree of isolation of the tasks the threads will perform. There's no trouble in running several threads if you know for sure these will never use the same variables. Then again, multiple high-level constructs exist in modern languages to help synchronize access to shared resources.
It's really a matter of application. If your task is simple enough to fit in some kind of high-level Task object (depends on your development platform; your mileage may vary), then using a task queue is your best bet. My rule of the thumb is that if you can't find a cool name to your thread, then its task is not important enough to justify a thread (instead of task going on an operation queue).
Threads give you the opportunity to do some evil things, specifically sharing state among different execution paths. But they offer a lot of convenience; you don't have to do expensive communication across process boundaries. Plus, they come with less overhead. So I think they're perfectly fine, used correctly.
I think the key is to share as little data as possible among the threads; just stick to synchronization data. If you try to share more than that, you have to engage in complex code that is hard to get right the first time around.
One method of avoiding threads is multiplexing - in essence you make a lightweight mechanism similar to threads which you manage yourself.
Thing is this is not always viable. In your case the 30s polling time per website - can it be split into 60 0.5s pieces, in between which you can stuff calls to the UI? If not, sorry.
Threads aren't evil, they are just easy to shoot your foot with. If doing Query A takes 30s and then doing Query B takes another 30s, doing them simultaneously in threads will take 120s instead of 60 due to thread overhead, fighting for disk access and various bottlenecks.
But if Operation A consists of 5s of activity and 55 seconds of waiting, mixed randomly, and Operation B takes 60s of actual work, doing them in threads will take maybe 70s, compared to plain 120 when you execute them in sequence.
The rule of thumb is: threads should idle and wait most of the time. They are good for I/O, slow reads, low-priority work and so on. If you want performance, use multiplexing, which requires more work but is faster, more efficient and has way less caveats. (synchronizing threads and avoiding race conditions is a whole different chapter of thread headaches...)

Sample Problems for Multithreading Practice

I'm about to tackle what I see as a hard problem, I think. I need to multi-thread a pipeline of producers and consumers.
So I want to start small. What are some practice problems, in varying levels of difficulty, that would be good for multi-threading practice? (And not contrived, impractical examples you see in books not dedicated to concurrency).
What books or references would you recommend that focus on concurrency and give in-depth problems and cases?
(I'd rather not focus on the problem I want to solve. I just want to ask for good references and sample problems. This would be more useful to other users. I'm not stuck on the problem.)
The little book of semaphores is a good free book. The author takes a unique approach of first asking a problem and then presenting hints before answering. The problems increase in difficulty level gradually, and the book isn't written for any language in particular but covers general multithreading concepts.
If you have enough time to invest I would recommend the book "Concurrency: State Models & Java Programs, 2nd Edition" by Jeff Magee and Jeff Kramer, John Wiley&Sons 2006
You can ignore the Java part if you are using some other language
There's a language used to model processes and concurrent processes called FSP. It needs some time and energy to be invested in order to be proficient in the language. There's a tool (LTSA, both are free and supported by an Eclipse plugin or stand alone app) which verifies your models and make you pretty shure that your model is correct from the standpoint of concurrent execution.
Translating this models to your language constructs is then just a question of programming technique and few design patterns.
Most text book problems, like readers-writers, producers-consumers or dinning philosophers are all illustrations of the mutex. I would prefer to model a prototype which is a simplistic approximation the bigger problem and go ahead.
I have some times seen situations where dead-lock avoidance is what is needed and dead-lock prevention measures are being used. It is always a good idea to analyse if Banker's algorithm would suit the case or not.
Completely ignoring your request, I'll suggest that you should look at SEDA (staged event driven architecture) as a way to think about setting up a multi-threaded pipeline of producers and consumers.
I'm not sure what you are looking for. But in real world enterprise situation, we usually use some kind of messaging framework when doing producers consumers stuff. Tipically in Java, that's JMS. And you can use the excellent Spring Framework to help you along.
If you're working with Java at all (and possibly even if you're not), you should definitely read Java Concurrency In Practice.
To be honest, many real-world multithreading programs are not doing much more than reading/writing some value (whether string or int) -- circular buffers (as a network connection might need), readers/writers of log files, etc.
In fact, I'd say that if you implement (or find) a solid (and generic) circular buffer, and then run all thread-to-thread communication through those buffers as the only contact point, that'll cover a very large portion of any multithread syncing you might need to do. (Unless you're working in a buzzword-compliant environment, and need to tack "enterprise", "messaging", or whatever onto the buzzword list... or you're writing a database or operating system.)
(Note that "circular buffer" is a fairly C-centric term, being rooted in the relatively direct manipulation of a block of memory. Python's Queue class implements the same basic principle in a list-centric way, and I'm sure that numerous other languages have conceptually similar constructs under slightly different names...)

Resources