waveOut (Win32API) and multithreading - multithreading

I cannot find any information about the thread-safety of the waveOut API.
After i creating new waveOut handle, i have those threads:
Thread 1: Buffers handling. Uses those API functions:
waveOutPrepareHeader
waveOutWrite
waveOutUnprepareHeader
Thread 2: Gui, Controller thread. Uses those API functions:
waveOutPause
waveOutRestart
waveOutReset
waveOutBreakLoop
Those two threads are running while using concurrently the same waveOut handle.
In my tests, i didn't saw any problem with the functionality, but it doesn't mean that it safe.
Is this architecture thread-safe?
Is there any documentation about the thread safety of the waveOut API?
Any other suggestions about the waveOut API thread-safety?
thanks.

In general the waveOut API should be thread-safe. Because usually a waveOutOpen() creates its own thread, and all waveOut* functions send messages to that thread. But I can not give you a proof...
However, you can change your application to make it safe in any case:
start your thread for buffer management, remember dwBufferThreadId
from GUI thread call waveOutOpen with dwCallback set to dwBufferThreadId and fdwOpen to CALLBACK_THREAD
your buffer management thread: "waveOutWrite" some buffers in advance, the loop on GetMessage()
waveOutOpen will send a WOM_DONE whenever a buffer is finished and a new buffer is required, this is the moment to waveOutWrite a new buffer from within that thread
make your calls to waveOutPause, waveOutRestart and so on from GUI thread (nothing in MSDN speaks against it, and all examples do this, even if the buffers will be filled from another thread)
example 1
If you want to be 100% sure, you could just grab a windows message (WM_USER+0), and call PostThreadMessage( WM_USER+0, dwBufferThreadId, MY_CTL_PAUSE,0 ) and then upon receiving that message in your buffering thread, you call waveOutPause() there. Windows message queues save you some work on writing your own message queues ;-)

I didn't see any documentation either, but I can't imagine that a call to waveOutWrite would be considered safe to be run concurrently with a call to WaveOutRestart on the same handle.
If you're using VS2010 Beta2 I would look at the various walkthroughs for the Agents Library and attempt to turn this into a producer consumer problem where you are passing messages like write,pause,restart, etc.
If you aren't using Visual Studio 2010 (or can't) I would encourage you to find a way to break this into a producer consumer problem using threads and some sort of internally synchronized queue that stores the commands to process. If the messages aren't that frequent and given that you only have 2 threads working on this queue, you may be able to get away with puting a plain old Win32 critical section around a std::queue...
hope this helps.

Sadly, it's not safe even in a single threaded environment. Look at this question for a discussion:
Why would waveOutWrite() cause an exception in the debug heap?
Attempts to report this to Microsoft resulted in them closing the bug. They're not going to fix it.

It may be thread safe, but if you (or I) can't find any official documentation stating it is thread safe then assume it isn't and add your own thread synchronization. A light weight EnterCriticalSection / LeaveCriticalSection implementation is probably no more than a dozen lines of code.
No amount of testing can ever assure you that the API is thread safe: problems may only occur on some architectures with some CPU or bus speeds or with some sound cards. Neither you (nor Microsoft) has the ability to test all possible configurations.
You also shouldn't make any assumptions about what Microsoft or Intel or a sound card manufacturer or driver writer will do in some future implementation.

Related

"Multi-process" vs. "single-process multi-threading" for software modules communicating via messaging

We need to build a software framework (or middleware) that will enable messaging between different software components (or modules) running on a single machine. This framework will provide such features:
Communication between modules are through 'messaging'.
Each module will have its own message queue and message handler thread that will synchronously handle each incoming message.
With the above requirements, which of the following approach is the correct one (with its reasoning)?:
Implementing modules as processes, and messaging through shared memory
Implementing modules as threads in a single process, and messaging by pushing message objects to the destination module's message queue.
Of source, there are some apparent cons & pros:
In Option-2, if one module causes segmentation fault, the process (thus the whole application) will crash. And one module can access/mutate another module's memory directly, which can lead to difficult-to-debug runtime errors.
But with Option-1, you need to take care of the states where a module you need to communicate has just crashed. If there are N modules in the software, there can be 2^N many alive/crashed states of the system that affects the algorithms running on the modules.
Again in Option-1, sender cannot assume that the receiver has received the message, because it might have crashed at that moment. (But the system can alert all the modules that a particular module has crashed; that way, sender can conclude that the receiver will not be able to handle the message, even though it has successfully received it)
I am in favor of Option-2, but I am not sure whether my arguments are solid enough or not. What are your opinions?
EDIT: Upon requests for clarification, here are more specification details:
This is an embedded application that is going to run on Linux OS.
Unfortunately, I cannot tell you about the project itself, but I can say that there are multiple components of the project, each component will be developed by its own team (of 3-4 people), and it is decided that the communication between these components/modules are through some kind of messaging framework.
C/C++ will be used as programming language.
What the 'Module Interface API' will automatically provide to the developers of a module are: (1) An message/event handler thread loop, (2) a synchronous message queue, (3) a function pointer member variable where you can set your message handler function.
Here is what I could come up with:
Multi-process(1) vs. Single-process, multi-threaded(2):
Impact of segmentation faults: In (2), if one module causes segmentation fault, the whole application crashes. In (1), modules have different memory regions and thus only the module that cause segmentation fault will crash.
Message delivery guarantee: In (2), you can assume that message delivery is guaranteed. In (1) the receiving module may crash before the receival or during handling of the message.
Sharing memory between modules: In (2), the whole memory is shared by all modules, so you can directly send message objects. In (1), you need to use 'Shared Memory' between modules.
Messaging implementation: In (2), you can send message objects between modules, in (1) you need to use either of network socket, unix socket, pipes, or message objects stored in a Shared Memory. For the sake of efficiency, storing message objects in a Shared Memory seems to be the best choice.
Pointer usage between modules: In (2), you can use pointers in your message objects. The ownership of heap objects (accessed by pointers in the messages) can be transferred to the receiving module. In (1), you need to manually manage the memory (with custom malloc/free functions) in the 'Shared Memory' region.
Module management: In (2), you are managing just one process. In (1), you need to manage a pool of processes each representing one module.
Sounds like you're implementing Communicating Sequential Processes. Excellent!
Tackling threads vs processes first, I would stick to threads; the context switch times are faster (especially on Windows where process context switches are quite slow).
Second, shared memory vs a message queue; if you're doing full synchronous message passing it'll make no difference to performance. The shared memory approach involves a shared buffer that gets copied to by the sender and copied from by the reader. That's the same amount of work as is required for a message queue. So for simplicity's sake I would stick with the message queue.
in fact you might like to consider using a pipe instead of a message queue. You have to write code to make the pipe synchronous (they're normally asynchronous, which would be Actor Model; message queues can often be set to zero length which does what you want for it to be synchronous and properly CSP), but then you could just as easily use a socket instead. Your program can then become multi-machine distributed should the need arise, but you've not had to change the architecture at all. Also named pipes between processes is an equivalent option, so on platforms where process context switch times are good (e.g. linux) the whole thread vs process question goes away. So working a bit harder to use a pipe gives you very significant scalability options.
Regarding crashing; if you go the multiprocess route and you want to be able to gracefully handle the failure of a process you're going to have to do a bit of work. Essentially you will need a thread at each end of the messaging channel simply to monitor the responsiveness of the other end (perhaps by bouncing a keep-awake message back and forth between themselves). These threads need to feed status info into their corresponding main thread to tell it when the other end has failed to send a keep-awake on schedule. The main thread can then act accordingly. When I did this I had the monitor thread automatically reconnect as and when it could (e.g. the remote process has come back to life), and tell the main thread that too. This means that bits of my system can come and go and the rest of it just copes nicely.
Finally, your actual application processes will end up as a loop, with something like select() at the top to wait for message inputs from all the different channels (and monitor threads) that it is expecting to hear from.
By the way, this sort of thing is frustratingly hard to implement in Windows. There's just no proper equivalent of select() anywhere in any Microsoft language. There is a select() for sockets, but you can't use it on pipes, etc. like you can in Unix. The Cygwin guys had real problems implementing their version of select(). I think they ended up with a polling thread per file descriptor; massively inefficient.
Good luck!
Your question lacks a description of how the "modules" are implemented and what do they do, and possibly a description of the environment in which you are planning to implement all of this.
For example:
If the modules themselves have some requirements which makes them hard to implement as threads (e.g. they use non-thread-safe 3rd party libraries, have global variables, etc.), your message delivery system will also not be implementable with threads.
If you are using an environment such as Python which does not handle thread parallelism very well (because of its global interpreter lock), and running on Linux, you will not gain any performance benefits with threads over processes.
There are more things to consider. If you are just passing data between modules, who says your system needs to use either multiple threads or multiple processes? There are other architectures which do the same thing without either of them, such as event-driven with callbacks (a message receiver can register a callback with your system, which is invoked when a message generator generates a message). This approach will be absolutely the fastest in any case where parallelism isn't important and where receiving code can be invoked in the execution context of the caller.
tl;dr: you have only scratched the surface with your question :)

What are some of the core principles needed to master multi-threading using Delphi?

I am kind of new to programming in general (about 8 months with on and off in Delphi and a little Python here and there) and I am in the process of buying some books.
I am interested in learning about concurrent programming and building multi-threaded apps using Delphi. Whenever I do a search for "multithreading Delphi" or "Delphi multithreading tutorial" I seem to get conflicting results as some of the stuff is about using certain libraries (Omnithread library) and other stuff seems to be more geared towards programmers with more experience.
I have studied quite a few books on Delphi and for the most part they seem to kind of skim the surface and not really go into depth on the subject. I have a friend who is a programmer (he uses c++) who recommends I learn what is actually going on with the underlying system when using threads as opposed to jumping into how to actually implement them in my programs first.
On Amazon.com there are quite a few books on concurrent programming but none of them seem to be made with Delphi in mind.
Basically I need to know what are the main things I should be focused on learning before jumping into using threads, if I can/should attempt to learn them using books that are not specifically aimed at Delphi developers (don't want to confuse myself reading books with a bunch of code examples in other languages right now) and if there are any reliable resources/books on the subject that anyone here could recommend.
Short answer
Go to OmnyThreadLibrary install it and read everything on the site.
Longer answer
You asked for some info so here goes:
Here's some stuff to read:
http://delphi.about.com/od/kbthread/Threading_in_Delphi.htm
I personally like: Multithreading - The Delphi Way.
(It's old, but the basics still apply)
Basic principles:
Your basic VCL application is single threaded.
The VCL was not build with multi-threading in mind, rather thread-support is bolted on so that most VCL components are not thread-safe.
The way in which this is done is by making the CPU wait, so if you want a fast application be careful when and how to communicate with the VCL.
Communicating with the VCL
Your basic thread is a decendent of TThread with its own members.
These are per thread variables. As long as you use these you don't have any problems.
My favorite way of communicating with the main window is by using custom windows Messages and postmessage to communicate asynchronically.
If you want to communicate synchronically you will need to use a critical section or a synchonize method.
See this article for example: http://edn.embarcadero.com/article/22411
Communicating between threads
This is where things get tricky, because you can run into all sorts of hard to debug synchonization issues.
My advice: use OmnithreadLibrary, also see this question: Cross thread communication in Delphi
Some people will tell you that reading and writing integers is atomic on x86, but this is not 100% true, so don't use those in a naive way, because you'll most likely get subtle issues wrong and end up with hard to debug code.
Starting and stopping threads
In old Delphi versions Thread.suspend and Thread.resume were used, however these are no longer recommended and should be avoided (in the context of thread synchronization).
See this question: With what delphi Code should I replace my calls to deprecated TThread method Suspend?
Also have a look at this question although the answers are more vague: TThread.resume is deprecated in Delphi-2010 what should be used in place?
You can use suspend and resume to pause and restart threads, just don't use them for thread synchronization.
Performance issues
Putting wait_for... , synchonize etc code in your thread effectively stops your thread until the action it's waiting for has occured.
In my opinion this defeats a big purpose of threads: speed
So if you want to be fast you'll have to get creative.
A long time ago I wrote an application called Life32.
Its a display program for conways game of life. That can generate patterns very fast (millions of generations per second on small patterns).
It used a separate thread for calculation and a separate thread for display.
Displaying is a very slow operation that does not need to be done every generation.
The generation thread included display code that removes stuff from the display (when in view) and the display thread simply sets a boolean that tells the generation thread to also display the added stuff.
The generation code writes directly to the video memory using DirectX, no VCL or Windows calls required and no synchronization of any kind.
If you move the main window the application will keep on displaying on the old location until you pause the generation, thereby stopping the generation thread, at which point it's safe to update the thread variables.
If the threads are not 100% synchronized the display happens a generation too late, no big deal.
It also features a custom memory manager that avoids the thread-safe slowness that's in the standard memory manager.
By avoiding any and all forms of thread synchronization I was able to eliminate the overhead from 90%+ (on smallish patterns) to 0.
You really shouldn't get me started on this, but anyway, my suggestions:
Try hard to not use the following:
TThread.Synchronize
TThread.WaitFor
TThread.OnTerminate
TThread.Suspend
TThread.Resume, (except at the end of constructors in some Delphi versions)
TApplication.ProcessMessages
Use the PostMessage API to communicate to the main thread - post objects in lParam, say.
Use a producer-consumer queue to communicate to secondary threads, (not a Windows message queue - only one thread can wait on a WMQ, making thread pooling impossible).
Do not write directly from one thread to fields in another - use message-passing.
Try very hard indeed to create threads at application startup and to not explicitly terminate them at all.
Do use object pools instead of continually creating and freeing objects for inter-thread communication.
The result will be an app that performs well, does not leak, does not deadlock and shuts down immediately when you close the main form.
What Delphi should have had built-in:
TWinControl.PostObject(anObject:TObject) and TWinControl.OnObjectRx(anObject:TObject) - methods to post objects from a secondary thread and fire a main-thread event with them. A trivial PostMessage wrap to replace the poor performing, deadlock-generating, continually-rewritten TThread.Synchronize.
A simple, unbounded producer-consumer class that actually works for multiple producers/consumers. This is, like, 20 lines of TObjectQueue descendant but Borland/Embarcadero could not manage it. If you have object pools, there is no need for complex bounded queues.
A simple thread-safe, blocking, object pool class - again, really simple with Delphi since it has class variables and virtual constructors, eg. creating a lot of buffer objects:
myPool:=TobjectPool.create(1024,TmyBuffer);
I thought it might be useful to actually try to compile a list of things that one should know about multithreading.
Synchronization primitives: mutexes, semaphores, monitors
Delphi implementations of synchronization primitives: TCriticalSection, TMREWSync, TEvent
Atomic operations: some knowledge about what operations are atomic and what not (discussed in this question)
Windows API multithreading capabilities: InterlockedIncrement, InterlockedExchange, ...
OmniThreadLibrary
Of course this is far from complete. I made this community wiki so that everyone can edit.
Appending to all the other answers I strongly suggest reading a book like:
"Modern Operating Systems" or any other one going into multithreading details.
This seems to be an overkill but it would make you a better programmer and
you defenitely get a very good insight
into threading/processes in an abstract way - so you learn why and how to
use critical section or semaphores on examples (like the
dining philosophers problem or the sleeping barber problem)

Can you receive Events in a secondary thread in Delphi XE?

I would like to have three threads in a sample application.
Thread #1 (Main Thread) - User Interface/GUI
Thread #2 - Tied to a serial port device receiving data via events passing to a data queue.
Thread #3 - Activated when a queue entry is made, process data node, frees data object.
The goal is to
a) Prevent the loss of data when a button or the form is held by the mouse on the main form.
b) Quickly get the data from the event, stuff it in the queue, go back to sleep
c) Process data when we have it, otherwise sleep.
Can packages like AsyncoPro tie event handling to a non-main thread?
I've never done much with serial port event driven apps, most of what I've work with are polled and I want to do some testing.
You can definitely tie event handling to a non-main thread. What you can't do is tie screen updating to a non-main thread. The Windows API is not threadsafe, and so the Delphi VCL, which is built on top of the Windows API, isn't either. But your design is basically a good, workable idea; just remember to use the Synchronize or Queue methods of TThread to send any UI updates back to be executed on the main thread.
The easiest should be to define some user messages, then sent it from sub-threads to the main thread.
It's perfectly thread-safe, and even process-safe.
Use PostMessage() with the Handle of the main form. But don't broadcast this WM_USER+n message to the whole UI, because you could confuse some part of the VCL which defines its own custom messages.
If you want to copy some textual data accross threads or processes, you can see WM_COPY_DATA. In practice, this is very fast, faster than named pipes for small messages.
For User Interface, I discovered than a stateless implementation is sometimes a good idea. That is, you don't call-back the main thread via a Synchronize() call or a GDI message, but your main GUI thread has a timer which check a shared memory buffer for pending updates. This is how the web works, and in practice, it's pretty easy to work with: you don't have to write any callback, each thread is independent, do its own stuff, and refresh when necessary.
But of course, the solution depends on your exact project architecture.
For a simple but proven library, see AsyncCalls, working from Delphi 5 up to XE. For latest versions of the IDE (Delphi 2007 and later), take a look at OmniThreadLibrary. By using such libraries, you'll ensure that your software implementation won't break anywhere: it's very common for a multi-threaded application to work as expected most of the time, then, for unknown reasons, going into an endless loop. And, of course, it happens only on the customer side, not yours... If you don't want to spend hours debugging your program, just trust those proven libraries, which are known to be well designed and debugged.
Sure you can do this, one way or another. Not used Apro since D5 - the Apro I have does not work on my D2009, (unicode/string/ANSIstring issues), & I have my own serial classes. Most of the available serial components have the option of firing dataRx events on either the rx thread or the main GUI thread - obviously in your case you should select the rx thread, (Thread #2). Shove the rx data into some buffer class and push it onto a producer-consumer thread to (Thread #3). Process it there. If you need to do a GUI update from there, PostMessage the reference to the GUI thread and handle it in a user-defined message-handler procedure.
Done this sort of stuff loadsa times - it will work OK.
Rgds,
Martin

How to output data form a thread to another thread without locking?

I'm developing a DirectShow application. I encounter a deadlock problem, the problem seems caused by acquire lock in a callback function called from a thread. This is the quest I asked in MSDN forum:
http://social.msdn.microsoft.com/Forums/en-US/windowsdirectshowdevelopment/thread/f9430f17-6274-45fc-abd1-11ef14ef4c6a
Now I have to avoid to acquire lock in that thread. But the problem is, I have to output the audio to another thread, how can I put data to another thread without lock?
There's someone tell me that I can use PostMessage of win32 sdk to post data to another thread. But however, to get the message, I have to run a windows program. My program is a Python C++ extension module. That might be very difficult to add a loop to pull message. So I am think another way to pass data among threads without locking.
(Actually... the producer thread can't be locked, but the consumer thread can do that. )
To lock or not to lock, that's the question.
So the question is how to do?
Thanks.
------EDIT------
I think I know why I got a deadlock, that might not be the problem of DirectShow.
The main thread is own by Python, it call stop, namely, it hold GIL. And the stop wait for callback of DirectShow in thread return. But callback acquire the GIL.
It looks like this
Main(Hold GIL) -> Stop(Wait callback) -> Callback(Wait GIL) -> GIL(Hold by Main thread)
Damn it! That's why I don't like multi-thread so much.
No matter what, thanks your help.
If you were doing this in pure Python, I'd use a Queue object; these buffer up data which is written but block on read until something is available, and do any necessary locking under the hood.
This is an extremely common datatype, and some equivalent should always be available, whatever your current language or toolchain; there's a STL Queue available in C++, for instance, but the standard doesn't specify thread-safety characteristics (so see your local implementation docs).
Well, theoretically locks can be avoided if both of your threads can work on duplicate copies of the same data. After reading your question in the MSDN forum...
"So to avoid deadlock, I should not acquire any lock in the graber callback function? How can I do if I want to output audio to another thread?"
I think that you should be able to deposit your audio data in a dequeue (an STL class) and then fetch this data from another thread. This other thread can then process your audio data.
I am glad that your problem has been resolved the reason I asked about your Os was that the documentation you referred to said that you should not wait on other threads because of some problem with win16Mutexes. There are no win16mutexes on windows XP (except when programs are running on ntvdm/wow16) so you should be able to use locks to synchronize these threads.

How to send a message to a TThread from main thread in Delphi?

I want to send a message to a thread and handle it in the thread. How can I do this in Delphi? I guess PostMessage is the way to go, but the examples I've seen so far are describing the other way, i.e. from the thread to main thread.
I won't even try and explain or write any code. Just look at this tutorial. It's a little old, but very good imho.
Multithreading - The Delphi Way
You can either have a message loop (possibly with a hidden notification window) in your thread and send a Windows message to it, or you can use a more native (less-GUI) way of doing it, such as a queue protected by a critical section combined with a manual-reset event that the thread waits on and the sending thread signals.
A more general solution is a producer-consumer queue, which in the classic implementation uses a couple of semaphores to keep track of consumers and producers and a third semaphore for mutually exclusive access to the queue; however, more optimal producer-consumer queues are available on the net.
Why would you need to do it? It is only for one reason that I ever had to create a message loop in a secondary thread, and that is because the thread used COM objects. The calls to OleInitialize() and OleUnitialize() are a sign that you need a standard GetMessage() loop. In that case it's also necessary to just post messages to that thread, using PostThreadMessage(), because normal blocking synchronization calls would interfere with the message loop. Otherwise, just don't do it.
If you are at Delphi 2007 or 2009, be sure to look into OmniThreadLibrary by Primož Gabrijelčič, it should make your job much easier.

Resources