I'm considering a multi-threaded architecture for a processing pipeline. My main processing module has an input queue, from which it receives data packets. It then performs transformations on these packets (decryption, etc.) and places them into an output queue.
The threading comes in where many input packets can have their contents transformed independently from one another.
However, the punchline is that the output queue must have the same ordering as the input queue (i.e., the first pulled off the input queue must be the first pushed onto the output queue, regardless of whether its transformations finished first.)
Naturally, there will be some kind of synchronisation at the output queue, so my question is: what would be the best way of ensuring that this ordering is maintained?

Have a single thread read the input queue, post a placeholder on the output queue, and then hand the item over to a worker thread to process. When the data is ready the worker thread updates the placeholder. When the thread that needs the value from the output queue reads the placeholder it can then block until the associated data is ready.
Because only a single thread reads the input queue, and this thread immediately puts the placeholder on the output queue, the order in the output queue is the same as that in the input. The worker threads can be numerous, and can do the transformations in any order.
On platforms that support futures, they are ideal as the placeholder. On other systems you can use an event, monitor or condition variable.

With the following assumptions
there should be one input queue, one output queue and one working queue
there should be only one input queue
output message should contain a wait
handle and a pointer to worker/output data
there may be an arbitrary number of
worker threads
I would consider the following flow:
Input queue listener does these steps:
extracts input message;
creates output message:
initializes worker data struct
resets the wait handle
enqueues the pointer to the output message into the working queue
enqueues the pointer to the output message into the output queue
Worker thread does the following:
waits on a working queue to
extract a pointer to an output
message from it
processes the message based on the given data and sets the event when done
consumer does the following:
waits on n output queue to
extract a pointer to an output
message from it
waits on a handle until the output data is ready
does something with the data

That's going to be implementation-specific. One general solution is to number the input items and preserve the numbering so you can later sort the output items. This could be done once the output queue is filled, or it could be done as part of filling it. In other words, you could insert them into their proper position and only allow the queue to be read when the next available item is sequential.
I'm going to sketch out a basic scheme, trying to keep it simple by using the appropriate primitives:
Instead of queueing a Packet into the input queue, we create a future value around it and enqueue that into both the input and output queues. In C#, you could write it like this:
var future = new Lazy<Packet>(delegate() { return Process(packet); }, LazyThreadSafetyMode.ExecutionAndPublication);
A thread from the pool of workers dequeues a future from the input queue and executes future.Value, which causes the delegate to run JIT and returns once the delegate is done processing the packet.
One or more consumers dequeues a future from the output queue. Whenever they need the value of the packet, they call future.Value, which returns immediately if a worker thread has already called the delegate.
Simple, but works.

If you are using a windowed-approach (known number of elements), use an array for the output queue. For example if it is media streaming and you discard packages which haven't been processed quickly enough.
Otherwise, use a priority queue (special kind of heap, often implemented based on a fixed size array) for the output items.
You need to add a sequence number or any datum on which you can sort the items to each data packet. A priority queue is a tree like structure which ensures the sequence of items on insert/pop.


Sending variables from one python thread to another

Lets say I have a function that will run in its own thread since its gettign serial data through a port.
def serialDataIncoming ():
device = Radar()
device.connect(port 1, baudrate 256000)
serialdata = device.startscan
for count, scan in enumerate(serialdata):
distance = device.distance
sector = device.angle
Now I want to run this in its own thread
# error handling here
now , I want to add to the code of serialDataIncoming(), a line where I send the distance and sector to another function to be processed and then send somewhere else, now here is this issue, the data incoming from "device" is continusly being sent, I can experience a delay or even lose some data if I lose some time inside the loop for another loop, so I want to create a new thread and from that thread run a function that will receive data from the first thread and process it and do whatever.
def dataProcessing():
# random code here where I process the data
However my issue is , how do I send both variables from one thread to the second thread, in my mind within multiple threads the second thread would have to wait until it receives variables and then start working, its going to be send a lot of data at the same time so I might have to introduce a third thread that would hold that data and then send it to the thread that processes.
So the question is basically that, how would I write in python sending 2 variables to another thread, and how would that be written in the function being used on the second thread?
To pass arguments to the thread function you can do:
def thread_fn(a, b, c):
print(a, b, c)
thread.start_new_thread(thread_fn, ("asdsd", 123, False))
The list of arguments must be a tuple or list. However in Python only one thread is actually running at a time so it may actually be more reliable (and simpler) to work out a way to do this with one thread. From the sounds of it you are polling the data so this is not like file access where the OS will notify the thread when it can wake up again once the file operation has completed (hence you wont get the kind of gains you would from multithreaded file access.)

Serial Dispatch Queue with Asynchronous Blocks

Is there ever any reason to add blocks to a serial dispatch queue asynchronously as opposed to synchronously?
As I understand it a serial dispatch queue only starts executing the next task in the queue once the preceding task has completed executing. If this is the case, I can't see what you would you gain by submitting some blocks asynchronously - the act of submission may not block the thread (since it returns straight-away), but the task won't be executed until the last task finishes, so it seems to me that you don't really gain anything.
This question has been prompted by the following code - taken from a book chapter on design patterns. To prevent the underlying data array from being modified simultaneously by two separate threads, all modification tasks are added to a serial dispatch queue. But note that returnToPool adds tasks to this queue asynchronously, whereas getFromPool adds its tasks synchronously.
class Pool<T> {
private var data = [T]();
// Create a serial dispath queue
private let arrayQ = dispatch_queue_create("arrayQ", DISPATCH_QUEUE_SERIAL);
private let semaphore:dispatch_semaphore_t;
init(items:[T]) {
for item in items {
semaphore = dispatch_semaphore_create(items.count);
func getFromPool() -> T? {
var result:T?;
if (dispatch_semaphore_wait(semaphore, DISPATCH_TIME_FOREVER) == 0) {
dispatch_sync(arrayQ, {() in
result =;
return result;
func returnToPool(item:T) {
dispatch_async(arrayQ, {() in;
Because there's no need to make the caller of returnToPool() block. It could perhaps continue on doing other useful work.
The thread which called returnToPool() is presumably not just working with this pool. It presumably has other stuff it could be doing. That stuff could be done simultaneously with the work in the asynchronously-submitted task.
Typical modern computers have multiple CPU cores, so a design like this improves the chances that CPU cores are utilized efficiently and useful work is completed sooner. The question isn't whether tasks submitted to the serial queue operate simultaneously — they can't because of the nature of serial queues — it's whether other work can be done simultaneously.
Yes, there are reasons why you'd add tasks to serial queue asynchronously. It's actually extremely common.
The most common example would be when you're doing something in the background and want to update the UI. You'll often dispatch that UI update asynchronously back to the main queue (which is a serial queue). That way the background thread doesn't have to wait for the main thread to perform its UI update, but rather it can carry on processing in the background.
Another common example is as you've demonstrated, when using a GCD queue to synchronize interaction with some object. If you're dealing with immutable objects, you can dispatch these updates asynchronously to this synchronization queue (i.e. why have the current thread wait, but rather instead let it carry on). You'll do reads synchronously (because you're obviously going to wait until you get the synchronized value back), but writes can be done asynchronously.
(You actually see this latter example frequently implemented with the "reader-writer" pattern and a custom concurrent queue, where reads are performed synchronously on concurrent queue with dispatch_sync, but writes are performed asynchronously with barrier with dispatch_barrier_async. But the idea is equally applicable to serial queues, too.)
The choice of synchronous v asynchronous dispatch has nothing to do with whether the destination queue is serial or concurrent. It's simply a question of whether you have to block the current queue until that other one finishes its task or not.
Regarding your code sample code, that is correct. The getFromPool should dispatch synchronously (because you have to wait for the synchronization queue to actually return the value), but returnToPool can safely dispatch asynchronously. Obviously, I'm wary of seeing code waiting for semaphores if that might be called from the main thread (so make sure you don't call getFromPool from the main thread!), but with that one caveat, this code should achieve the desired purpose, offering reasonably efficient synchronization of this pool object, but with a getFromPool that will block if the pool is empty until something is added to the pool.

Message ordinal-number by enqueuing order

My application (.NET-based) gets messages from a queue in a multithreaded fashion and I'm worried about the fact that I may receive messages in an out-of-order manner because one thread can be quicker than the other, for instance, given the following queue state:
[Message-5 | Message-4 | Message-3 | Message-2 | Message-1]
In a multithreaded operation, msg #2 may arrive before msg #1, even though msg #1 was first in the queue, due to many threading issues (thread time slices, thread scheduling etc).
In such a situation, it would be great if a message that is inside the queue have already stamped with an ordinal/sequence number when it was enqueued and even if I get the messages in an out of order fashion, I can still order them at some point within my application using their given ordinal-number attribute.
Any known mechanism to achieve it in a Websphere MQ environment?
You have 2 choices:
(1) Use Message Grouping in MQ as whitfiea mentioned or
(2) Change you application to be single threaded.
Note: If the sending application does not set the MQMD MsgId field then the queue manager will generate a unique number (based on queue manager name, date & time) and store it in the message's MQMD MsgID field.
You can obtain the MessageSequenceNumber from the MQMessage if the messages are put to the queue in a message group. The MessageSquenceNumber will either be the order that the messages were put to the queue by default or defined by the application that put the messages to the queue.
See the MessageSequenceNumber here for more details
Yes, if the originating message has an ordinal then as you receive your data you could:
Use a thread safe dictionary:

Should Storm Spouts only emit output using the thread calling Spout.nextTuple?

The ISpout.nextTuple() javadoc specifies that nextTuple(), ack(...) and fail(...) are called on the same thread.
However, the actual collector upon which emit(...) is called is supplied earlier, as a parameter on open(..., collector).
Question is whether a background thread that sees some new data must always enqueue the data for nextTuple() to dequeue and emit. What would happen if the background thread emits the data immediately? Is that supported? If that is allowed, what's the recommended way to implement the "sleep for a short amount of time" in nextTuple()?
The implicit meaning of nextTuple()/ack()/fail() methods are called on the same thread is, the task (background Java thread), running at machine 'A', which emits the tuple is the same task, running at 'A' on which the ack()/fail() is called depending on the success/failure of processing (processed by Bolt running at 'B'or 'C') the tuple in the topology.
As long as the messageId is not null and Bolt tasks are calling the ack(tuple) in the execute() method, Storm framework keeps track of tuple traversal within the topology and call the ack()/fail() of tuple's owning task.
Here is the brief introduction on how the background task thread works before answering your question. The background task thread has in-memory structure/buffer for the emitted tuple and few other in-memory structures for status/pending tuples etc. The buffer gets filling up as the Spout/Bolt starts emitting the data and this buffer getting freed up as and when the tuples are processed i.e after calling ack()/fail(). Essentially, the background thread calls nextTuple() when the buffer is free and background thread stops calling the nextTuple() once the buffer is full. In simple words, emit() method either in the open()/nextTuple()/close(), fills the background thread buffer and ack()/fail() frees up the buffer.
With the above explanation, the background thread is unaware of the new/incoming data. It's up to the logic within the nextTuple() to read the data from source(Twitter/JMS providers/ESB/AMQP compliant servers/RDBMS) and emit the data. So, depending on the background thread's buffer size, Storm calls nextTuple() as explained above.
For other question, it should be ok to sleep for short duration if it's required. Please note, the nextTuple() need not emit the value, it can return with nothing.
It is my understanding that you shouldn't emit data unless requested by Storm by calling your nextTuple() method. Consequently, your background thread must enqueue new data, so that it is emitted when requested. Your nextTuple() method should sleep briefly only if there are no tuples to emit when the method is called.

Multithreading: several producers + one consumer

I've got the following problem
I have several threads (producers) calculating positions of moving objects and one thread (consumer) that prints calculation results. Every thread has it's own time scale. The problem of synchronization is that consumer can print results only when all of the producers calculate position at the printing moment. In other words consumer have to compare it's current time with the same of the producers and to decide whether the results can be printed or not. I found a similar example where synchronization was made with a semaphore, but there was only one producer there. Does anyone know a smart solution?
Consumer loop:
wait n times
collect data
do its thing
signal all n producers
Producer loop (n in parallel):
do its thing
make data available
signal to consumer
(Sorry, don't know anything about QT, so just the general algorithm)
EDIT: If the producers have a buffer rather than wait to synchronise, then you can do this:
Consumer loop:
wait, then check all buffers; repeat while any buffer is empty
collect data; if any buffer was full, signal to that producer
do its thing
Producer loop (n in parallel):
do its thing
wait if buffer is full
queue data
signal to consumer
