Design Pattern for multithreaded observers - multithreading

In a digital signal acquisition system, often data is pushed into an observer in the system by one thread.
example from Wikipedia/Observer_pattern:
foreach (IObserver observer in observers)
observer.Update(message);
When e.g. a user action from e.g. a GUI-thread requires the data to stop flowing, you want to break the subject-observer connection, and even dispose of the observer alltogether.
One may argue: you should just stop the data source, and wait for a sentinel value to dispose of the connection. But that would incur more latency in the system.
Of course, if the data pumping thread has just asked for the address of the observer, it might find it's sending a message to a destroyed object.
Has someone created an 'official' Design Pattern countering this situation? Shouldn't they?

If you want to have the data source to always be on the safe side of concurrency, you should have at least one pointer that is always safe for him to use.
So the Observer object should have a lifetime that isn't ended before that of the data source.
This can be done by only adding Observers, but never removing them.
You could have each observer not do the core implementation itself, but have it delegate this task to an ObserverImpl object.
You lock access to this impl object. This is no big deal, it just means the GUI unsubscriber would be blocked for a little while in case the observer is busy using the ObserverImpl object. If GUI responsiveness would be an issue, you can use some kind of concurrent job-queue mechanism with an unsubscription job pushed onto it. ( like PostMessage in Windows )
When unsubscribing, you just substitute the core implementation for a dummy implementation. Again this operation should grab the lock. This would indeed introduce some waiting for the data source, but since it's just a [ lock - pointer swap - unlock ] you could say that this is fast enough for real-time applications.
If you want to avoid stacking Observer objects that just contain a dummy, you have to do some kind of bookkeeping, but this could boil down to something trivial like an object holding a pointer to the Observer object he needs from the list.
Optimization :
If you also keep the implementations ( the real one + the dummy ) alive as long as the Observer itself, you can do this without an actual lock, and use something like InterlockedExchangePointer to swap the pointers.
Worst case scenario : delegating call is going on while pointer is swapped --> no big deal all objects stay alive and delegating can continue. Next delegating call will be to new implementation object. ( Barring any new swaps of course )

You could send a message to all observers informing them the data source is terminating and let the observers remove themselves from the list.
In response to the comment, the implementation of the subject-observer pattern should allow for dynamic addition / removal of observers. In C#, the event system is a subject/observer pattern where observers are added using event += observer and removed using event -= observer.

Related

<Spring Batch> Why does making ItemReader thread-safe leads us to loosing restartability?

I have a multi-threaded batch job reading from a DB and I am concerned about different threads re-reading records as ItemReader is not thread safe in Spring batch. I went through SpringBatch FAQ section which states that
You can synchronize the read() method (e.g. by wrapping it in a delegator that does the synchronization). Remember that you will lose restartability, so best practice is to mark the step as not restartable and to be safe (and efficient) you can also set saveState=false on the reader.
I want to know why will I loose re-startability in this case? What has restartability got to do with synchronizing my read operations? It can always try again,right?
Also, will this piece of code be enough for synchronizing the reader?
public SynchronizedItemReader<T> implements ItemReader<T> {
private final ItemReader<T> delegate;
public SynchronizedItemReader(ItemReader<T> delegate) {
this.delegate = delegate;
}
public synchronized T read () {
return delegate.read();
}
}
When using an ItemReader with multithreads, the lack of restartability is not about the read itself. It's about saving the state of the reader which occurs in the update method. The issue is that there needs to be coordination between the calls to read() - the method providing the data and update() - the method persisting the state. When you use multiple threads, the internal state of the reader (and therefore the update() call) may or may not reflect the work that has been done. Take for example the FlatFileItemReader using a chunk size of 5 and running on multiple threads. You could have thread1 having read 5 items (time to update), yet thread 2 could have read an additional 3. This means that the call to update would save that 8 items have been read. If the chunk on thread 2 fails, the state would due incorrect and the restart would miss the three items that were already read.
This is not to say that it is impossible to write a thread safe ItemReader. However, as your example above illustrates, if delegate is a stateful ItemReader (implements ItemStream as well), the state will not be persisted correctly with calls to update (in fact, your example above doesn't even take the ItemStream aspect of stageful readers into account).
If you want make restartable your job, with parallel execution of items, you can save item, that reader read plus state of this item by yourself.

Golang: Best way to read from a hashmap w/ mutex

This is a continuation from here: Golang: Shared communication in async http server
Assuming I have a hashmap w/ locking:
//create async hashmap for inter request communication
type state struct {
*sync.Mutex // inherits locking methods
AsyncResponses map[string]string // map ids to values
}
var State = &state{&sync.Mutex{}, map[string]string{}}
Functions that write to this will place a lock. My question is, what is the best / fastest way to have another function check for a value without blocking writes to the hashmap? I'd like to know the instant a value is present on it.
MyVal = State.AsyncResponses[MyId]
Reading a shared map without blocking writers is the very definition of a data race. Actually, semantically it is a data race even when the writers will be blocked during the read! Because as soon as you finish reading the value and unblock the writers - the value may not exists in the map anymore.
Anyway, it's not very likely that proper syncing would be a bottleneck in many programs. A non-blocking lock af a {RW,}Mutex is probably in the order of < 20 nsecs even on middle powered CPUS. I suggest to postpone optimization not only after making the program correct, but also after measuring where the major part of time is being spent.

Is it safe to call Dispose on an instance from event handler?

public class MyTask : IDisposable { ... }
MyTask task = new MyTask(() => SomeTask);
task.Completed += (s, e) =>
{
// do something with result
...
// dispose of this instance
((MyTask)s).Dispose();
};
// execute the task
task.Execute();
Clearly I cannot tell when the task will be completed, so the only actual place, as I see it, that i can dispose of this instance is in Completed event.
Is this safe to do?
There is, alas, no general rule as to when it is safe to call Dispose. If Microsoft had specified that Dispose must be safe to call at any time when an object isn't in use, complying with such a rule would seldom have been difficult; in cases where a class might not always be able to perform all necessary cleanup immediately(*), it would generally be possible for it to set a flag and/or otherwise arrange to have necessary cleanup performed at the next opportunity. Unfortunately, Microsoft does not specify that Dispose implementations have to handle asynchronous Dispose requests, nor is there any general way for an object which holds the last useful reference to an IDisposable instance to ask for notification when it would be safe to dispose.
Despite the general lack of assurance as to when it is safe to call Dispose, many particular classes which implement Dispose do offer guarantees as to when it may safely be called. If one knows that a particular object is of a type which can be safely disposed in a particular context, one may dispose it then. Especially in cases where an event from an object may be the only opportunity to Dispose it in a threading context it could know about, and where disposing an object within an event handler would make sense, it should be safe to dispose of the object. Any properly-written event handlers should be prepared for the possibility that the object sending the event may be disposed between the time the system decides that they should run, and the time it actually runs them.
(*) The essential purpose of IDisposable is to allow an object to notify entities which are outside it but are acting on its behalf to the detriment of other entities, that they should no longer do so [e.g. to tell a file system that it should no longer grant an object exclusive access to a file]. Such action is referred to as "releasing resources". The fact that someone holds the last surviving reference to an object may imply that no other thread can be using that object, but does not imply that no other thread is using any non-thread-safe entities whose resources need to be released.

Should Observers be notified in separate threads each one?

I know it sounds heavy weight, but I'm trying to solve an hypothetical situation. Imagine you have N observers of some object. Each one interested in the object state. When applying the Observer Pattern the observable object tends to iterate through its observer list invoking the observer notify()|update() method.
Now imagine that a specific observer has a lot of work to do with the state of the observable object. That will slow down the last notification, for example.
So, in order to avoid slowing down notifications to all observers, one thing we can do is to notify the observer in a separate thread. In order for that to work, I suppose that a thread for each observer is needed. That is a painful overhead we are having in order to avoid the notification slow down caused by heavy work. Worst than slowing down if thread approach is used, is dead threads caused by infinite loops. It would be great reading experienced programmers for this one.
What people with years on design issues think?
Is this a problem without a substancial solution?
Is it a really bad idea? why?
Example
This is a vague example in order to demonstrate and, hopefully, clarify the basic idea that I don't even tested:
class Observable(object):
def __init__(self):
self.queues = {}
def addObserver(self, observer):
if not observer in self.queues:
self.queues[observer] = Queue()
ot = ObserverThread(observer, self.queues[observer])
ot.start()
def removeObserver(self, observer):
if observer in self.queues:
self.queues[observer].put('die')
del self.queues[observer]
def notifyObservers(self, state):
for queue in self.queues.values():
queue.put(state)
class ObserverThread(Thread):
def __init__(self, observer, queue):
self.observer = observer
self.queue = queue
def run(self):
running = True
while running:
state = self.queue.get()
if state == 'die':
running = False
else:
self.observer.stateChanged(state)
You're on the right track.
It is common for each observer to own its own input-queue and its own message handling thread (or better: the queue would own the thread, and the observer would own the queue). See Active object pattern.
There are some pitfalls however:
If you have 100's or 1000's of observers you may need to use a thread pool pattern
Note the you'll lose control over the order in which events are going to be processed (which observer handles the event first). This may be a non-issue, or may open a Pandora box of very-hard-to-detect bugs. It depends on your specific application.
You may have to deal with situations where observers are deleted before notifiers. This can be somewhat tricky to handle correctly.
You'll need to implement messages instead of calling functions. Message generation may require more resources, as you may need to allocate memory, copy objects, etc. You may even want to optimize by implementing a message pool for common message types (you may as well choose to implement a message factory that wrap such pools).
To further optimize, you'll probably like to generate one message and send it to all to observers (instead of generating many copies of the same message). You may need to use some reference counting mechanism for your messages.
Let each observer decide itself if its reaction is heavyweight, and if so, start a thread, or submit a task to a thread pool. Making notification in a separate thread is not a good solution: while freeing the observable object, it limits the processor power for notifications with single thread. If you do not trust your observers, then create a thread pool and for each notification, create a task and submit it to the pool.
In my opinion when you have a large no of Observers for an Observable, which do heavy processing, then the best thing to do is to have a notify() method in Observer.
Use of notify(): Just to set the dirty flag in the Observer to true. So whenever the Observer thread will find it appropriate it will query the Observable for the required updates.
And this would not require heavy processing on Observable side and shift the load to the Observer side.
Now it depends on the Observers when they have to Observe.
The answer of #Pathai is valid in a lot of cases.
One is that you are observing changes in a database. In many ways you can't reconstruct the final state from the snapshots alone, especially if your state is fetched as a complex query from the database, and the snapshot is an update to the database.
To implement it, I'd suggest using an Event object:
class Observer:
def __init__(self):
self.event = threading.Event()
# in observer:
while self.event.wait():
# do something
self.event.clear()
# in observable:
observer.event.set()

Worker thread doesn't have message loop (MFC, windows). Can we make it to receive messages?

Mfc provides both worker and UI thread. UI thread is enabled with message receiving capabilities (send, post). Could it be possible to let worker thread too receive messages.
Call CWinThread::PumpMessage() repeatedly until it returns a WM_QUIT message.
It seems you need a thread, that can handle multiple messages from another threads. Another threads would add-a-message to the message-queue of this thread. Well, in that case you may use PeekMessage to startup a loop, which would eventually create a hidden window, and then use GetMessage to get the messages. The other threads would use PostThreadMessage with the thread ID (the one having Peek/GetMessage), and the message-code, LPARAM, WPARAM.
It would be like (not syntactically correct):
TheProcessor()
{
MSG msg;
PeekMessage(&msg,...);
while(GetMessage(&msg...)
{ /* switch case here */ }
}
The threads would call PostThreadMessage - See MSDN for more info.
When you need to send more data than LPARAM/WPARAM can hold, you eventually need to allocate them on heap, and then delete AFTER processing the message in your custom message-loop. This would be cumbersome and buggy.
But... I would suggest you to have your own class, on top of std::queue/deque or other DS, where you can add AddMessage/PushMessage, and PopMessage (or whatever names you like). You need to use SetEvent, WaitForSingleObject to trigger the new message in loop (See one of the implementation here. You may make it generic for one data-type, or make it template class - that would support any data-type (your underlying DS (queue) would utilize the same data-type). You also need not to worry about heaps and deletions. This is less error prone. You may however, have to handle MT issues.
Using Windows events involves kernel mode transition (since events are named/kernel objects), and you may like to use Conditional Variables which are user objects.Or you may straightaway use unbounded_buffer class from Concurrency Runtime Library available in VC10. See this article (jump to unbounded_buffer).
Yes you can create a message queue on a worker thread. You will need to run a message pump on that thread.

Resources