I have a legacy event-based object that seems like a perfect fit for RX: after being connected to a network source, it raises events when a message is received, and may terminate with either a single error (connection dies, etc.) or (rarely) an indication that there will be no more messages. This object also has a couple projections -- most users are interested in only a subset of the messages received, so there are alternate events raised only when well-known message subtypes show up.
So, in the process of learning more about reactive programming, I built the following wrapper:
class LegacyReactiveWrapper : IConnectableObservable<TopLevelMessage>
{
private LegacyType _Legacy;
private IConnectableObservable<TopLevelMessage> _Impl;
public LegacyReactiveWrapper(LegacyType t)
{
_Legacy = t;
var observable = Observable.Create<TopLevelMessage>((observer) =>
{
LegacyTopLevelMessageHandler tlmHandler = (sender, tlm) => observer.OnNext(tlm);
LegacyErrorHandler errHandler = (sender, err) => observer.OnError(new ApplicationException(err.Message));
LegacyCompleteHandler doneHandler = (sender) => observer.OnCompleted();
_Legacy.TopLevelMessage += tlmHandler;
_Legacy.Error += errHandler;
_Legacy.Complete += doneHandler;
return Disposable.Create(() =>
{
_Legacy.TopLevelMessage -= tlmHandler;
_Legacy.Error -= errHandler;
_Legacy.Complete -= doneHandler;
});
});
_Impl = observable.Publish();
}
public IDisposable Subscribe(IObserver<TopLevelMessage> observer)
{
return _Impl.RefCount().Subscribe(observer);
}
public IDisposable Connect()
{
_Legacy.ConnectToMessageSource();
return Disposable.Create(() => _Legacy.DisconnectFromMessageSource());
}
public IObservable<SubMessageA> MessageA
{
get
{
// This is the moral equivalent of the projection behavior
// that already exists in the legacy type. We don't hook
// the LegacyType.MessageA event directly.
return _Impl.RefCount()
.Where((tlm) => tlm.MessageType == MessageType.MessageA)
.Select((tlm) => tlm.SubMessageA);
}
}
public IObservable<SubMessageB> MessageB
{
get
{
return _Impl.RefCount()
.Where((tlm) => tlm.MessageType == MessageType.MessageB)
.Select((tlm) => tlm.SubMessageB);
}
}
}
Something about how it's used elsewhere feels... off... somehow, though. Here's sample usage, which works but feels strange. The UI context for my test application is WinForms, but it doesn't really matter.
// in Program.Main...
MainForm frm = new MainForm();
// Updates the UI based on a stream of SubMessageA's
IObserver<SubMessageA> uiManager = new MainFormUiManager(frm);
LegacyType lt = new LegacyType();
// ... setup lt...
var w = new LegacyReactiveWrapper(lt);
var uiUpdateSubscription = (from msgA in w.MessageA
where SomeCondition(msgA)
select msgA).ObserveOn(frm).Subscribe(uiManager);
var nonUiSubscription = (from msgB in w.MessageB
where msgB.SubType == MessageBType.SomeSubType
select msgB).Subscribe(
m => Console.WriteLine("Got MsgB: {0}", m),
ex => Console.WriteLine("MsgB error: {0}", ex.Message),
() => Console.WriteLine("MsgB complete")
);
IDisposable unsubscribeAtExit = null;
frm.Load += (sender, e) =>
{
var connectionSubscription = w.Connect();
unsubscribeAtExit = new CompositeDisposable(
uiUpdateSubscription,
nonUiSubscription,
connectionSubscription);
};
frm.FormClosing += (sender, e) =>
{
if(unsubscribeAtExit != null) { unsubscribeAtExit.Dispose(); }
};
Application.Run(frm);
This WORKS -- The form launches, the UI updates, and when I close it the subscriptions get cleaned up and the process exits (which it won't do if the LegacyType's network connection is still connected). Strictly speaking, it's enough to dispose just connectionSubscription. However, the explicit Connect feels weird to me. Since RefCount is supposed to do that for you, I tried modifying the wrapper such that rather than using _Impl.RefCount in MessageA and MessageB and explicitly connecting later, I used this.RefCount instead and moved the calls to Subscribe to the Load handler. That had a different problem -- the second subscription triggered another call to LegacyReactiveWrapper.Connect. I thought the idea behind Publish/RefCount was "first-in triggers connection, last-out disposes connection."
I guess I have three questions:
Do I fundamentally misunderstand Publish/RefCount?
Is there some preferred way to implement IConnectableObservable<T> that doesn't involve delegation to one obtained via IObservable<T>.Publish? I know you're not supposed to implement IObservable<T> yourself, but I don't understand how to inject connection logic into the IConnectableObservable<T> that Observable.Create().Publish() gives you. Is Connect supposed to be idempotent?
Would someone more familiar with RX/reactive programming look at the sample for how the wrapper is used and say "that's ugly and broken" or is this not as weird as it seems?
I'm not sure that you need to expose Connect directly as you have. I would simplify as follows, using Publish().RefCount() as an encapsulated implementation detail; it would cause the legacy connection to be made only as required. Here the first subscriber in causes connection, and the last one out causes disconnection. Also note this correctly shares a single RefCount across all subscribers, whereas your implementation uses a RefCount per message type, which isn't probably what was intended. Users are not required to Connect explicitly:
public class LegacyReactiveWrapper
{
private IObservable<TopLevelMessage> _legacyRx;
public LegacyReactiveWrapper(LegacyType legacy)
{
_legacyRx = WrapLegacy(legacy).Publish().RefCount();
}
private static IObservable<TopLevelMessage> WrapLegacy(LegacyType legacy)
{
return Observable.Create<TopLevelMessage>(observer =>
{
LegacyTopLevelMessageHandler tlmHandler = (sender, tlm) => observer.OnNext(tlm);
LegacyErrorHandler errHandler = (sender, err) => observer.OnError(new ApplicationException(err.Message));
LegacyCompleteHandler doneHandler = sender => observer.OnCompleted();
legacy.TopLevelMessage += tlmHandler;
legacy.Error += errHandler;
legacy.Complete += doneHandler;
legacy.ConnectToMessageSource();
return Disposable.Create(() =>
{
legacy.DisconnectFromMessageSource();
legacy.TopLevelMessage -= tlmHandler;
legacy.Error -= errHandler;
legacy.Complete -= doneHandler;
});
});
}
public IObservable<TopLevelMessage> TopLevelMessage
{
get
{
return _legacyRx;
}
}
public IObservable<SubMessageA> MessageA
{
get
{
return _legacyRx.Where(tlm => tlm.MessageType == MessageType.MessageA)
.Select(tlm => tlm.SubMessageA);
}
}
public IObservable<SubMessageB> MessageB
{
get
{
return _legacyRx.Where(tlm => tlm.MessageType == MessageType.MessageB)
.Select(tlm => tlm.SubMessageB);
}
}
}
An additional observation is that Publish().RefCount() will drop the underlying subscription when it's subscriber count reaches 0. Typically I only use Connect over this choice when I need to maintain a subscription even when the subscriber count on the published source drops to zero (and may go back up again later). It's rare to need to do this though - only when connecting is more expensive than holding on to the subscription resource when you might not need to.
Your understanding is not entirely wrong, but you do appear to have some points of misunderstanding.
You seem to be under the belief that multiple calls to RefCount on the same source IObservable will result in a shared reference count. They do not; each instance keeps its own count. As such, you are causing multiple subscriptions to _Impl, one per call to subscribe or call to the Message properties.
You also may be expecting that making _Impl an IConnectableObservable somehow causes your Connect method to be called (since you seem surprised you needed to call Connect in your consuming code). All Publish does is cause subscribers to the published object (returned from the .Publish() call) to share a single subscription to the underlying source observable (in this case, the object made from your call to Observable.Create).
Typically, I see Publish and RefCount used immediately together (eg as source.Publish().RefCount()) to get the shared subscription effect described above or to make a cold observable hot without needing to call Connect to start the subscription to the original source. However, this relies on using the same object returned from the .Publish().RefCount() for all subscribers (as noted above).
Your implementation of Connect seems reasonable. I don't know of any recommendations for if Connect should be idempotent, but I would not personally expect it to be. If you wanted it to be, you would just need to track calls to it the disposal of its return value to get the right balance.
I don't think you need to use Publish the way you are, unless there is some reason to avoid multiple event handlers being attached to the legacy object. If you do need to avoid that, I would recommend changing _Impl to a plain IObservable and follow the Publish with a RefCount.
Your MessageA and MessageB properties have potential to be a source of confusion for users, since they return an IObservable, but still require a call to Connect on the base object to start receiving messages. I would either change them to IConnectableObservables that somehow delegate to the original Connect (at which point the idempotency discussion becomes more relevant) or drop the properties and just let the users make the (fairly simple) projections themselves when needed.
Related
I have a function in my Controller that takes user input, and then, using an infinite loop, queries a database and sends the object returned from the database to a webpage. This all works fine, except that I needed to introduce concurrency in order to both run this logic and render the webpage.
The code is given by:
def getSearchResult = Action { request =>
val search = request.queryString.get("searchInput").head.head
val databaseSupport = new InteractWithDatabase(comm, db)
val put = Future {
while (true) {
val data = databaseSupport.getFromDatabase(search)
if (data.nonEmpty) {
if (data.head.vendorId.equals(search)) {
comm.communicator ! data.head
}
}
}
}
Ok(views.html.singleElement.render)
}
The issue arises when I want to call this again, but with a different input. Because the first thread is in an infinite loop, it never ceases to run and is still running even when I start the second thread. Therefore, both objects are being sent to the webpage at the same time in two separate threads.
How can I stop the first thread once I call this function again? Or, is there a better implementation of this whole idea so that I could do it without using multithreading?
Note: I tried removing the concurrency from this function (as multithreading has been the thing giving me all of these problems) and instead moving it to the web socket itself, but this posed problems as the web socket is connected to a router, and everything connects to the web socket through the router.
Try AsyncAction where you return a Future[Result] as a result. Make database call in side this result. E.g.(pseudo code),
def getSearchResult = AsyncAction { request =>
val search = request.queryString.get("searchInput").head.head
val databaseSupport = new InteractWithDatabase(comm, db)
Future {
val data = databaseSupport.getFromDatabase(search)
if (data.nonEmpty) {
if (data.head.vendorId.equals(search)) {
comm.communicator ! data.head // A
}
}
Ok(views.html.singleElement.render)
}
}
Better if databaseSupport.getFromDatabase(search) returns a Future but that is a story for another day. The tricky part is to figure how to deal with Actor at "A". Just remember at the exit it must return Future[Result] result type.
I'm connecting to Azure Redis and they show me the number of open connections to my redis server. I've got the following c# code that encloses all my Redis sets and gets. Should this be leaking connections?
using (var connectionMultiplexer = ConnectionMultiplexer.Connect(connectionString))
{
lock (Locker)
{
redis = connectionMultiplexer.GetDatabase();
}
var o = CacheSerializer.Deserialize<T>(redis.StringGet(cacheKeyName));
if (o != null)
{
return o;
}
lock (Locker)
{
// get lock but release if it takes more than 60 seconds to complete to avoid deadlock if this app crashes before release
//using (redis.AcquireLock(cacheKeyName + "-lock", TimeSpan.FromSeconds(60)))
var lockKey = cacheKeyName + "-lock";
if (redis.LockTake(lockKey, Environment.MachineName, TimeSpan.FromSeconds(10)))
{
try
{
o = CacheSerializer.Deserialize<T>(redis.StringGet(cacheKeyName));
if (o == null)
{
o = func();
redis.StringSet(cacheKeyName, CacheSerializer.Serialize(o),
TimeSpan.FromSeconds(cacheTimeOutSeconds));
}
redis.LockRelease(lockKey, Environment.MachineName);
return o;
}
finally
{
redis.LockRelease(lockKey, Environment.MachineName);
}
}
return o;
}
}
}
You can keep connectionMultiplexer in a static variable and not create it for every get/set. That will keep one connection to Redis always opening and proceed your operations faster.
Update:
Please, have a look at StackExchange.Redis basic usage:
https://github.com/StackExchange/StackExchange.Redis/blob/master/Docs/Basics.md
"Note that ConnectionMultiplexer implements IDisposable and can be disposed when no longer required, but I am deliberately not showing using statement usage, because it is exceptionally rare that you would want to use a ConnectionMultiplexer briefly, as the idea is to re-use this object."
It works nice for me, keeping single connection to Azure Redis (sometimes, create 2 connections, but this by design). Hope it will help you.
I was suggesting try using Close (or CloseAsync) method explicitly. In a test setting you may be using different connections for different test cases and not want to share a single multiplexer. A search for public code using Redis client shows a pattern of Close followed by Dispose calls.
Noting in the XML method documentation of Redis client that close method is described as doing more:
//
// Summary:
// Close all connections and release all resources associated with this object
//
// Parameters:
// allowCommandsToComplete:
// Whether to allow all in-queue commands to complete first.
public void Close(bool allowCommandsToComplete = true);
//
// Summary:
// Close all connections and release all resources associated with this object
//
// Parameters:
// allowCommandsToComplete:
// Whether to allow all in-queue commands to complete first.
[AsyncStateMachine(typeof(<CloseAsync>d__183))]
public Task CloseAsync(bool allowCommandsToComplete = true);
...
//
// Summary:
// Release all resources associated with this object
public void Dispose();
And then I looked up the code for the client, found it here:
https://github.com/StackExchange/StackExchange.Redis/blob/master/src/StackExchange.Redis/ConnectionMultiplexer.cs
And we can see Dispose method calling Close (not the usual override-able protected Dispose(bool)), further more with the wait for connections to close set to true. It appears to be an atypical dispose pattern implementation in that by trying all the closure and waiting on them it is chancing to run into exception while Dispose method contract is supposed to never throw one.
Note: My question is about the way of including/passing the dispatcher instance around, not about how the pattern is useful.
I am studying the Flux Architecture and I cannot get my head around the concept of the dispatcher (instance) potentially being included everywhere...
What if I want to trigger an Action from my Model Layer? It feels weird to me to include an instance of an object in my Model files... I feel like this is missing some injection pattern...
I have the impression that the exact PHP equivalent is something (that feels) horrible similar to:
<?php
$dispatcher = require '../dispatcher_instance.php';
class MyModel {
...
public function someMethod() {
...
$dispatcher->...
}
}
I think my question is not exactly only related to the Flux Architecture but more to the NodeJS "way of doing things"/practices in general.
TLDR:
No, it is not bad practice to pass around the instance of the dispatcher in your stores
All data stores should have a reference to the dispatcher
The invoking/consuming code (in React, this is usually the view) should only have references to the action-creators, not the dispatcher
Your code doesn't quite align with React because you are creating a public mutable function on your data store.
The ONLY way to communicate with a store in Flux is via message passing which always flows through the dispatcher.
For example:
var Dispatcher = require('MyAppDispatcher');
var ExampleActions = require('ExampleActions');
var _data = 10;
var ExampleStore = assign({}, EventEmitter.prototype, {
getData() {
return _data;
},
emitChange() {
this.emit('change');
},
dispatcherKey: Dispatcher.register(payload => {
var {action} = payload;
switch (action.type) {
case ACTIONS.ADD_1:
_data += 1;
ExampleStore.emitChange();
ExampleActions.doThatOtherThing();
break;
}
})
});
module.exports = ExampleStore;
By closing over _data instead of having a data property directly on the store, you can enforce the message passing rule. It's a private member.
Also important to note, although you can call Dispatcher.emit() directly, it's not a good idea.
There are two main reasons to go through the action-creators:
Consistency - This is how your views and other consuming code interacts with the stores
Easier Refactoring - If you ever remove the ADD_1 action from your app, this code will throw an exception rather than silently failing by sending a message that doesn't match any of the switch statements in any of the stores
Main Advantages to this Approach
Loose coupling - Adding and removing features is a breeze. Stores can respond to any event in the system with by adding one line of code.
Less complexity - One way data flow makes wrapping head around data flow a lot easier. Less interdependencies.
Easier debugging - You can debug every change in your system with a few lines of code.
debugging example:
var MyAppDispatcher = require('MyAppDispatcher');
MyAppDispatcher.register(payload => {
console.debug(payload);
});
I'm currently trying to reduce the number of similar requests being processed in a business layer by:
Caching the requests a method receives
Performing the slow processing task (once for all similar requests)
Return the result to each requesting method calls
Things to note, are that:
The original method calls are not currently in a async BeginMethod() / EndMethod(IAsyncResult)
The requests arrive faster than the time it takes to generate the output
I'm trying to use TPL where possible, as I am currently trying to learn more about this library
eg. Improving the following
byte[] RequestSlowOperation(string operationParameter)
{
Perform slow task here...
}
Any thoughts?
Follow up:
class SomeClass
{
private int _threadCount;
public SomeClass(int threadCount)
{
_threadCount = threadCount;
int parameter = 0;
var taskFactory = Task<int>.Factory;
for (int i = 0; i < threadCount; i++)
{
int i1 = i;
taskFactory
.StartNew(() => RequestSlowOperation(parameter))
.ContinueWith(result => Console.WriteLine("Result {0} : {1}", result.Result, i1));
}
}
private int RequestSlowOperation(int parameter)
{
Lazy<int> result2;
var result = _cacheMap.GetOrAdd(parameter, new Lazy<int>(() => RequestSlowOperation2(parameter))).Value;
//_cacheMap.TryRemove(parameter, out result2); <<<<< Thought I could remove immediately, but this causes blobby behaviour
return result;
}
static ConcurrentDictionary<int, Lazy<int>> _cacheMap = new ConcurrentDictionary<int, Lazy<int>>();
private int RequestSlowOperation2(int parameter)
{
Console.WriteLine("Evaluating");
Thread.Sleep(100);
return parameter;
}
}
Here is a fast, safe and maintainable way to do this:
static var cacheMap = new ConcurrentDictionary<string, Lazy<byte[]>>();
byte[] RequestSlowOperation(string operationParameter)
{
return cacheMap.GetOrAdd(operationParameter, () => new Lazy<byte[]>(() => RequestSlowOperation2(operationParameter))).Value;
}
byte[] RequestSlowOperation2(string operationParameter)
{
Perform slow task here...
}
This will execute RequestSlowOperation2 at most once per key. Please be aware that the memory held by the dictionary will never be released.
The user delegate passed to the ConcurrentDictionary is not executed under lock, meaning that it could execute multiple times! My solution allows multiple lazies to be created but only one of them will ever be published and materialized.
Regarding locking: this solution will take locks, but it does not matter because the work items are far more expensive than the (few) lock operations.
Honestly, the use of TPL as a technology here is not really important, this is just a straight up concurrency problem. You're trying to protect access to a shared resource (the cached data) and, to do that, the only approach is to lock. Either that or, if the cache entry does not already exist, you could allow all incoming threads to generate it and then subsequent requesters benefit from the cached value once it's stored, but there's little value in that if the resource is slow/expensive to generate and cache.
Perhaps some more details will make it clear on exactly why you're trying to accomplish this without a lock. I'll happily to revise my answer if more detail makes it clearer what you're trying to do.
My code runs 4 function to fill in information (Using Invoke) to a class such as:
class Person
{
int Age;
string name;
long ID;
bool isVegeterian
public static Person GetPerson(int LocalID)
{
Person person;
Parallel.Invoke(() => {GetAgeFromWebServiceX(person)},
() => {GetNameFromWebServiceY(person)},
() => {GetIDFromWebServiceZ(person)},
() =>
{
// connect to my database and get information if vegeterian (using LocalID)
....
if (!person.isVegetrian)
return null
....
});
}
}
My question is: I can not return null if he's not a vegeterian, but I want to able to stop all threads, stop processing and just return null. How can it be achieved?
To exit the Parallel.Invoke as early as possible you'd have to do three things:
Schedule the action that detects whether you want to exit early as the first action. It's then scheduled sooner (maybe as first, but that's not guaranteed) so you'll know sooner whether you want to exit.
Throw an exception when you detect the error and catch an AggregateException as Jon's answer indicates.
Use cancellation tokens. However, this only makes sense if you have an opportunity to check their IsCancellationRequested property.
Your code would then look as follows:
var cts = new CancellationTokenSource();
try
{
Parallel.Invoke(
new ParallelOptions { CancellationToken = cts.Token },
() =>
{
if (!person.IsVegetarian)
{
cts.Cancel();
throw new PersonIsNotVegetarianException();
}
},
() => { GetAgeFromWebServiceX(person, cts.Token) },
() => { GetNameFromWebServiceY(person, cts.Token) },
() => { GetIDFromWebServiceZ(person, cts.Token) }
);
}
catch (AggregateException e)
{
var cause = e.InnerExceptions[0];
// Check if cause is a PersonIsNotVegetarianException.
}
However, as I said, cancellation tokens only make sense if you can check them. So there should be an opportunity inside GetAgeFromWebServiceX to check the cancellation token and exit early, otherwise, passing tokens to these methods doesn't make sense.
Well, you can throw an exception from your action, catch AggregateException in GetPerson (i.e. put a try/catch block around Parallel.Invoke), check for it being the right kind of exception, and return null.
That fulfils everything except stopping all the threads. I think it's unlikely that you'll easily be able to stop already running tasks unless you start getting into cancellation tokens. You could stop further tasks from executing by keeping a boolean value to indicate whether any of the tasks so far has failed, and make each task check that before starting... it's somewhat ugly, but it will work.
I suspect that using "full" tasks instead of Parallel.Invoke would make all of this more elegant though.
Surely you need to load your Person from the database first anyway? As it is your code calls the Web services with a null.
If your logic really is sequential, do it sequentially and only do in parallel what makes sense.