Parallel.Invoke - Exception handling - c#-4.0

My code runs 4 function to fill in information (Using Invoke) to a class such as:
class Person
{
int Age;
string name;
long ID;
bool isVegeterian
public static Person GetPerson(int LocalID)
{
Person person;
Parallel.Invoke(() => {GetAgeFromWebServiceX(person)},
() => {GetNameFromWebServiceY(person)},
() => {GetIDFromWebServiceZ(person)},
() =>
{
// connect to my database and get information if vegeterian (using LocalID)
....
if (!person.isVegetrian)
return null
....
});
}
}
My question is: I can not return null if he's not a vegeterian, but I want to able to stop all threads, stop processing and just return null. How can it be achieved?

To exit the Parallel.Invoke as early as possible you'd have to do three things:
Schedule the action that detects whether you want to exit early as the first action. It's then scheduled sooner (maybe as first, but that's not guaranteed) so you'll know sooner whether you want to exit.
Throw an exception when you detect the error and catch an AggregateException as Jon's answer indicates.
Use cancellation tokens. However, this only makes sense if you have an opportunity to check their IsCancellationRequested property.
Your code would then look as follows:
var cts = new CancellationTokenSource();
try
{
Parallel.Invoke(
new ParallelOptions { CancellationToken = cts.Token },
() =>
{
if (!person.IsVegetarian)
{
cts.Cancel();
throw new PersonIsNotVegetarianException();
}
},
() => { GetAgeFromWebServiceX(person, cts.Token) },
() => { GetNameFromWebServiceY(person, cts.Token) },
() => { GetIDFromWebServiceZ(person, cts.Token) }
);
}
catch (AggregateException e)
{
var cause = e.InnerExceptions[0];
// Check if cause is a PersonIsNotVegetarianException.
}
However, as I said, cancellation tokens only make sense if you can check them. So there should be an opportunity inside GetAgeFromWebServiceX to check the cancellation token and exit early, otherwise, passing tokens to these methods doesn't make sense.

Well, you can throw an exception from your action, catch AggregateException in GetPerson (i.e. put a try/catch block around Parallel.Invoke), check for it being the right kind of exception, and return null.
That fulfils everything except stopping all the threads. I think it's unlikely that you'll easily be able to stop already running tasks unless you start getting into cancellation tokens. You could stop further tasks from executing by keeping a boolean value to indicate whether any of the tasks so far has failed, and make each task check that before starting... it's somewhat ugly, but it will work.
I suspect that using "full" tasks instead of Parallel.Invoke would make all of this more elegant though.

Surely you need to load your Person from the database first anyway? As it is your code calls the Web services with a null.
If your logic really is sequential, do it sequentially and only do in parallel what makes sense.

Related

Node.js: How to implement a simple and functional Mutex mechanism to avoid racing conditions that bypass the guard statement in simultaneous actions

In the following class, the _busy field acts as a semaphore; but, in "simultaneous" situations it fails to guard!
class Task {
_busy = false;
async run(s) {
try {
if (this._busy)
return;
this._busy = true;
await payload();
} finally {
this._busy = false;
}
}
}
The sole purpose of the run() is to execute the payload() exclusively, denying all the other invocations while it's still being carried on. In other words, when "any" of the invocations reach to to the run() method, I want it to only allow the first one to go through and lock it down (denying all the others) until it's done with its payload; "finally", it opens up once it's done.
In the implementation above, the racing condition do occur by invoking the run() method simultaneously through various parts of the app. Some of the invocations (more than 1) make it past through the "guarding" if statement, since none of them are yet reached to the this._busy = true to lock it down (they get past simultaneously). So, the current implementation doesn't cut it!
I just want to deny the simultaneous invocations while one of them is already being carried out. I'm looking for a simple solution to only resolve this issue. I've designated the async-mutex library as a last resort!
So, how to implement a simple "locking" mechanism to avoid racing conditions that bypass the guard statement in simultaneous actions?
For more clarification, as per the comments below, the following is almost the actual Task class (without the irrelevant).
class Task {
_cb;
_busy = false;
_count = 0;
constructor(cb) {
this._cb = cb;
}
async run(params = []) {
try {
if (this._busy)
return;
this._busy = true;
this._count++;
if (this._count > 1) {
console.log('Race condition!', 'count:', this._count);
this._count--;
return;
}
await this._cb(...params);
} catch (err) {
await someLoggingRoutine();
} finally {
this._busy = false;
this._count--;
}
}
}
I do encounter with the Race condition! log. Also, all the task instances are local to a simple driver file (the instances are not passed down to any other function, they only wander as local instances in a single function.) They are created in the following form:
const t1 = new Task(async () => { await doSth1(); /*...*/ });
const t2 = new Task(async () => { await doSth2(); /*...*/ });
const t3 = new Task(async () => { await doSth3(); /*...*/ });
// ...
I do call them in the various library events, some of which happen concurrently and causing the "race condition" issue; e.g.:
someLib.on('some-event', async function() { /*...*/ t1.run().then(); /*...*/ });
anotherLib.on('event-2', async function() { /*...*/ t1.run().then(); /*...*/ });
Oh god, now I see it. How could I have missed this so long! Here is your implemenation:
async run() {
try {
if (this._busy)
return;
...
} finally {
this._busy = false;
}
}
As per documentations:
The Statements in the finally block are executed before control flow exits the try...catch...finally construct. These statements execute regardless of whether an exception was thrown or caught.
Thus, when it's busy and the flow reaches the guarding if, and then, logically encounters the return statement. The return statement causes the flow to exit the try...catch...finally construct; thus, as per the documentations, the statements in the finally block are executed whatsoever: setting the this._busy = false;, opening the thing up!
So, the first call of run() sets this._busy as being true; then enters the critical section with its longrunning callback. While this callback is running, just another event causes the run() to be invoked. This second call is rationally blocked from entering the critical section by the guarding if statement:
if (this._busy) return;
Encountering the return statement to exit the function (and thus exiting the try...catch...finally construct) causes the statements in the finally block to be executed; thus, this._busy = false resets the flag, even though the first callback is still running! Now suppose a third call to the run() from yet another event is invoked! Since this._busy is just set to false, the flow happily enters the critical section again, even though the first callback is still running! In turn, it sets this._busy back to true. In the meantime, the first callback finishes, and reaches the finally block, where it set this._busy = false again; even though the other callback is still running. So the next call to run() can enter the critical section again with no problems... And so on and so forth...
So to resolve the issue, the check for the critical section should be outside of the try block:
async run() {
if (this._busy) return;
this._busy = true;
try { ... }
finally {
this._busy = false;
}
}

How to use await keyword inside a method without changing the method async

I am developing a scheduled job to send message to Message queue using Quartz.net. The Execute method of IJob is not async. so I can't use async Task. But I want to call a method with await keyword.
Please find below my code. Not sure whether I am doing correct. Can anyone please help me with this?
private async Task PublishToQueue(ChangeDetected changeDetected)
{
_logProvider.Info("Publish to Queue started");
try
{
await _busControl.Publish(changeDetected);
_logProvider.Info($"ChangeDetected message published to RabbitMq. Message");
}
catch (Exception ex)
{
_logProvider.Error("Error publishing message to queue: ", ex);
throw;
}
}
public class ChangedNotificatonJob : IJob
{
public void Execute(IJobExecutionContext context)
{
//Publish message to queue
Policy
.Handle<Exception>()
.RetryAsync(3, (exception, count) =>
{
//Do something for each retry
})
.ExecuteAsync(async () =>
{
await PublishToQueue(message);
});
}
}
Is this correct way? I have used .GetAwaiter();
Policy
.Handle<Exception>()
.RetryAsync(_configReader.RetryLimit, (exception, count) =>
{
//Do something for each retry
})
.ExecuteAsync(async () =>
{
await PublishToQueue(message);
}).GetAwaiter()
Polly's .ExecuteAsync() returns a Task. With any Task, you can just call .Wait() on it (or other blocking methods) to block synchronously until it completes, or throws an exception.
As you have observed, since IJob.Execute(...) isn't async, you can't use await, so you have no choice but to block synchronously on the task, if you want to discover the success-or-otherwise of publishing before IJob.Execute(...) returns.
.Wait() will cause any exception from the task to be rethrown, wrapped in an AggregateException. This will occur if all Polly-orchestrated retries fail.
You'll need to decide what to do with that exception:
If you want the caller to handle it, rethrow it or don't catch it and let it cascade outside the Quartz job.
If you want to handle it before returning from IJob.Execute(...), you'll need a try {} catch {} around the whole .ExecuteAsync(...).Wait(). Or consider Polly's .ExecuteAndCaptureAsync(...) syntax: it avoids you having to provide that outer try-catch, by instead placing the final outcome of the execution into a PolicyResult instance. See the Polly doco.
There is a further alternative if your only intention is to log somewhere that message publishing failed, and you don't care whether that logging happens before IJob.Execute(...) returns or not. In that case, instead of using .Wait(), you could chain a continuation task on to ExecuteAsync() using .ContinueWith(...), and handle any logging in there. We adopt this approach, and capture failed message publishing to a special 'message hospital' - capturing enough information so that we can choose whether to republish that message again later, if appropriate. Whether this approach is valuable depends on how important it is to you never to lose a message.
EDIT: GetAwaiter() is irrelevant. It won't magically let you start using await inside a non-async method.

Function/Code Design with Concurrency in Swift

I'm trying to create my first app in Swift which involves making multiple requests to a website. These requests are each done using the block
var task = NSURLSession.sharedSession().dataTaskWithRequest(request, completionHandler: {data, response, error -> Void in ... }
task.resume()
From what I understand this block uses a thread different to the main thread.
My question is, what is the best way to design code that relies on the values in that block? For instance, the ideal design (however not possible due to the fact that the thread executing these blocks is not the main thread) is
func prepareEmails() {
var names = getNames()
var emails = getEmails()
...
sendEmails()
}
func getNames() -> NSArray {
var names = nil
....
var task = NSURLSession.sharedSession().dataTaskWithRequest(request, completionHandler: {data, response, error -> Void in
names = ...
})
task.resume()
return names
}
func getEmails() -> NSArray {
var emails = nil
....
var task = NSURLSession.sharedSession().dataTaskWithRequest(request, completionHandler: {data, response, error -> Void in
emails = ...
})
task.resume()
return emails
}
However in the above design, most likely getNames() and getEmails() will return nil, as the the task will not have updated emails/name by the time it returns.
The alternative design (which I currently implement) is by effectively removing the 'prepareEmails' function and doing everything sequentially in the task functions
func prepareEmails() {
getNames()
}
func getNames() {
...
var task = NSURLSession.sharedSession().dataTaskWithRequest(request, completionHandler: {data, response, error -> Void in
getEmails(names)
})
task.resume()
}
func getEmails(names: NSArray) {
...
var task = NSURLSession.sharedSession().dataTaskWithRequest(request, completionHandler: {data, response, error -> Void in
sendEmails(emails, names)
})
task.resume()
}
Is there a more effective design than the latter? This is my first experience with concurrency, so any advice would be greatly appreciated.
The typical pattern when calling an asynchronous method that has a completionHandler parameter is to use the completionHandler closure pattern, yourself. So the methods don't return anything, but rather call a closure with the returned information as a parameter:
func getNames(completionHandler:(NSArray!)->()) {
....
let task = NSURLSession.sharedSession().dataTaskWithRequest(request) {data, response, error -> Void in
let names = ...
completionHandler(names)
}
task.resume()
}
func getEmails(completionHandler:(NSArray!)->()) {
....
let task = NSURLSession.sharedSession().dataTaskWithRequest(request) {data, response, error -> Void in
let emails = ...
completionHandler(emails)
}
task.resume()
}
Then, if you need to perform these sequentially, as suggested by your code sample (i.e. if the retrieval of emails was dependent upon the names returned by getNames), you could do something like:
func prepareEmails() {
getNames() { names in
getEmails() {emails in
sendEmails(names, emails) // I'm assuming the names and emails are in the input to this method
}
}
}
Or, if they can run concurrently, then you should do so, as it will be faster. The trick is how to make a third task dependent upon two other asynchronous tasks. The two traditional alternatives include
Wrapping each of these asynchronous tasks in its own asynchronous NSOperation, and then create a third task dependent upon those other two operations. This is probably beyond the scope of the question, but you can refer to the Operation Queue section of the Concurrency Programming Guide or see the Asynchronous vs Synchronous Operations and Subclassing Notes sections of the NSOperation Class Reference.
Use dispatch groups, entering the group before each request, leaving the group within the completion handler of each request, and then adding a dispatch group notification block (called when all of the group "enter" calls are matched by their corresponding "leave" calls):
func prepareEmails() {
let group = dispatch_group_create()
var emails: NSArray!
var names: NSArray!
dispatch_group_enter(group)
getNames() { results in
names = results
dispatch_group_leave(group)
}
dispatch_group_enter(group)
getEmails() {results in
emails = results
dispatch_group_leave(group)
}
dispatch_group_notify(group, dispatch_get_main_queue()) {
if names != nil && emails != nil {
self.sendEmails(names, emails)
} else {
// one or both of those requests failed; tell the user
}
}
}
Frankly, if there's any way to retrieve both the emails and names in a single network request, that's going to be far more efficient. But if you're stuck with two separate requests, you could do something like the above.
Note, I wouldn't generally use NSArray in my Swift code, but rather use an array of String objects (e.g. [String]). Furthermore, I'd put in error handling where I return the nature of the error if either of these fail. But hopefully this illustrates the concepts involved in (a) writing your own methods with completionHandler blocks; and (b) invoking a third bit of code dependent upon the completion of two other asynchronous tasks.
The answers above (particularly Rob's DispatchQueue based answer) describe the concurrency concepts necessary to run two tasks in parallel and then respond to the result. The answers lack error handling for clarity because traditionally, correct solutions to concurrency problems are quite verbose.
Not so with HoneyBee.
HoneyBee.start()
.setErrorHandler(handleErrorFunc)
.branch {
$0.chain(getNames)
+
$0.chain(getEmails)
}
.chain(sendEmails)
This code snippet manages all of the concurrency, routes all errors to handleErrorFunc and looks like the concurrent pattern that is desired.

Wrapping legacy object in IConnectableObservable

I have a legacy event-based object that seems like a perfect fit for RX: after being connected to a network source, it raises events when a message is received, and may terminate with either a single error (connection dies, etc.) or (rarely) an indication that there will be no more messages. This object also has a couple projections -- most users are interested in only a subset of the messages received, so there are alternate events raised only when well-known message subtypes show up.
So, in the process of learning more about reactive programming, I built the following wrapper:
class LegacyReactiveWrapper : IConnectableObservable<TopLevelMessage>
{
private LegacyType _Legacy;
private IConnectableObservable<TopLevelMessage> _Impl;
public LegacyReactiveWrapper(LegacyType t)
{
_Legacy = t;
var observable = Observable.Create<TopLevelMessage>((observer) =>
{
LegacyTopLevelMessageHandler tlmHandler = (sender, tlm) => observer.OnNext(tlm);
LegacyErrorHandler errHandler = (sender, err) => observer.OnError(new ApplicationException(err.Message));
LegacyCompleteHandler doneHandler = (sender) => observer.OnCompleted();
_Legacy.TopLevelMessage += tlmHandler;
_Legacy.Error += errHandler;
_Legacy.Complete += doneHandler;
return Disposable.Create(() =>
{
_Legacy.TopLevelMessage -= tlmHandler;
_Legacy.Error -= errHandler;
_Legacy.Complete -= doneHandler;
});
});
_Impl = observable.Publish();
}
public IDisposable Subscribe(IObserver<TopLevelMessage> observer)
{
return _Impl.RefCount().Subscribe(observer);
}
public IDisposable Connect()
{
_Legacy.ConnectToMessageSource();
return Disposable.Create(() => _Legacy.DisconnectFromMessageSource());
}
public IObservable<SubMessageA> MessageA
{
get
{
// This is the moral equivalent of the projection behavior
// that already exists in the legacy type. We don't hook
// the LegacyType.MessageA event directly.
return _Impl.RefCount()
.Where((tlm) => tlm.MessageType == MessageType.MessageA)
.Select((tlm) => tlm.SubMessageA);
}
}
public IObservable<SubMessageB> MessageB
{
get
{
return _Impl.RefCount()
.Where((tlm) => tlm.MessageType == MessageType.MessageB)
.Select((tlm) => tlm.SubMessageB);
}
}
}
Something about how it's used elsewhere feels... off... somehow, though. Here's sample usage, which works but feels strange. The UI context for my test application is WinForms, but it doesn't really matter.
// in Program.Main...
MainForm frm = new MainForm();
// Updates the UI based on a stream of SubMessageA's
IObserver<SubMessageA> uiManager = new MainFormUiManager(frm);
LegacyType lt = new LegacyType();
// ... setup lt...
var w = new LegacyReactiveWrapper(lt);
var uiUpdateSubscription = (from msgA in w.MessageA
where SomeCondition(msgA)
select msgA).ObserveOn(frm).Subscribe(uiManager);
var nonUiSubscription = (from msgB in w.MessageB
where msgB.SubType == MessageBType.SomeSubType
select msgB).Subscribe(
m => Console.WriteLine("Got MsgB: {0}", m),
ex => Console.WriteLine("MsgB error: {0}", ex.Message),
() => Console.WriteLine("MsgB complete")
);
IDisposable unsubscribeAtExit = null;
frm.Load += (sender, e) =>
{
var connectionSubscription = w.Connect();
unsubscribeAtExit = new CompositeDisposable(
uiUpdateSubscription,
nonUiSubscription,
connectionSubscription);
};
frm.FormClosing += (sender, e) =>
{
if(unsubscribeAtExit != null) { unsubscribeAtExit.Dispose(); }
};
Application.Run(frm);
This WORKS -- The form launches, the UI updates, and when I close it the subscriptions get cleaned up and the process exits (which it won't do if the LegacyType's network connection is still connected). Strictly speaking, it's enough to dispose just connectionSubscription. However, the explicit Connect feels weird to me. Since RefCount is supposed to do that for you, I tried modifying the wrapper such that rather than using _Impl.RefCount in MessageA and MessageB and explicitly connecting later, I used this.RefCount instead and moved the calls to Subscribe to the Load handler. That had a different problem -- the second subscription triggered another call to LegacyReactiveWrapper.Connect. I thought the idea behind Publish/RefCount was "first-in triggers connection, last-out disposes connection."
I guess I have three questions:
Do I fundamentally misunderstand Publish/RefCount?
Is there some preferred way to implement IConnectableObservable<T> that doesn't involve delegation to one obtained via IObservable<T>.Publish? I know you're not supposed to implement IObservable<T> yourself, but I don't understand how to inject connection logic into the IConnectableObservable<T> that Observable.Create().Publish() gives you. Is Connect supposed to be idempotent?
Would someone more familiar with RX/reactive programming look at the sample for how the wrapper is used and say "that's ugly and broken" or is this not as weird as it seems?
I'm not sure that you need to expose Connect directly as you have. I would simplify as follows, using Publish().RefCount() as an encapsulated implementation detail; it would cause the legacy connection to be made only as required. Here the first subscriber in causes connection, and the last one out causes disconnection. Also note this correctly shares a single RefCount across all subscribers, whereas your implementation uses a RefCount per message type, which isn't probably what was intended. Users are not required to Connect explicitly:
public class LegacyReactiveWrapper
{
private IObservable<TopLevelMessage> _legacyRx;
public LegacyReactiveWrapper(LegacyType legacy)
{
_legacyRx = WrapLegacy(legacy).Publish().RefCount();
}
private static IObservable<TopLevelMessage> WrapLegacy(LegacyType legacy)
{
return Observable.Create<TopLevelMessage>(observer =>
{
LegacyTopLevelMessageHandler tlmHandler = (sender, tlm) => observer.OnNext(tlm);
LegacyErrorHandler errHandler = (sender, err) => observer.OnError(new ApplicationException(err.Message));
LegacyCompleteHandler doneHandler = sender => observer.OnCompleted();
legacy.TopLevelMessage += tlmHandler;
legacy.Error += errHandler;
legacy.Complete += doneHandler;
legacy.ConnectToMessageSource();
return Disposable.Create(() =>
{
legacy.DisconnectFromMessageSource();
legacy.TopLevelMessage -= tlmHandler;
legacy.Error -= errHandler;
legacy.Complete -= doneHandler;
});
});
}
public IObservable<TopLevelMessage> TopLevelMessage
{
get
{
return _legacyRx;
}
}
public IObservable<SubMessageA> MessageA
{
get
{
return _legacyRx.Where(tlm => tlm.MessageType == MessageType.MessageA)
.Select(tlm => tlm.SubMessageA);
}
}
public IObservable<SubMessageB> MessageB
{
get
{
return _legacyRx.Where(tlm => tlm.MessageType == MessageType.MessageB)
.Select(tlm => tlm.SubMessageB);
}
}
}
An additional observation is that Publish().RefCount() will drop the underlying subscription when it's subscriber count reaches 0. Typically I only use Connect over this choice when I need to maintain a subscription even when the subscriber count on the published source drops to zero (and may go back up again later). It's rare to need to do this though - only when connecting is more expensive than holding on to the subscription resource when you might not need to.
Your understanding is not entirely wrong, but you do appear to have some points of misunderstanding.
You seem to be under the belief that multiple calls to RefCount on the same source IObservable will result in a shared reference count. They do not; each instance keeps its own count. As such, you are causing multiple subscriptions to _Impl, one per call to subscribe or call to the Message properties.
You also may be expecting that making _Impl an IConnectableObservable somehow causes your Connect method to be called (since you seem surprised you needed to call Connect in your consuming code). All Publish does is cause subscribers to the published object (returned from the .Publish() call) to share a single subscription to the underlying source observable (in this case, the object made from your call to Observable.Create).
Typically, I see Publish and RefCount used immediately together (eg as source.Publish().RefCount()) to get the shared subscription effect described above or to make a cold observable hot without needing to call Connect to start the subscription to the original source. However, this relies on using the same object returned from the .Publish().RefCount() for all subscribers (as noted above).
Your implementation of Connect seems reasonable. I don't know of any recommendations for if Connect should be idempotent, but I would not personally expect it to be. If you wanted it to be, you would just need to track calls to it the disposal of its return value to get the right balance.
I don't think you need to use Publish the way you are, unless there is some reason to avoid multiple event handlers being attached to the legacy object. If you do need to avoid that, I would recommend changing _Impl to a plain IObservable and follow the Publish with a RefCount.
Your MessageA and MessageB properties have potential to be a source of confusion for users, since they return an IObservable, but still require a call to Connect on the base object to start receiving messages. I would either change them to IConnectableObservables that somehow delegate to the original Connect (at which point the idempotency discussion becomes more relevant) or drop the properties and just let the users make the (fairly simple) projections themselves when needed.

Async Logger. Can I lose/delay log entries?

I'm implementing my own logging framework. Following is my BaseLogger which receives the log entries and push it to the actual Logger which implements the abstract Log method.
I use the C# TPL for logging in an Async manner. I use Threads instead of TPL. (TPL task doesn't hold a real thread. So if all threads of the application end, tasks will stop as well, which will cause all 'waiting' log entries to be lost.)
public abstract class BaseLogger
{
// ... Omitted properties constructor .etc. ... //
public virtual void AddLogEntry(LogEntry entry)
{
if (!AsyncSupported)
{
// the underlying logger doesn't support Async.
// Simply call the log method and return.
Log(entry);
return;
}
// Logger supports Async.
LogAsync(entry);
}
private void LogAsync(LogEntry entry)
{
lock (LogQueueSyncRoot) // Make sure we ave a lock before accessing the queue.
{
LogQueue.Enqueue(entry);
}
if (LogThread == null || LogThread.ThreadState == ThreadState.Stopped)
{ // either the thread is completed, or this is the first time we're logging to this logger.
LogTask = new new Thread(new ThreadStart(() =>
{
while (true)
{
LogEntry logEntry;
lock (LogQueueSyncRoot)
{
if (LogQueue.Count > 0)
{
logEntry = LogQueue.Dequeue();
}
else
{
break;
// is it possible for a message to be added,
// right after the break and I leanve the lock {} but
// before I exit the loop and task gets 'completed' ??
}
}
Log(logEntry);
}
}));
LogThread.Start();
}
}
// Actual logger implimentations will impliment this method.
protected abstract void Log(LogEntry entry);
}
Note that AddLogEntry can be called from multiple threads at the same time.
My question is, is it possible for this implementation to lose log entries ?
I'm worried that, is it possible to add a log entry to the queue, right after my thread exists the loop with the break statement and exits the lock block, and which is in the else clause, and the thread is still in the 'Running' state.
I do realize that, because I'm using a queue, even if I miss an entry, the next request to log, will push the missed entry as well. But this is not acceptable, specially if this happens for the last log entry of the application.
Also, please let me know whether and how I can implement the same, but using the new C# 5.0 async and await keywords with a cleaner code. I don't mind requiring .NET 4.5.
Thanks in Advance.
While you could likely get this to work, in my experience, I'd recommend, if possible, use an existing logging framework :) For instance, there are various options for async logging/appenders with log4net, such as this async appender wrapper thingy.
Otherwise, IMHO since you're going to be blocking a threadpool thread during your logging operation anyway, I would instead just start a dedicated thread for your logging. You seem to be kind-of going for that approach already, just via Task so that you'd not hold a threadpool thread when nothing is logging. However, the simplification in implementation I think benefits just having the dedicated thread.
Once you have a dedicated logging thread, you then only need have an intermediate ConcurrentQueue. At that point, your log method just adds to the queue and your dedicated logging thread just does that while loop you already have. You can wrap with BlockingCollection if you need blocking/bounded behavior.
By having the dedicated thread as the only thing that writes, it eliminates any possibility of having multiple threads/tasks pulling off queue entries and trying to write log entries at the same time (painful race condition). Since the log method is now just adding to a collection, it doesn't need to be async and you don't need to deal with the TPL at all, making it simpler and easier to reason about (and hopefully in the category of 'obviously correct' or thereabouts :)
This 'dedicated logging thread' approach is what I believe the log4net appender I linked to does as well, FWIW, in case that helps serve as an example.
I see two race conditions off the top of my head:
You can spin up more than one Thread if multiple threads call AddLogEntry. This won't cause lost events but is inefficient.
Yes, an event can be queued while the Thread is exiting, and in that case it would be "lost".
Also, there's a serious performance issue here: unless you're logging constantly (thousands of times a second), you're going to be spinning up a new Thread for each log entry. That will get expensive quickly.
Like James, I agree that you should use an established logging library. Logging is not as trivial as it seems, and there are already many solutions.
That said, if you want a nice .NET 4.5-based approach, it's pretty easy:
public abstract class BaseLogger
{
private readonly ActionBlock<LogEntry> block;
protected BaseLogger(int maxDegreeOfParallelism = 1)
{
block = new ActionBlock<LogEntry>(
entry =>
{
Log(entry);
},
new ExecutionDataflowBlockOptions
{
MaxDegreeOfParallelism = maxDegreeOfParallelism,
});
}
public virtual void AddLogEntry(LogEntry entry)
{
block.Post(entry);
}
protected abstract void Log(LogEntry entry);
}
Regarding the loosing waiting messages on app crush because of unhandled exception, I've bound a handler to the event AppDomain.CurrentDomain.DomainUnload. Goes like this:
protected ManualResetEvent flushing = new ManualResetEvent(true);
protected AsyncLogger() // ctor of logger
{
AppDomain.CurrentDomain.DomainUnload += CurrentDomain_DomainUnload;
}
protected void CurrentDomain_DomainUnload(object sender, EventArgs e)
{
if (!IsEmpty)
{
flushing.WaitOne();
}
}
Maybe not too clean, but works.

Resources