Getting MessageLockLost exception on brokered message even though lock was not expired

Getting MessageLockLost exception on brokered message even though lock was not expired - azure

I have implemented a service bus trigger, my sample code is as below.
public static async Task Run([ServiceBusTrigger("myqueue", Connection =
"myservicebus:cs")]BrokeredMessage myQueueItem, TraceWriter log)
{
try
{
if (myQueueItem.LockedUntilUtc <= DateTime.UtcNow)
{
log.Info($"Lock expired.");
return;
}
//renew lock if message lock is about to expire.
if ((myQueueItem.LockedUntilUtc - DateTime.UtcNow).TotalSeconds <= defaultLock)
{
await myQueueItem.RenewLockAsync();
return true;
}
//Process message logic
await myQueueItem.CompleteAsync();
}
catch (MessageLockLostException ex)
{
log.Error($"Message Lock lost. Exception: {ex}");
}
catch(CustomException ex)
{
//forcefully dead letter if custom exception occurs
await myQueueItem.DeadLetterAsync();
}
catch (Exception ex)
{
log.Error($"Error occurred while processing message. Exception: {ex}");
await myQueueItem.AbandonAsync();
}
}
I have set the default lock duration on the queue as 5 minutes.
I'm getting message lock lost exception for few requests, even though lock was actually not expired.
Request process timings as below:
Service bus trigger fired at: May 07 07:02:14 +00:00 2018 utc
LockedUntilUtc in Brokered message : May 07 07:07:08.0149905 utc
Message lock lost exception occurred at : May 07 07:02:18 +00:00 2018 utc
Can anybody help me to find actually what is wrong with my code.
Thanks.

You should not explicitly call CompleteAsync and AbandonAsync in your code. Those methods will be called by Azure Functions runtime automatically based on the result of your function execution (Complete if no exception occurred, Abandon otherwise).
Renewing the lock manually shouldn't be necessary either, runtime should manage that for you. If you are running on Consumption plan, the max duration of function execution is 5 minutes (by default) anyway.
Try removing all the plumbing code and leave only //Process message logic, see if that helps.

Related

Akka.net Ask timeout when used in Azure WebJob

At work we have some code in a Azure WebJob where we use Rabbit
The basic workflow is this
A message arrives on RabbitMQ Queue
We have a message handler for the incoming message
Within the message handler we start a top level (user) supervisor actor where we "ask" it to handle the message
The supervisor actor hierarchy is like this
And the relevant top level code is something like this (this is the WebJob code)
static void Main(string[] args)
{
try
{
//Bootstrap akka IoC resolver well ahead of any actor usages
new AutoFacDependencyResolver(ContainerOperations.Instance.Container, ContainerOperations.Instance.Container.Resolve<ActorSystem>());
var system = ContainerOperations.Instance.Container.Resolve<ActorSystem>();
var busQueueReader = ContainerOperations.Instance.Container.Resolve<IBusQueueReader>();
var dateTime = ContainerOperations.Instance.Container.Resolve<IDateTime>();
busQueueReader.AddHandler<ProgramCalculationMessage>("RabbitQueue", x =>
{
//This is code that gets called whenever we have a RabbitMQ message arrive
//This is code that gets called whenever we have a RabbitMQ message arrive
//This is code that gets called whenever we have a RabbitMQ message arrive
//This is code that gets called whenever we have a RabbitMQ message arrive
//This is code that gets called whenever we have a RabbitMQ message arrive
try
{
//SupervisorActor is a singleton
var supervisorActor = ContainerOperations.Instance.Container.ResolveNamed<IActorRef>("SupervisorActor");
var actorMessage = new SomeActorMessage();
var supervisorRunTask = runModelSupervisorActor.Ask(actorMessage, TimeSpan.FromMinutes(25));
//we want to wait this guy out
var supervisorRunResult = supervisorRunTask.GetAwaiter().GetResult();
switch (supervisorRunResult)
{
case CompletedEvent completed:
{
break;
}
case FailedEvent failed:
{
throw failed.Exception;
}
}
}
catch (Exception ex)
{
_log.Error(ex, "Error found in Webjob");
//throw it for the actual RabbitMqQueueReader Handler so message gets NACK
throw;
}
});
Thread.Sleep(Timeout.Infinite);
}
catch (Exception ex)
{
_log.Error(ex, "Error found");
throw;
}
}
And this is the relevant IOC code (we are using Autofac + Akka.NET DI for Autofac)
builder.RegisterType<SupervisorActor>();
_actorSystem = new Lazy<ActorSystem>(() =>
{
var akkaconf = ActorUtil.LoadConfig(_akkaConfigPath).WithFallback(ConfigurationFactory.Default());
return ActorSystem.Create("WebJobSystem", akkaconf);
});
builder.Register<ActorSystem>(cont => _actorSystem.Value);
builder.Register(cont =>
{
var system = cont.Resolve<ActorSystem>();
return system.ActorOf(system.DI().Props<SupervisorActor>(),"SupervisorActor");
})
.SingleInstance()
.Named<IActorRef>("SupervisorActor");
The problem
So the code is working fine and doing what we want it to, apart from the Akka.Net "ask" timeout shown above in the WebJob code.
Annoyingly this seems to work fine if I try and run the webjob locally. Where I can simulate a "ask" timeout by providing a new supervisorActor that simply doesn't EVER respond with a message back to the "Sender".
This works perfectly running on my machine, but when we run this code in Azure, we DO NOT see a Timeout for the "ask" even though one of our workflow runs exceeded the "ask" timeout by a mile.
I just don't know what could be causing this behavior, does anyone have any ideas?
Could there be some Azure specific config value for the WebJob that I need to set.

The answer to this was to use the async rabbit handlers which apparently came out in V5.0 of the C# rabbit client. The offical docs still show the sync usage (sadly).
This article is quite good : https://gigi.nullneuron.net/gigilabs/asynchronous-rabbitmq-consumers-in-net/
Once we did this, all was good

IEventProcessor processes message two times

Hi I am using Event hub for ingesting data at the frequency of 1 second.
I am continuously pushing simulated data from console application to event hub and then storing into the SQL data base.
Now its been more than 5 days and I found every day some times my receiver process data two times that why i got duplicate records into the database.
Since it happen only once or twice in a day so I am not even able to trace.
Can any one faced such situation so far ?
Or is it possible then host can process same messages twice ?
Or is it an issue of async behavior of receiver ?
Looking forward for the help....
Code snippet :
public class SimpleEventProcessor : IEventProcessor
{
Stopwatch checkpointStopWatch;
async Task IEventProcessor.CloseAsync(PartitionContext context, CloseReason reason)
{
Console.WriteLine("Processor Shutting Down. Partition '{0}', Reason: '{1}'.", context.Lease.PartitionId, reason);
if (reason == CloseReason.Shutdown)
{
await context.CheckpointAsync();
}
}
Task IEventProcessor.OpenAsync(PartitionContext context)
{
Console.WriteLine("SimpleEventProcessor initialized. Partition: '{0}', Offset: '{1}'", context.Lease.PartitionId, context.Lease.Offset);
this.checkpointStopWatch = new Stopwatch();
this.checkpointStopWatch.Start();
return Task.FromResult<object>(null);
}
async Task IEventProcessor.ProcessEventsAsync(PartitionContext context, IEnumerable<EventData> messages)
{
foreach (EventData eventData in messages)
{
string data = Encoding.UTF8.GetString(eventData.GetBytes());
// store data into SQL database / database call.
}
// Call checkpoint every 5 minutes, so that worker can resume processing from 5 minutes back if it restarts.
if (this.checkpointStopWatch.Elapsed > TimeSpan.FromMinutes(0))
{
await context.CheckpointAsync();
this.checkpointStopWatch.Restart();
}
if (messages.Count() > 0)
await context.CheckpointAsync();
}
}

Event Hub guarantees at least once delivery:
It has the following characteristics:
low latency
capable of receiving and processing millions of events per second
at least once delivery
So you can expect this to happen.
Also take in account the situation that checkpointing just has occurred, then some more message (lets call them A and B) are processed and then the process fails. The next time the reading process is started again after the failure message consumption will start at the last checkpointed message, so in other words, message A and B will be processed again.

ScriptHost error occured despite catching an exception

I have a simple function that takes a message from a queue and saves it to a storage table. I expect that in some cases a table entity with the same data can already exist. Because of that, I added an exception handling to skip this type of situation and mark the queue message as processed. Despite the fact that exception is handled now, the scripthost informs me about an error and the message is still in the queue.
I suppose it is caused by the fact that I'm using table binding that is on edge between host and my code. Am I right? Should I use a table client within my code instead of binding? Is there a different approach?
Sample code to generate this situation:
[FunctionName("MyFunction")]
public static async Task Run([QueueTrigger("myqueue", Connection = "Conn")]string msg, [Table("mytable", Connection = "Conn")] IAsyncCollector<DataEntity> dataEntity, TraceWriter log)
{
try
{
await dataEntity.AddAsync(new DataEntity()
{
PartitionKey = "1",
RowKey = "1",
Data = msg
});
await dataEntity.FlushAsync();
}
catch (StorageException e)
{
// when it is an exception that informs "entity already exists" skip it
}
}

When a queue trigger function fails, Azure Functions retries the function up to five times for a given queue message, including the first try.
If all five attempts fail, the functions runtime adds a message to a queue named <originalqueuename>-poison.
You can write a function to process messages from the poison queue by logging them or sending a notification that manual attention is needed.
The host.json file contains settings that control queue trigger behavior:
{
"queues": {
"maxPollingInterval": 2000,
"visibilityTimeout" : "00:00:30",
"batchSize": 16,
"maxDequeueCount": 1,
"newBatchThreshold": 8
}
}
Note: maxDequeueCount default is 5. The number of times to try processing a message before moving it to the poison queue. For your need, you could set the "maxDequeueCount":1.
Also these settings are host wide and apply to all functions. You can't control these per function currently.

How to use await keyword inside a method without changing the method async

I am developing a scheduled job to send message to Message queue using Quartz.net. The Execute method of IJob is not async. so I can't use async Task. But I want to call a method with await keyword.
Please find below my code. Not sure whether I am doing correct. Can anyone please help me with this?
private async Task PublishToQueue(ChangeDetected changeDetected)
{
_logProvider.Info("Publish to Queue started");
try
{
await _busControl.Publish(changeDetected);
_logProvider.Info($"ChangeDetected message published to RabbitMq. Message");
}
catch (Exception ex)
{
_logProvider.Error("Error publishing message to queue: ", ex);
throw;
}
}
public class ChangedNotificatonJob : IJob
{
public void Execute(IJobExecutionContext context)
{
//Publish message to queue
Policy
.Handle<Exception>()
.RetryAsync(3, (exception, count) =>
{
//Do something for each retry
})
.ExecuteAsync(async () =>
{
await PublishToQueue(message);
});
}
}
Is this correct way? I have used .GetAwaiter();
Policy
.Handle<Exception>()
.RetryAsync(_configReader.RetryLimit, (exception, count) =>
{
//Do something for each retry
})
.ExecuteAsync(async () =>
{
await PublishToQueue(message);
}).GetAwaiter()

Polly's .ExecuteAsync() returns a Task. With any Task, you can just call .Wait() on it (or other blocking methods) to block synchronously until it completes, or throws an exception.
As you have observed, since IJob.Execute(...) isn't async, you can't use await, so you have no choice but to block synchronously on the task, if you want to discover the success-or-otherwise of publishing before IJob.Execute(...) returns.
.Wait() will cause any exception from the task to be rethrown, wrapped in an AggregateException. This will occur if all Polly-orchestrated retries fail.
You'll need to decide what to do with that exception:
If you want the caller to handle it, rethrow it or don't catch it and let it cascade outside the Quartz job.
If you want to handle it before returning from IJob.Execute(...), you'll need a try {} catch {} around the whole .ExecuteAsync(...).Wait(). Or consider Polly's .ExecuteAndCaptureAsync(...) syntax: it avoids you having to provide that outer try-catch, by instead placing the final outcome of the execution into a PolicyResult instance. See the Polly doco.
There is a further alternative if your only intention is to log somewhere that message publishing failed, and you don't care whether that logging happens before IJob.Execute(...) returns or not. In that case, instead of using .Wait(), you could chain a continuation task on to ExecuteAsync() using .ContinueWith(...), and handle any logging in there. We adopt this approach, and capture failed message publishing to a special 'message hospital' - capturing enough information so that we can choose whether to republish that message again later, if appropriate. Whether this approach is valuable depends on how important it is to you never to lose a message.
EDIT: GetAwaiter() is irrelevant. It won't magically let you start using await inside a non-async method.

Message being retried when operation takes time

I have a messaging system using Azure ServiceBus but I'm using Nimbus on top of that. I have an endpoint that sends a command to another endpoint and at one point the handler class on the other side picks it up, so it is all working fine.
When the operation takes time, roughly more than 20 second or so, the handler gets 'another' call with the same message. It looks like Nimbus is retrying the message that is already being handled by an other (even the same) instance of the handler, I don't see any exceptions being thrown and I could easily repro this with the following handler:
public class Synchronizer : IHandleCommand<RequestSynchronization>
{
public async Task Handle(RequestSynchronization synchronizeInfo)
{
Console.WriteLine("Received Synchronization");
await Task.Delay(TimeSpan.FromSeconds(30)); //Simulate long running process
Console.WriteLine("Got through first timeout");
await Task.Delay(TimeSpan.FromSeconds(30)); //Simulate another long running process
Console.WriteLine("Got through second timeout");
}
}
My question is: How do I disable this behavior? I am happy for the transaction take time as it is a heavy process that I have off-loaded from my website, which was the whole point of going with this architecture in the first place.
In other words, I was expecting the message to not to be picked up by another handler while one has picked it up and is processing it, unless there's an exception and the message goes back to the queue and eventually gets picked up for a retry.
Any ideas how to do this? Anything I'm missing?

By default, ASB/WSB will give you a message lock of 30 seconds. The idea is that you pop a BrokeredMessage off the head of the queue but have to either .Complete() or .Abandon() that message within the lock timeout.
If you don't do that, the service bus assumes that you've crashed or otherwise failed and it will return that message to the queue to be re-processed.
You have a couple of options:
1) Implement ILongRunningHandler on your handler. Nimbus will pay attention to the remaining lock time and automatically renew your message lock. Caution: The maximum message lock time supported by ASB/WSB is five minutes no matter how many times you renew so if your handler takes longer than that then you might want option #2.
public class Synchronizer : IHandleCommand<RequestSynchronization>, ILongRunningTask
{
public async Task Handle(RequestSynchronization synchronizeInfo)
{
Console.WriteLine("Received Synchronization");
await Task.Delay(TimeSpan.FromSeconds(30)); //Simulate long running process
Console.WriteLine("Got through first timeout");
await Task.Delay(TimeSpan.FromSeconds(30)); //Simulate another long running process
Console.WriteLine("Got through second timeout");
}
}
2) In your handler, call a Task.Run(() => SomeService(yourMessage)) and return. If you do this, be careful about lifetime scoping of dependencies if your handler takes any. If you need an IFoo, take a dependency on a Func> (or equivalent depending on your container) and resolve that within your handling task.
public class Synchronizer : IHandleCommand<RequestSynchronization>
{
private readonly Func<Owned<IFoo>> fooFunc;
public Synchronizer(Func<Owned<IFoo>> fooFunc)
{
_fooFunc = fooFunc;
}
public async Task Handle(RequestSynchronization synchronizeInfo)
{
// don't await!
Task.Run(() => {
using (var foo = _fooFunc())
{
Console.WriteLine("Received Synchronization");
await Task.Delay(TimeSpan.FromSeconds(30)); //Simulate long running process
Console.WriteLine("Got through first timeout");
await Task.Delay(TimeSpan.FromSeconds(30)); //Simulate another long running process
Console.WriteLine("Got through second timeout");
}
});
}
}

I think you are looking for the code here: http://www.uglybugger.org/software/post/support_for_long_running_handlers_in_nimbus

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Getting MessageLockLost exception on brokered message even though lock was not expired - azure

Related

Akka.net Ask timeout when used in Azure WebJob

IEventProcessor processes message two times

ScriptHost error occured despite catching an exception

How to use await keyword inside a method without changing the method async

Message being retried when operation takes time

Categories

Resources