Weird behaviour with Task Parallel Library Framework and Azure Instances - azure

I need some help solving a problem involving the Task Parallel Library with Azure instances. Below is code for my Worker Role.
Whenever I upload multiple files, a request is inserted into the queue and the worker process continously process queries Queues and gets the message. Once a message is retrieved, I do some long runnning process. I used task schedulder so that mutliple request are served by multiple task instance on multiple instances.
Now the uestion is if one instance take a message from a queue and assigns the message to a task and it process, now i see another instance also retrieves the same message from Queue and process it. Because of that my tasks are executed multiple times.
Please help me on this problem. My requirement is only one Azure instance of one Ccre handles one task operation not by mutliple by task.
public override void Run()
{
//Step1 : Get the message from Queue
//Step 2:
Task<string>.Factory.StartNew(() =>
{
//Message delete from Queue
PopulateBlobtoTable(uri, localStoragePath);
}
catch (Exception ex)
{
Trace.WriteLine(ex.Message);
throw;
}
finally
{
}
}
return "Finished!";
})
catch (AggregateException ae)
{
foreach (var exception in ae.InnerExceptions)
{
Trace.WriteLine(exception.Message);
}
}

I'm assuming you are using Windows Azure Storage queues, which have a default invisibility timeout of 90 seconds, when using the storage client APIs. If your message is not completely processed and explicitly deleted within that time period, it will reappear on the queue.
While you can increase this invisibility timeout to up to seven days when you add the message to the queue, you should be using operations that are idempotent, meaning it doesn't matter if the message is processed multiple times. It's your job to ensure idempotence, perhaps by recording a unique id (in table storage, SQL database, etc.) associated with each message and ignoring the message if you see it a second time and you find it's already been marked complete.
You might also look at Windows Azure Queues and Windows Azure Service Bus Queues - Compared and Constrasted. You'll note Service Bus queues have some additional constructs you can use to guarantee at-most-once (and at-least-once) delivery.

Now the uestion is if one instance take a message from a queue and assigns the message to a task and it process, now i see another instance also retrieves the same message from Queue and process it. Because of that my tasks are executed multiple times.
Are you getting the messages via "GET" semantics? If that's the case, then what's the visibility timeout you have set for your messages. When you "GET" a message, it should become invisible to other callers (read "instances" in your case) for a particular period of time which you can specify using visibility timeout period. Check out the documentation here for this: http://msdn.microsoft.com/en-us/library/windowsazure/ee758454.aspx

Related

How to persist Saga instances using storage engines and avoid race condition

I tried persisting Saga Instances using RedisSagaRepository; I wanted to run Saga in load balancing setup, so I cannot use InMemorySagaRepository.
However, after I switched, I noticed that some of the events published by Consumers were not getting processed by Saga. I checked the queue and did not see any messages.
What I noticed is it will likely occurs when the Consumer took little to no time to process command and publish event.
This issue will not occur if I use InMemorySagaRepository or add Task.Delay() in Consumer.Consume()
Am I using it incorrectly?
Also, If I want to run Saga in load balancing setup, and if the Saga needs to send multiple commands of the same type using dictionary to track completeness (similar logic as in Handling transition to state for multiple events). When multiple Consumer publish events at the same time, would I have race condition if two Sagas are process two different events at the same time? In this case, would the Dictionary in State object will be set correctly?
The code is available here
SagaService.ConfigureSagaEndPoint() is where I switch between InMemorySagaRepository and RedisSagaRepository
private void ConfigureSagaEndPoint(IRabbitMqReceiveEndpointConfigurator endpointConfigurator)
{
var stateMachine = new MySagaStateMachine();
try
{
var redisConnectionString = "192.168.99.100:6379";
var redis = ConnectionMultiplexer.Connect(redisConnectionString);
///If we switch to RedisSagaRepository and Consumer publish its response too quick,
///It seems like the consumer published event reached Saga instance before the state is updated
///When it happened, Saga will not process the response event because it is not in the "Processing" state
//var repository = new RedisSagaRepository<SagaState>(() => redis.GetDatabase());
var repository = new InMemorySagaRepository<SagaState>();
endpointConfigurator.StateMachineSaga(stateMachine, repository);
}
catch (Exception ex)
{
Console.WriteLine(ex.ToString());
}
}
LeafConsumer.Consume is where we add the Task.Delay()
public class LeafConsumer : IConsumer<IConsumerRequest>
{
public async Task Consume(ConsumeContext<IConsumerRequest> context)
{
///If MySaga project is using RedisSagaRepository, uncomment await Task.Delay() below
///Otherwise, it seems that the Publish message from Consumer will not be processed
///If using InMemorySagaRepository, code will work without needing Task.Delay
///Maybe I am doing something wrong here with these projects
///Or in real life, we probably have code in Consumer that will take a few milliseconds to complete
///However, we cannot predict latency between Saga and Redis
//await Task.Delay(1000);
Console.WriteLine($"Consuming CorrelationId = {context.Message.CorrelationId}");
await context.Publish<IConsumerProcessed>(new
{
context.Message.CorrelationId,
});
}
}
When you have events published in this manner, and are using multiple service instances with a non-transactional saga repository (such as Redis), you need to design your saga such that a unique identifier is used and enforced by Redis. This prevents multiple instances of the same saga from being created.
You also need to accept the events in more than the "expected" state. For instance, expecting to receive a Start, which puts the saga into a processing state, before receiving another event only in processing, is likely to fail. Allowing the saga to be started (Initially, in Automatonymous) by any of the sequence of events is recommended, to avoid out-of-order message delivery issues. As long as the events all move the dial from the left to the right, the eventual state will be reached. If an earlier event is received after a later event, it shouldn't move the state backwards (or to the left, in this example) but only add information to the saga instance and leave it at the later state.
If two events are processed on separate service instances, they'll both try to insert the saga instance to Redis, which will fail as a duplicate. The message should then retry (add UseMessageRetry() to your receive endpoint), which would then pick up the now existing saga instance and apply the event.

With the retry options in durable functions, what happens after the last attempt?

I'm using a durable function that's triggered off a queue. I'm sending messages off the queue to a service that is pretty flaky, so I set up the RetryPolicy. Even still, I'd like to be able to see the failed messages even if the max retries has been exhausted.
Do I need to manually throw those to a dead-letter queue (and if so, it's not clear to me how I know when a message has been retried any number of times), or will the function naturally throw those to some kind of dead-letter/poison queue?
When an activity fails in Durable Functions, an exception is marshalled back to the orchestration with FunctionFailedException thrown. It doesn't matter whether you used automatic retry or not - at the very end, the whole activity fails and it's up to you to handle the situation. As per documentation:
try
{
await context.CallActivityAsync("CreditAccount",
new
{
Account = transferDetails.DestinationAccount,
Amount = transferDetails.Amount
});
}
catch (Exception)
{
// Refund the source account.
// Another try/catch could be used here based on the needs of the application.
await context.CallActivityAsync("CreditAccount",
new
{
Account = transferDetails.SourceAccount,
Amount = transferDetails.Amount
});
}
The only thing retry changes is handling the transient error(so you do not have to enable the safe route each time you have e.g. network issues).

Run Web Job in parallel

We have a series of 4 Service Bus queues, each queue has a web job that processes messages and passes it on to the next queue. Though we're running on a single core, each webjob is async and allows the other jobs to continue while it queries a database or endpoint.
we have set MaxConcurrentCalls = 3 in the ServiceBusConfiguration
However, now all the messages are in the final queue, it's not spinning up multiple instances of the final Web Job to process them faster and instead executing synchronously. I'd like to know how to configure my Web Jobs to run the same web job in parallel.
I notice this article from 2014 which suggests we have to implement our own parallel processing but more recent articles contradict this information saying it is supported OOTB.
Only for Continuous WebJobs is available to scale out multi instances.
Which is determining whether the program or script runs on all instances or just one instance.
The option to run on multiple instances doesn't apply to the free or shared price tiers.
In your webjob, you will find an instance of the JobHostConfiguration object. This object is used to configure the properties of your webjob.
Here is a configuration:
static void Main()
{
var config = new JobHostConfiguration();
config.UseTimers();
config.Queues.MaxDequeueCount = 2;
config.Queues.MaxPollingInterval = TimeSpan.FromSeconds(4);
config.Queues.BatchSize = 2;
var host = new JobHost(config);
host.RunAndBlock();
}
So lets break down the items into pieces:
config.UseTimers();
The config.UserTimers(); allows us to use a timer trigger in our functions.
config.Queues.MaxDequeueCount = 2;
The MaxDequeueCount is the number of times your function will try process a message if it errors out.
config.Queues.MaxPollingInterval = TimeSpan.FromSeconds(4);
MaxPollingInterval is the max amount of time the WebJob will check the queue.
If of this is not desirable you can change this setting like I have above so that the WebJob will check the queue maximum every 4 seconds.
config.Queues.BatchSize = 2;
The BatchSize property is the amount of items your WebJob will process at the same time. The items will be processed asynchronously.
So if there is 2 items in the queue they will be processed parallel. If you set this one to 1 then you are creating a Synchronous flow as it will only take one item out of the queue at a time.
For more detail, you could refer to this article to learn run webjob in parallel.
Update:
The method BeginReceiveBatch/EndReceiveBatch allows you to retrieve multiple "items" from Queue (Async) and then use AsParallel to convert the IEnumerable returned by the previous methods and process the messages in multiple threads.
var messages = await Task.Factory.FromAsync<IEnumerable<BrokeredMessage>>(Client.BeginReceiveBatch(3, null, null), Client.EndReceiveBatch);
messages.AsParallel().WithDegreeOfParallelism(3).ForAll(item =>
{
ProcessMessage(item);
});
That code retrieves 3 messages from queue and processes then in "3 threads" (Note: it is not guaranteed that it will use 3 threads, .NET will analyze the system resources and it will use up to 3 threads if necessary.)
For more details, you could refer to this case.
By setting ServiceBusConfiguration.PrefetchCount and ServiceBusConfiguration.MessageOptions.MaxConcurrentCalls, I have been able to see that a single webjob will dequeue multiple messages and process them in parallel.

Setup webjob ServiceBusTriggers or queue names at runtime (without hard-coded attributes)?

Is there any way to configure triggers without attributes? I cannot know the queue names ahead of time.
Let me explain my scenario here.. I have one service bus queue, and for various reasons (complicated duplicate-suppression business logic), the queue messages have to be processed one at a time, so I have ServiceBusConfiguration.OnMessageOptions.MaxConcurrentCalls set to 1. So processing a message holds up the whole queue until it is finished. Needless to say, this is suboptimal.
This 'one at a time' policy isn't so simple. The messages could be processed in parallel, they just have to be divided into groups (based on a field in message), say A and B. Group A can process its messages one at a time, and group B can process its own one at a time, etc. A and B are processed in parallel, all is good.
So I can create a queue for each group, A, B, C, ... etc. There are about 50 groups, so 50 queues.
I can create a queue for each, but how to make this work with the Azure Webjobs SDK? I don't want to copy-paste a method for each queue with a different ServiceBusTrigger for the SDK to discover, just to enforce one-at-a-time per queue/group, then update the code with another copy-paste whenever another group is needed. Fetching a list of queues at startup and tying to the function is preferable.
I have looked around and I don't see any way to do what I want. The ITypeLocator interface is pretty hard-set to look for attributes. I could probably abuse the INameResolver, but it seems like I'd still have to have a bunch of near-duplicate methods around. Could I somehow create what the SDK is looking for at startup/runtime?
(To be clear, I know how to use INameResolver to get queue name as at How to set Azure WebJob queue name at runtime? but though similar this isn't my problem. I want to setup triggers for multiple queues at startup for the same function to get the one-at-a-time per queue processing, without using the trigger attribute 50 times repeatedly. I figured I'd ask again since the SDK repo is fairly active and it's been a year..).
Or am I going about this all wrong? Being dumb? Missing something? Any advice on this dilemma would be welcome.
The Azure Webjob Host discovers and indexes the functions with the ServiceBusTrigger attribute when it starts. So there is no way to set up the queues to trigger at the runtime.
The simpler solution for you is to create a long time running job and implement it manually:
public class Program
{
private static void Main()
{
var host = new JobHost();
host.CallAsync(typeof(Program).GetMethod("Process"));
host.RunAndBlock();
}
[NoAutomaticTriggerAttribute]
public static async Task Process(TextWriter log, CancellationToken token)
{
var connectionString = "myconnectionstring";
// You can also get the queue name from app settings or azure table ??
var queueNames = new[] {"queueA", "queueA" };
var messagingFactory = MessagingFactory.CreateFromConnectionString(connectionString);
foreach (var queueName in queueNames)
{
var receiver = messagingFactory.CreateMessageReceiver(queueName);
receiver.OnMessage(message =>
{
try
{
// do something
....
// Complete the message
message.Complete();
}
catch (Exception ex)
{
// Log the error
log.WriteLine(ex.ToString());
// Abandon the message so that it can be retry.
message.Abandon();
}
}, new OnMessageOptions() { MaxConcurrentCalls = 1});
}
// await until the job stop or restart
await Task.Delay(Timeout.InfiniteTimeSpan, token);
}
}
Otherwise, if you don't want to deal with multiple queues, you can have a look at azure servicebus topic/subscription and create SqlFilter to send your message to the right subscription.
Another option could be to create your own trigger: The azure webjob SDK provides extensibility points to create your own trigger binding :
Binding Extensions Overview
Good Luck !
Based on my understanding, your needs seems to be building a message batch system in parallel. The #Thomas solution is good, but I think Azure Batch service with Table storage may be better and could be instead of the complex solution of ServiceBus queue + WebJobs with a trigger.
Using Azure Batch with Table storage, you can control the task creation and execute the task in parallel and at scale, even monitor these tasks, please refer to the tutorial to know how to.

How to guarantee azure queue FIFO

I understand that MS Azure Queue service document http://msdn.microsoft.com/en-us/library/windowsazure/dd179363.aspx says first out (FIFO) behavior is not guaranteed.
However, our application is such that ALL the messages have to be read and processed in FIFO order. Could anyone please suggest how to achieve a guaranteed FIFO using Azure Queue Service?
Thank you.
The docs say for Azure Storage queues that:
Messages in Storage queues are typically first-in-first-out, but sometimes they can be out of order; for example, when a message's
visibility timeout duration expires (for example, as a result of a
client application crashing during processing). When the visibility
timeout expires, the message becomes visible again on the queue for
another worker to dequeue it. At that point, the newly visible message
might be placed in the queue (to be dequeued again) after a message
that was originally enqueued after it.
Maybe that is good enough for you? Else use Service bus.
The latest Service Bus release offers reliable messaging queuing: Queues, topics and subscriptions
Adding to #RichBower answer... check out this... Azure Storage Queues vs. Azure Service Bus Queues
MSDN (link retired)
http://msdn.microsoft.com/en-us/library/windowsazure/hh767287.aspx
learn.microsoft.com
https://learn.microsoft.com/en-us/azure/service-bus-messaging/service-bus-azure-and-service-bus-queues-compared-contrasted
Unfortunately, many answers misleads to Service Bus Queues but I assume the question is about Storage Queues from the tags mentioned. In Azure Storage Queues, FIFO is not guranteed, whereas in Service Bus, FIFO message ordering is guaranteed and that too, only with the use of a concept called Sessions.
A simple scenario could be, if any consumer receives a message from the queue, it is not visible to you when you are the second receiver. So you assume the second message you received is actually the first message (Where FIFO failed :P)
Consider using Service Bus if this is not your requirement.
I don't know how fast do you want to process the messages, but if you need to have a real FIFO, don't allow Azure's queue to get more than one message at a time.
Use this at your "program.cs" at the top of the function.
static void Main()
{
var config = new JobHostConfiguration();
if (config.IsDevelopment)
{
config.UseDevelopmentSettings();
}
config.Queues.BatchSize = 1; //Number of messages to dequeue at the same time.
config.Queues.MaxPollingInterval = TimeSpan.FromMilliseconds(100); //Pooling request to the queue.
JobHost host = new JobHost(config);
....your initial information...
// The following code ensures that the WebJob will be running continuously
host.RunAndBlock();
This will get one message at a time with a wait period of 100 miliseconds.
This is working perfectly with a logger webjob to write to files the traze information.
As mentioned here https://www.jayway.com/2013/12/20/message-ordering-on-windows-azure-service-bus-queues/ ordering is not guaranteed also in service bus, except of using recieve and delete mode which is risky
You just need to follow below steps to ensure Message ordering.:
1) Create a Queue with session enabled=false.
2) While saving message in the queue, provide the session id like below:-
var message = new BrokeredMessage(item);
message.SessionId = "LB";
Console.WriteLine("Response from Central Scoring System : " + item);
client.Send(message);
3) While creating receiver for reviving message:-
queueClient.OnMessage(s =>
{
var body = s.GetBody<string>();
var messageId = s.MessageId;
Console.WriteLine("Message Body:" + body);
Console.WriteLine("Message Id:" + messageId);
});
4) While having the same session id, it would automatically ensure order and give the ordered message.
Thanks!!

Resources