Waiting for an azure function durable orchestration to complete - azure

Currently working on a project where I'm using the storage queue to pick up items for processing. The Storage Queue triggered function is picking up the item from the queue and starts a durable orchestration. Normally the according to the documentation the storage queue picks up 16 messages (by default) in parallel for processing (https://learn.microsoft.com/en-us/azure/azure-functions/functions-bindings-storage-queue), but since the orchestration is just being started (simple and quick process), in case I have a lot of messages in the queue I will end up with a lot of orchestrations running at the same time. I would like to be able to start the orchestration and wait for it to complete before the next batch of messages are being picked up for processing in order to avoid overloading my systems. The solution I came up with and seems to work is:
public class QueueTrigger
{
[FunctionName(nameof(QueueTrigger))]
public async Task Run([QueueTrigger("queue-processing-test", Connection = "AzureWebJobsStorage")]Activity activity, [DurableClient] IDurableOrchestrationClient starter,
ILogger log)
{
log.LogInformation($"C# Queue trigger function processed: {activity.ActivityId}");
string instanceId = await starter.StartNewAsync<Activity>(nameof(ActivityProcessingOrchestrator), activity);
log.LogInformation($"Started orchestration with ID = '{instanceId}'.");
var status = await starter.GetStatusAsync(instanceId);
do
{
status = await starter.GetStatusAsync(instanceId);
} while (status.RuntimeStatus == OrchestrationRuntimeStatus.Running || status.RuntimeStatus == OrchestrationRuntimeStatus.Pending);
}
which basically picks up the message, starts the orchestration and then in a do/while loop waits while the staus is Pending or Running.
Am I missing something here or is there any better way of doing this (I could not find much online).
Thanks in advance your comments or suggestions!

This might not work since you could either hit timeouts causing duplicate orchestration runs or just force your function app to scale out defeating the purpose of your code all together.
Instead, you could rely on the concurrency throttles that Durable Functions come with. While the queue trigger would queue up orchestrations runs, only the max defined would run at any time on a single instance of a function.
This would still cause your function app to scale out, so you would have to consider that as well when setting this limit and you could also set the WEBSITE_MAX_DYNAMIC_APPLICATION_SCALE_OUT app setting to control how many instances you function app can scale out to.

It could be that the Function app's built in scaling throttling does not reduce load on downstream services because it is per app and will just cause the app to scale more. Then what is needed is a distributed max instance count that all app instances adhere to. I have built this functionality into my Durable Function orchestration app with a scaleGroupId and it`s max instance count. It has an Api call to save this info and the scaleGroupId is a string that can be set to anything that describes the resource you want to protect from overloading. Here is my app that can do this:
Microflow

Related

Run Web Job in parallel

We have a series of 4 Service Bus queues, each queue has a web job that processes messages and passes it on to the next queue. Though we're running on a single core, each webjob is async and allows the other jobs to continue while it queries a database or endpoint.
we have set MaxConcurrentCalls = 3 in the ServiceBusConfiguration
However, now all the messages are in the final queue, it's not spinning up multiple instances of the final Web Job to process them faster and instead executing synchronously. I'd like to know how to configure my Web Jobs to run the same web job in parallel.
I notice this article from 2014 which suggests we have to implement our own parallel processing but more recent articles contradict this information saying it is supported OOTB.
Only for Continuous WebJobs is available to scale out multi instances.
Which is determining whether the program or script runs on all instances or just one instance.
The option to run on multiple instances doesn't apply to the free or shared price tiers.
In your webjob, you will find an instance of the JobHostConfiguration object. This object is used to configure the properties of your webjob.
Here is a configuration:
static void Main()
{
var config = new JobHostConfiguration();
config.UseTimers();
config.Queues.MaxDequeueCount = 2;
config.Queues.MaxPollingInterval = TimeSpan.FromSeconds(4);
config.Queues.BatchSize = 2;
var host = new JobHost(config);
host.RunAndBlock();
}
So lets break down the items into pieces:
config.UseTimers();
The config.UserTimers(); allows us to use a timer trigger in our functions.
config.Queues.MaxDequeueCount = 2;
The MaxDequeueCount is the number of times your function will try process a message if it errors out.
config.Queues.MaxPollingInterval = TimeSpan.FromSeconds(4);
MaxPollingInterval is the max amount of time the WebJob will check the queue.
If of this is not desirable you can change this setting like I have above so that the WebJob will check the queue maximum every 4 seconds.
config.Queues.BatchSize = 2;
The BatchSize property is the amount of items your WebJob will process at the same time. The items will be processed asynchronously.
So if there is 2 items in the queue they will be processed parallel. If you set this one to 1 then you are creating a Synchronous flow as it will only take one item out of the queue at a time.
For more detail, you could refer to this article to learn run webjob in parallel.
Update:
The method BeginReceiveBatch/EndReceiveBatch allows you to retrieve multiple "items" from Queue (Async) and then use AsParallel to convert the IEnumerable returned by the previous methods and process the messages in multiple threads.
var messages = await Task.Factory.FromAsync<IEnumerable<BrokeredMessage>>(Client.BeginReceiveBatch(3, null, null), Client.EndReceiveBatch);
messages.AsParallel().WithDegreeOfParallelism(3).ForAll(item =>
{
ProcessMessage(item);
});
That code retrieves 3 messages from queue and processes then in "3 threads" (Note: it is not guaranteed that it will use 3 threads, .NET will analyze the system resources and it will use up to 3 threads if necessary.)
For more details, you could refer to this case.
By setting ServiceBusConfiguration.PrefetchCount and ServiceBusConfiguration.MessageOptions.MaxConcurrentCalls, I have been able to see that a single webjob will dequeue multiple messages and process them in parallel.

DurableOrchestrationClient.GetStatusAsync() always null for Durable Azure Function

I have a queue trigger azure function with DurableOrchestrationClient. I am able to start a new execution of my orchestration function, which triggers multiple activitytrigger functions and waits for them all to process. Everything works great.
My issue is that I am unable to check on the status of my orchestration function("TestFunction"). GetStatusAsync always returns as null. I need to know when the orchestration function is actually complete and process the return object (bool).
public static async void Run([QueueTrigger("photostodownload", Connection = "QueueStorage")]PhotoInfo photoInfo, [OrchestrationClient]DurableOrchestrationClient starter, TraceWriter log)
{
var x = await starter.StartNewAsync("TestFunction", photoInfo);
Thread.Sleep(2 * 1000);
var y = await starter.GetStatusAsync(x);
}
StartNewAsync enqueues a message into the control queuee, it doesn't mean that the orchestration starts immediately.
GetStatusAsync returns null if the instance either doesn't exist or has not yet started running. So, probably the orchestration just doesn't start yet during those 2 seconds of sleep that you have.
Rather than having a fixed wait timeout, you should either periodically poll the status of your orchestration, or send something like a Done event from the orchestration as the last step of the workflow.
Are you using function 1.0 or 2.0? A similar issue has been reported for Function 2.0 runtime on Github.
https://github.com/Azure/azure-functions-durable-extension/issues/126
Also when you say everything works great do you mean activityTrigger functions complete execution?
Are you running functions locally or is it deployed on Azure?

How to get runtime status of queue triggered azure function?

My azure function is calculating results of certain request jobs (cca. 5s-5min) where each job has unique jobId based on the hash of the request message. Execution leads to deterministic results. So it is functionally "pure function". Therefore we are caching results of already evaluated jobs in a blob storage based on the jobId. All great so far.
Now if a request for jobId comes three scenarios are possible.
Result is in the cache already => then it is served from the cache.
Result is not in the cache and no function is running the evaluation => new invocation
Result is not in the cache, but some function is already working on it => wait for result
We do some custom table storage based progress tracking magic to tell if function is working on given jobId or not yet.
It works somehow, up to the point of 5 x restart -> poison queue scenarios. There we are quite hopeless.
I feel like we are hacking around some of already reliably implemented feature of Azure Functions internals, because exactly the same info can be seen in the monitor page in azure portal or used to be visible in kudu webjobs monitor page.
How to reliably find out in c# if a given message (jobId) is currently being processed by some function and when it is not?
Azure Durable Functions provide a mechanism how to track progress of execution of smaller tasks.
https://learn.microsoft.com/en-us/azure/azure-functions/durable-functions-overview
Accroding to the "Pattern #3: Async HTTP APIs" the orchestrator can provide information about the function status in form like this:
{"runtimeStatus":"Running","lastUpdatedTime":"2017-03-16T21:20:47Z", ...}
This solves my problem about finding if given message is being processed.
How to reliably find out in c# if a given message (jobId) is currently being processed by some function and when it is not?
If you’d like to detect which message is being processed and get the message ID in queue triggered Azure function, you can try the following code:
#r "Microsoft.WindowsAzure.Storage"
using System;
using Microsoft.WindowsAzure.Storage.Queue;
public static void Run(CloudQueueMessage myQueueItem, TraceWriter log)
{
log.Info($"messageid: {myQueueItem.Id}, messagebody: {myQueueItem.AsString}");
}

WebJob QueueTrigger - limit per function

I've got a few Webjobs in place, each of which respond to a number of QueueTrigger, e.g.
public static void ProcessMessage([QueueTrigger("XXXXXXX")] string message, TextWriter log)
{
//processing message
}
public static void ProcessMessage([QueueTrigger("YYYYYY")] string message, TextWriter log)
{
//processing message
}
Should I be separating out each trigger to a separate job? Are there any reasons why continuing on this path is a bad idea, i.e. the more queues it can trigger the less functions get executed due to thread limits?
What you are doing is the common approach - the WebJobs SDK JobHost is designed to handle many different job functions all within the same application. It is true that all the job functions within a single host will share the same process/memory space and limits, but for most scenarios this isn't a problem and is the recommended approach.
For QueueTrigger specifically, each of your functions will efficiently poll for new work, and when work is available each will pull messages in batches of 16 (configurable via JobHostConfiguration.Queues) and process them in parallel.
You can also scale out if needed by increasing the number of instances your WebJob runs on. Each instance will then work cooperatively with the others to handle more load.

Setup webjob ServiceBusTriggers or queue names at runtime (without hard-coded attributes)?

Is there any way to configure triggers without attributes? I cannot know the queue names ahead of time.
Let me explain my scenario here.. I have one service bus queue, and for various reasons (complicated duplicate-suppression business logic), the queue messages have to be processed one at a time, so I have ServiceBusConfiguration.OnMessageOptions.MaxConcurrentCalls set to 1. So processing a message holds up the whole queue until it is finished. Needless to say, this is suboptimal.
This 'one at a time' policy isn't so simple. The messages could be processed in parallel, they just have to be divided into groups (based on a field in message), say A and B. Group A can process its messages one at a time, and group B can process its own one at a time, etc. A and B are processed in parallel, all is good.
So I can create a queue for each group, A, B, C, ... etc. There are about 50 groups, so 50 queues.
I can create a queue for each, but how to make this work with the Azure Webjobs SDK? I don't want to copy-paste a method for each queue with a different ServiceBusTrigger for the SDK to discover, just to enforce one-at-a-time per queue/group, then update the code with another copy-paste whenever another group is needed. Fetching a list of queues at startup and tying to the function is preferable.
I have looked around and I don't see any way to do what I want. The ITypeLocator interface is pretty hard-set to look for attributes. I could probably abuse the INameResolver, but it seems like I'd still have to have a bunch of near-duplicate methods around. Could I somehow create what the SDK is looking for at startup/runtime?
(To be clear, I know how to use INameResolver to get queue name as at How to set Azure WebJob queue name at runtime? but though similar this isn't my problem. I want to setup triggers for multiple queues at startup for the same function to get the one-at-a-time per queue processing, without using the trigger attribute 50 times repeatedly. I figured I'd ask again since the SDK repo is fairly active and it's been a year..).
Or am I going about this all wrong? Being dumb? Missing something? Any advice on this dilemma would be welcome.
The Azure Webjob Host discovers and indexes the functions with the ServiceBusTrigger attribute when it starts. So there is no way to set up the queues to trigger at the runtime.
The simpler solution for you is to create a long time running job and implement it manually:
public class Program
{
private static void Main()
{
var host = new JobHost();
host.CallAsync(typeof(Program).GetMethod("Process"));
host.RunAndBlock();
}
[NoAutomaticTriggerAttribute]
public static async Task Process(TextWriter log, CancellationToken token)
{
var connectionString = "myconnectionstring";
// You can also get the queue name from app settings or azure table ??
var queueNames = new[] {"queueA", "queueA" };
var messagingFactory = MessagingFactory.CreateFromConnectionString(connectionString);
foreach (var queueName in queueNames)
{
var receiver = messagingFactory.CreateMessageReceiver(queueName);
receiver.OnMessage(message =>
{
try
{
// do something
....
// Complete the message
message.Complete();
}
catch (Exception ex)
{
// Log the error
log.WriteLine(ex.ToString());
// Abandon the message so that it can be retry.
message.Abandon();
}
}, new OnMessageOptions() { MaxConcurrentCalls = 1});
}
// await until the job stop or restart
await Task.Delay(Timeout.InfiniteTimeSpan, token);
}
}
Otherwise, if you don't want to deal with multiple queues, you can have a look at azure servicebus topic/subscription and create SqlFilter to send your message to the right subscription.
Another option could be to create your own trigger: The azure webjob SDK provides extensibility points to create your own trigger binding :
Binding Extensions Overview
Good Luck !
Based on my understanding, your needs seems to be building a message batch system in parallel. The #Thomas solution is good, but I think Azure Batch service with Table storage may be better and could be instead of the complex solution of ServiceBus queue + WebJobs with a trigger.
Using Azure Batch with Table storage, you can control the task creation and execute the task in parallel and at scale, even monitor these tasks, please refer to the tutorial to know how to.

Resources