Sequence processing with Azure Function & Service Bus

Sequence processing with Azure Function & Service Bus - azure

I have an issue with Azure Function Service Bus trigger.
The issue is Azure function cannot wait a message done before process a new message. It process Parallel, it not wait 5s before get next message. But i need it process sequencecy (as image bellow).
How can i do that?
[FunctionName("HttpStartSingle")]
public static void Run(
[ServiceBusTrigger("MyServiceBusQueue", Connection = "Connection")]string myQueueItem,
[OrchestrationClient] DurableOrchestrationClient starter,
ILogger log)
{
Console.WriteLine($"MessageId={myQueueItem}");
Thread.Sleep(5000);
}

I resolved my problem by using this config in my host.json
{
"version": "2.0",
"extensions": {
"serviceBus": {
"messageHandlerOptions": {
"maxConcurrentCalls": 1
}
}
}}

There are two approaches you can accomplish this,
(1) You are looking for Durable Function with function chaining
For background jobs you often need to ensure that only one instance of
a particular orchestrator runs at a time. This can be done in Durable
Functions by assigning a specific instance ID to an orchestrator when
creating it.
(2) Based on the messages that you are writing to Queue, you need to partition the data, that will automatically handle the order of messages which you do not need to handle manually by azure function

In general, ordered messaging is not something I'd be striving to implement since the order can and at some point will be distorted. Saying that, in some scenarios, it's required. For that, you should either use Durable Function to orchestrate your messages or use Service Bus message Sessions.
Azure Functions has recently added support for ordered message delivery (accent on the delivery part as processing can still fail). It's almost the same as the normal Function, with a slight change that you need to instruct the SDK to utilize sessions.
public async Task Run(
[ServiceBusTrigger("queue",
Connection = "ServiceBusConnectionString",
IsSessionsEnabled = true)] Message message, // Enable Sessions
ILogger log)
{
log.LogInformation($"C# ServiceBus queue trigger function processed message: {Encoding.UTF8.GetString(message.MessageId)}");
await _cosmosDbClient.Save(...);
}
Here's a post for more detials.
Warning: using sessions will require messages to be sent with a session ID, potentially requiring a change on the sending side.

Related

Waiting for an azure function durable orchestration to complete

Currently working on a project where I'm using the storage queue to pick up items for processing. The Storage Queue triggered function is picking up the item from the queue and starts a durable orchestration. Normally the according to the documentation the storage queue picks up 16 messages (by default) in parallel for processing (https://learn.microsoft.com/en-us/azure/azure-functions/functions-bindings-storage-queue), but since the orchestration is just being started (simple and quick process), in case I have a lot of messages in the queue I will end up with a lot of orchestrations running at the same time. I would like to be able to start the orchestration and wait for it to complete before the next batch of messages are being picked up for processing in order to avoid overloading my systems. The solution I came up with and seems to work is:
public class QueueTrigger
{
[FunctionName(nameof(QueueTrigger))]
public async Task Run([QueueTrigger("queue-processing-test", Connection = "AzureWebJobsStorage")]Activity activity, [DurableClient] IDurableOrchestrationClient starter,
ILogger log)
{
log.LogInformation($"C# Queue trigger function processed: {activity.ActivityId}");
string instanceId = await starter.StartNewAsync<Activity>(nameof(ActivityProcessingOrchestrator), activity);
log.LogInformation($"Started orchestration with ID = '{instanceId}'.");
var status = await starter.GetStatusAsync(instanceId);
do
{
status = await starter.GetStatusAsync(instanceId);
} while (status.RuntimeStatus == OrchestrationRuntimeStatus.Running || status.RuntimeStatus == OrchestrationRuntimeStatus.Pending);
}
which basically picks up the message, starts the orchestration and then in a do/while loop waits while the staus is Pending or Running.
Am I missing something here or is there any better way of doing this (I could not find much online).
Thanks in advance your comments or suggestions!

This might not work since you could either hit timeouts causing duplicate orchestration runs or just force your function app to scale out defeating the purpose of your code all together.
Instead, you could rely on the concurrency throttles that Durable Functions come with. While the queue trigger would queue up orchestrations runs, only the max defined would run at any time on a single instance of a function.
This would still cause your function app to scale out, so you would have to consider that as well when setting this limit and you could also set the WEBSITE_MAX_DYNAMIC_APPLICATION_SCALE_OUT app setting to control how many instances you function app can scale out to.

It could be that the Function app's built in scaling throttling does not reduce load on downstream services because it is per app and will just cause the app to scale more. Then what is needed is a distributed max instance count that all app instances adhere to. I have built this functionality into my Durable Function orchestration app with a scaleGroupId and it`s max instance count. It has an Api call to save this info and the scaleGroupId is a string that can be set to anything that describes the resource you want to protect from overloading. Here is my app that can do this:
Microflow

Handling errors/failures occurred in Azure Durable Functions called by queue-triggered Azure Functions

We have an Azure Storage Queue which triggers an azure function once a payload/message hits the queue. The queue-triggered function invokes another durable function to process the message/payload.
Here it is the code snippet:
[FunctionName("QueueTriggerFunction")]
public Task QueueTriggerFunction(
[QueueTrigger("MyQueue", Connection = "MyStorage")]string item,
[OrchestrationClient] DurableOrchestrationClient client,
ILogger log)
=> client.StartNewAsync("Processor", JsonConvert.DeserializeObject<MyObject>(item));
And the durable function looks like the following code sample:
[FunctionName("Processor")]
public async Task ConcurrencyProcessorAsync(
[OrchestrationTrigger] DurableOrchestrationContext context,
ILogger log)
{
var myObject= context.GetInput<MyObject>();
if(ObjectProcessor(myObject) == false)
{
throw new Exception("Processor failed");
}
}
I'd like the payload to end up in the poison messages queue if the exception above is raised upon failing the ObjectProcessor method but it's not happening in reality because the exception does not bublle up through the orchestrator client. Any suggestions on how to make this exception thrown back to the caller function which is a queue-triggered one to make the payload appear in the poison messages queue?

You can't.
The QueueTriggerFunction just starts the Orchestration. After that it's life cycle ends.
I believe you can directly add your payload to poison queue using either Azure Storage Services REST API or this .Net library
Please note that name of poison queue == $"{queueName}-poison"

Automatic retry to CosmosDb output binding

I'm using an Azure function that sends an array of around 200 documents to a CosmosDB via the Output Binding. That function gets triggered about 1000 at the same time by queue messages.
In some cases I get the "Request rate is large" error and the function execution fails. The documentation says when this error occurs, I can retry the execution in some milliseconds, but I suspect the azure function runtime is doing that for me. I couldn't find any documentation explicitly saying that when the output binding throws that exception it will retry automatically (like with the .NET Linq library).
Can someone point me out to see if this is the case?

The Output binding uses SDK 1.13.2 which already has the retry mechanism in place.
Assuming you are using Azure Functions v1, if you are using the IAsyncCollection the Function will do an UpsertDocumentAsync for each AddAsync, if you are using a single document output, then the UpsertDocumentAsync should be happening once.
In any case, the SDK retries by default 9 times on a throttled result, after that, the exception is bubbled and you Function will error; the document should go back to the queue for retrying as per the QueueTrigger design and after a couple of iterations, it goes to the deadletter queue..
If you want more granular control of the flow, you could obtain the DocumentClient and do the UpsertDocumentAsync yourself with a try/catch, if it fails more than 9 times, you can opt to send to another Queue or retry another set of times. Something like:
using Microsoft.Azure.Documents;
using Microsoft.Azure.Documents.Client;
using Microsoft.Azure.Documents.Linq;
[FunctionName("CosmosDBSample")]
public static async Task<HttpResponseMessage> Run(
[QueueTrigger("my-queue")] MyPOCOClass myMessage,
[DocumentDB("test", "test", ConnectionStringSetting = "CosmosDB"] DocumentClient client,
TraceWriter log)
{
try
{
await client.UpsertDocumentAsync(myMessage);
}
catch(DocumentClientException ex)
{
// retry / queue somewhere else?
log.Warning($"DocumentClientException {ex.Message} in document {myMessage.Id}.");
}
}

Azure function service bus trigger: How to stop batches of data from event stream and only allow one message from the queue at a time?

This is my first time using Azure functions and service bus.
I'm trying to build a function app in Visual Studio and I am able to connect to the queue topic (which I don't control). The problem is that every time I enable a breakpoint, 10s of messages are processed at once in VS which is making local testing very difficult (not to mention the problems arising from database pools).
How do I ensure that only 1 message gets processed at once until I hit complete?
public static void Run([ServiceBusTrigger("xxx", "yyy", AccessRights.Manage)]BrokeredMessage msg, TraceWriter log)
{
// do something here for one message at a time.
}

Set maxConcurrentCalls to 1 in the host.json. You also could get answer from host.json reference for Azure Functions
maxConcurrentCalls 16 The maximum number of concurrent calls to the callback that the message pump should initiate. By default, the Functions runtime processes multiple messages concurrently. To direct the runtime to process only a single queue or topic message at a time, set maxConcurrentCalls to 1

For function apps 2.0 you need to to update the host.json like this:
{
"version": "2.0",
...
"extensions": {
"serviceBus": {
"messageHandlerOptions": {
"maxConcurrentCalls": 1
}
}
}
}

Since the host.json file is usually published to Azure along with the function itself, it is preferable not to modify it for debugging and development purposes. Any change to host.json can be made to local.settings.json instead.
Here is how to set maxConcurrentCalls to 1:
{
"IsEncrypted": false,
"Values": {
...
"AzureFunctionsJobHost:Extensions:ServiceBus:MessageHandlerOptions:MaxConcurrentCalls": 1
}
}
This override functionality is described here: https://learn.microsoft.com/en-us/azure/azure-functions/functions-host-json#override-hostjson-values

WebJob QueueTrigger - limit per function

I've got a few Webjobs in place, each of which respond to a number of QueueTrigger, e.g.
public static void ProcessMessage([QueueTrigger("XXXXXXX")] string message, TextWriter log)
{
//processing message
}
public static void ProcessMessage([QueueTrigger("YYYYYY")] string message, TextWriter log)
{
//processing message
}
Should I be separating out each trigger to a separate job? Are there any reasons why continuing on this path is a bad idea, i.e. the more queues it can trigger the less functions get executed due to thread limits?

What you are doing is the common approach - the WebJobs SDK JobHost is designed to handle many different job functions all within the same application. It is true that all the job functions within a single host will share the same process/memory space and limits, but for most scenarios this isn't a problem and is the recommended approach.
For QueueTrigger specifically, each of your functions will efficiently poll for new work, and when work is available each will pull messages in batches of 16 (configurable via JobHostConfiguration.Queues) and process them in parallel.
You can also scale out if needed by increasing the number of instances your WebJob runs on. Each instance will then work cooperatively with the others to handle more load.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Sequence processing with Azure Function & Service Bus - azure

I resolved my problem by using this config in my host.json { "version": "2.0", "extensions": { "serviceBus": { "messageHandlerOptions": { "maxConcurrentCalls": 1 } } }}

Related

Waiting for an azure function durable orchestration to complete

Handling errors/failures occurred in Azure Durable Functions called by queue-triggered Azure Functions

Automatic retry to CosmosDb output binding

Azure function service bus trigger: How to stop batches of data from event stream and only allow one message from the queue at a time?

WebJob QueueTrigger - limit per function

Categories

Resources