I have an application that executes rules using some rules engine. I have around 500+ rules and our application will receive around 10,000 entries and all of those 10,000 entries should go through those 500 rules for validation individually. We are currently planning to migrate all our rules into Azure functions. That is each Azure function will correspond to a single rule.
So I was trying a proof of concept and created couple of Azure functions with Task.delay to mock real rule validation. I need to execute all 500 rules for each and every request. So I created a Durable function as orchestrator calling all the 500 activity triggers (rules) using Task.WhenAll, but that didn't work. Azure functions got hung with just one request. So instead I created individual HttpTrigger function for the rules. From a C# class library I called all the 500 functions using Task.WhenAll for one request and it worked like a charm. However, when I tried calling azure functions for all entries (starting with 50) then Task.WhenAll started throwing errors saying a task got cancelled or TimeoutException for HTTP call. Below is the code:
var tasksForCallingAzureFunctions = new List<List<Task<HttpResponseMessage>>>();
Parallel.For(0, 50, (index) =>
{
var tasksForUser = new List<Task<HttpResponseMessage>>();
//call the mocked azure function 500 times
for (int j = 0; j < 500; j++)
{
tasksForUser.Add(client.SendAsync(CreateRequestObject()));
}
//add the tasks for each user into main list to call all the requests in parallel.
tasksForCallingAzureFunctions.Add(tasksForUser);
});
// call all the requests - 50 * 500 requests in parallel.
Parallel.ForEach(tasksForCallingAzureFunctions, (tasks) => Task.WhenAll(tasks).GetAwaiter().GetResult());
So my question is, Is this a recommended approach? I'm making 50 * 500 I/O calls. Requirement is all the entries (10000) should go through 500 rules.
Other option I thought was send all the entries to each Azure function, so that Azure function will have logic to loop through each entry and validate them. That way I only have to make 500 calls?
Please advise.
I would implement this scenario with queue-triggered Functions. Send a queue message per entity per rule, each function will pick one up, do its work and save the result to some storage. This should be more resilient than doing a gazillion sync HTTP requests.
Now, Durable Functions basically do the same behind the scenes, so you were on the right path. Maybe add a new question with minimal repro for your hanging problem.
Related
I have Azure durable function run by timer trigger, which runs another function (UploadActivity) that does some http call to the external to Azure REST service. We know for sure that small percentage of all UploadActivity invocations end up in http error and exception risen, the rest are exception-free and upload some data to the remote http resource. Interesting finding I got is that Azure Insight's 'requests' collection contains only failed requests, and no successful one recorded
// gives no results
requests
| where success == "True"
// gives no results
requests
| where success <> "False"
// gives results
requests
| where success == "False"
I can't realize why. Here are some attributes of one of returned request with success=='False' if it helps to find why
operation_Name:
UploadActivity
appName:
/subscriptions/1b3e7d9e-e73b-4061-bde1-628b728b43b7/resourcegroups/myazuretest-rg/providers/microsoft.insights/components/myazuretest-ai
sdkVersion:
azurefunctions: 4.0.1.16815
'resource' is defined in Azure as http call to http-triggered function, but I have no http triggered functions in my app which makes things even more confusing, I think maybe these requests belong to Azure Insights calls, that could be also built based on Azure Functions
For a timer triggered function it is normal that there are no records in the requests collection of Application Insights. If it would be an http triggered function you would have 1. Only the request that triggers the function is recorded as a request in Application Insights. A timer trigger does not respond to a request.
Once the function is triggered all http requests (and all kind of other communication like calls to service busses etc.) executed by that function will be recorded as a dependency in the dependencies collection. This is by design and is how Application Insight works.
Currently working on a project where I'm using the storage queue to pick up items for processing. The Storage Queue triggered function is picking up the item from the queue and starts a durable orchestration. Normally the according to the documentation the storage queue picks up 16 messages (by default) in parallel for processing (https://learn.microsoft.com/en-us/azure/azure-functions/functions-bindings-storage-queue), but since the orchestration is just being started (simple and quick process), in case I have a lot of messages in the queue I will end up with a lot of orchestrations running at the same time. I would like to be able to start the orchestration and wait for it to complete before the next batch of messages are being picked up for processing in order to avoid overloading my systems. The solution I came up with and seems to work is:
public class QueueTrigger
{
[FunctionName(nameof(QueueTrigger))]
public async Task Run([QueueTrigger("queue-processing-test", Connection = "AzureWebJobsStorage")]Activity activity, [DurableClient] IDurableOrchestrationClient starter,
ILogger log)
{
log.LogInformation($"C# Queue trigger function processed: {activity.ActivityId}");
string instanceId = await starter.StartNewAsync<Activity>(nameof(ActivityProcessingOrchestrator), activity);
log.LogInformation($"Started orchestration with ID = '{instanceId}'.");
var status = await starter.GetStatusAsync(instanceId);
do
{
status = await starter.GetStatusAsync(instanceId);
} while (status.RuntimeStatus == OrchestrationRuntimeStatus.Running || status.RuntimeStatus == OrchestrationRuntimeStatus.Pending);
}
which basically picks up the message, starts the orchestration and then in a do/while loop waits while the staus is Pending or Running.
Am I missing something here or is there any better way of doing this (I could not find much online).
Thanks in advance your comments or suggestions!
This might not work since you could either hit timeouts causing duplicate orchestration runs or just force your function app to scale out defeating the purpose of your code all together.
Instead, you could rely on the concurrency throttles that Durable Functions come with. While the queue trigger would queue up orchestrations runs, only the max defined would run at any time on a single instance of a function.
This would still cause your function app to scale out, so you would have to consider that as well when setting this limit and you could also set the WEBSITE_MAX_DYNAMIC_APPLICATION_SCALE_OUT app setting to control how many instances you function app can scale out to.
It could be that the Function app's built in scaling throttling does not reduce load on downstream services because it is per app and will just cause the app to scale more. Then what is needed is a distributed max instance count that all app instances adhere to. I have built this functionality into my Durable Function orchestration app with a scaleGroupId and it`s max instance count. It has an Api call to save this info and the scaleGroupId is a string that can be set to anything that describes the resource you want to protect from overloading. Here is my app that can do this:
Microflow
My azure function is calculating results of certain request jobs (cca. 5s-5min) where each job has unique jobId based on the hash of the request message. Execution leads to deterministic results. So it is functionally "pure function". Therefore we are caching results of already evaluated jobs in a blob storage based on the jobId. All great so far.
Now if a request for jobId comes three scenarios are possible.
Result is in the cache already => then it is served from the cache.
Result is not in the cache and no function is running the evaluation => new invocation
Result is not in the cache, but some function is already working on it => wait for result
We do some custom table storage based progress tracking magic to tell if function is working on given jobId or not yet.
It works somehow, up to the point of 5 x restart -> poison queue scenarios. There we are quite hopeless.
I feel like we are hacking around some of already reliably implemented feature of Azure Functions internals, because exactly the same info can be seen in the monitor page in azure portal or used to be visible in kudu webjobs monitor page.
How to reliably find out in c# if a given message (jobId) is currently being processed by some function and when it is not?
Azure Durable Functions provide a mechanism how to track progress of execution of smaller tasks.
https://learn.microsoft.com/en-us/azure/azure-functions/durable-functions-overview
Accroding to the "Pattern #3: Async HTTP APIs" the orchestrator can provide information about the function status in form like this:
{"runtimeStatus":"Running","lastUpdatedTime":"2017-03-16T21:20:47Z", ...}
This solves my problem about finding if given message is being processed.
How to reliably find out in c# if a given message (jobId) is currently being processed by some function and when it is not?
If you’d like to detect which message is being processed and get the message ID in queue triggered Azure function, you can try the following code:
#r "Microsoft.WindowsAzure.Storage"
using System;
using Microsoft.WindowsAzure.Storage.Queue;
public static void Run(CloudQueueMessage myQueueItem, TraceWriter log)
{
log.Info($"messageid: {myQueueItem.Id}, messagebody: {myQueueItem.AsString}");
}
Like the title describes - I have an Azure Function on the App Service Plan, configured for Always On and no functionTimeout set in my host.json, and it appears to timeout / not finish anytime after 30 minutes to 1 hour.(...but I feel this may be a false positive...)
The HTTP Triggered function can sometimes take over 1-2 hours to complete. I understand that this probably isn't the best design and according to the Azure Function Best Practices I should break this out into smaller / more manageable pieces - I get that. However, I expect the Function on the App Service plan to work as advertised - no hard limit on execution time. Perhaps this is the same question as Unexpected azure-function timeouts on app-service-plan, but that has no answer and I am using an HTTP Trigger instead.
Currently, the HTTP Triggered method does not return until the work is complete. (Is this a problem - the HTTP trigger needs to return quicker?)
According to the Kudu Function Invocation Logs, this case reports "Never Finished", and when I click on the Toggle Output button to view the logs, they never come in.
When I viewed this function's run in the Logs section of that trigger, it seems like the function just stopped, and the log stream just reports no new trace:
2017-07-26T16:36:43.116 [INFO] [Class1] Update operation started processing 790 sales records ...
2017-07-26T16:36:43.116 [DBUG] [Class2] Matching and updating ids from the map...
2017-07-26T16:38:07 No new trace in the past 1 min(s).
2017-07-26T16:39:07 No new trace in the past 2 min(s).
2017-07-26T16:40:07 No new trace in the past 3 min(s).
2017-07-26T16:41:07 No new trace in the past 4 min(s).
So not sure why this function just seemed to stop - or perhaps it stopped collecting log statements (there are many), and for some reason, the function never completed.
Any ideas?
Approx time: 2017-07-26T16:00:00 UTC
InvocationID: d856c107-f1ee-455a-892b-ed970dcad128 (I think?)
If it is indeed being timed out, is there any way for us to know, (Exception? App Insights? etc.)
Based on my test, I found azure function will not stop your function if you don't set the timeout.
Here is my test, I create a ManualTrigger function which will log the message every 10 minutes.
The codes like below:
public static void Run(string input, TraceWriter log)
{
for (int i = 0; i < 100; i++)
{
log.Info( "Worked " + i*10 + " minutes ");
Thread.Sleep(600000);
}
}
The log details:
In the log, you could find my function executed 70 minutes.It still works well.
The no trace means there are no new requests send to the azure function.
Currently, the HTTP Triggered method does not return until the work is complete. (Is this a problem - the HTTP trigger needs to return quicker?)
As Jesse Carter says, you couldn't execute long time function when you used HTTP Triggered method.
Since your client-side(send request) will have a timeout value. It will wait for the function's response.
Normally, if we want to execute long time function, I suggest you could use http trigger to get the request. In the http trigger function you could add a queue message to the azure storage queue.
Then you could write a queue trigger function which will execute the long time work.
If your HTTP method takes more than a minute, you should be offloading it to a Queue. Period. (I know the other answers have said this, but it's worth repeating).
Http connections are a limited resource.
While Azure Functions as an execution engine can handle long running
operations (as demonstrated by queue / service bus support), the
http pipeline may cut off / timeout long running requests.
Queue triggers can easily run for 30+ minutes. If your job is longer than that, you really should split it into multiple queue messages.
Also check out Durable Function support: https://github.com/Azure/azure-functions-durable-extension/
Regardless of the function app timeout setting, 230 seconds is the maximum amount of time that an HTTP triggered function can take to respond to a request. This is because of the default idle timeout of Azure Load Balancer. For longer processing times, consider using the Durable Functions async pattern or defer the actual work and return an immediate response.
Function app timeout duration: Check Notes
Is there any way to configure triggers without attributes? I cannot know the queue names ahead of time.
Let me explain my scenario here.. I have one service bus queue, and for various reasons (complicated duplicate-suppression business logic), the queue messages have to be processed one at a time, so I have ServiceBusConfiguration.OnMessageOptions.MaxConcurrentCalls set to 1. So processing a message holds up the whole queue until it is finished. Needless to say, this is suboptimal.
This 'one at a time' policy isn't so simple. The messages could be processed in parallel, they just have to be divided into groups (based on a field in message), say A and B. Group A can process its messages one at a time, and group B can process its own one at a time, etc. A and B are processed in parallel, all is good.
So I can create a queue for each group, A, B, C, ... etc. There are about 50 groups, so 50 queues.
I can create a queue for each, but how to make this work with the Azure Webjobs SDK? I don't want to copy-paste a method for each queue with a different ServiceBusTrigger for the SDK to discover, just to enforce one-at-a-time per queue/group, then update the code with another copy-paste whenever another group is needed. Fetching a list of queues at startup and tying to the function is preferable.
I have looked around and I don't see any way to do what I want. The ITypeLocator interface is pretty hard-set to look for attributes. I could probably abuse the INameResolver, but it seems like I'd still have to have a bunch of near-duplicate methods around. Could I somehow create what the SDK is looking for at startup/runtime?
(To be clear, I know how to use INameResolver to get queue name as at How to set Azure WebJob queue name at runtime? but though similar this isn't my problem. I want to setup triggers for multiple queues at startup for the same function to get the one-at-a-time per queue processing, without using the trigger attribute 50 times repeatedly. I figured I'd ask again since the SDK repo is fairly active and it's been a year..).
Or am I going about this all wrong? Being dumb? Missing something? Any advice on this dilemma would be welcome.
The Azure Webjob Host discovers and indexes the functions with the ServiceBusTrigger attribute when it starts. So there is no way to set up the queues to trigger at the runtime.
The simpler solution for you is to create a long time running job and implement it manually:
public class Program
{
private static void Main()
{
var host = new JobHost();
host.CallAsync(typeof(Program).GetMethod("Process"));
host.RunAndBlock();
}
[NoAutomaticTriggerAttribute]
public static async Task Process(TextWriter log, CancellationToken token)
{
var connectionString = "myconnectionstring";
// You can also get the queue name from app settings or azure table ??
var queueNames = new[] {"queueA", "queueA" };
var messagingFactory = MessagingFactory.CreateFromConnectionString(connectionString);
foreach (var queueName in queueNames)
{
var receiver = messagingFactory.CreateMessageReceiver(queueName);
receiver.OnMessage(message =>
{
try
{
// do something
....
// Complete the message
message.Complete();
}
catch (Exception ex)
{
// Log the error
log.WriteLine(ex.ToString());
// Abandon the message so that it can be retry.
message.Abandon();
}
}, new OnMessageOptions() { MaxConcurrentCalls = 1});
}
// await until the job stop or restart
await Task.Delay(Timeout.InfiniteTimeSpan, token);
}
}
Otherwise, if you don't want to deal with multiple queues, you can have a look at azure servicebus topic/subscription and create SqlFilter to send your message to the right subscription.
Another option could be to create your own trigger: The azure webjob SDK provides extensibility points to create your own trigger binding :
Binding Extensions Overview
Good Luck !
Based on my understanding, your needs seems to be building a message batch system in parallel. The #Thomas solution is good, but I think Azure Batch service with Table storage may be better and could be instead of the complex solution of ServiceBus queue + WebJobs with a trigger.
Using Azure Batch with Table storage, you can control the task creation and execute the task in parallel and at scale, even monitor these tasks, please refer to the tutorial to know how to.