Durable Functions: Return Result from Orchestrator - azure

I have a use case which fits well with the durable functions sequence example: push a json payload through three functions, each of which modifies the json graph and forwards it to the next function.
In the sequence example the result of the sequence is retrieved by issuing a query to the orchestrator.
In my use case I want to directly return the result of the three functions, essentially as the response of the third function.
Is there a way to do this? Is it even wise?

This is certainly doable. You can start with an HTTP trigger to start the orchestration and use the GetStatusAsync API inside your function to poll and wait for it to complete. Once completed, you can return the result from your HTTP trigger.
Something like this, perhaps:
public static async Task<JObject> Run(JObject input, DurableOrchestrationClient client)
{
string instanceId = await client.StartAsync("MyOrchestration", input);
for (int i = 0; i < 60; i++)
{
var status = await client.GetStatusAsync(instanceId);
if (status?.RuntimeStatus == "Completed")
{
return (JObject)status.Output;
}
// handle other status conditions, like failure
await Task.Delay(TimeSpan.FromSeconds(1));
}
// handle timeouts
}
As you can see from the code, the issue you'll have is dealing with error conditions. For example, what does your function do if the orchestration fails? Also, what if it takes a long time to finish? Those are things you can certainly figure out, but you'll want to code defensively to handle these cases.

Related

How to avoid conflicts when two tasks start at same time in c# while using aync await

When we call async method in another async method,
sometimes few of tasks getting into conflict.
If two tasks of EventReceiverAsync running at same time. For example getData(1) and getData(2) run parallely.
If getData(2) response is faster than getData(1). In this case objPayload taking getData(2) response even
for the task1.
Sometimes there is chance of conflict happens.
private async Task EventReceiverAsync(ID)
{
Payload objPayload = await getData(ID);
}
private async Task<Payload> getData(string ID)
{
Payload objOutput= new Payload ();
objOutput = await CallAPI("/api/getapidata", ID);
return objOutput;
}
Is there any way to categorize output based on task through any Id or something?
For example getData(1) and getData(2) run parallely. If getData(2) response is faster than getData(1). In this case objPayload taking getData(2) response even for the task1.
This is not possible with the code you posted. There are two invocations of EventReceiverAsync, each with its own objPayload variable. The objPayload for task 1 will receive the result of getData(1), and the objPayload for task 2 will receive the result of getData(2), regardless of completion order.
As long as your code returns its results, mixing results will not happen. However, if your code sets some shared variable as its "result", then that sharing will have to be managed by you to avoid mixing up the results.

Order Process with Azure Durable Functions or not

I am creating an architecture to process our orders from an ecommerce website who gets 10,000 orders or more every hour. We are using an external third party order fulfillment service and they have about 5 Steps/APIs that we have to run which are dependent upon each other.
I was thinking of using Fan in/Fan Out approach where we can use durable functions.
My plan
Once the order is created on our end, we store in a table with a flag of Order completed.
Run a time trigger azure function that runs the durable function orchestrator which calls the activity functions for each step
Now if it fails, timer will pick up the order again until it is completed. But my question is should we put this order in service bus and pick it up from there instead of time trigger.
Because there can be more than 10,000 records each hour so we have to run a query in the time trigger function and find orders that are not completed and run the durable orchestrator 10,000 times in a loop. My first question - Can I run the durable function parallelly for 10,000 records?
If I use service bus trigger to trigger durable orchestrator, it will automatically run azure function and durable 10,000 times parallelly right? But in this instance, I will have to build a dead letter queue function/process so if it fails, we are able to move it to active topic
Questions:
Is durable function correct approach or is there a better and easier approach?
If yes, Is time trigger better or Service bus trigger to start the orchestrator function?
Can I run the durable function orchestrator parallelly through time trigger azure function. I am not talking about calling activity functions because those cannot be run parallelly because we need output of one to be input of the next
This usecase fits function chaining. This can be done by
Have the ordering system put a message on a queue (storage or servicebus)
Create an azure function with storage queue trigger or service bus trigger. This would also be the client function that triggers the orchestration function
Create an orchestration function that invokes the 5 step APIs, one activity function for each (similar to as given in function chaining example.
Create five activity function, one f for each API
Ordering system
var clientOptions = new ServiceBusClientOptions
{
TransportType = ServiceBusTransportType.AmqpWebSockets
};
//TODO: Replace the "<NAMESPACE-NAME>" and "<QUEUE-NAME>" placeholders.
client = new ServiceBusClient(
"<NAMESPACE-NAME>.servicebus.windows.net",
new DefaultAzureCredential(),
clientOptions);
sender = client.CreateSender("<QUEUE-NAME>");
var message = new ServiceBusMessage($"{orderId}");
await sender.SendMessageAsync(message);
Client function
public static class OrderFulfilment
{
[Function("OrderFulfilment")]
public static string Run([ServiceBusTrigger("<QUEUE-NAME>", Connection = "ServiceBusConnection")] string orderId,
[DurableClient] IDurableOrchestrationClient starter)
{
var logger = context.GetLogger("OrderFulfilment");
logger.LogInformation(orderId);
return starter.StartNewAsync("ChainedApiCalls", orderId);
}
}
Orchestration function
[FunctionName("ChainedApiCalls")]
public static async Task<object> Run([OrchestrationTrigger] IDurableOrchestrationContext fulfillmentContext)
{
try
{
// .... get order with orderId
var a = await context.CallActivityAsync<object>("ApiCaller1", null);
var b = await context.CallActivityAsync<object>("ApiCaller2", a);
var c = await context.CallActivityAsync<object>("ApiCaller3", b);
var d = await context.CallActivityAsync<object>("ApiCaller4", c);
return await context.CallActivityAsync<object>("ApiCaller5", d);
}
catch (Exception)
{
// Error handling or compensation goes here.
}
}
Activity functions
[FunctionName("ApiCaller1")]
public static string ApiCaller1([ActivityTrigger] IDurableActivityContext fulfillmentApiContext)
{
string input = fulfillmentApiContext.GetInput<string>();
return $"API1 result";
}
[FunctionName("ApiCaller2")]
public static string ApiCaller2([ActivityTrigger] IDurableActivityContext fulfillmentApiContext)
{
string input = fulfillmentApiContext.GetInput<string>();
return $"API2 result";
}
// Repeat 3 more times...

Node.js Spawning multiple threads within a class method

How can I run a single method multiple times multi-threaded when called as a method of a class?
At first I tried to use the cluster module, but I realize it just re-runs the whole process from the start, rightfully so.
How can I achieve something like what's outlined below?
I want a class's method to spawn n processes, and when the parallel tasks are completed, I can resolve a promise which the method returns.
The problem with the code below is that calling cluster.fork() will fork index.js process.
index.js
const Person = require('./Person.js');
var Mary = new Person('Mary');
Mary.run(5).then(() => {...});
console.log('I should only run once, but I am called 5 times too many');
Person.js
const cluster = require('cluster');
class Person{
run(distance){
var completed = 0;
return new Promise((resolve, reject) => {
for(var i = 0; i < distance; i++) {
// run a separate process for each
cluster.fork().send(i).on('message', message => {
if (message === 'completed') { ++completed; }
if (completed === distance) { resolve(); }
});
}
});
}
}
I think the short answer is impossible. It's even worse - this has nothing to do with js. To multi (process or thread) in your particular problem you will essentially need a copy of the object in every thread, since it needs (maybe) access to fields - in this case you would need to either initialize it in every thread or share memory. That last one I don't think is provided in cluster, and not trivial in other languages in every use case.
If the calculation is independent of the Person I suggest you extract it, and use the usual (in index.js):
if(cluster.isWorker) {
//Use the i for calculation
} else {
//Create Person, then fork children in for loop
}
You then collect the results and change the Person as needed. You will be copying index.js, but this is standard and you only run what you need.
The problem is if results are dependent on Person. If these are constant for all i you can still send them to your forks independently. Otherwise what you have is the only way to fork. In general forking in cluster is not meant for methods, but for the app itself, which is the standard forking behavior.
Another solution
Following your comment, I suggest you checkout child_process.execFile or child_process.exec on same file.
This way you can spawn a totally independent process on the fly. Now instead of calling cluster.fork you call execFile. You can use either the exit code or stdout as return values (stderr etc.). Promise is now replaced with:
var results = []
for(var i = 0; i < distance; i++) {
// run a separate process for each
results.push(child_process.execFile().child.execFile('node', 'mymethod.js`,i]));
}
//... catch the exit event from all results or return a callback using results.
Inside mymethod.js Have your code that takes i and returns what you want either in the exit code or through stdout, both properties of the returned child_process. This is a bit un-node.js-y since you're waiting on asynchronous calls, but you're requirements are non standard. Since I'm not sure how you use this perhaps returning a callback with the array is a better idea.

Express Node Request For Loop Issue [duplicate]

With node.js I want to http.get a number of remote urls in a way that only 10 (or n) runs at a time.
I also want to retry a request if an exception occures locally (m times), but when the status code returns an error (5XX, 4XX, etc) the request counts as valid.
This is really hard for me to wrap my head around.
Problems:
Cannot try-catch http.get as it is async.
Need a way to retry a request on failure.
I need some kind of semaphore that keeps track of the currently active request count.
When all requests finished I want to get the list of all request urls and response status codes in a list which I want to sort/group/manipulate, so I need to wait for all requests to finish.
Seems like for every async problem using promises are recommended, but I end up nesting too many promises and it quickly becomes uncypherable.
There are lots of ways to approach the 10 requests running at a time.
Async Library - Use the async library with the .parallelLimit() method where you can specify the number of requests you want running at one time.
Bluebird Promise Library - Use the Bluebird promise library and the request library to wrap your http.get() into something that can return a promise and then use Promise.map() with a concurrency option set to 10.
Manually coded - Code your requests manually to start up 10 and then each time one completes, start another one.
In all cases, you will have to manually write some retry code and as with all retry code, you will have to very carefully decide which types of errors you retry, how soon you retry them, how much you backoff between retry attempts and when you eventually give up (all things you have not specified).
Other related answers:
How to make millions of parallel http requests from nodejs app?
Million requests, 10 at a time - manually coded example
My preferred method is with Bluebird and promises. Including retry and result collection in order, that could look something like this:
const request = require('request');
const Promise = require('bluebird');
const get = Promise.promisify(request.get);
let remoteUrls = [...]; // large array of URLs
const maxRetryCnt = 3;
const retryDelay = 500;
Promise.map(remoteUrls, function(url) {
let retryCnt = 0;
function run() {
return get(url).then(function(result) {
// do whatever you want with the result here
return result;
}).catch(function(err) {
// decide what your retry strategy is here
// catch all errors here so other URLs continue to execute
if (err is of retry type && retryCnt < maxRetryCnt) {
++retryCnt;
// try again after a short delay
// chain onto previous promise so Promise.map() is still
// respecting our concurrency value
return Promise.delay(retryDelay).then(run);
}
// make value be null if no retries succeeded
return null;
});
}
return run();
}, {concurrency: 10}).then(function(allResults) {
// everything done here and allResults contains results with null for err URLs
});
The simple way is to use async library, it has a .parallelLimit method that does exactly what you need.

Can I queue multiple items from a single run of an Azure Function?

I have a Node.js timerTrigger Azure function that processes a collection and queues the processing results for further processing (by a Node.js queueTrigger function).
The code is something like the following:
module.exports = function (context, myTimer) {
collection.forEach(function (item) {
var items = [];
// do some work and fill 'items'
var toBeQueued = { items: items };
context.bindings.myQueue = toBeQueued;
});
context.done();
};
This code will only queue the last toBeQueued and not each one I'm trying to queue.
Is there any way to queue more than one item?
Update
To be clear, I'm talking about queueing a toBeQueued in each iteration of forEach, not just queueing an array. Yes, there is an issue with Azure Functions because of which I cannot queue an array, but I have a workaround for it; i.e., { items: items }.
Not yet, but we'll address that within the week, stay tuned :) You'll be able to pass an array to the binding as you're trying to do above.
We have an issue tracking this in our public repo here. Thanks for reporting.
Mathewc's answer is the correct one wrt Node.
For C# you can today by specifying ICollector<T> as the type of your output queue parameter.
Below is an example I have of two output queues, one of which I add via a for loop.
public static void Run(Item inbound, DateTimeOffset InsertionTime, ICollector<Item> outbound, ICollector<LogItem> telemetry, TraceWriter log)
{
log.Verbose($"C# Queue trigger function processed: {inbound}");
telemetry.Add(new LogItem(inbound, InsertionTime));
if(inbound.current_generation < inbound.max_generation)
{
for(int i = 0; i < inbound.multiplier; i++) {
outbound.Add(Item.nextGen(inbound));
}
}
}

Resources