Is the orchestrator client function reliable? - azure

I am new to azure durable functions. According to documents that the orchestrator functions are reliable. But I am wondering if the the starter function is reliable. Suppose that I have a http-trigger orchestrator functions. I guess when durable function framework (or something run in backend) detects that an http request matches a orchestrator function, it starts the starter function to trigger the orchestrator function. I am wondering where the starter function runs, a VM? Can the VM fail? I cannot find much doc on msdn.

Orchestrator functions reliably maintain their execution state by using the event sourcing design pattern. Instead of directly storing the current state of an orchestration, the Durable Task Framework uses an append-only store to record the full series of actions the function orchestration takes. An append-only store has many benefits compared to "dumping" the full runtime state. Benefits include increased performance, scalability, and responsiveness. You also get eventual consistency for transactional data and full audit trails and history. The audit trails support reliable compensating actions.
Durable Functions uses event sourcing transparently. Behind the scenes, the await (C#) or yield (JavaScript) operator in an orchestrator function yields control of the orchestrator thread back to the Durable Task Framework dispatcher. The dispatcher then commits any new actions that the orchestrator function scheduled (such as calling one or more child functions or scheduling a durable timer) to storage. The transparent commit action appends to the execution history of the orchestration instance. The history is stored in a storage table. The commit action then adds messages to a queue to schedule the actual work. At this point, the orchestrator function can be unloaded from memory.
When an orchestration function is given more work to do (for example, a response message is received or a durable timer expires), the orchestrator wakes up and re-executes the entire function from the start to rebuild the local state. During the replay, if the code tries to call a function (or do any other async work), the Durable Task Framework consults the execution history of the current orchestration. If it finds that the activity function has already executed and yielded a result, it replays that function's result and the orchestrator code continues to run. Replay continues until the function code is finished or until it has scheduled new async work.
Offcial doc

Related

In Azure Function App how can I reduce downtime between each yield in orchestration

In a Node.js Function App on Azure I have an orchestration (generator*) function that calls an series of activity functions. In between each successful activity function there seems to be a period of dead time before the orchestrator is resumed and the next activity is begun. Sometimes it seems very long, like many minutes. Is there anything I can do to make the orchestrator turn over faster? I am on the Premium plan.
Something I have experienced is that, if the output of an activity function is big it creates a dead time before the orchestrator is resumed and the next activity is begun.
Durable Task Saves execution history (also the activity function's output) into table storage. If the output of the activity function becomes big (more than 64kb), it can't save the result in the table column so it saves the result in the blob storage.
Every time the orchestrator function is resumed after waiting for a task to complete, the Durable Task Framework reruns the orchestrator function from scratch and deserializes the blob, and binds to your variable every time. It creates a lot of overhead and slows down the orchestration.

Calling functions from other functions in Azure servless function app

I have a timer function that runs by itself once per minute.
Is it possible to invoke this function from another type of function if I want to call it at an arbitrary time (not on its cron schedule).
From:
An orchestrator function?
An activity function?
Also, it is it possible to call an orchestrator directly from a an activity function? I have heard that you can do "sub orchestrations" from an orchestrator. But what about directly from an activity function.
You cannot call the Time Trigger Function, but one thing you can do is extract the logic to a class library and share it with a Http Trigger that function that will be running in the same Azure Function App as the time trigger one.
About the Durable part, it's been a while since the last time I worked with that, but as far as I know, the orchestrator can call sub orchestrators and activities.

How to stop replay activity in azure durable functions?

I created 1 activity function and calling it from orchestrator function using powershell commandlet Invoke-DurableActivity. But when I executed the orchestrator function I'm seeing that the code running asynchronous and replay activity causing code to re-run. How can I avoid this re-run as we don't need it.
The re-run is part of the design of the framework. The good news is, once the activity finishes, The orchestrator will check internally for the state (result of the activity) and will move to the next step of your workflow.

How to check running status and stop Durable function

I want to process millions of records on-demand, which takes approximate 2-3 hours to process. I want to go serverless that is why tried durable function (first-time). I want to check, how long I can run durable function so I created 3 functions
Http function to kick start Orchestrator function
Orchestrator function
Activity function
My DurableFunction is running and emitting logs in Application Insights from last 5 days and based on my code it would take 15 more days to complete.
I want to know that how to stop Orchestrator function manually?
I can see thousands of entry in ApplicationInsights requests table for single execution, Is there any way to check how many DurableFunction running in backend? and how much time taken by single execution?
I can see some information regarding orchestrator function in "DurableFunctionHubInstance" table but as MS recommended not rely on table.
Since Durable Functions does a lot of checkpointing and replays the orchestration, normal logging might not always be very insightful.
Getting the status
There are several ways to query for the status of orchestrations. One of them is through the Azure Functions Core tools as George Chen mentioned.
Another way to query the status is by using the HTTP API of Durable Functions directly:
GET <rooturl>/runtime/webhooks/durableTask/instances?
taskHub={taskHub}
&connection={connectionName}
&code={systemKey}
&createdTimeFrom={timestamp}
&createdTimeTo={timestamp}
&runtimeStatus={runtimeStatus1,runtimeStatus2,...}
&showInput=[true|false]
&top={integer}
More info in the docs.
The HTTP API also has methods to purge orchestrations. Either a single one by ID or multiple by datetime/status.
DELETE <rooturl>/runtime/webhooks/durabletask/instances/{instanceId}
?taskHub={taskHub}
&connection={connection}
&code={systemKey}
Finally you can also manage your instances using the DurableOrchestrationClient API in C#. Here's a sample on GitHub: HttpGetStatusForMany.cs
I have written & vlogged about using the DurableOrchestrationClient API in case you want to know more about how to use this in C#.
Custom status
Small addition: it's possible to add a custom status object to the orchestration so you can add enriched information about the progress of the orchestration.
Getting the duration
When you query the status of an orchestration instance you get back a DurableOrchestrationStatus object. This contains two properties:
CreatedTime
LastUpdatedTime
I'm guessing you can subtract those and get a reasonable indication of the time it has taken.
You could manage the Durable Functions orchestration instances with Azure Functions Core Tools.
Terminate instances:
func durable terminate --id 0ab8c55a66644d68a3a8b220b12d209c --reason "It was time to be done."
Query instances with filters: you could add the parameter(runtime-status) to filter the running instances.
func durable get-instances --created-after 2018-03-10T13:57:31Z --created-before 2018-03-10T23:59Z --top 15
As for the time functions took, looks like it doesn't support. The similar parameter is the get-history.

Azure Function Event Hub Trigger reliability

I'm a bit confused regarding the EventHubTrigger for Azure functions.
I've got an IoT Hub, and am using its eventhub-compatible endpoint to trigger an Azure function that is going to process and store the received data.
However, if my function fails (= throws an exception), that message (or messages) being processed during that function call will get lost. I actually would expect the Azure function runtime to process the messages at a later time again. Specifically, I would expect this behavior because the EventHubTrigger is keeping checkpoints in the Function Apps storage account in order to keep track of where in the event stream it has to continue.
The documention of the EventHubTrigger even states that
If all function executions succeed without errors, checkpoints are added to the associated storage account
But still, even when I deliberately throw exceptions in my function, the checkpoints will get updated and the messages will not get received again.
Is my understanding of the EventHubTriggers documentation wrong, or is the EventHubTriggers implementation (or its documentation) wrong?
This piece of documentation seems confusing indeed. I guess they mean the errors of Function App host itself, not of your code. An exception inside function execution doesn't stop the processing and checkpointing progress.
The fact is that Event Hubs are not designed for individual message retries. The processor works in batches, and it can either mark the whole batch as processed (i.e. create a checkpoint after it), or retry the whole batch (e.g. if the process crashed).
See this forum question and answer.
If you still need to re-process failed events from Event Hub (and errors don't happen too often), you could implement such mechanism yourself. E.g.
Add an output Queue binding to your Azure Function.
Add try-catch around processing code.
If exception is thrown, add the problematic event to the Queue.
Have another Function with Queue trigger to process those events.
Note that the downside of this is that you will loose ordering guarantee provided by Event Hubs (since Queue message will be processed later than its neighbors).
Quick fix. As retry policy would not work if down system is down for few hours. You can call Process.GetCurrentProcess().Kill(); in exception handling. This would stop the checkpoint moving forward. I have tested this with consumption based function app. You will not see anything in logs but i added email to notify that something went wrong and to avoid data loss i have killed the function instance.
Hope this helps.
Would put an blog over it and other part of workflow where I stop function in case of continuous failure on down system using logic app.

Resources