Azure Logic Apps - Timeout issue - azure

I have created a Azure Logic apps to pull data from a REST API and populate a Azure SQL Database to process some data and push result to Dynamics 365. I have around 6000 rows from REST API and I have created 2 logic apps, one pulls data as paged (each page having 10 records) and using a do until loop to process each set. I'm calling another logic app 2 from DO UNTIL loop and passing the paged records which inserts record in to SQL Database.
The issue i'm encountering is the Main logic app times out after 2 minutes.(It process around 600 rows and times out.)
I came across this article which explains various patterns related to managing long running process.
https://learn.microsoft.com/en-us/azure/logic-apps/logic-apps-create-api-app
What would be the best approach to executing long running tasks without time out issues?

Your REST API should follow async pattern by returning 202 with a retry-after & location header, see more at: https://learn.microsoft.com/azure/logic-apps/logic-apps-create-api-app
Or, your REST API can be of webhook kind, so Logic Apps can provide a callback url for you to invoke once the processing is completed.

Related

Node js REST Client Scaling the Data collection

I have a scenario where my node js client collects data from rest api.
Scenario : my api endpoint is like this http://url/{project}
where project is parameter. the project comes from a Database table.
here is my procedure:
I am getting all the projects names from Database to a list
using a loop calling rest endpoint for every project in the list
My Query: If I have less number of projects in the Database this procedure working fine but, If I have around 1000 projects to collect, the requests are taking long time and some times failing due to timeout errors.
How can I scale this process so that it finish collecting data in a good amount of time?

Azure Function with ServiceBusTrigger circuit breaker pattern

I have an Azure function with ServiceBusTrigger which will post the message content to a webservice behind an Azure API Manager. In some cases the load of the (3rd party) webserver backend is too high and it collapses returning error 500.
I'm looking for a proper way to implement circuit breaker here.
I've considered the following:
Disable the azure function, but it might result in data loss due to multiple messages in memory (serviceBus.prefetchCount)
Implement API Manager with rate-limit policy, but this seems counter productive as it runs fine in most cases
Re-architecting the 3rd party webservice is out of scope :)
Set the queue to ReceiveDisabled, this is the preferred solution, but it results in my InputBinding throwing a huge amount of MessagingEntityDisabledExceptions which I'm (so far) unable to catch and handle myself. I've checked the docs for host.json, ServiceBusTrigger and the Run parameters but was unable to find a useful setting there.
Keep some sort of responsecode resultset and increase retry time, not ideal in a serverless scenario with multiple parallel functions.
Let API manager map 500 errors to 429 and reschedule those later, will probably work but since we send a lot of messages it will hammer the service for some time. In addition it's hard to distinguish between a temporary 500 error or a consecutive one.
Note that this question is not about deciding whether or not to trigger the circuitbreaker, merely to handle the appropriate action afterwards.
Additional info
Azure functionsV2, dotnet core 3.1 run in consumption plan
API Manager runs Basic SKU
Service Bus runs in premium tier
Messagecount: 300.000

How to check running status and stop Durable function

I want to process millions of records on-demand, which takes approximate 2-3 hours to process. I want to go serverless that is why tried durable function (first-time). I want to check, how long I can run durable function so I created 3 functions
Http function to kick start Orchestrator function
Orchestrator function
Activity function
My DurableFunction is running and emitting logs in Application Insights from last 5 days and based on my code it would take 15 more days to complete.
I want to know that how to stop Orchestrator function manually?
I can see thousands of entry in ApplicationInsights requests table for single execution, Is there any way to check how many DurableFunction running in backend? and how much time taken by single execution?
I can see some information regarding orchestrator function in "DurableFunctionHubInstance" table but as MS recommended not rely on table.
Since Durable Functions does a lot of checkpointing and replays the orchestration, normal logging might not always be very insightful.
Getting the status
There are several ways to query for the status of orchestrations. One of them is through the Azure Functions Core tools as George Chen mentioned.
Another way to query the status is by using the HTTP API of Durable Functions directly:
GET <rooturl>/runtime/webhooks/durableTask/instances?
taskHub={taskHub}
&connection={connectionName}
&code={systemKey}
&createdTimeFrom={timestamp}
&createdTimeTo={timestamp}
&runtimeStatus={runtimeStatus1,runtimeStatus2,...}
&showInput=[true|false]
&top={integer}
More info in the docs.
The HTTP API also has methods to purge orchestrations. Either a single one by ID or multiple by datetime/status.
DELETE <rooturl>/runtime/webhooks/durabletask/instances/{instanceId}
?taskHub={taskHub}
&connection={connection}
&code={systemKey}
Finally you can also manage your instances using the DurableOrchestrationClient API in C#. Here's a sample on GitHub: HttpGetStatusForMany.cs
I have written & vlogged about using the DurableOrchestrationClient API in case you want to know more about how to use this in C#.
Custom status
Small addition: it's possible to add a custom status object to the orchestration so you can add enriched information about the progress of the orchestration.
Getting the duration
When you query the status of an orchestration instance you get back a DurableOrchestrationStatus object. This contains two properties:
CreatedTime
LastUpdatedTime
I'm guessing you can subtract those and get a reasonable indication of the time it has taken.
You could manage the Durable Functions orchestration instances with Azure Functions Core Tools.
Terminate instances:
func durable terminate --id 0ab8c55a66644d68a3a8b220b12d209c --reason "It was time to be done."
Query instances with filters: you could add the parameter(runtime-status) to filter the running instances.
func durable get-instances --created-after 2018-03-10T13:57:31Z --created-before 2018-03-10T23:59Z --top 15
As for the time functions took, looks like it doesn't support. The similar parameter is the get-history.

Fetching bot answers from a database

I'm using Azure Cosmos DB with MongoDB for storing the answers that my Microsoft Bot Framework-based chatbot will give to different dialogs.
My issue is that I don't know if it's best to do a query for each response or do one large query to fetch everything in the DB once the code runs and store it in arrays.
The Azure Cosmos DB pricing uses the unit Request Units per second (RU/s).
In terms of cost and speed, I'm thinking of doing one query whenever the bot service is run (in my case, that would be when app.js is run on my Azure Web App).
This query fetches all the data in my database and stores results in different arrays in my code. Inside my bot.dialog()s I will use these arrays to fetch the answer that I wont the bot to return to the end user.
i would load all the data from the db into the bot when the app starts up and if you manipulate the data you can write it back into the db when the bot shuts down. this would mean that you have one single big query at the beginning of your bots life and another one at the end. but this also depends on the amount of memory that your app has allocated and how big the db is
From Cosmos DB perspective fewer requests that yield larger datasets will typically be faster/cheaper in terms of RUs than more requests fetching smaller datasets. Roundtrips are expensive. But it depends on the complexity of the queries too - aggregation pipelines are more expensive than find() with filters. Everything else should be a client-side consideration

Programmatically get the amount of instances running for a Function App

I'm running an Azure Function app on Consumption Plan and I want to monitor the amount of instances currently running. Using REST API endpoint of format
https://management.azure.com/subscriptions/{subscr}/resourceGroups/{rg}
/providers/Microsoft.Web/sites/{appname}/instances?api-version=2015-08-01
I'm able to retrieve the instances. However, the result doesn't match the information that I see in Application Insights / Live Metrics Stream.
For example, right now App Insights shows 4 servers online, while API call returns just one (the GUID of this 1 instance is also among App Insights guids).
Who can I trust? Is there a better way to get instance count (e.g. from App Insights)?
UPDATE: It looks like data from REST API are wrong.
I was sending 10000 messages to the queue, logging each function call with respective instance ID which processed the request.
While messages keep coming in and the backlog grows, instance count from REST API seems to be correct (scaled from 1 to 12). After sending stops, the reported instance count rapidly goes down (eventually back to 1, while processors are still busy).
But based on the speed and the execution logs I can tell that the actual instance count kept growing and ended up at 15 instances at the moment of last message processed.
UPDATE2: It looks like SDK refuses to report more than 20 servers. The metric flats out at 20, while App Insights kept steady growth and is already showing 41.
Who can I trust? Is there a better way to get instance count (e.g. from App Insights)?
Based on my understanding we need to use Rest API endpoint to retrieve the instance, App Insights could be configured for multiple WebApps, so the number of servers online in the App Insights may be for multiple WebApps.
Updated:
Based on my test, the number of the application insight may be not real time.
During my test if the WebApp Function scale out then I could get multiple instances with Rest API, and I also can check the number of servers online in the App Insights.
https://management.azure.com/subscriptions/{subscriptionId}/resourceGroups/{resourcegroup}/providers/Microsoft.Web/sites/{functionname}/instances?api-version=2016-08-01
But after I finished the test, I could get the number of the instance with Rest API is 1, based on my understanding, it is right result.
At the same time I check it in the Application Insight the number of the servers online is the max number during my test.
And after a while, the number of server online in the application insight also became 1.
So If we want to get the number of intance for Azure function, my suggestion is that using REST API to do that.
Update2:
According to the DavidEbbo mentioned that the REST API is not always reliable.
Unfortunately, the REST API is not always reliable. Specifically, when a Function App scales across multiple scale units, only the instances from the 'home' scale unit are reflected. You probably will not see this in a smallish test, but likely will if you start scaling out widely (say over 20 instances).

Resources