How to use callbacks for long-running Azure Functions in a DevOps pipeline? - azure

I have an Azure DevOps release pipeline that deploys a Python Azure function and then invokes it. The Python function does some heavy lifting, so it takes a few minutes to execute.
There are two options for the completion event for the Invoke Azure Function task: Api Response and Callback.0
The maximum response time when using Api Response is 20 seconds, so I need to use Callback. OK, fine. Using this documentation, I implemented an Azure function that returns an HTTPResponse immediately, and then posts completion data to the specified endpoint. Here's the complete code for my Azure function:
import logging
import time
import threading
import azure.functions as func
def main(req: func.HttpRequest) -> func.HttpResponse:
t = threading.Thread(target=do_work, args=(req,))
t.start()
return func.HttpResponse("Succeeded", status_code=200)
def do_work(req: func.HttpRequest):
logging.info ("Starting asynchronous worker task")
#time.sleep(21)
try:
planUrl = req.headers["PlanUrl"]
projectId = req.headers["ProjectId"]
hubName = req.headers["HubName"]
planId = req.headers["PlanId"]
taskInstanceId = req.headers["TaskInstanceId"]
jobId = req.headers["JobId"]
endpoint = f"{planUrl}/{projectId}/_apis/distributedtask/hubs/{hubName}/plans/{planId}/events?api-version=2.0-preview.1"
body = {
"name": "TaskCompleted",
"taskId": taskInstanceId,
"jobId": jobId,
"result": "succeeded"
}
logging.info(endpoint)
logging.info(body)
requests.post(endpoint, data=body)
except:
pass
finally:
logging.info ("Completed asynchronous worker task")
Now, the Invoke Azure Function task doesn't time out, but it doesn't complete either. It just looks like it's waiting for something else to happen:
Not sure what I'm supposed to do here. I'm also following this thread on GitHub, but it's not leading to a resolution.

Unless you are using Durable functions (which shouldn't be necessary here), background jobs as part of a function execution like this is not supported with Azure Functions.
Once a response is returned by a function, that channel is closed in a way that any background operations would not work as expected (exception- Durable Functions).
If you need something to work asynchronously like this and you are ok not waiting for a response, the recommended approach would be to use multiple functions.
In this case, your HttpTrigger could drop a message in the queue (or any other messaging service) and return a response immediately. Then, you'd have a queue trigger (or other event based trigger) to pick up events from the queue (or any such messaging service) and do your heavy lifting and once completed, could post to that endpoint in your example.
If you want to implement this with just one function, then from your devops pipeline, you could directly drop a message to a messaging service and have your function trigger off of that.
Hope this helps!

I can't find it explicitly documented anywhere, but it appears you need to add an authorization header (at least when running locally)
Debugging my (very similar C#) function, the callback response gives me a 204 status code with an empty body when I include the header, and a 203 response with an Azure DevOps sign in page in the body when I do not.
The bearer token is passed in the same way as the other system variables, as AuthToken so you need to add the Authorization header with a value of "Bearer " + req.headers["AuthToken"]

Related

Azure insights: 'requests' item type are only stored with success=='False'

I have Azure durable function run by timer trigger, which runs another function (UploadActivity) that does some http call to the external to Azure REST service. We know for sure that small percentage of all UploadActivity invocations end up in http error and exception risen, the rest are exception-free and upload some data to the remote http resource. Interesting finding I got is that Azure Insight's 'requests' collection contains only failed requests, and no successful one recorded
// gives no results
requests
| where success == "True"
// gives no results
requests
| where success <> "False"
// gives results
requests
| where success == "False"
I can't realize why. Here are some attributes of one of returned request with success=='False' if it helps to find why
operation_Name:
UploadActivity
appName:
/subscriptions/1b3e7d9e-e73b-4061-bde1-628b728b43b7/resourcegroups/myazuretest-rg/providers/microsoft.insights/components/myazuretest-ai
sdkVersion:
azurefunctions: 4.0.1.16815
'resource' is defined in Azure as http call to http-triggered function, but I have no http triggered functions in my app which makes things even more confusing, I think maybe these requests belong to Azure Insights calls, that could be also built based on Azure Functions
For a timer triggered function it is normal that there are no records in the requests collection of Application Insights. If it would be an http triggered function you would have 1. Only the request that triggers the function is recorded as a request in Application Insights. A timer trigger does not respond to a request.
Once the function is triggered all http requests (and all kind of other communication like calls to service busses etc.) executed by that function will be recorded as a dependency in the dependencies collection. This is by design and is how Application Insight works.

How can I execute asynchronous tasks in the background as scheduled in Google Cloud Platform?

Problem
I want to get a lot of game data at 9 o'clock every morning. Therefore I use App Engine & cron job. However I would like to add Cloud Tasks, I don't know how to do.
Question
How can I execute asynchronous tasks in the background as scheduled in Google Cloud Platform?
Which is natural to implement (Cloud Scheduler + Cloud Tasks) or (cron job+ Cloud Tasks)?
Development Environment
App Engine Python (Flexible environment).
Python 3.6
Best regards,
Cloud Tasks are asynchronous by design. As you mentioned, the best way would be to pair them with Cloud Scheduler.
First of all, since cloud Scheduler needs either a Pub/Sub or an HTTP endpoint, to call once it runs the jobs, I recommend to you to create an App Engine handler, to which the Cloud Scheduler will call, that creates and sends the task.
You can do so by following this documentation. First of all you will have to create a queue, and afterwards I recommend you to deploy simple application that has a handler to create the tasks. A small example:
from google.cloud import tasks_v2beta3
from flask import Flask, request
app = Flask(__name__)
#app.route('/createTask', methods=['POST'])
def create_task_handler():
client = tasks_v2beta3.CloudTasksClient()
parent = client.queue_path('YOUR_PROJECT', 'PROJECT_LOCATION', 'YOUR_QUEUE_NAME')
task = {
'app_engine_http_request': {
'http_method': 'POST',
'relative_uri': '/handler_to_call'
}
}
response = client.create_task(parent, task)
return response
Where the 'relative_uri' is the handler that the task will call, and processes your data.
Once that is done, follow the Cloud Scheduler documentation to create jobs, and specify the target to be App Engine HTTP, set the URL to '/createTask', the service to whichever is handling the URL, and the HTTP method to POST. Fill the rest of parameters as required, and you can set the Frequency to 'every monday 09:00'.

How to avoid callback-hell using asyncio in python

I have the following situation.
I have 3 Services JobInitiator,Mediator,Executor That talk to eachother in the following manner.
The JobInitiator once every X minutes publishes to a queue (RabbitMQ) a requested job
The Executor service every Y minutes sends a REST API call to the Mediator service and asks if there is any jobs to be done. If so - the Mediator pulls a message from the queue and returns the message to the Executor service in the response.
After the Executor finishes executing the job - he posts the job results to an API in the Mediator service that publishes it to a queue that the JobInitiator listens to.
Side notes + restrictions and limitations:
The Mediator service is just a REST API wrapper to my queue. The main issue is that Executor service can't be accessed publicly - only outgoing api calls are allowed.
I cannot connect the queue directly from the JobInitiator to the Executor service
Up until now - nothing really special about this process. The thing i was wondering about is if its possible to write this with asyncio in python so i won't deal with callback hell. Something like this (pseudo code)
class JobInitiator(object):
def do_job():
token = await get_token()
applicative_results = await get_results(token=token)
where get_token() and get_results() both run through the process described above.

DurableOrchestrationClient.GetStatusAsync() always null for Durable Azure Function

I have a queue trigger azure function with DurableOrchestrationClient. I am able to start a new execution of my orchestration function, which triggers multiple activitytrigger functions and waits for them all to process. Everything works great.
My issue is that I am unable to check on the status of my orchestration function("TestFunction"). GetStatusAsync always returns as null. I need to know when the orchestration function is actually complete and process the return object (bool).
public static async void Run([QueueTrigger("photostodownload", Connection = "QueueStorage")]PhotoInfo photoInfo, [OrchestrationClient]DurableOrchestrationClient starter, TraceWriter log)
{
var x = await starter.StartNewAsync("TestFunction", photoInfo);
Thread.Sleep(2 * 1000);
var y = await starter.GetStatusAsync(x);
}
StartNewAsync enqueues a message into the control queuee, it doesn't mean that the orchestration starts immediately.
GetStatusAsync returns null if the instance either doesn't exist or has not yet started running. So, probably the orchestration just doesn't start yet during those 2 seconds of sleep that you have.
Rather than having a fixed wait timeout, you should either periodically poll the status of your orchestration, or send something like a Done event from the orchestration as the last step of the workflow.
Are you using function 1.0 or 2.0? A similar issue has been reported for Function 2.0 runtime on Github.
https://github.com/Azure/azure-functions-durable-extension/issues/126
Also when you say everything works great do you mean activityTrigger functions complete execution?
Are you running functions locally or is it deployed on Azure?

How to get runtime status of queue triggered azure function?

My azure function is calculating results of certain request jobs (cca. 5s-5min) where each job has unique jobId based on the hash of the request message. Execution leads to deterministic results. So it is functionally "pure function". Therefore we are caching results of already evaluated jobs in a blob storage based on the jobId. All great so far.
Now if a request for jobId comes three scenarios are possible.
Result is in the cache already => then it is served from the cache.
Result is not in the cache and no function is running the evaluation => new invocation
Result is not in the cache, but some function is already working on it => wait for result
We do some custom table storage based progress tracking magic to tell if function is working on given jobId or not yet.
It works somehow, up to the point of 5 x restart -> poison queue scenarios. There we are quite hopeless.
I feel like we are hacking around some of already reliably implemented feature of Azure Functions internals, because exactly the same info can be seen in the monitor page in azure portal or used to be visible in kudu webjobs monitor page.
How to reliably find out in c# if a given message (jobId) is currently being processed by some function and when it is not?
Azure Durable Functions provide a mechanism how to track progress of execution of smaller tasks.
https://learn.microsoft.com/en-us/azure/azure-functions/durable-functions-overview
Accroding to the "Pattern #3: Async HTTP APIs" the orchestrator can provide information about the function status in form like this:
{"runtimeStatus":"Running","lastUpdatedTime":"2017-03-16T21:20:47Z", ...}
This solves my problem about finding if given message is being processed.
How to reliably find out in c# if a given message (jobId) is currently being processed by some function and when it is not?
If you’d like to detect which message is being processed and get the message ID in queue triggered Azure function, you can try the following code:
#r "Microsoft.WindowsAzure.Storage"
using System;
using Microsoft.WindowsAzure.Storage.Queue;
public static void Run(CloudQueueMessage myQueueItem, TraceWriter log)
{
log.Info($"messageid: {myQueueItem.Id}, messagebody: {myQueueItem.AsString}");
}

Resources