How to avoid callback-hell using asyncio in python - python-3.x

I have the following situation.
I have 3 Services JobInitiator,Mediator,Executor That talk to eachother in the following manner.
The JobInitiator once every X minutes publishes to a queue (RabbitMQ) a requested job
The Executor service every Y minutes sends a REST API call to the Mediator service and asks if there is any jobs to be done. If so - the Mediator pulls a message from the queue and returns the message to the Executor service in the response.
After the Executor finishes executing the job - he posts the job results to an API in the Mediator service that publishes it to a queue that the JobInitiator listens to.
Side notes + restrictions and limitations:
The Mediator service is just a REST API wrapper to my queue. The main issue is that Executor service can't be accessed publicly - only outgoing api calls are allowed.
I cannot connect the queue directly from the JobInitiator to the Executor service
Up until now - nothing really special about this process. The thing i was wondering about is if its possible to write this with asyncio in python so i won't deal with callback hell. Something like this (pseudo code)
class JobInitiator(object):
def do_job():
token = await get_token()
applicative_results = await get_results(token=token)
where get_token() and get_results() both run through the process described above.

Related

Azure insights: 'requests' item type are only stored with success=='False'

I have Azure durable function run by timer trigger, which runs another function (UploadActivity) that does some http call to the external to Azure REST service. We know for sure that small percentage of all UploadActivity invocations end up in http error and exception risen, the rest are exception-free and upload some data to the remote http resource. Interesting finding I got is that Azure Insight's 'requests' collection contains only failed requests, and no successful one recorded
// gives no results
requests
| where success == "True"
// gives no results
requests
| where success <> "False"
// gives results
requests
| where success == "False"
I can't realize why. Here are some attributes of one of returned request with success=='False' if it helps to find why
operation_Name:
UploadActivity
appName:
/subscriptions/1b3e7d9e-e73b-4061-bde1-628b728b43b7/resourcegroups/myazuretest-rg/providers/microsoft.insights/components/myazuretest-ai
sdkVersion:
azurefunctions: 4.0.1.16815
'resource' is defined in Azure as http call to http-triggered function, but I have no http triggered functions in my app which makes things even more confusing, I think maybe these requests belong to Azure Insights calls, that could be also built based on Azure Functions
For a timer triggered function it is normal that there are no records in the requests collection of Application Insights. If it would be an http triggered function you would have 1. Only the request that triggers the function is recorded as a request in Application Insights. A timer trigger does not respond to a request.
Once the function is triggered all http requests (and all kind of other communication like calls to service busses etc.) executed by that function will be recorded as a dependency in the dependencies collection. This is by design and is how Application Insight works.

Waiting for an azure function durable orchestration to complete

Currently working on a project where I'm using the storage queue to pick up items for processing. The Storage Queue triggered function is picking up the item from the queue and starts a durable orchestration. Normally the according to the documentation the storage queue picks up 16 messages (by default) in parallel for processing (https://learn.microsoft.com/en-us/azure/azure-functions/functions-bindings-storage-queue), but since the orchestration is just being started (simple and quick process), in case I have a lot of messages in the queue I will end up with a lot of orchestrations running at the same time. I would like to be able to start the orchestration and wait for it to complete before the next batch of messages are being picked up for processing in order to avoid overloading my systems. The solution I came up with and seems to work is:
public class QueueTrigger
{
[FunctionName(nameof(QueueTrigger))]
public async Task Run([QueueTrigger("queue-processing-test", Connection = "AzureWebJobsStorage")]Activity activity, [DurableClient] IDurableOrchestrationClient starter,
ILogger log)
{
log.LogInformation($"C# Queue trigger function processed: {activity.ActivityId}");
string instanceId = await starter.StartNewAsync<Activity>(nameof(ActivityProcessingOrchestrator), activity);
log.LogInformation($"Started orchestration with ID = '{instanceId}'.");
var status = await starter.GetStatusAsync(instanceId);
do
{
status = await starter.GetStatusAsync(instanceId);
} while (status.RuntimeStatus == OrchestrationRuntimeStatus.Running || status.RuntimeStatus == OrchestrationRuntimeStatus.Pending);
}
which basically picks up the message, starts the orchestration and then in a do/while loop waits while the staus is Pending or Running.
Am I missing something here or is there any better way of doing this (I could not find much online).
Thanks in advance your comments or suggestions!
This might not work since you could either hit timeouts causing duplicate orchestration runs or just force your function app to scale out defeating the purpose of your code all together.
Instead, you could rely on the concurrency throttles that Durable Functions come with. While the queue trigger would queue up orchestrations runs, only the max defined would run at any time on a single instance of a function.
This would still cause your function app to scale out, so you would have to consider that as well when setting this limit and you could also set the WEBSITE_MAX_DYNAMIC_APPLICATION_SCALE_OUT app setting to control how many instances you function app can scale out to.
It could be that the Function app's built in scaling throttling does not reduce load on downstream services because it is per app and will just cause the app to scale more. Then what is needed is a distributed max instance count that all app instances adhere to. I have built this functionality into my Durable Function orchestration app with a scaleGroupId and it`s max instance count. It has an Api call to save this info and the scaleGroupId is a string that can be set to anything that describes the resource you want to protect from overloading. Here is my app that can do this:
Microflow

How to use callbacks for long-running Azure Functions in a DevOps pipeline?

I have an Azure DevOps release pipeline that deploys a Python Azure function and then invokes it. The Python function does some heavy lifting, so it takes a few minutes to execute.
There are two options for the completion event for the Invoke Azure Function task: Api Response and Callback.0
The maximum response time when using Api Response is 20 seconds, so I need to use Callback. OK, fine. Using this documentation, I implemented an Azure function that returns an HTTPResponse immediately, and then posts completion data to the specified endpoint. Here's the complete code for my Azure function:
import logging
import time
import threading
import azure.functions as func
def main(req: func.HttpRequest) -> func.HttpResponse:
t = threading.Thread(target=do_work, args=(req,))
t.start()
return func.HttpResponse("Succeeded", status_code=200)
def do_work(req: func.HttpRequest):
logging.info ("Starting asynchronous worker task")
#time.sleep(21)
try:
planUrl = req.headers["PlanUrl"]
projectId = req.headers["ProjectId"]
hubName = req.headers["HubName"]
planId = req.headers["PlanId"]
taskInstanceId = req.headers["TaskInstanceId"]
jobId = req.headers["JobId"]
endpoint = f"{planUrl}/{projectId}/_apis/distributedtask/hubs/{hubName}/plans/{planId}/events?api-version=2.0-preview.1"
body = {
"name": "TaskCompleted",
"taskId": taskInstanceId,
"jobId": jobId,
"result": "succeeded"
}
logging.info(endpoint)
logging.info(body)
requests.post(endpoint, data=body)
except:
pass
finally:
logging.info ("Completed asynchronous worker task")
Now, the Invoke Azure Function task doesn't time out, but it doesn't complete either. It just looks like it's waiting for something else to happen:
Not sure what I'm supposed to do here. I'm also following this thread on GitHub, but it's not leading to a resolution.
Unless you are using Durable functions (which shouldn't be necessary here), background jobs as part of a function execution like this is not supported with Azure Functions.
Once a response is returned by a function, that channel is closed in a way that any background operations would not work as expected (exception- Durable Functions).
If you need something to work asynchronously like this and you are ok not waiting for a response, the recommended approach would be to use multiple functions.
In this case, your HttpTrigger could drop a message in the queue (or any other messaging service) and return a response immediately. Then, you'd have a queue trigger (or other event based trigger) to pick up events from the queue (or any such messaging service) and do your heavy lifting and once completed, could post to that endpoint in your example.
If you want to implement this with just one function, then from your devops pipeline, you could directly drop a message to a messaging service and have your function trigger off of that.
Hope this helps!
I can't find it explicitly documented anywhere, but it appears you need to add an authorization header (at least when running locally)
Debugging my (very similar C#) function, the callback response gives me a 204 status code with an empty body when I include the header, and a 203 response with an Azure DevOps sign in page in the body when I do not.
The bearer token is passed in the same way as the other system variables, as AuthToken so you need to add the Authorization header with a value of "Bearer " + req.headers["AuthToken"]

How can I execute asynchronous tasks in the background as scheduled in Google Cloud Platform?

Problem
I want to get a lot of game data at 9 o'clock every morning. Therefore I use App Engine & cron job. However I would like to add Cloud Tasks, I don't know how to do.
Question
How can I execute asynchronous tasks in the background as scheduled in Google Cloud Platform?
Which is natural to implement (Cloud Scheduler + Cloud Tasks) or (cron job+ Cloud Tasks)?
Development Environment
App Engine Python (Flexible environment).
Python 3.6
Best regards,
Cloud Tasks are asynchronous by design. As you mentioned, the best way would be to pair them with Cloud Scheduler.
First of all, since cloud Scheduler needs either a Pub/Sub or an HTTP endpoint, to call once it runs the jobs, I recommend to you to create an App Engine handler, to which the Cloud Scheduler will call, that creates and sends the task.
You can do so by following this documentation. First of all you will have to create a queue, and afterwards I recommend you to deploy simple application that has a handler to create the tasks. A small example:
from google.cloud import tasks_v2beta3
from flask import Flask, request
app = Flask(__name__)
#app.route('/createTask', methods=['POST'])
def create_task_handler():
client = tasks_v2beta3.CloudTasksClient()
parent = client.queue_path('YOUR_PROJECT', 'PROJECT_LOCATION', 'YOUR_QUEUE_NAME')
task = {
'app_engine_http_request': {
'http_method': 'POST',
'relative_uri': '/handler_to_call'
}
}
response = client.create_task(parent, task)
return response
Where the 'relative_uri' is the handler that the task will call, and processes your data.
Once that is done, follow the Cloud Scheduler documentation to create jobs, and specify the target to be App Engine HTTP, set the URL to '/createTask', the service to whichever is handling the URL, and the HTTP method to POST. Fill the rest of parameters as required, and you can set the Frequency to 'every monday 09:00'.

Whether to use Task or Parallel class methods

I am developing a Asp.Net Core component which has some interface for getting requests for some process execution.This request would be sync one where the request is accepted and submission token is returned to caller. The requests are added to a queue and processed asynchronously. Each request execution involves making some rest calls for fetching some data , executing process, etc.
How to process multiple requests from the queue in parallel whether to use Task or Parallel class
What you are describing is a series of I/O bound REST calls. What you should do is loop over those calls, awaiting each. Something like
public async Task X()
{
// add requests to queue
...
foreach (var request in queue)
{
await ExecuteRequest(request);
}
}
Parallelism in the classic sense is about threads and CPU bound work, and so isn't suitable for your scenario.

Resources