Loading up clients in Azure Functions - azure

I'm creating an Azure Function that will run in consumption mode and will get triggered by messages in a queue.
The function will typically need to make a database call when it gets triggered. I "assume" the function gets launched and loaded to memory when it gets triggered and when it's idle, it gets terminated because it's running in consumption mode.
Based on this assumption, I don't think I can load up a singleton instance of my back-end client which includes the logic for making database calls.
Is then new'ing up my back-end client the right approach every time I need to perform some back-end operations?

This is a wrong assumption. Your function will be loaded during the first call, and will be unloaded only after an idle timeout (5 or 10 minutes).
You will not pay for idling, but you will pay for the whole time that your function was running, including the wait time during the database calls (or other IO).
Singletons and statics work just fine; and you should reuse instances like HttpClient between the calls.

Related

Azure Function Queue Trigger Gets Triggered Twice, Two Instances Running?

I have an Azure Function, with a queue trigger. The function must proces one queue message each time, one by one, sequentially. This is important, since the function uses an external OAuth API, with various limitions on requesting new access and refresh tokens.
In order to process the queue sequentially, I've the following settings:
host.json
"queues": {
"batchSize": 1,
"newBatchThreshold": 0
}
Application settings
FUNCTIONS_WORKER_PROCESS_COUNT = 1
WEBSITE_MAX_DYNAMIC_APPLICATION_SCALE_OUT = 1
Despite these settings, it happens sometimes that the function gets triggered multiple times. Not in my tests, where I dump a lot of messages in the queue, but when the functions is live.
I've dived into the Application Insights Log, and I found that the function gets restarted a lot during the day, and when it restarts, it sometimes start the functions twice.
For example:
Two instances are started. I'm guessing this causes my double triggers from the queue.
How can I make sure the function is triggered once?
I believe the only way to make it even more secure that only one message will be executed at the same time is to use sessions. Give every message the same sessionId to guarantee your desired behavior. Be aware that some things like deadlettering will behave differently if you switch to sessions.
You can also use the [Singleton] attribute if that fits your function app. It should also work, but can incur additional costs when you use the consumption plan. So you need to validate how it affects costs given your workload.
See also my other answer here: azure servicebus maxConcurrentCalls totally ignored

Is it possible to stop MassTransit service after a saga/consumer completes

So I know this runs a bit counter to MassTransit style, but I want to take advantage of some key features of MT such as message broker connection management, sagas, scheduled messages.
However, I know the service will be rarely used. This is a fairly large data take from an API which has a throttle of 12,000 requests per hour. Once every 24 hours a saga will start to take data and move it into Data Lake. The service will run for some minutes until the throttle is hit, then start again where it left off (state) when enough time has passed, maybe something like 30 minutes later. The amount of data means this will repeat for several hours (2 to 4).
The fit for a saga and and scheduled message seems pretty good. But it would be better if the service did not have incur operating costs for being awake 24x7. There will only ever be one request at a time for one set of API credentials. There may come a time when we might have multiple sets of credentials.
Is there a way to nicely close down the service when the saga completes?
As this is likely to be implemented with a container instance I propose to start an instance from a queue triggered function or similar.
Assuming that this is the approach you want to take (versus just an Azure Web Job, triggered by Azure Scheduler), there are a number of options:
Publish an event when the saga completes, consume that event, use Task.Run() or whatever to stop the bus.
Use a receive observer to keep track of in-flight messages and when it reaches zero and stays there for n seconds, stop the bus, exit the function.
Though I wonder why not just use a scheduled job via Azure, seems easier unless MassTransit is being used for more than just scheduling.

How does a Cloud Function instance handles multiple requests?

I trying to wrap my head around Cloud Function's instances and how they work.
I'm asking about an example of an HTTP function, but I think the concept applies to any kind of function.
Let's say I have this cloud function that handles SSR for my app, named ssrApp.
And let's assume that it takes 1 second to complete every time it gets a request.
When Cloud Function receives the 1st request, it will spin up an instance to respond it.
QUESTION
How does that instance behave when multiple requests are coming?
From: https://cloud.google.com/functions/docs/concepts/exec
Each instance of a function handles only one concurrent request at a time. This means that while your code is processing one request, there is no possibility of a second request being routed to the same instance. Thus the original request can use the full amount of resources (CPU and memory) that you requested.
Does it mean that during that 1 second when my ssrApp function is running, if somebody hits my app URL, it is guaranteed that Cloud Function will spin up another instance for that second request? Does it matter if the function does only sync calls or some async calls in its execution? What I mean is, could an async call free the instance to respond to another request in parallel?
Does it mean that during that 1 second when my ssrApp function is running, if somebody hits my app URL, it is guaranteed that Cloud Function will spin up another instance for that second request?
That's the general behavior, although there are no guarantees around scheduling.
Does it matter if the function does only sync calls or some async calls in its execution? What I mean is, could an async call free the instance to respond to another request in parallel?
No, that makes no difference. If the container is waiting for an async call, it it still considered to be in-use.
2022 Update
For future searchers, Cloud Functions Gen2 now supports concurrency: https://cloud.google.com/functions/docs/2nd-gen/configuration-settings#concurrency

Using mysql pool on amazon lambda

I am trying to use mysql pool in my NodeJS service that is running on Amazon Lambda.
This is the beginning of my module that works with database:
console.log('init database module ...');
var settings = require('./settings.json');
var mysql = require('mysql');
var pool = mysql.createPool(settings);
As following from logs in Amazon console this piece of code is executed very often:
If I just deployed the service and executed 10 requests simultaneously - all these 10 requests execute this piece of code.
If I again execute 10 requests simultaneously immediately after first series - they don't execute this code.
If some time is passed from last query - then some of the requests re-execute that code.
Even if I use global - this decreases but not eliminates duplicates:
if (!global.pool) {
console.log('init database module ...');
var settings = require('./settings.json');
var mysql = require('mysql');
global.pool = mysql.createPool(settings);
}
Moreover, if request execution has some error - this piece of code is executed after the request and global.pool is null at that moment.
So, does this mean that using pool in Amazon Lambda is not possible?
Is there any option how I can make Amazon use the same pool instance every time?
Each time a Lambda function is invoked, it runs in its own, independent container. If no idle containers are available, a new one is automatically created by the service. Hence:
If I just deployed the service and executed 10 requests simultaneously - all these 10 requests execute this piece of code.
If a container is available, it may be, and very likely will be, reused. When that happens, the process is already running, so the global section doesn't run again -- the invocation starts with the handler. Therefore:
If I again execute 10 requests simultaneously immediately after first series - they don't execute this code.
After each invocation is complete, the container that was used is frozen, and will ultimately be either thawed and reused for a subsequent invocation, or if it isn't needed after a few minutes, it is destroyed. Thus:
If some time is passed from last query - then some of the requests re-execute that code.
Makes sense, now, right?
The only "catch" is that the amount of time that must elapse before a container is destroyed is not a fixed value. Anecdotally, it appears to be about 15 minutes, but I don't believe it's documented, since most likely the timer is adaptive... the service can (at its descretion) use heuristics to predict whether recent activity was a spike or likely to be sustained, and probably considers other factors.
(Lambda#Edge, which is Lambda integrated with CloudFront for HTTP header manipulation, seems to operate with different timing. Idle containers seem to persist much longer, at least in small quantities, but this makes sense because they are always very small containers... and again this observation is anecdotal.)
The global section of your code only runs when a new container is created.
Pooling doesn't make sense because nothing is shared during an invocation -- each invocation is the only one running in its container -- one per process -- at any one time.
What you will want to do, though, is change the idle_timeout on the connections. MySQL Server doesn't have an effective way to "discover" that an idle connection has gone away entirely, so when your connection goes away when the container is destroyed, the server just sits there, and the connection remains in the Sleep state until the default idle_timeout expires. The default is 28800 seconds, or 8 hours, which is too long. You can change this on the server, or send the query SET ##IDLE_TIMEOUT = 900 (though you'll need to experiment with an appropriate value).
Or, you can establish and destroy the connection inside the handler for each invocation. This will take a little bit more time, of course, but it's a sensible approach if your function isn't going to be running very often. The MySQL client/server protocol's connection/handshake sequence is reasonably lightweight, and frequent connect/disconnect doesn't impose as much load on the server as you might expect... although you would not want to do that on an RDS server that uses IAM token authentication, which is more resource-intensive.

Nodejs using child_process runs a function until it returns

I want this kind of structure;
express backend gets a request and runs a function, this function will get data from different apis and saves it to db. Because this could takes minutes i want it to run parallel while my web server continues to processing requests.
i want this because of this scenario:
user has dashboard, after logs in app starts to collect data from apis and preparing the dashboard for user, at that time user can navigate through site even can close the browser but the function has to run until it finishes fetching data.Once it finishes, all data will be saved db and the dashboard is ready for user.
how can i do this using child_process or any kind of structure in nodejs?
Since what you're describing is all async I/O (networking or disk) and is not CPU intensive, you don't need multiple child processes in order to effectively serve multiple requests. This is the beauty of node.js. With async I/O, node.js can be working on many different requests over the same period of time.
Let's supposed part of your process is downloading an image. Your node.js code sends a request to fetch an image. That request is sent off via TCP. Immediately, there is nothing else to do on that request. It's winging it's way to the destination server and the destination server is preparing the response. While all that is going on, your node.js server is completely free to pull other events from it's event queue and start working on other requests. Those other requests do something similar (they start async operations and then wait for events to happen sometime later).
Your server might get 10 different async operations started and "in flight" before the first one actually starts getting a response. When a response starts coming in, the system puts an event into the node.js event queue. When node.js has a moment between other requests, it pulls the next event out of the event queue and processes it. If the processing has further async operations (like saving it to disk), the whole async and event-driven process starts over again as node.js requests a write to disk and node.js is again free to serve other events. In this manner, events are pulled from the event queue one at a time as they become available and lots of different operations can all get worked on in the idle time between async operations (of which there is a lot).
The only thing that upsets the apple cart and ruins the ability of node.js to juggle lots of different things all at once is an operation that takes a lot of CPU cycles (like say some unusually heavy duty crypto). If you had something like that, it would "hog" too much of the CPU and the CPU couldn't be effectively shared among lots of other operations. If that were the case, then you would want to move the CPU-intensive operations to a group of child processes. But, just doing async I/O (disk, networking, other hardware ports, etc...) does not hog the CPU - in fact it barely uses much node.js CPU.
So, the next question is often "how do I know if I have too much stuff that uses the CPU". The only way to really know is to just code your server properly using async I/O and then measure its performance under load and see how things go. If you're doing async things appropriately and the CPU still spikes to 100%, then you have too much CPU load and you'll want to either use generic clustering or move specific CPU-heavy operations to a group of child processes.

Resources