How to persist state in Azure Function (the cheap way)? - azure

How can I persist a small amount of data between Azure Function executions? Like in a global variable? The function runs on a Timer Trigger.
I need to store the result of one Azure Function execution and use this as input of the next execution of the same function. What is the cheapest (not necessarily simplest) way of storing data between function executions?
(Currently I'm using the free amount of Azure Functions that everyone gets and now I'd like to save state in a similar free or cheap way.)

There are a couple options - I'd recommend that you store your state in a blob.
You could use a blob input binding to read global state for every execution, and a blob output binding to update that state.
You could also remove the timer trigger and use queues, with the state stored in the queue message and a visibility timeout on the message to set the schedule (i.e next execution time).
Finally, you could use a file on the file system, as it is shared across the function app.
If you can accept the possibility of data loss and only care at the instance level, you can:
maintain a static data structure
write to instance local storage

Durable entities are now available to handle persistence of state.
https://learn.microsoft.com/en-us/azure/azure-functions/durable/durable-functions-entities?tabs=csharp

This is old thread but worth sharing the new way to handle state in Azure function.
Now we have the durable function approach from Microsoft itself where we can maintain the function state very easily and effectively. Please refer the below documentation from MS.
https://learn.microsoft.com/en-us/azure/azure-functions/durable/durable-functions-overview

Related

StackExchange.Redis issue in Azure Function (ResetNonConnected)

I have an Azure Function called by a Logic App and the goal of this function is to read an Azure Redis Cache to find a particular key.
I use the StackExchange.Redis lib to read and write to the Redis cache.
The function can be called in parallel because multiple Logic App instances can be executed at the same time.
I have sometimes (but not rarely) a "ResetNonConnected" error while reading the content of the Redis Cache and when I start having this issue, I need to stop and start the Function App to make the function working again.
The code of my function is very simple: I read a key (StringGet), if it exists I compare it with a value I received as an input of my function. If the value received is greater than the one already in the cache of if the key is not in the cache, then I update the value in the cache (StringSet then KeyExpire).
Errors are not occurring during high load situation, I limited the number of instances of my Logic App to 10 in parallel, and I have this error.
Are there any known issues with this lib in an Azure function? What is the alternative to using a Redis Cache in an Azure function so to be sure that it will work just fine?
I won't help you with Redis error but
What is the alternative to using a Redis Cache in an Azure function so to be sure that it will work just fine?
For read-key/write-key operations the typical approach would be to use Azure Table Storage. It has similar key-based API but it also has native support in Azure Functions in form of Table Storage Binding. Most probably, you won't have to write any custom code to read/write values, but do everything directly with binding parameters.
Key lookups on Table Storage scale linearly to any load and cost almost nothing.
I found the root cause of my issue: this was due to the fact that my ConnectionMultiplexer was not declared static and was created each time I called the function.
By declaring it as static, as well documented (my bad), it works just fine.

How to intimate a Event to all the instances of Azure Function

We have a Functions App which scales to a couple of hundred instances under peak load, now we need a way in which if a particular event happens (maybe a new message in the queue) all the instances are notified, what are the potential approaches to achieve this, please advise..
One pattern is to have a common marker (a blob) which changes when the common state if updated. Each instance can proactively check the blob's etag to determine if it has the state has changed, and if so, it knows to reload its state.
Note that:
it's important for the instance to check (rather than wait for a change notification) because there can be a lag in most notification mechanisms. For example, a blob trigger could lag several minutes.
There's no way to bypass the load-balancer and send a message to a specific instance. So you can't proactively send an invalidation message.
Another pattern is to have you state fully externalized in something like Redis. That's easy to invalidate and update. (Although that's essentially just a special case of the prior suggestion)

Azure functions with complex c# code

I have complex code in c# with multiple classes(and classes have different functions) and i want to implement it as azure function. The problem is that the architecture is as follows: stream data is coming to function as input and after complex calculation within the class functions, I need to return calculated values as stream again. To be returned values are inside class functions and I am having trouble to find a way to return these to "run" function. Is there any easy way to do it?
Structure is like this
public static void Run(string myQueueItem, TraceWriter log)
{
// gets data from service bus per second
call function 1
}
public class Class1
{
function1(){
call function2
}
}
public class Class2
{
function2()
output interested is in here and program creates an output after 30 31 seconds and continues to creates about every 20 second
)}
Many thanks
Your question is not very clear, but based on your comments I understood that your processing logic and calculation results depend on multiple incoming messages.
Azure Function (Run method) will be called once per each Service Bus message in your queue. That means that you need to persist some state, e.g. the previous messages, across Function calls.
Azure Functions themselves don't provide any reliable in-process state storage, so you need to make use of external storage (e.g. SQL Database, Cosmos DB, Table Storage etc).
Your function flow would look something like this:
Run is called for an incoming Service Bus message.
You load the previous state of the function from external storage.
You instantiate Class1/Class2/etc hierachy.
You pass the incoming message AND your state to Class1/Class2.
Your logic produces an output message, new state, or both.
You persist the state back to the storage.
You return the output message to the output binding.
In case you don't want any external state, Azure Functions might not be the right service for you. E.g. you may have a Web Job, which constantly runs and keeps a series of messages in memory.
Edit: As #Gaurav suggested, you should take a look at
Durable Functions, but they are still at early preview stage.
You should take a look at Azure Durable Functions announced recently. Unfortunately I have only read about it and not used it, hence I will not be able to propose how exactly it will solve your problem.
One neat thing I liked about this is that unlike your regular Functions, they are stateful in nature which lets you persist the local state.
Another thing I liked about this is that it is intended for long running tasks which is something you're after.
Looking at your question, I believe Function Chaining pattern is something that could be useful to you.
First of all, Azure Functions are not designed for complex processing. Instead you can consider worker roles for this or a microservice if your solution is already on Service Fabric.

How should I replicate Firebase Queue data properties in Cloud Functions for Firebase?

Firebase Queue uses the following properties to track processing:
_state _state_changed _owner _progress _error_details _id
I am migrating some of my Firebase Queue code to Cloud Functions for firebase. How would I obtain an equivalent data property for the _owner property?
Alternatively, considering this property is primarily to alleviate issues with concurrency, does Cloud Functions already resolve this issue in another way making my implementation unnecessary?
The semantics are a little different between the two systems. The primary reason for the _owner property in Firebase Queue is that it can't kill user code once the lease for an operation expires, and so we have to guard any side-effects of the processing function with a check. Cloud Functions controls the execution environment and can kill user code if needed, so we can assume if it's writing, then it still has the lease.
In order to emulate the behavior, it would be possible to generate a v4 UUID at the start of every Cloud Function execution, write that back to the database somewhere with a timestamp so you can timeout leases (guarded by a transaction so you don't clobber other owners), then compare those with the current UUID and time in a transaction every time you write back to the database in your function.
The _state and _state_changed properties should be more than adequate to resolve concurrency issues (specifically, race-conditions) within Google Cloud Functions for Firebase (I'm still investigating how Google resolves this internally).
However, having an _owner property would be additionally helpful in instances where there was multi-platform task workers.

Azure Queue Storage Proper Call Pattern in 2015?

What is the proper call/code pattern to write to Azure Queue Storage in a performant way?
Right now, the pseudo-code is
Create static class with StorageCredentials and CloudStorage account properies. At application startup, read values from config file into {get;}-only properties.
Create class with an async Task method with input parameter of my application message type. The method serializes the type, creates new CloudQueueMessage, new CloudQueueClient, new CloudQueue reference. Where configuration information is needed, it is read from the static class. My code then:
await Task.Run( ()=> theref.AddMessage(themessage).
It looks to me as if I have some redundancy in the code, and am not sure if/how connections might be pooled to the queue, and also if I require retry logic as I would with database (SQL Server etc) connectivity.
I am trying to understand which queue accessing steps can be reduced or optimized in any way.
All ideas appreciated.
Using .NET 4.5.2, C#. Code is executing in a Cloud Service (Worker Role).
Thanks.
Azure Storage Client Library already retries for you by default in case of service/network errors. It will retry up to 3 times per operation.
You can change your call to await theref.AddMessageAsync(themessage) instead of blocking on the synchronous AddMessage call on a separate thread.
As of the latest library, you can reuse the CloudQueueClient object to get a new reference to CloudQueue.
As long as you are calling AddMessageAsync sequentially, the same connection will be reused whenever possible. If you call it concurrently, more connections will be created, up to ServicePointManager.DefaultConnectionLimit connections. So, if you would like concurrent access to the queue, you might want to increase this number.
Disabling Nagle algorithm through ServicePointManager.UseNagleAlgorithm is also recommended given the size of queue messages.
I would cache your CloudQueue reference and re-use it. Each time you add a message to the queue, this class constructs a REST call using HttpClient. Since your credentials and storage/queue Uri are already known this could save a few cycles.
Also, using AddMessageAsync instead of AddMessage is recommended.
As a reference, you can see the implementation in the storage client libraries here.

Resources