How To Share object state across all Azure Function Instances

How To Share object state across all Azure Function Instances - azure

We have a C# Azure Function App (consumption plan) which scales dynamically i.e new instances are added depending on the load, now we have an object which needs to be used across all instances of this Azure Function, as the object is complex (has delegates) I cannot serialize to put on an external cache so that all instances can access it from there, please suggest the best way to handle this scenario.

Azure Function instances are running on different servers, they do not share the same memory, so there is no way for all of them to access a single .NET object without remote calls (and thus serialization).

There's something fundamentally contradictory in your question: you can't have both a) a function that scales to many machines; b) all instances sharing in-memory objects as if they were running on a single machine.
It goes the other way too: serverless means that if nobody is using your function, all infrastructure can be shut down (hence saving you money) - but that implicitly means you'd lose your in-memory state and need to be able to serialize it.
Minimize the shared state, make it serializable (there are patterns for dealing with things like delegates [1]), and use external storage (Azure Storage, or a cache like Redis).
[1] For delegates, one trick is to maintain a dictionary of handles to delegates, and then serialize the handle. Similarly, for polymorphism, you might serialize the type name.

Related

How to create a file caching layer in microservices/serverless architecture

Here is the essence of the what I want:
There is a service X which other services can use to stream files that service X stores. E.g. GET /files/8c267d1c-2b6d-4fe3-969f-4820fe8b3a9c which returns foo.txt's content might be requested by services A, B, C, multiple times by each.
Service X be implemented using some serverless technology such as Azure Functions
Obviously, I want some type of caching system so that I don't have to stream the file from it's stored location every time. Instead of a 2-part stream like
A <------ Instance of X Application <----- X's storage (e.g. Azure Files)
it would be great to have it like
A <------ [TBD]
Obviously, I want this to scale
Ideally, I want to use out-of-box solutions
Ideally, I don't want to expose to the client service A, B, C the storage mechanism for them to directly access.
Are these goals incompatible?

Not sure if you have already considered CDN. There are quite a few providers in market and most of them are globally distributed. A lot depends on how frequent you update these contents. If they are not really static and updated frequently then CDN may not fit the purpose.
It does take care of your first two goals, but CND has a predefined and a well published architecture on how it operates, so not really sure of your point regarding not exposing the storage mechanism. You can control the access policies on these contents though.

"doubleton" or how to have two global instances of a module

I am playing with node.js porting a webAPI currently implemented in Delphi and learning how to do things the node way.
My application uses two instances of a database-backend connected to a different DB each. Service-code can get a connection to each from one of two global instances of a connection-pool.
How would one do this in node.js?
I know how to write the pool as a module exporting a constructor so I can create two instances, configured with a different connection string. I can hold these in global variables in the main server-file or in a module holding references to various services and doing routing of requests.
But how would the individual service-modules that want to get a connection, do something with it and then return it to the pool get access?
So far I have been happy with require()-ing such shared modules everywhere I use them, so they behave as singletons and the state is shared between everywhere they are require()-ed. But how to do that if I want two (or n) instances that are configured differently?
All I can think of so far is actively injecting a reference to them everywhere. Would work, but is that the proper solution?
P.S.: I did not know up to now that Doubletons are a thing.

Service Fabric Reliable Services: Communication and Partitioning essentials

While discovering SF Reliable Services I want to make sure that next basic statements are true.
Reliable Services Default Communication stack (DefaultStack) and Reliable Actors Communication stack (using ServiceProxy/ActorProxy) can only be used for communicating inside SF Cluster. Customers from outside must use WebAPI/WCF stacks.
ServicePartitionResolver, CommunicationClientFactory, ServicePartitionClient are stuff that already implemented inside DefaultStack. I don't have to worry about it if I use only DefaultStack.
Some Stateful service has more then one partition, and I want for example to post an item to process it. It is not SF's responsibility to decide what exactly partition should be used by posting customer. I need manually implement an algorithm resolving partition key or name and use it in ServiceProxy constructor (for DefaultStack).

You're correct on all those points,
If you want to communicate outside Service Fabric you need to use something like an OwinCommunicationListener (see here).
You’d only have to implement those if you wanted to plug in your own communication stack.
Yep, you’d need to define the partition key when you’re creating a ServiceProxy.

Transition between stateful service and external persistence in Azure Service Fabric

The Azure Service Fabric appears to be focused on scenarios in which all data can fit within RAM and persistence is used as a backing store. Reliable Services are designed to store information in Reliable Collections, which use a log-checkpoint system where logged information is written into RAM. Meanwhile, for Reliable Actors, the default actor state provider is "the distributed Key-Value store provided by Service Fabric platform." This seems to indicate that the same limitations would apply.
There may, however, be situations in which one would like to use the Service Fabric for "hot data" but write "cold data" to some form of permanent storage. What are best practices for handling this transition?
In Orleans, this seems to be handled automatically, using a persistence store such as Azure tables. But it seems that a principal design purpose of the Service Fabric and the Reliable Collections are to avoid needing external services, thus enhancing data locality. The current documentation anticipates the possibility that one would want to move data into some permanent store for disaster recovery and analytics, but it does not discuss the possibility of moving data back and forth between persistence-backed in-memory actors and more permanent forms of storage.
A possible answer is that the Service Fabric already does this. Maybe a Reliable Dictionary has some built-in mechanism for switching between persistence-backed in-memory storage and permanent storage.
Or, maybe the answer is that one must manage this oneself. One approach might be for an Actor to keep track of how "hot" it is and switch its persistence store as necessary. But this sacrifices one of the benefits of the Actor model, the automatic allocation and deallocation of actors. Similarly, we might periodically remove items from the Reliable Dictionary and add it to some other persistence store, and then add them back. Again, though, this requires knowledge of when it makes sense to make the transition.
A couple of examples may help crystallize this:
(1) Suppose that we are implementing a multiplayer game with many different "rooms." We don't need all the rooms in memory at once, but we need to move them into memory and use local persistence as a backup once players join them.
(2) Suppose that we are implementing an append-only B-Tree as part of a database. The temptation would be to have each B-Tree node be a stateful actor. We would like hot b-trees to remain in memory but of course the entire index can't be in memory. It seems that this is a core scenario that is already implemented for things like DocumentDB, but it's not clear to me from the documentation how one would do this.
A related question that I found is here. But that question focuses on when to use Azure Service Fabric vs. external services. My question is on whether there is a need to transition between them, or whether Azure Service Fabric already has all the capability needed here.

The Key-Value store state provider does not require everything to be kept in memory. This provider actually stores the state of all actors on the local disk and the state is also replicated to the local disk on other nodes. So the KVS store is considered a persistent and reliable store.
In addition to that, the state of active actors is also stored in memory. When an actor hasn't been used in a while, it gets deactivated and garbage collected. When this happens, the in-memory copy is freed and only the copy on disk remains. When the actor is activated again, the state is fetched from disk and remains in memory as long as the actor is active.
Also, KVS is not the only built-in state provider. We also have the VolatileActorStateProvider (http://azure.microsoft.com/en-gb/documentation/articles/service-fabric-reliable-actors-platform/#actor-state-provider-choices). This is the state provider that keeps everything in memory.

The KvsActorStateProvider does indeed store actor state in a KeyValueStore which is a similar structure to the ReliableDictionary.
The first question I'd ask is whether you need to relegate old actors state to cold storage? The limitation of keeping everything in memory doesn't limit you to a total number of actors, but a total number per replica. So you must first consider the partitioning strategy so that your actors are distributed across a number of different replicas. As your demands grow you can then add more machines to the cluster and the ServiceFabric will orchestrate movements of the replicas to the new machines. For more information on partitioning of the Actor service, see http://azure.microsoft.com/en-gb/documentation/articles/service-fabric-reliable-actors-platform/
If you do want to use cold storage after some time, then you have a couple of options. Firstly, you could decorate your actors with a custom ActorStateProviderAttribute that returns your own implementation of an IActorStateProvider that can handle persistence as you decide.
Alternatively, you could handle it entirely within your Actor implementation. Hook into the Actor Lifecycle and in OnDeactivateAsync such that when the instance is garbage collected, or use an Actor Reminder for some specified time in the future, to serialise the state and store in cold storage such as blob or table storage and null out the State property. The ActivateAsync override can then be used to retrieve this state from offline storage and deserialise.

Refreshing static instances across multiple worker process [duplicate]

I have a lot of Singleton implementation in asp.net application and want to move my application to IIS Web Garden environment for some performance reasons.
CMIIW, moving to IIS Web Garden with n worker process, there will be one singleton object created in each worker process, which make it not a single object anymore because n > 1.
can I make all those singleton objects, singleton again in IIS Web Garden?

I don't believe you can ( unless you can get those IIS workers to use objects in shared memory somehow ).
This is a scope issue. Your singleton instance uses process space as its scope. And like you've said, your implementation now spans multiple processes. By definition, on most operating systems, singletons will be tied to a certain process-space, since it's tied to a single class instance or object.
Do you really need a singleton? That's a very important question to ask before using that pattern. As Wikipedia says, some consider it an anti-pattern ( or code smell, etc. ).
Examples of alternate designs that may work include...
You can have multiple objects synchronize against a central store or with each other.
Use object serialization if applicable.
Use a Windows Service and some form of IPC, eg. System.Runtime.Remoting.Channels.Ipc
I like option 3 for large websites. A companion Windows Service is very helpful in general for large websites. Lots of things like sending mail, batch jobs, etc. should already be decoupled from the frontend processing worker process. You can push the singleton server object into that process and use client objects in your IIS worker processes.
If your singleton class works with multiple objects that share state or just share initial state, then options 1 and 2 should work respectively.
Edit
From your comments it sounds like the first option in the form of a Distributed Cache should work for you.
There are lots of distributed cache implementations out there.
Microsoft AppFabric ( formerly called Velocity ) is their very recent move into this space.
Memcached ASP.Net Provider
NCache ( MSDN Article ) - Custom ASP.Net Cache provider of OutProc support. There should be other custom Cache providers out there.
Roll out your own distributed cache using Windows Services and IPC ( option 3 )
PS. Since you're specifically looking into chat. I'd definitely recommend researching Comet ( Comet implementation for ASP.NET?, and WebSync, etc )

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string