Azure Service Fabric Reliable Collection and other Persistent Store

Azure Service Fabric Reliable Collection and other Persistent Store - domain-driven-design

I am very new to Service Fabric.
Is Service Fabric recommends to use only Reliable Collections to store ALL the data for an application?
What if I use SQL DB to persist all my business data and use Reliable Collection to lazily persist to SQL DB for integration purposes. Following DDD, if i persist my aggregate to SQL DB and leave a entry in reliable collection to communicate with other Bounded Context. Will this approach has any issues?

The Service Fabric does NOT recommend to store all the data in Reliable Collections. Its your choice. The Service Fabric provides you freedom on how to do things, on many levels.
You can use an external DB(like SQL DB or DocumentDB or anything) and use the stateful service as a cache. Or use the stateful service as a primary storage and don't use an external DB at all.
Even though the Reliable Collection is a bit limited in usage(its a key/value store with no effective query interface other than looping all the data) it has the advantage of being internally stored(performance) and it has good fail safe mechanisms(defining secondary instances, as many as you want). The partitioning capabilities should not be forgotten either.
Personally I tend to minimize the external dependencies. An external DB is a dependency. But if your requirements for your application specify extensive query capabilities, go for it.

According to Microsoft
Treat Reliable Actors as a transactional system. Service Fabric
Reliable Actors is not a two phase commit-based system offering ACID.
If we do not implement the optional persistence, and the machine the
actor is running on dies, its current state will go with it. The actor
will be coming up on another node very fast, but unless we have
implemented the backing persistence, the state will be gone. However,
between leveraging retries, duplicate filtering, and/or idempotent
design, you can achieve a high level of reliability and consistency.
https://acom-feature-videos-twitter-card.azurewebsites.net/en-us/documentation/articles/service-fabric-reliable-actors-anti-patterns/

Related

Service Fabric stateful services in-memory vs external storage

I have a Service Fabric application and it contains two services stateless and stateful. Service Fabric Application Stateless Service: It contains API endpoints to communicate with stateful service. Stateful Service: The data is being stored in Reliable collections i.e in-memory storage.
I have around 15 service fabric microservices that will be communicating with each other based on the requirement. I'm ending up with a lot of proxy calls in order to communicate between the services which is one of the major reasons for performance hindrance.
In order to mitigate this issue, I have a thought to remove stateful service( in-memory storage with Reliable Dictionaries) and use external storage like Azure Cosmos DB as a data storage.
In the new approach, my application will have one stateless service and it will communicate with the external data store ( ex: Cosmo DB). Service Fabric Application Stateless Service: It contains API endpoints to communicate with the storage provider ( Ex: CosmosDB).
Can anyone let us know whether Service fabric in-memory or external storage gives more performance?
Apart from the performance issues with the in-memory storage, it is becoming very challenging to implement the complex queries or do any elastic search or creating reports as we have dependencies between the services.
Is there any other better approach that can really resolve these kinds of issues?

The whole point of using stateful services is to bring the data to where the compute (your service) is. The benefit of this is performance, as there is no network latency for getting the data.
Now, what you are doing is effectively throwing this benefit away by using a stateful service as a central datastore for other services to get data from.
There are at least two option I can think of. The first is to use an external datastore like Cosmos DB and have all services connect to that datastore or, second opion, to convert your stateless services to stateful services and copy/distribute only the portions of the data a given service need to that service. To make it easier to report based on the data you could create read models.
Currently, we have a databse and moving all databse tables as microservices. Inorder to implement stored procedures/ views, we are fetching few services data in a single service and implemting the logic. Do we have an alternative approach for the Sp's/ Views?
You should not try to map a database and its views/stored procedure to some logic and microservices. Instead, try a new view on it. Let each service put their own data into one or more reliable collections. If there is need for a data store with data combined from each service have those services update a so called read model (you'll probably and up having more than one readmodel).
Look up terms like CQRS and read models, they will help with a micro services architecure.
Or have all services connect to, for example, a sql server giving the benefits of stored procedures and views. But do mind that once you use a centralized database, whether it is a sql database or cosmos db database, your micro services are no longer independent services as they all share a single database schema.

Service Fabric Application with Entity Framework?

I started learning Service Fabric applications, and little confused about stateful Reliable Services.
In stateful Reliable Services state means the data to be stored in the tables in our normal database applications or something else?
Is it possible to use EF with stateful Reliable Services ?
How we can store/retrieve the data to/from database (like Products, Categories, Employess etc...) using EF in Reliable Services?
Any tutorial/help will be much appreciable.
Thanks in advance

There are 2 flavors of reliable services, stateless and stateful. The main difference being that stateful services give access to reliable collections to store your data.
TL;DR
If you are planning to use Entity Framework (EF) and you have no plan for storing data using reliable collections, stick to stateless services.
Q1
In stateful Reliable Services state means the data to be stored in the tables in our normal database applications or something else?
It means you are planning to store the data in Reliable Collections.
Q2
Is it possible to use EF with stateful Reliable Services ?
Yes, even when you use a stateful service you can write logic to store data in EF, and optionally store data in reliable collections (See the use case presented by Oleg in the comments for example) but if you only want to use EF then go for a stateless service. A stateful service only makes sense if you use reliable collections.
Q3
How we can store/retrieve the data to/from database (like Products, Categories, Employess etc...) using EF in Reliable Services?
Create a stateless service, add the EF NuGet packages and write the code as you would normally do.
Additional information
From this quickstart
A stateless service is a type of service that is currently the norm in cloud applications. It is considered stateless because the service itself does not contain data that needs to be stored reliably or made highly available. If an instance of a stateless service shuts down, all of its internal state is lost. In this type of service, state must be persisted to an external store, such as Azure Tables or a SQL database, for it to be made highly available and reliable.
and
Service Fabric introduces a new kind of service that is stateful. A stateful service can maintain state reliably within the service itself, co-located with the code that's using it. State is made highly available by Service Fabric without the need to persist state to an external store.
Reliable Collection can be best described as a No-Sql data store. It is up to you if you want to use this, or have a mix between stateful and stateless services.
For a more in-depth overview of Reliable Collections, read this doc

Service fabric Stateful service - Scaling without partitioning?

I am planning to migrate my existing cloud monolithic Restful Web API service to Service fabric in three steps.
The Memory cache (in process) has been heavily used in my cloud service.
Step 1) Migrate cloud service to SF stateful service with 1 replica and single partition. The cache code is as it is. No use of Reliable collection.
Step 2) Horizontal scaling of SF Monolithic stateful service to 5 replica and single partition. Cache code is modified to use Reliable collection.
Step 3) Break down the SF monolithic service to micro services (stateless / stateful)
Is the above approach cleaner? Any recommendation.? Any drawback?
More on Step 2) Horizontal scaling of SF stateful service
I am not planning to use SF partitioning strategy as I could not think of uniform data distribuition in my applictaion.
By adding more replica and no partitioning with SF stateful service , I am just making my service more reliable (Availability) . Is my understanding correct?
I will modify the cache code to use Reliable collection - Dictionary. The same state data will be available in all replicas.
I understand that the GET can be executed on any replica , but update / write need to be executed on primary replica?
How can i scale my SF stateful service without partitioning ?
Can all of the replica including secondory listen to my client request and respond the same? GET shall be able to execute , How PUT & POST call works?
Should i prefer using external cache store (Redis) over Reliable collection at this step? Use Stateless service?

This document has a good overview of options for scaling a particular workload in Service Fabric and some examples of when you'd want to use each.
Option 2 (creating more service instances, dynamically or upfront) sounds like it would map to your workload pretty well. Whether you decide to use a custom stateful service as your cache or use an external store depends on a few things:
Whether you have the space in your main compute machines to store the cached data
Whether your service can get away with a simple cache or whether it needs more advanced features provided by other caching services
Whether your service needs the performance improvement of a cache in the same set of nodes as the web tier or whether it can afford to call out to a remote service in terms of latency
whether you can afford to pay for a caching service, or whether you want to make due with using the memory, compute, and local storage you're already paying for with the VMs.
whether you really want to take on building and running your own cache
To answer some of your other questions:
Yes, adding more replicas increases availability/reliability, not scale. In fact it can have a negative impact on performance (for writes) since changes have to be written to more replicas.
The state data isn't guaranteed to be the same in all replicas, just a majority of them. Some secondaries can even be ahead, which is why reading from secondaries is discouraged.
So to your next question, the recommendation is for all reads and writes to always be performed against the primary so that you're seeing consistent quorum committed data.

What type of services are best fit as Reliable Actors?

I am going through Reliable Services and Reliable Actors. I have gone through online documentation but few concepts are not clear to me.
With what I understood:
(1) Reliable Service is a programming model and comprises of Stateless and Stateful services. Reliable Service provides highly available set of classes called as: Reliable Collections.
(2) Reliable Actors is a programming model which comprises of Stateful services which utilize single thread for execution. Reliable Actors cannot be Stateless.
I want to know when to use:
(a) Stateless Service
(b) Stateful Service, and
(c) Reliable Actors
What type of services are best suited as single-threaded application?

It will depend on your application and how it is strructured I imagine that you are talking about Service Fabric Here since you just put a tag and not on the question itself.
For the reliable services you have 2 option they are statefull services iagine this as a webpage that wants to keep the state of the client (if you are used to .net imagine this as a persistent session state of having the affinity cookie enabled on your application) basically what you are doing ti tis persisting the information that comes trough this services in order to do that you should be using reliable collections.
Stateless services are basic api calls that will get the response from a service or do some work and return the response back. the classical case it is imagine that you have a services that will do the sum of 2 numbers it doesn't need to implement state it will do the work and retuirn the reponse back but this can be used for services that store the state outside of the service itself.
Reliable Actors are build on top of the reliable services that are statefull they are an implementation of the actor model but built on top of the statefull reliable services. THey just add some utilities on top of the statefull services.
you can read more details on the service fabric implementation of those model on : https://learn.microsoft.com/en-us/azure/service-fabric/service-fabric-overview

How to store temporary data in an Azure multi-instance (scale set) virtual machine?

We developed a server service that (in a few words) supports the communications between two devices. We want to make advantage of the scalability given by an Azure Scale Set (multi instance VM) but we are not sure how to share memory between each instance.
Our service basically stores temporary data in the local virtual machine and these data are read, modified and sent to the devices connected to this server.
If these data are stored locally in one of the instances the other instances cannot access and do not have the same information. Is it correct?
If one of the devices start making some request to the server the instance that is going to process the request will not always be the same so the data at the end is spread between instances.
So the question might be, how to share memory between Azure instances?
Thanks

Depending on the type of data you want to share and how much latency matters, as well as ServiceFabric (low latency but you need to re-architect/re-build bits of your solution), you could look at a shared back end repository - Redis Cache is ideal as a distributed cache; SQL Azure if you want to use a relation db to store the data; storage queue/blob storage - or File storage in a storage account (this allows you just to write to a mounted network drive from both vm instances). DocumentDB is another option, which is suited to storing JSON data.

You could use Service Fabric and take advantage of Reliable Collections to have your state automagically replicated across all instances.
From https://azure.microsoft.com/en-us/documentation/articles/service-fabric-reliable-services-reliable-collections/:
The classes in the Microsoft.ServiceFabric.Data.Collections namespace provide a set of out-of-the-box collections that automatically make your state highly available. Developers need to program only to the Reliable Collection APIs and let Reliable Collections manage the replicated and local state.
The key difference between Reliable Collections and other high-availability technologies (such as Redis, Azure Table service, and Azure Queue service) is that the state is kept locally in the service instance while also being made highly available.
Reliable Collections can be thought of as the natural evolution of the System.Collections classes: a new set of collections that are designed for the cloud and multi-computer applications without increasing complexity for the developer. As such, Reliable Collections are:
Replicated: State changes are replicated for high availability.
Persisted: Data is persisted to disk for durability against large-scale outages (for example, a datacenter power outage).
Asynchronous: APIs are asynchronous to ensure that threads are not blocked when incurring IO.
Transactional: APIs utilize the abstraction of transactions so you can manage multiple Reliable Collections within a service easily.
Working with Reliable Collections -
https://azure.microsoft.com/en-us/documentation/articles/service-fabric-work-with-reliable-collections/

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string