Service Fabric stateful services in-memory vs external storage - azure

I have a Service Fabric application and it contains two services stateless and stateful. Service Fabric Application Stateless Service: It contains API endpoints to communicate with stateful service. Stateful Service: The data is being stored in Reliable collections i.e in-memory storage.
I have around 15 service fabric microservices that will be communicating with each other based on the requirement. I'm ending up with a lot of proxy calls in order to communicate between the services which is one of the major reasons for performance hindrance.
In order to mitigate this issue, I have a thought to remove stateful service( in-memory storage with Reliable Dictionaries) and use external storage like Azure Cosmos DB as a data storage.
In the new approach, my application will have one stateless service and it will communicate with the external data store ( ex: Cosmo DB). Service Fabric Application Stateless Service: It contains API endpoints to communicate with the storage provider ( Ex: CosmosDB).
Can anyone let us know whether Service fabric in-memory or external storage gives more performance?
Apart from the performance issues with the in-memory storage, it is becoming very challenging to implement the complex queries or do any elastic search or creating reports as we have dependencies between the services.
Is there any other better approach that can really resolve these kinds of issues?

The whole point of using stateful services is to bring the data to where the compute (your service) is. The benefit of this is performance, as there is no network latency for getting the data.
Now, what you are doing is effectively throwing this benefit away by using a stateful service as a central datastore for other services to get data from.
There are at least two option I can think of. The first is to use an external datastore like Cosmos DB and have all services connect to that datastore or, second opion, to convert your stateless services to stateful services and copy/distribute only the portions of the data a given service need to that service. To make it easier to report based on the data you could create read models.
Currently, we have a databse and moving all databse tables as microservices. Inorder to implement stored procedures/ views, we are fetching few services data in a single service and implemting the logic. Do we have an alternative approach for the Sp's/ Views?
You should not try to map a database and its views/stored procedure to some logic and microservices. Instead, try a new view on it. Let each service put their own data into one or more reliable collections. If there is need for a data store with data combined from each service have those services update a so called read model (you'll probably and up having more than one readmodel).
Look up terms like CQRS and read models, they will help with a micro services architecure.
Or have all services connect to, for example, a sql server giving the benefits of stored procedures and views. But do mind that once you use a centralized database, whether it is a sql database or cosmos db database, your micro services are no longer independent services as they all share a single database schema.

Related

Service Fabric Application with Entity Framework?

I started learning Service Fabric applications, and little confused about stateful Reliable Services.
In stateful Reliable Services state means the data to be stored in the tables in our normal database applications or something else?
Is it possible to use EF with stateful Reliable Services ?
How we can store/retrieve the data to/from database (like Products, Categories, Employess etc...) using EF in Reliable Services?
Any tutorial/help will be much appreciable.
Thanks in advance
There are 2 flavors of reliable services, stateless and stateful. The main difference being that stateful services give access to reliable collections to store your data.
TL;DR
If you are planning to use Entity Framework (EF) and you have no plan for storing data using reliable collections, stick to stateless services.
Q1
In stateful Reliable Services state means the data to be stored in the tables in our normal database applications or something else?
It means you are planning to store the data in Reliable Collections.
Q2
Is it possible to use EF with stateful Reliable Services ?
Yes, even when you use a stateful service you can write logic to store data in EF, and optionally store data in reliable collections (See the use case presented by Oleg in the comments for example) but if you only want to use EF then go for a stateless service. A stateful service only makes sense if you use reliable collections.
Q3
How we can store/retrieve the data to/from database (like Products, Categories, Employess etc...) using EF in Reliable Services?
Create a stateless service, add the EF NuGet packages and write the code as you would normally do.
Additional information
From this quickstart
A stateless service is a type of service that is currently the norm in cloud applications. It is considered stateless because the service itself does not contain data that needs to be stored reliably or made highly available. If an instance of a stateless service shuts down, all of its internal state is lost. In this type of service, state must be persisted to an external store, such as Azure Tables or a SQL database, for it to be made highly available and reliable.
and
Service Fabric introduces a new kind of service that is stateful. A stateful service can maintain state reliably within the service itself, co-located with the code that's using it. State is made highly available by Service Fabric without the need to persist state to an external store.
Reliable Collection can be best described as a No-Sql data store. It is up to you if you want to use this, or have a mix between stateful and stateless services.
For a more in-depth overview of Reliable Collections, read this doc

Azure Service Fabric Application

Can you guys explain
Service Fabric can be packaged with MULTIPLE SERVICES to be shipped but then
how do you reuse some of these services into other Application?
Is there a way Reliable Dictionary or Reliable Queue may be shared among
services deployed on Same Cluster?
I tried reading on google but no clear understanding. Your help will be really appreciated.
... how do you reuse some of these services into other Application?
What do you mean with reuse? Sharing the code? You could have a service in Application A talk to a service in Application B instead of having the same service in Application A.
Is there a way Reliable Dictionary or Reliable Queue may be shared among services deployed on Same Cluster?
No there is not. A Reliable Dictionary or Reliable Queue provides data locality to a service removing the need for additional network calls. As soon as you need this same data for multiple services you should consider using other storage solutions like CosmosDB, Blob storage or another database.
If you are looking for some kind of distributed cache you can take a look at Azure Redis.
It is, however, entirely possible to expose the data of a Reliable Dictionary or Reliable Queue using a service. Then that service acts like a data provider / repository. You can expose methods like Add() or Delete() in such a service that results in an update of the Reliable Dictionary or Reliable Queue.

Database location in Microservices Architecture

We have a monolithic application which we are now converting to microservice architecture using containers.
Our microservices are stateful (i.e they need to insert/retrieve data from db). As per microservice architecture, each microservice should have its own data (i.e database in our case).
My question is that where the database of each microservice should be deployed, whether it should be in the same host in which the microservice is deployed, in the same container in which the microservice is deployed or it should be in the separate server like azure db or something?
What would be the pros & cons of each approach and what is the best approach according to microservice best practices?
*
You are correct, each microservice should use its own data store that fits best to its needs. There might be a service that want's to store its data in a blob storage, another may store its data in a table storage or DocumentDb or SQL Database.
You probably want to use Database-as-a-Service thus not hosting your own db because you don't have to worry about availabilty, scaling, backups...
Martin's answer is good, but I want to add that because you are using a containerized application you should definitely deploy the database separately from your services containers. The reason being that your services can evolve (independently), and one of the biggest benefits of stateless service containers is that, if you have a cluster of them, you can update them using rolling updates without any impact on your application availability. Updates to the stateful database services are more difficult, but also expected to be less frequent (and new technologies like cockroachdb are on the horizon). Good read.
I doesn't matter where so much except that it cannot be within the same container as your application, as stated earlier in this thread.
The important part is that only one (1) microservice has the ownership of the data. If more than one microservice needs access to the data, they must access it through a API provided by the microservice that owns that data.
You could structure it likes this:
"Sql Microservice" - handles all traffic to and from SQL Server. All microservices that needs data from Sql talks to this guys. You will have a similar microservice for TableStorage.
If "microservice A" uses a datastore other from Sql/TableStorage and that datastore is local to Microservice A, I would create 2 microservice.
Microservice A1 would be where your code runs
Microservice A2 has an API that exposes the database operation to A1.
When A1 needs data he talks to A2.
In addition to, that this pattern allows you to scale your data layer independent of the application nodes, you also ensure that data is only owned by one (1) microservice and that is the key.

How to store temporary data in an Azure multi-instance (scale set) virtual machine?

We developed a server service that (in a few words) supports the communications between two devices. We want to make advantage of the scalability given by an Azure Scale Set (multi instance VM) but we are not sure how to share memory between each instance.
Our service basically stores temporary data in the local virtual machine and these data are read, modified and sent to the devices connected to this server.
If these data are stored locally in one of the instances the other instances cannot access and do not have the same information. Is it correct?
If one of the devices start making some request to the server the instance that is going to process the request will not always be the same so the data at the end is spread between instances.
So the question might be, how to share memory between Azure instances?
Thanks
Depending on the type of data you want to share and how much latency matters, as well as ServiceFabric (low latency but you need to re-architect/re-build bits of your solution), you could look at a shared back end repository - Redis Cache is ideal as a distributed cache; SQL Azure if you want to use a relation db to store the data; storage queue/blob storage - or File storage in a storage account (this allows you just to write to a mounted network drive from both vm instances). DocumentDB is another option, which is suited to storing JSON data.
You could use Service Fabric and take advantage of Reliable Collections to have your state automagically replicated across all instances.
From https://azure.microsoft.com/en-us/documentation/articles/service-fabric-reliable-services-reliable-collections/:
The classes in the Microsoft.ServiceFabric.Data.Collections namespace provide a set of out-of-the-box collections that automatically make your state highly available. Developers need to program only to the Reliable Collection APIs and let Reliable Collections manage the replicated and local state.
The key difference between Reliable Collections and other high-availability technologies (such as Redis, Azure Table service, and Azure Queue service) is that the state is kept locally in the service instance while also being made highly available.
Reliable Collections can be thought of as the natural evolution of the System.Collections classes: a new set of collections that are designed for the cloud and multi-computer applications without increasing complexity for the developer. As such, Reliable Collections are:
Replicated: State changes are replicated for high availability.
Persisted: Data is persisted to disk for durability against large-scale outages (for example, a datacenter power outage).
Asynchronous: APIs are asynchronous to ensure that threads are not blocked when incurring IO.
Transactional: APIs utilize the abstraction of transactions so you can manage multiple Reliable Collections within a service easily.
Working with Reliable Collections -
https://azure.microsoft.com/en-us/documentation/articles/service-fabric-work-with-reliable-collections/

Azure Service Fabric Reliable Collection and other Persistent Store

I am very new to Service Fabric.
Is Service Fabric recommends to use only Reliable Collections to store ALL the data for an application?
What if I use SQL DB to persist all my business data and use Reliable Collection to lazily persist to SQL DB for integration purposes. Following DDD, if i persist my aggregate to SQL DB and leave a entry in reliable collection to communicate with other Bounded Context. Will this approach has any issues?
The Service Fabric does NOT recommend to store all the data in Reliable Collections. Its your choice. The Service Fabric provides you freedom on how to do things, on many levels.
You can use an external DB(like SQL DB or DocumentDB or anything) and use the stateful service as a cache. Or use the stateful service as a primary storage and don't use an external DB at all.
Even though the Reliable Collection is a bit limited in usage(its a key/value store with no effective query interface other than looping all the data) it has the advantage of being internally stored(performance) and it has good fail safe mechanisms(defining secondary instances, as many as you want). The partitioning capabilities should not be forgotten either.
Personally I tend to minimize the external dependencies. An external DB is a dependency. But if your requirements for your application specify extensive query capabilities, go for it.
According to Microsoft
Treat Reliable Actors as a transactional system. Service Fabric
Reliable Actors is not a two phase commit-based system offering ACID.
If we do not implement the optional persistence, and the machine the
actor is running on dies, its current state will go with it. The actor
will be coming up on another node very fast, but unless we have
implemented the backing persistence, the state will be gone. However,
between leveraging retries, duplicate filtering, and/or idempotent
design, you can achieve a high level of reliability and consistency.
https://acom-feature-videos-twitter-card.azurewebsites.net/en-us/documentation/articles/service-fabric-reliable-actors-anti-patterns/

Resources