Service Fabric Reliable Services: Communication and Partitioning essentials - azure

While discovering SF Reliable Services I want to make sure that next basic statements are true.
Reliable Services Default Communication stack (DefaultStack) and Reliable Actors Communication stack (using ServiceProxy/ActorProxy) can only be used for communicating inside SF Cluster. Customers from outside must use WebAPI/WCF stacks.
ServicePartitionResolver, CommunicationClientFactory, ServicePartitionClient are stuff that already implemented inside DefaultStack. I don't have to worry about it if I use only DefaultStack.
Some Stateful service has more then one partition, and I want for example to post an item to process it. It is not SF's responsibility to decide what exactly partition should be used by posting customer. I need manually implement an algorithm resolving partition key or name and use it in ServiceProxy constructor (for DefaultStack).

You're correct on all those points,
If you want to communicate outside Service Fabric you need to use something like an OwinCommunicationListener (see here).
You’d only have to implement those if you wanted to plug in your own communication stack.
Yep, you’d need to define the partition key when you’re creating a ServiceProxy.

Related

Questions pertaining to micro-service architecture

I have a couple of questions that exist around micro service architecture, for example take the following services:
orders,
account,
communication &
management
Question 1: From what I read I understand that each service is suppose to have ownership of the data pertaining to that service, so orders would have an orders database. How important is that data ownership? Would micro-services make sense if they all called from one traditional database such that all data pertaining to the services would exist in one database? If so, are there an implications of structuring the services this way.
Question 2: Services should be able to communicate with one and other. How would that statement be any different than simply curling an existing API? & basing the logic on that response? Is calling a service more efficient than simply curling the API?
Question 3: Is it worth it? Now I understand this is a massive generality , and it's fundamentally predicated on the needs of the business. But when that discussion has been had, was the re-build worth it? & what challenges can you expect to face
I will try to answer all the questions.
Respect to all services using the same database. If you do so you have two main problems. First the database would become a bottleneck because all requests will go to the same point. And second you will have coupled all your services, so if the database goes down or it needs to update, all your services will be affected. (The database will became a single point of failure)
The communication between services could be whatever your services need (syncrhonous, asynchronous, via message passing (message broker), etc..) it all depends on the use cases you have to support. The recommended way to do to avoid temporal decoupling is to use a message broker like kafka, doing this your services don't have to known each other and in case some of them go down the others will still working. And when they are up again, they can continue to process the messages that have pending. However, if your services need to respond in synchronous way, you can define synchronous communication between services and use a circuit breaker to behave properly in case the callee service is down.
Microservices architecture is far more complicated to make it work, to monitoring and to debug than a traditional monolith architecture so, it is only worth if you will have very large requirements of scalability and availability and/or if the system is very large and it will require several teams working in different parts of the system and it is recommendable to avoid dependencies among them. So each team can work at their own pace deploying their own services

Specify primary/secondary actors properly (UML Use-Case Diagram)

Consider the following case:
I have a webservice that provides information about orders in an online shop. On another machine there is a windows service that retrieves orders from the webservice once an hour and writes the data to a database. Instead of a scheduled task a windows service is used, because it provides a tcp endpoint, so a client can manually (using a simple desktop application) command the service to retrieve data of a specific order.
I am unsure where I have to place the windows service. It is a primary actor calling the webservice in a given interval, but it is a secondary actor as it reacts to the command of a client.
How should I proceed creating a Use-Case-Diagram for this scenario?
The answer depends on what you consider as your system.
One system
If your system contains both the webservice and windows service as parts of your (multi-tiered) system, then neither is an actor. The functionality offered by the windows service will be one (or more, depending on the complexity of the service) use case. If you assume that webservice it might become a second use case, that is included by the windows service (a rare case but works here).
The mere fact that those parts are on separate machined doesn't change a thing. It's common approach that database has it's separate machines but no-one reasonable consider it to be separate from the system itself.
Two systems
If you treat windows service as a separate system then you will actually have two use case diagrams, one for each of the systems.
In this case the use case diagram of windows service will have the client as a primary actor and system containing the webservice as a secondary actor.
In the use case diagram of the system with webservice your primary actor would be the windows service system (again as a whole , not a service itself). In this diagram the client is not depicted at all as it does not interact with the system.
Component as a system
Even if you consider both windows service and webservice as a single system, you may still depict use cases of components rather than the system as a whole. In such case the aproach will be similar to the situation with two systems.
In addition to what #Ister said: Draw a boundary that represents your system under consideration. Now think about what is inside (the use case bubbles) and what is outside (the actors). For the latter there's the convention to place the primary actors to the left and the secondary ones to the right. Primary actors are usually considered the ones that start a workflow while the secondary ones are being triggered/informed in the course of any such workflow.

CQRS and Event Sourcing Guide

I want to create a CQRS and Event Sourcing architecture that is very cheap and very flexible and very uncomplicated.
I want to make sure that events never fail to at least reach the publisher/event store, ever, ever, because that's where business is.
Now, i have several options in mind:
Azure
With azure, i seem to not know what to use.
Azure service bus
Azure Function
Azure webjob (i suppose this can be replaced with Azure functions)
?? (something else i forgot or dont know?)
How reliable are these azure server-less solutions??
Custom
For this i am thinking of using RabbitMQ, the problem is the cost of a virtual machine to run it.
All in all, i want:
Ability to replay the messages/events in case of failure.
Ability to easily add subscribers.
Ability to select the subscribers upon which to replay the messages.
The Event store should be able to store very large sizes of event messages (or how else shall queue an image or file??).
The event store MUST NEVER EVER get chocked, or sleep.
Speed of implementation/prototyping would be an added
advantage.
What does your experience suggest?
What about other alternatives? (eg: apache-kafka)?
Why not run Event Store? Created by Greg Young himself. Host where you need.
I am a java user, I have been using hornetq (aka artemis which I dont use) an alternative to rabbitmq for the longest; the only problem is it does not support replication but gets the job done when it comes to eventsourcing. For your custom scenario, rabbitmq is a good choice but try running it on a digital ocean instance for low costs. If you are looking for simplicity and flexibility you have only 2 choices , build your own or forgo simplicity and pick up apache kafka with all its complexities but will give you flexibility. Again you can also build an eventstore with mongodb. https://www.mongodb.com/blog/post/event-sourcing-with-mongodb
Your requirements are too vague to make the optimal choice. You need to consider a lot of things, one of them would be, for instance, the numbers of events per one aggregate, the number of aggregates (note that this has to be statistical). Those are important primarily because if you allow tens of thousands of events for each aggregate then you would need to have snapshotting which adds complexity which you might not need.
But for regular use cases you could just use a relational database like Postgres as your (linearizable) event store. It also has a listen/notify functionality to you would not really need any message bus either and your application could be written in a reactive way.

How many read operations can be handled by Reliable Actor with no problems?

Aim: Pretend, I have a very popular page (let's say 1 million people per 5 minute) on my Azure Service Fabric based web application. I want to make some kind of cache layer between a data layer and frontend API layer.
Solution: For this purpose, I choose a Reliable Actor performing only one method for readonly operation: GetFrequentlyAskedPage(). This Actor has a volatile type and 5 minutes timeout to be replaced with Garbage Collector.
Questions:
How many read-operations can be handled by the Actor before it lay down?
Should I use in this case "read from secondary replicas" option for that Actor?
Or maybe I am totally wrong in my reasoning and should change the way of implementation.
I would not recommend using actors as a cache. Actor instances force single-threaded turn-based access, meaning an actor instance can only service one request at a time. This obviously will not perform well as a cache. See here for more info: https://azure.microsoft.com/en-us/documentation/articles/service-fabric-reliable-actors-introduction/
Instead I would recommend using a stateful Reliable Service with a Reliable Dictionary to cache data, or better yet, use a stateful Reliable Service as your data layer, in which case you don't need this cache at all.

Using Hazelcast as a service directory?

I am exploring the notion of using Hazelcast (or any another caching framework) to advertise services within a cluster. Ideally when a cluster member departs then its services (or objects advertising them) should be removed from the cache.
Is this at all possible?
It is possible for sure.
The question is: which solution do you like.
If the services can be stored in a map, you could create a map with a ttl of e.g. a few minutes and each member needs to refresh its service to prevent the services from expiring.
An alternative solution is to listen to member changes using the membershiplistener and once a member leaves, the services that belong to that member need to be removed from the map.
If you don't like none of this, you could create your own SPI based implementation. The SPI is the lower level infrastructure used by hazelcast to create its distributed datastructures. A lot more work, but also a lot of flexibility.
So there are many solutions.

Resources