I'm hoping someone can point me in the right direction. I see a chat example for Actors at https://azure.microsoft.com/en-us/documentation/articles/service-fabric-reliable-actors-pattern-distributed-networks-and-graphs/#smart-cache-code-sample-groupchat but that example only shows part of the chat story. Chat's a good example for me because it's similar to the problem I'm trying to solve.
For my problem, clients that push messages into the Actor network also need to receive updates when the state of that network changes. I believe the obvious tool for this is SignalR, but I'm kind of stuck at that point. The Actor SDK doesn't seem to provide a reliable way of streaming state changes out of an Actor. And from what I read, Actor Events don't seem to be reliable enough for this scenario (I'm guessing that's the case, since the documentation says "best effort").
So take the chat example on the SF site. Where do I go from there? How do I subscribe to Actor updates from an ASP.NET Signalr Hub?
The best answer that I've come up with so far is to move pub/sub stuff out of Service Fabric Actors and over to another tool. For example, for my particular problem, I use Redis's pub/sub feature.
Typically you want a 1-to-1 relationship between client and actor or 1-to-many relationship between client to actor. With that being said I would design it so that the client sends messages to the actor, and the actor is responsible for ensuring the message is delivered to the "session" (via reminder). The session I view as a stateful service via reliable collection for that "session". As updates reach the session they call the hub method to broadcast the changes to the appropriate clients.
Related
I have a couple of questions that exist around micro service architecture, for example take the following services:
orders,
account,
communication &
management
Question 1: From what I read I understand that each service is suppose to have ownership of the data pertaining to that service, so orders would have an orders database. How important is that data ownership? Would micro-services make sense if they all called from one traditional database such that all data pertaining to the services would exist in one database? If so, are there an implications of structuring the services this way.
Question 2: Services should be able to communicate with one and other. How would that statement be any different than simply curling an existing API? & basing the logic on that response? Is calling a service more efficient than simply curling the API?
Question 3: Is it worth it? Now I understand this is a massive generality , and it's fundamentally predicated on the needs of the business. But when that discussion has been had, was the re-build worth it? & what challenges can you expect to face
I will try to answer all the questions.
Respect to all services using the same database. If you do so you have two main problems. First the database would become a bottleneck because all requests will go to the same point. And second you will have coupled all your services, so if the database goes down or it needs to update, all your services will be affected. (The database will became a single point of failure)
The communication between services could be whatever your services need (syncrhonous, asynchronous, via message passing (message broker), etc..) it all depends on the use cases you have to support. The recommended way to do to avoid temporal decoupling is to use a message broker like kafka, doing this your services don't have to known each other and in case some of them go down the others will still working. And when they are up again, they can continue to process the messages that have pending. However, if your services need to respond in synchronous way, you can define synchronous communication between services and use a circuit breaker to behave properly in case the callee service is down.
Microservices architecture is far more complicated to make it work, to monitoring and to debug than a traditional monolith architecture so, it is only worth if you will have very large requirements of scalability and availability and/or if the system is very large and it will require several teams working in different parts of the system and it is recommendable to avoid dependencies among them. So each team can work at their own pace deploying their own services
I want to create a CQRS and Event Sourcing architecture that is very cheap and very flexible and very uncomplicated.
I want to make sure that events never fail to at least reach the publisher/event store, ever, ever, because that's where business is.
Now, i have several options in mind:
Azure
With azure, i seem to not know what to use.
Azure service bus
Azure Function
Azure webjob (i suppose this can be replaced with Azure functions)
?? (something else i forgot or dont know?)
How reliable are these azure server-less solutions??
Custom
For this i am thinking of using RabbitMQ, the problem is the cost of a virtual machine to run it.
All in all, i want:
Ability to replay the messages/events in case of failure.
Ability to easily add subscribers.
Ability to select the subscribers upon which to replay the messages.
The Event store should be able to store very large sizes of event messages (or how else shall queue an image or file??).
The event store MUST NEVER EVER get chocked, or sleep.
Speed of implementation/prototyping would be an added
advantage.
What does your experience suggest?
What about other alternatives? (eg: apache-kafka)?
Why not run Event Store? Created by Greg Young himself. Host where you need.
I am a java user, I have been using hornetq (aka artemis which I dont use) an alternative to rabbitmq for the longest; the only problem is it does not support replication but gets the job done when it comes to eventsourcing. For your custom scenario, rabbitmq is a good choice but try running it on a digital ocean instance for low costs. If you are looking for simplicity and flexibility you have only 2 choices , build your own or forgo simplicity and pick up apache kafka with all its complexities but will give you flexibility. Again you can also build an eventstore with mongodb. https://www.mongodb.com/blog/post/event-sourcing-with-mongodb
Your requirements are too vague to make the optimal choice. You need to consider a lot of things, one of them would be, for instance, the numbers of events per one aggregate, the number of aggregates (note that this has to be statistical). Those are important primarily because if you allow tens of thousands of events for each aggregate then you would need to have snapshotting which adds complexity which you might not need.
But for regular use cases you could just use a relational database like Postgres as your (linearizable) event store. It also has a listen/notify functionality to you would not really need any message bus either and your application could be written in a reactive way.
I have the following model.
Events are published by my back-end services. Webhooks are processed by a web-app that simply queues jobs based on the event that is fired. The queueing service subsequently can do things such as make calls to other back-end services. The issue with this model is that events that are missed, e.g when the webhook processor app experiences downtime, are lost. What are best practices for tracking and replaying these missed events?
I know to avoid the anti-pattern database as a queue, and it seems like the solution would involve some kind of message queue. Hoping that someone that has solved this problem could shed some light.
I have an Event Hub with loads of ingress for telemetry kind of data and there is a worker for receiving these events and delegate the small processing work to an Actor on Service Fabric. Even though there is no data consistency problem while storing data out from these Actors they were basically picked up for a scalable and reliable way of handling these events. Now if I have decided to move away from Actors to a self developed logic on containers, what would be the best approach to handle processing of huge volumes of data with easy check points on Event Hub?
I would like know possible problems that needs to be handled or the behaviors of Actor I would be missing.
Reading the documentation, Azure EventHubs is meant for:
Application instrumentation
User experience or workflow processing
Internet of Things (IoT) scenarios
Can this be used for any transactional data, handling revenue or application sensitive data?
Based on what I read, looks like it is meant for handling data that one should not be worried about any data loss. Is this the case?
It is mainly designed for large scale ingestion of data. That is why typical scenario's include IoT solutions which consists of a multitude of devices sending mass amounts of telemetry data.
To allow for this kind of scale it does not include some features other messaging service, like Azure Service Bus, do have. I think this blog does a good job of listening the differences. Especially the section Use Case explains things very well:
From a target use case perspective if we consider some of our typical enterprise integration patterns then if you are implementing a pattern which uses a Command Message, or a Request/Reply Message then you probably want to use Azure Service Bus Messaging. RPC patterns can be implemented using Request/Reply messages on Azure Service Bus using a response queue. These are really about ESB and EAI style messaging patterns where you want to send messages between applications and probably want to use other features such as property based routing.
Azure Event Hubs is more likely to be used if you’re implementing patterns with Event Messages and you want somewhere reliable to send them that is capable of dealing with a massive scale but will allow you to do stuff with the events out of process.
With these core target use cases in mind it is easy to see where the scale differences come into play. For messaging it’s about one application telling one or more apps to DO SOMETHING or GIVE ME SOMETHING. The alternative is that in eventing the applications are saying SOMETHING HAS HAPPENED. When you consider this in typical application scenarios and you put events into the telemetry and logging space you can quickly see that the SOMETHING HAS HAPPENED scenario will produce a lot more traffic than the other.
Now I’m not saying that you can’t implement some messaging type functions using event hubs and that you can’t push events to a Service Bus topic as in integration there are always different requirements which result in different implementation scenarios, but I think if you follow the above as a general rule then you will usually be on the right path.
That does not mean however, that it is only capable of handling data that one should not be worried about any data loss. Data is stored for a configurable amount of time and if necessary, this data can be read from an earlier point in time.
Now, given your scenario I do not think Event Hub is the best fit. But truth to be told, I am not sure because you will have to elaborate more on what you want to do exactly.
Addition
The idea behind Event Hubs is that you will get at least once delivery at great scale. (Source). See also this question: Does Azure Event Hub guarantees at least once delivery?