Sharing EventHub between Azure Fabric reliable actors - azure

I'm having an application where I map devices from the physical world to Reliable Actors in Azure Fabric. Each time I receive a message from a device, I want to push a message to an event hub.
What I'm doing right now is creating/using/closing the EventHubClient object for each message.
This is very inefficient (it takes about 1500ms) but it solves an issue I had in the past where I was keeping the EventHubClient in memory. When I have a lot of devices, the underlying virtual machine can quickly run out of network connections.
I'm thinking about creating a new actor that would be responsible for pushing data to the EventHub (by keeping the EventHubClient alive). Because of the turned based concurrency model of Reliable Actors, I'm not sure it's a good idea. If I get 10 000 devices pushing data "at the same time", each of their actors will block to push the message to the new actor that pushes message to the EventHub.
What is the recommended approach for this scenario ?
Thanks,

One approach would be to create a stateless service that is responsible for pushing messages to the EventHub. Each time an Actor receives a message from the device (by the way, how are they communicating with actors?) the Actor calls the stateless service. The stateless service in turn would be responsible for creating, maintining and disposing of one EventHubClient per service. Reliable Service would not introduce the same 'overhead' when it comes to handling incoming messages as a Reliable Actor would. If it is important for your application that the messages reach the EventHub in strictly the same order that they were produced in then you would have to do this with a Stateful Service and a Reliable Queue. (Note, this there is on the other hand no guarantee that Actors would be able to finish handling incoming messages in the same order as they are produced)
You could then fine tune-tune the solution by experimenting with the instance count (https://learn.microsoft.com/en-us/azure/service-fabric/service-fabric-availability-services) to make sure you have enough instances to handle the throughput of incoming messages. How many instances are roughly determined by the number of nodes and cores per node, although other factors may also affect.
Devices communicate with your Actors, the Actors in turn communicate with the Service (may be Stateless or Stateful if you want to queue message, see below), each Service manages an EventHubClient that can push messages to the EventHub.
If your cluster is unable to support an instance count for this service that is high enough (a little simplified: more instances = higher throughput), then you may need to create it as a Stateful Service instead and put messages in a Reliable Queue in the Service and then have the the RunAsync for the Service processing the queue in order. This could take the pressure of peaks in performance.
The Service Fabric Azure-Samples WordCount shows how you work with different Partitions to make the messages from Actors target different instances (or really partitions).
A general tip would be to not try to use Actors for everything (but for the right things they are great and reduces complexity a lot), the Reliable Services model support a lot more scenarios and requirements and could really complement your Actors (rather than trying to make Actors do something they are not really designed for).

You could use a pub/sub pattern here (use the BrokerService).
By decoupling event publishing from event processing, you don't need to worry about the turn based concurrency model.
Publishers:
The Actor sends out messages by simply publishing them to a BrokerService.
Subscribers
Then you use one or more Stateless Services or (different) Actors as subscribers of the events.
They would send them into EventHub in their own pace.
Event Hub Client
Using this approach you'd have full control over the EventHubClient instance counts and lifetimes.
You could increase event processing power by simply adding more subscribers.

In my opinion you should directly call from your actors the event hub in a background thread with an internal memory queue. You should aggregate messages and use SendBatch to improve performance.
The event hub is able to receive the load by himself.

Related

Replaying Messages in Order

I am implementing a consumer which does processing of messages from a queue where order of messages is of importance. I would like to implement a mechanism using NodeJS where:
the consumer function is consuming messages m1, m2, ..., mN from the queue
doing an IO intensive operation and process the messages. m -> m'
Storing the result m' in a redis cache.
acknowledging the queue after each message process (2)
In a different function, I am listening to the message from the cache
sending the processed messages m' to an external system
if the external system was able to process the external system, then delete the processed message from the cache
If the external system rejects the processed message, then stop sending messages, discard the unsent processed messages in the cache and reset the offset to the last accepted message in the queue. For example if m12' was the last message accepted by the system, and I have acknowledged m23 from the queue, then I have to discard m13' to m23' and reset the offset so that the consumer can read and start processing from m13 again.
Few assumptions:
The processing m to m' is intensive and I am processing them optimistically, knowing that most of the times there won't be a failure
With the current assumptions and goals, is there any way I can achieve this with RabbitMQ or any Azure equivalent? My client doesn't prefer Kafka or any Azure equivalent of Kafka (Azure Event Hub).
In scenarios where the messages will always be generated in sequence then a simple queue is probably all you need.
Azure Queues are pretty simple to get into, but the general mode of operation for queues is to remove the messages as they are processed successfully.
If you can avoid the scenario where you must "roll back" or re-process from an earlier time, so if you can avoid the orchestration aspect then this would be a much simpler option.
It's the "go back and replay" that you will struggle with. If you can implement two queues in a sequential pattern, where processing messages from one queue successfully pushes the message into the next queue, then we never need to go back, because the secondary consumer can never process ahead of the primary.
With Azure Event Hubs it is much easier to reset the offset for processing, because the messages stay in the bucket regardless of their read state, (in fact any given message does not have such a state) and the consumer maintains the offset pointer itself. It also has support for multiple consumer groups, which will make a copy of the message available to each consumer.
You can up your plan to maintain the data for up to 7 days without blowing the budget.
There are two problems with Large scale telemetry ingestion services like Azure Event Hubs for your use case
The order of receipt of the message is less reliable for messages that are extremely close together, the Hub is designed to receive many messages from many sources concurrently, so its internal architecture cares a lot less about trying to preserve the precise order, it records the precise receipt timestamp on the message, but it does not guarantee that the overall sequence of records will match exactly to a scenario where you were to sort by the receipt timestamp. (its a subtle but important distinction)
Event Hubs (and many client processing code examples) are designed to guarantee Exactly Once delivery across multiple concurrent consuming threads. Again the Consumers are encouraged to be asynchronous and the serice will try to ensure that failed processing attempts are retried by the next available thread.
So you could use Event Hubs, but you would have to bypass or disable a lot of its features which is generally a strong message that it is not the correct fit for your purpose, if you want to explore it though, you would want to limit the concurrency aspects:
minimise the partition count
You probably want 1 partition for each message producer, or atleast for each sequential set, maintaining sequence is simpler inside a single partition
make sure your message sender (producer) only sends to a specific partition
Each producer MUST use a unique partition key
create a consumer group for each of your consumers
process messages one at a time, not in batches
process with a single thread
I have a lot of experience in designing MS Azure based solutions for Industrial IoT (Telemetry from PLCs) and Agricultural IoT (Raspberry Pi) device implementations. In almost all cases we think that the order of messaging is important, but unless you are maintaining real-time 2 way command and control, you can usually get away with an optimisitic approach where each message and any derivatives are or were correct at the time of transmission.
If there is the remote possibility that a device can be offline for any period of time, then dealing with the stale data flushing through the system when a device comes back online can really play havok with sequential logic programming.
Take a step back to analyse your solution, EventHubs does offer a convient way to rollback the processing to a previous offset, as long as that record is still in the bucket, but can you re-design your logic flow so that you do not have to re-process old data?
What is the requirement that drives this sequence? If it is so important to maintain the sequence, then you should probably process the data with a single consumer that does everything, or look at chaining the queues in a sequential manner.

Node.js application acting as producer and consumer

I am now working on the application saving data into the database using the REST API. The basic flow is: REST API -> object -> save to database. I wanted to introduce the queue to the application, having in mind the idea of the producer and consumer being a part of one, abovementioned application.
Is it possible for the Node.js application to act as both producer and consumer of the queue? Knowing that Node.js is single-threaded language, does it give me any other choice instead of creating two applications - one producing to the queue and the second one - waiting actively for messages in a queue and saving to the database?
Also, the requirement here would be for an application to process any item that hasn't been acknowledged on the queue on the restart. That also makes me think that the 'two applications' architecture is the best idea here.
Thank you for the help.
Yes, nodejs is able to do that and is well suited for every I/O intensive application use case. The point here is "what are you trying to achieve"? message queues are meant to make different applications communicate together, while if you need an in-process event bus is a total overkill. There are many easier and efficient ways to propagate messages between decoupled components of the same nodejs app; one of these way is EventEmitter that let your components collaborate in a pubsub fashion
If you are convinced that an AMQP broker is you solution, you just need to
Define a "producer" class that publishes data on an exchange myExchange
Define a "consumer" queue that declares a queue myQueue
Create a binding at application startup between myExchange and myQueue, based on some routing key. Then, when a message is received from "consumer" you need to acknowledge after db saving. When a message is acked, it will be destroyed since it's already been consumed. You can decide, after an error, to recover the message via NACK
There are nodejs libraries that make code easier, such as Rascal
Short answer: YES and use two separate connections for publishing and consuming
Is it possible for the NodeJS application to act as both producer and consumer of the queue?
I would even state that it is a good usecase matching extremely well with NodeJS philosophy and threading mechanism.
Knowing that Node.js is single-threaded language, does it give me any other choice instead of creating two applications - one producing to the queue and the second one - waiting actively for messages in a queue and saving to the database?
You can have one application handling both, just be aware that if your client is publish too fast for the server to handle, RabbitMQ can apply back pressure on the TCP connection, thus consuming on a back-pressured TCP connection would greatly affect consumer performance.

Should single micro-service listen to single azure bus topic/queue?

We have a Azure service fabric micro-service which listen to multiple azure service bus topics(Topic A, Topic B).
Topic A has more then 10 times message traffic then topic B. and to handle the scale-ability of service we will create the multiple instance of service.
My first question is, In most of the services instance will not get the message in Topic B, As Topic B has less traffic, So will it be waste of resources ?
2 Is it better to create different micro-services for Topic A and Topic B listeners, and create 10x instance of micro-service which listen to topic A and x instance of topic B listener service ?
Is create a message listener in azure service bus, keep on pulling message every time ? means continuously looking/ checking for message, message is there or not.
Thanks Guys for your supports.
If one service receives messages from 2 topics, there's little waste of resources. Listening for messages is not a very resource intensive process.
This depends on your application requirements.
This depends on whether you are using SBMP / SOAP (default) or AMQP as the communication protocol. AMQP is connection based. SBMP does (long) polling.
Microservices advocates the idea of loosely coupled services, where each micro-service will handle his own domain.
Following the microservices approach, if you understand that you had to create two different topics to publish your messages, probably it is because they have different scopes\domain, needing their own micro-service.
In your description it is hard to identify if the domain of TopicA and TopicB are related, so we can not offer a good suggestion.
In any case, if one service listen for both topics, let's assume TopicA handles 1000 messages and TopicB handles 100 per second.
In case you have to publish a new version of your application to handle changes on TopicB messages, you would have to stop the handling of TopicA, that was not necessary. So you are coupling the services, that to begin with should be two independent services, or both topics should be handle as a single one.
Regarding your questions:
1 My first question is, In most of the services instance will not get
the message in Topic B, As Topic B has less traffic, So will it be
waste of resources ?
Waste of resources is relative how you design your application, it might be if your service listen the queue\topic and handle it at the same time, and uses too much memory to keep running all the time. In this scenario, would be case to split them and make a Queue\Topic Listener and other Message Handler that will receive the message to process, and if it keep too long without processing messages you shut it down, leaving just the listener. You could also use actors instead of a service.
2 Is it better to create different micro-services for Topic A and
Topic B listeners, and create 10x instance of micro-service which
listen to topic A and x instance of topic B listener service ?
Yes for the services, regarding the the number of instances, it should be driven by the size of the queue, otherwise you would have too much listeners and also wasting resources, if you follow the approach of splitting the services, you would need one listener receiving the messages from the queue\topic and it would delivery the messages to multiple messages handlers(service instances\actors) and the queue\topic listener control the number of running instances at same time.
3 Is create a message listener in azure service bus, keep on pulling
message every time ? means continuously looking/ checking for message,
message is there or not.
Is not the only approach, but it's correct.

Detect and Delete Orphaned Queues, Topics, or Subscriptions on Azure Service Bus

If there are no longer any publishers or subscribers reading nor writing to a Queue, Topic, or Subscription, because of crashes or other abnormal terminations (instance restart, etc.), is that Queue/Topic/Subscription effectively orphaned?
I tested this by creating a few Queues, and then terminating the applications. Those Queues were still on the Service Bus a long time later. It seems that they will just stay there forever. That would be wonderful if we WANTED that behavior, but in this case, we do not.
How can we detect and delete these Queues, Topics, and Subscriptions? They will count towards Azure limits, etc, and we cannot have these orphaned processes every time an instance is restarted/patched/crashes.
If it helps make the question clearer, this is a unique situation in which the Queues/Topics/Subscriptions have special names, or special Filters, and a very limited set of publishers (1) and subscribers (1) for a limited time. This is not a case where we want survivability. These are instance-specific response channels. Whether we use Queues or Subscriptions is immaterial. If the instance is gone, so is the need for that Queue (or Subscription).
This is part of a solution where each web role has a dedicated response channel that it monitors. At any time, this web role may have dozens of requests pending via other messaging channels (Queues/Topics), and it is waiting for the answers on multiple threads. We need the response to come back to the thread that placed the message, so that the web role can respond to the caller. It is no good in this situation to simply have a Subscription based on the machine, because it will be receiving messages for other threads. We need each publishing thread to establish a dedicated response channel, so that the only thing on that channel is the response for that thread.
Even if we use Subscriptions (with some kind of instance-related filter) to do a long-polling receive operation on the Subscription, if the web role instance dies, that Subscription will be orphaned, correct?
This question can be boiled down like so:
If there are no more publishers or subscribers to a Queue/Topic/Subscription, then that service is effectively orphaned. How can those orphans be detected and cleaned up?
In this scenario you are looking for the Queue/Subscriptions to be "dynamic" in nature. They would be created and removed based on use as opposed to the current explicit provisioning model for these entities. Service Bus provides you with the APIs to perform create/delete operations so you can plug these on role OnStart/OnStop events appropriately. If those operations fail for some reason then the orphaned entities will exist. Again you can run clean up operation on them based on some unique identifier for the name of the entities. An example of this can be seen here: http://windowsazurecat.com/2011/08/how-to-simplify-scale-inter-role-communication-using-windows-azure-service-bus/
In the near future we will add more metadata and query capabilities to Queues/Topics/Subscriptions so you can see when they were last accessed and make cleanup decisions.
Service Bus Queues are built using the “brokered messaging” infrastructure designed to integrate applications or application components that may span multiple communication protocols, data contracts, trust domains, and/or network environments. The allows for a mechanism to communicate reliably with durable messaging.
If a client (publisher) sends a message to a service bus queue and then crashes the message will be stored on the Queue until as consumer reads the message off the queue. Also if your consumer dies and restarts it will just poll the queue and pick up any work that is waiting for it (You can scale out and have multiple consumers reading from queue to increase throughput), Service Bus Queues allow you to decouple your applications via durable cloud gateway analogous to MSMQ on-premises (or other queuing technology).
What I'm really trying to say is that you won't get an orphaned queue, you might get poisoned messages that you will need to handled, this blog post gives some very detailed information re: Service Bus Queues and their Capacity and Quotas which might give you a better understanding http://msdn.microsoft.com/en-us/library/windowsazure/hh767287.aspx
Re: Queue Management, you can do this via Visual Studio (1.7 SDK & Tools) or there is an excellent tool called Service Bus Explorer that will make your life easier for queue managagment: http://code.msdn.microsoft.com/windowsazure/Service-Bus-Explorer-f2abca5a
*Note the default maximum number of queues is 10,000 (per service namespace, this can be increased via a support call)
As Abhishek Lai mentioned there is no orphan detecting capability supported.
Orphan detection can be implement externally in multiple ways.
For example, whenever you send/receive a message, update a timestamp in an SQL database to indicate that the queue/tropic/subscription is still active. This timestamp can then be used to determine orphans.
If your process will crash which is very much possible there will be issue with the message delivery within the queue however queue will still be available to process your request. Handling Application Crashes and Unreadable Messages with Windows Azure Service Bus queues are described here:
The Service Bus provides functionality to help you gracefully recover from errors in your application or difficulties processing a message. If a receiver application is unable to process the message for some reason, then it can call the Abandon method on the received message (instead of the Complete method). This will cause the Service Bus to unlock the message within the queue and make it available to be received again, either by the same consuming application or by another consuming application.
In the event that the application crashes after processing the message but before the Complete request is issued, then the message will be redelivered to the application when it restarts. This is often called At Least Once Processing, that is, each message will be processed at least once but in certain situations the same message may be redelivered. If the scenario cannot tolerate duplicate processing, then application developers should add additional logic to their application to handle duplicate message delivery. This is often achieved using the MessageId property of the message, which will remain constant across delivery attempts.
If there are no longer any processes reading nor writing to a queue, because of crashes or other abnormal terminations (instance restart, etc.), is that queue effectively orphaned?
No the queue is in place to allow communication to occur via Brokered Messages, if all your apps die for some reason then the queue still exists and will be there when they become alive again, it's the communication channel for loosely decoupled applications. Regards Billing 'Messages are charged based on the number of messages sent to, or delivered by, the Service Bus during the billing month' you won't be charged if a queue exists but nobody is using it.
I tested this by creating a few queues, and then terminating the
applications. Those queues were still on the machine a long time
later.
The whole point of the queue is to guarantee message delivery of loosely decoupled applications. Think of the queue as an entity or application in its own right with high availability (SLA) as its hosted in Azure, your producer/consumers can die/restart and the queue will be active in Azure. *Note I got a bit confused with your wording re: "still on the machine a long time later", the queue doesn't actually live on your machine, it sits up in Azure in a designated service bus namespace. You can view and managed the queues via the tools I pointed out in the previous answer.
How can we detect and delete these queues, as they will count towards
Azure limits, etc.
As stated above the default maximum number of queues is 10,000 (per service namespace, this can be increased via a support call), queue management can be done via the tools stated in the other answer. You should only be looking to delete queue's when you no longer have producer/consumers looking to write to them (i.e. never again). You can of course create and delete queues in your producer/consumer applications via the namespaceManager.QueueExists, more information here How to Use Service Bus Queues
If it helps make the question clearer, this is a unique situation in which the queues have special names, and a very limited set of publishers (1) and subscribers (1) for a limited time.
It sounds like you need to use Topics & Subscriptions How to Use Service Bus Topics/Subscriptions, this link also has a section on 'How to Delete Topics and Subscriptions' If you have a very limited lifetime then you could handle topic creation/deletion in your app's otherwise you could have have a separate Queue/Topic/Subscription setup/deletion script to handle this logic...

How to design a service that processes messages arriving in a queue

I have a design question for a multi-threaded windows service that processes messages from multiple clients.
The rules are
Each message is to process something for an entity (with a unique id) and can be different i.e DoA, DoB, DoC etc. Entity id is in the payload of the message.
The processing may take some time (up to few seconds).
Messages must be processed in the order they arrive for each entity (with same id).
Messages can however be processed for another entity concurrently (i.e as long as they are not the same entity id)
The no of concurrent processing is configurable (generally 8)
Messages can not be lost. If there is an error in processing a message then that message and all other messages for the same entity must be stored for future processing manually.
The messages arrive in a transactional MSMQ queue.
How would you design the service. I have a working solution but would like to know how others would tackle this.
First thing you do is step back, and think about how critical is performance for this application. Do you really need to proccess messages concurrently? Is it mission critical? Or do you just think that you need it? Have you run a profiler on your service to find the real bottlenecks of the procces and optimized those?
The reason I ask, is be cause you mention you want 8 concurrent procceses - however, if you make this app single threaded, it will greatly reduce the complexity & developement & testing time... And since you only want 8, it almost seems not worth it...
Secondly, since you can only proccess concurrent messages on the same entity - how often will you really get concurrent requests from your client to procces the same entity? Is it worth adding so many layers of complexity for a use case that might not come up very often?
I would KISS. I'd use MSMQ via WCF, and keep my WCF service as a singleton. Now you have the power, ordered reliability of MSMQ and you are now meeting your actual requirements. Then I'd test it at high load with realistic data, and run a profiler to find bottlenecks if i found it was too slow. Only then would I go through all the extra trouble of building a much more complex app to manage concurrency for only specific use cases...
One design to consider is creating a central 'gate keeper' or 'service bus' service who receives all the messages from the clients, and then passes these messages down to the actual worker service(s). When he gets a request, he then finds if another one of his clients are already proccessing a message for the same entity - if so, he sends it to that same service he sent the other message to. This way you can proccess the same messages for a given entity concurrently and nothing more... And you have ease of seamless scalability... However, I would only do this if I absolutely had to and it was proved out via profiling and testing, and not because 'we think we needed it' (see YAGNI principal :))
My approach would be the following:
Create a threadpool with your configurable number of threads.
Keep map of entity ids and associate each id with a queue of messages.
When you receive a message place it in the queue of the corresponding entity id.
Each thread will only look at the entity id dedicated to it (e.g. make a class that is initialized as such Service(EntityID id)).
Let the thread only process messages from the queue of its dedicated entity id.
Once all the messages are processed for the given entity id remove the id from the map and exit the loop of the thread.
If there is room in the threadpool, then add a new thread to deal with the next available entity id.
You'll have to manage the messages that can't be processed at the time, including the situations where the message processing fails. Create a backlog of messages, etc.
If you have access to a concurrent map (a lock-free/wait-free map), then you can have multiple readers and writers to the map without the need of locking or waiting. If you can't get a concurrent map, then all the contingency will be on the map: whenever you add messages to a queue in the map or you add new entity id's you have to lock it. The best thing to do is wrap the map in a structure that offers methods for reading and writing with appropriate locking.
I don't think you will see any significant performance impact from locking, but if you do start seeing one I would suggest that you create your own lock-free hash map: http://www.azulsystems.com/events/javaone_2007/2007_LockFreeHash.pdf
Implementing this system will not be a rudimentary task, so take my comments as a general guideline... it's up to the engineer to implement the ideas that apply.
While my requirements were different from yours, I did have to deal with the concurrent processing from a message queue. My solution was to have a service which would look at each incoming message and hand it off to an agent process to consume. The service has a setting which controls how many agents it can have running.
I would look at having n thread each that read from a single thread-safe queue. I would then hash the EntityId to decide witch queue on put an incomming message on.
Sometimes, some threads will have nothing to do, but is this a problem if you have a few more threads then CPUs?
(Also you may wish to group entites by type into the queues so as to reduce the number of locking conflits in your database.)

Resources