if we use Azure Service bus with SessionId is slower than without ? Or do they have the same speed - azure

I'm using service bus service From azure to Send Messages and I was wondering if Using SessionId will effect the speed of sending messages than the Case if I dont use it.
I know that SessionId will preserve the Order but what about the all in all speed ?
Thanks

Sending a message will not be much slower when you specify a session ID. Processing will be, but this is the wrong terminology to use. You can't compare handling messages w/o a session by multiple concurrent consumers and sessioned messages where the intent is to process those messages in the order they were sent in. Different business requirements that have different justifications, right? If you plan to use sessions, processing will be somewhat slower due to only a single active consumer being able to process all the messages from a given session. And that has to be backed up by a requirement, probably.
Take, for example handling items scanned at a grocery checkout. If you want to know what items are purchased in general, competing consumers is the way to go. However, if you want to know what items were bought per purchase, you can't use a competing consumer and have to use sessions to ensure only items for a given purchase are included and nothing else. Will the latter be somewhat slower? Yes, but you can't accomplish it with a competing consumer and if the business wants it, they'll accept the cost of slightly slower processing to gain the insights. Note, there are always multiple ways to solve the problem and maybe sessions is not what's needed at all.

Related

Azure Service Bus - is it a good solution for peer-to-peer messaging platform?

We are designing a system where users can exchange "messages" (let's say XML files for simplicity sake). This system is peer to peer by design - meaning only directed messages are supported. User A can only send message to User B, it is not possible to send messages to "groups" of users etc. FIFO order is mandatory requirement as well.
This must be a reliable solution - so we started looking into Azure and its services. And Service Bus does look like the right solution to me. It offers all bells and whistles we are looking for:
FIFO order is guaranteed
Dead-letter queue with timeouts
Geo-redundancy
Transactions
and so on
So naturally, I started playing with it. And the first idea I had was to give each user of my system a QUEUE from the service bus. It will act as an INBOX for them. Other users send messages to the user (let's say using unique USER_ID as a queue ID for example), messages get accumulated in the queue and when user decides to check the inbox, they will get all the messages in the correct order. This way we "outsource" all routing, security etc etc to the service bus itself - thus considerable simplifying the app logic.
But there is a serious caveat in this approach - Service Bus supports only up to 10,000 queues: https://learn.microsoft.com/en-us/azure/service-bus-messaging/service-bus-azure-and-service-bus-queues-compared-contrasted#capacity-and-quotas and the number of users in my system can reach tens of thousands (but max out at 100,000 or so). So I'm somewhat in the range but not really. Therefore, I have questions:
Is there a flaw in my approach? Overall, is that a good idea to give a queue to the user exclusively? Or perhaps I should implement some kind of metadata and route messages based on it?
Am I looking at the right solution? I want to use SaaS as much as possible so I don't want to start building RabbitMQs on VMs etc - but are there built-in alternatives? may be a different approach should be considered?..
As for the numbers, I'm looking to start with 2,000 users and 200,000 messages a day - not a high load by any means. But if things work out, I see how these numbers can increase by 20x - 30x (but no more).
I would appreciate any options on this. Thank you.

Architecture issue - Azure servicebus and message order guarantee

Ok so i'm relatively new to the servicebus. Working on a project where we use Azure servicebus for queueing messages. Our architecture roughly looks like the following:
So the idea is that in our SourceSystem all kinds of stuff happens, which leads to messages being put on the servicebustopics. Now our responsibility is syncing these events to the external client so they are aware of what we are doing.
Now the issue is that currently we dont use servicebus sessions so message order isnt guaranteed. Also consider the following scenario:
OrderCreated
OrderUpdate 1
OrderUpdate 2
OrderClosed
What happens now is if the externalclients API is down for say OrderUpdate 1 and OrderUpdate 2, we could potentially send the messages in order: OrderCreated, OrderClosed, OrderUpdate 1, OrderUpdate 2.
Currently we just retry a message a few times and then it moves into the deadletter queue for manual reprocessing.
What steps should we take to better guarantee message order? I feel like in the scope of an order, message order needs to be guaranteed.
Should we force the sourcesystem to put all messages for a order in a servicebus session? But how can we handle this with multiple topics? And what do we do if message 1 from a session ends up in the deadletter?
There are a lot of considerations here, should we use a single topic so its easier to manage the sessions? But this opens up other problems with different message structures being in a single topic?
Id love to hear your opinions on this
Have a look at Durable Functions in Azure. You can use the 'Async Http API' or one of the other patterns to achieve the orchestration you need to do.
NServicebus' Sagas might also be a good option, here is an article that does a very good comparison between NServicebus and Durable Functions.
If the external client has to receive all those events and order matters, sending those messages to multiple topics where a topic is per message type will make your mission extremely hard to accomplish. For ordered messaging first you need to use a single entity (queue or topic) with Sessions enabled. That way you can guarantee ordered message processing. In case you have multiple external clients, you'd need to have a session-enabled entity (topic) per external client.
Another option is to implement a pattern known as Process Manager. The process manager would be responsible to make the decisions about the incoming messages and conclude when the work for a given order is completed or not.
There are also libraries (MassTransit, NServiceBus, etc) that can help you. NServiceBus implements Process Manager via a feature called Saga (tutorial) and MassTransit has it as well (documentation).

SocketIO scaling architecture and large rooms requirements

We are using socketIO on a large chat application.
At some points we want to dispatch "presence" (user availability) to all other users.
io.in('room1').emit('availability:update', {userid='xxx', isAvailable: false});
room1 may contains a lot of users (500 max). We observe a significant raise in our NodeJS load when many availability updates are triggered.
The idea was to use something similar to redis store with Socket IO. Have web browser clients to connect to different NodeJS servers.
When we want to emit to a room we dispatch the "emit to room1" payload to all other NodeJS processes using Redis PubSub ZeroMQ or even RabbitMQ for persistence. Each process will itself call his own io.in('room1').emit to target his subset of connected users.
One of the concern with this setup is that the inter-process communication may become quite busy and I was wondering if it may become a problem in the future.
Here is the architecture I have in mind.
Could you batch changes and only distribute them every 5 seconds or so? In other words, on each node server, simply take a 'snapshot' every X seconds of the current state of all users (e.g. 'connected', 'idle', etc.) and then send that to the other relevant servers in your cluster.
Each server then does the same, every 5 seconds or so it sends the same message - of only the changes in user state - as one batch object array to all connected clients.
Right now, I'm rather surprised you are attempting to send information about each user as a packet. Batching seems like it would solve your problem quite well, as it would also make better use of standard packet sizes that are normally transmitted via routers and switches.
You are looking for this library:
https://github.com/automattic/socket.io-redis
Which can be used with this emitter:
https://github.com/Automattic/socket.io-emitter
About available users function, I think there are two alternatives,you can create a "queue Users" where will contents "public data" from connected users or you can use exchanges binding information for show users connected. If you use an "user's queue", this will be the same for each "room" and you could update it when an user go out, "popping" its state message from queue (Although you will have to "reorganize" all queue message for it).
Nevertheless, I think that RabbitMQ is designed for asynchronous communication and it is not very useful approximation have a register for presence or not from users. I think it's better for applications where you don't know when the user will receive the message and its "real availability" ("fire and forget architectures"). ZeroMQ require more work from zero but you could implement something more specific for your situation with a better performance.
An publish/subscribe example from RabbitMQ site could be a good point to begin a new design like yours where a message it's sent to several users at same time. At summary, I will create two queues for user (receive and send queue messages) and I'll use specific exchanges for each "room chat" controlling that users are in each room using exchange binding's information. Always you have two queues for user and you create exchanges to binding it to one or more "chat rooms".
I hope this answer could be useful for you ,sorry for my bad English.
This is the common approach for sharing data across several Socket.io processes. You have done well, so far, with a single process and a single thread. I could lamely assume that you could pick any of the mentioned technologies for communicating shared data without hitting any performance issues.
If all you need is IPC, you could perhaps have a look at Faye. If, however, you need to have some data persisted, you could start a Redis cluster with as many Redis masters as you have CPUs, though this will add minor networking noise for Pub/Sub.

Azure Service Bus - Determine Number of Active Connections (Topic/Queue)

Since Azure Service Bus limits the maximum number of concurrent connections to a Queue or Topic to 100, is there a method that we can use to query our Queues/Topics to determine how many concurrent connections there are?
We are aware that we can capture the throttling events, but would very much prefer an active approach, where we can proactively increase or decrease the number of Queues/Topics when the system is under a heavy load.
The use case here is a process waiting for a reply message, where the reply is coming from a long-running process, and the subscription is using a Correlation Filter to facilitate two-way communication between the Publisher and Subscriber. Thus, we must have a BeginReceive() going in order to await the response, and each such Publisher will be consuming a connection for the duration of their wait time. The system already balances load across multiple Topics, but we need a way to be proactive about how many Topics are created, so that we do not get throttled too often, but at the same time not have an excess of Topics for this purpose.
I don't believe it is currently possile to query the listener counts. I think that the subscriber object also figures into that so in theory, if you have up to 2000 subscribers per topic and if each allows up to 100 connections, that's alot of potential connections. We just need to keep in mind that subscribers are cooperative (each gets a copy of all messages) and receivers on subscriers are competitive (only one gets it).
I've also seen unconfirmed reports of performance delays when you start running > 1,000 subscribers so make sure you test this scenario.
But... given your scenario, I'd deduce that performance time likely isn't the biggest factor (you have long running processes already). So introducing a couple seconds lag into the workflow likely won't be critical. If that's the case, I'd set the timeout for your BeginRecieve to something fairly short (couple seconds) and have a sleep/wait delay between attempts. This gives other listeners an opportnity to get messsages as well. We might also want to consider an approach where we attempt to recieve multiple messages and then assign them out other processes for processing (coorelation in this case?).
Juts some thoughts.

How to design a service that processes messages arriving in a queue

I have a design question for a multi-threaded windows service that processes messages from multiple clients.
The rules are
Each message is to process something for an entity (with a unique id) and can be different i.e DoA, DoB, DoC etc. Entity id is in the payload of the message.
The processing may take some time (up to few seconds).
Messages must be processed in the order they arrive for each entity (with same id).
Messages can however be processed for another entity concurrently (i.e as long as they are not the same entity id)
The no of concurrent processing is configurable (generally 8)
Messages can not be lost. If there is an error in processing a message then that message and all other messages for the same entity must be stored for future processing manually.
The messages arrive in a transactional MSMQ queue.
How would you design the service. I have a working solution but would like to know how others would tackle this.
First thing you do is step back, and think about how critical is performance for this application. Do you really need to proccess messages concurrently? Is it mission critical? Or do you just think that you need it? Have you run a profiler on your service to find the real bottlenecks of the procces and optimized those?
The reason I ask, is be cause you mention you want 8 concurrent procceses - however, if you make this app single threaded, it will greatly reduce the complexity & developement & testing time... And since you only want 8, it almost seems not worth it...
Secondly, since you can only proccess concurrent messages on the same entity - how often will you really get concurrent requests from your client to procces the same entity? Is it worth adding so many layers of complexity for a use case that might not come up very often?
I would KISS. I'd use MSMQ via WCF, and keep my WCF service as a singleton. Now you have the power, ordered reliability of MSMQ and you are now meeting your actual requirements. Then I'd test it at high load with realistic data, and run a profiler to find bottlenecks if i found it was too slow. Only then would I go through all the extra trouble of building a much more complex app to manage concurrency for only specific use cases...
One design to consider is creating a central 'gate keeper' or 'service bus' service who receives all the messages from the clients, and then passes these messages down to the actual worker service(s). When he gets a request, he then finds if another one of his clients are already proccessing a message for the same entity - if so, he sends it to that same service he sent the other message to. This way you can proccess the same messages for a given entity concurrently and nothing more... And you have ease of seamless scalability... However, I would only do this if I absolutely had to and it was proved out via profiling and testing, and not because 'we think we needed it' (see YAGNI principal :))
My approach would be the following:
Create a threadpool with your configurable number of threads.
Keep map of entity ids and associate each id with a queue of messages.
When you receive a message place it in the queue of the corresponding entity id.
Each thread will only look at the entity id dedicated to it (e.g. make a class that is initialized as such Service(EntityID id)).
Let the thread only process messages from the queue of its dedicated entity id.
Once all the messages are processed for the given entity id remove the id from the map and exit the loop of the thread.
If there is room in the threadpool, then add a new thread to deal with the next available entity id.
You'll have to manage the messages that can't be processed at the time, including the situations where the message processing fails. Create a backlog of messages, etc.
If you have access to a concurrent map (a lock-free/wait-free map), then you can have multiple readers and writers to the map without the need of locking or waiting. If you can't get a concurrent map, then all the contingency will be on the map: whenever you add messages to a queue in the map or you add new entity id's you have to lock it. The best thing to do is wrap the map in a structure that offers methods for reading and writing with appropriate locking.
I don't think you will see any significant performance impact from locking, but if you do start seeing one I would suggest that you create your own lock-free hash map: http://www.azulsystems.com/events/javaone_2007/2007_LockFreeHash.pdf
Implementing this system will not be a rudimentary task, so take my comments as a general guideline... it's up to the engineer to implement the ideas that apply.
While my requirements were different from yours, I did have to deal with the concurrent processing from a message queue. My solution was to have a service which would look at each incoming message and hand it off to an agent process to consume. The service has a setting which controls how many agents it can have running.
I would look at having n thread each that read from a single thread-safe queue. I would then hash the EntityId to decide witch queue on put an incomming message on.
Sometimes, some threads will have nothing to do, but is this a problem if you have a few more threads then CPUs?
(Also you may wish to group entites by type into the queues so as to reduce the number of locking conflits in your database.)

Resources