Does Azure Queue Storage behave the same as MSMQ? - azure

I have a few applications that I have built that leverage MSMQ on a Windows Server. I would like to port these to Azure-based Web Applications which helps me with some of the security and trust-barrier challenges I have faced in the past.
My question is, can I expect that the Message Queue aspects of Windows Azure storage will behave the same as MSMQ?

In general, not really. While both are queueing solutions, you will find that Windows Azure queues don't have the same guarantees that MSMQ will have (for good reason). For instance, WAQ supports a delivery model of 'at least once' versus 'exactly once'. It also uses a 2PC model where you must pop and then delete a message. It is also not involved in transactions like MSMQ can handle. However, for cloud queuing scenarios, some of these considerations matter less. There are other nuances as well (message TTL, invisibility, renewals, storage time, etc.) that differ.
It might be easier to answer more completely if you explain what features your queueing needs must have. Also, keep in mind that AppFabric has a queueing service that also might be appropriate depending on what you really need.

Related

CQRS and Event Sourcing Guide

I want to create a CQRS and Event Sourcing architecture that is very cheap and very flexible and very uncomplicated.
I want to make sure that events never fail to at least reach the publisher/event store, ever, ever, because that's where business is.
Now, i have several options in mind:
Azure
With azure, i seem to not know what to use.
Azure service bus
Azure Function
Azure webjob (i suppose this can be replaced with Azure functions)
?? (something else i forgot or dont know?)
How reliable are these azure server-less solutions??
Custom
For this i am thinking of using RabbitMQ, the problem is the cost of a virtual machine to run it.
All in all, i want:
Ability to replay the messages/events in case of failure.
Ability to easily add subscribers.
Ability to select the subscribers upon which to replay the messages.
The Event store should be able to store very large sizes of event messages (or how else shall queue an image or file??).
The event store MUST NEVER EVER get chocked, or sleep.
Speed of implementation/prototyping would be an added
advantage.
What does your experience suggest?
What about other alternatives? (eg: apache-kafka)?
Why not run Event Store? Created by Greg Young himself. Host where you need.
I am a java user, I have been using hornetq (aka artemis which I dont use) an alternative to rabbitmq for the longest; the only problem is it does not support replication but gets the job done when it comes to eventsourcing. For your custom scenario, rabbitmq is a good choice but try running it on a digital ocean instance for low costs. If you are looking for simplicity and flexibility you have only 2 choices , build your own or forgo simplicity and pick up apache kafka with all its complexities but will give you flexibility. Again you can also build an eventstore with mongodb. https://www.mongodb.com/blog/post/event-sourcing-with-mongodb
Your requirements are too vague to make the optimal choice. You need to consider a lot of things, one of them would be, for instance, the numbers of events per one aggregate, the number of aggregates (note that this has to be statistical). Those are important primarily because if you allow tens of thousands of events for each aggregate then you would need to have snapshotting which adds complexity which you might not need.
But for regular use cases you could just use a relational database like Postgres as your (linearizable) event store. It also has a listen/notify functionality to you would not really need any message bus either and your application could be written in a reactive way.

Azure ServiceBus vs ServiceRemoting, HTTP and WCF

The documentation of Service Fabric recommends service remoting, ICommunicationClient or WcfCommunicationClient to realize the communication between the micro services.
The ServiceBus, which I always used for inter-service communication, is not even mentioned. Why?
I think you misinterpreted the docs. It does not recommend any protocol or service (the word is not even present on the page). What it does do is list the built-in communication options and appropriate situations of when to use them.
There is nothing that prevent you from using service bus for inter service communications. In fact, if you google around you will find some projects like this one
The ability to plug in any desired service or protocol is one of the great things about SF, but they leave the implementation to you.
There are many approaches to do service to service communication, if they had to document all of then, they would spend more time writing the possible approaches than doing the actual communication.
They probably decided for the one with closest relation to the platform, but they could write about any possible, it is just a matter o preference.
I could name a few from many just to have an idea:
Http
Remoting
WCF
Service Bus
Event Hub
AMQP
MQTT
gRPC + protobuf
TCP
UDP
Pipes
And many more, Imagine if they had to document all of then.
The communication is flexible enough to let you implement using any communication mechanism.
Regarding the ones you mentioned,
I always opt for HTTP for being platform agnostic and widely implemented on most platforms, does not matter if is .Net, Java, NodeJs, Windows or Linux, they all talk the same language, the others are very tight to the .Net and Windows platform and force every other solution to be also tighten or adapted to then. And also there is the fact of some being synchronous and other asynchronous like Service bus.
Then, when performance is an issue, I evaluate the other options.

Can Azure EventHub be used for critical transactional data in production?

Reading the documentation, Azure EventHubs is meant for:
Application instrumentation
User experience or workflow processing
Internet of Things (IoT) scenarios
Can this be used for any transactional data, handling revenue or application sensitive data?
Based on what I read, looks like it is meant for handling data that one should not be worried about any data loss. Is this the case?
It is mainly designed for large scale ingestion of data. That is why typical scenario's include IoT solutions which consists of a multitude of devices sending mass amounts of telemetry data.
To allow for this kind of scale it does not include some features other messaging service, like Azure Service Bus, do have. I think this blog does a good job of listening the differences. Especially the section Use Case explains things very well:
From a target use case perspective if we consider some of our typical enterprise integration patterns then if you are implementing a pattern which uses a Command Message, or a Request/Reply Message then you probably want to use Azure Service Bus Messaging.  RPC patterns can be implemented using Request/Reply messages on Azure Service Bus using a response queue.  These are really about ESB and EAI style messaging patterns where you want to send messages between applications and probably want to use other features such as property based routing.
Azure Event Hubs is more likely to be used if you’re implementing patterns with Event Messages and you want somewhere reliable to send them that is capable of dealing with a massive scale but will allow you to do stuff with the events out of process.
With these core target use cases in mind it is easy to see where the scale differences come into play.  For messaging it’s about one application telling one or more apps to DO SOMETHING or GIVE ME SOMETHING.  The alternative is that in eventing the applications are saying SOMETHING HAS HAPPENED.  When you consider this in typical application scenarios and you put events into the telemetry and logging space you can quickly see that the SOMETHING HAS HAPPENED scenario will produce a lot more traffic than the other.
Now I’m not saying that you can’t implement some messaging type functions using event hubs and that you can’t push events to a Service Bus topic as in integration there are always different requirements which result in different implementation scenarios, but I think if you follow the above as a general rule then you will usually be on the right path.
That does not mean however, that it is only capable of handling data that one should not be worried about any data loss. Data is stored for a configurable amount of time and if necessary, this data can be read from an earlier point in time.
Now, given your scenario I do not think Event Hub is the best fit. But truth to be told, I am not sure because you will have to elaborate more on what you want to do exactly.
Addition
The idea behind Event Hubs is that you will get at least once delivery at great scale. (Source). See also this question: Does Azure Event Hub guarantees at least once delivery?

Are there disadvantages of using large number of entities in Azure ServiceBus

In another words, if I create messaging layout which uses rather large number of messaging entities (like several thousands), instead of smaller number, is there something in Azure ServiceBus that gets irritated by that and makes it perform less than ideally, or generates significantly different costs. Let us assume that number of messages will remain roughly the same in both scenarios.
So to make clear I am not asking if messaging layout with many entities is sound from applications point of view, but rather is there in Azure some that performs badly in such situations. If there are advantages to it (perhaps Azure can scale it more easily), that would be also interesting.
I am aware of 10000 entites limit in single ServiceBus namespace.
It is the more matter of programming and architecture of the solution i think - for example, we saw the problems with the ACS (authentication mechanism) - SB started to throttle the client sometimes when there were many requests. Take a look at the guidance about SB high availability - there are some issues listed that should be considered when you have a lot of load.
And, you always have other options that can be more suitable for highload scenarios - for example, Azure Event Hubs, more lightweight queue mechanism intended to be the service for the extremely high amount of messages.

Messaging bus + event storage + PubSub

I'm looking at building an application which has many data sources, each of which put events into my system. Events have a well defined data structure and could be encoded using JSON or XML.
I would like to be able to guarantee that events are saved persistently, and that the events are used as a part of a publish/subscribe bus with multiple subscribers possible per event.
For the database, availability is very important even as it scales to multiple nodes, and partition tolerance is important so that I can scale the number of places which can store my events. Eventual consistency is good enough for me.
I was thinking of using a JMS enterprise messaging bus (e.g. Mule) or an AMQP enterprise messaging bus (such as RabbitMQ or ZeroMQ).
But for my application, it seems that if I could set up a publish subscribe system with CouchDB or something similar, it would solve my problem without having to integrate a enterprise messaging bus and a persistent storage system.
Which would work better, CouchDB + scaling + loadbalancing + some kind of PubSub mechanism, or an explicit PubSub messaging system with attached eventually-consistent , Available, partition-tolerant storage? Which one is easier to set up, administer, and operate? Which solution will have high throughput for a given cost? Why?
Also, are there any more questions I should ask before selecting my technologies? (BTW, Java is the server-side and client-side language).
I am using a CouchDB message queue in production. (It is not pub/sub, so I do not consider this answer complete.)
Currently (June 2011), CouchDB has huge potential as a messaging substrate:
Good data persistence
Well-poised for clustering (on a LAN, using BigCouch or Lounge)
Well-poised for distribution (between data centers, world-wide)
Good platform. Despite the shortcomings listed below, I love CQS because I can re-use my DB and it works from Erlang, NodeJS, and every web browser.
The _changes query
Continuous feeds, instant delivery without polling
Network going down is no problem, just retry later from the previous position
Still, even a low-volume message system in CouchDB requires careful planning and maintenance. CouchDB is potentially a great messaging server. (It is inspired by Lotus notes, which handles high email volume.)
However, these are the challenges with CouchDB:
Append-only database files grow fast
Be mindful about disk capacity
Be mindful about disk i/o. Compaction will read and re-write all live documents
Deleted documents are not really deleted. They are marked deleted=true and kept forever, even after compaction! This is in fact uniquely good about CouchDB, because the deleted action will propagate through the cluster, even if the network goes down for a time.
Propagating (replicating) deletes is great, but what about the buildup of deleted docs? Eventually it will outstrip everything else. The solution is to purge them, which actually removes them from disk. Unfortunately, if you do 2 or more purges before querying a map/reduce view, the view will completely rebuild itself. That may take too much time, depending on your needs.
As usual, we hear NoSQL databases shouting "free lunch!", "free lunch!" while CouchDB says "you are going to have to work for this."
Unfortunately, unless you have compelling pressure to re-use CouchDB, I would use a dedicated messaging platform. I had a good experience with ejabberd as a messaging platform and to communicate to/from Google App Engine.)
I think that the best solution would be CouchDB + Jabber/XMPP server (ejabberd) + book: http://professionalxmpp.com
JSON is the natural storing mechanism for CouchDB
Jabber/XMPP server includes pubsub support
The book is a must read
While you can use a database as an alternative to a message queueing system, no database is a message queuing system, not even CouchDB. A message queueing system like AMQP provides more than just persistence of messages, in fact with RabbitMQ, persistence is just an invisible service under the hood that takes care of all of the challenges that you have to deal with by yourself on CouchDB.
Take a good look at the RabbitMQ website where there is lots of information about AMQP and how to make use of it. They have done a great job of collecting together articles and blogs about message queueing.

Resources