How to route Event Hub messages to different Azure functions based on their message type - azure

I have an Azure Event Hub over which I would like to send various types of messages. Each message should be handled by a separate Azure Function, based on their message type. What is the best way to accomplish this?
Actually, I could create some JSON container with a type and payload property and let one parent Azure Function dispatch all the messages payloads - based on their type - to other functions, but that feels a bit hacky.
This question basically asks the same - however it is answered how it can be done using the IoT Hub and message routing. In the Event Hub configuration I cannot find any setting to configure message routing though.
Or should I switch to an Azure Message Queue to get this functionality?

I would use Azure Streaming Analytics to route it to the different Azure Functions. ASAs allow you to specify Event Hubs as a source and several sinks (one of which can be multiple Azure Functions). You can read more about setting up Azure Streaming Analytics services through the Azure Portal here. You'll need to set up the Event Hub as your source (docs). You'll also need to set up your sink (docs). You write some MS SQL-like code to route the messages to the various sinks. However, ASAs are costly relative to other services since you're paying for a fixed amount of compute.
I put some pseudo code below. You'll have to swap it out based on how you configure you're ASA using the information from the attached MS Documentation.
SELECT
*
INTO
[YourOutputAlias]
FROM
[YourInputAlias]
HAVING
[CONDITION]
SELECT
*
INTO
[YourAlternateOutputAlias]
FROM
[YourInputAlias]
HAVING
[CONDITION]

Based on your additional info about the business requirements and assuming that the event size < 64KB (1MB in preview), the following screen snippet shows an example of your solution:
The concept of the above solution is based on the pushing a batch of the events to the Event Domain Endpoint of the AEG. The EventHub Trigger function has a responsibility for mapping each event message type in the batch to the domain topic before its publishing to the AEG.
Note, that using the Azure IoT Hub for ingestion of the events, the AEG can be directly integrated to the IoT Hub and each event message can be distributed in the loosely decoupled Pub/Sub manner. Besides that, for this business requirements can be used the B1 scale tier for IoT Hub ($10/month) comparing to the Basic Event Hubs ($11.16).
The IoT Hub has built-in a message routing mechanism (with some limitations), but a recently new feature of the IoT/AEG integration such as publishing a device telemetry message is giving a good support in the serverless architecture.

I ended up using Azure Durable Functions using the Fan Out/Fan In pattern.
In this approach, all events are handled by a single Orchestrator Function which in fact is a Durable Azure Function (F1). This deserializes incoming JSON to the correct DTO. Based on the content of the DTO, a corresponding activity function (F2) is invoked which processes it.

Related

EventGrid vs EventHub

I am working on a service fabric application and want to publish few events from this application and subscribe or process those publish events in another application.
I have tried EventGrid concept and observed that there is a delay while publishing and processing the events. So, now I am looking for other alternatives like EventHub or Queues, etc..
If anyone had already used EventGrid, EventHud or Queues, etc.. , Please do suggest which one will give more performance when we deal with more events.
Design Approach
We have migrated the tables from SQL service to Service Fabric. There is a view in SQL Service, and we are planning to implement that as a service in service fabric.
The implementation logic follows below.
Table 1 implemented service and we publish an event for each CRUD operation to EventGrid/ EventHud.
Table 2 implemented service and we publish an event for each CRUD operation to EventGrid/ EventHud.
We have created a view service where it listens to the events when any event sent to EventGrid/ EventHud, it will perform required calculations and store in the ViewService( It is a background job)
We are looking for a messaging service which gives more performance.
Have you seen this comparison and this one?
Anyway, can you clarify your requirements in terms of throughput and performance? It depends on a lot of factors including, but not limited to, the message size and the amount of messages.
Having used both Event Grid and Event Hub I'd say Event Hub works very well for many messages per second, say data streams from iot devices, but the performance of the downstream processing can be a bottleneck. You have to process them very fast in order to receive new events. Then there are partitions and consumer groups that can be of help to balance the load and have different processors for the same data but with different view of the data stream. (A fast processor for live displaying of sensor data and a slower one for storing the data for later analysis)
If you're talking about a few events generated by an application that triggers other apps to start doing some work based on those events Event Grid is a good fit. I haven't experienced much delay in receiving those events.
But bottom line, I think all services (Event Grid, Event Hub, Service Bus etc) support different use cases and that should be your first decision point.
Can you describe your publisher, subscriber, etc. and show your metrics of the Azure Event Grid usage?
You can use the portal screen snippets on the topic (publisher) and subscription (subscriber).
The following screen snippets are from my tester when manually have been fired few events.
Publisher side:
Subscriber side:
Metrics on the portal:
As you can see, the delivery destination processing time is ~1ms. The latency time on the publisher side (custom topic) is between 2-4ms.
Note, that the AEG is a PUSH->PUSH-ACK or PUSH->PULL-ACK eventing loosely decupled Pub/Sub model instead of the Event Hub model which is based on the PUSH->PULL mechanism, in other words, the Event Hub needs to host a listener and receiver for pulling an event from the partition.

How to consume events delivered by Azure Event Grid to GCP

Basically what I understood from few Azure topics is as below:
Azure Event Hub - where data is received initially and converted into events
Service Bus- acting as a queue
Azure Event Grid - where events converted in hub are transferred here.
so the connection is like below:
Hub -> Service Bus -> Event Grid -> Pub Sub -> Storage
I understood this concept. My problem is I want data to be pushed from the event grid to GCP (subscription / topics). My question are:
How can I establish this using PUSH method?
What do I need to develop exactly?
How can I push things from grid to pubsub/subscriptions?
I found this link where data is getting published into Event Grid but I want to push data from the event grid to gcp. Can anybody explain me where am I going wrong or what exactly should I start with. I am new to this and its very confusing so I just need little bit of guidance over here.
I have below doubts:
Is there any direct subscriber option available with event grid listener? I mean can I directly link my google storage account with this listener so, whenever there is an event triggered it will be directly pushed to my GCP account(I don't have Azure account with me right now since access issue is in progress so I can't see it that's why I am asking here)
Suppose I have 20 columns in my data but I want only 16 columns to be pushed in GCP so is there any customization possible while sending data from event grid/event hub to pub/sub
If I write custom connectors code as per the links provided in the below answers then how can I run it? I mean where I can deploy those scripts on the cloud so that they will be triggered automatically whenever an event is triggered?
Can I implement webhooks in this scenario? (as an alternative to connectors), If yes then how can I do it and on which side do I need to create it?
Also, I read some articles and I came to know from a few guys that they experienced data loss in this entire process. So, what's the possibility over here and how can it be avoided
Can anybody explain me where am I going wrong or what exactly should I start with.
It's right here:
so the connection is like below:
Hub -> Service Bus -> Event Grid -> Pub Sub -> Storage
Although this might be the case, it sounds very much as if you're looking at one (very) specific scenario where data flows in this exact way.
Azure Event Hub, Azure Service Bus and Azure Event Grid can work together, but can also be used completely separate from each other.
Event Grid
The purpose of Event Grid is to enable Reactive programming. Use this when you want to react to (status) changes.
Event Hubs
Event Hubs facilitate a big data pipeline. Use this when you need telemetry and distributed data streaming.
Service Bus
The purpose of Service bus is to enable High-value enterprise messaging. Use this when you want to do something like Order processing and financial transactions.
In some cases, you use the services side by side to fulfill distinct roles. For example, an ecommerce site can use Service Bus to process the order, Event Hubs to capture site telemetry, and Event Grid to respond to events like an item was shipped.
In other cases, you link them together to form an event and data pipeline. You use Event Grid to respond to events in the other services. For an example of using Event Grid with Event Hubs to migrate data to a data warehouse, see Stream big data into a data warehouse.
Taken from the very interesting and important documentation article Choose between Azure messaging services - Event Grid, Event Hubs, and Service Bus
EDIT
My problem is I want data to be pushed from event grid to GCP (subscription / topics). So how can I establish this using PUSH method??
Possibly the simplest solution is to have an Event Grid Event trigger a webhook (which might run an Azure Function or a Google Cloud Function) which in turn puts the event/message on the GCP Topic.
Publishing messages is quite well documented. There are examples on how to do so with a REST call, command-line, C#, Go, JAVA, NodeJS, PHP, Python and Ruby.
EDIT 2
What you need to do is create an Event Grid Subscription to listen to and handle Event Grid Events.
Here's an example screenshot on how to listen for events for a specific Storage Account and call a WebHook whenever such an event occurs:
Pay attention to the "Endpoint Details": that's where you can specify to, for instance, call a webhook every time an event is triggered.
The easiest way to transfer the EventHub generated events would probably be to create an EventHub event receiver in Node.js (which you mentioned in your comments) as described here, which receives events and publishes them to Cloud Pub/Sub directly, as described in the Cloud Pub/Sub publisher documentation for Node.js.

Route and transform data from Azure IoT Hub

in our usecase, we receive messages on Azure IoT Hub, and would like to route the data to different Event Hubs or Service Bus topics.
IoT Hub routes and endpoints are no option, because the data is binary data (protobuf), and there are only 10 different endpoints possible (we need more).
Our requierements are:
Splitting the message
Transform the data (maybe json)
Routing to different endpoints based on the payload (different parts of message could be routed to different endpoints)
(optional) enrich the data with additional payload
I see different options:
Azure Stream Analytics
Azure Functions
Spark or Flink
Do it yourself (write an Application and run it in Service Fabric or Kubernets)
Which techology would you recommend?
Regards,
Markus
There is also another option for your scenario such as using an Azure Event Grid. In this case, the telemetry data from the Azure IoT Hub are pushed to the Event Grid via its custom topic endpoint. Note, that there is a limit for the event message such as 64KB, see more details here.
The Event Grid allows to subscribe unlimited number of the Event Hubs, more details about the Event Grid are here and here.
Based on the above, the following screen snippet shows your another option for routing a small telemetry data to more than 10 Event Hubs, basically to any kind of subscriber.

Azure Service Fabric routing

I would like to get some recommendation, for designing a routing of IoT messages in Azure.
I have following scenario:
Senors sending messages to Azure IoT Hub in Google Protobuf format. Depending of the type of a message, I want to route the message to different applications inside a service fabric.
My current approach is to use a service fabric application to receive all messages from the IoT hub, parse the protobuf message, send the message depending on their type (attribute inside the protobuf) to an type-specific Azure event hub. Now the applications fetches the messages from their "own" event hub and process the messages.
I'm not sure if this is the best approach. I don't like the fact, to have one event hub for each type of message. Service Bus Topics are probably not an option, because I have a lot of messages (~30k per second).
Do I realy need a event hub, to decoupling this process, or does it make sense, to send the messages from the "routing application" direct to the different "type applications"?
What do you think?
Regards,
Markus
If you really need high performance you should take a look at IoT Hub and Event Hubs. Azure Event Hubs is a highly scalable data streaming platform and event ingestion service capable of receiving and processing millions of events per second. Event Hubs can process and store events, data, or telemetry produced by distributed software and devices. Data sent to an event hub can be transformed and stored using any real-time analytics provider or batching/storage adapters.
In other hand if you need only 30k messages per second you can go with Premium Messaging.
Comparison of Azure IoT Hub and Azure Event Hubs
Premium Messaging: How fast is it?
What is Event Hubs?

Which Azure service to use for processing data from Event Hub?

I would appreciate some help picking out the best suited Azure services for my scenario - I am just beginning with Azure services and my knowledge is pretty limited.
I have data from multiple sources, and of different shapes, coming into an Event Hub. I need to subscribe to the events from the Event Hub and, based on their format, process them and ultimately save them into an SQL Database. All components - events consumers, the SQL Database - need to be hosted in the cloud.
How would I implement this in an "Azure Orientated Architecture"?
In an off cloud application, I would have competing consumers subscribing to the Event Hub. They would be some console applications or Windows services, and each would be processing the events asynchronously (this is further simplified by the event processing being idempotent).
Ideally, the Azure equivalent of the above consumers would scale up and down automatically, so I would like to not have to use VMs that host console applications (where I would need to keep an eye on the VM's resources myself). Scaling and deployment wise they would have to behave like App Services, however I'm under the impression that those are just for web applications. I've also briefly looked at Web Jobs, but those seem to be polling data at various intervals, whereas I need a proper event subscriber that the Event Hub pushes data into.
Any help will be greatly appreciated!
Thank you.
Later Edit:
I've looked into Web Jobs and they do allow continuous
processing, hence looks like they can be used as automatically
scaling subscribers.
Ideally I would like to write the code for
the subscribers in F#. C# is the other option if that is not
available.
You can see my post regarding IoT Hub. Its basically the same for Event Hub.
(each of the examples in the post can be used on Event Hubs).
https://stackoverflow.com/a/38682324/6659347
In addition, For Event Hub you can also use Azure Function which has an Event Hub trigger - a function that will run whenever an event hub receive a new event. And it will also answer your requirement of scaling.
Make sure that if you are working with multiple consumers make use of the Event Hub Consumer Groups so each consumer can read the stream independently.
I'd say use a WebJob in combination with an EventProcessor. I wrote some demo code that can easily be transferred to a WebJob: https://github.com/DeHeerSoftware/SemanticLogging.EventHub/tree/master/SemanticLogging.EventHub.Processor
See https://azure.microsoft.com/en-us/documentation/articles/event-hubs-csharp-ephcs-getstarted/#receive-messages-with-eventprocessorhost for official documentation.
I've created a WebJob myself using this approach. Works like a charm.

Resources