Pipe Events from Azure Event Hub to Azure Service Bus - azure

I am listening to Event hub for various events.
Each event is high value and cannot be missed.
Events are partitioned based on device id.
Events from one device id are sparse and not very frequent (couple of events per few dasy) . It only occurs in response to a user action which is infrequent.
The number of devices are huge , so I will have a lot of events for a variety of device Ids.
For each event , I need to make 3-4 API calls to Systems which are not super reliable. And since some of these are cross Geo Calls it might take some time.
I am planning to take the events from Event hub and put them into Service Bus. My reasons are as follows.
Event hub can be scaled to only 32 partitions and if one event takes time , the entire partition gets blocked.
Service bus on the other hand is more horizontally scalable. If the throughput drops I can just add more subscribers to the Service Bus.
I have been looking for patterns like this but I have not seen patterns where we get data from a log based messaging system and push them to a queue based one.
Are there better approach to handle such scenarios ?

I think you can use Event hub trigger and service bus output binding to achieve what you want.
For example, I want to monitor Event hub 'test' and I am using C# library:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using Microsoft.Azure.EventHubs;
using Microsoft.Azure.WebJobs;
using Microsoft.Extensions.Logging;
namespace FunctionApp68
{
public static class Function1
{
[FunctionName("Function1")]
[return: ServiceBus("test1", Connection = "ServiceBusConnection")]
public static string Run([EventHubTrigger("test", Connection = "str")] EventData[] events, ILogger log)
{
var exceptions = new List<Exception>();
string messageBodyt = "";
foreach (EventData eventData in events)
{
try
{
string messageBody = Encoding.UTF8.GetString(eventData.Body.Array, eventData.Body.Offset, eventData.Body.Count);
messageBodyt = messageBodyt + messageBody;
// Replace these two lines with your processing logic.
log.LogInformation($"C# Event Hub trigger function processed a message: {messageBody}");
//await Task.Yield();
}
catch (Exception e)
{
// We need to keep processing the rest of the batch - capture this exception and continue.
// Also, consider capturing details of the message that failed processing so it can be processed again later.
exceptions.Add(e);
}
}
// Once processing of the batch is complete, if any messages in the batch failed processing throw an exception so that there is a record of the failure.
if (exceptions.Count > 1)
throw new AggregateException(exceptions);
if (exceptions.Count == 1)
throw exceptions.Single();
return messageBodyt;
}
}
}
The above code will collect from event hub 'test' and save to service bus queue 'test1'.
Have a look of these doc:
https://learn.microsoft.com/en-us/azure/azure-functions/functions-bindings-event-hubs-trigger?tabs=csharp
https://learn.microsoft.com/en-us/azure/azure-functions/functions-bindings-service-bus-output?tabs=csharp#example

What you need is actually a private queue per device-Id. As soon as event comes to event hub, Pull events from it and put that into device-Id's private queue, and then process it serially.
How to build queue per device-Id:
Simple way to build queue is to use SQL database(mostly it works if request per second are not very very high, for sql-db 100 req/second are normal.)
another horizontally scalable way is to use azure append blobs(if your event processors are stateless).
You can also use advanced methodology like using Azure Service Fabric Reliable Queue.

Related

Read messages from Azure Service Bus Queue BUT NOT real time

I have specific business case where I need to read an azure service bus queue – but reading this queue should not be real time.
This is my setup
I have an Azure function that is doing some stuff and part of that processing is to read some messages from a Service bus Queue at the end of the day. This function is Timmer trigered.
I have a service Bus Topic which auto forwards the message to a Service Bus Queue. This is done in real time. So over a 7 hours working period the messages will pile up in this queue (average about 20 messages per day)
At the end of the day the function will then read the messages (average 20) in the service the Service Bus Queue (not real time) and produce a report.
All the code snippets I've seen online are all triggered in real time. As they are all registering an event that get fired as soon as a message get sent to the queue.
I had this code simppet in my application but noticed that as soon as a message is added to the queue the message is pulled imediately from the queue, which i dont want. I want the message to remain in the queue until the end of the day
public async Task<IEnumerable<ChangeNotification>> ReadChangeNotificationMessagesAsync()
{
processor = client.CreateProcessor(serviceBusOptions.TopicName, serviceBusOptions.SubscriptionName, serviceBusProcessorOptions);
processor.ProcessMessageAsync += AddNotificationToQueueEventAsync;
processor.ProcessErrorAsync += ProcessErrorEventAsync;
await processor.StartProcessingAsync();
}
private async Task AddNotificationToQueueEventAsync(ProcessMessageEventArgs args)
{
var changeNotification = args.Message.Body.ToObjectFromJson<ChangeNotification>(
new JsonSerializerOptions { PropertyNameCaseInsensitive = true });
//do some stuff
}
private Task ProcessErrorEventAsync(ProcessErrorEventArgs arg)
{
//log error
}
serviceBusProcessorOptions = new ServiceBusProcessorOptions
{
MaxConcurrentCalls = serviceBusOptions.Value.MaxConcurrentCalls,
AutoCompleteMessages = serviceBusOptions.Value.AutoCompleteMessages
};
Can someone provided a bit of code snippet that will allow me to read the queue but not in real time
You can use a Timer Triggered Azure Function and schedule it to run once a day. In your Function code you can make use of Service Bus SDK to read messages from the Service Bus and process them.
UPDATE
I noticed that you are using Service Bus Processor to process the messages which basically provides an event based model for processing your messages.
Instead of using that, you can simply use ServiceBusReceiver and read messages manually using ReceiveMessagesAsync(Int32, Nullable<TimeSpan>, CancellationToken).

Sequence processing with Azure Function & Service Bus

I have an issue with Azure Function Service Bus trigger.
The issue is Azure function cannot wait a message done before process a new message. It process Parallel, it not wait 5s before get next message. But i need it process sequencecy (as image bellow).
How can i do that?
[FunctionName("HttpStartSingle")]
public static void Run(
[ServiceBusTrigger("MyServiceBusQueue", Connection = "Connection")]string myQueueItem,
[OrchestrationClient] DurableOrchestrationClient starter,
ILogger log)
{
Console.WriteLine($"MessageId={myQueueItem}");
Thread.Sleep(5000);
}
I resolved my problem by using this config in my host.json
{
"version": "2.0",
"extensions": {
"serviceBus": {
"messageHandlerOptions": {
"maxConcurrentCalls": 1
}
}
}}
There are two approaches you can accomplish this,
(1) You are looking for Durable Function with function chaining
For background jobs you often need to ensure that only one instance of
a particular orchestrator runs at a time. This can be done in Durable
Functions by assigning a specific instance ID to an orchestrator when
creating it.
(2) Based on the messages that you are writing to Queue, you need to partition the data, that will automatically handle the order of messages which you do not need to handle manually by azure function
In general, ordered messaging is not something I'd be striving to implement since the order can and at some point will be distorted. Saying that, in some scenarios, it's required. For that, you should either use Durable Function to orchestrate your messages or use Service Bus message Sessions.
Azure Functions has recently added support for ordered message delivery (accent on the delivery part as processing can still fail). It's almost the same as the normal Function, with a slight change that you need to instruct the SDK to utilize sessions.
public async Task Run(
[ServiceBusTrigger("queue",
Connection = "ServiceBusConnectionString",
IsSessionsEnabled = true)] Message message, // Enable Sessions
ILogger log)
{
log.LogInformation($"C# ServiceBus queue trigger function processed message: {Encoding.UTF8.GetString(message.MessageId)}");
await _cosmosDbClient.Save(...);
}
Here's a post for more detials.
Warning: using sessions will require messages to be sent with a session ID, potentially requiring a change on the sending side.

Process Azure IoT hub events from a single device only

I'm trying to solve for having thousands of IoT devices deployed, all logging events to Azure IoT hub, then being able to read events created by a single deviceid only.
I have been playing with EventProcessorHost to get something like this working, but so far I can only see a way to read all messages from all devices.
Its not a feasible solution to read all the messages and filter client side as there may be millions of messages.
The major purpose of the Azure IoT Hub is an ingestion of mass events from the devices to the cloud stream pipeline for their analyzing in the real-time manner. The default telemetry path (hot way) is via a built-in Event Hub, where all events are temporary stored in the EH partitions.
Besides that default endpoint (events), there is also capability to route an event message to the custom endpoints based on the rules (conditions).
Note, that the number of custom endpoints is limited to 10 and the number of rules to 100. If this limit is matching your business model, you can very easy to stream 10 devices individually, like is described in the Davis' answer.
However, splitting a telemetry stream pipeline based on the sources (devices) over this limit (10+1), it will require to use additional azure entities (components).
The following picture shows a solution for splitting a telemetry stream pipeline based on the devices using a Pub/Sub push model.
The above solution is based on forwarding the stream events to the Azure Event Grid using a custom topic publisher. The event schema for Event Grid eventing is here.
The Custom Topic Publisher for Event Grid is represented by Azure EventHubTrigger Function, where each stream event is mapped into the Event Grid event message with a subject indicated a registered device.
The Azure Event Grid is a Pub/Sub loosely decoupled model, where the events are delivered to the subscribers based on their subscribed subscriptions. In other words, if there is no match for delivery, the event message is disappeared.
Note, that the capable of Event Grid routing is 10 millions events per second per region. The limit of the number of subscriptions is 1000 per region.
Using the REST Api, the subscription can be dynamically created, updated, deleted, etc.
The following code snippet shows an example of the AF implementation for mapping the stream event to the EG event message. As you can see it is very straightforward implementation:
run.csx:
#r "Newtonsoft.Json"
#r "Microsoft.ServiceBus"
using System.Configuration;
using System.Text;
using System.Threading.Tasks;
using Microsoft.Azure.EventGrid.Models;
using Microsoft.ServiceBus.Messaging;
using Newtonsoft.Json;
// reusable client proxy
static HttpClient client = HttpClientHelper.Client(ConfigurationManager.AppSettings["TopicEndpointEventGrid"], ConfigurationManager.AppSettings["aeg-sas-key"]);
// AF
public static async Task Run(EventData ed, TraceWriter log)
{
log.Info($"C# Event Hub trigger function processed a message:{ed.SequenceNumber}");
// fire EventGrid Custom Topic
var egevent = new EventGridEvent()
{
Id = ed.SequenceNumber.ToString(),
Subject = $"/iothub/events/{ed.SystemProperties["iothub-message-source"] ?? "?"}/{ed.SystemProperties["iothub-connection-device-id"] ?? "?"}",
EventType = "telemetryDataInserted",
EventTime = ed.EnqueuedTimeUtc,
Data = new
{
sysproperties = ed.SystemProperties,
properties = ed.Properties,
body = JsonConvert.DeserializeObject(Encoding.UTF8.GetString(ed.GetBytes()))
}
};
await client.PostAsJsonAsync("", new[] { egevent });
}
// helper
class HttpClientHelper
{
public static HttpClient Client(string address, string key)
{
var client = new HttpClient() { BaseAddress = new Uri(address) };
client.DefaultRequestHeaders.Add("aeg-sas-key", key);
return client;
}
}
function.json:
{
"bindings": [
{
"type": "eventHubTrigger",
"name": "ed",
"direction": "in",
"path": "<yourEventHubName>",
"connection": "<yourIoTHUB>",
"consumerGroup": "<yourGroup>",
"cardinality": "many"
}
],
"disabled": false
}
project.json:
{
"frameworks": {
"net46":{
"dependencies": {
"Microsoft.Azure.EventGrid": "1.1.0-preview"
}
}
}
}
Finally, the following screen snippet shows an event grid event message received by AF subscriber for Device1:
If you're ok with Java/Scala, this example shows how to create a client and filter messages by device Id:
https://github.com/Azure/toketi-iothubreact/blob/master/samples-scala/src/main/scala/A_APIUSage/Demo.scala#L266
The underlying client reads all the messages from the hub though.
You could also consider using IoT Hub message routing, more info here:
https://azure.microsoft.com/blog/azure-iot-hub-message-routing-enhances-device-telemetry-and-optimizes-iot-infrastructure-resources
https://azure.microsoft.com/blog/iot-hub-message-routing-now-with-routing-on-message-body

Setup webjob ServiceBusTriggers or queue names at runtime (without hard-coded attributes)?

Is there any way to configure triggers without attributes? I cannot know the queue names ahead of time.
Let me explain my scenario here.. I have one service bus queue, and for various reasons (complicated duplicate-suppression business logic), the queue messages have to be processed one at a time, so I have ServiceBusConfiguration.OnMessageOptions.MaxConcurrentCalls set to 1. So processing a message holds up the whole queue until it is finished. Needless to say, this is suboptimal.
This 'one at a time' policy isn't so simple. The messages could be processed in parallel, they just have to be divided into groups (based on a field in message), say A and B. Group A can process its messages one at a time, and group B can process its own one at a time, etc. A and B are processed in parallel, all is good.
So I can create a queue for each group, A, B, C, ... etc. There are about 50 groups, so 50 queues.
I can create a queue for each, but how to make this work with the Azure Webjobs SDK? I don't want to copy-paste a method for each queue with a different ServiceBusTrigger for the SDK to discover, just to enforce one-at-a-time per queue/group, then update the code with another copy-paste whenever another group is needed. Fetching a list of queues at startup and tying to the function is preferable.
I have looked around and I don't see any way to do what I want. The ITypeLocator interface is pretty hard-set to look for attributes. I could probably abuse the INameResolver, but it seems like I'd still have to have a bunch of near-duplicate methods around. Could I somehow create what the SDK is looking for at startup/runtime?
(To be clear, I know how to use INameResolver to get queue name as at How to set Azure WebJob queue name at runtime? but though similar this isn't my problem. I want to setup triggers for multiple queues at startup for the same function to get the one-at-a-time per queue processing, without using the trigger attribute 50 times repeatedly. I figured I'd ask again since the SDK repo is fairly active and it's been a year..).
Or am I going about this all wrong? Being dumb? Missing something? Any advice on this dilemma would be welcome.
The Azure Webjob Host discovers and indexes the functions with the ServiceBusTrigger attribute when it starts. So there is no way to set up the queues to trigger at the runtime.
The simpler solution for you is to create a long time running job and implement it manually:
public class Program
{
private static void Main()
{
var host = new JobHost();
host.CallAsync(typeof(Program).GetMethod("Process"));
host.RunAndBlock();
}
[NoAutomaticTriggerAttribute]
public static async Task Process(TextWriter log, CancellationToken token)
{
var connectionString = "myconnectionstring";
// You can also get the queue name from app settings or azure table ??
var queueNames = new[] {"queueA", "queueA" };
var messagingFactory = MessagingFactory.CreateFromConnectionString(connectionString);
foreach (var queueName in queueNames)
{
var receiver = messagingFactory.CreateMessageReceiver(queueName);
receiver.OnMessage(message =>
{
try
{
// do something
....
// Complete the message
message.Complete();
}
catch (Exception ex)
{
// Log the error
log.WriteLine(ex.ToString());
// Abandon the message so that it can be retry.
message.Abandon();
}
}, new OnMessageOptions() { MaxConcurrentCalls = 1});
}
// await until the job stop or restart
await Task.Delay(Timeout.InfiniteTimeSpan, token);
}
}
Otherwise, if you don't want to deal with multiple queues, you can have a look at azure servicebus topic/subscription and create SqlFilter to send your message to the right subscription.
Another option could be to create your own trigger: The azure webjob SDK provides extensibility points to create your own trigger binding :
Binding Extensions Overview
Good Luck !
Based on my understanding, your needs seems to be building a message batch system in parallel. The #Thomas solution is good, but I think Azure Batch service with Table storage may be better and could be instead of the complex solution of ServiceBus queue + WebJobs with a trigger.
Using Azure Batch with Table storage, you can control the task creation and execute the task in parallel and at scale, even monitor these tasks, please refer to the tutorial to know how to.

CloudQueueClient.ResponseReceived Event broken?

I'm trying to build an event driven Azure Queue where a event is to fired every time a message is put in the Azure Queue. With AzureXplorer I see that the messages are put in the Azure Queue properly but the CloudQueueClient.ResponseReceived Event never fires. I'm using Azure V1.4. This is the code from my Worker role:
public class WorkerRole : RoleEntryPoint
{
public override void Run()
{
while (true)
{
Thread.Sleep(10000);
}
}
public override bool OnStart()
{
// Set the maximum number of concurrent connections
ServicePointManager.DefaultConnectionLimit = 12;
var queuDataSource = new AzureQueueDataSource();
queuDataSource.GetCloudQueueClient().ResponseReceived +=new EventHandler<ResponseReceivedEventArgs>(WorkerRole_ResponseReceived);
// For information on handling configuration changes
// see the MSDN topic at http://go.microsoft.com/fwlink/?LinkId=166357.
return base.OnStart();
}
void WorkerRole_ResponseReceived(object sender, ResponseReceivedEventArgs e)
{
var i = 1; // Breakpoint here never happends
}
}
Windows Azure Queues need to be polled for new messages. See SDK samples or code here for examples on how to query queues for new messages.
Quick list of things to take into account:
Because polling is counted as a
transaction in Windows Azure, you
will be paying for those.
It is usually better to implement some kind of retry mechanism if no messages are found (e.g. exponential back-off, etc)
It is usually good to retrieve messages in batches (less round trips, less transactions, etc)
Remember that messages can be delivered more than once (plan for duplicate messages)
Use the "dequeuecount" property to deal with "poison messages".
There's plenty of coverage on all these. See the documentation/samples in the link above. This article is pretty good too: http://blogs.msdn.com/b/appfabriccat/archive/2010/12/20/best-practices-for-maximizing-scalability-and-cost-effectiveness-of-queue-based-messaging-solutions-on-windows-azure.aspx

Resources