I hope someone can clarify this for me:
I have 2 consumers in the same ConsumerGroup, it is my understanding that they should coordinate between them, but I am having the issue that both consumers are getting all the messages. My code is pretty simple:
const connectionString =...";
const eventHubName = "my-hub-dev";
const consumerGroup = "processor";
async function main() {
const consumerClient = new EventHubConsumerClient(consumerGroup, connectionString, eventHubName);
const subscription = consumerClient.subscribe({
processEvents: async (events, context) => {
for (const event of events) {
console.log(`Received event...`, event)
}
},
}
);
If I run two instances of this consumer code and publish an event, both instances will receive the event.
So my questions are:
Am I correct in my understanding that only 1 consumer should receive the message?
Is there anything I am missing here?
The EventHubConsumerClient requires a CheckpointStore that facilitates coordination between multiple clients. You can pass this to the EventHubConsumerClient constructor when you instantiate it.
The #azure/eventhubs-checkpointstore-blob uses Azure Storage Blob to store the metadata and required to coordinate multiple consumers using the same consumer group. It also stores checkpoint data: you can call context.updateCheckpoint with an event and if you stop and start a new receiver, it will continue from the last checkpointed event in the partition that event was associated with.
There's a full sample using the #azure/eventhubs-checkpointstore-blob here: https://github.com/Azure/azure-sdk-for-js/blob/master/sdk/eventhub/eventhubs-checkpointstore-blob/samples/javascript/receiveEventsUsingCheckpointStore.js
Clarification: The Event Hubs service doesn't enforce a single owner for a partition when reading from a consumer group unless the client has specified an ownerLevel. The highest ownerLevel "wins". You can set this in the options bag you pass to subscribe, but if you want the CheckpointStore to handle coordination for you it's best not to set it.
Related
I am trying a sample code of Azure Event Hub Producer and trying to send some message to Azure Event Hub.
The eventhub and its policy is correctly configured for sending and listening messages. I am using Dotnet core 3.1 console application. However, the code doesn't move beyond CreateBatchAsync() call. I tried debugging and the breakpoint doesn't go to next line. Tried Try-catch-finally and still no progress. Please guide what I am doing wrong here. The Event hub on Azure is shows some number of successful incoming requests.
class Program
{
private const string connectionString = "<event_hub_connection_string>";
private const string eventHubName = "<event_hub_name>";
static async Task Main()
{
// Create a producer client that you can use to send events to an event hub
await using (var producerClient = new EventHubProducerClient(connectionString, eventHubName))
{
// Create a batch of events
using EventDataBatch eventBatch = await producerClient.CreateBatchAsync();
// Add events to the batch. An event is a represented by a collection of bytes and metadata.
eventBatch.TryAdd(new EventData(Encoding.UTF8.GetBytes("First event")));
eventBatch.TryAdd(new EventData(Encoding.UTF8.GetBytes("Second event")));
eventBatch.TryAdd(new EventData(Encoding.UTF8.GetBytes("Third event")));
// Use the producer client to send the batch of events to the event hub
await producerClient.SendAsync(eventBatch);
Console.WriteLine("A batch of 3 events has been published.");
}
}
}
The call to CreateBatchAsync would be the first need to create a connection to Event Hubs. This indicates that you're likely experiencing a connectivity or authorization issue.
In the default configuration you're using, the default network timeout is 60 seconds and up to 3 retries are possible, with some back-off between them.
Because of this, a failure to connect or authorize may take up to roughly 5 minutes before it manifests. That said, the majority of connection errors are not eligible for retries, so the failure would normally surface after roughly 1 minute.
To aid in your debugging, I'd suggest tweaking the default retry policy to speed things up and surface an exception more quickly so that you have the information needed to troubleshoot and make adjustments. The options to do so are discussed in this sample and would look something like:
var connectionString = "<< CONNECTION STRING FOR THE EVENT HUBS NAMESPACE >>";
var eventHubName = "<< NAME OF THE EVENT HUB >>";
var options = new EventHubProducerClientOptions
{
RetryOptions = new EventHubsRetryOptions
{
// Allow the network operation only 15 seconds to complete.
TryTimeout = TimeSpan.FromSeconds(15),
// Turn off retries
MaximumRetries = 0,
Mode = EventHubsRetryMode.Fixed,
Delay = TimeSpan.FromMilliseconds(10),
MaximumDelay = TimeSpan.FromSeconds(1)
}
};
await using var producer = new EventHubProducerClient(
connectionString,
eventHubName,
options);
What is the correct way to add a correlation-id to azure events ?
Right now, I send the events as follows:
const { EventHubProducerClient } = require('#azure/event-hubs');
const producer = new EventHubProducerClient(connectionString, eventHubName);
const batch = await producer.createBatch();
batch.tryAdd({
body: {
foo: "bar"
}
});
await producer.sendBatch(batch);
Of course as a workaround I could just add my own field to the body. However, I suspect that there is a built-in mechanism or default approach to do this.
The latest release exposes a correlationId property on EventData, which corresponds to the correlation-id field of the message properties section of the underlying AMQP message.
One important call-out is that the correlationId is intended to enable tracing of data within an application, such as an event's path from producer to consumer. It has no meaning to the Event Hubs service or within a distributed tracing/AppInsights/OpenTelemetry context.
I have created an Event Hub Namespace and 2 event hubs. I defined a Shared Access Policy (SAP) on the Event Hub Namespace. However, when I use the connection string defined on the namespace, I am able to send events to only one of the hubs even though I create the client using the correct event hub name
function void SendEvent(connectionString, eventHubName){
await using(var producerClient = new EventHubProducerClient(connectionString, eventHubName)) {
// Create a batch of events
using EventDataBatch eventBatch = await producerClient.CreateBatchAsync();
var payload = GetEventModel(entity, entityName);
// Add events to the batch. An event is a represented by a collection of bytes and metadata.
eventBatch.TryAdd(new EventData(Encoding.UTF8.GetBytes(payload.ToString())));
// Use the producer client to send the batch of events to the event hub
await producerClient.SendAsync(eventBatch);
System.Diagnostics.Debug.WriteLine($"Event for {entity} sent to Hub {eventHubName}");
}
}
The above code is called for sending events to Hub1 and Hub2. When I use the connection string from the SAP defined on the Namespace, I can only send events to Hub1 or Hub2 whichever happens to be called first. I am specifying the eventHubName as Hub1 or Hub2 as appropriate.
I call the function SendEvent in my calling code.
The only way I can send to both hubs is to define SAP on each hub and use that connection string when creating the EventHubProducer
Am I missing something or is this by design?
I did a quick test at my side, and it can work well at my side.
Please try the code below, and let me know if it does not meet your need:
class Program
{
//the namespace level sas
private const string connectionString = "Endpoint=sb://yyeventhubns.servicebus.windows.net/;SharedAccessKeyName=mysas;SharedAccessKey=xxxx";
//I try to send data to the following 2 eventhub instances.
private const string hub1 = "yyeventhub1";
private const string hub2 = "yyeventhub2";
static async Task Main()
{
SendEvent(connectionString, hub1);
SendEvent(connectionString, hub2);
Console.WriteLine("**completed**");
Console.ReadLine();
}
private static async void SendEvent(string connectionString, string eventHubName)
{
// Create a producer client that you can use to send events to an event hub
await using (var producerClient = new EventHubProducerClient(connectionString, eventHubName))
{
// Create a batch of events
using EventDataBatch eventBatch = await producerClient.CreateBatchAsync();
// Add events to the batch. An event is a represented by a collection of bytes and metadata.
eventBatch.TryAdd(new EventData(Encoding.UTF8.GetBytes("First event: "+eventHubName)));
eventBatch.TryAdd(new EventData(Encoding.UTF8.GetBytes("Second event: "+eventHubName)));
eventBatch.TryAdd(new EventData(Encoding.UTF8.GetBytes("Third event: "+eventHubName)));
eventBatch.TryAdd(new EventData(Encoding.UTF8.GetBytes("Fourth event: " + eventHubName)));
eventBatch.TryAdd(new EventData(Encoding.UTF8.GetBytes("Fifth event: " + eventHubName)));
// Use the producer client to send the batch of events to the event hub
await producerClient.SendAsync(eventBatch);
Console.WriteLine("A batch of 3 events has been published to: "+ eventHubName);
}
}
}
After running the code, I can see the data are sent to both of the 2 eventhub instances. Here is the screenshot:
We followed this example (http://masstransit-project.com/MassTransit/usage/azure-functions.html) to try to set up Azure Functions as Azure Service Bus event (topic) subscribers using MassTransit (for .Net CORE 2.1, Azure Functions 2.0).
When using Azure Webjobs this is as simple as using RabbitMQ, configure the publisher, let the subscriber configure and set up its queue, and have Masstransit automatically create one topic per event, redirect to queue and to "queue_error" after all retries have failed. You do not have to setup anything manually.
But with Azure Functions we seem to manually (through Service Bus Explorer or ARM templates) have to add the subscribers to the topic (which is created by the publisher on the first event it publishes) and the queues as well (though these don't even seem to be necessary, the events are handled directly by the consuming Azure Function topic subscribers.).
Maybe we are doing something wrong, I cannot see from the docs that MT will not, as it normally does, set up the subscriber andd creating queues when using Azure Functions. But it works, except for when the consumer throws an exception and after all setup retries have been executed. We simply do not get the event in the deadletter queue and the normally MT-generated error queue does not even get generated.
So how do we get MT to create the error queues, and MOVE the failed events there?
Our code:
[FunctionName("OrderShippedConsumer")]
public static Task OrderShippedConsumer(
[ServiceBusTrigger("xyz.events.order/iordershipped", "ordershippedconsumer-queue", Connection = "AzureServiceBus")] Message message,
IBinder binder,
ILogger logger,
CancellationToken cancellationToken,
ExecutionContext context)
{
var config = CreateConfig(context);
var handler = Bus.Factory.CreateBrokeredMessageReceiver(binder, cfg =>
{
var serviceBusEndpoint = Parse.ConnectionString(config["AzureServiceBus"])["Endpoint"];
cfg.CancellationToken = cancellationToken;
cfg.SetLog(logger);
cfg.InputAddress = new Uri($"{serviceBusEndpoint}{QueueName}");
cfg.UseRetry(x => x.Intervals(TimeSpan.FromSeconds(5)));
cfg.Consumer(() => new OrderShippedConsumer(cfg.Log, config));
});
return handler.Handle(message);
}
And the Consumer code:
public OrderShippedConsumer(ILog log, IConfigurationRoot config)
{
this.config = config;
this.log = log;
}
public async Task Consume(ConsumeContext<IOrderShipped> context)
{
// Handle the event
}
}
For my app I need to organize a circular (ring) queue. It means that any processed message immediately goes to the end of the queue for continuous processing.
For example:
Queue: A, B, C.
Receiver processes A.
Queue: B, C, A.
2 and 3 should be performed atomically. So we never lose A or any other message.
Another requirement is to ignore duplicates. So there should be always a single A in the queue. Even if a sender pushes another A item. A refers to some unique (primary) key of the message here.
I looked for using Azure Service Bus, but I cannot find how to meet both requirements with it. Is it possible to implement the scenario with Service Bus? If not, what are best alternatives?
This kind of queue can be implemented with Service Bus sessions. Sessions provide "group by" mechanics, so we can assign our unique key to SessionId of the message and then receive messages in groups ignoring all messages in a group except the first one.
Implementation
1) Create a queue with RequiresSession set to true:
var queueDescription = new QueueDescription("CircularQueue")
{
RequiresSession = true,
};
await namespaceManager.CreateQueueAsync(queueDescription);
2) When sending message to the queue, set SessionId to your unique key value:
var message = new BrokeredMessage($"Message body")
{
MessageId = "MESSAGE_UNIQUE_KEY",
SessionId = "MESSAGE_UNIQUE_KEY"
};
await queueClient.SendAsync(message);
3) Receive messages using sessions:
while (true)
{
var session = await queueClient.AcceptMessageSessionAsync(TimeSpan.FromSeconds(10));
if (session == null)
continue;
try
{
var messages = (await session.ReceiveBatchAsync(100)).ToList();
if (messages.Count == 0)
continue;
var message = messages[0];
ProcessMessage(message);
await queueClient.SendAsync(message.Clone());
await session.CompleteBatchAsync(messages.Select(msg => msg.LockToken));
}
finally
{
await session.CloseAsync();
}
}
Based on the little I know about Azure Service Bus, I believe both of these requirements can be fulfilled with it individually, though I am not sure how both of them can be fulfilled together.
Message Cycling
It is my understanding that Azure Service Bus supports First-In-First-Out (FIFO) behavior. What you could do is fetch the message (say A) from the top of the queue in Receive and Delete mode and then reinsert the message back in the queue. Since you're creating a new message, it will be posted to the end of the queue.
Avoid Duplicate Messages
Service Bus Queues has a boolean property called RequiresDuplicateDetection and setting this value accordingly will prevent duplicate messages being inserted. A simple search for Azure Service Bus Duplicate Detection will lead you to many examples.