How to read Azure Service Bus messages from Multiple Queues with one worker - azure

I have three queues and one worker that I want monitoring the three queues (or only two of them)
One queue is qPirate
One queue is qShips
One queue is qPassengers
The idea is that workers will either be looking at all 3 of them, 2 of them, or one of them, and doing different things depending on what the message says.
The key though is that say a message is failing because ship1 is offline, all queues in qships will refresh, workers that are looking at that and other queues will get hung up slightly from it as they will try to process the messages for that queue while only looking at the other queues a little bit, while the other workers that are looking at the other 2 queues and skipping qships will continue to process through messages without holdup or delays.
public static void GotMessage([ServiceBusTrigger("%LookAtAllQueuesintheservicebus%")] BrokeredMessage message)
{
var handler = new MessageHandler();
var manager = new MessageManager(
handler,
"PirateShips"
);
manager.ProcessMessageViaHandler(message);
}
Looking around online I'm guessing this isn't something that's possible, but it seems like it would be? Thanks in advance either way!
Edit1: I'll add the Job Host as well to attempt to clarify things a bit
JobHostConfiguration config = new JobHostConfiguration()
{
DashboardConnectionString = "DefaultEndpointsProtocol=https;AccountName=PiratesAreUs;AccountKey=Yarr",
StorageConnectionString = "DefaultEndpointsProtocol=https;AccountName=PiratesAreUs;AccountKey=Yarr",
NameResolver = new QueueNameResolver()
};
ServiceBusConfiguration serviceBusConfig = new ServiceBusConfiguration()
{
ConnectionString = "Endpoint=AllPirateQueuesLocatedHere;SharedAccessKeyName=PiratesAreUs;SharedAccessKey=Yarr"
};
serviceBusConfig.MessageOptions.AutoComplete = false;
serviceBusConfig.MessageOptions.AutoRenewTimeout = TimeSpan.FromMinutes(1);
serviceBusConfig.MessageOptions.MaxConcurrentCalls = 1;
config.UseServiceBus(serviceBusConfig);
JobHost host = new JobHost(config);
host.RunAndBlock();
Also the QueueNameResolverClass is simply
public class QueueNameResolver : INameResolver
{
public string Resolve(string name)
{
return name;
}
}
I don't appear to have anyway to have the NameResolver be multiple queues, while I can say that I want the jobhost to look at a certain ServiceBus, I don't know how to tell it to look at all the queues within the ServiceBus.
In other words, I want multiple servicebustriggers on this worker so that if a message gets sent to qpirate1 and qships1 which are both located in service bus AllPirateQueuesHere, the worker can pick up the message in qpirate1, process it, then pick up the message in qships1 and process it.

Figured out the answer... This is possible and its simpler than I thought I'm not sure why I didn't connect the dots but I'm still curious why there isn't more documentation about this. Apparently it's simply make a function per queue you want a worker to look at multiple queues. So if you had three queues you'd want something like the below (you can handle each message differently).
public static void GotMessage1([ServiceBusTrigger("%qPirate1%")] BrokeredMessage message)
{
var handler = new MessageHandler();
var manager = new MessageManager(
handler,
"Pirates"
);
manager.ProcessMessageViaHandler(message);
}
public static void GotMessage2([ServiceBusTrigger("%qShip1%")] BrokeredMessage message)
{
var handler = new MessageHandler();
var manager = new MessageManager(
handler,
"Ships"
);
manager.ProcessMessageViaHandler(message);
}
public static void GotBooty([ServiceBusTrigger("%qBooty%")] BrokeredMessage message)
{
var handler = new MessageHandler();
var manager = new MessageManager(
handler,
"Booty"
);
manager.ProcessMessageViaHandler(message);
}

Related

Event Hub Consumer in Service Fabric

I'm trying to get a service fabric to consistently pull messages from an azure event hub. I seem to have everything wired up but notice that my consumer just stops pulling events.
I have a hub with a couple thousand events I've pushed to it. Configured the hub with 1 partition and have my service fabric service with also only 1 partition to ease debugging.
Service starts, creates the EventHubClient, from there uses it to create a PartitionReceiver. The receiver is passed to an "EventLoop" that enters an "infinite" while that calls receiver.ReceiveAsync. The code for the EventLoop is below.
What I am observing is the first time through the loop I almost always get 1 message. Second time through I get somewhere between 103 and 200ish messages. After that, I get no messages. Also seems like if I restart the service, I get the same messages again - but that's because when I restart the service I'm having it start back at the beginning of the stream.
Would expect this to keep running until my 2000 messages were consumed and then it would wait for me (polling ocassionally).
Is there something specific I need to do with the Azure.Messaging.EventHubs 5.3.0 package to make it keep pulling events?
//Here is how I am creating the EventHubClient:
var connectionString = "something secret";
var connectionStringBuilder = new EventHubsConnectionStringBuilder(connectionString)
{
EntityPath = "NameOfMyEventHub"
};
try
{
m_eventHubClient = EventHubClient.Create(connectionStringBuilder);
}
//Here is how I am getting the partition receiver
var receiver = m_eventHubClient.CreateReceiver("$Default", m_partitionId, EventPosition.FromStart());
//The event loop which the receiver is passed to
private async Task EventLoop(PartitionReceiver receiver)
{
m_started = true;
while (m_keepRunning)
{
var events = await receiver.ReceiveAsync(m_options.BatchSize, TimeSpan.FromSeconds(5));
if (events != null) //First 2/3 times events aren't null. After that, always null and I know there are more in the partition/
{
var eventsArray = events as EventData[] ?? events.ToArray();
m_state.NumProcessedSinceLastSave += eventsArray.Count();
foreach (var evt in eventsArray)
{
//Process the event
await m_options.Processor.ProcessMessageAsync(evt, null);
string lastOffset = evt.SystemProperties.Offset;
if (m_state.NumProcessedSinceLastSave >= m_options.BatchSize)
{
m_state.Offset = lastOffset;
m_state.NumProcessedSinceLastSave = 0;
await m_state.SaveAsync();
}
}
}
}
m_started = false;
}
**EDIT, a question was asked on the number of partitions. The event hub has a single partition and the SF service also has a single one.
Intending to use service fabric state to keep track of my offset into the hub, but that's not the concern for now.
Partition listeners are created for each partition. I get the partitions like this:
public async Task StartAsync()
{
// slice the pie according to distribution
// this partition can get one or more assigned Event Hub Partition ids
string[] eventHubPartitionIds = (await m_eventHubClient.GetRuntimeInformationAsync()).PartitionIds;
string[] resolvedEventHubPartitionIds = m_options.ResolveAssignedEventHubPartitions(eventHubPartitionIds);
foreach (var resolvedPartition in resolvedEventHubPartitionIds)
{
var partitionReceiver = new EventHubListenerPartitionReceiver(m_eventHubClient, resolvedPartition, m_options);
await partitionReceiver.StartAsync();
m_partitionReceivers.Add(partitionReceiver);
}
}
When the partitionListener.StartAsync is called, it actually creates the PartitionListener, like this (it's actually a bit more than this, but the branch taken is this one:
m_eventHubClient.CreateReceiver(m_options.EventHubConsumerGroupName, m_partitionId, EventPosition.FromStart());
Thanks for any tips.
Will
How many partition do you have? I can't see in your code how you make sure you read all partitions in the default consumer group.
Any specific reason why you are using PartitionReceiver instead of using an EventProcessorHost?
To me, SF seems like a perfect fit for using the event processor host. I see there is already a SF integrated solution that uses stateful services for checkpointing.

MSMQ ARCHITECTURE WITH DEDICATED PROCESSORS PER DATABASE

I have a web application in ASP.NET MVC , C# and I have a specific use case that takes long time to process and users have to wait until the process is complete. I want to use MSMQ and relay the heavy work to dedicated MSMQ consumer/servicer. Our application has multiple clients and each client has their own SQL database. So let's say 100 clients make 100 separate SQL databases. The real challenge I have is to make the process faster using MSMQ but task of 1 client should not effect the performance of others. So I have 2 solutions:
Option-1: Unique MSMQ Private Queue per database so in my case it will be 100 queues and growing. 1 dedicated ASP.NET console application that listens to a dedicated MSMQ so in my case it will be 100 processors or console applications.
Option-2: 1 big MSMQ private queue for all databases
A: 1 dedicated MSMQ consumer per database so 100 processors
B: 1 MSMQ consumer that listens to the big MSMQ
I want to stick with Option-1 but I would want to know is this a feasible and enterprise type solution?
You actually have two questions
First, how do you allocate a resources affinity to a processor to SQL Server.
Select the database in Sql Management Studio, right click and follow this..
Clean your Database regularly
DBCC FREEPROCCACHE;
DBCC DROPCLEANBUFFERS;
MSMQ, turn on [journaling][2], but also consider another queuing process RabbitMQ etc, or write a simple one to enquque the jobs sample from here
public class MultiThreadQueue
{
BlockingCollection<string> _jobs = new BlockingCollection<string>();
public MultiThreadQueue(int numThreads)
{
for (int i = 0; i < numThreads; i++)
{
var thread = new Thread(OnHandlerStart)
{ IsBackground = true };//Mark 'false' if you want to prevent program exit until jobs finish
thread.Start();
}
}
public void Enqueue(string job)
{
if (!_jobs.IsAddingCompleted)
{
_jobs.Add(job);
}
}
public void Stop()
{
//This will cause '_jobs.GetConsumingEnumerable' to stop blocking and exit when it's empty
_jobs.CompleteAdding();
}
private void OnHandlerStart()
{
foreach (var job in _jobs.GetConsumingEnumerable(CancellationToken.None))
{
Console.WriteLine(job);
Thread.Sleep(10);
}
}
}
Hope this helps :)
The question has been reworded, he meant sometheng else when he said Processors.
Update added a consumer pattern with onPeek :
You really need to post some code!
Consider using the OnPeekCompleted method. If there is an error you can leave the message on the queue
If you have some kind of header which identifies the message you can switch to a different dedicated/thread.
private static void OnPeekCompleted(Object sourceQueue, PeekCompletedEventArgs asyncResult)
{
// Set up and connect to the queue.
MessageQueue mq = (MessageQueue)sourceQueue;
// gets a new transaction going
using (var txn = new MessageQueueTransaction())
{
try
{
// retrieve message and process
txn.Begin();
// End the asynchronous peek operation.
var message = mq.Receive(txn);
#if DEBUG
// Display message information on the screen.
if (message != null)
{
Console.WriteLine("{0}: {1}", message.Label, (string)message.Body);
}
#endif
// message will be removed on txn.Commit.
txn.Commit();
}
catch (Exception ex)
{
// If there is an error you can leave the message on the queue, don't remove message from queue
Console.WriteLine(ex.ToString());
txn.Abort();
}
}
// Restart the asynchronous peek operation.
mq.BeginPeek();
}
You can also use a service broker

Akka.net Ask timeout when used in Azure WebJob

At work we have some code in a Azure WebJob where we use Rabbit
The basic workflow is this
A message arrives on RabbitMQ Queue
We have a message handler for the incoming message
Within the message handler we start a top level (user) supervisor actor where we "ask" it to handle the message
The supervisor actor hierarchy is like this
And the relevant top level code is something like this (this is the WebJob code)
static void Main(string[] args)
{
try
{
//Bootstrap akka IoC resolver well ahead of any actor usages
new AutoFacDependencyResolver(ContainerOperations.Instance.Container, ContainerOperations.Instance.Container.Resolve<ActorSystem>());
var system = ContainerOperations.Instance.Container.Resolve<ActorSystem>();
var busQueueReader = ContainerOperations.Instance.Container.Resolve<IBusQueueReader>();
var dateTime = ContainerOperations.Instance.Container.Resolve<IDateTime>();
busQueueReader.AddHandler<ProgramCalculationMessage>("RabbitQueue", x =>
{
//This is code that gets called whenever we have a RabbitMQ message arrive
//This is code that gets called whenever we have a RabbitMQ message arrive
//This is code that gets called whenever we have a RabbitMQ message arrive
//This is code that gets called whenever we have a RabbitMQ message arrive
//This is code that gets called whenever we have a RabbitMQ message arrive
try
{
//SupervisorActor is a singleton
var supervisorActor = ContainerOperations.Instance.Container.ResolveNamed<IActorRef>("SupervisorActor");
var actorMessage = new SomeActorMessage();
var supervisorRunTask = runModelSupervisorActor.Ask(actorMessage, TimeSpan.FromMinutes(25));
//we want to wait this guy out
var supervisorRunResult = supervisorRunTask.GetAwaiter().GetResult();
switch (supervisorRunResult)
{
case CompletedEvent completed:
{
break;
}
case FailedEvent failed:
{
throw failed.Exception;
}
}
}
catch (Exception ex)
{
_log.Error(ex, "Error found in Webjob");
//throw it for the actual RabbitMqQueueReader Handler so message gets NACK
throw;
}
});
Thread.Sleep(Timeout.Infinite);
}
catch (Exception ex)
{
_log.Error(ex, "Error found");
throw;
}
}
And this is the relevant IOC code (we are using Autofac + Akka.NET DI for Autofac)
builder.RegisterType<SupervisorActor>();
_actorSystem = new Lazy<ActorSystem>(() =>
{
var akkaconf = ActorUtil.LoadConfig(_akkaConfigPath).WithFallback(ConfigurationFactory.Default());
return ActorSystem.Create("WebJobSystem", akkaconf);
});
builder.Register<ActorSystem>(cont => _actorSystem.Value);
builder.Register(cont =>
{
var system = cont.Resolve<ActorSystem>();
return system.ActorOf(system.DI().Props<SupervisorActor>(),"SupervisorActor");
})
.SingleInstance()
.Named<IActorRef>("SupervisorActor");
The problem
So the code is working fine and doing what we want it to, apart from the Akka.Net "ask" timeout shown above in the WebJob code.
Annoyingly this seems to work fine if I try and run the webjob locally. Where I can simulate a "ask" timeout by providing a new supervisorActor that simply doesn't EVER respond with a message back to the "Sender".
This works perfectly running on my machine, but when we run this code in Azure, we DO NOT see a Timeout for the "ask" even though one of our workflow runs exceeded the "ask" timeout by a mile.
I just don't know what could be causing this behavior, does anyone have any ideas?
Could there be some Azure specific config value for the WebJob that I need to set.
The answer to this was to use the async rabbit handlers which apparently came out in V5.0 of the C# rabbit client. The offical docs still show the sync usage (sadly).
This article is quite good : https://gigi.nullneuron.net/gigilabs/asynchronous-rabbitmq-consumers-in-net/
Once we did this, all was good

Getting Data from EventHub is delayed

I have an EventHub configured in Azure, also a consumer group for reading the data. It was working fine for some days. Suddenly, I see there is a delay in incoming data(around 3 days). I use Windows Service to consume data in my server. I have around 500 incoming messages per minute. Can anyone help me out to figure this out ?
It might be that you are processing them items too slow. Therefore the work to be done grows and you will lag behind.
To get some insight in where you are in the event stream you can use code like this:
private void LogProgressRecord(PartitionContext context)
{
if (namespaceManager == null)
return;
var currentSeqNo = context.Lease.SequenceNumber;
var lastSeqNo = namespaceManager.GetEventHubPartition(context.EventHubPath, context.ConsumerGroupName, context.Lease.PartitionId).EndSequenceNumber;
var delta = lastSeqNo - currentSeqNo;
logWriter.Write(
$"Last processed seqnr for partition {context.Lease.PartitionId}: {currentSeqNo} of {lastSeqNo} in consumergroup '{context.ConsumerGroupName}' (lag: {delta})",
EventLevel.Informational);
}
the namespaceManager is build like this:
namespaceManager = NamespaceManager.CreateFromConnectionString("Endpoint=sb://xxx.servicebus.windows.net/;SharedAccessKeyName=yyy;SharedAccessKey=zzz");
I call this logging method in the CloseAsync method:
public Task CloseAsync(PartitionContext context, CloseReason reason)
{
LogProgressRecord(context);
return Task.CompletedTask;
}
logWriter is just some logging class I have used to write info to blob storage.
It now outputs messages like
Last processed seqnr for partition 3: 32780931 of 32823804 in consumergroup 'telemetry' (lag: 42873)
so when the lag is very high you could be processing events that have occurred a long time ago. In that case you need to scale up/out your processor.
If you notice a lag you should measure how long it takes to process a given number of item. You can then try to optimize performance and see whether it improves. We did it like:
public async Task ProcessEventsAsync(PartitionContext context, IEnumerable<EventData> events)
{
try
{
stopwatch.Restart();
// process items here
stopwatch.Stop();
await CheckPointAsync(context);
logWriter.Write(
$"Processed {events.Count()} events in {stopwatch.ElapsedMilliseconds}ms using partition {context.Lease.PartitionId} in consumergroup {context.ConsumerGroupName}.",
EventLevel.Informational);
}
}

How to do Async in Azure WebJob function

I have an async method that gets api data from a server. When I run this code on my local machine, in a console app, it performs at high speed, pushing through a few hundred http calls in the async function per minute. When I put the same code to be triggered from an Azure WebJob queue message however, it seems to operate synchronously and my numbers crawl - I'm sure I am missing something simple in my approach - any assistance appreciated.
(1) .. WebJob function that listens for a message on queue and kicks off the api get process on message received:
public class Functions
{
// This function will get triggered/executed when a new message is written
// on an Azure Queue called queue.
public static async Task ProcessQueueMessage ([QueueTrigger("myqueue")] string message, TextWriter log)
{
var getAPIData = new GetData();
getAPIData.DoIt(message).Wait();
log.WriteLine("*** done: " + message);
}
}
(2) the class that outside azure works in async mode at speed...
class GetData
{
// wrapper that is called by the message function trigger
public async Task DoIt(string MessageFile)
{
await CallAPI(MessageFile);
}
public async Task<string> CallAPI(string MessageFile)
{
/// create a list of sample APIs to call...
var apiCallList = new List<string>();
apiCallList.Add("localhost/?q=1");
apiCallList.Add("localhost/?q=2");
apiCallList.Add("localhost/?q=3");
apiCallList.Add("localhost/?q=4");
apiCallList.Add("localhost/?q=5");
// setup httpclient
HttpClient client =
new HttpClient() { MaxResponseContentBufferSize = 10000000 };
var timeout = new TimeSpan(0, 5, 0); // 5 min timeout
client.Timeout = timeout;
// create a list of http api get Task...
IEnumerable<Task<string>> allResults = apiCallList.Select(str => ProcessURLPageAsync(str, client));
// wait for them all to complete, then move on...
await Task.WhenAll(allResults);
return allResults.ToString();
}
async Task<string> ProcessURLPageAsync(string APIAddressString, HttpClient client)
{
string page = "";
HttpResponseMessage resX;
try
{
// set the address to call
Uri URL = new Uri(APIAddressString);
// execute the call
resX = await client.GetAsync(URL);
page = await resX.Content.ReadAsStringAsync();
string rslt = page;
// do something with the api response data
}
catch (Exception ex)
{
// log error
}
return page;
}
}
First because your triggered function is async, you should use await rather than .Wait(). Wait will block the current thread.
public static async Task ProcessQueueMessage([QueueTrigger("myqueue")] string message, TextWriter log)
{
var getAPIData = new GetData();
await getAPIData.DoIt(message);
log.WriteLine("*** done: " + message);
}
Anyway you'll be able to find usefull information from the documentation
Parallel execution
If you have multiple functions listening on different queues, the SDK will call them in parallel when messages are received simultaneously.
The same is true when multiple messages are received for a single queue. By default, the SDK gets a batch of 16 queue messages at a time and executes the function that processes them in parallel. The batch size is configurable. When the number being processed gets down to half of the batch size, the SDK gets another batch and starts processing those messages. Therefore the maximum number of concurrent messages being processed per function is one and a half times the batch size. This limit applies separately to each function that has a QueueTrigger attribute.
Here is a sample code to configure the batch size:
var config = new JobHostConfiguration();
config.Queues.BatchSize = 50;
var host = new JobHost(config);
host.RunAndBlock();
However, it is not always a good option to have too many threads running at the same time and could lead to bad performance.
Another option is to scale out your webjob:
Multiple instances
if your web app runs on multiple instances, a continuous WebJob runs on each machine, and each machine will wait for triggers and attempt to run functions. The WebJobs SDK queue trigger automatically prevents a function from processing a queue message multiple times; functions do not have to be written to be idempotent. However, if you want to ensure that only one instance of a function runs even when there are multiple instances of the host web app, you can use the Singleton attribute.
Have a read of this Webjobs SDK documentation - the behaviour you should expect is that your process will run and process one message at a time, but will scale up if more instances are created (of your app service). If you had multiple queues, they will trigger in parallel.
In order to improve the performance, see the configurations settings section in the link I sent you, which refers to the number of messages that can be triggered in a batch.
If you want to process multiple messages in parallel though, and don't want to rely on instance scaling, then you need to use threading instead (async isn't about multi-threaded parallelism, but making more efficient use of the thread you're using). So your queue trigger function should read the message from the queue, the create a thread and "fire and forget" that thread, and then return from the trigger function. This will mark the message as processed, and allow the next message on the queue to be processed, even though in theory you're still processing the earlier one. Note you will need to include your own logic for error handling and ensuring that the data wont get lost if your thread throws an exception or can't process the message (eg. put it on a poison queue).
The other option is to not use the [queuetrigger] attribute, and use the Azure storage queues sdk API functions directly to connect and process the messages per your requirements.

Resources