I'm working on standing up the Azure Service Bus messaging infrastructure for my team, and I'm trying to establish best practices for developing Service Bus message receivers. We are standing up a new service to consume the Service Bus messages; the start up script will instantiate the message receivers and start their message reception.
The pattern I'm setting up for my team is to extend a base receiver class and implement an abstract function that will starts the message receiver in the stream fashion.
I'm curious if there are any notable differences between receiving messages using ServiceBusReceiver::subscribe vs ServiceBusReceiver::receiveMessages (stream vs loop)? I'm suggesting that my team uses ServiceBusReceiver::subscribe since it registers the reception forever and it seems to handle errors more gracefully.
I've noticed two differences between the stream vs loop:
ServiceBusReceiver::receiveMessages is asynchronous. This means that in my script I would need to run Promise.all or Promise.allSettled to start the receivers in parallel. Because of the limited error handling with the loop message reception, I noticed that if the receiver hits an error, it will halt messaging processing. This scenario would require our team to restart the service if any of the receivers hits an error which is a con for our team.
The streaming method is synchronous so my start up script can register the subscriptions, save the return values, and close the subscriptions on shutdown.
If I refer to this object's properties in the ServiceBusReceiver::subscribe callback functions, I get an error that the property is undefined. It seems like the callback functions lose context of the object?
Thanks in advance
The intended way of receiving messages is definitely streaming for the messaging services though both the ways of receiving work just fine with the ServiceBus JS SDK.
receiveMessages (loop) is more for the convenience of the users who just want to receive the messages simply and don't want to deal with the callbacks, handlers, etc.
Internally, receiveMessages also does streaming to receive the messages and waits for the given duration before returning the array of messages.
Hope that might clarify your doubts.
If I refer to this object's properties in the ServiceBusReceiver::subscribe callback functions, I get an error that the property is undefined. It seems like the callback functions lose context of the object?
You can perhaps use arrow functions. For reference, please check this part of an unrelated subscribe test...
https://github.com/Azure/azure-sdk-for-js/blob/d417e93b53450b2660c34965ffa177f3d4d2f947/sdk/servicebus/perf-tests/service-bus/test/subscribe.spec.ts#L72
Related
I'm a bit confused regarding the EventHubTrigger for Azure functions.
I've got an IoT Hub, and am using its eventhub-compatible endpoint to trigger an Azure function that is going to process and store the received data.
However, if my function fails (= throws an exception), that message (or messages) being processed during that function call will get lost. I actually would expect the Azure function runtime to process the messages at a later time again. Specifically, I would expect this behavior because the EventHubTrigger is keeping checkpoints in the Function Apps storage account in order to keep track of where in the event stream it has to continue.
The documention of the EventHubTrigger even states that
If all function executions succeed without errors, checkpoints are added to the associated storage account
But still, even when I deliberately throw exceptions in my function, the checkpoints will get updated and the messages will not get received again.
Is my understanding of the EventHubTriggers documentation wrong, or is the EventHubTriggers implementation (or its documentation) wrong?
This piece of documentation seems confusing indeed. I guess they mean the errors of Function App host itself, not of your code. An exception inside function execution doesn't stop the processing and checkpointing progress.
The fact is that Event Hubs are not designed for individual message retries. The processor works in batches, and it can either mark the whole batch as processed (i.e. create a checkpoint after it), or retry the whole batch (e.g. if the process crashed).
See this forum question and answer.
If you still need to re-process failed events from Event Hub (and errors don't happen too often), you could implement such mechanism yourself. E.g.
Add an output Queue binding to your Azure Function.
Add try-catch around processing code.
If exception is thrown, add the problematic event to the Queue.
Have another Function with Queue trigger to process those events.
Note that the downside of this is that you will loose ordering guarantee provided by Event Hubs (since Queue message will be processed later than its neighbors).
Quick fix. As retry policy would not work if down system is down for few hours. You can call Process.GetCurrentProcess().Kill(); in exception handling. This would stop the checkpoint moving forward. I have tested this with consumption based function app. You will not see anything in logs but i added email to notify that something went wrong and to avoid data loss i have killed the function instance.
Hope this helps.
Would put an blog over it and other part of workflow where I stop function in case of continuous failure on down system using logic app.
A really common pattern that I need in multi instance web applications is invalidating MemoryCaches over all instances - and waiting for a confirmation that this has been done. (Because a user might otherwise after a refresh suddenly see old data on another instance)
We can make this with a combination of:
AzureServicebus,
Sending message to a topic
other instances send message back with ReplyTo to the original instance
have a wait loop for waiting on the messages back,
be aware of how many other instances are there in the first place.
probably some timeout because what happens if an instance crashes in between?
I think working out all these little edge cases might be a lot of work - so before we reinvent the wheel - is there already a common pattern or library for this?
(of course one solution would be using a shared cache like Redis, but for some situations a memorycache is a lot faster)
Have a look at Azure Durable Functions, e.g. Fan-In/Fan-Out scenario. They use Azure Storage Queues underneath, but provide higher-level abstractions.
Note that Durable Functions are still in early preview (as of August 2017), so not suitable for production use yet.
I think working out all these little edge cases might be a lot of work - so before we reinvent the wheel - is there already a common pattern or library for this?
Indeed. This sounds like a candidate for a middleware framework such as NServiceBus or MassTransit.
AzureServicebus
Both NServiceBus and MassTransit support Azure Service Bus as the transport.
Sending message to a topic
Both NServiceBus and MassTransit can Publish messages (events) to topics.
other instances send message back with ReplyTo to the original instance
Both NServiceBus and MassTransit can send messages to specific destination. NServiceBus also can Reply to the originator of an incoming message using a request/reply pattern.
have a wait loop for waiting on the messages back
Both NServiceBus and MassTransit support Sagas, also known as Process Coordinator pattern.
be aware of how many other instances are there in the first place.
Not sure about this requirement. When you scale out, you're running with a competing consumer and shouldn't care about number of instances of an endpoint.
probably some timeout because what happens if an instance crashes in between?
If you refer to retries and recovery, then both NServiceBus and MassTransit support retries.
You can use Azure Redis cache pub/sub model to do this.
1) Subscribe to Redis multiplexer
connectionMultiplexer.GetSubscriber().Subscribe(
"SubscribeChannelName",
(channel, message) => {
invalidate cache here and publish the confirmation using below publish method
connectionMultiplexer.GetSubscriber().PublishAsync("PublishChannelName", "Cache invalidated for instance").Wait();
});
2) Publish the cache invalidation and subscribe for confirmation from instances
var connection = ConnectionMultiplexer.Connect("redis connection string");
var redisSubscriber = connection.GetSubscriber();
redisSubscriber.Subscribe(
"PublishChannelName",
(channel, message) => {
// write logic to verify if all instances notified about cache invalidation.
});
redisSubscriber.PublishAsync("SubscribeChannelName","invalidate cache")).Wait();
Per Azure Functions Service Bus bindings:
Trigger behavior
...
PeekLock behavior - The Functions runtime receives a message in PeekLock mode and calls Complete on the message if the function finishes successfully, or calls Abandon if the function fails. If the function runs longer than the PeekLock timeout, the lock is automatically renewed.
I am assuming that when azure function calls Complete on the message, it will be removed from the queue.
What should I do in my function if I want my function to spy on the message but never delete it?
Unsuccessful processing of a message resulting in function throwing an exception or an explicit abandon operation on the message will not complete the message.
Saying that, I see a problem with this approach. You're not truly "spying" on the messages, but actively processing those. Which means a given message will be re-delivered and eventually end up in the dead letter queue. If you want to spy, you should peek at the messages, but Azure Service Bus trigger doesn't do that.
If you need a wiretap implementation, it's probably not a bad idea to use a topic and have a 2 subscriptions, one to consume the messages and another to duplicate all the messages for your wiretap function (that perhaps does some sort of analysis or logging). Without understanding the full scope of what you're doing, hard to provide an answer.
I am using the Simple Service Bus from Codeplex and have a handler that provides me with a message and an IMessageContext.
public void Handle(MyEnquiryMessage message, IMessageContext context)
I store both these in a list and let the handler complete. At some point in the future I do some processing and try to send a reply by taking the context that I stored and calling:
context.Endpoint.MessageBus.Reply(myResponse)
Unfortunately this throws an exception “Object reference not set to an instance of an object”. Is this asynchronous way of replying possible or can “reply” only be used within the handler?
I don't know Simple Service Bus but I would guess that your context is only valid in the handler. If you want to send back a response you need to gather all the data you need from the context and simply do a 'send' at that later stage.
Even so, it sounds a bit strange performing process 'later' when it could probably be handled in another endpoint that processes a relevant message type. Without more information it is difficulty to tell, but your design may not be optimal.
When a Web Role places a message onto a Storage Queue, how can it poll for a specific, correlated response? I would like the back-end Worker Role to place a message onto a response queue, with the intent being that the caller would pick the response up and go from there.
Our intent is to leverage the Queue in order to offload some heavy processing onto the back-end Worker Roles in order to ensure high performance on the Web Roles. However, we do not wish to respond to the HTTP requests until the back-end Workers are finished and have responded.
I am actually in the middle of making a similar decision. In my case i have a WCF service running in a web role which should off-load calculations to worker-roles. When the result has been computed, the web role will return the answer to the client.
My basic data structure knowledge tells me that i should avoid using something that is designed as a queue in a non-queue way. That means a queue should always be serviced in a FIFO like manner. So basically if using queues for both requests and response, the threads awaiting to return data to the client will have to wait untill the calculation message is at the "top" of the response queue, which is not optimal. If storing the responses by using Azure tables, the threads poll for messages creating unnecessary overhead
What i belive is a possible solution to this problem is using a queue for the requests. This enables use of the competeing consumers pattern and thereby load-balancing. On messages sent into this queue you set the correlationId property on the message. For reply the pub/sub part ("topics") part of Azure service bus is used togehter with a correlation filter. When your back-end has processed the request, it published a result to a "responseSubject" with the correlationId given in the original request. Now this response ca be retrieved by your client by calling CreateSubscribtion (Sorry, i can't post more than two links apparently, google it) using that correlation filter, and it should get notified when the answer is published. Notice that the CreateSubscribtion part should just be done one time in the OnStart method. Then you can do an async BeginRecieve on that subscribtion and the role will be notified in the given callback when a response for one of it's request is available. The correlationId will tell you which request the response is for. So your last challenge is giving this response back to the thread holding the client connection.
This could be achieved by creating Dictionary with the correlationId's (probably GUID's) as key and responses as value. When your web role gets a request it creates the guid, set it as correlationId, add it the hashset, fire the message to the queue and then call Monitor.Wait() on the Guid object. Then have the recieve method invoked by the topic subscribition add the response to the dictionary and then call Monitor.Notify() on that same guid object. This awakens your original request-thread and you can now return the answer to your client (Or something. Basically you just want your thread to sleep and not consume any ressources while waiting)
The queues on the Azure Service Bus have a lot more capabilities and paradigms including pub / sub capabilities which can address issues dealing with queue servicing across multiple instance.
One approach with pub / sub, is to have one queue for requests and one for the responses. Each requesting instance would also subscribe to the response queue with a filter on the header such that it would only receive the responses targeted for it. The request message would, of course contain the value to the placed in the response header to drive the filter.
For the Service Bus based solution there are samples available for implementing Request/Response pattern with Queues and Topics (pub-sub)
Let worker role keep polling and processing the message. As soon as the message is processed add an entry in Table storage with the required corelationId(RowKey) and the processing result, before deleting the processed message from the queue.
Then WebRoles just need to do a look up of the Table with the desired correlationId(RowKey) & PartitionKey
Have a look at using SignalR between the worker role and the browser client. So your web role puts a message on the queue and returns a result to the browser (something simple like 'waiting...') and hook it up to the worker role with SignalR. That way your web role carries on doing other stuff and doesn't have to wait for a result from asynchronous processing, only the browser needs to.
There is nothing intrinsic to Windows Azure queues that does what you are asking. However, you could build this yourself fairly easily. Include a message ID (GUID) in your push to the queue and when processing is complete, have the worker push a new message with that message ID into a response channel queue. Your web app can poll this queue to determine when processing is completed for a given command.
We have done something similar and are looking to use something like SignalR to help reply back to the client when commands are completed.