Azure ServiceBus Queues -- taking a long time to get local messages? - azure

I'm working through a basic tutorial on the ServiceBus. A web role adds objects to a ServiceBus Queue, while a worker role reads those messages off the queue and marks them complete. This is all within the local environment (compute emulator).
It seems like it should be incredibly simple, but I'm seeing the following behavior:
The call QueueClient.Receive() is always timing out.
Some messages are just hanging out in the queue and are not being picked up by the worker.
What could be going on? How can I debug the state of these messages?

You can check the length of the Queue (from the portal or looking at the MessageCount property).
Another possibility is that the messages are DeadLettered. You can read from the DeadLettered subqueue using this sample code.

First of all, please make sure you indeed have some messages in the queue. I would like to suggest you to run the end solution of this tutorial: http://msdn.microsoft.com/en-us/WAZPlatformTrainingCourse_ServiceBusMessaging. If that works fine, please compare your code with the sample code. If that doesn’t work as well, it is likely to be a configuration issue or a network issue. Then I would recommend you to check whether you have properly configured the Service Bus account, and check if you’re able to access internet from your machine.
Best Regards,
Ming Xu.

Related

AddMessage in batch process for a queue

I want to include a lot of records in a queue of Azure.
I don't want to do it one to one. I would like to create a batch process to know if something went wrong.
Because I need to do a rollback proccess if something went wrong.
For example to run a batch of 50 and if the queue gets the 50 records receive a success.
if something went wrong receive a that information.
I know I can include records in a table in a batch way with this command:
cloudTable.ExecuteBatchAsync(tableBatchOperation);
And I saw on internet a way to do it batch process for queues.
But I think this post it is related with performance, more than batch process success or not.
Any idea? any magic library?
AFAIK, It’s not possible to send messages in a batch to a storage queue.
Azure Service Bus on the other hand supports this functionality. You might want to look into it if batching is important for you.

Azure worker role instance got stuck

I have an continuous running Worker role that executes multiple jobs. The jobs are there to process queue messages. Normally if there is an exception or any problem, the job will fail, the queued message will go back into the queue, and the job will try to reprocess.
But I am facing a weird issue since last month that no messages had processed in the past day or so. I investigated on the Azure Portal, and saw that the worker role instance still had a "running" status. For some reason, the job did not time out or quit, but all the messages was sitting in the queue, unprocessed.
There were also no logs or exceptions/errors thrown (I have a decent amount of logging and exception handling in the method).
I restarted the worker role via the Azure Portal, and once that happened, all of the backed up queue messages began processing immediately.
Can anyone help with the solutions or suggestions to handle this case?
RDP to the VM and troubleshoot it just like you would troubleshoot it on-prem. What do performance counters show you? Is your process (or any other) consuming CPU? Anything in the event logs? Take a hang dump of WaWorkerHost.exe and check the callstacks to see what your code is doing or if it is stuck in something like a deadlock or infinite loop.
You can also check the guest agent and host boostrapper logs (see https://blogs.msdn.microsoft.com/kwill/2013/08/09/windows-azure-paas-compute-diagnostics-data/), but since you said the portal was reporting that the instance was in the Ready state then I don't think you will find anything there. It sounds like 'Azure' (the role host processes) are working fine and it is something within WaWorkerHost.exe (your code) that is the problem.

Azure ServiceBus Queue Timing Out

Encountering a strange issue with one of our queues (for production, no less). When I try to put a message onto the queue, it's throwing an exception that simply states:
A timeout has occurred during the operation
The messages do seem to be making it onto the queue, as evidenced by the fact that I can see the queue length increasing in the management portal. However, the client application is not receiving any messages.
The management portal shows that there have been several failed requests, and also several internal server exceptions; though unfortunately I don't see any way to get more details about those failed requests and errors.
I'm somewhat at a loss as to what may have caused this, how to get more information about what's wrong, and how to move ahead in troubleshooting this. Any help would be greatly appreciated.
edit: I should mention just for completeness sake, that I did not make any changes to the clients that I'm aware of; This issue just sort of started happening all of a sudden
edit #2, woke up this morning, and things have magically returned to normal. Still not sure what happened, so I'd like to change the tone of the question to solicit suggestions as to how this kind of thing may be mitigated and/or troubleshooted (troubleshot? troubleshat? :) ) better
I have experienced this scenario too. When I tried too create a new service bus namespace, and pointed my app to this new namespace, it worked for me. This suggests that it might be some hardware failure going on (on the node where your sb-namespace resides).
Be sure to use transient failure handling, for example http://www.nuget.org/packages/EnterpriseLibrary.WindowsAzure.TransientFaultHandling/
But there might as well be required too use a "second level retry" for errors that are not transient. This you have to code yourself.
Too be more fault tolerant you can also use the new feature of paired namespaces. Here is a good resource: http://msdn.microsoft.com/en-us/library/dn292562.aspx
Hth
//Peter

Deleting dead topics in Azure Service Bus

I've tried to do my homework on this issue but no searches I can make have gotten me closer to the answer. Closest hit was Detect and Delete Orphaned Queues, Topics, or Subscriptions on Azure Service Bus.
My scenario:
I have multiple services running (standard win service). At startup these processes starts to subscribe to a given topic in Azure Service Bus. Let's call the topic "Messages".
When the service is shut down it unsubcribes in a nice way.
But sometimes stuff happens and the service crashes, causing the unsubscription to fail and the subscription then is left hanging.
My questions:
1) From what I'm seeing, each dead topic subscription counts when a message is sent to that topic. Even if no one is ever going to pick it up. Fact or fiction?
2) Is there anyway to remove subscriptions that haven't been checked for a while, for example for the last 24h? Preferrably by a Power Shell script?
I've raised this issue directly with Microsoft but haven't received any answer yet. Surely, I can't be the first to experience this. I'll also update this if I get any third party info.
Thanks
Johan
In the Azure SDK 2.0 release we have addressed this scenario with the AutoDeleteOnIdle feature. This will allow you to set a timespan on a Queue/Topic/Subscription and the when no activity is detected for the specified duration, the entity will automatically be deleted. See details here, and the property to set is here.
On your 1) question, yes messages sent to a topic will be sent to any matching subscription, even if that is Idle (based on your own logic). A subscription is a permanent artifact that you create that is open to receive messages, even when no services are dequeuing messages.
To clean out subscriptions, you can probably use the AccessedAt property of the SubscriptionDescription and use that to check when someone last read the queue (by a Receive operation).
http://msdn.microsoft.com/en-us/library/microsoft.servicebus.messaging.subscriptiondescription.accessedat.aspx
If you use that logic, you can build your own 'cleansing' mechanisms
HTH

WF4 Affinity on Windows Azure and other NLB environments

I'm using Windows Azure and WF4 and my workflow service is hosted in a web-role (with N instances). My job now is find out how
to do an affinity, in a way that I can send messages to the right workflow instance. To explain this scenario, my workflow (attached) starts with a "StartWorkflow" receive activity, creates 3 "Person" and, in a parallel-for-each, waits for the confirmation of these 3 people ("ConfirmCreation" Receive Activity).
I then started to search how the affinity is made in others NLB environments (mainly looked for informations about how this works on Windows Server AppFabric), but I didn't find a precise answer. So how is it done in others NLB environments?
My next task is find out how I could implement a system to handle this affinity on Windows Azure and how much would this solution cost (in price, time and amount of work) to see if its viable or if it's better to work with only one web-role instance while we wait for the WF4 host for the Azure AppFabric. The only way I found was to persist the workflow instance. Is there other ways of doing this?
My third, but not last, task is to find out how WF4 handles multiple messages received at the same time. In my scenario, this means how it would handle if the 3 people confirmed at the same time and the confirmation messages are also received at the same time. Since the most logical answer for this problem seems to be to use a queue, I started looking for information about queues on WF4 and found people speaking about MSQM. But what is the native WF4 messages handler system? Is this handler really a queue or is it another system? How is this concurrency handled?
You shouldn't need any affinity. In fact that's kinda the whole point of durable Workflows. Whilst your workflow is waiting for this confirmation it should be persisted and unloaded from any one server.
As far as persistence goes for Windows Azure you would either need to hack the standard SQL persistence scripts so that they work on SQL Azure or write your own InstanceStore implementation that sits on top of Azure Storage. We have done the latter for a workflow we're running in Azure, but I'm unable to share the code. On a scale of 1 to 10 for effort, I'd rank it around an 8.
As far as multiple messages, what will happen is the messages will be received and delivered to the workflow instance one message at a time. Now, it's possible that every one of those messages goes to the same server or maybe each one goes to a diff. server. No matter how it happens, the workflow runtime will attempt to load the workflow from the instance store, see that it is currently locked and block/retry until the workflow becomes available to process the next message. So you don't have to worry about concurrent access to the same workflow instance as long as you configure everything correctly and the InstanceStore implementation is doing its job.
Here's a few other suggestions:
Make sure you use the PersistBeforeSend option on your SendReply actvities
Configure the following workflow service options
<workflowIdle timeToUnload="00:00:00" />
<sqlWorkflowInstanceStore ... instanceLockedExceptionAction="AggressiveRetry" />
Using the out of the box SQL instance store with SQL Azure is a bit of a problem at the moment with the Azure 1.3 SDK as each deployment, even if you made 0 code changes, results in a new service deployment meaning that already persisted workflows can't continue. That is a bug that will be solved but a PITA for now.
As Drew said your workflow instance should just move from server to server as needed, no need to pin it to a specific machine. And even if you could that would hurt scalability and reliability so something to be avoided.
Sending messages through MSMQ using the WCF NetMsmqBinding works just fine. Internally WF uses a completely different mechanism called bookmarks that allow a workflow to stop and resume. Each Receive activity, as well as others like Delay, will create a bookmark and wait for that to be resumed. You can only resume existing bookmarks. Even resuming a bookmark is not a direct action but put into an internal queue, not MSMQ, by the workflow scheduler and executed through a SynchronizationContext. You get no control over the scheduler but you can replace the SynchronizationContext when using the WorkflowApplication and so get some control over how and where activities are executed.

Resources