ServiceStack Message queue .outq max size is 100? - servicestack

I'm setting up a message queue using ServiceStack-v3 that looks like this
ClaimImport -> Validation -> Success
I've added hundreds of ClaimImports with no problem, the .inq count is correct. The issue is I want to see how many claims were imported by checking the ClaimsImport.outq. It never seems to go past 101. Is there some other way I could check this, or is this max limit intentional?

This is the default limit added on RedisMessageQueueClient.MaxSuccessQueueSize. The purpose of the .outq is to be a rolling log of recently processed messages.
Clients can subscribe to the QueueNames.TopicOut to get notified when a message is published to the .outq.

Related

Azure servicebus ReceiveBatch only returns 2 messages

I am trying to periodically receive all messages in a servicebus queue. But when I call ReceiveBatch(1000) I max get 2 messages back.
This question is kind of related to this question, except he would get a lot more by calling ReceiveBatch multiple times, I do not.
How do I get all messages on a servicebus queue?
The name ReceiveBatch(maximumNumber) is somewhat misleading. You don't get a batch, you get a collection of up-to maximum number of messages. This means you can receive less than maximuNumber as well. If you wish to receive a specific amount, you'd need to loop through the receiving operation until you get that number of messages (and potentially slightly more).

Google PubSub: drop nacked message after n retries

Is there way to configure pull subscription in the way that messages which caused error and were nacked, were re-queued (and so that redelivered) no more than n times?
Ideally on the last processing if it also failed I would like to handle this case (for example, log that this message is given up to process and will be dropped).
Or probably it's possible to find out, how much times received message was tried to be processed before?
I use node.js. I can see a lot of different options in the source code by am not sure how should I achieve desired behaviour.
Cloud Pub/Sub supports Dead Letter Queues that can be used to drop nacked messages after a configurable number of retries.
Currently, there is no way in Google Cloud Pub/Sub to automatically drop messages that were redelivered some designated number of times. The message will stop being delivered once the retention deadline has passed for that message (by default, seven days). Likewise, Pub/Sub does not keep track of or report the number of times a message was delivered.
If you want to handle these kinds of messages, you'd need to maintain a persistent storage keyed by message ID that you could use to keep track of the delivery count. If the delivery count exceeds your desired threshold, you could write the message to a separate topic that you use as a dead letter queue and then acknowledge original message.

Docusign Bulk Send Api Queue/Sent/Failed count is not accurate

Recently I am heavily dealing with Docusign Api. Especially Bulk Send Rest api method since we have requirements to send 30K envelope in 3-4 hours. Given the Api Rule Limits, led us to leverage bulk send feature.
Since Bulk send has some limitation, like it has its own queue mechanism where queue size can not exceed 2000, I am implementing my solution by respecting to this limit.
To do that, I divided my bulk recipient file (30 K recipient) into 30 CSV file.
Then I initiate for each loop and inside the loop I am controlling queue size if the queued item count became 0 for the batch. During my many tests even though all email has been reached to my inbox, I have never seen queued property to become 0. If it would become 0, then I would send the next batch. But I could never do that
Below is the screenshot I took from ApiExplorer.
If I look deeper for Trial 3 to see what are those 24 queued items as seen below.
I am getting following response.
As you can see from the latest screenshot, even though queued property indicates that there are some pending items, resultSetSize property shows 0 although I just queried queued items.
For this reason I am not able to build my logic based on sent, queued, failed property value. I thought, I could rely on them to successfully build my logic. If not, how can I overcome this problem ? Any help would be appreciated.
Thank you in advance

How does Azure Service Bus identify a duplicate message?

I understand that Azure Service Bus has a duplicate message detection feature which will remove messages it believes are duplicates of other messages. I'd like to use this feature to help protect against some duplicate delivery.
What I'm curious about is how the service determines two messages are actually duplicates:
What properties of the message are considered?
Is the content of the message considered?
If I send two messages with the same content, but different message properties, are they considered duplicates?
The duplicate detection is looking at the MessageId property of the brokered message. So, if you set the message Id to something that should be unique per message coming in the duplicate detection can catch it. As far as I know only the message Id is used for detection. The contents of the message are NOT looked at, so if you have two messages sent that have the same actual content, but have different message IDs they will not be detected as duplicate.
References:
MSDN Documentation: https://learn.microsoft.com/en-us/azure/service-bus-messaging/service-bus-queues-topics-subscriptions
If the scenario cannot tolerate duplicate processing, then additional
logic is required in the application to detect duplicates which can be
achieved based upon the MessageId property of the message which will
remain constant across delivery attempts. This is known as Exactly
Once processing.
There is also a Brokered Message Duplication Detection code sample on WindowsAzure.com that should be exactly what you are looking for as far as proving it out.
I also quickly tested this out and sent in 5 messages to a queue with RequiresDuplicateDetection set to true, all with the exact same content but different MessageIds. I then retrieved all five messages. I then did the reverse where I had matching MessageIds but different payloads, and only one message was retrieved.
In my case I have to apply ScheduledEnqueueTimeUtc on top of MessageId.
Because most of the time the first message already got pickup by worker, before the sub-sequence duplicate message were arrive in the Queue.
By adding ScheduledEnqueueTimeUtc. We tell the Service bus to hold on the the message for some time before letting worker them up.
var message = new BrokeredMessage(json)
{
MessageId = GetMessageId(input, extra)
};
// Delay 30 seconds for Message to process
// So that Duplication Detection Engine has enought time to reject duplicated message
message.ScheduledEnqueueTimeUtc = DateTime.UtcNow.AddSeconds(30);
Another important property to be considered while dealing with 'RequiresDuplicateDetection' property of a Azure Service Bus entity is 'DuplicateDetectionHistoryTimeWindow', the time frame within which message with duplicate message id will be rejected.
Default value of duplicate detection time history now is 30 seconds, the value can range between 20 seconds and 7 days.
Enabling duplicate detection helps keep track of the application-controlled MessageId of all messages sent into a queue or topic during a specified time window. If any new message is sent carrying a MessageId that has already been logged during the time window, the message is reported as accepted (the send operation succeeds), but the newly sent message is instantly ignored and dropped. No other parts of the message other than the MessageId are considered.

Windows Azure staging <--> production causing conflicts & errors on table storage

We had a terrible problem/experience yesterday when trying to swap our staging <--> production role.
Here is our setup:
We have a workerrole picking up messages from the queue. These messages are processed on the role. (Table Storage inserts, db selects etc ). This can take maybe 1-3 seconds per queue message depending on how many table storage posts he needs to make. He will delete the message when everything is finished.
Problem when swapping:
When our staging project went online our production workerrole started erroring.
When the role wanted to process queue messsage it gave a constant stream of 'EntityAlreadyExists' errors. Because of these errors queue messages weren't getting deleted. This caused the queue messages to be put back in the queue and back to processing and so on....
When looking inside these queue messages and analysing what would happend with them we saw they were actually processed but not deleted.
The problem wasn't over when deleting these faulty messages. Newly queue messages weren't processed as well while these weren't processed yet and no table storage records were added, which sounds very strange.
When deleting both staging and producting and publishing to production again everything started to work just fine.
Possible problem(s)?
We have litle 2 no idea what happened actually.
Maybe both the roles picked up the same messages and one did the post and one errored?
...???
Possible solution(s)?
We have some idea's on how to solve this 'problem'.
Make a poison message fail over system? When the dequeue count gets over X we should just delete that queue message or place it into a separate 'poisonqueue'.
Catch the EntityAlreadyExists error and just delete that queue message or put it in a separate queue.
...????
Multiple roles
I suppose we will have the same problem when putting up multiple roles?
Many thanks.
EDIT 24/02/2012 - Extra information
We actually use the GetMessage()
Every item in the queue is unique and will generate unique messages in table Storage. Little more information about the process: A user posts something and will have to be distributed to certain other users. The message generate from that user will have a unique Id (guid). This message will be posted into the queue and picked up by the worker role. The message is distributed over several other tables (partitionkey -> UserId, rowkey -> Some timestamp in ticks & the unique message id. So there is almost no chance the same messages will be posted in a normal situation.
The invisibility time out COULD be a logical explanation because some messages could be distributed to like 10-20 tables. This means 10-20 insert without the batch option. Can you set or expand this invisibility time out?
Not deleting the queue message because of an exception COULD be a explanation as well because we didn't implement any poison message fail over YET ;).
Regardless of the Staging vs. Production issue, having a mechanism that handles poison messages is critical. We've implemented an abstraction layer over Azure queues that automatically moves messages over to a poison queue once they've been attempted to be processed some configurable amount of times.
You clearly have a fault on handling double messages. The fact that your ID is unique doesn't mean that the message will not be processed twice in some occasions like:
The role dying and with partially finished work, so the message will re-appear for processing in the queue
The role crashing unexpected, so the message ends up back in the queue
The FC migrating moving your role and you don't have code to handle this situation, so the message ends up back in the queue
In all cases, you need code that handles the fact that the message will re-appear. One way is to use the DequeueCount property and check how many times the message was removed from a Queue and received for processing. Make sure you have code that handles partial processing of a message.
Now what probably happened during swapping was, when the production environment became the staging and staging became production, both of them were trying to receive the same messages so they were basically competing each other fro those messages, which is probably not bad because this is a known pattern to work anyway but when you killed your old production (staging) every message that was received for processing and wasn't finished, ended up back in the Queue and your new production environment picked the message for processing again. Having no code logic to handle this scenario and a message was that partially processed, some records in the tables existed and it started causing the behavior you noticed.
There are a few possible causes:
How are you reading the queue messages? If you are doing a Peek Message then the message will still be visible to be picked up by another role instance (or your staging environment) before the message is deleted. You want to make sure you are using Get Message so the message is invisible until it can be deleted.
Is it possible that your first role crashed after doing the work for the message but prior to deleting the message? This would cause the message to become visible again and get picked up by another role instance. At that point the message will be a poison message which will cause your instances to constantly crash.
This problem almost certainly has nothing to do with Staging vs Production, but is most likely caused by having multiple instances reading from the same queue. You can probably reproduce the same problem by specifying 2 instances, or by deploying the same code to 2 different production services, or by running the code locally on your dev machine (still pointing to Azure storage) using 2 instances.
In general you do need to handle poison messages so you need to implement that logic anyways, but I would suggest getting to the root cause of this problem first, otherwise you are just going to run into a lot more problems later on.
With queues you need to code with idempotency in mind and expect and handle the ‘EntityAlreadyExists’ as a viable response.
As others have suggested, causes could be
Multiple message in the queue with the same identifier.
Are peeking for the message and not reading it form the queue and so not making them invisible.
Not deleting the message because an exception was thrown before you can delete them.
Taking too long to process the message so it cannot be deleted (because invisibility was timed out) and appears again
Without looking at the code I am guessing that it is either the 3 or 4 option that is occurring.
If you cannot detect the issue with a code review, you may consider adding time based logging and try/catch wrappers to get a better understanding.
Using queues effectively, in a multi-role environment, requires a slightly different mindset and running into such issues early is actually a blessing in disguise.
Appended 2/24
Just to clarify, modifying the invisibility time out is not a generic solution to this type of problem. Also, note that this feature although available on the REST API, may not be available on the queue client.
Other options involve writing to table storage in an asynchronous manner to speed up your processing time, but again this is a stop gap measures which does not really address the underlying paradigm of working with queues.
So, the bottom line is to be idempotent. You can try using the table storage upsert (update or insert) feature to avoid getting the ‘EntitiyAlreadyExists’ error, if that works for your code. If all you are doing is inserting new entities to azure table storage then the upsert should solve your problem with minimal code change.
If you are doing updates then it is a different ball game all together. One pattern is to pair updates with dummy inserts in the same table with the same partition key so as to error out if the update occurred previously and so skip the update. Later after the message is deleted, you can delete the dummy inserts. However, all this adds to the complexity, so it is much better to revisit the architecture of the product; for example, do you really need to insert/update into so many tables?
Without knowing what your worker role is actually doing I'm taking a guess here, but it sounds like when you have two instances of your worker role running you are getting conflicts while trying to write to an Azure table. It is likely to be because you have code that looks something like this:
var queueMessage = GetNextMessageFromQueue();
Foo myFoo = GetFooFromTableStorage(queueMessage.FooId);
if (myFoo == null)
{
myFoo = new Foo {
PartitionKey = queueMessage.FooId
};
AddFooToTableStorage(myFoo);
}
DeleteMessageFromQueue(queueMessage);
If you have two adjacent messages in the queue with the same FooId it is quite likely that you'll end up with both of the instances checking to see if the Foo exists, not finding it then trying to create it. Whichever instance is the last to try and save the item will get the "Entity already exists" error. Because it errored it never gets to the delete message part of the code and therefore it becomes visible back on the queue after a period of time.
As others have said, dealing with poison messages is a really good idea.
Update 27/02
If it's not subsequent messages (which based on your partition/row key scheme I would say it's unlikely), then my next bet would be it's the same message appearing back in the queue after the visibility timeout. By default if you're using .GetMessage() the timeout is 30 seconds. It has an overload which allows you to specify how long that time frame is. There is also the .UpdateMessage() function that allows you to update that timeout as you're processing the message. For example you could set the initial visibility to 1 minute, then if you're still processing the message 50 seconds later, extent it for another minute.

Resources