Azure queues getMessages method in sdk not working as expected

Azure queues getMessages method in sdk not working as expected - node.js

I have created a queue in Azure Queue and enqueued two items in it. Using the nodejs sdk, i create a timer that executes every 5 secs and calls:
azure.createQueueService("precondevqueues", "<key>").getMessages(queueName, {numOfMessages : 1, visibilityTimeout: 1 }, callback)
I expect that the same message of the two in the queue to show up after every 5 secs but that does not seem to be the case. The output of this call alternates between the two messages.
This should not be the case since visibilityTimeout is set to 1 and hence, after 1 second, the message dequeued in the first call should be visible again before the next getMessage call is made.

As noted here, FIFO ordering is not guaranteed. So it may be the case, that most of the time messages are fetched in FIFO order, but that is not guaranteed and Azure can give you the messages in the order which is best for their implementation.
Messages are generally added to the end of the queue and retrieved
from the front of the queue, although first in, first out (FIFO)
behavior is not guaranteed.

Aha my mistake! I again read the getMessages documentation very carefully and realize that getMessages dequeues the message but retains a invisible copy outside of the queue. If the message processor does not delete the message before the visibility timeout expires, the copy is re-enqueued in the message and therefore they go to the end of the queue.

Related

What good is a QueueTriggerAttribute when you can't dequeue the exact queued message? Or can you?

I'm trying to understand something about the Azure Functions' QueueTriggerAttribute for use with the Queue Storage. I can see that the QueueTriggerAttribute allows me to bind an Azure Function to a Queue Storage event for when a new item is added to the queue - my Function calls with the contents of that new message.
How is this useful, though?
There is still no way to dequeue that exact Queue Storage item within that triggered Function, right? The best that you could do is just pop whatever the next available item is off of the Queue, which may not be the one that triggered the Function.
I guess in theory 1 single push to the Queue Storage would trigger 1 single Function call in which you could make 1 single pop call. So at the end of the day you still can leverage these triggers to process all of the items in the queue - so long as there is no interruptions or anything that would cause a trigger to go unhandled, resulting in items stuck in the queue.
Am I missing something here? I'm looking at Queue Storage in conjunction with Azure Functions and a QueueTrigger. I'm trying to conceptualize a queue-driven workflow that executes functions, but I feel like this doesn't seem correct - or I'm not understanding something here.

It seems you are understanding something wrong, but I'm not exactly sure what your confusion is.
When you send 1 single message to a Queue, the Function listening to that queue will fire, and the contents of that message will be passed as input parameter to the code that you wrote.
Your code needn't and shouldn't "pop" anything from the queue explicitly - this is done behind the scenes by Azure Functions runtime. One invocation will process just that single message and will quit as soon as possible.
Look at this code:
[FunctionName("QueueTrigger")]
public static void QueueTrigger(
[QueueTrigger("myqueue-items")] string myQueueItem,
TraceWriter log)
{
log.Info($"C# function processed: {myQueueItem}");
}
Apart from attributes, no code works with the Queue. It just gets triggered, once per message.

RabbitMQ: how to limit consuming rate

I need to limit the rate of consuming messages from rabbitmq queue.
I have found many suggestions, but most of them offer to use prefetch option. But this option doesn't do what I need. Even if I set prefetch to 1 the rate is about 6000 messages/sec. This is too many for consumer.
I need to limit for example about 70 to 200 messages per second. This means consuming one message every 5-14ms. No simultaneous messages.
I'm using Node.JS with amqp.node library.

Implementing a token bucket might help:
https://en.wikipedia.org/wiki/Token_bucket
You can write a producer that produces to the "token bucket queue" at a fixed rate with a TTL on the message (maybe expires after a second?) or just set a maximum queue size equal to your rate per second. Consumers that receive a "normal queue" message must also receive a "token bucket queue" message in order to process the message effectively rate limiting the application.
NodeJS + amqplib Example:
var queueName = 'my_token_bucket';
rabbitChannel.assertQueue(queueName, {durable: true, messageTtl: 1000, maxLength: bucket.ratePerSecond});
writeToken();
function writeToken() {
rabbitChannel.sendToQueue(queueName, new Buffer(new Date().toISOString()), {persistent: true});
setTimeout(writeToken, 1000 / bucket.ratePerSecond);
}

I've already found a solution.
I use module nanotimer from npm for calculation delays.
Then I calculate delay = 1 / [message_per_second] in nanoseconds.
Then I consume message with prefetch = 1
Then I calculate really delay as delay - [processing_message_time]
Then I make timeout = really delay before sending ack for the message
It works perfectly. Thanks to all

See 'Fair Dispatch' in RabbitMQ Documentation.
For example in a situation with two workers, when all odd messages are heavy and even messages are light, one worker will be constantly busy and the other one will do hardly any work. Well, RabbitMQ doesn't know anything about that and will still dispatch messages evenly.
This happens because RabbitMQ just dispatches a message when the message enters the queue. It doesn't look at the number of unacknowledged messages for a consumer. It just blindly dispatches every n-th message to the n-th consumer.
In order to defeat that we can use the prefetch method with the value of 1. This tells RabbitMQ not to give more than one message to a worker at a time. Or, in other words, don't dispatch a new message to a worker until it has processed and acknowledged the previous one. Instead, it will dispatch it to the next worker that is not still busy.

I don't think RabbitMQ can provide you this feature out of the box.
If you have only one consumer, then the whole thing is pretty easy, you just let it sleep between consuming messages.
If you have multiple consumers I would recommend you to use some "shared memory" to keep the rate. For example, you might have 10 consumers consuming messages. To keep 70-200 messages rate across all of them, you will make a call to Redis, to see if you are eligible to process message. If yes, then update Redis, to show other consumers that currently one message is in process.
If you have no control over consumer, then implement option 1 or 2 and publish message back to Rabbit. This way the original consumer will consume messages with the desired pace.

This is how I fixed mine with just settimeout
I set mine to process consume every 200mls which will consume 5 data in 1 seconds I did mine to do update if exist
channel.consume(transactionQueueName, async (data) => {
let dataNew = JSON.parse(data.content);
const processedTransaction = await seperateATransaction(dataNew);
// delay ack to avoid duplicate entry !important dont remove the settimeout
setTimeout(function(){
channel.ack(data);
},200);
});
Done

Manually publish messages to dead-letter queue?

Why would someone want to do that? I have to unit-test exception handling mechanism in our application.
I presumed that dead letter queue is literally azure service bus queue, where I could publish messages using QueueClient
string dlQ = #"sb://**.servicebus.windows.net/**/Subscriptions/DefaultSubscription/$DeadLetterQueue";
string connectionString = CloudConfigurationManager.GetSetting("Microsoft.ServiceBus.ConnectionString");
NamespaceManager _namespaceManager = NamespaceManager.CreateFromConnectionString(connectionString);
QueueDescription qd = _namespaceManager.GetQueue(dataPromotionDLQ);
var queueClient = QueueClient.CreateFromConnectionString(connectionString, "DefaultSubscription/$DeadLetterQueue");
BrokeredMessage brokeredMessage = new BrokeredMessage("Message to PublishToDLQ");
try
{
queueClient.Send(brokeredMessage);
}
catch (Exception)
{
}
But I get MessagingEntityNotFoundException. What could be wrong?

You would never want to publish directly to a dead letter queue. It's where poisoned messages that can't be processed are placed.
There are two ways of placing messages onto the dead letter queue. The service bus itself dead-letters messages that have exceeded the maximum number of delivery attempts. You can also explicitly dead-letter a message that you have received using the DeadLetter() method.

Create your messages with a very short TTL via the BrokeredMessage.TimeToLive property.
The Subscription must have EnableDeadLetteringOnMessageExpiration set to true.

Though late here, adding to the answers of #Mikee and #Ben Morris may help someone. You can make use of #Mike's suggestion of making use of message.DeadLetter() or message.DeadLetterAsync() to dead-letter a message. Another suggestion can be to set very less or 0 second TimeToLive to move the messages to Dead letter.
After you perform any of these and try to view the messages in the Active end queue, you may still find that message is available sometimes (Which you are currently facing). The reason is that the messages that are dead-lettered due to TTLExpiredException, HeaderSizeExceeded or any system defined Errors, or manually Dead-Lettered messages like DeadLetter() methods are cleaned up by an asynchronous "garbage collection" program periodically. This doesn't occur immediately which we expect it to.
When you perform Peek operation, you can still see that the message is in the Active queue. You have to wait for the garbage collector to run or you can perform a Receive operation which forces the garbage collector to run first, thereby moving the messages to dead-letter before retrieval is done.

Wait for messages processed by Service Bus OnMessage to finish

I'm using the Azure Service Bus SubscriptionClient.OnMessage method; configured to process up to 5 messages concurrently.
Within the code I need to wait for all messages to finish processing before I can continue (to properly shutdown an Azure Worker Role). How do I do this?
Will SubscriptionClient.Close() block until all messages have finished processing?

Calling Close on SubscriptionClient or QueueClient will not block. Calling Close closes off the entity immediately as far as I can tell. I tested quickly just using the Worker Role With Service Bus Queue project template that shipped with Windows Azure SDK 2.0. I added a thread sleep for many seconds in the message process action and then shut down the role while it was running. I saw the Close method get called while the messages were processing in their thread sleep but it certainly did not wait for the for message processing to complete, the role simple closed down.
To handle this gracefully you'll need to do the same thing we did when dealing with any worker role that was processing messages (Service Bus, Azure Storage queue or anything else): keep track of what is being worked on and shut down when it is complete. There are several ways to deal with that but all of them are manual and made messy in this case because of the multiple threads involved.
Given the way that OnMessage works you'll need to add something in the action that looks to see if the role has been told to shutdown, and if so, to not do any processing. The problem is, when the OnMessage action is executed it HAS a message already. You'd probably need to abandon the message but not exit the OnMessage action, otherwise it will keep getting a message if there are ones in the queue. You can't simply abandon the message and let the execution leave the action because then the system will be handed another message (possibly the same one) and several threads doing this may cause messages to get too many dequeue counts and get dead lettered. Also, you can't call Close on the SubscriptionClient or QueueClient, which would stop the receive loop internally, because once you call close any of the outstanding message processing will throw an exception when .Complete, .Abandon, etc. is called on the message because the message entity is now closed. This means you can't stop the incoming messages easily.
The main issue here is because you are using the OnMessage and setting up the concurrent message handling by setting the MaxConcurrentCalls on the OnMessageOptions, which means the code that starts and manages the threads is buried in the QueueClient and SubscriptionClient and you don't have control over that. You don't have a way to reduce the count of threads, or stop the threads individually, etc. You'll need to create a way to put the OnMessage action threads into a state where they are aware that the system is being told to shut down and then complete their message and not exit the action in order for them to not continuously be assigned new messages. This means you'll likely need to also set the MessageOptions to not use autocomplete and manually call complete in your OnMessage action.
Having to do all of this may severely reduce the actual benefit of using the OnMessage helper. Behind the scenes OnMessage is simply setting up a loop calling receive with the default timeout and handing of messages to another thread to do the action (loose description). So what you get by using the OnMessage approach is away from having to write that handler on your own, but then the problem you are having is because you didn't write that handler on your own you don't have control over those threads. Catch-22. If you really need to stop gracefully you may want to step away from the OnMessage approach, write your own Receive loop with threading and within the main loop stop receiving new messages and wait for all the workers to end.
One option, especially if the messages are idempotent (which means processing them more than once yields the same results... which you should be mindful of anyway) then if they are stopped in mid processing they will simply reappear on the queue to be processed by another instance later. If the work itself isn't resource intensive and the operations are idempotent then this really can be an option. No different than when an instance might fail due to hardware failure or other issues. Sure, it's not graceful or elegant, but it certainly removes all the complexity I've mentioned and is still something that can happen anyway due to other failures.
Note that the OnStop is called when an instance is told to shut down. You've got 5 minutes you can delay this until the fabric just shuts it off, so if your messages take longer than five minutes to process it won't really matter if you attempt to shut down gracefully or not, some will be cut off during processing.

You can tweak OnMessageAsync to wait for processing of messages to complete, and block new messages from beginning to be processed:
Here is the implementation:
_subscriptionClient.OnMessageAsync(async message =>
{
if (_stopRequested)
{
// Block processing of new messages. We want to wait for old messages to complete and exit.
await Task.Delay(_waitForExecutionCompletionTimeout);
}
else
{
try
{
// Track executing messages
_activeTaskCollection[message.MessageId] = message;
await messageHandler(message);
await message.CompleteAsync();
}
catch (Exception e)
{
// handle error by disposing or doing nothing to force a retry
}
finally
{
BrokeredMessage savedMessage;
if (!_activeTaskCollection.TryRemove(message.MessageId, out savedMessage))
{
_logger.LogWarning("Attempt to remove message id {0} failed.", savedMessage.MessageId);
}
}
}
}, onMessageOptions);
And an implementation of Stop that waits for completion:
public async Task Stop()
{
_stopRequested = true;
DateTime startWaitTime = DateTime.UtcNow;
while (DateTime.UtcNow - startWaitTime < _waitForExecutionCompletionTimeout && _activeTaskCollection.Count > 0)
{
await Task.Delay(_waitForExecutionCompletionSleepBetweenIterations);
}
await _subscriptionClient.CloseAsync();
}
Note that _activeTaskCollection is a ConcurrentDictionary (we can also use a counter with interlock to count the number of in progress messages, but using a dictionary allows you to investigate what happend easily in case of errors.

Azure Queue Storage - Mark messages as visible immediately after calling CloudQueue.GetMessages()

Problem:
I am reading messages from a Azure Storage Queue and then inserting them into a Storage Table using a Worker Role.
I want to read in messages but only process them if there are at least 100 (this is to optimize the Storage Table batch insert which is occurring). If there are less than 100 messages, then I want to cancel the message processing and make them immediately visible on the queue again for the next queue read.
Question:
Is it possible to mark a message which has just been read by CloudQueue.GetMessages(...) as visible without having to wait for the timeout to expire?
Code: (in WorkerRole.cs)
public override void Run()
{
while (true)
{
var messages = queue.GetMessages(100);
if (messages.Count() >= 100)
{
// This will process, insert into a table, and delete from the queue
ProcessMessages(messages);
}
else
{
//!!! MARK MESSAGES AS VISIBLE ON THE QUEUE
System.Threading.Thread.Sleep(1000);
}
}
}
Thanks

You can check the queue's `ApproximateMessageCount' property (details here), which will give you a rough idea how many messages are waiting in the queue.
Also: you can set a message's invisibility timeout to something small (maybe 5-10 seconds?). After that period, the message becomes visible again. You can also modify invisibility timeout to something shorter after you read it.
Just remember that reading from the queue counts as a transaction, as does updating messages (e.g. updating invisibility timeout).
Waiting for 100 messages may be a non-optimal optimization. Oh, and GetMessages()(details here) is limited to 32 messages, so it doesn't make sense to wait for 100. Also: Transactions are really, really cheap (a penny per 100K transactions). I don't necessarily see the value here.

Reset the expire time to 0.0. That will hopefully do the trick.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string