Azure ServiceBus AbandonMessageAsync releasing message at inconsistent times - azure

I have the need to inspect a dead letter queue, and if some condition exists (like older than 30 days) I want to archive it to some data store (not just remove it). So I was gonna grab the messages, if it meets this condition, save it to some store and complete/delete the message, if not, abandoning it. I have a console app where I’m grabbing the messages from the dlq, it seems to work, but if I run it over and over again, I’m seeing inconsistent results in the number of messages that get returned. It will have all of them for a few iterations (in my example it would be 7), but then it will start only getting 6, 0 or 1, and eventually go back to the full amount that’s in the dql (like 30 seconds later which I think is the default lock period for peek lock). I would assume that every time I run this, I should get all messages, cause I’m abandoned the messages the run before.
I'm using Azure.Messaging.ServiceBus 7.8.1 and seems like you just pass the message object to the abandon method. If anyone has any suggestion that would be great!
Code in github: https://github.com/ndn2323/bustest
using Azure.Messaging.ServiceBus;
using System.Text;
namespace BusReceiver
{
public class TaskRunner
{
public TaskRunner() { }
public async Task Run() {
const string DLQPATH = "/$deadletterqueue";
var maxMsgCount = 50;
var connectionString = "[ConnectionString]";
var topicName = "testtopic1";
var subscriberName = "testsub1";
var subscriberDlqName = subscriberName + DLQPATH;
var client = new ServiceBusClient(connectionString);
var options = new ServiceBusReceiverOptions();
options.ReceiveMode = ServiceBusReceiveMode.PeekLock;
var receiver = client.CreateReceiver(topicName, subscriberName, options);
var receiverDlq = client.CreateReceiver(topicName, subscriberDlqName, options);
Log("Starting receive from regular queue");
var msgList = await receiver.ReceiveMessagesAsync(maxMsgCount, TimeSpan.FromMilliseconds(500));
Log(msgList.Count.ToString() + " messages found");
foreach (var msg in msgList)
{
await receiver.DeadLetterMessageAsync(msg);
}
Log("Starting receive from dead letter queue");
var msgListDlq = await receiverDlq.ReceiveMessagesAsync(maxMsgCount, TimeSpan.FromMilliseconds(500));
Log(msgListDlq.Count.ToString() + " messages found in dlq");
foreach (var msg in msgListDlq) {
Log("MessageId: " + msg.MessageId + " Body: " + Encoding.ASCII.GetString(msg.Body));
// if some condition, archieve message to some data store, else abandon it to be picked up again
// for this test I'm abandoning all messages
await receiverDlq.AbandonMessageAsync(msg);
}
await receiver.CloseAsync();
await receiverDlq.CloseAsync();
}
private void Log(string msg) {
Console.WriteLine(DateTime.Now.ToString() + ": " + msg);
}
}
}
Example of output:
C:\GitHub\ndn2323\bustest\BusReceiver\bin\Debug\net6.0>BusReceiver.exe
5/29/2022 11:45:36 PM: Starting receive from regular queue
5/29/2022 11:45:37 PM: 0 messages found
5/29/2022 11:45:37 PM: Starting receive from dead letter queue
5/29/2022 11:45:37 PM: 7 messages found in dlq
5/29/2022 11:45:37 PM: MessageId: 9e9f390655af44a8b93866920a6de77c Body: TestMessage
5/29/2022 11:45:37 PM: MessageId: 3aacffe40ab5473fb34412684bcd1907 Body: TestMessage
5/29/2022 11:45:37 PM: MessageId: a47f83d4a12845088ade427e084d8e39 Body: TestMessage
5/29/2022 11:45:37 PM: MessageId: 47ff6dd4f4134661a3616a9210670be5 Body: TestMessage
5/29/2022 11:45:37 PM: MessageId: d10b3602f57047f1bf613675e35793e0 Body: TestMessage
5/29/2022 11:45:37 PM: MessageId: 08a45405375e46ffb99db9812c3e3d78 Body: TestMessage
5/29/2022 11:45:37 PM: MessageId: d21cff4ae5b6453f9077b3805ace4e09 Body: TestMessage
C:\GitHub\ndn2323\bustest\BusReceiver\bin\Debug\net6.0>BusReceiver.exe
5/29/2022 11:45:42 PM: Starting receive from regular queue
5/29/2022 11:45:43 PM: 0 messages found
5/29/2022 11:45:43 PM: Starting receive from dead letter queue
5/29/2022 11:45:43 PM: 7 messages found in dlq
5/29/2022 11:45:43 PM: MessageId: 9e9f390655af44a8b93866920a6de77c Body: TestMessage
5/29/2022 11:45:43 PM: MessageId: 3aacffe40ab5473fb34412684bcd1907 Body: TestMessage
5/29/2022 11:45:43 PM: MessageId: a47f83d4a12845088ade427e084d8e39 Body: TestMessage
5/29/2022 11:45:43 PM: MessageId: 47ff6dd4f4134661a3616a9210670be5 Body: TestMessage
5/29/2022 11:45:43 PM: MessageId: d10b3602f57047f1bf613675e35793e0 Body: TestMessage
5/29/2022 11:45:43 PM: MessageId: 08a45405375e46ffb99db9812c3e3d78 Body: TestMessage
5/29/2022 11:45:43 PM: MessageId: d21cff4ae5b6453f9077b3805ace4e09 Body: TestMessage
C:\GitHub\ndn2323\bustest\BusReceiver\bin\Debug\net6.0>BusReceiver.exe
5/29/2022 11:45:48 PM: Starting receive from regular queue
5/29/2022 11:45:49 PM: 0 messages found
5/29/2022 11:45:49 PM: Starting receive from dead letter queue
5/29/2022 11:45:49 PM: 1 messages found in dlq
5/29/2022 11:45:49 PM: MessageId: d21cff4ae5b6453f9077b3805ace4e09 Body: TestMessage
C:\GitHub\ndn2323\bustest\BusReceiver\bin\Debug\net6.0>BusReceiver.exe
5/29/2022 11:46:03 PM: Starting receive from regular queue
5/29/2022 11:46:04 PM: 0 messages found
5/29/2022 11:46:04 PM: Starting receive from dead letter queue
5/29/2022 11:46:04 PM: 1 messages found in dlq
5/29/2022 11:46:04 PM: MessageId: d21cff4ae5b6453f9077b3805ace4e09 Body: TestMessage

Due to variations in the network, service, and your application, it is normal to see batches of inconsistent size returned when calling ReceiveMessagesAsync.
When receiving, there is no minimum batch size. The receiver will add enough credits to the link to allow maxMessageCount to flow from the service but will not wait in an attempt to build a batch of that size. Once any messages have been transferred from the service, they will be returned as the batch. Because you specified a maxWaitTime, if no messages were available on the service within that time, an empty batch will be returned.
Setting a PrefetchCount in your ServiceBusReceiverOptions can help to smooth out the batch sizes. That said, it is important to be aware that locks are held for messages in the prefetch queue and are not automatically renewed, so finding a prefetch count too high will result in seeing expired locks.
In your example, the best approach may be to just perform your receive loop repeatedly until you see 1 (or more?) empty batches consecutively. That would be a strong indicator that the queue was empty.

Related

Camel error handling fixed redelivery delay for azure storage queue not working correctly

I have an azure app service that reads from a azure storage queue through camel route version 3.14.0. Below is my code:
queue code:
QueueServiceClient client = new QueueServiceClientBuilder()
.connectionString(storageAccountConnectionString)
.buildClient();
getContext().getRegistry().bind("client", client);
errorHandler(deadLetterChannel(SEND_TO_POISON_QUEUE)
.useOriginalBody()
.log("Message sent to poison queue for handling")
.retryWhile(method(new RetryRuleset(), "shouldRetry"))
.maximumRedeliveries(24)
.asyncDelayedRedelivery()
.redeliveryDelay(3600 * 1000L) // initial delay
);
// Route to retrieve a message from storage queue.
from("azure-storage-queue:" + storageAccountName + "/" + QUEUE_NAME + "?serviceClient=#client&maxMessages=1&visibilityTimeout=P2D")
.id(QUEUE_ROUTE_CONSUMER)
.log("Message received from queue with messageId: ${headers.CamelAzureStorageQueueMessageId} and ${headers.CamelAzureStorageQueueInsertionTime} in UTC")
.bean(cliFacilityService, "processMessage(${body}, ${headers.CamelAzureStorageQueueInsertionTime})")
.end();
RetryRuleset code:
public boolean shouldRetry(#Header(QueueConstants.MESSAGE_ID) String messageId,
#Header(Exchange.REDELIVERY_COUNTER) Integer counter,
#Header(QueueConstants.INSERTION_TIME) OffsetDateTime insertionTime) {
OffsetDateTime futureRetryOffsetDateTime = OffsetDateTime.now(Clock.systemUTC()).plusHours(1); //because redelivery delay is 1hr
OffsetDateTime insertionTimePlus24hrs = insertionTime.plusHours(24);
if (futureRetryOffsetDateTime.isAfter(insertionTimePlus24hrs)) {
log.info("Facility queue message: {} done retrying because next time to retry {}. Redelivery count: {}, enqueue time: {}",
messageId, futureRetryOffsetDateTime, counter, insertionTime);
return false;
}
return true;
}
the redeliveryDelay is 1hr and maximumRedeliveries is 24, because i want to try once an hour for about 24 hrs. so not necessarily needs to be 24 times, just as many as it can do with 24hrs. and if it passes 24hrs, send it to the poison queue (this code is in the retry ruleset)
The problem is the app service retrying for first lets say 2 - 5 times normally once an hour. but after that the app service retries after 2 days later. So the message is expired and not retried because of the ruleset and sent to poison queue. Sometimes the app service does the first read from queue and the next retry is after 2 days. so very unstable. so total it is retrying tops 1-10 times and the last retry is always 2 days later in the same time from the first retry.
Is there anything i am doing wrong?
Thank you for you help!

Frequent timeout errors from Azure ServiceBus

Using Azure SDK example code as inspiration I have coded a publishing function that sends a message to the specified queue name.
export const publisher = async (message: ServiceBusMessage, queueName: string) => {
let sbClient, sender;
try {
sbClient = new ServiceBusClient(SB_CONNECTION_STRING);
sender = sbClient.createSender(queueName);
await sender.sendMessages([message]);
await sender.close();
} catch (err) {
console.log(`[Service Bus] error sending message ${queueName}`, err);
console.log("retrying message publish...");
publisher(message, queueName);
} finally {
await sbClient.close();
}
};
Most of the time this code works flawlessly but occasionally the connection times out and I retry sending within the catch block which seems to work all the time.
The message I'm sending are quite small:
{
body: {
type: PROCESS_FILE,
data: { type: CURRENT, directory: PENDING_DIRECTORY }
}
}
And example of the log output that includes the thrown error by the Azure SDK:
[08/03/2022 12:40:55.191] [LOG] [X] Processed task
[08/03/2022 12:40:55.346] [LOG] [X] Processed task
[08/03/2022 12:40:55.545] [LOG] [X] Processed task
[08/03/2022 12:41:27.840] [LOG] [Service Bus] error sending message local.importer.process { ServiceBusError: ETIMEDOUT: connect ETIMEDOUT 40.127.7.243:5671
at translateServiceBusError (/usr/share/app/node_modules/#azure/service-bus/src/serviceBusError.ts:174:12)
at MessageSender.open (/usr/share/app/node_modules/#azure/service-bus/src/core/messageSender.ts:304:31)
at process._tickCallback (internal/process/next_tick.js:68:7)
name: 'MessagingError',
retryable: false,
address: '40.127.7.243',
code: 'GeneralError',
errno: 'ETIMEDOUT',
port: 5671,
syscall: 'connect' }
[08/03/2022 12:41:27.840] [LOG] retrying message publish...
[08/03/2022 12:41:28.756] [LOG] [X] Processed task
I am not sure on how to proceed. Azure documentation recommends that you retry the message in the case of a timeout which I am doing however the timeouts are so frequent that it concerns me.
Does any kind soul have some insight into this from previous experience? I am using "#azure/service-bus": "^7.3.0",

How can I make the master wait for workers to complete in nodejs cluster

I have an application deployed on Nodejs cluster, where a master forks of workers. The workers do some database activity (which could take a while) and then have to send some result back to the master. Here is the skeleton of what I have. When I run this, some of the messages from the workers are not received by the master. How can I make the master wait till it receives messages from all the workers.
if ( isMaster ) {
for(k=0; k<nodes; k++)
{
cluster.fork();
console.log("Started Node-" + k);
}
for (const id in cluster.workers) {
var worker = cluster.workers[id];
worker.on('exit', () => {
console.log('worker', id . ' Exited');
});
worker.on('message', (msg) => {
console.log ("msg recvd by id:", id, 'msg:', msg);
consumeMsg(msg);
});
} else { // isWorker
// do some database work, potentially long running (in tens of seconds)
.....
process.send( { results: dbResults, ID: cluster.worker.id} );
}
Here is a simple version of what you want for one worker
const { Worker, isMainThread, parentPort } = require('worker_threads');
const MESSAGE_COUNT = 10;
function main() {
if (isMainThread) {
// instanciate worker
const worker = new Worker(__filename);
// receive worker responses
worker.on('message', (message) => {
console.log('worker response', message);
});
// send some messages
for (let i = 0; i < MESSAGE_COUNT; i++) {
worker.postMessage('message ' + parseInt(i));
}
// ask the worker to stop
worker.postMessage('exit');
} else {
// receive parent message
parentPort.on('message', (message) => {
// handle exit case
if (message == 'exit') {
parentPort.unref();
}
// do some work
console.log('worker: message from main', message);
// send result back
parentPort.postMessage('response to:' + message);
});
parentPort.on('exit', () => {
// do some closing actions
console.log('worker closing');
});
}
};
main();
output :
worker: message from main message 0
worker response response to:message 0
worker response response to:message 1
worker response response to:message 2
worker response response to:message 3
worker response response to:message 4
worker response response to:message 5
worker response response to:message 6
worker response response to:message 7
worker response response to:message 8
worker response response to:message 9
worker response response to:exit
worker: message from main message 1
worker: message from main message 2
worker: message from main message 3
worker: message from main message 4
worker: message from main message 5
worker: message from main message 6
worker: message from main message 7
worker: message from main message 8
worker: message from main message 9
worker: message from main exit
For several workers you can do the same
const { Worker, isMainThread, parentPort } = require('worker_threads');
const MESSAGE_COUNT = 10;
function createWorker(name) {
// instanciate worker
const worker = new Worker(__filename);
// receive worker responses
worker.on('message', (message) => {
console.log('worker ' + name + ' response', message);
});
// send some messages
for (let i = 0; i < MESSAGE_COUNT; i++) {
worker.postMessage(name + ' message ' + parseInt(i));
}
// ask the worker to stop
worker.postMessage('exit');
}
function main() {
if (isMainThread) {
createWorker('one');
createWorker('two');
} else {
// receive parent message
parentPort.on('message', (message) => {
// handle exit case
if (message == 'exit') {
parentPort.unref();
}
// do some work
console.log('worker: message from main', message);
// send result back
parentPort.postMessage('response to:' + message);
});
parentPort.on('exit', () => {
// do some closing actions
console.log('worker closing');
});
}
};
main();
output :
worker: message from main one message 0
worker one response response to:one message 0
worker one response response to:one message 1
worker one response response to:one message 2
worker one response response to:one message 3
worker one response response to:one message 4
worker one response response to:one message 5
worker one response response to:one message 6
worker one response response to:one message 7
worker one response response to:one message 8
worker one response response to:one message 9
worker one response response to:exit
worker: message from main two message 0
worker two response response to:two message 0
worker two response response to:two message 1
worker two response response to:two message 2
worker two response response to:two message 3
worker two response response to:two message 4
worker two response response to:two message 5
worker two response response to:two message 6
worker two response response to:two message 7
worker two response response to:two message 8
worker two response response to:two message 9
worker two response response to:exit
worker: message from main one message 1
worker: message from main one message 2
worker: message from main one message 3
worker: message from main one message 4
worker: message from main one message 5
worker: message from main one message 6
worker: message from main one message 7
worker: message from main one message 8
worker: message from main one message 9
worker: message from main exit
worker: message from main two message 1
worker: message from main two message 2
worker: message from main two message 3
worker: message from main two message 4
worker: message from main two message 5
worker: message from main two message 6
worker: message from main two message 7
worker: message from main two message 8
worker: message from main two message 9
worker: message from main exit

Why jobs are not waiting in waiting queue in bull?

I am using bull as a job queue. I have set this rate limit with the queue.
const opts = {
limiter: {
max: 1,
duration: 100,
bounceBack: false
}
};
let queue = new Queue('FetchQueue', opts);
My understanding of bounceBack: false is that it put the job in the delayed queue or waiting queue if it can't be processed immediately. But I am getting this error intermittently.
StatusCodeError: 403 - {"error":{"code":403,"message":"User Rate Limit Exceeded","errors":[{"message":"User Rate Limit Exceeded","domain":"usageLimits","reason":"userRateLimitExceeded"}]}}
What is the correct setting for bounceBack to make jobs wait in the job queue?

NodeJS event loop busy to pick up HTTP requests tasks from event queue & times out resulting in ETIMEDOUT

Because of single-threaded nature of NodeJS event loop, even before the HTTP request is triggered to downstream application, NodeJS is self timing out resulting in HTTP request not reaching to downstream application at all. i.e by the time event loop is picking up this task from the event queue timeout has already occurred (30ms have already elapsed) & the request to downstream application never got triggered.
Ideally the timer should have started after the HTTP request is triggered but in this case it seems like the timer starts immediately when the task is still in event queue waiting to be picked up by the event loop.
Meanwhile, I increased the timeout to 120ms from 30ms, the self-timeouts have reduced from ~2.5K per hour to 2-8 per hour.
let options = {
url: Url, //URL to hit
qs: params, //Query string data
method: "GET", //Specify the method
timeout: 30
};
request(options, (error, response, body) => {
if (error) {
return callback(new ErrorResponse(500, {error: error}));
} else {
return callback(null, response, body);
}
});
Environment details below
node 5.9.1
npm 3.7.x
request ~2.67.0
OS Ubuntu 14.04.4 LTS
My understanding of event-loop & event-queue is as illustrated in the below diagram.
Can someone help me figure out what is the exact issue here.

Resources