How to listen to a queue using azure service-bus with Node.js? - node.js

Background
I have several clients sending messages to an azure service bus queue. To match it, I need several machines reading from that queue and consuming the messages as they arrive, using Node.js.
Research
I have read the azure service bus queues tutorial and I am aware I can use receiveQueueMessage to read a message from the queue.
However, the tutorial does not mention how one can listen to a queue and read messages as soon as they arrive.
I know I can simply poll the queue for messages, but this spams the servers with requests for no real benefit.
After searching in SO, I found a discussion where someone had a similar issue:
Listen to Queue (Event Driven no polling) Service-Bus / Storage Queue
And I know they ended up using the C# async method ReceiveAsync, but it is not clear to me if:
That method is available for Node.js
If that method reads messages from the queue as soon as they arrive, like I need.
Problem
The documentation for Node.js is close to non-existant, with that one tutorial being the only major document I found.
Question
How can my workers be notified of an incoming message in azure bus service queues ?

Answer
According to Azure support, it is not possible to be notified when a queue receives a message. This is valid for every language.
Work arounds
There are 2 main work arounds for this issue:
Use Azure topics and subscriptions. This way you can have all clients subscribed to an event new-message and have them check the queue once they receive the notification. This has several problems though: first you have to pay yet another Azure service and second you can have multiple clients trying to read the same message.
Continuous Polling. Have the clients check the queue every X seconds. This solution is horrible, as you end up paying the network traffic you generate and you spam the service with useless requests. To help minimize this there is a concept called long polling which is so poorly documented it might as well not exist. I did find this NPM module though: https://www.npmjs.com/package/azure-awesome-queue
Alternatives
Honestly, at this point, you may be wondering why you should be using this service. I agree...
As an alternative there is RabbitMQ which is free, has a community, good documentation and a ton more features.
The downside here is that maintaining a RabbitMQ fault tolerant cluster is not exactly trivial.
Another alternative is Apache Kafka which is also very reliable.

You can receive messages from the service bus queue via subscribe method which listens to a stream of values. Example from Azure documentation below
const { delay, ServiceBusClient, ServiceBusMessage } = require("#azure/service-bus");
// connection string to your Service Bus namespace
const connectionString = "<CONNECTION STRING TO SERVICE BUS NAMESPACE>"
// name of the queue
const queueName = "<QUEUE NAME>"
async function main() {
// create a Service Bus client using the connection string to the Service Bus namespace
const sbClient = new ServiceBusClient(connectionString);
// createReceiver() can also be used to create a receiver for a subscription.
const receiver = sbClient.createReceiver(queueName);
// function to handle messages
const myMessageHandler = async (messageReceived) => {
console.log(`Received message: ${messageReceived.body}`);
};
// function to handle any errors
const myErrorHandler = async (error) => {
console.log(error);
};
// subscribe and specify the message and error handlers
receiver.subscribe({
processMessage: myMessageHandler,
processError: myErrorHandler
});
// Waiting long enough before closing the sender to send messages
await delay(20000);
await receiver.close();
await sbClient.close();
}
// call the main function
main().catch((err) => {
console.log("Error occurred: ", err);
process.exit(1);
});
source :
https://learn.microsoft.com/en-us/azure/service-bus-messaging/service-bus-nodejs-how-to-use-queues

I asked myslef the same question, here is what I found.
Use Google PubSub, it does exactly what you are looking for.
If you want to stay with Azure, the following ist possible:
cloud functions can be triggered from SBS messages
trigger an event-hub event with that cloud function
receive the event and fetch the message from SBS

You can make use of serverless functions which are "ServiceBusQueueTrigger",
they are invoked as soon as message arrives in queue,
Its pretty straight forward doing in nodejs, you need bindings defined in function.json which have type as
"type": "serviceBusTrigger",
This article (https://learn.microsoft.com/en-us/azure/azure-functions/functions-bindings-service-bus#trigger---javascript-example) probably would help in more detail.

Related

Google Pub/Sub with distributed subscribers in Node.js

We are attempting to migrate a message processing app from Kafka to Google Pub/Sub and it's just not working as expected.
We are running in Kubernetes (Google Cloud) where there may be multiple pods processing messages on the same subscription. Topics and subscriptions are all created using terraform and are more or less permanent. They are not created/destroyed on the fly by the application.
In our development environment, where message throughput is rather low, everything works just fine. But when we scale up to production levels, everything seems to fall apart. We get big backlogs of unacked messages, and yet some pods are not receiving any messages at all. And then, all of a sudden, the backlog will just go away, but then climb again.
We are using the nodejs client library provided by google: #google-cloud/pubsub:3.1.0
Each instance of the application subscribes to the same named subscription, and according to the documentation, messages should be distributed to each subscriber. But that is not happening. Some pods will be consuming messages rapidly, while others sit idle.
Every message is processed in a try/catch block and we are not observing any errors being thrown. So, as far as we know, every received message is getting acked.
I am suspicious that, as pods are terminated with autoscaling or updated deployments, that we are not properly closing subscriptions, but there are no examples addressing a distributed environment and I have not found any document that specifically addresses how to properly manage resources. It is also worth mentioning that the app has multiple subscriptions to different topics.
When a pod shuts down, what actions should be taken on the Subscription object and the PubSub client object? Maybe that's not even the issue, but it seems like a reasonable place to start.
When we start a subscription we do something like this:
private exampleSubscribe(): Subscription {
// one suggestion for having multiple subscriptions in the same app
// was to use separate clients for each
const pubSubClient = new PubSub({
// use a regional endpoint for message ordering
apiEndpoint: 'us-central1-pubsub.googleapis.com:443',
});
pubSubClient.projectId = 'my-project-id';
const sub = pubSubClient.subscription('my-subscription-name', {
// have tried various values for maxMessage from 5 to the default of 1000
flowControl: { maxMessages: 250, allowExcessMessages: false },
ackDeadline: 30,
});
sub.on('message', async (message) => {
await this.exampleMessageProcessing(message);
});
return sub;
}
private async exampleMessageProcessing(message: Message): Promise<void> {
try {
// do some cool stuff
} catch (error) {
// log the error
} finally {
message.ack();
}
}
Upon termination of a pod, we do this:
private async exampleCloseSub(sub: Subscription) {
try {
sub.removeAllListeners('message');
await sub.close();
// note that we do nothing with the PubSub
// client object -- should it also be closed?
} catch (error) {
// ignore error, we are shutting down
}
}
When running with Kafka, we can easily keep up with the message pace with usually no more than 2 pods. So I know that we are not running into issues of it simply taking too long to process each message.
Why are messages being left unacked? Why are pods not receiving messages when there is clearly a large backlog? What is the correct way to shut down one subscriber on a shared subscription?
It turns out that the issue was an improper implementation of message ordering.
The official docs for message ordering in Pub/Sub are rather brief:
https://cloud.google.com/pubsub/docs/ordering
Not much there regarding how to implement an ordering key or the implications of message ordering on horizontal scaling.
Though they do link to some external resources, one of which is this blog post:
https://medium.com/google-cloud/google-cloud-pub-sub-ordered-delivery-1e4181f60bc8
In our case, we did not have enough distinct ordering keys to allow for proper distribution of messages across subscribers/pods.
So this was definitely an RTFM situation, or more accurately: Read The Fine Blog Post Referred To By The Manual. I would have much preferred that the important details were actually in the official documentation. Is that to much to ask for?

How to avoid memory leak when using pub sub to call function?

I stuck on performance issue when using pubsub to triggers the function.
//this will call on index.ts
export function downloadService() {
// References an existing subscription
const subscription = pubsub.subscription("DOWNLOAD-sub");
// Create an event handler to handle messages
// let messageCount = 0;
const messageHandler = async (message : any) => {
console.log(`Received message ${message.id}:`);
console.log(`\tData: ${message.data}`);
console.log(`\tAttributes: ${message.attributes.type}`);
// "Ack" (acknowledge receipt of) the message
message.ack();
await exportExcel(message);//my function
// messageCount += 1;
};
// Listen for new messages until timeout is hit
subscription.on("message", messageHandler);
}
async function exportExcel(message : any) {
//get data from database
const movies = await Sales.findAll({
attributes: [
"SALES_STORE",
"SALES_CTRNO",
"SALES_TRANSNO",
"SALES_STATUS",
],
raw: true,
});
... processing to excel// 800k rows
... bucket.upload to gcs
}
The function above is working fine if I trigger ONLY one pubsub message.
However, the function will hit memory leak issue or database connection timeout issue if I trigger many pubsub message in short period of time.
The problem I found is, first processing havent finish yet but others request from pubsub will straight to call function again and process at the same time.
I have no idea how to resolve this but I was thinking implement the queue worker or google cloud task will solve the problem?
As mentioned by #chovy in the comments, there is a need to queue up the excelExport function calls since the function's execution is not keeping up with the rate of invocation. One of the modules that can be used to queue function calls is async. Please note that the async module is not officially supported by Google.
As an alternative, you can employ flow control features on the subscriber side. Data pipelines often receive sporadic spikes in published traffic which can overwhelm subscribers in an effort to catch up. The usual response to high published throughput on a subscription would be to dynamically autoscale subscriber resources to consume more messages. However, this can incur unwanted costs — for instance, you may need to use more VM’s — which can lead to additional capacity planning. Flow control features on the subscriber side can help control the unhealthy behavior of these tasks on the pipeline by allowing the subscriber to regulate the rate at which messages are ingested. Please refer to this blog for more information on flow control features.

Azure Session Consumer seems to be locked to same Worker Service even when Message gets Abandoned

I have an Azure Message Bus Topic.
I have one "Session enabled" Azure Message Bus Consumer for this Topic.
I have around 3 Worker Services that are using the same Consumer. So the work is shared between these 3 Workers.
The Messages which are sent to the consumer need to be ordered, thats why I am using the "Session Feature" on the Consumer.
I believe that on a first Message, the Session of the Message gets bind to a Worker Service.
For certain Reasons I want to abandon not only a Message but also the session so that it can be picked up by another of the 3 Worker Services.
My questions:
Is this possible?
If yes how can I do this in the code?
Is there something like "Accept Session Or Not" Handler which kicks in when Message received?
See code below:
private void SetupServiceBusSessionProcessors2()
{
var busProcessorOptions = new ServiceBusSessionProcessorOptions();
var busProcessor = _busClient.CreateSessionProcessor("fooTopic", "fooSubscription", busProcessorOptions);
busProcessor.ProcessMessageAsync += args => ProcessSessionMessageHandler2(args);
}
private async Task ProcessSessionMessageHandler2(ProcessSessionMessageEventArgs args)
{
if (false) // Condition here which Abandons Message AND Session
{
// the following line of code seems only to abandon the Message
// but it seems like the session is locked to this service
// i want that other services which are listening via the same consumer can try to handle the session
await args.AbandonMessageAsync(args.Message);
}
}
This is possible in version 7.3.0-beta.1 using the ReleaseSession method on the event args. Note that this is a beta version so the API is subject to change before there is a stable release.

How to send service bus message to deadletter queue in NodeJS?

How can I send message to deadletter queue?
serviceBusService.receiveQueueMessage(MESSAGE_QUEUE, {isPeekLock: true}, (error, message) => {
...... // want to put message to deadletter queue if there is exception
serviceBusService.deleteMessage(message, error => {
});
});
Mostly, you'd want to rely on the system to decide when to move a message to DLQ and make it the messaging engine's responsibility as much as possible (and not explicitly put a message on DLQ.) It also appears that the guidance in this scenario was provided via documentation here: How to handle application crashes and unreadable messages
Looks like you are using the older azure-sb package that relies on the HTTP REST apis. If you instead use the newer #azure/service-bus package which uses the faster AMQP implementation, there is a deadletter() method on the message you receive that you can use to send the message to the dead letter queue.

How to guarantee azure queue FIFO

I understand that MS Azure Queue service document http://msdn.microsoft.com/en-us/library/windowsazure/dd179363.aspx says first out (FIFO) behavior is not guaranteed.
However, our application is such that ALL the messages have to be read and processed in FIFO order. Could anyone please suggest how to achieve a guaranteed FIFO using Azure Queue Service?
Thank you.
The docs say for Azure Storage queues that:
Messages in Storage queues are typically first-in-first-out, but sometimes they can be out of order; for example, when a message's
visibility timeout duration expires (for example, as a result of a
client application crashing during processing). When the visibility
timeout expires, the message becomes visible again on the queue for
another worker to dequeue it. At that point, the newly visible message
might be placed in the queue (to be dequeued again) after a message
that was originally enqueued after it.
Maybe that is good enough for you? Else use Service bus.
The latest Service Bus release offers reliable messaging queuing: Queues, topics and subscriptions
Adding to #RichBower answer... check out this... Azure Storage Queues vs. Azure Service Bus Queues
MSDN (link retired)
http://msdn.microsoft.com/en-us/library/windowsazure/hh767287.aspx
learn.microsoft.com
https://learn.microsoft.com/en-us/azure/service-bus-messaging/service-bus-azure-and-service-bus-queues-compared-contrasted
Unfortunately, many answers misleads to Service Bus Queues but I assume the question is about Storage Queues from the tags mentioned. In Azure Storage Queues, FIFO is not guranteed, whereas in Service Bus, FIFO message ordering is guaranteed and that too, only with the use of a concept called Sessions.
A simple scenario could be, if any consumer receives a message from the queue, it is not visible to you when you are the second receiver. So you assume the second message you received is actually the first message (Where FIFO failed :P)
Consider using Service Bus if this is not your requirement.
I don't know how fast do you want to process the messages, but if you need to have a real FIFO, don't allow Azure's queue to get more than one message at a time.
Use this at your "program.cs" at the top of the function.
static void Main()
{
var config = new JobHostConfiguration();
if (config.IsDevelopment)
{
config.UseDevelopmentSettings();
}
config.Queues.BatchSize = 1; //Number of messages to dequeue at the same time.
config.Queues.MaxPollingInterval = TimeSpan.FromMilliseconds(100); //Pooling request to the queue.
JobHost host = new JobHost(config);
....your initial information...
// The following code ensures that the WebJob will be running continuously
host.RunAndBlock();
This will get one message at a time with a wait period of 100 miliseconds.
This is working perfectly with a logger webjob to write to files the traze information.
As mentioned here https://www.jayway.com/2013/12/20/message-ordering-on-windows-azure-service-bus-queues/ ordering is not guaranteed also in service bus, except of using recieve and delete mode which is risky
You just need to follow below steps to ensure Message ordering.:
1) Create a Queue with session enabled=false.
2) While saving message in the queue, provide the session id like below:-
var message = new BrokeredMessage(item);
message.SessionId = "LB";
Console.WriteLine("Response from Central Scoring System : " + item);
client.Send(message);
3) While creating receiver for reviving message:-
queueClient.OnMessage(s =>
{
var body = s.GetBody<string>();
var messageId = s.MessageId;
Console.WriteLine("Message Body:" + body);
Console.WriteLine("Message Id:" + messageId);
});
4) While having the same session id, it would automatically ensure order and give the ordered message.
Thanks!!

Resources