Parallel consumption of messages on different queues +rabbitmq+nodejs - node.js

It looks like the only library that's there for writing nodejs apps over rabbitmq is the
https://github.com/postwait/node-amqp
I have a producer which is posting messages to multiple queues at a very fast rate, and in the consumer I'm creating subscriptions for each queue.
connection.on('ready', function () {
for(var i=0;i<queues.length;i++)
connection.queue(queues[i],{autoDelete:false}, qOnReady);
});
function qOnReady(q){
// Catch all messages
logger("Q "+q.name+" is ready");
q.bind('#');
// Receive messages
q.subscribe(subscriber);
}
But when I run the consumer all the messages that are consumed are from a particular queue and till the queue is exhausted , the subscription doesn't start on the other queue.How do I achieve consume messages in parallel.

You might want to use Async Library to achieve parallel execution of your queues.
Though not sure if it fits your needs.
https://github.com/caolan/async#paralleltasks-callback

Related

How To Rate-Limit Google Cloud Pub/Sub Queue

I'm using Google's Pub/Sub queue to handle messages between services. Some of the subscribers connect to rate-limit APIs.
For example, I'm pushing street addresses onto a pub/sub topic. I have a Cloud function which subscribes (via push) to that topic, and calls out to an external rate-limited geocoding service. Ideally, my street addresses could be pushed onto the topic with no delay, and the topic would retain those messages - calling the subscriber in a rate-limited fashion.
Is there anyway to configure such a delay, or a message distribution rate limit? Increasing the Ack window doesn't really help: I've architected this system to prevent long-running functions.
Because there's no answer so far describing workarounds, I'm going to answer this now by stating that there is currently no way to do this. There are workarounds (see the comments on the question that explain how to create a queueing system using Cloud Scheduler), but there's no way to just set a setting on a pull subscription that creates a rate limit between it and its topic.
I opened a feature request for this though. Please speak up on the tracked issue if you'd like this feature.
https://issuetracker.google.com/issues/197906331
An aproach to solve your problem is by using: async.queue
There you have a concurrency attribute wich you can manage the rate limit.
// create a queue object with concurrency 2
var q = async.queue(function(task, callback) {
console.log('hello ' + task.name);
callback();
}, 2);
// assign a callback
q.drain = function() {
console.log('all items have been processed');
};
// add some items to the queue
q.push({name: 'foo'}, function(err) {
console.log('finished processing foo');
});
// quoted from async documentation
GCP cloud task queue enables you to limit the number of tasks. Check this doc

Does node js worker thread and rabbitmq consumer are same?

This is to gain more knowledge on how rabbitmq queuing and node js master/worker threads combined.
Node.js master worker threads are different then rabbitmq queuing as rabbitmq provides the facility to store the tasks into queue so that they can be consumed by a worker process when a worker is free. Combining these two would have very specific use cases and generally not needed.
There are couple of things required for the combined implementation of these two which mainly includes node-amqp client and cluster. Cluster is the default feature of node which provides the api for master worker threads. Without rabbitmq you would generally distribute the tasks using one master process i.e. send the task to all worker process and worker threads listens to receive the tasks.
Now since you want to use rabbitmq you have to first subscribe to a exchange to listen for all the tasks and when you receive the task you pass that to your worker process. Below is an small snippet to provide the gist of explaination.
connection.on('ready', function() {
connection.exchange('exchange-name', function(exchange) {
_exchange = exchange;
connection.queue('queue-name', function(queue) {
_queue = queue;
// Bind to the exchange
queue.bind('exchange-name', 'routing-key');
// Subscribe to the queue
queue
.subscribe(function(message) {
// When receiving the message call the worker thread to complete the task
console.log('Got message', message);
queue.shift(false, false);
})
.addCallback(function(res) {
// Hold on to the consumer tag so we can unsubscribe later
_consumerTag = res.consumerTag;
})
;
});
});
});
Message exchange between master and worker: Instead of directly sending message to master worker needs to put the success message to a queue. The master will listen to that queue to receive the acknowledgements and success messages.

Most effective way to poll an Amazon SQS queue using Node

My question is short, but I think is interesting:
I've a queue from Amazon SQS service, and I'm polling the queue every second. When there's a message I process the message and after processing, go back to polling the queue.
Is there a better way for this?, some sort of trigger? or which approach will be the best in your opinion, and why.
Thanks!
A useful and easily to use library for consuming messages from SQS is sqs-consumer
const Consumer = require('sqs-consumer');
const app = Consumer.create({
queueUrl: 'https://sqs.eu-west-1.amazonaws.com/account-id/queue-name',
handleMessage: (message, done) => {
console.log('Processing message: ', message);
done();
}
});
app.on('error', (err) => {
console.log(err.message);
});
app.start();
It's well documented if you need more information. You can find the docs at:
https://github.com/bbc/sqs-consumer
yes there is:
http://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-long-polling.html
you can configure the SQS queues to have a "receive message wait time" and do long polling.
so you can set it to say 10 seconds, and the call will come back only if you have a message or after the 10 sec timeout expires. you can continuously poll the queue in this scenario.
As mentioned by Mircea, long polling is one option.
When you ask for a 'trigger', I believe you are looking for something other than continuously polling SQS on your own. If that is the case, I'd suggest you look at AWS Lambda. It allows you to put code in the cloud, which automatically gets triggered on your configured events, such as SNS event, a file pushed to S3 etc.
http://aws.amazon.com/lambda/

NodeJS and AWS SQS

Folks,
I would like to set up a message queue between our Java API and NodeJS API.
After reading several examples of using aws-sdk, I am not sure how to make the service watch the queue.
For instance, this article Using SQS with Node: Receiving Messages Example Code tells me to use the sqs.receiveMessage() to receive and sqs.deleteMessage() to delete a message.
What I am not clear about, is how to wrap this into a service that runs continuously, which constantly takes the messages off the sqs queue, passes them to the model, stores them in mongo, etc.
Hope my question is not entirely vague. My experience with Node lies primarily with Express.js.
Is the answer as simple as using something like sqs-poller? How would I implement the same into an already running NodeJS Express app? Quite possibly I should look into SNS to not have any delay in message transfers.
Thanks!
For a start, Amazon SQS is a pseudo queue that guarantees availability of messages but not their sequence in FIFO fashion. You have to implement sequencing logic into your app if you want it to work that way.
Coming back to your question, SQS has to be polled within your app to check if there are new messages available. I implemented this in an app using setInterval(). I would poll the queue for items and if no items were found, I would delay the next call and in case some items were found, the next call would be immediate bypassing the setInterval(). This is obviously a very raw implementation and you can look into alternatives. How about a child process on your server that pings your NodeJS app when a new item is found in SQS ? I think you can implement the child process as a watcher in BASH without using NodeJS. You can also look into npm modules if there is already one for this.
In short, there are many ways you can poll but polling has to be done one way or the other if you are working with Amazon SQS.
I am not sure about this but if you want to be notified of items, you might want to look into Amazon SNS.
When writing applications to consume messages from SQS I use sqs-consumer:
const Consumer = require('sqs-consumer');
const app = Consumer.create({
queueUrl: 'https://sqs.eu-west-1.amazonaws.com/account-id/queue-name',
handleMessage: (message, done) => {
console.log('Processing message: ', message);
done();
}
});
app.on('error', (err) => {
console.log(err.message);
});
app.start();
See the docs for more information (well documented):
https://github.com/bbc/sqs-consumer

why is performance of redis+socket.io better than just socket.io?

I earlier had all my code in socket.io+node.js server. I recently converted all the code to redis+socket.io+socket.io+node.js after noticing slow performance when too many users send messages across the server.
So, why socket.io alone was slow because it is not multi threaded, so it handles one request or emit at a time.
What redis does is distribute these requests or emits across channels. Clients subscribe to different channels, and when a message is published on a channel, all the client subscribed to it receive the message. It does it via this piece of code:
sub.on("message", function (channel, message) {
client.emit("message",message);
});
The client.on('emit',function(){}) takes it from here to publish messages to different channels.
Here is a brief code explaining what i am doing with redis:
io.sockets.on('connection', function (client) {
var pub = redis.createClient();
var sub = redis.createClient();
sub.on("message", function (channel, message) {
client.emit('message',message);
});
client.on("message", function (msg) {
if(msg.type == "chat"){
pub.publish("channel." + msg.tousername,msg.message);
pub.publish("channel." + msg.user,msg.message);
}
else if(msg.type == "setUsername"){
sub.subscribe("channel." +msg.user);
}
});
});
As redis stores the channel information, we can have different servers publish to the same channel.
So, what i dont understand is, if sub.on("message") is getting called every time a request or emit is sent, why is redis supposed to be giving better performance? I suppose even the sub.on("message") method is not multi threaded.
As you might know, Redis allows you to scale with multiple node instances. So the performance actually comes after the fact. Utilizing the Pub/Sub method is not faster. It's technically slower because you have to communicate between Redis for every Pub/Sign signal. The "giving better performance" is only really true when you start to horizontally scale out.
For example, you have one node instance (simple chat room) -- that can handle a maximum of 200 active users. You are not using Redis yet because there is no need. Now, what if you want to have 400 active users? Whilst using your example above, you can now achieve this 400 user mark, which is a "performance increase". In the sense you can now handle more users, but not really a speed increase. If that makes sense. Hope this helps!

Resources