Most effective way to poll an Amazon SQS queue using Node - node.js

My question is short, but I think is interesting:
I've a queue from Amazon SQS service, and I'm polling the queue every second. When there's a message I process the message and after processing, go back to polling the queue.
Is there a better way for this?, some sort of trigger? or which approach will be the best in your opinion, and why.
Thanks!

A useful and easily to use library for consuming messages from SQS is sqs-consumer
const Consumer = require('sqs-consumer');
const app = Consumer.create({
queueUrl: 'https://sqs.eu-west-1.amazonaws.com/account-id/queue-name',
handleMessage: (message, done) => {
console.log('Processing message: ', message);
done();
}
});
app.on('error', (err) => {
console.log(err.message);
});
app.start();
It's well documented if you need more information. You can find the docs at:
https://github.com/bbc/sqs-consumer

yes there is:
http://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-long-polling.html
you can configure the SQS queues to have a "receive message wait time" and do long polling.
so you can set it to say 10 seconds, and the call will come back only if you have a message or after the 10 sec timeout expires. you can continuously poll the queue in this scenario.

As mentioned by Mircea, long polling is one option.
When you ask for a 'trigger', I believe you are looking for something other than continuously polling SQS on your own. If that is the case, I'd suggest you look at AWS Lambda. It allows you to put code in the cloud, which automatically gets triggered on your configured events, such as SNS event, a file pushed to S3 etc.
http://aws.amazon.com/lambda/

Related

How to process data from AWS SQS?

I have a problem with understanding of working SQS from AWS in NodeJS. I'm creating a simple endpoint which receive data sended to my server and add to SQS queue:
export const receiveMessageFromSDK = async (req: Request, res: Response) => {
const payload = req.body;
try {
await sqs.sendMessage({
QueueUrl: process.env.SQS_QUEUE_URL,
MessageBody: payload
}).promise();
} catch (error) {
//something else
}
}
and ok, it working and MessageBody is adding to my SQS queue. And right now I have a problem with processing of the data from this queue... How to do this?
In Google Cloud I'm creating simple request queue with sending (in body payload) also url of endpoint which should process data from queue, then gcloud send to this endpoint this body and my server starting business logic on this data. But how to do this in SQS? How to receive data from this queue and start processing data on my side?
I'm thinking about QueueUrl param but in docs is written that this value should be an url from aws console like https://sqs.eu-central-1.amazonaws.com... so I have no idea how to process this data from queue on my side.
So..., can anybody help me?
Thanks, in advice for any help!
Should I call launch this function on cron or AWS can eq. send request on providing from me endpoint with this data etc
SQS is for pulling data, which means that you have to have a cron job working (or any equivalent system in your app) that iteratively pulls your queue for messages using receiveMessage, , e.g. every 10 seconds.
If you want push type messaging system, then you have to use SNS instead of SQS. SNS can push messages to your HTTP/HTTPS endpoint automatically if your application exposes any such endpoint for the messages.

How To Rate-Limit Google Cloud Pub/Sub Queue

I'm using Google's Pub/Sub queue to handle messages between services. Some of the subscribers connect to rate-limit APIs.
For example, I'm pushing street addresses onto a pub/sub topic. I have a Cloud function which subscribes (via push) to that topic, and calls out to an external rate-limited geocoding service. Ideally, my street addresses could be pushed onto the topic with no delay, and the topic would retain those messages - calling the subscriber in a rate-limited fashion.
Is there anyway to configure such a delay, or a message distribution rate limit? Increasing the Ack window doesn't really help: I've architected this system to prevent long-running functions.
Because there's no answer so far describing workarounds, I'm going to answer this now by stating that there is currently no way to do this. There are workarounds (see the comments on the question that explain how to create a queueing system using Cloud Scheduler), but there's no way to just set a setting on a pull subscription that creates a rate limit between it and its topic.
I opened a feature request for this though. Please speak up on the tracked issue if you'd like this feature.
https://issuetracker.google.com/issues/197906331
An aproach to solve your problem is by using: async.queue
There you have a concurrency attribute wich you can manage the rate limit.
// create a queue object with concurrency 2
var q = async.queue(function(task, callback) {
console.log('hello ' + task.name);
callback();
}, 2);
// assign a callback
q.drain = function() {
console.log('all items have been processed');
};
// add some items to the queue
q.push({name: 'foo'}, function(err) {
console.log('finished processing foo');
});
// quoted from async documentation
GCP cloud task queue enables you to limit the number of tasks. Check this doc

NodeJS and AWS SQS

Folks,
I would like to set up a message queue between our Java API and NodeJS API.
After reading several examples of using aws-sdk, I am not sure how to make the service watch the queue.
For instance, this article Using SQS with Node: Receiving Messages Example Code tells me to use the sqs.receiveMessage() to receive and sqs.deleteMessage() to delete a message.
What I am not clear about, is how to wrap this into a service that runs continuously, which constantly takes the messages off the sqs queue, passes them to the model, stores them in mongo, etc.
Hope my question is not entirely vague. My experience with Node lies primarily with Express.js.
Is the answer as simple as using something like sqs-poller? How would I implement the same into an already running NodeJS Express app? Quite possibly I should look into SNS to not have any delay in message transfers.
Thanks!
For a start, Amazon SQS is a pseudo queue that guarantees availability of messages but not their sequence in FIFO fashion. You have to implement sequencing logic into your app if you want it to work that way.
Coming back to your question, SQS has to be polled within your app to check if there are new messages available. I implemented this in an app using setInterval(). I would poll the queue for items and if no items were found, I would delay the next call and in case some items were found, the next call would be immediate bypassing the setInterval(). This is obviously a very raw implementation and you can look into alternatives. How about a child process on your server that pings your NodeJS app when a new item is found in SQS ? I think you can implement the child process as a watcher in BASH without using NodeJS. You can also look into npm modules if there is already one for this.
In short, there are many ways you can poll but polling has to be done one way or the other if you are working with Amazon SQS.
I am not sure about this but if you want to be notified of items, you might want to look into Amazon SNS.
When writing applications to consume messages from SQS I use sqs-consumer:
const Consumer = require('sqs-consumer');
const app = Consumer.create({
queueUrl: 'https://sqs.eu-west-1.amazonaws.com/account-id/queue-name',
handleMessage: (message, done) => {
console.log('Processing message: ', message);
done();
}
});
app.on('error', (err) => {
console.log(err.message);
});
app.start();
See the docs for more information (well documented):
https://github.com/bbc/sqs-consumer

why is performance of redis+socket.io better than just socket.io?

I earlier had all my code in socket.io+node.js server. I recently converted all the code to redis+socket.io+socket.io+node.js after noticing slow performance when too many users send messages across the server.
So, why socket.io alone was slow because it is not multi threaded, so it handles one request or emit at a time.
What redis does is distribute these requests or emits across channels. Clients subscribe to different channels, and when a message is published on a channel, all the client subscribed to it receive the message. It does it via this piece of code:
sub.on("message", function (channel, message) {
client.emit("message",message);
});
The client.on('emit',function(){}) takes it from here to publish messages to different channels.
Here is a brief code explaining what i am doing with redis:
io.sockets.on('connection', function (client) {
var pub = redis.createClient();
var sub = redis.createClient();
sub.on("message", function (channel, message) {
client.emit('message',message);
});
client.on("message", function (msg) {
if(msg.type == "chat"){
pub.publish("channel." + msg.tousername,msg.message);
pub.publish("channel." + msg.user,msg.message);
}
else if(msg.type == "setUsername"){
sub.subscribe("channel." +msg.user);
}
});
});
As redis stores the channel information, we can have different servers publish to the same channel.
So, what i dont understand is, if sub.on("message") is getting called every time a request or emit is sent, why is redis supposed to be giving better performance? I suppose even the sub.on("message") method is not multi threaded.
As you might know, Redis allows you to scale with multiple node instances. So the performance actually comes after the fact. Utilizing the Pub/Sub method is not faster. It's technically slower because you have to communicate between Redis for every Pub/Sign signal. The "giving better performance" is only really true when you start to horizontally scale out.
For example, you have one node instance (simple chat room) -- that can handle a maximum of 200 active users. You are not using Redis yet because there is no need. Now, what if you want to have 400 active users? Whilst using your example above, you can now achieve this 400 user mark, which is a "performance increase". In the sense you can now handle more users, but not really a speed increase. If that makes sense. Hope this helps!

Parallel consumption of messages on different queues +rabbitmq+nodejs

It looks like the only library that's there for writing nodejs apps over rabbitmq is the
https://github.com/postwait/node-amqp
I have a producer which is posting messages to multiple queues at a very fast rate, and in the consumer I'm creating subscriptions for each queue.
connection.on('ready', function () {
for(var i=0;i<queues.length;i++)
connection.queue(queues[i],{autoDelete:false}, qOnReady);
});
function qOnReady(q){
// Catch all messages
logger("Q "+q.name+" is ready");
q.bind('#');
// Receive messages
q.subscribe(subscriber);
}
But when I run the consumer all the messages that are consumed are from a particular queue and till the queue is exhausted , the subscription doesn't start on the other queue.How do I achieve consume messages in parallel.
You might want to use Async Library to achieve parallel execution of your queues.
Though not sure if it fits your needs.
https://github.com/caolan/async#paralleltasks-callback

Resources