NodeJS and AWS SQS

NodeJS and AWS SQS - node.js

Folks,
I would like to set up a message queue between our Java API and NodeJS API.
After reading several examples of using aws-sdk, I am not sure how to make the service watch the queue.
For instance, this article Using SQS with Node: Receiving Messages Example Code tells me to use the sqs.receiveMessage() to receive and sqs.deleteMessage() to delete a message.
What I am not clear about, is how to wrap this into a service that runs continuously, which constantly takes the messages off the sqs queue, passes them to the model, stores them in mongo, etc.
Hope my question is not entirely vague. My experience with Node lies primarily with Express.js.
Is the answer as simple as using something like sqs-poller? How would I implement the same into an already running NodeJS Express app? Quite possibly I should look into SNS to not have any delay in message transfers.
Thanks!

For a start, Amazon SQS is a pseudo queue that guarantees availability of messages but not their sequence in FIFO fashion. You have to implement sequencing logic into your app if you want it to work that way.
Coming back to your question, SQS has to be polled within your app to check if there are new messages available. I implemented this in an app using setInterval(). I would poll the queue for items and if no items were found, I would delay the next call and in case some items were found, the next call would be immediate bypassing the setInterval(). This is obviously a very raw implementation and you can look into alternatives. How about a child process on your server that pings your NodeJS app when a new item is found in SQS ? I think you can implement the child process as a watcher in BASH without using NodeJS. You can also look into npm modules if there is already one for this.
In short, there are many ways you can poll but polling has to be done one way or the other if you are working with Amazon SQS.
I am not sure about this but if you want to be notified of items, you might want to look into Amazon SNS.

When writing applications to consume messages from SQS I use sqs-consumer:
const Consumer = require('sqs-consumer');
const app = Consumer.create({
queueUrl: 'https://sqs.eu-west-1.amazonaws.com/account-id/queue-name',
handleMessage: (message, done) => {
console.log('Processing message: ', message);
done();
}
});
app.on('error', (err) => {
console.log(err.message);
});
app.start();
See the docs for more information (well documented):
https://github.com/bbc/sqs-consumer

Related

API that will continuously return data

Beginner here, I'm using Firebase real time database and I need my API to constantly return that value when something has been added see my code below.
apiCalls.get('/api/getallusers',function(req,res){
userFunc.getAllUsers(function(err,result){
if (err) return res.status(500).send('internal server error!');
res.status(200).write(JSON.stringify(result));
res.end();
return res;
})
})
this will return the error
Error [ERR_STREAM_WRITE_AFTER_END]: write after end
but if i remove res.end it will show 1 record and constantly load until the page times out..
is what I'm doing possible or are there different ways to do it.
also I'm using firebase cloud functions for this api.
UPDATE:
Uploaded the API but it does not return anything...
here is the link https://us-central1-testproject-e6819.cloudfunctions.net/api1/api/getUser
tried axios and Event Source
Firebase functions logs the values but it does not return it..

If you're viewing the API response like a web page, your browser is buffering the data it's received until there's enough of it to form a more full page. Your browser is expecting content that ends, not some endless stream of data.
You should remove .end() if you expect to be able to continue to write to the output stream.
Also, I recommend using the Server-Sent Events (SSE) protocol for this. https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events It provides a nice standards-based abstraction that makes it very easy to handle event streams client-side.
const eventSource = new EventSource('https://api.example.com/someApi');
eventSource.addEventListener('userupdate', (e) => {
console.log(e.data);
});
Server-side, there are a couple Express-based middlewares to make this even easier than it already is.

Operations in Cloud Functions must be relatively short-lived and end deterministically. There is no way to keep a connection open from Cloud Functions to the client.
Typically consider what triggers the need to send new data. For example, if it is triggered by the fact that a new user is registered, you can use trigger your Cloud Functions from Firebase Authentication. Then the function could for example write to the Realtime Database (or Cloud Firestore), and your client/app listens to the database for realtime updates. That way you're using all the pieces of Firebase in the way they're designed: Cloud Functions for short-lived updates triggered from events in the system, and the Realtime Database or Cloud Firestore for sending realtime updates.
If that doesn't work for your use-case, you'll need a runtime environment that allows you to keep processes alive. Something like App Engine flex, Kubernetes, or many other options come to mind for that.

How To Rate-Limit Google Cloud Pub/Sub Queue

I'm using Google's Pub/Sub queue to handle messages between services. Some of the subscribers connect to rate-limit APIs.
For example, I'm pushing street addresses onto a pub/sub topic. I have a Cloud function which subscribes (via push) to that topic, and calls out to an external rate-limited geocoding service. Ideally, my street addresses could be pushed onto the topic with no delay, and the topic would retain those messages - calling the subscriber in a rate-limited fashion.
Is there anyway to configure such a delay, or a message distribution rate limit? Increasing the Ack window doesn't really help: I've architected this system to prevent long-running functions.

Because there's no answer so far describing workarounds, I'm going to answer this now by stating that there is currently no way to do this. There are workarounds (see the comments on the question that explain how to create a queueing system using Cloud Scheduler), but there's no way to just set a setting on a pull subscription that creates a rate limit between it and its topic.
I opened a feature request for this though. Please speak up on the tracked issue if you'd like this feature.
https://issuetracker.google.com/issues/197906331

An aproach to solve your problem is by using: async.queue
There you have a concurrency attribute wich you can manage the rate limit.
// create a queue object with concurrency 2
var q = async.queue(function(task, callback) {
console.log('hello ' + task.name);
callback();
}, 2);
// assign a callback
q.drain = function() {
console.log('all items have been processed');
};
// add some items to the queue
q.push({name: 'foo'}, function(err) {
console.log('finished processing foo');
});
// quoted from async documentation

GCP cloud task queue enables you to limit the number of tasks. Check this doc

Node JS Socket.IO Emitter (and redis)

I'll give a small premise of what I'm trying to do. I have a game concept in mind which requires multiple players sitting around a table somewhat like poker.
The normal interaction between different players is easy to handle via socket.io in conjunction with node js.
What I'm having a hard time figuring out is; I have a cron job which is running in another process which gets new information every minute which then needs to be sent to each of those players. Since this is a different process I'm not sure how I send certain clients this information.
socket.io does have information for this and I'm quoting it below:
In some cases, you might want to emit events to sockets in Socket.IO namespaces / rooms from outside the context of your Socket.IO processes.
There’s several ways to tackle this problem, like implementing your own channel to send messages into the process.
To facilitate this use case, we created two modules:
socket.io-redis
socket.io-emitter
From what I understand I need these two modules to do what I mentioned earlier. What I do not understand however is why is redis in the equation when I just need to send some messages.
Is it used to just store the messages temporarily?
Any help will be appreciated.

There are several ways to achieve this if you just need to emit after an external event. It depend on what you're using for getting those new data to send :
/* if the other process is an http post incoming you can use for example
express and use your io object in a custom middleware : */
//pass the io in the req object
app.use( '/incoming', (req, res, next) => {
req.io = io;
})
//then you can do :
app.post('/incoming', (req, res, next) => {
req.io.emit('incoming', req.body);
res.send('data received from http post request then send in the socket');
})
//if you fetch data every minute, why don't you just emit after your job :
var job = sheduledJob('* */1 * * * *', io => {
axios.get('/myApi/someRessource').then(data => io.emit('newData', data.data));
})

Well in the case of socket.io providing those, I read into that you actually need both. However this shouldn't necessarily be what you want. But yes, redis is probably just used to store data temporarily, where it also does a really good job, by being close to what a message queue does.
Your cron now wouldn't need a message queue or similar behaviour.
My suggestion though would be to run the cron with some node package from within your process as a child_process hook onto it's readable stream and then push directly to your sockets.

If the cron job process is also a nodejs process, you can exchange data through redis.io pub-sub client mechanism.
Let me know what is your cron job process in and in case further help required in pub-sub mechanism..
redis is one of the memory stores used by socket.io(in case you configure)

You must employ redis only if you have multi-server configuration (cluster) to establish a connection and room/namespace sync between those node.js instances. It has nothing to do with storing data in this case, it works as a pub/sub machine.

Most effective way to poll an Amazon SQS queue using Node

My question is short, but I think is interesting:
I've a queue from Amazon SQS service, and I'm polling the queue every second. When there's a message I process the message and after processing, go back to polling the queue.
Is there a better way for this?, some sort of trigger? or which approach will be the best in your opinion, and why.
Thanks!

A useful and easily to use library for consuming messages from SQS is sqs-consumer
const Consumer = require('sqs-consumer');
const app = Consumer.create({
queueUrl: 'https://sqs.eu-west-1.amazonaws.com/account-id/queue-name',
handleMessage: (message, done) => {
console.log('Processing message: ', message);
done();
}
});
app.on('error', (err) => {
console.log(err.message);
});
app.start();
It's well documented if you need more information. You can find the docs at:
https://github.com/bbc/sqs-consumer

yes there is:
http://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-long-polling.html
you can configure the SQS queues to have a "receive message wait time" and do long polling.
so you can set it to say 10 seconds, and the call will come back only if you have a message or after the 10 sec timeout expires. you can continuously poll the queue in this scenario.

As mentioned by Mircea, long polling is one option.
When you ask for a 'trigger', I believe you are looking for something other than continuously polling SQS on your own. If that is the case, I'd suggest you look at AWS Lambda. It allows you to put code in the cloud, which automatically gets triggered on your configured events, such as SNS event, a file pushed to S3 etc.
http://aws.amazon.com/lambda/

why is performance of redis+socket.io better than just socket.io?

I earlier had all my code in socket.io+node.js server. I recently converted all the code to redis+socket.io+socket.io+node.js after noticing slow performance when too many users send messages across the server.
So, why socket.io alone was slow because it is not multi threaded, so it handles one request or emit at a time.
What redis does is distribute these requests or emits across channels. Clients subscribe to different channels, and when a message is published on a channel, all the client subscribed to it receive the message. It does it via this piece of code:
sub.on("message", function (channel, message) {
client.emit("message",message);
});
The client.on('emit',function(){}) takes it from here to publish messages to different channels.
Here is a brief code explaining what i am doing with redis:
io.sockets.on('connection', function (client) {
var pub = redis.createClient();
var sub = redis.createClient();
sub.on("message", function (channel, message) {
client.emit('message',message);
});
client.on("message", function (msg) {
if(msg.type == "chat"){
pub.publish("channel." + msg.tousername,msg.message);
pub.publish("channel." + msg.user,msg.message);
}
else if(msg.type == "setUsername"){
sub.subscribe("channel." +msg.user);
}
});
});
As redis stores the channel information, we can have different servers publish to the same channel.
So, what i dont understand is, if sub.on("message") is getting called every time a request or emit is sent, why is redis supposed to be giving better performance? I suppose even the sub.on("message") method is not multi threaded.

As you might know, Redis allows you to scale with multiple node instances. So the performance actually comes after the fact. Utilizing the Pub/Sub method is not faster. It's technically slower because you have to communicate between Redis for every Pub/Sign signal. The "giving better performance" is only really true when you start to horizontally scale out.
For example, you have one node instance (simple chat room) -- that can handle a maximum of 200 active users. You are not using Redis yet because there is no need. Now, what if you want to have 400 active users? Whilst using your example above, you can now achieve this 400 user mark, which is a "performance increase". In the sense you can now handle more users, but not really a speed increase. If that makes sense. Hope this helps!

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

NodeJS and AWS SQS - node.js

Related

API that will continuously return data

How To Rate-Limit Google Cloud Pub/Sub Queue

Node JS Socket.IO Emitter (and redis)

Most effective way to poll an Amazon SQS queue using Node

why is performance of redis+socket.io better than just socket.io?

Categories

Resources