Ensure PubSub batch requests are sent in Google Cloud Run Node App - node.js

I have a NodeJS server running in a container in Google Cloud Run. It publishes messages to PubSub.
// during handling a request,
const topic = pubsub.topic(topicName, {
batching: {
maxMessages: 1000,
maxMilliseconds: 1000,
},
});
// publish some messages
someArray.forEach((item) => {
topic.publishJSON(item);
});
Let's assume someArray.length is less than maxMessages.
What happens if the Node sends a response before maxMilliseconds has elapsed? Are the messages sent? Does Google Cloud Run kill the container upon the http response, or does it somehow know that the PubSub library has set a timeout?

Cloud Run doesn't kill the container immediately, but it limits its access to CPU after processing the request. As the documentation suggests finish all the asynchronous tasks within request handling. Try async/await like in the example.

Related

How to let the frontend know when a background job is done?

In Heroku long requests can cause H12 timeout errors.
The request must then be processed...by your application...within 30 seconds to
avoid the timeout.
src
Heroku suggests moving long tasks to background jobs.
Sending an email...Accessing a remote API...
Web scraping / crawling...you should move this heavy lifting into a background job which can run asynchronously from your web request.
src
Heroku's docs say requests shouldn't take longer than 500ms to return a response.
It’s important for a web application to serve end-user requests as
fast as possible. A good rule of thumb is to avoid web requests which
run longer than 500ms. If you find that your app has requests that
take one, two, or more seconds to complete, then you should consider
using a background job instead.
src
So if I have a background job, how do I tell the frontend when the background job is done and what the job returns?
On Heroku their example code just returns the background job id. But this won't give the frontend the information it needs.
app.post('/job', async (req, res) => {
let job = await workQueue.add();
res.json({ id: job.id });
});
For example this method won't tell the frontend when an image is done being uploaded. Or the frontend won't know when a call to an API, like an external exchange rate API, returns a result, like an exchange rate, and what that result is.
Someone suggested using job.finished() but doesn't this get you back where you started? Now your requests are waiting for the queue to finish in order to respond. So your requests are the a same length as when there was no queue and this could lead to timeout errors again.
const result = await job.finished();
res.send(result);
This is example uses Bull, Redis, Node.js.
Someone suggested websockets. I didn't find an example of this yet.
The idea of using a queue for long tasks is that you post the task and
then return immediately. I guess you are updating the database as last
step in your job, and only use the completed event for notifying the
clients. What you need to do in this case is to implement either a
websocket or similar realtime communication and push the notification
to relevant clients. This can become complicated so you can save some
time with a solution like https://pusher.com/ or similar...
https://github.com/OptimalBits/bull/issues/1901
I also saw a solution in heroku's full example, which I didn't originally see:
web server
// Fetch updates for each job
async function updateJobs() {
for (let id of Object.keys(jobs)) {
let res = await fetch(`/job/${id}`);
let result = await res.json();
if (!!jobs[id]) {
jobs[id] = result;
}
render();
}
}
frontend
// Fetch updates for each job
async function updateJobs() {
for (let id of Object.keys(jobs)) {
let res = await fetch(`/job/${id}`);
let result = await res.json();
if (!!jobs[id]) {
jobs[id] = result;
}
render();
}
}
// Attach click handlers and kick off background processes
window.onload = function() {
document.querySelector("#add-job").addEventListener("click", addJob);
document.querySelector("#clear").addEventListener("click", clear);
setInterval(updateJobs, 200);
};

How to process data from AWS SQS?

I have a problem with understanding of working SQS from AWS in NodeJS. I'm creating a simple endpoint which receive data sended to my server and add to SQS queue:
export const receiveMessageFromSDK = async (req: Request, res: Response) => {
const payload = req.body;
try {
await sqs.sendMessage({
QueueUrl: process.env.SQS_QUEUE_URL,
MessageBody: payload
}).promise();
} catch (error) {
//something else
}
}
and ok, it working and MessageBody is adding to my SQS queue. And right now I have a problem with processing of the data from this queue... How to do this?
In Google Cloud I'm creating simple request queue with sending (in body payload) also url of endpoint which should process data from queue, then gcloud send to this endpoint this body and my server starting business logic on this data. But how to do this in SQS? How to receive data from this queue and start processing data on my side?
I'm thinking about QueueUrl param but in docs is written that this value should be an url from aws console like https://sqs.eu-central-1.amazonaws.com... so I have no idea how to process this data from queue on my side.
So..., can anybody help me?
Thanks, in advice for any help!
Should I call launch this function on cron or AWS can eq. send request on providing from me endpoint with this data etc
SQS is for pulling data, which means that you have to have a cron job working (or any equivalent system in your app) that iteratively pulls your queue for messages using receiveMessage, , e.g. every 10 seconds.
If you want push type messaging system, then you have to use SNS instead of SQS. SNS can push messages to your HTTP/HTTPS endpoint automatically if your application exposes any such endpoint for the messages.

Concurrency in node js express app for get request with setTimeout

Console log Image
const express = require('express');
const app = express();
const port = 4444;
app.get('/', async (req, res) => {
console.log('got request');
await new Promise(resolve => setTimeout(resolve, 10000));
console.log('done');
res.send('Hello World!');
});
app.listen(port, () => {
console.log(`Example app listening at http://localhost:${port}`);
});
If I hit get request http://localhost:4444 three times concurrently then it is returning logs as below
got request
done
got request
done
got request
done
Shouldn't it return the output in the below way because of nodes event loop and callback queues which are external to the process thread? (Maybe I am wrong, but need some understanding on Nodes internals) and external apis in node please find the attached image
Javascript Run time environment
got request
got request
got request
done
done
done
Thanks to https://stackoverflow.com/users/5330340/phani-kumar
I got the reason why it is blocking. I was testing this in chrome. I am making get requests from chrome browser and when I tried the same in firefox it is working as expected.
Reason is because of this
Chrome locks the cache and waits to see the result of one request before requesting the same resource again.
Chrome stalls when making multiple requests to same resource?
It is returning the response like this:
Node.js is event driven language. To understand the concurrency, you should look a How node is executing this code. Node is a single thread language(but internally it uses multi-thread) which accepts the request as they come. In this case, Node accepts the request and assign a callback for the promise, however, in the meantime while it is waiting for the eventloop to execute the callback, it will accept as many request as it can handle(ex memory, cpu etc.). As there is setTimeout queue in the eventloop all these callback will be register there and once the timer is completed the eventloop will exhaust its queue.
Single Threaded Event Loop Model Processing Steps:
Client Send request to the Node.js Server.
Node.js internally maintains a limited(configurable) Thread pool to provide services to the Client Requests.
Node.js receives those requests and places them into a Queue that is known as “Event Queue”.
Node.js internally has a Component, known as “Event Loop”. Why it got this name is that it uses indefinite loop to receive requests and process them.
Event Loop uses Single Thread only. It is main heart of Node JS Platform Processing Model.
Event Loop checks any Client Request is placed in Event Queue. If not then wait for incoming requests for indefinitely.
If yes, then pick up one Client Request from Event Queue
Starts process that Client Request
If that Client Request Does Not requires any Blocking IO Operations, then process everything, prepare response and send it back to client.
If that Client Request requires some Blocking IO Operations like interacting with Database, File System, External Services then it will follow different approach
Checks Threads availability from Internal Thread Pool
Picks up one Thread and assign this Client Request to that thread.
That Thread is responsible for taking that request, process it, perform Blocking IO operations, prepare response and send it back to the Event Loop
You can check here for more details (very well explained).

Most effective way to poll an Amazon SQS queue using Node

My question is short, but I think is interesting:
I've a queue from Amazon SQS service, and I'm polling the queue every second. When there's a message I process the message and after processing, go back to polling the queue.
Is there a better way for this?, some sort of trigger? or which approach will be the best in your opinion, and why.
Thanks!
A useful and easily to use library for consuming messages from SQS is sqs-consumer
const Consumer = require('sqs-consumer');
const app = Consumer.create({
queueUrl: 'https://sqs.eu-west-1.amazonaws.com/account-id/queue-name',
handleMessage: (message, done) => {
console.log('Processing message: ', message);
done();
}
});
app.on('error', (err) => {
console.log(err.message);
});
app.start();
It's well documented if you need more information. You can find the docs at:
https://github.com/bbc/sqs-consumer
yes there is:
http://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-long-polling.html
you can configure the SQS queues to have a "receive message wait time" and do long polling.
so you can set it to say 10 seconds, and the call will come back only if you have a message or after the 10 sec timeout expires. you can continuously poll the queue in this scenario.
As mentioned by Mircea, long polling is one option.
When you ask for a 'trigger', I believe you are looking for something other than continuously polling SQS on your own. If that is the case, I'd suggest you look at AWS Lambda. It allows you to put code in the cloud, which automatically gets triggered on your configured events, such as SNS event, a file pushed to S3 etc.
http://aws.amazon.com/lambda/

NodeJS and AWS SQS

Folks,
I would like to set up a message queue between our Java API and NodeJS API.
After reading several examples of using aws-sdk, I am not sure how to make the service watch the queue.
For instance, this article Using SQS with Node: Receiving Messages Example Code tells me to use the sqs.receiveMessage() to receive and sqs.deleteMessage() to delete a message.
What I am not clear about, is how to wrap this into a service that runs continuously, which constantly takes the messages off the sqs queue, passes them to the model, stores them in mongo, etc.
Hope my question is not entirely vague. My experience with Node lies primarily with Express.js.
Is the answer as simple as using something like sqs-poller? How would I implement the same into an already running NodeJS Express app? Quite possibly I should look into SNS to not have any delay in message transfers.
Thanks!
For a start, Amazon SQS is a pseudo queue that guarantees availability of messages but not their sequence in FIFO fashion. You have to implement sequencing logic into your app if you want it to work that way.
Coming back to your question, SQS has to be polled within your app to check if there are new messages available. I implemented this in an app using setInterval(). I would poll the queue for items and if no items were found, I would delay the next call and in case some items were found, the next call would be immediate bypassing the setInterval(). This is obviously a very raw implementation and you can look into alternatives. How about a child process on your server that pings your NodeJS app when a new item is found in SQS ? I think you can implement the child process as a watcher in BASH without using NodeJS. You can also look into npm modules if there is already one for this.
In short, there are many ways you can poll but polling has to be done one way or the other if you are working with Amazon SQS.
I am not sure about this but if you want to be notified of items, you might want to look into Amazon SNS.
When writing applications to consume messages from SQS I use sqs-consumer:
const Consumer = require('sqs-consumer');
const app = Consumer.create({
queueUrl: 'https://sqs.eu-west-1.amazonaws.com/account-id/queue-name',
handleMessage: (message, done) => {
console.log('Processing message: ', message);
done();
}
});
app.on('error', (err) => {
console.log(err.message);
});
app.start();
See the docs for more information (well documented):
https://github.com/bbc/sqs-consumer

Resources