AWS http api gateway + lambda (node/express) Internal Server Error - node.js

I get internal server error when I have a long running query.
Actually, I have to fetch historic data through an API, which sometime can take longer than 30 seconds. It depends on the query how complex it is. It can take 1 min also.
Not sure but guessing, API gateway timeout is set to 30 seconds (and I cann't increase it) and my query execution time is more then 30 seconds. So I get internal server error I believe.
HOW can I say above statement ?
because If I run the same query locally, I mean in node/express locally by running npm run start, it works fine even if takes1 mins, response will always come back.
But when I deploy node/express code to lambda function, it throws error if any query takes longer period to execute.
I have following setup of node/express
const express = require("express");
const serverless = require("serverless-http");
const app = express();
app.use(cors());
app.use((req, res, next) => {
res.setHeader('Connection', 'keep-alive'); // I added this line as suggested in some post but not helping
res.setHeader('Keep-Alive', 'timeout=30'); // I added this line as suggested in some post but not helping
res.setHeader("Access-Control-Allow-Headers", "X-Requested-With,content-type");
res.setHeader("Access-Control-Allow-Origin", "*");
res.setHeader("Access-Control-Allow-Methods", "GET, POST, OPTIONS, PUT, PATCH, DELETE");
res.setHeader("Access-Control-Allow-Credentials", true);
next();
});
app.use(express.json());
app.use(express.urlencoded({ extended: true }));
app.use(`api-end-point/user`, userRoute);
....
if (process.env.NODE_ENV !== "lambda") {
PORT = process.env.PORT || 7000;
const server = app.listen(PORT, () => {
console.log(`node-express server running in ${process.env.NODE_ENV} mode on ${PORT}`);
});
server.timeout = 0;
}else {
module.exports.handler = serverless(app); // this is for lambda function
}
I deploy this code to AWS lambda function.
HTTP API gateway is configured with two routes /ANY, /{proxy+}
TIMEOUT
API gateway is set to default 30 seconds. [I can not increase this time as not allowed by AWS]
Lambda is set to 10 **mins**
CORS
I really have no idea how can I fix this problem ?
How can I increase API gateway timeout or How can I keep connection alive ?

You cannot increase the API Gateway timeout to greater than 30 seconds, as has already been mentioned.
The only solution I know of at this time is to run your Lambda asynchronously, but this cannot be done in an Http API. But if you're willing to change it to a REST API, then this can be done with a combination of turning on Lambda Proxy Integration in the REST API and invoking the Lambda asynchronously utilizing an invoke header X-Amz-Invocation-Type. This will allow your Lambda to run asynchronously (up to 15 minutes) with an API call.

https://docs.aws.amazon.com/apigateway/latest/developerguide/limits.html
Since timeout cannot be incresed, you might change using single HTTP request to
https://en.wikipedia.org/wiki/Post/Redirect/Get Pattern
Client POST query
Server response an url for the result
Client GET the url multiple times -- it will be 200 OK when the result ready
or WebSocket
The document says Idle Connection Timeout for WebSocket is up to 10 minutes

Using Lambda means subscribing to patterns from the Serverless catalog/philosophy. Which means using async whenever possible.
As far as I understand your Lambda needs receives a request, does another call to something (not specified) which takes 30~60s.
The API Gateway has a hardcoded timeout of 29s (hard limit).
To solve this problem the application would need to be re-architectured:
Trigger the Lambda asynchronously using X-Amz-Invocation-Type Event from the Frontend.
The Lambda calls the history API and stores the result in some storage (DynamoDB, S3, RDS, ...).
The frontend queries the backend from the frontend until the data is available (or use WebSockets)
This way the historic API call can take up to 15m and the calls can be cached in the storage to speed up further calls. If it needs more than 15m then I would ask the historic API to re-architecture.

Related

AWS HTTP API Gateway + Lambda ( Node/express) 503 Service unavailable

I don't have much knowledge in AWS
I have following setup,
const express = require("express");
const serverless = require("serverless-http");
const app = express();
app.use(cors());
app.use((req, res, next) => {
res.setHeader('Connection', 'keep-alive');
res.setHeader('Keep-Alive', 'timeout=30');
res.setHeader("Access-Control-Allow-Headers", "X-Requested-With,content-type");
res.setHeader("Access-Control-Allow-Origin", "*");
res.setHeader("Access-Control-Allow-Methods", "GET, POST, OPTIONS, PUT, PATCH, DELETE");
res.setHeader("Access-Control-Allow-Credentials", true);
next();
});
app.use(express.json());
app.use(express.urlencoded({ extended: true }));
app.use(`api-end-point/user`, userRoute);
....
if (process.env.NODE_ENV !== "lambda") {
PORT = process.env.PORT || 7000;
const server = app.listen(PORT, () => {
console.log(`node-express server running in ${process.env.NODE_ENV} mode on ${PORT}`);
});
server.timeout = 0;
}else {
module.exports.handler = serverless(app);
}
user.controller.js
const AWS = require("aws-sdk");
const CryptoJS = require("crypto-js");
const Base64 = require("crypto-js/enc-base64");
const config = require("../core/config/config.json");
...
...
async getUser(token, cb) {
var params = {
AccessToken: token /* required */
};
try {
this.cognitoIdentity.getUser(params, (err, data) => {
if (err) cb(err); // an error occurred
else cb(null, data); // successful response
});
} catch (e) {
console.log(error);
return false;
}
}
And of course, I have cognito user pool from where I get the requested user information using above getUser by passing Authorization token (after login).
Local Environment
When I run this and all other APIs, believe me I haven't received any error single time and all APIs work just fine without any problem.
API Gateway + Lambda Enviornment
Problem happens when this code goes to Lambda function.
API gateway routes are configure as /ANY, /{proxy+}
Also, CORS setting is as below,
And the entire code goes to LAMBDA function and lambda is associated with API gateway.
When I hit `API gateway + Lambda (+ congnito)` to get **user information**. It works fine. Most of the time, it works but there are instances where it fails and returns (all of sudden)
503 Service Unavailable
Now, I really have no idea what is going wrong.
TIMEOUT
API gateway timeout is set to default which is 30 seconds.
Lambda timeout is set to 10 MINUTS. (I just increased lambda timeout to try something)
BUT THIS ISSUE KEEPS OCCURING
Sometime (NOT Every time) I get CORS issue (out of no where and all of sudden)
Is there any way I can increase API gateway timeout
What should I real do to make work every time ?
NOTE: getUser function is very straight forward. I don't think it is taking so much time to get user details from cognito user pool. something is wrong which I'm not able to figure out.
Please please help
You should try increasing the timeout value for the API Gateway by going to the API Gateway settings in the AWS Management Console and editing the timeout value for the resource and method that corresponds to the Lambda function you are invoking. The default timeout for an API Gateway is under a minute, which might not be enough time for the Lambda function to complete and return a response.
Answer to question 1:
Seeing your code, I suspect that you are using serverless framework. If your code is running normal when you invoke locally, I would suggest you to try and reserve your concurrency for this function (or ask AWS to increase your account limit).
You can find it in the Concurrency tabs and I would suggest you to reserve some concurrency resource for your function. Even though your account has a 1000 concurrency limit, if you are logging to your AWS client account using IAM or deploy on a shared resources, the capacity pools might be shared. I would suggest to check on that as well.
Since Lambda usually return 503 when:
The number of function executions exceeded one of the quotas (formerly known as limits) that Lambda sets to throttle executions in an AWS Region (concurrent executions or invocation frequency).
or when
The function exceeded the Lambda function timeout quota.
You can read it more about it here
But if the worst case scenario happen, which means your account concurrency limit is hit, you would need to ask AWS support for raising your total allowed concurrency. Which involves going the Service Quotas dashboard.For a more detailed guide, you can refer to this
Answer to question 2:
Would be hard to answer without knowing more on what path got CORS or all of your part just got CORS. If it's all part got CORS it could be due to your OPTIONS requests not having any response due to your server did not run (Because it hits concurrency limit or something). If concurrency is really the problem you have, this should resolve itself out.
Answer to question 3:
You can't raise APIGateway timeout limit unfortunately, it is fixed at 29 seconds and AWS does not allow you to change it
You can read more details here

What limits the number of outbound requests an express server inside a Cloud Run container instance can make?

I have an express server running inside a Cloud Run container.
I know a single Cloud Run instance may handle up to 250 requests concurrently.
I also know that an express server may handle many requests at a time.
But how about outbound external requests that request handlers functions of that server make? What limits that?
Imagine I have an API route, that needs to make a lot of external API calls. For example:
/api/update-all-db-with-external-info
And the handler for that route is something like this:
updateAll.ts
import fetch from "cross-fetch";
async function updateAll() {
const all1000promises = [];
for (const obj of WHOLE_DB_1000_ITEMS) {
all1000promises .push(fetch("SOME_EXTERNAL_API"));
}
Promise.all(all1000promises).then((values) => {
console.log(values);
});
// UPDATE DB...
// DO OTHER STUFF...
}
What will limit this kind of code? Is it the memory allocated for the instance? Or is there something like an outbound request concurrency limit for either the express server or the cloud run instance?

Firebase cloud functions - what happens with multiple HTTP triggers at once

I have a firebase cloud function that is an endpoint for an external API, and it handles a POST request.
This external API POSTS data to my cloud function endpoint at random intervals (this cloud function gets pinged with a POST request based on when a result is returned from this external API, and there can be multiple at once and its unpredictable)
exports.handleResults = functions.https.onRequest((req, res) => {
if (req.method === 'POST') {
// run code here that handles the POST payload
}
})
What happens when there is more than one POST request that come in at the same time?
Is there a queue? Does it finish the first request before moving on to the next?
Or if another request comes in while the function is running, does it block/ignore the request until the function is done?
Cloud Functions will automatically scale up the server instances running your functions when it determines that more capacity is needed. Those instances will run your function concurrently. The instances will be scaled down when they are no longer needed. The exact behavior is not documented - it should be considered an implementation detail that may change over time.
To learn more about this, watch my video about Cloud Functions scaling and isolation.

Node app that fetches, processes, and formats data for consumption by a frontend app on another server

I currently have a frontend-only app that fetches 5-6 different JSON feeds, grabs some necessary data from each of them, and then renders a page based on said data. I'd like to move the data fetching / processing part of the app to a server-side node application which outputs one simple JSON file which the frontend app can fetch and easily render.
There are two noteworthy complications for this project:
1) The new backend app will have to live on a different server than its frontend counterpart
2) Some of the feeds change fairly often, so I'll need the backend processing to constantly check for changes (every 5-10 seconds). Currently with the frontend-only app, the browser fetches the latest versions of the feeds on load. I'd like to replicate this behavior as closely as possible
My thought process for solving this took me in two directions:
The first is to setup an express application that uses setTimeout to constantly check for new data to process. This data is then sent as a response to a simple GET request:
const express = require('express');
let app = express();
let processedData = {};
const getData = () => {...} // returns a promise that fetches and processes data
/* use an immediately invoked function with setTimeout to fetch the data
* when the program starts and then once every 5 seconds after that */
(function refreshData() {
getData.then((data) => {
processedData = data;
});
setTimeout(refreshData, 5000);
})();
app.get('/', (req, res) => {
res.send(processedData);
});
app.listen(port, () => {
console.log(`Started on port ${port}`);
});
I would then run a simple get request from the client (after properly adjusting CORS headers) to get the JSON object.
My questions about this approach are pretty generic: Is this even a good solution to this problem? Will this drive up hosting costs based on processing / client GET requests? Is setTimeout a good way to have a task run repeatedly on the server?
The other solution I'm considering would deal with setting up an AWS Lambda that writes the resulting JSON to an s3 bucket. It looks like the minimum interval for scheduling an AWS Lambda function is 1 minute, however. I imagine I could set up 3 or 4 identical Lambda functions and offset them by 10-15 seconds, however that seems so hacky that it makes me physically uncomfortable.
Any suggestions / pointers / solutions would be greatly appreciated. I am not yet a super experienced backend developer, so please ELI5 wherever you deem fit.
A few pointers.
Use crontasks for periodic processing of data. This is far preferable especially if you are formatting a lot of data.
Don't setup multiple Lambda functions for the same task. It's going to be messy to maintain all those functions.
After processing / fetching the feed, you can store the JSON file in your own server or S3. Note that if it's S3, then you are paying and waiting for a network operation. You can read the file from your express app and just send the response back to your clients.
Depending on the file size and your load in the server you might want to add a caching server so that you can cache the response until new JSON data is available.

Node server, socket, request and response timeouts

Problem
Node's default configuration timeouts requests after 2 minutes. I would like to change the request timeouts to:
1 minute for 'normal' requests
5 minutes for requests that serve static files (big assets in this case)
8 hours for uploads (couple of thousand pictures per request)
Research
Reading through Node's documentation, I've discovered that there are numerous ways of defining timeouts.
server.setTimeout
socket.setTimeout
request.setTimeout
response.setTimeout
I'm using Express which also provides middleware to define timeout's for (specific) routes. I've tried that, without success.
Question
I'm confused about how to properly configure the timeout limit globally and per route. Should I configure all of the above timeouts? How is setting the server's timeout different to setting the socket's or request's timeout?
As I saw on your other question concerning the usage of the timeout middleware, you are using it somehow differently.
See documentation of timeout-connect middleware.
Add your errorHandler-function as an EventListener to the request, as it is an EventEmitter and the middleware causes it to emit the timeout-event:
req.on("timeout", function (evt) {
if (req.timedout) {
if (!res.headersSent) {
res
.status(408)
.send({
success: true,
message: 'Timeout error'
});
}
}
});
This is called outside of the middleware stack, causing the function call to next(err) to be invalid. Also, you have to keep in mind, that if the timeout happens while the request is hanging server-side, you have to prevent your server code from further processing this request (because headers are already sent and its underlying connection will no longer be available).
Summary
nodejs timeout API are all inactivity timeout
expressjs/timeout package is response hard timeout
nodejs timeout API
server.timeout
inactivity/idle timeout
equal to socket timeout
default 2min
server.setTimeout
inactivity/idle timeout
equal to socket timeout
default 2min
have callback
socket.setTimeout
inactivity/idle timeout
callback responsible to end(), destroy() socket
default no timeout
response.setTimeout
socket.setTimeout front end
request.setTimeout
socket.setTimeout front end
expressjs/timeout package
response hard-timeout (vs inactivity)
have callback
Conclusion
max. time allowed for an action(request+response), express/timeout package is needed.
This is properly what you need, but the callback need to end the request/response. As the timeout only trigger the callback, it does not change the state or interfere with the connection. It is the callback job.
idle timeout, set nodejs api request/response timeout
I don't recommend touching these, as it is not necessary in most cases. Unless you want to allow a connection to idle(no traffic) over 2min.
There is already a Connect Middleware for Timeout support. You can try this middleware.
var timeout = express.timeout // express v3 and below
var timeout = require('connect-timeout'); //express v4
app.use(timeout(120000)); // should be changed with your desire time
app.use(haltOnTimedout);
function haltOnTimedout(req, res, next){
if (!req.timedout) next();
}

Resources