Invoking multiple AWS Lambdas doesn't make paralel processes - node.js

I am trying to invoke multiple lambda functions (one lambda function, that would run separate parallel processes) from another lambda function. The first one runs as cron lambda that just queries docs from db and then invokes another lambda with doc's params. This cron lambda runs every five minutes and queries the docs correctly. I was testing the second lambda with two documents. The problem is that every time the second lambda gets invoked it only process one document - every time it processes the other one it didn't process on the previous invoke:
Ex:
doc 1
doc 2
First, invoke of second lambda -> process doc 1
Second, invoke of second lambda -> process doc 2
Third, invoke of second lambda -> process doc 1
Forth invoke of second lambda -> process doc 2
etc...
First (cron) lambda code:
aws.config.update({
region : env.lambdaRegion,
accessKeyId: env.lambdaAccessKeyId,
secretAccessKey: env.lambdaSecretAccessKey,
});
const lambda = new aws.Lambda({
region: env.lambdaRegion,
});
exports.handler = async (event: any, context: any) => {
context.callbackWaitsForEmptyEventLoop = false;
return new Promise(async (resolve, reject) => {
for (let i = 0; i < 100; i++) {
const doc = await mongo.db.collection('docs').
findOneAndUpdate(
{
status: 1,
lambdaProcessing: null,
},
{ $set: { lambdaProcessing: new Date() } },
{
sort: { processedAt: 1 },
returnNewDocument: true,
},
);
if (doc.value && doc.value._id) {
const params = {
FunctionName: env.lambdaName,
InvocationType: 'Event',
Payload: JSON.stringify({ docId: doc.value._id }),
};
lambda.invoke(params);
} else {
if (doc.lastErrorObject && doc.lastErrorObject.n === 0) {
break;
}
}
}
resolve();
});
};
Second lambda function:
exports.handler = async (event: any, ctx: any) => {
ctx.callbackWaitsForEmptyEventLoop = false;
if (event && event.docId) {
const doc = await mongo.db.collection('docs').findById(event.docId);
return await processDoc(doc);
} else {
throw new Error('doc ID is not present.');
}
};

To run multiple lambdas in parallel without an "ugly" cronjob solution I would recommend using AWS step functions with type Parallel. You can set up the logic in your serverless.yml, the function calls itself are lambda functions. You can pass data by the second argument of callback. If the data is larger than 32kb I would recommend using an S3 bucket/database though.
Example serverless.yml
stepFunctions:
stateMachines:
test:
name: 'test'
definition:
Comment: "Testing tips-like state structure"
StartAt: GatherData
States:
GatherData:
Type: Parallel
Branches:
-
StartAt: GatherDataA
States:
GatherDataA:
Type: Task
Resource: "arn:aws:lambda:#{AWS::Region}:#{AWS::AccountId}:function:${self:service}-${opt:stage, self:provider.stage}-firstA"
TimeoutSeconds: 15
End: true
-
StartAt: GatherDataB
States:
GatherDataB:
Type: Task
Resource: "arn:aws:lambda:#{AWS::Region}:#{AWS::AccountId}:function:${self:service}-${opt:stage, self:provider.stage}-firstB"
TimeoutSeconds: 15
End: true
Next: ResolveData
ResolveData:
Type: Task
Resource: "arn:aws:lambda:#{AWS::Region}:#{AWS::AccountId}:function:${self:service}-${opt:stage, self:provider.stage}-resolveAB"
TimeoutSeconds: 15
End: true
Example handlers
module.exports.firstA = (event, context, callback) => {
const data = {
id: 3,
somethingElse: ['Hello', 'World'],
};
callback(null, data);
};
module.exports.firstB = (event, context, callback) => {
const data = {
id: 12,
somethingElse: ['olleH', 'dlroW'],
};
callback(null, data);
};
module.exports.resolveAB = (event, context, callback) => {
console.log("resolving data from a and b: ", event);
const [dataFromA, dataFromB] = event;
callback(null, event);
};
More information see
https://serverless.com/plugins/serverless-step-functions/
https://docs.aws.amazon.com/step-functions/latest/dg/amazon-states-language-common-fields.html

The key was to create new seperate aws.Lambda() instance for every lambda we want to invoke, then we have to resolve and await every lambda we invoked (promieses array). This is OK if the invoked lambdas doesn't need to be awaited, so we don't waste processing time on AWS - so invoked lambda starts processing and then resolves without awaiting its response so the main (cron) lambda can resolve.
Fixed (cron) lambda handler:
aws.config.update({
region : env.lambdaRegion,
accessKeyId: env.lambdaAccessKeyId,
secretAccessKey: env.lambdaSecretAccessKey,
});
exports.handler = async (event: any, context: any) => {
context.callbackWaitsForEmptyEventLoop = false;
return new Promise(async (resolve, reject) => {
const promises: any = [];
for (let i = 0; i < 100; i++) {
const doc = await global['mongo'].db.collection('docs').
findOneAndUpdate(
{
status: 1,
lambdaProcessing: null,
},
{ $set: { lambdaProcessing: new Date() } },
{
sort: { processedAt: 1 },
returnNewDocument: true,
},
);
if (doc.value && doc.value._id) {
const params = {
FunctionName: env.lambdaName,
InvocationType: 'Event',
Payload: JSON.stringify({ docId: doc.value._id }),
};
const lambda = new aws.Lambda({
region: env.lambdaRegion,
maxRetries: 0,
});
promises.push(
new Promise((invokeResolve, invokeReject) => {
lambda.invoke(params, (error, data) => {
if (error) { console.error('ERROR: ', error); }
if (data) { console.log('SUCCESS:', data); }
// Resolve invoke promise in any case.
invokeResolve();
});
}),
);
} else {
if (doc.lastErrorObject && doc.lastErrorObject.n === 0) {
break;
}
}
}
await Promise.all(promises);
resolve();
});
};
Second (processing) lambda:
exports.handler = async (event: any, ctx: any) => {
ctx.callbackWaitsForEmptyEventLoop = false;
if (event && event.docId) {
const doc = await mongo.db.collection('docs').findById(event.docId);
processDoc(doc);
return ctx.succeed('Completed.');
} else {
throw new Error('Doc ID is not present.');
}
};
I don't know if there is any better way of achieving this using strictly lambda functions, but this works.

Related

Jest - How to mock aws-sdk sqs.receiveMessage methode

I try mocking sqs.receiveMessage function which imported from aws-sdk.
Here is my code(sqsHelper.js):
const AWS = require("aws-sdk");
export default class SqsHelper {
static SqsGetMessagesTest = () => {
const sqs = new AWS.SQS({
apiVersion: serviceConfig.sqs.api_version,
region: serviceConfig.sqs.region,
});
const queueURL =
"https://sqs.us-west-2.amazonaws.com/<1234>/<4567>";
const params = {
AttributeNames: ["SentTimestamp"],
MaxNumberOfMessages: 10,
MessageAttributeNames: ["All"],
QueueUrl: queueURL,
VisibilityTimeout: 20,
WaitTimeSeconds: 20,
};
return new Promise((resolve, reject) => {
sqs.receiveMessage(params, async (recErr, recData) => {
if (recErr) {
reject(recErr);
} else if (recData.Messages) {
console.info(`Message count: ${recData.Messages.length}`);
resolve(recData.Messages);
}
});
});
};
}
And here is the test file(sqsHelper.test.js):
import SqsHelper from "../../helpers/sqsHelper.js";
import { SQS } from "aws-sdk";
const dumyData = { Messages: [{ name: "123", lastName: "456" }] };
const sqs = new SQS();
describe("Test SQS helper", () => {
test("Recieve message", async () => {
jest.spyOn(sqs, 'receiveMessage').mockReturnValue(dumyData);
// check 1
const res1 = await sqs.receiveMessage();
console.log(`res: ${JSON.stringify(res1, null, 2)}`)
expect(res1).toEqual(dumyData);
// check 2
const res2 = await SqsHelper.SqsGetMessagesTest();
console.log(`res2: ${JSON.stringify(res2, null, 2)}`);
expect(res2).toBe(dumyData);
});
});
The problem is that on the first check( which i call the function directly from the test file) i can see that the receiveMessage has been mocked and the results is as expected.
But on the second check(which the function called from the second module "sqsHelper.js") looks that the mock function doe's work and the originalreceiveMessage has been called and it still ask me about credentials.
This is the error:
InvalidClientTokenId: The security token included in the request is
invalid.
what I'm doing wrong?
Thanks
The receiveMessage should trigger a callback that comes in the params. receiveMessage does not return a Promise
Try something like this:
const dummyData = { Messages: [{ name: "123", lastName: "456" }] };
const mockReceiveMessage = jest.fn().mockImplementation((params, callback) => callback("", dummyData));
jest.mock("aws-sdk", () => {
const originalModule = jest.requireActual("aws-sdk");
return {
...originalModule,
SQS: function() { // needs to be function as it will be used as constructor
return {
receiveMessage: mockReceiveMessage
}
}
};
})
describe("Test SQS helper", () => {
test("Recieve message", async () => {
const res = await SqsHelper.SqsGetMessagesTest();
expect(res).toBe(dummyData.Messages);
});
test("Error response", async () => {
mockReceiveMessage.mockImplementation((params, callback) => callback("some error"));
await expect(SqsHelper.SqsGetMessagesTest()).rejects.toEqual("some error");
});
});

Using promises in a Lambda node for loop

I'm trying to pre-load a DynamoDB table with records. I have about 1500 records to do. I've tried various ways to loop through only 5 but only one gets entered each time. Here is what I have so far.
'use strict';
var AWS = require('aws-sdk'), documentClient = new AWS.DynamoDB.DocumentClient();
var params = {};
exports.handler = function(event, ctx, callback) {
Promise.all(
event.map(e => {
var params = {
Item: {
UID: ctx.awsRequestId,
AccountName: e.accountname,
AccountStatus: e.accountstatus,
MainNumber: e.mainnumber,
FaxNumber: e.faxnumber,
EmergencyNumber: e.emergencynumber,
EPAPERNO: e.epaperno,
BGB: e.bgb,
WebID: e.webid,
BoxProgram: e.boxprogram,
ReportGroup: e.reportgroup,
CreditLimit: e.creditlimit,
Customer: e.customer,
Transporter: e.transporter,
TSDF: e.tsdf,
Permit: e.permit,
Created: e.created,
Author: e.author,
Modified: e.modified,
Editor: e.editor
},
TableName: 'Accounts'
};
documentClient.put(params, function (err, data){
if(err){
console.log(err);
}else{
console.log(params);
}
});
})
).then(console.log("Done"));
};
Any help would be appreciated.
I recommend using for loop instead of map, Promise.all will call many request by parallel, it maybe take DynamoDB become stressful.
Convert your handler function to async/await function, then just wait until dynamodb process finish.
Ref: Convert aws-sdk callback to Promise
'use strict';
var AWS = require('aws-sdk'), documentClient = new AWS.DynamoDB.DocumentClient();
const insertAccount = async (e) => {
const params = {
Item: {
UID: ctx.awsRequestId,
AccountName: e.accountname,
AccountStatus: e.accountstatus,
MainNumber: e.mainnumber,
FaxNumber: e.faxnumber,
EmergencyNumber: e.emergencynumber,
EPAPERNO: e.epaperno,
BGB: e.bgb,
WebID: e.webid,
BoxProgram: e.boxprogram,
ReportGroup: e.reportgroup,
CreditLimit: e.creditlimit,
Customer: e.customer,
Transporter: e.transporter,
TSDF: e.tsdf,
Permit: e.permit,
Created: e.created,
Author: e.author,
Modified: e.modified,
Editor: e.editor
},
TableName: 'Accounts'
};
return documentClient.put(params).promise(); // convert to Promise
}
exports.handler = async (event, ctx) => { // Async function
for (const item of event) {
console.log(item);
await insertAccount(item) // wait until it finish and go to next item
.catch((error) => {
console.log(error);
// throw error; don't care about this error, just continue
});
}
console.log("Done");
};
Have you tried this :
'use strict';
var AWS = require('aws-sdk'), documentClient = new AWS.DynamoDB.DocumentClient();
exports.handler = async(event, ctx) => {
return Promise.all(
event.map(e => {
var params = {
Item: {
UID: ctx.awsRequestId,
AccountName: e.accountname,
AccountStatus: e.accountstatus,
MainNumber: e.mainnumber,
FaxNumber: e.faxnumber,
EmergencyNumber: e.emergencynumber,
EPAPERNO: e.epaperno,
BGB: e.bgb,
WebID: e.webid,
BoxProgram: e.boxprogram,
ReportGroup: e.reportgroup,
CreditLimit: e.creditlimit,
Customer: e.customer,
Transporter: e.transporter,
TSDF: e.tsdf,
Permit: e.permit,
Created: e.created,
Author: e.author,
Modified: e.modified,
Editor: e.editor
},
TableName: 'Accounts'
};
documentClient.put(params).promise().then(data=>{
console.log(params);
})
})
).then(()=>{
console.log("Done")
}).catch(e=>{
console.log(e)
})
};
Let me know if you find problem with, I haven't it try
Just another idea is using [BatchWriteItem][1] to bulking put process, max 25 per batch

Lambda Invoke not triggering second lambda

I have gone through similar threads to fix this issue but I have had no luck. Both lambdas can be trigger independently of one another, and I am able to invoke the second Lambda through the command line, but my code does not work.
'use strict'
/* eslint max-statements: ['error', 100, { 'ignoreTopLevelFunctions': true }] */
const RespHelper = require('../../lib/response')
const { uuid } = require('uuidv4')
const AWS = require('aws-sdk')
const DB = require('./dynamo')
const respHelper = new RespHelper()
const Dynamo = new DB()
const lambda = new AWS.Lambda({
region: 'us-west-2'
})
const secondLambda = async (lambdaData) => {
var params = {
LogType: 'Tail',
FunctionName: 'second_lambda_name',
InvocationType: 'RequestResponse',
Payload: JSON.stringify(lambdaData)
}
lambda.invoke(params, function (err, data) {
if (err) {
console.log(err)
} else {
console.log(`Success: ${data.Payload}`)
}
})
}
exports.handler = async event => {
const id = uuid()
let bodyData = {
uuid: id,
user: 'owner#email.com',
processingStatus: 'IN_PROGRESS'
}
let payloadData = {
uuid: id,
user: 'owner#email.com',
processingStatus: 'COMPLETE'
}
try {
await Dynamo.writeRecordToDB(bodyData)
await secondLambda(payloadData)
return respHelper.sendResponse(200, { message: bodyData })
} catch (err) {
console.log(`Failure: ${err}`)
return respHelper.sendResponse(400, { message: 'ERROR' })
}
}
I have double checked the lambda role and it has the Invoke Lambda and Invoke Asynchronous Invoke permission on all resources. Console outputs don't give me any indication of why this is not working. Any help is appreciated.
You're awaiting a callback when you need to await a promise
const secondLambda = async lambdaData =>
lambda
.invoke({
LogType: 'Tail',
FunctionName: 'second_lambda_name',
InvocationType: 'RequestResponse',
Payload: JSON.stringify(lambdaData),
})
.promise()

Chained AWS Lambda calls in an orchestration

I am attempting to create an orchestration AWS lambda that calls two other AWS lambdas. These two other AWS lambdas can be invoked in their own right but in certain cases, there is a need for orchestration.
My orchestration lambda looks like this:
module.exports.orchestration = async (event, context, callback) => {
const lambdaAPromise = lambdaA();
const lambdaBPromise = lambdaB();
const lambdaAResponse = await lambdaAPromise;
const lambdaBResponse = await lambdaBPromise;
if (lambdaAResponse && lambdaBResponse) {
console.log(
"Both streams responsed with: ",
lambdaAResponse,
lambdaBResponse
);
var orchestrationResponse = [];
orchestrationResponse.push(lambdaAResponse);
orchestrationResponse.push(lambdaBResponse);
const orchestrationSucceeded = {
statusCode: 200,
isBase64Encoded: false,
body: orchestrationResponse
};
callback(null, orchestrationSucceeded);
} else {
console.log(
"At least one stream not responded: ",
lambdaAResponse,
lambdaBResponse
);
const orchestrationFailed = {
statusCode: 400,
isBase64Encoded: false,
body: someresponse
};
callback(null, orchestrationFailed);
}
};
function lambdaA() {
var payload = {
groupNumber: requestBody.groupNumber
};
var params = {
FunctionName: process.env.CCE_FUNCTION_NAME,
InvocationType: "RequestResponse",
LogType: "Tail",
Payload: JSON.stringify(payload)
};
return lambda
.invoke(params)
.promise()
.then(({ Payload }) => {
var payload = JSON.parse(Payload);
return payload.body;
});
}
function lambdaB() {
var payload = {
groupNumber: requestBody.groupNumber
};
var params = {
FunctionName: process.env.CCE_FUNCTION_NAME,
InvocationType: "RequestResponse",
LogType: "Tail",
Payload: JSON.stringify(payload)
};
return lambda
.invoke(params)
.promise()
.then(({ Payload }) => {
var payload = JSON.parse(Payload);
return payload.body;
});
}
Both lambdaA and lambdaB functions look like this:
module.exports.lambdaA = (event) => {
return new Promise((resolve) => {
do something ...
resolve(boolean value);
});
};
My issue was that the await function did not occur as I had an incorrect signature (was still using callback rather than promise). I have updated the code snippets that are now working correctly.
Just wrapping up from comment:
Issue was lambdaA and lambdaB use callbacks, hence you cannot await them. [From Snippet #1]

AWS Cloudwatch Metric and callbackWaitsForEmptyEventLoop do not work together?

below is a simplification of my code.
const AWS = require('aws-sdk');
exports.handler = async (event, context, callback) => {
context.callbackWaitsForEmptyEventLoop = true;
AWS.config.update({region: 'cn-north-1'});
// Create CloudWatch service object
var cw = new AWS.CloudWatch({apiVersion: '2010-08-01'});
var params = {
MetricData: [
{
MetricName: 'PAGES_VISITED',
Dimensions: [
{
Name: 'UNIQUE_PAGES',
Value: 'URLS'
},
],
Unit: 'None',
Value: 1.0
},
],
Namespace: 'MyNewNameSpace'
};
cw.putMetricData(params, function(err, data) {
if (err) {
console.log("Error", err);
} else {
console.log("Success", JSON.stringify(data));
}
});
callback(null, "the result");
};
It seems that once I set the callbackWaitsForEmptyEventLoop = false then the metric cannot be put up there. I donot understand this conflict.
If you set callbackWaitsForEmptyEventLoop = false then your function execution terminates before all the callbacks are done. In this case, the function terminates before the callback from cw.putMetricData is ever called, so your code is not executed. It is likely that the operation on CloudWatch actually happens, but that you just don't see the callback, as it does not happen.
Here's your function, using the async/await model, without callbacks and without callbackWaitsForEmptyEventLoop:
const AWS = require('aws-sdk');
exports.handler = async event => {
AWS.config.update({region: 'cn-north-1'});
// Create CloudWatch service object
var cw = new AWS.CloudWatch({apiVersion: '2010-08-01'});
var params = {...};
await cw.putMetricData(params)
.promise()
.then(data => {console.log("Success", JSON.stringify(data));})
.catch(err => {console.log("Error", err);})
return "the result";
};

Resources