Why is it not going inside the function? It only output "finish" and "start" on AWS Lambda - node.js

I'm doing a html to pdf function using the phantom-html-to-pdf on AWS Lambda with NodeJS. But I encountered a problem that did not go inside the function. The output only shows Start, Finish and Done is not shown which means it's not going in the function. What is the issue here?
var fs = require('fs')
var conversion = require("phantom-html-to-pdf")();
exports.handler = async (event) => {
console.log("Start")
conversion({ html: "<h1>Hello World</h1>" },
async(err, pdf) => {
var output = fs.createWriteStream('output.pdf')
console.log(pdf.logs);
console.log(pdf.numberOfPages);
// since pdf.stream is a node.js stream you can use it
// to save the pdf to a file (like in this example) or to
// respond an http request.
pdf.stream.pipe(output);
console.log("Done")
});
console.log("Finish")
};

The problem is you have marked your lambda function as async which means your function should return a promise. In your case you are not returning a promise. So you have two choices here
Convert the conversion function from callback to promise based.
OR instead of marking the function async, add callback and execute that. something like this
const fs = require("fs");
const conversion = require("phantom-html-to-pdf")();
exports.handler = (event, context, callback) => {
console.log("Start");
conversion({"html": "<h1>Hello World</h1>"},
// eslint-disable-next-line handle-callback-err
async (err, pdf) => {
const output = fs.createWriteStream("output.pdf");
console.log(pdf.logs);
console.log(pdf.numberOfPages);
// since pdf.stream is a node.js stream you can use it
// to save the pdf to a file (like in this example) or to
// respond an http request.
pdf.stream.pipe(output);
console.log("Done");
callback(null, "done");
});
};

Related

How to await inside NodeJS request?

in NodeJS I have the following code inside a function:
var request = require('request-promise');
exports.quotetest = functions.https.onRequest(async (req, res) => {
// ... code ...
await request(options, function (error, response, body) {
const jsondata = JSON.parse(body).response;
firestore.collection('Partite').doc('Serie A').update({
[m] : {"home":jsondata[0].bookmakers[0] }
});
})
}
And it gives me the following error: TypeError: Cannot read properties of undefined (reading 'bookmakers'). I think because when I update Firestore jsondata isn't ready yet. How could I wait for jsondata to be ready? If I try const jsondata = await JSON.parse(body).response; I receive the following error: SyntaxError: await is only valid in async functions and the top level bodies of modules
If I try const jsondata = await JSON.parse(body).response; I receive the following error:
SyntaxError: await is only valid in async functions and the top level bodies of modules**
This does not work as it clearly states await is valid in async function, which means your JSON.parse(....) needs to be async which is not the case.
Coming to your problem: you need to give a slight delay between your function call and when response is sent Try to call the below function which gives a delay:
async function hangOn() {
return new Promise((resolve) => setTimeout(resolve, 100));
}
Call this function between your function call to create json data and your response like this:
await hangOn()
I recently ran into similar problem which can be referred from below thread:
Call async function via PUT API call

Why is my upload incomplete in a NodeJS express app

I need to upload a v8 heap dump into an AWS S3 bucket after it's generated however the file that is uploaded is either 0KB or 256KB. The file on the server is over 70MB in size so it appears that the request isn't waiting until the heap dump isn't completely flushed to disk. I'm guessing the readable stream that is getting piped into fs.createWriteStream is happening in an async manner and the await with the call to the function isn't actually waiting. I'm using the v3 version of the AWS NodeJS SDK. What am I doing incorrectly?
Code
async function createHeapSnapshot (fileName) {
const snapshotStream = v8.getHeapSnapshot();
// It's important that the filename end with `.heapsnapshot`,
// otherwise Chrome DevTools won't open it.
const fileStream = fs.createWriteStream(fileName);
snapshotStream.pipe(fileStream);
}
async function pushHeapSnapshotToS3(fileName)
{
const heapDump = fs.createReadStream(fileName);
const s3Client = new S3Client();
const putCommand = new PutObjectCommand(
{
Bucket: "my-bucket",
Key: `heapdumps/${fileName}`,
Body: heapDump
}
)
return s3Client.send(putCommand);
}
app.get('/heapdump', asyncMiddleware(async (req, res) => {
const currentDateTime = Date.now();
const fileName = `${currentDateTime}.heapsnapshot`;
await createHeapSnapshot(fileName);
await pushHeapSnapshotToS3(fileName);
res.send({
heapdumpFileName: `${currentDateTime}.heapsnapshot`
});
}));
Your guess is correct. The createHeapSnapshot() returns a promise, but that promise has NO connection at all to when the stream is done. Therefore, when the caller uses await on that promise, the promise is resolved long before the stream is actually done. async functions have no magic in them to somehow know when a non-promisified asynchronous operation like .pipe() is done. So, your async function returns a promise that has no connection at all to the stream functions.
Since streams don't have very much native support for promises, you can manually promisify the completion and errors of the streams:
function createHeapSnapshot (fileName) {
return new Promise((resolve, reject) => {
const snapshotStream = v8.getHeapSnapshot();
// It's important that the filename end with `.heapsnapshot`,
// otherwise Chrome DevTools won't open it.
const fileStream = fs.createWriteStream(fileName);
fileStream.on('error', reject).on('finish', resolve);
snapshotStream.on('error', reject);
snapshotStream.pipe(fileStream);
});
}
Alternatively, you could use the newer pipeline() function which does support promises (built-in promise support added in nodejs v15) and replaces .pipe() and has built-in error monitoring to reject the promise:
const { pipeline } = require('stream/promises');
function createHeapSnapshot (fileName) {
const snapshotStream = v8.getHeapSnapshot();
// It's important that the filename end with `.heapsnapshot`,
// otherwise Chrome DevTools won't open it.
return pipeline(snapshotStream, fs.createWriteStream(fileName))
}

Using async/await with util.promisify(fs.readFile)?

I'm trying to learn async/await and your feedback would help a lot.
I'm simply using fs.readFile() as a specific example of functions that has not been modernized with Promises and async/await.
(I'm aware of fs.readFileSync() but I want to learn the concepts.)
Is the pattern below an ok pattern? Are there any issues with it?
const fs = require('fs');
const util = require('util');
//promisify converts fs.readFile to a Promised version
const readFilePr = util.promisify(fs.readFile); //returns a Promise which can then be used in async await
async function getFileAsync(filename) {
try {
const contents = await readFilePr(filename, 'utf-8'); //put the resolved results of readFilePr into contents
console.log('✔️ ', filename, 'is successfully read: ', contents);
}
catch (err){ //if readFilePr returns errors, we catch it here
console.error('⛔ We could not read', filename)
console.error('⛔ This is the error: ', err);
}
}
getFileAsync('abc.txt');
import from fs/promises instead, like this:
const { readFile } = require('fs/promises')
This version returns the promise you are wanting to use and then you don't need to wrap readFile in a promise manually.
Here is some more ways on using async/await
EDITED: as #jfriend00 pointed in comments, of course you have to use standard NodeJS features with built in methods like fs.readFile. So I changed fs method in the code below to something custom, where you can define your own promise.
// Create your async function manually
const asyncFn = data => {
// Instead of result, return promise
return new Promise((resolve, reject) => {
// Here we have two methods: resolve and reject.
// To end promise with success, use resolve
// or reject in opposite
//
// Here we do some task that can take time.
// For example purpose we will emulate it with
// setTimeout delay of 3 sec.
setTimeout(() => {
// After some processing time we done
// and can resolve promise
resolve(`Task completed! Result is ${data * data}`);
}, 3000);
});
}
// Create function from which we will
// call our asyncFn in chain way
const myFunct = () => {
console.log(`myFunct: started...`);
// We will call rf with chain methods
asyncFn(2)
// chain error handler
.catch(error => console.log(error))
// chain result handler
.then(data => console.log(`myFunct: log from chain call: ${data}`));
// Chain call will continue execution
// here without pause
console.log(`myFunct: Continue process while chain task still working.`);
}
// Create ASYNC function to use it
// with await
const myFunct2 = async () => {
console.log(`myFunct2: started...`);
// Read file and wait for result
const data = await asyncFn(3);
// Use your result inline after promise resolved
console.log(`myFunct2: log from async call: ${data}`);
console.log(`myFunct2: continue process after async task completed.`);
}
// Run myFunct
myFunct();
myFunct2();

AWS Lambda function flow

I'm having some issues with how my functions flow in lambda. I'm trying to grab value stored in S3, increment it, and put it back. However, my program doesn't flow how I feel it should be. I'm using async waterfall to run the flow of my functions.
Here's my code:
let AWS = require('aws-sdk');
let async = require('async');
let bucket = "MY_BUCKET";
let key = "MY_FILE.txt";
exports.handler = async (event) => {
let s3 = new AWS.S3();
async.waterfall([
download,
increment,
upload
], function (err) {
if (err) {
console.error(err);
} else {
console.log("Increment successful");
}
console.log("test4");
return null;
}
);
console.log("test5");
function download(next) {
console.log("test");
s3.getObject({
Bucket: bucket,
Key: key
},
next);
}
function increment(response, next) {
console.log("test2");
console.log(response.Body);
let newID = parseInt(response.Body, 10) + 1;
next(response.ContentType, newID);
}
function upload(contentType, data, next) {
console.log("test3");
s3.putObject({
Bucket: bucket,
Key: key,
Body: data,
ContentType: contentType
},
next);
}
};
I'm only getting test and test5 on my log. I was under the impression that after the download function, increment should run if it was okay or the callback function at the end of the waterfall should run if there was an error. The program doesn't give an error on execution and it doesn't appear to go to either function.
Could someone guide me to what I'm missing in my understanding?
EDIT: So it seems my issue was related to my function declaration. The default template declared it as async(event). I thought this was odd as usually they are declared as (event, context, callback). Switching to the later (or even just (event) without the async) fixed this. It looks like my issue is with calling the function as asynchronous. This blocked the waterfall async calls?? Can anyone elaborate on this?
Your problem is that your handler is declared as an async function, which will create a promise for you automatically, but since you are not awaiting at all your function is essentially ending synchronously.
There are a couple of ways to solve this, all of which we'll go over.
Do not use promises, use callbacks as the async library is designed to use.
Do not use the async library or callbacks and instead use async/await.
Mix both together and make your own promise and resolve/reject it manually.
1. Do not use promises
In this solution, you would remove the async keyword and add the callback parameter lambda is passing to you. Simply calling it will end the lambda, passing it an error will signal that the function failed.
// Include the callback parameter ────┐
exports.handler = (event, context, callback) => {
const params =[
download,
increment,
upload
]
async.waterfall(params, (err) => {
// To end the lambda call the callback here ──────┐
if (err) return callback(err); // error case ──┤
callback({ ok: true }); // success case ──┘
});
};
2. Use async/await
The idea here is to not use callback style but to instead use the Promise based async/await keywords. If you return a promise lambda will use that promise to handle lambda completion instead of the callback.
If you have a function with the async keyword it will automatically return a promise that is transparent to your code.
To do this we need to modify your code to no longer use the async library and to make your other functions async as well.
const AWS = require('aws-sdk');
const s3 = new AWS.S3();
const Bucket = "MY_BUCKET";
const Key = "MY_FILE.txt";
async function download() {
const params = {
Bucket,
Key
}
return s3.getObject(params).promise(); // You can await or return a promise
}
function increment(response) {
// This function is synchronous, no need for promises or callbacks
const { ContentType: contentType, Body } = response;
const newId = parseInt(Body, 10) + 1;
return { contentType, newId };
}
async function upload({ contentType: ContentType, newId: Body }) {
const params = {
Bucket,
Key,
Body,
ContentType
};
return s3.putObject(params).promise();
}
exports.handler = async (event) => {
const obj = await download(); // await the promise completion
const data = increment(obj); // call synchronously without await
await upload(data)
// The handlers promise will be resolved after the above are
// all completed, the return result will be the lambdas return value.
return { ok: true };
};
3. Mix promises and callbacks
In this approach we are still using the async library which is callback based but our outer function is promised based. This is fine but in this scenario we need to make our own promise manually and resolve or reject it in the waterfall handler.
exports.handler = async (event) => {
// In an async function you can either use one or more `await`'s or
// return a promise, or both.
return new Promise((resolve, reject) => {
const steps = [
download,
increment,
upload
];
async.waterfall(steps, function (err) {
// Instead of a callback we are calling resolve or reject
// given to us by the promise we are running in.
if (err) return reject(err);
resolve({ ok: true });
});
});
};
Misc
In addition to the main problem of callbacks vs. promises you are encountering you have a few minor issues I noticed:
Misc 1
You should be using const rather than let most of the time. The only time you should use let is if you intend to reassign the variable, and most of the time you shouldn't do that. I would challenge you with ways to write code that never requires let, it will help improve your code in general.
Misc 2
You have an issue in one of your waterfall steps where you are returning response.ContentType as the first argument to next, this is a bug because it will interpret that as an error. The signature for the callback is next(err, result) so you should be doing this in your increment and upload functions:
function increment(response, next) {
const { ContentType: contentType, Body: body } = response;
const newId = parseInt(body, 10) + 1;
next(null, { contentType, newId }); // pass null for err
}
function upload(result, next) {
const { contentType, newId } = result;
s3.putObject({
Bucket: bucket,
Key: key,
Body: newId,
ContentType: contentType
},
next);
}
If you don't pass null or undefined for err when calling next async will interpret that as an error and will skip the rest of the waterfall and go right to the completion handler passing in that error.
Misc 3
What you need to know about context.callbackWaitsForEmptyEventLoop is that even if you complete the function correctly, in one of the ways discussed above your lambda may still hang open and eventually timeout rather than successfully complete. Based on your code sample here you won't need to worry about that probably but the reason why this can happen is if you happen to have something that isn't closed properly such as a persistent connection to a database or websocket or something like that. Setting this flag to false at the beginning of your lambda execution will cause the process to exit regardless of anything keeping the event loop alive, and will force them to close ungracefully.
In the case below your lambda can do the work successfully and even return a success result but it will hang open until it timesout and be reported as an error. It can even be re-invoked over and over depending on how it's triggered.
exports.handler = async (event) => {
const db = await connect()
await db.write(data)
// await db.close() // Whoops forgot to close my connection!
return { ok: true }
}
In that case simply calling db.close() would solve the issue but sometimes its not obvious what is hanging around in the event loop and you just need a sledge hammer type solution to close the lambda, which is what context.callbackWaitsForEmptyEventLoop = false is for!
exports.handler = async (event) => {
context.callbackWaitsForEmptyEventLoop = false
const db = await connect()
await db.write(data)
return { ok: true }
}
The above will complete the lambda as soon as the function returns, killing all connections or anything else living in the event loop still.
Your function terminates before the waterfall is resolved. That is, the asynchronous calls aren't executed at all. That is why you don't see any of the console.log calls you have within the waterfall functions, and only see the one that is called synchronously immediately after the call to async.waterfall.
Not sure how well async.waterfall is supported by AWS Lambda, but since promises are natively supported and perform the same functionality (with fewer loc), you could use promises instead. Your code would look something like this:
module.exports.handler = (event,context) =>
s3.getObject({
Bucket: bucket,
Key: key
}).promise()
.then(response => ({
Body: parseInt(response.Body, 10) + 1,
ContentType: response.contentType,
}))
.then(modifiedResponse => s3.putObject({
Bucket: bucket,
Key: key,
Body: modifiedResponse.data,
ContentType: modifiedResponse.contentType}).promise())
.catch(err => console.error(err));

How come async/await doesn't work in my code?

How come this async/await doesn't work?
I've spent all day trying different combinations, watching videos and reading about async/await to find why this doesn't work before posting this here.
I'm trying to make a second nodejs app that will run on a different port, and my main app will call this so it scrap some data and save it to the db for cache.
What it's suppose to do:
Take a keyword and send it to a method called scrapSearch, this method create a complete URI link and send it to the method that actually get the webpage and returns it up to the first caller.
What is happening:
The console.log below the initial call is triggered before the results are returned.
Console output
Requesting : https://www.google.ca/?q=mykeyword
TypeError: Cannot read property 'substr' of undefined
at /DarkHawk/srv/NodesProjects/_scraper/node_scrapper.js:34:18
at <anonymous>
app.js:
'use strict';
var koa = require('koa');
var fs = require('fs');
var app = new koa();
var Router = require('koa-router');
var router = new Router();
app
.use(router.routes())
.use(router.allowedMethods());
app.listen(3002, 'localhost');
router.get('/scraptest', async function(ctx, next) {
var sfn = require('./scrap-functions.js');
var scrapFunctions = new sfn();
var html = await scrapFunctions.scrapSearch("mykeyword");
console.log(html.substr(0, 20));
//Normally here I'll be calling my other method to extract content
let json_extracted = scrapFunctions.exGg('mykeywords', html);
//Save to db
});
scrap-functions.js:
'use strict';
var request = require('request');
var cheerio = require('cheerio');
function Scraper() {
this.html = ''; //I tried saving html in here but the main script seems to have issues
retrieving that
this.kw = {};
this.tr = {};
}
// Search G0000000gle
Scraper.prototype.scrapSearch = async function(keyword) {
let url = "https://www.google.ca/?q="+keyword";
let html = await this.urlRequest(url);
return html;
};
// Get a url'S content
Scraper.prototype.urlRequest = async function(url) {
console.log("Requesting : "+url);
await request(url, await function(error, response, html) {
if(error) console.error(error);
return response;
});
};
module.exports = Scraper;
I tried a lot of things but I finally gave up - I tried putting await/async before each methods - didn't work either.
Why that isn't working?
Edit: wrong function name based on the fact that I created 2 different projects for testing and I mixed the file while copy/pasting.
You are not returning anything from urlRequest. Because it is an async function, it will still create a promise, but it will resolve with undefined. Therefore your html is undefined as seen in the error.
The problematic part is the request function which is a callback style function, but you're treating it as a promise. Using await on any value that is not a promise, won't do anything (technically it creates a promise that resolves directly with the value, but the resulting value remains the same). Both awaits within the urlRequest are unnecessary.
request(url, function(error, response, html) {
if(error) console.error(error);
// This return is for the callback function, not the outer function
return response;
});
You cannot return a value from within the callback. As it's asynchronous, your function will already have finished by the time the callback is called. With the callback style you would do the work inside the callback.
But you can turn it into a promise. You have to create a new promise and return it from urlRequest. Inside the promise you do the asynchronous work (request) and either resolve with the value (the response) or reject with the error.
Scraper.prototype.urlRequest = function(url) {
console.log("Requesting : "+url);
return new Promise((resolve, reject) => {
request(url, (err, response) => {
if (err) {
return reject(err);
}
resolve(response);
});
});
};
When an error occurred you want to return from the callback, so the rest (successful part) is not executed. I also removed the async keyword, because it's manually creating a promise.
If you're using Node 8, you can promisify the request function with the built-in util.promisify.
const util = require('util');
const request = require('request');
const requestPromise = util.promisify(request);
Scraper.prototype.urlRequest = function(url) {
console.log("Requesting : " + url);
return requestPromise(url);
};
Both versions will resolve with the response and to get the HTML you need to use response.body.
Scraper.prototype.scrapSearch = async function(keyword) {
let url = "https://www.google.ca/?q=" + keyword;
let response = await this.urlRequest(url);
return response.body;
};
You still need to handle errors from the promise, either with .catch() on the promise, or using try/catch when you await it.
It is absolutely essential to understand promises when using async/await, because it's syntactic sugar on top of promises, to make it look more like synchronous code.
See also:
Understand promises before you start using async/await
Async functions - making promises friendly
Exploring ES6 - Promises for asynchronous programming

Resources