Promisifying multiparty - node.js

I am promisifying multiparty to use its form.parse. It works fine but form.parse does not return a promise whose then/catch value I can use.
var Promise = require('bluebird');
var multiparty = Promise.promisifyAll(require('multiparty'), {multiArgs:true})
var form = new multiparty.Form();
form.parse({}).then((data)=>{console.log(data)});

Here is my solution using build-in Promise:
const promisifyUpload = (req) => new Promise((resolve, reject) => {
const form = new multiparty.Form();
form.parse(req, function(err, fields, files) {
if (err) return reject(err);
return resolve([fields, files]);
});
});
And usage:
const [fields, files] = await promisifyUpload(req)

My solution for waiting until all the parts are read:
const multipartParser = new Form();
multipartParser.on('error', error => { /* do something sensible */ });
const partLatches: Latch<void, Error>[] = [];
multipartParser.on('part', async part => {
// Latch must be created and pushed *before* any async/await activity!
const partLatch = createLatch();
partLatches.push(partLatch);
const bodyPart = await readPart(part);
// do something with the body part
partLatch.resolve();
});
const bodyLatch = createLatch();
multipartParser.on('close', () => {
logger.debug('Done parsing whole body');
bodyLatch.resolve();
});
multipartParser.parse(req);
await bodyLatch;
await Promise.all(partLatches.map(latch => latch.promise));
This can be handy in cases where you want to process the parts further, for example parse and validate them, perhaps store them in a database.

Related

Comparing two arrays from async functions?

I have read a lot of posts on how to solve this problem, but I cannot understand it.
I have a database (psql) and a csv. I have a two functions. One to read a list of domains from psql. And another to read a different list of domains from the csv.
Both functions are async operations that live in separate modules.
Goal: to bring the results of both reader functions (which are arrays)into the same file and compare the files for duplicates.
Currently, I have made progress using Promise.all. However, I cannot seem to isolate the two separate arrays so I can use them.
Solution Function (not working):
This is where I am trying to read in both lists into two separate arrays.
The CSVList variable has a console.log that logs the array when the CSVList.filter is not present. Which leads me to believe that the array is actually there? Maybe?
const allData = async function () {
let [result1, result2] = await Promise.all([readCSV, DBList]);
const DBLists = result2(async (domainlist) => {
return domainlist;
});
const CSVList = result1(async (csv) => {
const csvArr = await csv.map((x) => {
return x[0];
});
console.log(csvArr);
return csvArr;
});
const main = await CSVList.filter((val) => !DBLists.includes(vals)); // this doesn't work. it says that filter is not a function. I understand why filter is not a function. What I do not understand is why the array is not being returned?
};
allData();
psql reader:
const { pool } = require("./pgConnect");
//
const DBList = async (callback) => {
await pool
.query(
`
SELECT website
FROM domains
limit 5
`
)
.then(async (data) => {
const domainList = await data.rows.map((x) => {
return x.website;
});
callback(domainList);
});
};
csv reader:
const { parseFile } = require("#fast-csv/parse");
const path = require("path");
const fs = require("fs");
const domainPath = path.join(__dirname, "domains.csv");
//reads initial domain list and pushes the domains to an array
//on end, calls a callback function with the domain data
const readCSV = async (callback) => {
let domainList = [];
let csvStream = parseFile(domainPath, { headers: false })
.on("data", (data) => {
//push csv data to domainList array
domainList.push(data);
// console.log(data);
})
.on("end", () => {
callback(domainList);
});
};
I took jFriend00 Advice and I updated my code a bit.
The biggest issue was the ReadCSV function. Fast-csv doesn't seem to be asynchronous. I wrapped it in a new promise manually. And then resolved that promise passing the domain list as an argument to resolve.
updated CSV Reader:
const readCSV2 = new Promise((resolve, reject) => {
let domainList = [];
let csvStream = parseFile(domainPath, { headers: false })
.on("data", (data) => {
//push csv data to domainList array
domainList.push(data[0]);
// console.log(data);
})
.on("end", () => {
resolve(domainList);
});
});
Updated Solution for comparing the two lists
const allData = async function () {
// get the values from the DB and CSV in one place
let [result1, result2] = await Promise.all([readCSV2, DBList]);
const CSVDomains = await result1;
const DBDomains = await result2();
//final list compares the two lists and returns the list of non duplicated domains.
const finalList = await CSVDomains.filter(
(val) => !DBDomains.includes(val)
);
console.log("The new list is: " + finalList);
};
Quick aside: I could have accomplished the same result by using psql ON CONFLICT DO NOTHING. This would have ignored duplicates when updating to the database because I have a UNIQUE constraint on the domain column.

Promise not waiting for firebase query to complete and gets resolved too soon

I have created a promise, which would take an array of firebase keys as input, loop them to query firebase realtime database. My issue is even after I use async await, for firebase to provide results back, promise is getting resolved quickly.
function firebaseQuery(keys){
const result = [];
return new Promise((resolve, reject) => {
keys.forEach((key) => {
const snap = app.child(key).once('value');
const snapJSON = await snap.then(snapshot => snapshot.toJSON());
result.push({ key: key, post: snapJSON });
console.log(result);
});
resolve(result);
});
}
forEach does not pause for await statements, so it won't work like this (https://codeburst.io/javascript-async-await-with-foreach-b6ba62bbf404). Better to map the keys into an array of promises, and then use Promise.all() to wait until they all resolve. Something like this (just make sure to handle your errors)
async function firebaseQuery(keys){
const result = await Promise.all(keys.map(async key => {
const snap = app.child(key).once('value');
const snapJSON = await snap.then(snapshot => snapshot.toJSON());
const returnValue = { key: key, post: snapJSON };
console.log(returnValue);
return returnValue;
}));
}

Node.js Lambda Async return Undefined

Simple call to ec2 Describing Security groups and returning the security group ID. Using Async / await, but when logging the return value, I get undefined. I fully admit I'm coming from Python and I've tried my hardest to wrap my brain around async calls. I thought I had it nailed, but I'm obviously missing something.
'use strict';
// Load Modules
const AWS = require('aws-sdk')
//Set the region
AWS.config.update({region: 'us-west-2'});
// Call AWS Resources
const ec2 = new AWS.EC2();
// Get Security Group ID From Event
const getSgIdFromEvent = async (event) => {
var ec2params = { Filters: [{Name: 'tag:t_whitelist',Values[event['site']]}]};
await ec2.describeSecurityGroups(ec2params, function (err, response) {
if (err) {return console.error(err.message)}
else {
var sgId = response.SecurityGroups[0].GroupId;
return sgId;
};
});
};
// MAIN FUNCTION
exports.handler = (event, context) => {
getSgIdFromEvent(event)
.then(sgId => {console.log(sgId)});
}
"sgId" should return the security group ID. It does print out fine in the original function before the return.
Typically if it is an async call you want you handle it similar to this way without using a callback
// Load Modules
const AWS = require('aws-sdk')
//Set the region
AWS.config.update({ region: 'us-west-2' });
// Call AWS Resources
const ec2 = new AWS.EC2();
// Get Security Group ID From Event
const getSgIdFromEvent = async (event) => {
var ec2params = { Filters: [{ Name: 'tag:t_whitelist', Values[event['site']]}] };
try {
const securityGroupsDesc = await ec2.describeSecurityGroups(ec2params).promise();
const sgId = securityGroupsDesc.SecurityGroups[0].GroupId;
//do something with the returned result
return sgId;
}
catch (error) {
console.log('handle error');
// throw error;
}
});
};
// MAIN FUNCTION
exports.handler = (event, context) => {
getSgIdFromEvent(event)
.then(sgId => { console.log(sgId) });
}
however if it doesn't support async you just use the callback to handle the returned data or error without using async function.However Reading into AWS docs you can find that the function ec2.describeSecurityGroups() returns an AWS Request
which has a method promise() that needs to be invoked to send the request and get a promise returned.Note that the try catch here is not needed but good to have in case error occurs during the process.
As I said in the comment, chance are that describeSecurityGroups doesn't return a Promise. Try transforming it explictly in a Promise instead:
const promiseResponse = await new Promise((res, rej) => {
ec2.describeSecurityGroups(ec2params, function (err, response) {
if (err) {return rej(err.message)}
else {
var sgId = response.SecurityGroups[0].GroupId;
res(sgId);
};
})
});
// promiseResponse is now equal to sgId inside the callback
return promiseResponse; // this will work because the function is async
Note: You can drop the else keyword
Here is the code that worked using async / await. Thanks to #Cristian Traina I realized ec2.describeSecurityGroups wasn't returning a promise, it was returning an AWS.Event.
// Get Security Group ID From Event
const getSgIdFromEvent = async (event) => {
console.log('Getting Security Group ID')
var params = { Filters: [{Name: 'tag:t_whitelist', Values
[event['site']]}]};
const describeSG = await ec2.describeSecurityGroups(params).promise();
return describeSG.SecurityGroups[0].GroupId;
};
// Get Ingress Rules from Security Group
const getSgIngressRules = async (sgId) => {
console.log(`Getting SG Ingress rules for ${sgId}`)
var params = { GroupIds: [ sgId]};
try{
const ingressRules = await ec2.describeSecurityGroups(params).promise();
return ingressRules;
}
catch (error) {
console.log("Something went wrong getting Ingress Ruls");
}
};
// MAIN FUNCTION
exports.handler = (event, context) => {
getSgIdFromEvent(event)
.then(sgId => {return getSgIngressRules(sgId);})
.then(ingressRules => {console.log(ingressRules);});
}
I submitted this as the answer now since the getSgIdFromEvent function I have, is only 8 lines and still using the async/await like I was desiring.
What I was missing was the .promise() on the end of the function and returning that promise.
Thanks for all the responses!

Export a dynamic variable

I'm trying to export a variable in node.js like this:
let news = [];
const fetchNews = new Promise ((resolve, reject) => {
let query = 'SELECT id, name FROM news';
mysql.query(query, [], (error, results) => {
if (error)
reject({error: `DB Error: ${error.code} (${error.sqlState})`})
results = JSON.parse(JSON.stringify(results));
news = results;
resolve(results);
});
});
if(!news.length)
fetchNews
.then(results => {news = results})
.catch(err => {console.log('Unable to fetch news', err)});
exports.news = news;
When I use this code in some other module like this:
const news = require('./news.js').news;
console.log(news);
//returns [];
Can somebody point out my mistake in first code?
There are a couple of things that seem odd in the way you are doing this:
You have an async operation but you want just the value without actually awaiting on the operation to complete. Try something like this:
module.exports = new Promise ((resolve, reject) => {
mysql.query('SELECT id, name FROM news', (error, results) => {
if (error)
reject({error: `DB Error: ${error.code} (${error.sqlState})`})
resolve(JSON.parse(JSON.stringify(results)));
});
});
Then to get the news:
var getNewsAsync = require('./news')
getNewsAsync.then(news => console.log(news))
It would be cleaner/shorter if you actually utilize async/await with the mysql lib.
Update:
With Node 8 and above you should be able to promisify the mySQL lib methods. Although there might be better npm options out there to get this to work. Here is an untested version:
const mysql = require('mysql');
const util = require('util');
const conn = mysql.createConnection({yourHOST/USER/PW/DB});
const query = util.promisify(conn.query).bind(conn);
module.exports = async () => {
try {return await query('SELECT id, name FROM news')} finally {conn.end()}
}
To get the news:
var getNewsAsync = require('./news')
console.log(await getNewsAsync())

NodeJS, promises, streams - processing large CSV files

I need to build a function for processing large CSV files for use in a bluebird.map() call. Given the potential sizes of the file, I'd like to use streaming.
This function should accept a stream (a CSV file) and a function (that processes the chunks from the stream) and return a promise when the file is read to end (resolved) or errors (rejected).
So, I start with:
'use strict';
var _ = require('lodash');
var promise = require('bluebird');
var csv = require('csv');
var stream = require('stream');
var pgp = require('pg-promise')({promiseLib: promise});
api.parsers.processCsvStream = function(passedStream, processor) {
var parser = csv.parse(passedStream, {trim: true});
passedStream.pipe(parser);
// use readable or data event?
parser.on('readable', function() {
// call processor, which may be async
// how do I throttle the amount of promises generated
});
var db = pgp(api.config.mailroom.fileMakerDbConfig);
return new Promise(function(resolve, reject) {
parser.on('end', resolve);
parser.on('error', reject);
});
}
Now, I have two inter-related issues:
I need to throttle the actual amount of data being processed, so as to not create memory pressures.
The function passed as the processor param is going to often be async, such as saving the contents of the file to the db via a library that is promise-based (right now: pg-promise). As such, it will create a promise in memory and move on, repeatedly.
The pg-promise library has functions to manage this, like page(), but I'm not able to wrap my ahead around how to mix stream event handlers with these promise methods. Right now, I return a promise in the handler for readable section after each read(), which means I create a huge amount of promised database operations and eventually fault out because I hit a process memory limit.
Does anyone have a working example of this that I can use as a jumping point?
UPDATE: Probably more than one way to skin the cat, but this works:
'use strict';
var _ = require('lodash');
var promise = require('bluebird');
var csv = require('csv');
var stream = require('stream');
var pgp = require('pg-promise')({promiseLib: promise});
api.parsers.processCsvStream = function(passedStream, processor) {
// some checks trimmed out for example
var db = pgp(api.config.mailroom.fileMakerDbConfig);
var parser = csv.parse(passedStream, {trim: true});
passedStream.pipe(parser);
var readDataFromStream = function(index, data, delay) {
var records = [];
var record;
do {
record = parser.read();
if(record != null)
records.push(record);
} while(record != null && (records.length < api.config.mailroom.fileParserConcurrency))
parser.pause();
if(records.length)
return records;
};
var processData = function(index, data, delay) {
console.log('processData(' + index + ') > data: ', data);
parser.resume();
};
parser.on('readable', function() {
db.task(function(tsk) {
this.page(readDataFromStream, processData);
});
});
return new Promise(function(resolve, reject) {
parser.on('end', resolve);
parser.on('error', reject);
});
}
Anyone sees a potential problem with this approach?
You might want to look at promise-streams
var ps = require('promise-streams');
passedStream
.pipe(csv.parse({trim: true}))
.pipe(ps.map({concurrent: 4}, row => processRowDataWhichMightBeAsyncAndReturnPromise(row)))
.wait().then(_ => {
console.log("All done!");
});
Works with backpressure and everything.
Find below a complete application that correctly executes the same kind of task as you want: It reads a file as a stream, parses it as a CSV and inserts each row into the database.
const fs = require('fs');
const promise = require('bluebird');
const csv = require('csv-parse');
const pgp = require('pg-promise')({promiseLib: promise});
const cn = "postgres://postgres:password#localhost:5432/test_db";
const rs = fs.createReadStream('primes.csv');
const db = pgp(cn);
function receiver(_, data) {
function source(index) {
if (index < data.length) {
// here we insert just the first column value that contains a prime number;
return this.none('insert into primes values($1)', data[index][0]);
}
}
return this.sequence(source);
}
db.task(t => {
return pgp.spex.stream.read.call(t, rs.pipe(csv()), receiver);
})
.then(data => {
console.log('DATA:', data);
}
.catch(error => {
console.log('ERROR:', error);
});
Note that the only thing I changed: using library csv-parse instead of csv, as a better alternative.
Added use of method stream.read from the spex library, which properly serves a Readable stream for use with promises.
I found a slightly better way of doing the same thing; with more control. This is a minimal skeleton with precise parallelism control. With parallel value as one all records are processed in sequence without having the entire file in memory, we can increase parallel value for faster processing.
const csv = require('csv');
const csvParser = require('csv-parser')
const fs = require('fs');
const readStream = fs.createReadStream('IN');
const writeStream = fs.createWriteStream('OUT');
const transform = csv.transform({ parallel: 1 }, (record, done) => {
asyncTask(...) // return Promise
.then(result => {
// ... do something when success
return done(null, record);
}, (err) => {
// ... do something when error
return done(null, record);
})
}
);
readStream
.pipe(csvParser())
.pipe(transform)
.pipe(csv.stringify())
.pipe(writeStream);
This allows doing an async task for each record.
To return a promise instead we can return with an empty promise, and complete it when stream finishes.
.on('end',function() {
//do something wiht csvData
console.log(csvData);
});
So to say you don't want streaming but some kind of data chunks? ;-)
Do you know https://github.com/substack/stream-handbook?
I think the simplest approach without changing your architecture would be some kind of promise pool. e.g. https://github.com/timdp/es6-promise-pool

Resources