Comparing two arrays from async functions? - node.js

I have read a lot of posts on how to solve this problem, but I cannot understand it.
I have a database (psql) and a csv. I have a two functions. One to read a list of domains from psql. And another to read a different list of domains from the csv.
Both functions are async operations that live in separate modules.
Goal: to bring the results of both reader functions (which are arrays)into the same file and compare the files for duplicates.
Currently, I have made progress using Promise.all. However, I cannot seem to isolate the two separate arrays so I can use them.
Solution Function (not working):
This is where I am trying to read in both lists into two separate arrays.
The CSVList variable has a console.log that logs the array when the CSVList.filter is not present. Which leads me to believe that the array is actually there? Maybe?
const allData = async function () {
let [result1, result2] = await Promise.all([readCSV, DBList]);
const DBLists = result2(async (domainlist) => {
return domainlist;
});
const CSVList = result1(async (csv) => {
const csvArr = await csv.map((x) => {
return x[0];
});
console.log(csvArr);
return csvArr;
});
const main = await CSVList.filter((val) => !DBLists.includes(vals)); // this doesn't work. it says that filter is not a function. I understand why filter is not a function. What I do not understand is why the array is not being returned?
};
allData();
psql reader:
const { pool } = require("./pgConnect");
//
const DBList = async (callback) => {
await pool
.query(
`
SELECT website
FROM domains
limit 5
`
)
.then(async (data) => {
const domainList = await data.rows.map((x) => {
return x.website;
});
callback(domainList);
});
};
csv reader:
const { parseFile } = require("#fast-csv/parse");
const path = require("path");
const fs = require("fs");
const domainPath = path.join(__dirname, "domains.csv");
//reads initial domain list and pushes the domains to an array
//on end, calls a callback function with the domain data
const readCSV = async (callback) => {
let domainList = [];
let csvStream = parseFile(domainPath, { headers: false })
.on("data", (data) => {
//push csv data to domainList array
domainList.push(data);
// console.log(data);
})
.on("end", () => {
callback(domainList);
});
};

I took jFriend00 Advice and I updated my code a bit.
The biggest issue was the ReadCSV function. Fast-csv doesn't seem to be asynchronous. I wrapped it in a new promise manually. And then resolved that promise passing the domain list as an argument to resolve.
updated CSV Reader:
const readCSV2 = new Promise((resolve, reject) => {
let domainList = [];
let csvStream = parseFile(domainPath, { headers: false })
.on("data", (data) => {
//push csv data to domainList array
domainList.push(data[0]);
// console.log(data);
})
.on("end", () => {
resolve(domainList);
});
});
Updated Solution for comparing the two lists
const allData = async function () {
// get the values from the DB and CSV in one place
let [result1, result2] = await Promise.all([readCSV2, DBList]);
const CSVDomains = await result1;
const DBDomains = await result2();
//final list compares the two lists and returns the list of non duplicated domains.
const finalList = await CSVDomains.filter(
(val) => !DBDomains.includes(val)
);
console.log("The new list is: " + finalList);
};
Quick aside: I could have accomplished the same result by using psql ON CONFLICT DO NOTHING. This would have ignored duplicates when updating to the database because I have a UNIQUE constraint on the domain column.

Related

Store data returned from database into variable in nodejs express

I want to fetch subject Contents based on subject code and then inside each subject content, I want to fetch its sub contents as well and then store the main contents and sub contents in one array as object and return the data to react.
Please help me with this.
Node express API code
app.post('/api/teacher/courses/maintopics', (req, res) =>
{
let SubCode = req.body.data;
let teacher = new Teacher();
teacher.getCoursesMainContent(SubCode).then(result =>
{
let Contiants = [];
result.forEach(element =>
{
SubContent = [];
element.forEach(e =>
{
let contentCode = e.ContentCode;
teacher.getCoursesSubContent(contentCode).then()
.then(res => {
SubContent.push(res)
// here I want to store the sub content inside SubContent array
});
})
});
res.json(Contiants);
});
});
the problem is that when res.json(Contiants); is executed, the promises (getCoursesSubContent) are not resolved yet.
you need to use await like jax-p said.
also note that you cannot use forEach with await/promises (well you can, but it wont work as you wish it does : Using async/await with a forEach loop)
app.post('/api/teacher/courses/maintopics', async (req, res, next) =>
{
try {
let SubCode = req.body.data;
let teacher = new Teacher();
const results = await teacher.getCoursesMainContent(SubCode);
let Contiants = [];
for (let element of results) {
SubContent = [];
for (let e of element) {
let contentCode = e.ContentCode;
let res = await teacher.getCoursesSubContent(contentCode);
SubContent.push(res)
}
}
res.json(Contiants);
} catch(err) {
next(err);
}
});

Can't add key from function to dictionary

My code:
var price = {};
function getPrice(price) {
const https = require('https');
var item = ('M4A1-S | Decimator (Field-Tested)')
var body = '';
var price = {};
https.get('https://steamcommunity.com/market/priceoverview/?appid=730&market_hash_name=' + item, res => {
res.on('data', data => {
body += data;
})
res.on('end', () => price ['value'] = parseFloat(JSON.parse(body).median_price.substr(1))); //doesnt add to dict
}).on('error', error => console.error(error.message));
}
price['test'] = "123" //adds to dict fine
getPrice(price)
console.log(price);
Output:
{ test: '123' }
as you can see, the "test: 123" gets added, but the "value: xxx" from the function doesn't. Why is that?
There are two main problems here:
You're redeclaring the variable inside your function so you're declaring a separate, new variable and modifying that so the higher scoped variable, never gets your .value property.
You're assigning the property inside an asynchronous callback that runs sometime later after your function has returned and thus your function actually returns and you do the console.log() too soon before you have even obtained the value. This is a classic issue with return asynchronously obtained data from a function in Javascript. You will need to communicate back that data with a callback or with a promise.
I would also suggest that you use a higher level library that supports promises for getting your http request and parsing the results. There are many that already support promises, already read the whole response, already offer JSON parsing built-in, do appropriate error detection and propagation, etc... You don't need to write all that yourself. My favorite library for this is got(), but you can see a list of many good choices here. I would strongly advise that you use promises to communicate back your asynchronous result.
My suggestion for fixing this would be this code:
const got = require('got');
async function getPrice() {
const item = 'M4A1-S | Decimator (Field-Tested)';
const url = 'https://steamcommunity.com/market/priceoverview/?appid=730&market_hash_name=' + item;
const body = await got(url).json();
if (!body.success || !body.median_price) {
throw new Error('Could not obtain price');
}
return parseFloat(body.median_price.substr(1));
}
getPrice().then(value => {
// use value here
console.log(value);
}).catch(err => {
console.log(err);
});
When I run this, it logs 5.2.
You're actually console.logging .price before you're setting .value; .value isn't set until the asynchronous call fires.
You are declaring price again inside the function and also not waiting for the asynchronous task to finish.
const https = require("https");
const getPrice = () =>
new Promise((resolve, reject) => {
const item = "M4A1-S | Decimator (Field-Tested)";
let body = "";
return https
.get(
`https://steamcommunity.com/market/priceoverview/?appid=730&market_hash_name=${item}`,
res => {
res.on("data", data => {
body += data;
});
res.on("end", () =>
resolve(
parseFloat(JSON.parse(body).median_price.substr(1))
)
);
}
)
.on("error", error => reject(error));
});
const main = async () => {
try{
const price = await getPrice();
//use the price value to do something
}catch(error){
console.error(error);
}
};
main();

Why would promisify cause a loop to take massively longer?

I have an app which has to extract color info from a video and it does this by analyzing each frame. First I extract the frames and then load an array of their locations in memory. As you might imagine, for even a small video it can be in the thousands.
The function I use to extract each frames color info is a promise so I opted to batch an array of promises with Promise.all
With each files absolute path, I read the file with fs and then pass it along to be processed. I've done this with many stand alone images and know the process only takes about a second but suddenly it was taking almost 20min to process 1 image. I finally figured out that using promisify on fs.readFile was what caused the bottle neck. What I don't understand is why?
In the first one fs.readFile is transformed inside of the promise that's returned while in the second one fs.readFile is just used as it normally would be and I wait for resolve to be called. I don't mind using the non-promise one, I'm just curious why this would cause such a slow down?
The second I stopped using promisify the app sped back up to 1 frame / second
The slow code:
async analyzeVideo(){
await this._saveVideo();
await this._extractFrames();
await this._removeVideo();
const colorPromises = this.frameExtractor.frames.map(file => {
return new Promise(resolve => {
//transform image into data
const readFile = promisify(fs.readFile);
readFile(file)
.then(data => {
const analyzer = new ColorAnalyzer(data);
analyzer.init()
.then(colors => {
resolve(colors)
})
})
.catch((e)=> console.log(e));
})
});
const colors = await runAllQueries(colorPromises);
await this._removeFrames();
this.colors = colors;
async function runAllQueries(promises) {
const batches = _.chunk(promises, 50);
const results = [];
while (batches.length) {
const batch = batches.shift();
const result = await Promise.all(batch)
.catch(e=>console.log(e));
results.push(result)
}
return _.flatten(results);
}
}
The fast code:
async analyzeVideo(){
await this._saveVideo();
await this._extractFrames();
await this._removeVideo();
const colorPromises = this.frameExtractor.frames.map(file => {
return new Promise(resolve => {
//transform image into data
fs.readFile(file, (err, data) => {
const analyzer = new ColorAnalyzer(data);
analyzer.init()
.then(colors => {
resolve(colors)
})
});
})
});
const colors = await runAllQueries(colorPromises);
await this._removeFrames();
this.colors = colors;
async function runAllQueries(promises) {
const batches = _.chunk(promises, 50);
const results = [];
while (batches.length) {
const batch = batches.shift();
const result = await Promise.all(batch)
.catch(e=>console.log(e));
results.push(result)
}
return _.flatten(results);
}
}
You don't need to promisify in each loop iteration, just do it once at the top of the module.
Most likely the issue is caused by Promises that are never settled. You are not handling the error correctly, so Promise.all may never finish if an error is thrown.
Instead of logging the error in .catch, you'll have to reject too, or resolve at least if you don't care about the errors. Also analyzer.init() errors are not being catched (if that function can reject)
const readFile = promisify(fs.readFile);
// ...
const colorPromises = this.frameExtractor.frames.map(file => {
return new Promise((resolve, reject) => {
//transform image into data
// const readFile = promisify(fs.readFile);
readFile(file)
.then(data => {
const analyzer = new ColorAnalyzer(data);
return analyzer.init()
})
.then(resolve) // colors
.catch((e)=> {
reject(e);
console.log(e)
});
})
})
Aside from that runAllQueries is not doing what you think it's doing. You already executed all the promises.
I recommend you use p-limit instead
const pLimit = require('p-limit');
const limit = pLimit(50);
/* ... */
const colorPromises = this.frameExtractor.frames.map(file => {
return limit(() => {
return readFile(file)
.then(data => {
const analyzer = new ColorAnalyzer(data);
return analyzer.init()
})
.then(resolve) // colors
})
})
const colors = await Promise.all(colorPromises);
Furthermore, if you're executing 50 reads at a time, you should increase the value of UV_THREADPOOL_SIZE which defaults to 4.
At your entry point, before any require:
process.env.UV_THREADPOOL_SIZE = 64 // up to 128
Or call the script as: UV_THREADPOOL_SIZE=64 node index.js

How to get data using async and await in node js

I have two collections of products and categories and I have both collections that have data products(152), categories(10). So, I tried to connect the DB to retrieve the data. First I call products collection data next call categories collection data using async-await functionality. But it gets the first categories of data and next product data. How to solve this issue anyone can give the answer.
product.js
async function product_data(collection) {
let mongodb = await MongoDB.connect(collection)
let result = await mongodb.findAll()
return result
}
module.exports.product_data = product_data
category.js
async function category_data(collection) {
let mongodb = await MongoDB.connect(collection)
let result = await mongodb.findAll()
return result
}
module.exports.category_data = category_data
app.js
const {product_data} = require("./product")
const {category_data} = require("./category")
async function updatedb() {
let product_data = await product_data("ecomm_product")
console.log(product_data)
let category_data = await category_data("ecomm_category")
console.log(category_data)
}
I got result
Its first print category_data after print product_data
Expected result
Its first print product_data after print category_data
I can't reproduce this at all, not even with explicit delays in which products take longer to resolve than categories. Collapsing your code to a single file, and using proper JS conventions for naming and case:
function getProductData(collection) {
return new Promise(resolve => {
setTimeout(() => resolve('product'), 2000);
});
}
function getCategoryData(collection) {
return new Promise(resolve => {
setTimeout(() => resolve('category'), 1000);
});
}
async function updatedb() {
let product_data = await getProductData("ecomm_product")
console.log(product_data)
let category_data = await getCategoryData("ecomm_category")
console.log(category_data)
}
updatedb();
This simply yields the following output, every time:
$node test.js
product
category

NodeJS - read CSV file to array returns []

I'm trying to use the promised-csv module (https://www.npmjs.com/package/promised-csv) to read the rows of a CSV file to an array of strings for a unit test:
const inputFile = '.\\test\\example_result.csv';
const CsvReader = require('promised-csv');
function readCSV(inputFile){
var reader = new CsvReader();
var output = [];
reader.on('row', function (data) {
//console.log(data);
output.push(data[0]);
});
reader.read(inputFile, output);
return output;
}
I would like to call this function later in a unit test.
it("Should store the elements of the array", async () => {
var resultSet = readCSV(inputFile);
console.log(resultSet);
});
However, resultSet yields an empty array. I am also open to use any other modules, as long as I can get an array of strings as a result.
The code should look something like this, according to the docs.
const inputFile = './test/example_result.csv';
const CsvReader = require('promised-csv');
function readCSV(inputFile) {
return new Promise((resolve, reject) => {
var reader = new CsvReader();
var output = [];
reader.on('row', data => {
// data is an array of data. You should
// concatenate it to the data set to compile it.
output = output.concat(data);
});
reader.on('done', () => {
// output will be the compiled data set.
resolve(output);
});
reader.on('error', err => reject(err));
reader.read(inputFile);
});
}
it("Should store the elements of the array", async () => {
var resultSet = await readCSV(inputFile);
console.log(resultSet);
});
readCSV() returns a Promise. There are two ways that you can access the data it returns upon completion.
As Roland Starke suggests, use async and await.
var resultSet = await readCSV(inputFile);
This will wait for the Promise to resolve before returning a value.
More here: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/await
Use Promise.prototype.then() - this is similar to async/await, but can also be chained with other promises and Promise.prototype.catch().
The most important thing to remember is that the function passed to .then() will not be executed until readCSV() has resolved.
readCSV().then((data)=>{return data}).catch((err)=>{console.log(err)})
More here: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Promise/then

Resources