How to download much files in a loop in node js?

How to download much files in a loop in node js? - node.js

i take a first steps with node.js and i don't understand async code.
I would to make a code who downloads 1k files from links.
In my code i used package "node-downloader-helper".
and this package work to one or 2 files.
when loop is bigger downloads max 70 files and all don't working correctly
all are downloaded incorrectly.
Its my code:
let uniq = [array with 1000 links];
async.forEachOf(uniq, function (value, key, callback) {
const dl = new DownloaderHelper(value, 'E:/XAMPP/htdocs/mydownloads/pdf/',{fileName: key+".pdf"});
dl.on('end', () => console.log('Download Completed'))
dl.start();
}, function (err) {
if (err) console.error(err.message);
});
enter image description here
solution
const save = async (link, index) => {
const dl = new DownloaderHelper(link, linkToDirectory)
await dl.start();
}
const forLoop = async _ => {
console.log('Start')
for (let index = 0; index < uniq.length; index++) {
const link = uniq[index]
const response = await save(link, index)
console.log(response)
}
console.log('End')
}
forLoop();

Related

I am trying to get data from my sql database to my react page, but my api returns an empty array

http://chucklets.no/getOnlineTime
When I click this I can see the JSON and it looks fine.
But when using fetch it
API Node.js code:
app.get('/getOnlineTime', (req, res) => {
console.log("Reading rows from the Table...");
const arr = [];
connection.execSql(new Request('SELECT * FROM OnlineTime', function (err, rowCount, rows) {
if (err) {
console.error(err);
}
})
.on('doneInProc', function (rowCount, more, rows) {
var row = {};
for (let i = 0; i < rows.length; i++) {
var row = {};
for (let j = 0; j < rows[i].length; j++) {
row[rows[i][j].metadata.colName] = rows[i][j].value;
}
arr.push(row);
}
res.json(arr)
})
);
});
React code:
const [playerD, setPlayerData] = useState([]);
useEffect(() => {
const fetchData = async () => {
const response = await fetch('http://chucklets.no/getOnlineTime', {method:'GET'})
if (!response.ok) {
throw new Error(response.status);
}
const data = await response.text()
setPlayerData(data)
}
fetchData()
}, [])
When using fetch on https://nba-players.herokuapp.com/players-stats I get a nice JSON. Any input would be greatly appreciated.

Because it works when you open the link in the browser and doesn't work per JS fetch, I suspect there might be some CORS issues and the Node.js isn't even responding. That's why I suspect the default value of the playerD variable in react is never updated and stay an empty array.
Try using the cors middleware:
Express CORS Middleware

How to make sure ftp stream is fully download?

This is my code
const execute = async() => {
const dir = `c://users/xxx/desktop/test/`
const ftp = new PromiseFtp()
try {
await ftp.connect(config)
for (let word of word_name) {
let lists = await ftp.list(`/loc/${word}/`)
let file = lists.slice(-2)[0] //get recent file
let file_path = `${dir}${file.name}`
const stream = await ftp.get(`/loc/${word}/${file.name}`)
await new Promise((resolve, reject) => {
stream.once('end', resolve)
stream.once('error', reject)
stream.pipe(fs.createWriteStream(file_path))
})
let byte = fs.readFileSync(file_path )
byte = iconv.decode(byte, 'big5')
console.log(byte)
}
} catch(e) {
console.error(e)
} finally {
return ftp.end()
}
}
I use promise-ftp module, sometimes the byte in the loop all return correct value，but sometimes one or more than one value just return undefined，I think the reason is asynchronous stream, I tried use stream.once('finish')and other way，but it not working too，anyone can explain why? thanks。

upload to s3 with fileS buffer

I am trying to upload to s3 with bulk files.
Somehow if I am uploading with callback, it'll work properly but I want to push all the return data into an array then do something after. But it doesn't work.
I looked online, I was saw answers such as using async await or recurssive would work but still it's not working though. I even tried using reduce but no luck too
example of my reduce
return files.reduce((accumulator, current) => {
const {path, buffer} = current;
const s3 = new AWS.S3();
// https://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html#putObject-property
s3.putObject(awsS3sdkParams(path, buffer), function ( err, data ) {
const { protocol, host } = this.request.httpRequest.endpoint;
data.params = this.request.params;
data.params.url = `${protocol}//${host}/${data.params.Key}`;
return [...accumulator, data];
});
}, []);
example using recurrsive
const result = [];
const helper = (files) => {
const {path, buffer} = files[0];
const s3 = new AWS.S3();
// https://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html#putObject-property
s3.putObject(UploadService.awsS3sdkParams(path, buffer), function (err, data){
const { protocol, host } = this.request.httpRequest.endpoint;
data.params = this.request.params;
data.params.url = `${protocol}//${host}/${data.params.Key}`;
UtilsService.clDebug(data, 'data');
result.push(data);
files.shift();
if(files.length > 0) return helper(files);
});
};
helper(files);
return results;
example using promise
const result = [];
for(let {path, buffer} of files){
const s3 = new AWS.S3();
s3.putObject(awsS3sdkParams(path, buffer)).promise()
.then(file => {
result.push(file);
})
.catch(err => {
console.log(err, 'errs');
});
}
I can pretty much understand why result is always [] but how can I make it work though?
Reason why I cannot use async await is because I tried but then somehow after files are either uploaded with bad data that I cannot even open the file, or keys would be the same...
Does anyone has any other suggestions or advice?
Thanks in advance for any

NodeJS scraper needing to increment page and re-run

I'm building a simple NodeJS web scraper, and I want to re-run the function like a 'for loop' until pageNum = totalNumberOfPages... im having a brain fart, and unable to re-run the function from inside itself, since it returns an array fragment and kills itself. Could someone help me overcome this obstacle? I'm pretty sure it's very simple.
I looked at this and this but didn't figure it out...
const cheerio = require("cheerio");
const axios = require("axios");
let pageNum = 0;
let siteUrl = "https://whatever.com?&page=" + pageNum + "&viewAll=true";
let productArray = [];
let vendor = [];
let productTitle = [];
let plantType = [];
let thcRange = [];
let cbdRange = [];
let price = [];
let totalNumberOfPages = undefined;
// called by getResults()
const fetchData = async () => {
const result = await axios.get(siteUrl);
return cheerio.load(result.data);
};
// this function is called from index.js
const getResults = async () => {
// >>>>>>>>>>>>>>>>>> HOW DO I RERUN FROM HERE <<<<<<<<<<<<<<<<<<<<<<<<<<<
const $ = await fetchData();
// first check how many total pages there are
totalNumberOfPages = parseInt($('.pagination li:nth-last-child(2)').text());
// use fetched data to grab elements (and their text) and push into arrays defined above
$('.product-tile__vendor').each((index, element) => {
vendor.push($(element).text());
});
$('.product-tile__title').each((index, element) => {
productTitle.push($(element).text());
});
$('.product-tile__plant-type').each((index, element) => {
plantType.push($(element).text());
});
$('.product-tile__properties li:nth-child(2) p').each((index, element) => {
thcRange.push($(element).text());
});
$('.product-tile__properties li:nth-child(3) p').each((index, element) => {
cbdRange.push($(element).text());
});
$('.product-tile__price').each((index, element) => {
price.push($(element).text());
});
// increment page number to get more products if the page count is less than total number of pages
if (pageNum < totalNumberOfPages) {
pageNum ++;
};
//Convert to an array so that we can sort the results.
productArray.push ({
vendors: [...vendor],
productTitle: [...productTitle],
plantType: [...plantType],
thcRange: [...thcRange],
cbdRange: [...cbdRange],
price: [...price],
pageNum
});
// >>>>>>>>>>>>>>>>>> UNTIL HERE I THINK <<<<<<<<<<<<<<<<<<<<<<<<<<<
return productArray;
};
module.exports = getResults;

you can use recursion concept in your code:
which means the function itself will call itself
so what you can do is
const getResults = async () => {
// >>>>>>>>>>>>>>>>>> HOW DO I RERUN FROM HERE <<<<<<<<<<<<<<<<<<<<<<<<<<<
const $ = await fetchData();
// first check how many total pages there are
totalNumberOfPages = parseInt($('.pagination li:nth-last-child(2)').text());
// use fetched data to grab elements (and their text) and push into arrays defined above
$('.product-tile__vendor').each((index, element) => {
vendor.push($(element).text());
});
$('.product-tile__title').each((index, element) => {
productTitle.push($(element).text());
});
$('.product-tile__plant-type').each((index, element) => {
plantType.push($(element).text());
});
$('.product-tile__properties li:nth-child(2) p').each((index, element) => {
thcRange.push($(element).text());
});
$('.product-tile__properties li:nth-child(3) p').each((index, element) => {
cbdRange.push($(element).text());
});
$('.product-tile__price').each((index, element) => {
price.push($(element).text());
});
// increment page number to get more products if the page count is less than total number of pages
if (pageNum < totalNumberOfPages) {
pageNum ++;
};
//Convert to an array so that we can sort the results.
productArray.push ({
vendors: [...vendor],
productTitle: [...productTitle],
plantType: [...plantType],
thcRange: [...thcRange],
cbdRange: [...cbdRange],
price: [...price],
pageNum
});
// >>>>>>>>>>>>>>>>>> UNTIL HERE I THINK <<<<<<<<<<<<<<<<<<<<<<<<<<<
if(pageNum >= totalNumberOfPages) getResults()
return productArray;
};

Node.js download multiple files

I need to download multiple files from urls. I have got list of them in the file. How should I do that? I already made it, but it's not working. I need to wain until last download is done before starting next wan. How can I do that?

You want to call the download function from the callback of the file before that. I threw together something, do not consider it pretty nor production ready, please ;-)
var http = require('http-get');
var files = { 'url' : 'local-location', 'repeat-this' : 'as often as you want' };
var MultiLoader = function (files, finalcb) {
var load_next_file = function (files) {
if (Object.keys(files) == 0) {
finalcb(null);
return;
}
var nexturl = Object.keys(files)[0];
var nextfnname = files[nexturl];
console.log('will load ' + nexturl);
http.get(nexturl, nextfnname, function (err, result) {
console.log('loaded ' + nexturl);
delete files[nexturl];
load_next_file(files);
});
};
load_next_file(JSON.parse(JSON.stringify(files)));
};
MultiLoader(files, function () { console.log('finalcb'); });
http-get is not a standard node module, you can install it via npm install http-get.

I think this is what you're looking for.
const fs = require('fs')
const https = require('https')
const downloadFolderPath = 'downloads'
const urls = [
'url 1',
'url 2'
]
const downloadFile = url => {
return new Promise((resolve, reject) => {
const splitUrl = url.split('/')
const filename = splitUrl[splitUrl.length - 1]
const outputPath = `${downloadFolderPath}/${filename}`
const file = fs.createWriteStream(outputPath)
https.get(url, res => {
if (res.statusCode === 200) {
res.pipe(file).on('close', resolve)
} else {
reject(res.statusCode)
}
})
})
}
if (!fs.existsSync(downloadFolderPath)) {
fs.mkdirSync(downloadFolderPath)
}
let downloadedFiles = 0
urls.forEach(async url => {
await downloadFile(url)
downloadedFiles++
console.log(`${downloadedFiles}/${urls.length} downloaded`)
})

You can read files using fs (var fs = require('fs');)in node js
fs.readFile('<filepath>', "utf8", function (err, data) {
if (err) throw err;
console.log(data);
});

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to download much files in a loop in node js? - node.js

Related

I am trying to get data from my sql database to my react page, but my api returns an empty array

How to make sure ftp stream is fully download?

upload to s3 with fileS buffer

NodeJS scraper needing to increment page and re-run

Node.js download multiple files

Categories

Resources