What is the ideal way to loop API requests with fetch? - node.js

I'm relatively new to working with NodeJS, and I'm doing a practice project using the Youtube API to get some data on a user's videos. The Youtube API returns a list of videos with a page token, to successfully collect all of a user's videos, you would have to make several API requests, each with a different page token. When you reach the end of these requests, there will be no new page token present in the response, so you can move on. Doing it in a for, or while loop seemed like the way to handle this, but these are synchronous operations that do not appear to work in promises, so I had to look for an alternative
I looked at a few previous answers to similar questions, including the ones here and here. I got the general idea of the code in the answers, but I couldn't quite figure out how to get it working fully myself. The request I am making is already chained in a .then() of a previous API call - I would like to complete the recursive fetch calls with new page tokens, and then move onto another .then(). Right now, when I run my code, it moves onto the next .then() without the requests that use the tokens being complete. Is there any way to stop this from happening? I know async/await may be a solution, but I've decided to post here just to see if there are any possible solutions without having to go down that route in the hope I learn a bit about fetch/promises in general. Any other suggestions/advice about the way the code is structured is welcome too, as I'm pretty conscious that this is probably not the best way to handle making all of these API calls.
Code :
let body = req.body
let resData = {}
let channelId = body.channelId
let videoData = []
let pageToken = ''
const fetchWithToken = (nextPageToken) => {
let uploadedVideosUrlWithToken = `https://youtube.googleapis.com/youtube/v3/playlistItems?part=ContentDetails&playlistId=${uploadedVideosPlaylistId}&pageToken=${nextPageToken}&maxResults=50&key=${apiKey}`
fetch(uploadedVideosUrlWithToken)
.then(res => res.json())
.then(uploadedVideosTokenPart => {
let {items} = uploadedVideosTokenPart
videoData.push(...items.map(v => v.contentDetails.videoId))
pageToken = (uploadedVideosTokenPart.nextPageToken) ? uploadedVideosTokenPart.nextPageToken : ''
if (pageToken) {
fetchWithToken(pageToken)
} else {
// tried to return a promise so I can chain .then() to it?
// return new Promise((resolve) => {
// return(resolve(true))
// })
}
})
}
const channelDataUrl = `https://youtube.googleapis.com/youtube/v3/channels?part=snippet%2CcontentDetails%2Cstatistics&id=${channelId}&key=${apiKey}`
// promise for channel data
// get channel data then store it in variable (resData) that will eventually be sent as a response,
// contentDetails.relatedPlaylists.uploads is the playlist ID which will be used to get individual video data.
fetch(channelDataUrl)
.then(res => res.json())
.then(channelData => {
let {snippet, contentDetails, statistics } = channelData.items[0]
resData.snippet = snippet
resData.statistics = statistics
resData.uploadedVideos = contentDetails.relatedPlaylists.uploads
return resData.uploadedVideos
})
.then(uploadedVideosPlaylistId => {
// initial call to get first set of videos + first page token
let uploadedVideosUrl = `https://youtube.googleapis.com/youtube/v3/playlistItems?part=ContentDetails&playlistId=${uploadedVideosPlaylistId}&maxResults=50&key=${apiKey}`
fetch(uploadedVideosUrl)
.then(res => res.json())
.then(uploadedVideosPart => {
let {nextPageToken, items} = uploadedVideosPart
videoData.push(...items.map(v => v.contentDetails.videoId))
// idea is to do api calls until pageToken is non existent, and add the video id's to the existing array.
fetchWithToken(nextPageToken)
})
})
.then(() => {
// can't seem to get here synchronously - code in this block will happen before all the fetchWithToken's are complete - need to figure this out
})
Thanks to anyone who takes the time out to read this.
Edit:
After some trial and error, this seemed to work - it is a complete mess. The way I understand it is that this function now recursively creates promises that resolve to true only when there is no page token from the api response allowing me to return this function from a .then() and move on to a new .then() synchronously. I am still interested in better solutions, or just suggestions to make this code more readable as I don't think it's very good at all.
const fetchWithToken = (playlistId, nextPageToken) => {
let uploadedVideosUrlWithToken = `https://youtube.googleapis.com/youtube/v3/playlistItems?part=ContentDetails&playlistId=${playlistId}&pageToken=${nextPageToken}&maxResults=50&key=${apiKey}`
return new Promise((resolve) => {
resolve( new Promise((res) => {
fetch(uploadedVideosUrlWithToken)
.then(res => res.json())
.then(uploadedVideosTokenPart => {
let {items} = uploadedVideosTokenPart
videoData.push(...items.map(v => v.contentDetails.videoId))
pageToken = (uploadedVideosTokenPart.nextPageToken) ? uploadedVideosTokenPart.nextPageToken : ''
// tried to return a promise so I can chain .then() to it?
if (pageToken) {
res(fetchWithToken(playlistId, pageToken))
} else {
res(new Promise(r => r(true)))
}
})
}))
})
}

You would be much better off using async/await which are basically a wrapper for promises. Promise chaining, which is what you are doing with the nested thens, can get messy and confusing...
I converted your code to use async/await so hopefully this will help you see how to solve your problem. Good luck!
Your initial code:
let { body } = req
let resData = {}
let { channelId } = body
let videoData = []
let pageToken = ''
const fetchWithToken = async (nextPageToken) => {
const someData = (
await fetch(
`https://youtube.googleapis.com/youtube/v3/playlistItems?part=ContentDetails&playlistId=${uploadedVideosPlaylistId}&pageToken=${nextPageToken}&maxResults=50&key=${apiKey}`,
)
).json()
let { items } = someData
videoData.push(...items.map((v) => v.contentDetails.videoId))
pageToken = someData.nextPageToken ? someData.nextPageToken : ''
if (pageToken) {
await fetchWithToken(pageToken)
} else {
// You would need to work out
}
}
const MainMethod = async () => {
const channelData = (
await fetch(
`https://youtube.googleapis.com/youtube/v3/channels?part=snippet%2CcontentDetails%2Cstatistics&id=${channelId}&key=${apiKey}`,
)
).json()
let { snippet, contentDetails, statistics } = channelData.items[0]
resData.snippet = snippet
resData.statistics = statistics
resData.uploadedVideos = contentDetails.relatedPlaylists.uploads
const uploadedVideosPlaylistId = resData.uploadedVideos
const uploadedVideosPart = (
await fetch(
`https://youtube.googleapis.com/youtube/v3/playlistItems?part=ContentDetails&playlistId=${uploadedVideosPlaylistId}&maxResults=50&key=${apiKey}`,
)
).json()
let { nextPageToken, items } = uploadedVideosPart
videoData.push(...items.map((v) => v.contentDetails.videoId))
await fetchWithToken(nextPageToken)
}
MainMethod()
Your Edit:
const fetchWithToken = (playlistId, nextPageToken) => {
return new Promise((resolve) => {
resolve(
new Promise(async (res) => {
const uploadedVideosTokenPart = (
await fetch(
`https://youtube.googleapis.com/youtube/v3/playlistItems?part=ContentDetails&playlistId=${playlistId}&pageToken=${nextPageToken}&maxResults=50&key=${apiKey}`,
)
).json()
let { items } = uploadedVideosTokenPart
videoData.push(...items.map((v) => v.contentDetails.videoId))
pageToken = uploadedVideosTokenPart.nextPageToken
? uploadedVideosTokenPart.nextPageToken
: ''
if (pageToken) {
res(fetchWithToken(playlistId, pageToken))
} else {
res(new Promise((r) => r(true)))
}
}),
)
})
}

Related

how to use await instead of then in promise?

How to correctly resolve a Promise.all(...), I'm trying that after resolving the promise which generates a set of asynchronous requests (which are simple database queries in supabase-pg SQL) I'm iterating the result in a forEach , to make a new request with each of the results of the iterations.
But, try to save the result that it brings me in a new array, which prints fine in the console, but in the response that doesn't work. It comes empty, I understand that it is sending me the response before the promise is finished resolving, but I don't understand why.
In an answer to a previous question I was told to use await before the then, but I didn't quite understand how to do it.
What am I doing wrong?
export const getReportMonthly = async(req: Request & any, res: Response, next: NextFunction) => {
try {
let usersxData: UsersxModalidadxRolxJob[] = [];
let data_monthly: HoursActivityWeeklySummary[] = [];
let attendance_schedule: AttendanceSchedule[] = [];
let time_off_request: TimeOffRequestRpc[] = [];
let configs: IndicatorConfigs[] = [];
const supabaseService = new SupabaseService();
const promises = [
supabaseService.getSummaryWeekRpcWihoutFreelancers(req.query.fecha_inicio, req.query.fecha_final).then(dataFromDB => {
data_monthly = dataFromDB as any;
}),
supabaseService.getUsersEntity(res).then(dataFromDB => {
usersxData = dataFromDB as any;
}),
supabaseService.getAttendaceScheduleRpc(req.query.fecha_inicio, req.query.fecha_final).then(dataFromDB => {
attendance_schedule = dataFromDB as any;
}),
supabaseService.getTimeOffRequestRpc(req.query.fecha_inicio, req.query.fecha_final).then(dataFromDB => {
time_off_request = dataFromDB as any;
}),
supabaseService.getConfigs(res).then(dataFromDB => {
configs = dataFromDB;
}),
];
let attendanceInMonthly = new Array();
await Promise.all(promises).then(() => {
attendance_schedule.forEach(element => {
let start_date = element.date_start.toString();
let end_date = element.date_end.toString();
supabaseService.getTrackedByDateAndIDArray(start_date, end_date).then(item => {
console.log(item);
attendanceInMonthly.push(item);
});
});
})
res.json(attendanceInMonthly)
} catch (error) {
console.log(error);
res.status(500).json({
title: 'API-CIT Error',
message: 'Internal server error'
});
}
If you await a promise you could write the return of this in a variable and work with this normaly.
So instead of your current code you could use the following changed code:
export const getReportMonthly = async(req: Request & any, res: Response, next: NextFunction) => {
try {
let usersxData: UsersxModalidadxRolxJob[] = [];
let data_monthly: HoursActivityWeeklySummary[] = [];
let attendance_schedule: AttendanceSchedule[] = [];
let time_off_request: TimeOffRequestRpc[] = [];
let configs: IndicatorConfigs[] = [];
const supabaseService = new SupabaseService();
const promises = [
supabaseService.getSummaryWeekRpcWihoutFreelancers(req.query.fecha_inicio, req.query.fecha_final).then(dataFromDB => {
data_monthly = dataFromDB as any;
}),
supabaseService.getUsersEntity(res).then(dataFromDB => {
usersxData = dataFromDB as any;
}),
supabaseService.getAttendaceScheduleRpc(req.query.fecha_inicio, req.query.fecha_final).then(dataFromDB => {
attendance_schedule = dataFromDB as any;
}),
supabaseService.getTimeOffRequestRpc(req.query.fecha_inicio, req.query.fecha_final).then(dataFromDB => {
time_off_request = dataFromDB as any;
}),
supabaseService.getConfigs(res).then(dataFromDB => {
configs = dataFromDB;
}),
];
const resolvedPromises = await Promise.all(promises)
const attendanceInMonthly = await Promise.all(
resolvedPromises.map(
async (element) => {
let start_date = element.date_start.toString();
let end_date = element.date_end.toString();
return supabaseService.getTrackedByDateAndIDArray(start_date, end_date)
}
)
)
console.log(attendanceInMonthly) // this should be your finaly resolved promise
res.json(attendanceInMonthly)
} catch (error) {
console.log(error);
res.status(500).json({
title: 'API-CIT Error',
message: 'Internal server error'
});
}
Something like this should your code looks like. I am not sure if this solves exactly your code because your code has some syntax errors wich you have to solve for you.
If I understand correctly, you launch a few requests, among which one (getAttendaceScheduleRpc, which assigns attendance_schedule) is used to launch some extra requests again, and you need to wait for all of these (including the extra requests) before returning?
In that case, the immediate issue is that you perform your extra requests in "subqueries", but you do not wait for them.
A very simple solution would be to properly separate those 2 steps, somehow like in DerHerrGammler's answer, but using attendance_schedule instead of resolvedPromises as input for the 2nd step:
let attendanceInMonthly = new Array();
await Promise.all(promises);
await Promise.all(attendance_schedule.map(async (element) => {
let start_date = element.date_start.toString();
let end_date = element.date_end.toString();
const item = await supabaseService.getTrackedByDateAndIDArray(start_date, end_date);
console.log(item);
attendanceInMonthly.push(item);
});
res.json(attendanceInMonthly);
If you are really looking to fine tune your performance, you could take advantage of the fact that your extra requests depend only on the result of one of your initial requests (getAttendaceScheduleRpc), so you could launch them as soon as the latter is fullfilled, instead of waiting for all the promises of the 1st step:
let attendance_schedule: AttendanceSchedule[] = [];
let attendanceInMonthly = new Array();
const promises = [
supabaseService.getAttendaceScheduleRpc(req.query.fecha_inicio, req.query.fecha_final).then(dataFromDB => {
attendance_schedule = dataFromDB as any;
// Immediately launch your extra (2nd step) requests, without waiting for other 1st step requests
// Make sure to return when all new extra requests are done, or a Promise
// that fullfills when so.
return Promise.all(attendance_schedule.map(async (element) => {
let start_date = element.date_start.toString();
let end_date = element.date_end.toString();
const item = await supabaseService.getTrackedByDateAndIDArray(start_date, end_date);
console.log(item);
attendanceInMonthly.push(item);
});
}),
// etc. for the rest of 1st step requests
];
await Promise.all(promises);
res.json(attendanceInMonthly);

Download files before build in gatsby wordpress

I have a client that im working with who needs his pdfs to be readable in browser and the user doesn't need to download them first and it turned out to not be an option to do it through Wordpress so I thought I can download them in gatsby before build everytime if they don't already exist and I was wondering if this is possible.
I found this repo: https://github.com/jamstack-cms/jamstack-ecommerce
that shows a way to do it with this code:
function getImageKey(url) {
const split = url.split('/')
const key = split[split.length - 1]
const keyItems = key.split('?')
const imageKey = keyItems[0]
return imageKey
}
function getPathName(url, pathName = 'downloads') {
let reqPath = path.join(__dirname, '..')
let key = getImageKey(url)
key = key.replace(/%/g, "")
const rawPath = `${reqPath}/public/${pathName}/${key}`
return rawPath
}
async function downloadImage (url) {
return new Promise(async (resolve, reject) => {
const path = getPathName(url)
const writer = fs.createWriteStream(path)
const response = await axios({
url,
method: 'GET',
responseType: 'stream'
})
response.data.pipe(writer)
writer.on('finish', resolve)
writer.on('error', reject)
})
}
but It doesn't seem to work if i put it in my createPages and i cant use it outside it either because i don't have access to graphql to query the data first.
any idea how to do this?
WordPress source example is defined as async:
exports.createPages = async ({ graphql, actions }) => {
... so you can already use await to download your file(-s) just after querying data (and before createQuery() call). It should (NOT TESTED) be as easy as:
// Check for any errors
if (result.errors) {
console.error(result.errors)
}
// Access query results via object destructuring
const { allWordpressPage, allWordpressPost } = result.data
const pageTemplate = path.resolve(`./src/templates/page.js`)
allWordpressPage.edges.forEach(edge => {
// for one file per edge
// url taken/constructed from some edge property
await downloadImage (url);
createPage({
Of course for multiple files you should use Promise.all to wait for [resolving] all [returned promise] downloads before creating page:
allWordpressPage.edges.forEach(edge => {
// for multiple files per edge(page)
// url taken/constructed from some edge properties in a loop
// adapth 'paths' of iterable (edge.xxx.yyy...)
// and/or downloadImage(image) argument, f.e. 'image.someUrl'
await Promise.all(
edge.node.someImageArrayNode.map( image => { return downloadImage(image); }
);
createPage({
If you need to pass/update image nodes (for components usage) you should be able to mutate nodes, f.e.:
await Promise.all(
edge.node.someImageArrayNode.map( image => {
image["fullUrl"] = `/publicPath/${image.url}`;
return downloadImage(image.url); // return Promise at the end
}
);
createPage({
path: slugify(item.name),
component: ItemView,
context: {
content: item,
title: item.name,
firstImageUrl: edge.node.someImageArrayNode[0].fullUrl,
images: edge.node.someImageArrayNode

how much time each function takes in promis.all? [duplicate]

This question already has answers here:
How to measure the execution time of a promise?
(5 answers)
Closed 3 years ago.
I have some URLs and I want to call each of them simultaneous. I want to know how much time each request takes?
my code like this:
var urls=["http://req0.com","http://req1.com","http://req2.com"];
Promis.all(urls.map(e=>return axios.post(e,{test:""test}).catch(err=>return e)).then(
(values)=>{
console.log(values[0]);
console.log(values[1]);
console.log(values[2]);
})
what I want is something like this
conosle.log(value[0].responseTime);
conosle.log(value[1].responseTime)
conosle.log(value[2].responseTime)
is there any way to get this time?
Pretty simple, your .map functor offers the opportunity for a reliable closure for the start time of each axios request, allowing calculation of time taken by subtraction in the requests' .then callback.
var urls = ["http://req0.com","http://req1.com","http://req2.com"];
Promise.all(urls.map(e => {
let start = Date.now();
return axios.post(e, {test:'test'})
.then(value => ( { value, t: Date.now() - start} ));
}))
.then((timedValues) => {
let times = timedValues.map(x => x.t);
let values = timedValues.map(x => x.value);
console.log(times);
console.log(values);
});
If you wish to include the timing of errors, then it's only slightly more complicated:
var urls=["http://req0.com","http://req1.com","http://req2.com"];
Promise.all(urls.map(e => {
let t = Date.now();
return axios.post(e, {test:"test"})
.then(value => ( { outcome:'success', value, t:Date.now() - t} ))
.catch(error => ( { outcome:'error', error, t:Date.now() - t} ));
}))
.then((timedOutcomes) => {
let times = timedOutcomes.map(x => x.t);
let values = timedOutcomes.filter(x => x.outcome === 'success').map(x => x.value);
let errors = timedOutcomes.filter(x => x.outcome === 'error').map(x => x.error);
console.log(times);
console.log(values);
console.log(errors);
});
you can use async/await and measure the time with console.time(), console.timeEnd().
async getPost(){
const url = 'https://jsonplaceholder.typicode.com/posts?_start=1';
console.time();
const post = await axios.get(url);
console.timeEnd();
return post;
};
const post = getPost();
console.log(`post ${post}`);

node.js Get.Request & Pagination & Async

I'm having a tremendously tough time organizing the flow here as I'm self-taught so wondering if someone might be able to assist.
var channelIds = ['XYZ','ABC','QRS']
var playlistIds = [];
var videoIds = [];
ORDER OF PROCESS
1. Get All Playlist IDs: If returning Get Request JSON contains nextPageToken run Get Request again with that page before going to (2)
2. Get All Video IDs: If returning Get Request JSON contains nextPageToken run Get Request again with that page before going to (3)
3. Aggregate into Final Array: I need put all in an array such as:
var ArrFinal = [{channelId,playlistID,videoId},{channelId,playlistID,videoId},{channelId,playlistID,videoId}];
I don't necessarily need someone to write the whole thing. I'm trying to better understand the most efficient way to know when the previous step is done, but also handle the nextPageToken iteration.
i'm not familiar with the youtube api.
But what you basically need is a get function for each endpoint. This function should also care about the "nextPageToken".
Something like that: (not tested)
'use strict';
const Promise = require('bluebird');
const request = Promise.promisifyAll(require('request'));
const playlistEndpoint = '/youtube/v3/playlists';
const baseUrl = 'https://www.googleapis.com'
const channelIds = ['xy', 'ab', 'cd'];
const getPlaylist = async (channelId, pageToken, playlists) => {
const url = `${baseUrl}${playlistEndpoint}`;
const qs = { 
channelId,
maxResults: 25,
pageToken
};
try {
const playlistRequest = await request.getAsync({ url, qs });
const nextPageToken = playlistRequest.body.nextPageToken;
// if we already had items, combine with the new ones
const items = playlists ? playlists.concat(playlistRequest.body.items) : playlistRequest.body.items;
if (nextPageToken) {
// if token, do the same again and pass results to function
return getPlaylist(channelId, nextPageToken, items);
}
// if no token we are finished
return items;
}
catch (e) {
console.log(e.message);
}
};
const getVideos = async (playlistId, pageToken, videos) => {
// pretty much the same as above
}
function awesome(channelIds) {
const fancyArray = [];
await Promise.map(channelIds, async (channelId) => {
const playlists = await getPlaylist(channelId);
const videos = await Promise.map(playlists, async (playlistId) => {
const videos = await getVideos(playlistId);
videos.forEach(videoId => {
fancyArray.push({ channelId, playlistId, videoId })
})
});
});
return fancyArray;
}
awesome(channelIds)
// UPDATE
This may be a lot concurrent requests, you can limit them by using
Promise.map(items, item => { somefunction() }, { concurrency: 5 });

Use restful api to invoke nightmare scraping multi site with promise wrapper

I would like to scrap multi-site with one restful api, I use express to implement it.
But I only triggered nightmare successfully in first time with my api,
when I call again my api I can't trigger nightmare any more :(
Have any idea?
another question, in below case, I need to instantiate new Nightmare object individually , so that I can scrap three different site, have any smarter way to achieve that?
bellow getScrap is my apiControler function with express Router GET callback,
you also could check in gist:
https://gist.github.com/sevenLee/7091f8c56ccad3c0551b512f725af7da
import Nightmare from 'nightmare';
import cheerio from 'cheerio';
let nightmare = Nightmare({show: false});
let nightmare2 = Nightmare({show: false});
let nightmare3 = Nightmare({show: false});
const urlObject = {
site1: 'http://www.site1.com',
site2: 'http://www.site2.com',
site3: 'http://www.site3.com'
};
export function getScrap(req, res){
let result = {};
result.site1 = {
topList: []
};
result.site2 = {
topList: []
};
result.site3 = {
topList: []
};
const pro1 = Promise.resolve(
nightmare
.goto(urlObject.site1)
.wait(200)
.evaluate(() => {
console.log('site1 into evaluate');
return document.querySelector('.ninenine').innerHTML;
})
.end()
)
.then((html) => {
let $ = cheerio.load(html);
let tt = $('.horizontal-li');
let sections = $(".section-board-title");
sections.each((index, elm) => {
if($(elm).text() === 'TopList'){
$(elm).next('ul').find('li').each((index, elm_li) => {
let title =$(elm_li).find('.cabinet-instruction').text();
let price =$(elm_li).find('.cabinet-middle .price').text();
let imgSrc = $(elm_li).find('.cabinet-img').attr('data-temp-src');
if(title !== '' && price !==''){
result.site1.topList.push({
title,
price,
imgSrc
});
}
});
}
});
})
.catch((err) => {
console.log('site1 scrap err:', err);
return res.status(400).send({reason:'site1 scrap err'});
});
const pro2 = Promise.resolve(
nightmare2
.goto(urlObject.site2)
.wait(200)
.evaluate(() => {
return document.querySelector('.ninenine').innerHTML;
})
.end()
)
.then((html) => {
let $ = cheerio.load(html);
let tt = $('.horizontal-li');
let sections = $(".section-board-title");
sections.each((index, elm) => {
if($(elm).text() === 'TopList'){
$(elm).next('ul').find('li').each((index, elm_li) => {
let title =$(elm_li).find('.cabinet-instruction').text();
let price =$(elm_li).find('.cabinet-middle .price').text();
let imgSrc = $(elm_li).find('.cabinet-img').attr('data-temp-src');
if(title !== '' && price !==''){
result.site2.topList.push({
title,
price,
imgSrc
});
}
});
}
});
})
.catch((err) => {
console.log('site2 scrap err:', err);
return res.status(400).send({reason:'site2 scrap err'});
});
const pro3 = Promise.resolve(
nightmare3
.goto(urlObject.site3)
.wait(200)
.evaluate(() => {
return document.querySelector('#layout').innerHTML;
})
.end()
)
.then((html) => {
let $ = cheerio.load(html);
let sections = $(".pditem");
sections.each((index, elm) => {
let title = $(elm).find('.name').text();
let price = $(elm).find('.price').find('span').eq(1).text();
let imgSrc = ['www.site3.com',$(elm).find('li').eq(1).find('img').attr('src')].join('');
result.site3.topList.push({
title,
price,
imgSrc
});
});
})
.catch((err) => {
console.log('site3 scrap err:', err);
return res.status(400).send({reason:'site3 scrap err'});
});
Promise.all([pro1, pro2, pro3])
.then(values => {
res.json(result);
})
.catch((err) => {
return res.status(500).send({reason:err.toString()});
});
}
(From my original answer at segmentio/nightmare#715.)
But I only triggered nightmare successfully in first time with my api,
when I call again my api I can't trigger nightmare any more
It looks like you're defining your instances outside of getScrap(), then calling .end() inside of getScrap(), which will end and destroy the Nightmare/Electron instances. Once they are ended, they can no longer be used. Try moving the creation of your Nightmare instances inside of the getScrap() method.
another question, in below case, I need to instantiate new Nightmare object individually , so that I can scrap three different site, have any smarter way to achieve that?
Depends on what your use case is. You could use a single Nightmare instance and iterate over the URLs, but that will take more time as Nightmare execution must be sequential. If you're curious on how to do such a thing, this article from nightmare-examples might be worth reading.
Finally, it's probably worth pointing out that based on your above code, you don't have to use cheerio. You could use .evaluate() and CSS queries to accomplish what you want, I think.

Resources