Is there a way to change browser in puppeteer? - browser

I want to change browser in one run.
Is there any way to do this?
For example,
At first, launch chrome browser.
When variable became multiple of 2, change browser to edge,
When variable became multiple of 3, change browser to firefox.
I tried this.
(async () => {
var browser = await puppeteer.launch({
executablePath: "chrome path",
});
var page = await browser.newPage();
for(i = 0; i < 10; i++) {
  
// change browser to edge
if (i % 2 == 0) {
await browser.close();
browser = await puppeteer.launch({
executablePath: "edge path",
page = await browser.newPage();
});
  
// change browser to firefox
}else if (i % 3 == 0) {
await browser.close();
browser = await puppeteer.launch({
product: 'firefox',
});
page = await browser.newPage();
}
}
})();
Error Message
Protocol error (Page.navigate): Session closed. Most likely the page has been closed.

Please use
const puppeteer = require('puppeteer-core');
and then change executablePath:

Related

How to catch a tab drop in puppeteer-extra and refresh the page?

I have a small application on puppeteer-extra, it works through a proxy server, sometimes the proxy server crashes and I get this error on the page.
if you click the "reload" button, the page will refresh and everything will be fine.
But how can I do it programmatically?
How do I catch such a tab drop?
require('dotenv').config();
const puppeteer = require('puppeteer-extra')
const PuppeteerExtraPluginProxy = require('puppeteer-extra-plugin-proxy2')
const pluginStealth = require('puppeteer-extra-plugin-stealth')
const sleep = require('./src/ToolsSleep');
async function main() {
puppeteer.use(PuppeteerExtraPluginProxy({
proxy: 'socks://username:password#gateproxy.com:6969',
}))
puppeteer.use(pluginStealth());
let file_link = await fetchLinkPage();
let browser = await puppeteer.launch({
headless: false,
userDataDir: './var/prof',
args: [
'--window-size=1200,1400',
'--window-position=000,000',
'--no-sandbox',
'--disable-dev-shm-usage',
'--disable-web-security',
'--disable-features=IsolateOrigins',
'--disable-site-isolation-trials'
]
});
let page = await browser.newPage();
await page.setExtraHTTPHeaders({ referer: file_link.referer })
await page.goto(file_link.link);
let pages = await browser.pages();
while (true) {
for await (let tab of pages) {
await sleep(1500);
if (await isDesiredPage(tab)) {
await DesiredPage(tab);
}else{
// we will close the ad if it is in other tabs
await tab.close();
}
}
await sleep(500);
}
}
main().catch((e) => {
throw e
})
I want to make sure that my "relaod" button is pressed automatically when the tab drops. How do I do this?

Puppeteer not clicking button with text

I have a simple function that tries to accept the cookies
Here's my code:
(async () => {
const browser = await puppeteer.launch({ headless: false });
const page = await browser.newPage();
await page.goto('https://www.sport1.de/live/darts-sport');
await page.click('button[text=AKZEPTIEREN]');
// await page.screenshot({ path: 'example.png' });
// await browser.close();
})();
The cookie popup is placed in an iframe. You have to switch to iframe by contentFrame to be able to click on the accept button.
Also, if you want to filter by textContent, you need to use XPath. With CSS selector you can't get elements by its textContent.
const cookiePopUpIframeElement=await page.$("iframe[id='sp_message_iframe_373079']");
const cookiePopUpIframe=await cookiePopUpIframeElement.contentFrame();
const acceptElementToClick = await cookiePopUpIframe.$x("//button[text()='AKZEPTIEREN']");
await acceptElementToClick[0].click();

Bypass Cloudflare with puppeteer

I am trying to scrape some startups data of a site with puppeteer and when I try to navigate to the next page the cloudflare waiting screen comes in and disrupts the scraper. I tried changing the IP but its still the same. Is there a way to bypass it with puppeteer.
(async () => {
const browser = await puppeteer.launch({
headless: false,
defaultViewport: null,
});
const page = await browser.newPage();
page.setDefaultNavigationTimeout(0);
let links = [];
// initial page
await page.goto(`https://www.startupranking.com/top/india`, {
waitUntil: "networkidle0",
});
// looping through the url to different pages
for (let i = 2; i <= 7; i++) {
if (i === 3) {
console.log("waiting");
await page.waitFor(20000);
console.log("waited");
}
const onPageLinks = await page.$$eval("tr .name a", (arr) =>
arr.map((cur) => cur.href)
);
links = links.concat(onPageLinks);
console.log(onPageLinks, "inside loop");
await page.goto(`https://www.startupranking.com/top/india/${i}`, {
waitUntil: "networkidle0",
});
}
console.log(links, links.length, "outside loop");
})();
As it is only checking for the first loop i put in a waitFor to bypass the time it takes to check, it works fine on some IP's but on others it gives challenges to solve, I have to run this on a server so I am thinking of bypassing it completely.

how to handle new page on button click in pupeteer?

i am using puppeteer in a project to test a web page , in the page i have several buttons that open a new tab in the browser , how can i handle that using puppeteer?
const puppeteer = require('puppeteer');
(async() => {
const browser = await puppeteer.launch({ defaultViewport: null });
const page = await browser.newPage();
// go to the URL
await page.goto('https://example.com/', {waitUntil: 'networkidle'});
await page.click('.btnId'); //opens new tab with Page 2
// handle Page 2
// process Page 2
// close Page 2
// go back to Page 1
browser.close();
})();
how can i handle the page 2 ?
await page.waitFor(3 * 1000) // wait for new page to open
const pages = await browser.pages() // get all pages
const page2 = pages[pages.length - 1] // get the new page
// process the new page
await page2.close()
Hope this helps in solving the problem.
const puppeteer = require('puppeteer');
(async() => {
const browser = await puppeteer.launch({ defaultViewport: null });
const page = await browser.newPage();
// go to the URL
await page.goto('https://example.com/', {waitUntil: 'networkidle'});
await page.click('.btnId'); //opens new tab with Page 2
// you can make this as dynamic as well depends on the website and use case.
const [tabOne, tabTwo] = (await browser.pages());
// use the tabs Page objects properly
console.log("Tab One Title ",await tabOne.title());
// use the tabs Page objects properly
console.log("Tab Two Title ",await tabTwo.title());
// you can use close property for tab when it's done.
browser.close();
})();

Screenshots location while running the puppeteer script

I have created a Puppeteer script to run in offline, I have got the below code to take the screenshot. While running the offline-login-check.js script from the command prompt, could some one please advise where the screen shots are added ?
const puppeteer = require("puppeteer");
(async() => {
const browser = await puppeteer.launch({
headless: true,
chromeWebSecurity: false,
args: ['--no-sandbox']
});
try {
// Create a new page
const page = await browser.newPage()
// Connect to Chrome DevTools
const client = await page.target().createCDPSession()
// Navigate and take a screenshot
await page.waitFor(3000);
await page.goto('https://sometestsite.net/home',{waitUntil: 'networkidle0'})
//await page.goto(url, {waitUntil: 'networkidle0'});
await page.evaluate('navigator.serviceWorker.ready');
console.log('Going offline');
await page.setOfflineMode(true);
// Does === true for the main page but the fallback content isn't being served.
page.on('response', r => console.log(r.fromServiceWorker()));
await page.reload({waitUntil: 'networkidle0'});
await page.waitFor(5000);
await page.screenshot({path: 'screenshot.png',fullPage: true})
await page.waitForSelector('mat-card[id="route-tile-card]');
await page.click('mat-card[id="route-tile-card]');
await page.waitFor(3000);
} catch(e) {
// handle initialization error
console.log ("Timeout or other error: ", e)
}
await browser.close();
})();
const puppeteer = require('puppeteer');
(async() => {
const browser = await puppeteer.launch({
headless: false,
chromeWebSecurity: false,
args: ['--no-sandbox']
});
try {
// Create a new page
const page = await browser.newPage();
// Connect to Chrome DevTools
const client = await page.target().createCDPSession();
// Navigate and take a screenshot
await page.goto('https://example.com', {waitUntil: 'networkidle0'});
// await page.evaluate('navigator.serviceWorker.ready');
console.log('Going offline');
await page.setOfflineMode(true);
// Does === true for the main page but the fallback content isn't being served.
page.on('response', r => console.log(r.fromServiceWorker()));
await page.reload({waitUntil: 'networkidle0'});
await page.screenshot({path: 'screenshot2.png',fullPage: true})
// await page.waitForSelector('mat-card[id="route-tile-card]');
// await page.click('mat-card[id="route-tile-card]');
} catch(e) {
// handle initialization error
console.log ("Timeout or other error: ", e)
}
await browser.close();
})();
then in command line run ls | GREP .png and you should see screenshot there. Be aware i take rid of await page.evaluate('navigator.serviceWorker.ready'); which might be specified to your website
Your script is perfect. There is no problem with it!
The screenshot.png should be on the directory that you run the node offline-login-check.js command.
If its not there, maybe you are getting some error/timeout before the page.screenshot command runs. Since your script is ok, this can be caused by network issues or issues with the page. For example, if your page has a never ending connection (like WebSocket), change the "networkidle0" to "networkidle2" or "load", otherwise the first page.goto will get stuck.
Again, your script is perfect. You don't have to change it.

Resources