How to use pkg with puppeteer? - node.js

I've been trying to make this simple code work with pkg.
const puppeteer = require("puppeteer");
async function scraper(url) {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto(url);
const title = await page.title();
await browser.close();
return title;
}
scraper("http://example.com").then(console.log);
But the exe closed at once.
I Know that the problem is that there is no "specified" path for chromium. I searched a lot and tried a lot of different things but nothing worked.
Something I tried:
const browser = await puppeteer.launch({executablePath: '/path/to/Chrome'});
But it never worked as well.

Related

puppeteer overridePermissions clipboard-read not working on createIncognitoBrowserContext()

The following code is viable for reading the clipboard in headless/headfull:
var context = await client.defaultBrowserContext();
await context.overridePermissions('http://localhost', ['clipboard-read']);
page = await browser.newPage();
await page.goto( 'http://localhost/test/', {waitUntil: 'load', timeout: 35000});
// click button for clipboard..
let clipboard = await page.evaluate(`(async () => await navigator.clipboard.readText())()`);
But when you later start incognito its not working anymore:
const incognito = await client.createIncognitoBrowserContext();
page = await incognito.newPage();
and you get:
DOMException: Read permission denied.
I currently try to figure out to combine both.. Anybody know how to set overridePermissions inside of the new incognito window?
Please notice I do not want to use the incognito chrome arg at the start. I want to manually create new incognito pages inside of my scripts with correct overridePermissions.
I am having the very same issue. Here's a minimal reproducible example.
Nodejs version: v16.13.1
puppeteer version: puppeteer#14.4.1
'use strict';
const puppeteer = require('puppeteer');
const URL = 'https://google.com';
(async () => {
const browser = await puppeteer.launch();
const context = browser.defaultBrowserContext();
context.overridePermissions(URL, ['clipboard-read', 'clipboard-write'])
const page = await browser.newPage();
await page.goto(URL, {
waitUntil: 'networkidle2',
});
await page.evaluate(() => navigator.clipboard.writeText("Injected"));
const value = await page.evaluate(() => navigator.clipboard.readText());
console.log(value);
})();

Puppeteer doesn't recognize selector with just type and class but accepts full selector

I'm trying to click on a cookiewall on a webpage, but Puppeteer refuses to recognize the short selector with just the type and class selector (button.button-action). Changing this to the full CSS selector fixes the problem but isn't a viable solution since any chance in parent elements can break the selector. As far as I know this shouldn't be a problem because on the page in question using document.querySelector("button.button-action") also returns the element I'm trying to click.
The code that doesn't work:
const puppeteer = require('puppeteer');
const main = async () => {
const browser = await puppeteer.launch({headless: false,});
const page = await browser.newPage();
await page.goto("https://www.euclaim.nl/check-uw-vlucht#/problem", { waitUntil: 'networkidle2' });
const cookiewall = await page.waitForSelector("button.button-action", {visible: true});
await cookiewall.click();
};
main();
The code that does work:
const puppeteer = require('puppeteer');
const main = async () => {
const browser = await puppeteer.launch({headless: false,});
const page = await browser.newPage();
await page.goto("https://www.euclaim.nl/check-uw-vlucht#/problem", { waitUntil: 'networkidle2' });
const cookiewall = await page.waitForSelector("#InfoPopupContainer > div.ipBody > div > div > div.row.actionButtonContainer.mobileText > button", {visible: true});
await cookiewall.click();
};
main();
The problem is that you have three button.button-action there. And the first match is not visible.
One thing you could do is waitForSelector but without the visible bit (because it will check the first button).
And then iterate through all items checking which item is clickable.
await page.waitForSelector("button.button-action");
const actions = await page.$$("button.button-action");
for(let action of actions) {
if(await action.boundingBox()){
await action.click();
break;
}
}

nodejs puppeteer How can I type input inside an iframe?

I am trying to type my username on a website but the input box is inside an iframe, I have tried this code to locate the element inside the iframe but I keep getting an error JSHandles can be evaluated only in the context they were created!
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch({headless: false});
const page = await browser.newPage();
await page.goto("www.examplesite.com", { waitUntil: 'networkidle0' })
await sleep(1000)
const myframe = await page.frames()[2];
const userselector = await myframe.$('input[name="usernameinput"]')
await page.type(userselector, "myusername")
await page.screenshot({path: 'example.png'});
await browser.close();
})();
The type function expects a selector string not a handle. So, as the frame also has a type function you could do:
await myframe.type('input[name="usernameinput"]', "myusername")

How can i get all the items like src, titles and url from specific page using this code?

i have been working in a web scraping code in node.js using the npm puppeteer to get the url, image and titles from each news in the page but the only thing i was able to get the url, image and title from the first news.
const puppeteer = require('puppeteer');
(async () => {
const brower = await puppeteer.launch();
const page = await brower.newPage();
const url = 'https://es.cointelegraph.com/category/latest';
await page.goto(url, { waitUntil: 'load' });
const datos = await page.evaluate(() => Array.from(document.querySelectorAll('.categories-page__list'))
.map( info => ({
titulo: info.querySelector('.post-preview-item-inline__title').innerText.trim(),
link: info.querySelector('.post-preview-item-inline__title-link').href,
imagen: info.querySelector('.post-preview-item-inline__figure .lazy-image__wrp img ').src
}))
)
console.log(datos);
await page.close();
await brower.close();
})()
Because there is just one .categories-page__list in the page while there are a lot of .post-preview-list-inline__item elements.
You map over an array returned from document.querySelectorAll('.categories-page__list') but the array has just one element, it's right that it run the map closure just once.
So, replace
document.querySelectorAll('.categories-page__list')
with
document.querySelectorAll('.post-preview-list-inline__item')
and everything works.
Here you can find a working example.
Let me know if you need some more help 😉

Getting Puppeteer to wait for a given text to appear/render on a page?

I want to load a page, and then wait for the text (or class in this case) to be rendered before I get the content.
This example works.
async function test() {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto('https://sunnythailand.com');
// Wait until the page is fully rendered
while (content.indexOf("scrapebot_description") < 0) {
console.log("looking for scrapebot_description")
await new Promise((resolve)=>setTimeout(()=> resolve() ,1000));
content = await page.content();
}
console.log("scrapebot_description FOUND!!!")
await browser.close();
}
My question is, can I do this easier with puppeteer?
I tried this:
await page.waitForFunction('document.querySelector("scrapebot_description")');
But that just hangs there forever, nothing ever happens...
(to be honest I dont understand what querySelector is, so perhaps the problem is there)
I also tried this:
var checkText = "scrapebot_description"
await page.evaluate((checkText) => {
console.log("scrapebot_description FOUND IT!!");
},{checkText});
This also does not work.
This is the last element to render on the page what im waiting for....
<span class="hide scrapebot_description ng-binding" ng-bind="'transFrontDescription' | translate">
You can do this:
async function test() {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto('https://sunnythailand.com');
const selector = '.scrapebot_description' // or #scrapebot_description
await page.waitForSelector(selector)
console.log("scrapebot_description FOUND!!!")
await browser.close();
}

Resources