TimeoutError: Navigation timeout of 30000 ms exceeded in puppeteer in ubuntu - node.js

No issue in windows.. But in production server ubuntu, I'm getting this error after goto function
const browser = await puppeteer.launch({
headless: true,
args: ['--no-sandbox', '--disable-setuid-sandbox'],
});
const url: String = login.url;
const page: any = await browser.newPage();
await page.setUserAgent('Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36');
await page.goto(url, { waitUntil: 'networkidle2' });
await page.setViewport({
width: 1520,
height: 800,
deviceScaleFactor: 1,
isMobile: false
});
chromium-browser installed, puppeteer installed and some others like libgbm-dev or something
Anyone tell me whats the issue?
If you need any more informations please comment...

In my case, I was running an Ubuntu server with 512MB memory that could not handle running my scripts. I figured this out by writing a simple scraper that visited Google, which worked fine. I then ran my more intensive scrapers, and watched memory usage via htop, while they failed to execute and giving me a timeout error.
I upgraded the server two 2gb of memory, and everything worked fine. You might not need to upgrade all the way to 2gb, but I did just in case.

Puppeteer sometimes require lot of time to answer.
in my case: Puppeteer-19.4.1 Ubuntu-20.04.1 LTS (server) with 1gb RAM, i solve the issue just increasing "page.goto" timeout at 2 minutes.
await page.goto(url, {'timeout': 120000});

Related

Error while trying to create a PDF of Size around 7 MB (500 pages) from HTML using Node JS, Puppeteer in AWS EC2 server. Works fine in local

I am trying to create a PDF from HTML (created from Handlebar). I am using NodeJs and Puppeteer. The resulting PDF has a size of 7MB around 500 pages. The code is working fine in my local but I am facing issues on cloud servers. I have tried with heroku free dyno (no longer available after 29th November 2022) and AWS EC2 server.
In heroku I was getting below error
Error R15 (Memory quota vastly exceeded)
In AWS EC2 server:
if I use the flag --single-process while trying to launch the puppeteer browser I am getting the below error:
ProtocolError: Protocol error (Runtime.callFunctionOn): Target closed.
at /home/ec2-user/revisor_template_engine/node_modules/puppeteer/node_modules/puppeteer-core/lib/cjs/puppeteer/common/Connection.js:329:24
at new Promise (<anonymous>)
at CDPSessionImpl.send (/home/ec2-user/revisor_template_engine/node_modules/puppeteer/node_modules/puppeteer-core/lib/cjs/puppeteer/common/Connection.js:325:16)
at ExecutionContext._ExecutionContext_evaluate (/home/ec2-user/revisor_template_engine/node_modules/puppeteer/node_modules/puppeteer-core/lib/cjs/puppeteer/common/ExecutionContext.js:211:46)
at processTicksAndRejections (internal/process/task_queues.js:95:5)
at async ExecutionContext.evaluate (/home/ec2-user/revisor_template_engine/node_modules/puppeteer/node_modules/puppeteer-core/lib/cjs/puppeteer/common/ExecutionContext.js:107:16)
at async IsolatedWorld.setContent (/home/ec2-user/revisor_template_engine/node_modules/puppeteer/node_modules/puppeteer-core/lib/cjs/puppeteer/common/IsolatedWorld.js:217:9)
at async CDPPage.setContent (/home/ec2-user/revisor_template_engine/node_modules/puppeteer/node_modules/puppeteer-core/lib/cjs/puppeteer/common/Page.js:996:9)
and if I don't use the --single-process flag the code get stuck after page.setContent() function.
Here is the code snippet :
Note: The HTML is generated from handlebars
const browser = await puppeteer.launch({
args: [
'--no-sandbox',
'--disable-setuid-sandbox',
'--disable-gpu',
'--disable-dev-shm-usage',
'--no-first-run',
'--no-zygote',
'--single-process',
'--shm-size=2gb',
'--disable-features=site-per-process',
'--disable-features=IsolateOrigins',
'--disable-site-isolation-trials',
'--unlimited-storage',
'--force-gpu-mem-available-mb',
'--full-memory-crash-report'
],
headless: true,
timeout: 300000,
devtools: true
});
const page = await browser.newPage();
await page.setContent(html, {timeout: 300000, waitUntil: ['domcontentloaded', 'load', "networkidle0"]}); // This line throws error
let buffer = await page.pdf({...options, timeout: 300000});
The above code is working fine in my local but throwing error in cloud servers.
Puppeteer version used : 19.2.2
Node Version in AWS EC2 : v14.21.1
IF I TRY WITH SMALL SMALL HTML STRING (~300 kb) THE SAME CODE IS WORKING FINE IN AWS EC2
I have followed this link https://github.com/puppeteer/puppeteer/blob/main/docs/troubleshooting.md#running-puppeteer-on-aws-ec2-instance-running-amazon-linux
I have also tried
https://github.com/puppeteer/puppeteer/issues/1175
Puppeteer - Protocol error (Page.navigate): Target closed
https://github.com/puppeteer/puppeteer/issues/3683
Same question was posted on stackoverflow but no answer :
Puppeteer Protocol error (Runtime.callFunctionOn): Target closed

Puppeteer Waiting for target frame Ubuntu digitalocean

I have been building a webscraper in Node.js and running it on a digital ocean Ubuntu server. Puppeteer is only having issues on Ubuntu for my program.
I originally had an issue running Puppeteer with root user so I switched to a new account I made on the server and now I have this new issue.
Version: HeadlessChrome/105.0.5173.0
Error: Waiting for target frame D0E4A57B880331E15F232D467A28499A
failed
at Timeout._onTimeout (/home/pricepal/priceServer-deployment/price-server/node_modules/puppeteer/lib/cjs/puppeteer/common/util.js:447:18)
at listOnTimeout (node:internal/timers:564:17)
at process.processTimers (node:internal/timers:507:7)
Node.js v18.7.0
Here is the block of code that the program stops at and eventually errors out:
try {
const browser = await puppeteer.launch()
const page = await browser.newPage()
await page.goto(link)
const content = await page.content()
await browser.close()
return content
} catch (error) {
console.log(error)
}
It takes a little longer than normal to generate the headless browser but the error is stemming from a timeout happening at page.goto(link). All of the links fail to load not just one in particular.
The links I am using work when ran on my m1 mac with the same chromium and node versions.
I have been doing research and trying new things all day but I cannot get it fixed and have found little resourced relating to this issue.
I had the exact same problem, been pulling my hair out looking for answers the past few days. I know it's not exactly a proper answer (mods sorry if you have to delete this), but I found that switching from Ubuntu to Debian 10 magically fixed everything. FWIW the line causing the error is:
const page = await browser.newPage()
I suspect the issue lies somewhere within the version of Chromium that Puppeteer downloads, and its interaction with the OS. What exactly though I couldn't say. My results are as follows:
Didn't work:
Ubuntu 22.04
Ubuntu 20.04
Debian 11
Worked:
Debian 10

Run puppeteer on Chrome No chromium

I want to open Telegram site with puppeteer
But there is a problem
Telegram session only opens on Chrome
You must login with puppeteer each time
There is a way for the puppeteer to run only on the running chrome to detect the session
const browser = await puppeteer.launch({
headless : false,
executablePath: "C:\\Program Files (x86)\\Google\\Chrome\\Application\\chrome.exe",
args: ["--lang=en-US,en", '--no-sandbox', '--disable-setuid-sandbox', '--disable-extensions']
})
This code works properly
But on chromium
Yes, it's possible to run a puppeteer instance on top of an pre-existing Chrome process.
In order to achieve this, first, you need to start the Chrome process with the remote-debugging-port option, usually defined as: --remote-debugging-port=9222
This Medium articule is well detailed on how to achieve so, but to summarize:
MAC OS:
Run:
/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --remote-debugging-port=9222 --no-first-run --no-default-browser-check --user-data-dir=$(mktemp -d -t 'chrome-remote_data_dir')
Windows:
Right click on your Google Chrome shortcut icon => Properties
In Target field, add to the very end --remote-debugging-port=9222
Should look something like:
"C:\Program Files (x86)\Google\Chrome\Application\chrome.exe" --remote-debugging-port=9222
Then, you'll be able to navigate to http://localhost:9222/json/version (the port is the same you've defined above), and see an output like this:
{
"Browser": "HeadlessChrome/87.0.4280.66",
"Protocol-Version": "1.3",
"User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) HeadlessChrome/87.0.4280.66 Safari/537.36",
"V8-Version": "8.7.220.25",
"WebKit-Version": "537.36 (#fd98a29dd59b36f71e4741332c9ad5bda42094bf)",
"webSocketDebuggerUrl": "ws://localhost:9222/devtools/browser/000aaaa-bb08-55af-a8e3-760dd9998fc7"
}
Then, you can use the puppeteer connect() method (instead of the launch() method) like this:
const browser = await puppeteer.connect({
browserWSEndpoint: "ws://localhost:9222/devtools/browser/000aaaa-bb08-55af-a8e3-760dd9998fc7",
});
// now, 'browser' is connected to your chrome window.
// get the opened pages
const openedPages = await browser.pages();
// filter out the one you want (telegram). not sure the best way to do it, please test it yourself
const telegramPage = openedPages.filter(page => page.url().includes("telegram"));

Puppeteer is failing to launch the browser in local

I am getting this error again and again while launching the application. I would have reinstalled puppeteer for like 8-9 times and even downloaded all the dependencies listed in the Troubleshooting link.
Error: Failed to launch the browser process! spawn /home/......./NodeJs/Scraping/code3/node_modules/puppeteer/.local-chromium/linux-756035/chrome-linux/chrome ENOENT
TROUBLESHOOTING: https://github.com/puppeteer/puppeteer/blob/main/docs/troubleshooting.md
This Code is just for taking a screenshot of google.com
NodeJs Version- 14.0.0
Puppeteer Version- 4.0.1
Ubuntu Version- 20.04
I am using puppeteer which is bundled with Chromium
const chalk = require("chalk");
// MY OCD of colorful console.logs for debugging... IT HELPS
const error = chalk.bold.red;
const success = chalk.keyword("green");
(async () => {
try {
// open the headless browser
var browser = await puppeteer.launch({ headless: false });
// open a new page
var page = await browser.newPage();
// enter url in page
await page.goto(`https://www.google.com/`);
// Google Say Cheese!!
await page.screenshot({ path: "example.png" });
await browser.close();
console.log(success("Browser Closed"));
} catch (err) {
// Catch and display errors
console.log(error(err));
await browser.close();
console.log(error("Browser Closed"));
}
})(); ```
As you said puppeteer 2.x.x works for you perfectly but 4.x.x doesn't: it seems to be a linux dependency issue which occurs more since puppeteer 3.x.x (usually libgbm1 is the culprit).
If you are not sure where is your chrome executable located first run:
whereis chrome
(e.g.: /usr/bin/chrome)
Then to find your missing dependencies run:
ldd /usr/bin/chrome | grep not
sudo apt-get install the listed dependencies.
After this happened you are able to do a clean npm install on your project with the latest puppeteer aas well (as of today it will be 5.0.0).

Unhandled promise rejection (rejection id: 1): Error: kill ESRCH

I've made some research on the Web and SOF, but found nothing really helpful on that error.
I installed Node and Puppeteer with Windows 10 Ubuntu Bash, but didn't manage to make it work, yet I manage to make it work on Windows without Bash on an other machine.
My command is :
node index.js
My index.js tries to take a screenshot of a page :
const puppeteer = require('puppeteer');
async function run() {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto('https://github.com');
await page.screenshot({ path: 'screenshots/github.png' });
browser.close();
}
run();
Does anybody know the way I could fix this "Error: kill ESRCH" error?
I had the same issue, this worked for me.
Try updating your script to the following:
const puppeteer = require('puppeteer');
async function run() {
//const browser = await puppeteer.launch();
const browser = await puppeteer.launch({headless: true, args: ['--no-sandbox'] }); //WSL's chrome support is very new, and requires sandbox to be disabled in a lot of cases.
const page = await browser.newPage();
await page.goto('https://github.com');
await page.screenshot({ path: 'screenshots/github.png' });
await browser.close(); //As #Md. Abu Taher suggested
}
run();
const browser = await puppeteer.launch({ args: ['--no-sandbox'] });
If you want to read all the details on this, this ticket has them (or links to them).
https://github.com/Microsoft/WSL/issues/648
Other puppeteer users with similar issues:
https://github.com/GoogleChrome/puppeteer/issues/290#issuecomment-322851507
I just fixed this issue. What you need to do is the following:
1) Install Debian dependencies
You can find them in this doc:
https://github.com/GoogleChrome/puppeteer/blob/master/docs/troubleshooting.md
sudo apt-get install all of those bad boys.
2) Add '--no-sandbox' flag when launching puppeteer
3) Make sure your windows 10 is up to date. I was missing an important update that allowed you to launch Chrome.
Points no consider:
Windows bash is not a complete drop-in replacement for Ubuntu bash (yet). There are many cases where different GUI based apps did not work properly. Also, the script might be confused by bash on windows 10. It could think that the os is linux instead of windows.
Windows 10 bash only supports 64-bit binaries, so make sure the node and the chrome version that's used inside is pretty much 64-bit. Puppeteer is using -child.pid to kill the child processes instead of child.pid on windows version. Make sure puppeteer is not getting confused by all these bash/windows thing.
Back to your case.
You are using browser.close() in the function, but it should be await browser.close(), otherwise it's not executing in proper order.
Also, You should try to add await page.close(); before browser.close();.
So the code should be,
await page.close();
await browser.close();
I worked around it by softlinking chrome.exe to node_modules/puppeteer/.../chrome as below
ln -s /mnt/c/Program\ Files\ \(x86\)/Google/Chrome/Application/chrome.exe node_modules/puppeteer/.local-chromium/linux-515411/chrome-linux/chrome

Resources