Puppeteer not working with google cloud functions in Node 10 runtime

Puppeteer not working with google cloud functions in Node 10 runtime - node.js

I am trying to launch an instance of google chrome in headless mode but getting the following error:
Failed to launch the browser process!
/workspace/node_modules/puppeteer/.local-chromium/linux-737027/chrome-linux/chrome: error while loading shared libraries: libgbm.so.1: cannot open shared object file: No such file or directory
I am using puppeteer v3.0.0 with nodejs 10 runtime.
How can i resolve this error?

As the post was about Cloud Functions the above will not work when then function is deployed. This problem is a known issue.
There's some discussion of reverting to Puppeteer 2.1.0,
github puppeteer issues 5674
or
there's a work around:
github puppeteer issues 5704
My experience:
I tried the work around but it did not work. Maybe it needs tweaking
but I had not time to debug.
I reverted to puppeteer 2.1.0, deployed
be function with --memory 2048MB and it worked successfully.

Related

Cloud Functions Puppeteer cannot open browser

My setup in GCF:
install npm install --save puppeteer from project cloud shell
edit package.json like so:
{ "dependencies": { "puppeteer": "^19.2.2" } }
paste code from medium.com into index.js:
https://gist.githubusercontent.com/Alezco/b9b7ce4ec7ee7f208818e395225fcbbe/raw/8554acc8b311a10e272f5d1b98dce3400945bb00/index.js
deploy with 2 GB RAM, 0-3 instances, max 500s timeout
I get these errors after building or opening the URL:
Internal Server Error
Could not find Chromium (rev. 1056772). This can occur if either 1. you did not perform an installation before running the script (e.g. npm install) or 2. your cache path is incorrectly configured (which is: /workspace/.cache/puppeteer). For (2), check out our guide on configuring puppeteer at https://pptr.dev/guides/configuration.
When I run npm list both webdriver and puppeteer are installed. I suspect there is an issue this Path but I cannot figure out where it should lead.
I could then provide puppeteer.launch() with argument executablePath which might solve the problem.
I tried reinstalling puppeteer and changing configuration. No luck.

In addition to adding a .puppeteerrc.cjs per Kristofer's answer, I added a postinstall script in my package.json:
"scripts": {
...
"postinstall": "node node_modules/puppeteer/install.js"
},
This fixed the problem and I was able to deploy my Google Cloud Function. This is a temporary fix until issue #9128 is fixed.

I had the exact same issue and it seems to be related to this https://github.com/puppeteer/puppeteer/issues/9128
I'm using Firebase and don't have complete control over the build process when I deploy my functions but I still have access to the build logs. From the issue above, I realized I needed to handle the cache directory and the NPM version for this to work.
As far as I can tell the problem is that the build step installs the Chrome browser needed for Puppeteer in a cache directory outside of the final image that is used for the actuall function. In that context the error message makes more sence, it can't find the browser therefor it doesn't work.
I was using Node 14 in my cloud functions which used NPM 6.14.17 in the build steps. According to the issue you need to use NPM > 7 so I upgraded my function to use Node 16.
Then I added the .puppeteerrc.cjs from https://pptr.dev/guides/configuration/#examples when testing that locally it will add a .cache directory where the Chrome installation is. This has to be ignored when deploying the cloud function or the deplot will fail due to size.
In the firebase.json add:
"functions": {
"ignore": [
".cache"
]
},
The last step is pretty specific for Firebase and I'm not sure how this applies to your build steps etc. But this solved my issue that had the exact same error messages as you had. So double check the following:
NPM version in the build step, needs to beat least v.7 - Node 16 should do this.
Cache directory specified in the .puppeteerrc.cjs
It also looks like your using an old Puppeteer version, I used 19.3

I got this same (very missleading) error in my Puppeteer project which I run in Google Cloud Function. The issue was that Function was finishing (exiting) before the async Puppeteer script was finished.
Resolved this issue by changing the "await browser.close();" to a Promise and creating the response message in promise.then().
... only to hit next problem. My script is not downloading the csv file as expected. Works locally though...

How to reduce Puppeteer size

I'm using Puppeteer for webscraping, with a small NodeJs webapp that I made. This webapp is hosted on Heroku and use jontewks/puppeteer-heroku-buildpack to works.
The problem I'm facing is that my app do not build anymore because of the Heroku size limit:
Compiled slug size: 537.4M is too large (max is 500M).
I've tried severals things:
Using Firefox instead of Chromium
It's a "no go" for me because of a current issue with puppeteer/firefox:
Reducing the size of Chromium by removing the file interactive_ui_tests.exe
I can't do this because Heroku use Linux instead of Windows, and this file does not exist in the Linux Chromium distribution
Using headless_shell instead of Chromium
I'm stuck with this (like here) as I do not understand how to make it works. I found the file to use here, but I'm facing the same issue as the comment from the 07/09/2018
Using Playwright instead of Puppeteer
It might be a solution, but I'm using stuffs like puppeteer-extra and puppeteer-extra-plugin-stealth, so it bother me to change
Reducing the size of Chromium by removing the folder locales
It helps a bit, but not much
Using an older version of Puppeteer (2.1.1), which is using an older version Chromium who was slighlty lighter
At the moment, it's the only working solution that I have
Use the command heroku repo:gc -a myapp and heroku builds:cache:purge -a myapp
My last three points reduced the size of my slug to 490M. So my app is working, but it's not great for the (close) future, like having an up to date Puppeteer version.
So here I am, asking for help, as I do not have any more ideas at the moment.
Thank you very much for your help 🙏

Finally, I end up using Playwright.
With this Buildpack, the build of my app is only 250Mb!
Here's a few steps I've followed:
Install with NPM playwright-chromium to only download Chromium.
Set PLAYWRIGHT_BUILDPACK_BROWSERS env variable to chromium in Heroku to only install Chromium dependencies.
Put this buildpack before Node.js buildpack in Heroku.
With this trick you can use most of the of stuff from puppeteer-stealth.
If you want, you can block resources like in Puppeteer:
await page.route('**/*', route => ([
'stylesheet',
'image',
'media',
'font',
// 'script',
'texttrack',
'xhr',
'fetch',
'eventsource',
'websocket',
'manifest',
'other',
].includes(route.request().resourceType()) ? route.abort() : route.continue()))

ChromeNotInstalledError while using chrome-launcher npm package

I am using chrome-launcher for running lighthouse programmatically. It works fine locally but when I run it on azure I am getting an error.
On this statement const chrome = await chromeLauncher.launch({chromeFlags: ['--headless']}); I am getting the following error:
ChromeNotInstalledErrorat new LauncherError (C:\home\site\wwwroot\node_modules\chrome-launcher\dist\utils.js:37:22)at new ChromeNotInstalledError (C:\home\site\wwwroot\node_modules\chrome-launcher\dist\utils.js:68:9){message: 'No Chrome installations found.',code: 'ERR_LAUNCHER_NOT_INSTALLED'}
How can I solve this?

You need to install Chrome on the Azure Function app somehow.
One way to do this is by using an npm dependency that installs Chrome as part of its install process. Examples of this are puppeteer and playwright. Although then you end up with some unnecessary dependencies.
You could also have a startup script or something that installs Chrome before running chrome-launcher/lighthouse. You'll need to tell crome-launcher where Chrome is installed if it's not a standard place using chromePath options or CHROME_PATH environment variable link.
You also have to ensure you do a Remote Build for the Function app.
You will also run into this error, which has a possible workaround: https://github.com/GoogleChrome/chrome-launcher/issues/188
Overall it's not easy. I actually ended up moving my workflow to GitHub Actions instead as Chrome is already installed on their runner images.
See:
https://anthonychu.ca/post/azure-functions-headless-chromium-puppeteer-playwright/

Jest with puppeteer using existing chrome browser

I am trying to make puppeteer work with Jest for e2e testing while using existing Chrome browser.
I choose my puppeteer version "5.1.0" for chrome browser version "84.0.4147" from the list of supported browsers.
I am trying to configure Jest with puppeteer using the information aviable at the following link
Jest Puppeteer configuration using jest-puppeteer
apparently puppeteer library tried to download chromium browser binary which i have skip b/c i would like to use existing chrome browser. and i am having hard time configuring that.
There is some help here at jest puppeteer preset documentation but still not enough help how to use existing browser.
I am assuming using existing chrome configuration should be in jest.puppeteer.config.js but don't know yet how to do it !
Rite now my jest-puppeteer.config.js looks as following
module.exports = {
launch: {
headless: false,
slowMo: false
devtools:true
},
browser: 'chromium'
browserContext: 'default'
}
when i run my tests, i get following error
Error: Could not find browser revision 800071. Run "PUPPETEER_PRODUCT=firefox 7pm install" or "PUPPETEER_PRODUCT=firefox yarn install" to download a supported Fire fox browser binary"

https://developers.google.com/web/tools/puppeteer/get-started
By default, Puppeteer downloads and uses a specific version of Chromium so its API is guaranteed to work out of the box. To use Puppeteer with a different version of Chrome or Chromium, pass in the executable's path when creating a Browser instance:
const browser = await puppeteer.launch({executablePath: '/path/to/Chrome'});

i was still getting same error with the above answer but following solution worked for me.
https://docs.percy.io/docs/skipping-puppeteer-chromium-download

Failed to launch the browser process on Heroku

I built an app using as core node, express and sulla (import puppeteer).
Basically I scrape some data and use sulla to send them via whatsapp.
It works fine on local but when I deploy it on heroku I'm faced with this issue :
Failed to launch the browser
process!\n[0601/222716.792459:FATAL:zygote_host_impl_linux.cc(116)] No
usable sandbox! Update your kernel or see
https://chromium.googlesource.com/chromium/src/+/master/docs/linux_suid_sandbox_development.md
for more information on developing with the SUID sandbox. If you want
to live dangerously and need an immediate workaround, you can try
using --no-sandbox ...... Core file will not be generated.
TROUBLESHOOTING:
https://github.com/puppeteer/puppeteer/blob/master/docs/troubleshooting.md
I've already added the following buildpacks to my heroku app :
https://github.com/jontewks/puppeteer-heroku-buildpack.git
heroku/nodejs
https://github.com/heroku/heroku-buildpack-chromedriver
I've seen solutions like https://stackoverflow.com/a/52228855, but I can't apply it since I'm not directly using puppeteer. Or clear heroku caches without success.

Until you are using the current npm package of sulla unfortunately it won't work for you on Heroku. As the linked question says, you need to launch puppeteer with --no-sandbox (the --disable-setuid-sandbox arg is not mandatory for Heroku):
await puppeteer.launch({ args: ['--no-sandbox'] })
Sulla lacks this arg in the npm package (launch) (current launch config) (not used launch config that would work with Heroku).
It is very good that you've already added the buildpacks, those are needed if puppeteer is running as a dependency in the background.
I.) You could try a fork of sulla, called sulla-hotfix: https://www.npmjs.com/package/#jprustv/sulla-hotfix if it suits your needs. This one still uses the previous sulla puppeteer config, which apparently contains --no-sandbox launch arg.
It is true for the original project sulla was forked from: #open-wa/wa-automate. It may works on Heroku with the buildpacks.
II.) Or you could publish a modified version of sulla under MIT license, containing the right launch parameter.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Puppeteer not working with google cloud functions in Node 10 runtime - node.js

Related

Cloud Functions Puppeteer cannot open browser

How to reduce Puppeteer size

ChromeNotInstalledError while using chrome-launcher npm package

Jest with puppeteer using existing chrome browser

Failed to launch the browser process on Heroku

Categories

Resources