How to run NodeJS puppeteer with Chromium v77 in Docker? - node.js

Alpine supports Chromium v77 on 8Oct.
Reference: https://pkgs.alpinelinux.org/packages?name=chromium&branch=edge
Tried to copy steps to download Chromium v77 and run Puppeteer v1.20 but with error when running it:
Error for printPdf()
{}
Error: Failed to launch chrome!
Error relocating /usr/lib/chromium/chrome: _ZNSt7__cxx1118basic_stringstreamIcSt11char_traitsIcESaIcEEC1Ev: symbol not found
Error relocating /usr/lib/chromium/chrome: _ZNSt7__cxx1119basic_ostringstreamIcSt11char_traitsIcESaIcEEC1Ev: symbol not found
Error relocating /usr/lib/chromium/chrome: hb_subset_input_set_retain_gids: symbol not found
Error relocating /usr/lib/chromium/chrome: _ZNSt19_Sp_make_shared_tag5_S_eqERKSt9type_info: symbol not found
TROUBLESHOOTING: https://github.com/GoogleChrome/puppeteer/blob/master/docs/troubleshooting.md
at onClose (/usr/src/app/node_modules/puppeteer/lib/Launcher.js:348:14)
at Interface.<anonymous> (/usr/src/app/node_modules/puppeteer/lib/Launcher.js:337:50)
at Interface.emit (events.js:214:15)
at Interface.close (readline.js:403:8)
at Socket.onend (readline.js:180:10)
at Socket.emit (events.js:214:15)
at endReadableNT (_stream_readable.js:1178:12)
at processTicksAndRejections (internal/process/task_queues.js:77:11)
Dockerfile:
FROM node:12-alpine
ENV CHROME_BIN="/usr/bin/chromium-browser"\
PUPPETEER_SKIP_CHROMIUM_DOWNLOAD="true"
RUN set -x \
&& apk update \
&& apk upgrade \
&& echo "127.0.0.1 localhost" >> /etc/hosts \
&& echo "http://dl-cdn.alpinelinux.org/alpine/edge/testing" > /etc/apk/repositories \
&& echo "http://dl-cdn.alpinelinux.org/alpine/edge/community" >> /etc/apk/repositories \
&& echo "http://dl-cdn.alpinelinux.org/alpine/edge/main" >> /etc/apk/repositories \
&& apk add --no-cache g++ chromium \
&& npm install puppeteer#1.20.0 puppeteer-core#1.20.0
...

I think the issue is with your installation. You can try this as a base image.
FROM zenika/alpine-chrome:77-with-node
ENV PUPPETEER_SKIP_CHROMIUM_DOWNLOAD=true
RUN npm install puppeteer#1.20.0 puppeteer-core#1.20.0
COPY my_script.js /usr/src/app/
CMD ["node","my_script.js"]
my_script.js testing code
const puppeteer = require('puppeteer');
(async () => {
const browser =await puppeteer.launch({
executablePath: '/usr/bin/chromium-browser',
args: ['--no-sandbox', '--headless', '--disable-gpu']
});
const page = await browser.newPage();
await page.goto('https://news.ycombinator.com', {waitUntil: 'networkidle2'});
await page.pdf({path: 'hn.pdf', format: 'A4'});
await browser.close();
})();
If you want to build from sractch then you can use this Dockerfile and modify as per your need.

Related

Docker return error DPI-1047: cannot locate a 64-bit Oracle library: libclntsh.so. Node.js Windows 10

I'm using in docker container on windows 10 with nodejs. When I try to get data from oracle database - get request (the connection to data base in nodejs code) I get the message:
DPI-1047: Cannot locate a 64-bit Oracle Client library: "libclntsh.so: cannot open shared object file: No such file or directory". See https://oracle.github.io/node-oracledb/INSTALL.html for help
When I make a get request without the container(run server) the data was return well.
Dockerfile:
FROM node:latest
WORKDIR /app
COPY package*.json app.js ./
RUN npm install
COPY . .
EXPOSE 9000
CMD ["npm", "start"]
connection to oracle:
async function send2db(sql_command, res) {
console.log("IN");
console.log(sql_command);
try {
await oracledb.createPool({
user: dbConfig.user,
password: dbConfig.password,
connectString: dbConfig.connectString,
});
console.log("Connection pool started");
const result = await executeSQLCommand(sql_command
// { outFormat: oracledb.OUT_FORMAT_OBJECT }
);
return result;
} catch (err) {
// console.log("init() error: " + err.message);
throw err;
}
}
From Docker for Oracle Database Applications in Node.js and Python here is one solution:
FROM node:12-buster-slim
WORKDIR /opt/oracle
RUN apt-get update && \
apt-get install -y libaio1 unzip wget
RUN wget https://download.oracle.com/otn_software/linux/instantclient/instantclient-basiclite-linuxx64.zip && \
unzip instantclient-basiclite-linuxx64.zip && \
rm -f instantclient-basiclite-linuxx64.zip && \
cd instantclient* && \
rm -f *jdbc* *occi* *mysql* *jar uidrvci genezi adrci && \
echo /opt/oracle/instantclient* > /etc/ld.so.conf.d/oracle-instantclient.conf && \
ldconfig
You would want to use a later Node.js version now. The referenced link shows installs on other platforms too.

Error: Jest: Got error running globalSetup - /home/pptruser/app/node_modules/jest-environment-puppeteer/setup.js

When running from local, its getting passed. Below error is thrown when run from docker locally. I am in the process of setting up my code for my puppeteer test.
I also included here below package.json, jest-puppeteer.config, jest.config files. Here I haven't included my tests files.
shall you please someone help ? Thanks.
Error: Jest: Got error running globalSetup - /home/pptruser/app/node_modules/jest-environment-puppeteer/setup.js, reason: Could not find expected browser (chrome) locally. Run `npm install` to download the correct Chromium revision (1022525).
at ChromeLauncher.launch (/home/pptruser/app/node_modules/puppeteer/lib/cjs/puppeteer/node/ChromeLauncher.js:70:23)
at async Promise.all (index 0)
at async setup (/home/pptruser/app/node_modules/jest-environment-puppeteer/lib/global.js:37:16)
at async /home/pptruser/app/node_modules/#jest/core/build/runGlobalHook.js:125:13
at async waitForPromiseWithCleanup (/home/pptruser/app/node_modules/#jest/transform/build/ScriptTransformer.js:209:5)
at async runGlobalHook (/home/pptruser/app/node_modules/#jest/core/build/runGlobalHook.js:116:9)
at async runJest (/home/pptruser/app/node_modules/#jest/core/build/runJest.js:369:5)
at async _run10000 (/home/pptruser/app/node_modules/#jest/core/build/cli/index.js:320:7)
at async runCLI (/home/pptruser/app/node_modules/#jest/core/build/cli/index.js:173:3)
at async Object.run (/home/pptruser/app/node_modules/jest-cli/build/cli/index.js:155:37)
jest.config.js:
module.exports = {
preset: "jest-puppeteer"
notifyMode: "always",
maxWorkers: "50%",
maxConcurrency: 150,
maxWorkers: 1,
bail: 1,
collectCoverage: true,
testRunner: "jest-jasmine2",
timers: "fake",
testTimeout: 9000000,
watchman: false,
};
package.json:
"devDependencies": {
"#babel/preset-env": "^7.18.2",
"babel-jest": "^27.0.6",
"dotenv": "^16.0.1",
"jest": "^27.5.1",
"jest-cli": "^27.5.1",
"jest-jasmine2": "^27.2.3",
"jest-puppeteer": "^6.1.0",
"prettier": "2.5.1",
"puppeteer": "^14.1.2"
}
jest-puppeteer.config:
module.exports = {
launch: {
headless: true,
slowMo: 0,
defaultViewport: null,
args: ["--window-size=1920,1080",
"--incognito",
"--start-maximized",
"--disable-extensions",
"--no-sandbox",
"--disable-setuid-sandbox",
"--no-first-run",
"--no-zygote"],
setDefaultNavigationTimeout: 8000000,
setDefaultTimeout: 8000000,
},
browserContext: "default",
};
My docker file:
FROM docker-remote.artifactory.oci.oraclecorp.com/oraclelinux:7-slim
COPY --from=odo-docker-signed-local.artifactory.oci.oraclecorp.com/odo/base-image-support:ol7x-1.6 / /
RUN yum-config-manager --add-repo https://artifactory.oci.oraclecorp.com/io-ol7-nodejs16-yum-local/ \
--add-repo https://artifactory.oci.oraclecorp.com/io-ol7-oracle-instant-client-yum-local/ \
ENV PUPPETEER_SKIP_CHROMIUM_DOWNLOAD=true \
PUPPETEER_EXECUTABLE_PATH=/usr/bin/chromium-browser \
ORACLE_NPM=https://artifactory.oci.oraclecorp.com/api/npm/npm-remote/
RUN yum -y update \
&& yum -y install nodejs-16.14.2-1.0.1.el7.x86_64 chromium-102.0.5005.115-1.el7.x86_64
RUN groupadd -r pptruser \
&& useradd -r -g pptruser -G audio,video pptruser \
&& mkdir -p /home/pptruser/Downloads \
&& mkdir -p /home/pptruser/app \
&& chown -R pptruser:pptruser /home/pptruser
WORKDIR /home/pptruser/app
COPY --chown=pptruser:pptruser . .
CMD ["npm","run-script smoke"]
The way we solved this was to download Chromium https://download-chromium.appspot.com/ and unzip into a specific folder: .\node_modules\puppeteer.local-chromium\win64-722234
Some other answers include specifying the path to your chromium like so:
const launchOptions = {
// other options (headless, args, etc)
executablePath: '/home/jack/repos/my-repo/node_modules/puppeteer/.local-chromium/linux-901912/chrome-linux/chrome'
}

AWS lambda function throwing error "newPromise is not defined"

I am using AWS lambda function with below code
'use strict';
var newPromise = require('es6-promise').Promise;
const childProcess= require("child_process");
const path= require("path");
const backupDatabase = () => {
const scriptFilePath =path.resolve(__dirname, "./backup.sh");
return newPromise((resolve, reject) => {
childProcess.execFile(scriptFilePath, (error) => {
if (error) {
console.error(error);
resolve(false);
}
resolve(true);
});
});
};
module.exports.handler = async (event) => {
const isBackupSuccessful = await backupDatabase();
if (isBackupSuccessful) {
return {
status: "success",
message: "Database backup completed successfully!"
};
}
return {
status: "failed",
message: "Failed to backup the database! Check out the logs for more details"
};
};
The code above run's with in the docker container, tries to run the below backup script
#!/bin/bash
#
# Author: Bruno Coimbra <bbcoimbra#gmail.com>
#
# Backups database located in DB_HOST, DB_PORT, DB_NAME
# and can be accessed using DB_USER. Password should be
# located in $HOME/.pgpass and this file should be
# chmod 0600[1].
#
# Target bucket should be set in BACKUP_BUCKET variable.
#
# AWS credentials should be available as needed by aws-cli[2].
#
# Dependencies:
#
# * pg_dump executable (can be found in postgresql-client-<version> package)
# * aws-cli (with python environment configured execute 'pip install awscli')
#
#
# References
# [1] - http://www.postgresql.org/docs/9.3/static/libpq-pgpass.html
# [2] - http://docs.aws.amazon.com/cli/latest/userguide/cli-chap-welcome.html
#
#
###############
### Variables
export AWS_ACCESS_KEY_ID=
export AWS_SECRET_ACCESS_KEY=
DB_HOST=
DB_PORT="5432"
DB_USER="postgres"
BACKUP_BUCKET=
###############
#
# **RISK ZONE** DON'T TOUCH below this line unless you know
# exactly what you are doing.
#
###############
set -e
export PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
### Variables
S3_BACKUP_BUCKET=${BACKUP_BUCKET:-test-db-backup-bucket}
TEMPFILE_PREFIX="db-$DB_NAME-backup"
TEMPFILE="$(mktemp -t $TEMPFILE_PREFIX-XXXXXXXX)"
DATE="$(date +%Y-%m-%d)"
TIMESTAMP="$(date +%s)"
BACKUPFILE="backup-$DB_NAME-$TIMESTAMP.sql.gz"
LOGTAG="DB $DB_NAME Backup"
### Validations
if [[ ! -r "$HOME/.pgpass" ]]; then
logger -t "$LOGTAG" "$0: Can't find database credentials. $HOME/.pgpass file isn't readable. Aborted."
exit 1
fi
if ! which pg_dump > /dev/null; then
logger -t "$LOGTAG" "$0: Can't find 'pg_dump' executable. Aborted."
exit 1
fi
if ! which aws > /dev/null; then
logger -t "$LOGTAG" "$0: Can't find 'aws cli' executable. Aborted."
exit 1
fi
logger -t "$LOGTAG" "$0: remove any previous dirty backup file"
rm -f /tmp/$TEMPFILE_PREFIX*
### Generate dump and compress it
logger -t "$LOGTAG" "Dumping Database..."
pg_dump -O -x -h "$DB_HOST" -p "$DB_PORT" -U "$DB_USER" -w "$DB_NAME" > "$TEMPFILE"
logger -t "$LOGTAG" "Dumped."
logger -t "$LOGTAG" "Compressing file..."
nice gzip -9 "$TEMPFILE"
logger -t "$LOGTAG" "Compressed."
mv "$TEMPFILE.gz" "$BACKUPFILE"
### Upload it to S3 Bucket and cleanup
logger -t "$LOGTAG" "Uploading '$BACKUPFILE' to S3..."
aws s3 cp "$BACKUPFILE" "s3://$S3_BACKUP_BUCKET/$DATE/$BACKUPFILE"
logger -t "$LOGTAG" "Uploaded."
logger -t "$LOGTAG" "Clean-up..."
rm -f $TEMPFILE
rm -f $BACKUPFILE
rm -f /tmp/$TEMPFILE_PREFIX*
logger -t "$LOGTAG" "Finished."
if [ $? -eq 0 ]; then
echo "script passed"
exit 0
else
echo "script failed"
exit 1
fi
I created a docker image with above app.js content and bakup.sh with the below docker file
ARG FUNCTION_DIR="/function"
FROM node:14-buster
RUN apt-get update && \
apt install -y \
g++ \
make \
cmake \
autoconf \
libtool \
wget \
openssh-client \
gnupg2
RUN wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | apt-key add - && \
echo "deb http://apt.postgresql.org/pub/repos/apt/ buster-pgdg main" | tee /etc/apt/sources.list.d/pgdg.list && \
apt-get update && apt-get -y install postgresql-client-12
ARG FUNCTION_DIR
RUN mkdir -p ${FUNCTION_DIR} && chmod -R 755 ${FUNCTION_DIR}
WORKDIR ${FUNCTION_DIR}
COPY package.json .
RUN npm install
COPY backup.sh .
RUN chmod +x backup.sh
COPY app.js .
ENTRYPOINT ["/usr/local/bin/npx", "aws-lambda-ric"]
CMD ["app.handler"]
I am running the docker container created with the image created from the above docker file
docker run -v ~/aws:/aws -it --rm -p 9000:8080 --entrypoint /aws/aws-lambda-rie backup-db:v1 /usr/local/bin/npx aws-lambda-ric app.handler
And trying to hit that docker container with below curl command
curl -XPOST "http://localhost:9000/2015-03-31/functions/function/invocations" -d '{}'
when I run curl command I am seeing the below error
An error I see is :"newPromise is not defined","trace":["ReferenceError: newPromise is not defined"," at backupDatabase (/function/app.js:9:3)","
Tried adding the variable var newPromise = require('es6-promise').Promise;, but that gave a new error "Cannot set property 'scqfkjngu7o' of undefined","trace"
Could someone help me with fixing the error ? My expected output is the message as described in the function, but am seeing the errors.
Thank you
Node 14 supports promises natively. You should do:
return new Promise((resolve, reject) => {
childProcess.execFile(scriptFilePath, (error) => {
if (error) {
console.error(error);
resolve(false);
}
resolve(true);
});
Note the space between new and Promise. Promise is the object and you are using a constructor. There is no need to import any module.

net::ERR_ADDRESS_UNREACHABLE at {URL}

i am using puppeteer v1.19.0 in nodejs , is error unreachable after build and run in docker,
is js file
await puppeteer.launch({
executablePath: '/usr/bin/chromium-browser',
args: ['--no-sandbox', '--disable-setuid-sandbox', '--headless'],
}).then(async (browser) => {
const url = `${thisUrl}analisa-jabatan/pdf/${_id}`
const page = await browser.newPage()
await page.goto(url, { waitUntil: 'networkidle0' })
// await page.evaluate(() => { window.scrollBy(0, window.innerHeight) })
await page.setViewport({
width: 1123,
height: 794,
})
setTimeout(async () => {
const buffer = await page.pdf({
path: `uploads/analisa-jabatan.pdf`,
displayHeaderFooter: true,
headerTemplate: '',
footerTemplate: '',
printBackground: true,
format: 'A4',
landscape: true,
margin: {
top: 20,
bottom: 20,
left: 20,
right: 20,
},
})
let base64data = buffer.toString('base64')
await res.status(200).send(base64data)
// await res.download(process.cwd() + '/uploads/analisa-jabatan.pdf')
await browser.close()
}, 2000)
})
}
and is dockerfile
FROM aria/alpine-nodejs:3.10
#FROM node:12-alpine
LABEL maintainer="Aria <aryamuktadir22#gmail.com>"
# ENVIRONMENT VARIABLES
# NODE_ENV
ENV NODE_ENV=production
# SERVER Configuration
ENV HOST=0.0.0.0
ENV PORT=3001
ENV SESSION_SECRET=thisissecret
# CORS Configuration
ENV CORS_ORIGIN=http://117.54.250.109:8081
ENV CORS_METHOD=GET,POST,PUT,DELETE,PATCH,OPTIONS,HEAD
ENV CORS_ALLOWED_HEADERS=Authorization,Content-Type,Access-Control-Request-Method,X-Requested-With
ENV CORS_MAX_AGE=600
ENV CORS_CREDENTIALS=false
# DATABASE Configuration
ENV DB_HOST=anjabdb
ENV DB_PORT=27017
ENV DB_NAME=anjab
# Tell Puppeteer to skip installing Chrome. We'll be using the installed package.
ENV PUPPETEER_SKIP_CHROMIUM_DOWNLOAD=true \
PUPPETEER_EXECUTABLE_PATH=/usr/bin/chromium-browser
# SET WORKDIR
WORKDIR /usr/local/app
# INSTALL REQUIRED DEPENDENCIES
RUN apk update && apk upgrade && \
apk add --update --no-cache \
gcc g++ make autoconf automake pngquant \
python2 \
chromium \
udev \
nss \
freetype \
freetype-dev \
harfbuzz \
ca-certificates \
ttf-freefont ca-certificates \
nodejs \
yarn \
libpng libpng-dev lcms2 lcms2-dev
# COPY SOURCE TO CONTAINER
ADD deploy/etc/ /etc
ADD package.json app.js server.js process.yml ./
ADD lib ./lib
ADD middlewares ./middlewares
ADD models ./models
ADD modules ./modules
ADD uploads ./uploads
ADD assets ./assets
ADD views ./views
COPY keycloak.js.prod ./keycloak.js
# INSTALL NODE DEPENDENCIES
RUN npm cache clean --force
RUN npm config set unsafe-perm true
RUN npm -g install pm2 phantomjs html-pdf
RUN yarn && yarn install --production=true && sleep 3 &&\
yarn cache clean
RUN set -ex \
&& apk add --no-cache --virtual .build-deps ca-certificates openssl \
&& wget -qO- "https://github.com/dustinblackman/phantomized/releases/download/2.1.1/dockerized-phantomjs.tar.gz" | tar xz -C / \
&& npm install -g phantomjs \
&& apk del .build-deps
EXPOSE 3001
And is result
Error: net::ERR_ADDRESS_UNREACHABLE at http://117.54.250.109:8089/analisa-jabatan/pdf/5ee9e6a15ff81d00c7c3a614
at navigate (/usr/local/app/node_modules/puppeteer/lib/FrameManager.js:120:37)
at process._tickCallback (internal/process/next_tick.js:68:7)
-- ASYNC --
at Frame. (/usr/local/app/node_modules/puppeteer/lib/helper.js:111:15)
at Page.goto (/usr/local/app/node_modules/puppeteer/lib/Page.js:674:49)
at Page. (/usr/local/app/node_modules/puppeteer/lib/helper.js:112:23)
at puppeteer.launch.then (/usr/local/app/modules/analisajabatan/methods/pdfpuppeteer.js:60:20)
at process._tickCallback (internal/process/next_tick.js:68:7)
From the puppeteer docs:
page.goto will not throw an error when any valid HTTP status code is returned by the remote server, including 404 "Not Found" and 500 "Internal Server Error"
so providing the url is valid, the server doesn't seem to be sending a response.

puppeteer - how to set download location

I was able to successfully download a file with puppeteer, but it was just saving it to my /Downloads folder. I've been looking around and can't find anything in the api or forums to set this location.
My downloads are basically just go going to the link:
await page.goto(url);
Update for newer Puppeteer versions (~June 2022):
As mentioned by #Daniel here, you have to create the CDP session yourself:
const client = await page.target().createCDPSession()
await client.send('Page.setDownloadBehavior', {
behavior: 'allow',
downloadPath: './myAwesomeDownloadFolder',
})
Original Answer
This is how you can set the download path in latest puppeteer v0.13.
await page._client.send('Page.setDownloadBehavior', {behavior: 'allow', downloadPath: './myAwesomeDownloadFolder'});
The behaviour is experimental, it might be removed, modified, or changed later.
Pst, you can try more tricks listed here, on your own risk :).
In newer versions of Puppeteer (I'm using v14.1), the Accepted Answer no longer works:
await page._client.send('Page.setDownloadBehavior', {behavior: 'allow', downloadPath: './myAwesomeDownloadFolder'});
> TypeError: page._client.send is not a function
Instead, I had to explicitely create a new CDPSession:
const client = await page.target().createCDPSession()
await client.send('Page.setDownloadBehavior', {
behavior: 'allow',
downloadPath: './myAwesomeDownloadFolder',
})
I realize this is an old thread, but this thread popped up first for me when looking for how to set Puppeteer default download location.
I was able to set the download location using the following code,
let customChrome = path.resolve(__dirname, './customChrome')
let prefs = fs.readFileSync(customChrome+'/Default/Preferences');
let obj = JSON.parse(prefs);
obj.savefile.default_directory = path.resolve(__dirname, './downloads');
obj.download.default_directory = path.resolve(__dirname, './downloads');
fs.writeFileSync(customChrome+'/Default/Preferences', JSON.stringify(obj));
const browser = await puppeteer.launch({
userDataDir:customChrome,
headless: false,
args:['--disable-features=site-per-process','--no-sandbox']
});
This will set the default download directory for files before the process starts. Essentially, Puppeteer creates a custom profile each time it runs, we can override that profile and define the download directory.
The first time you run the above code, you will have to comment out the fs.readFile to fs.writeFile as the UserDirDirectory is created if it does not exist the first time that Chrome is started.
All profile related data is then stored in the customChrome/Default folder.
How to pass userDataDir profile folder to Puppeteer
All given solutions weren't working for me in the newer version of puppeteer 15.5.0
using puppeteer-extra with puppeteer-extra-plugin-user-preferences plugin did the trick.
// make sure puppeteer-extra & puppeteer-extra-plugin-user-preferences are installed
const UserPreferencesPlugin = require("puppeteer-extra-plugin-user-preferences");
const downloadImageDirectoryPath = process.cwd()
puppeteer.use(
UserPreferencesPlugin({
userPrefs: {
download: {
prompt_for_download: false,
open_pdf_in_system_reader: true,
default_directory: downloadImageDirectoryPath,
},
plugins: {
always_open_pdf_externally: true,
},
},
})
);
The answer from Muhammad Uzair solved my similar issue of setting the Chromium user preference to enforce PDF file downloads, but I ran into an issue of setting things up since I am using Puppeteer, Jest, and Jest-Puppeteer, where Jest-Puppeteer handles the initial setup behind the scenes.
This Github post from Macil helped with how to apply the puppeteer-extra-plugin-user-preferences plugin within the jest-puppeteer.config.js file.
For example, this is my jest-puppeteer.config.js file:
const puppeteer = require('puppeteer-extra');
const UserPreferencesPlugin = require('puppeteer-extra-plugin-user-preferences');
const userPreferenceOptions = {
userPrefs: {
plugins: {
always_open_pdf_externally: true,
},
download: {
open_pdf_in_system_reader: false,
prompt_for_download: false,
},
}
};
puppeteer.use(UserPreferencesPlugin(userPreferenceOptions));
require.cache[require.resolve('puppeteer')] = require.cache[require.resolve('puppeteer-extra')];
module.exports = {
launch: {
// https://github.com/puppeteer/puppeteer/blob/v13.3.2/docs/api.md#puppeteerlaunchoptions
headless: true, // opens a browser instance
slowMo: 25, // millis to slow each step
devtools: false, // auto opens the devtools in the browser
defaultViewport: {
width: 1820,
height: 980,
deviceScaleFactor: 1,
isMobile: false,
isLandscape: true,
},
product: "chrome", // can also specify firefox
browserContext: 'incognito',
args: [
// Chromium browser arguments: https://peter.sh/experiments/chromium-command-line-switches/
'--ignore-certificate-errors',
'--no-sandbox',
'--disable-setuid-sandbox',
'--window-size=1920,1080',
],
},
};
Finally, I run the program in docker container with chrome installed
FROM node:18-slim
# ------------------------------------------
# install extension
# ------------------------------------------
RUN apt-get update -y && apt-get install -y \
# chromium \
# libnss3 lsb-release xdg-utils wget \
wget gnupg \
&& wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add - \
&& sh -c 'echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google.list' \
&& apt-get update \
&& apt-get install -y google-chrome-stable fonts-ipafont-gothic fonts-wqy-zenhei fonts-thai-tlwg fonts-kacst fonts-freefont-ttf libxss1 \
--no-install-recommends \
&& rm -rf /var/lib/apt/lists/*
# ------------------------------------------
# set global config
# ------------------------------------------
ENV TZ Asia/Hong_Kong
# ------------------------------------------
# change the work directory
# ------------------------------------------
COPY source /root/source
WORKDIR /root/source
# ------------------------------------------
# upgrade npm
# ------------------------------------------
RUN npm install npm -g
RUN npm upgrade
# install node packages
RUN if test -e package-lock.json ; then npm ci ; else npm i ; fi
RUN npm run build
ENTRYPOINT ["npm", "run"]
CMD ["start"]
and
use await page.target().createCDPSession() as client for downloading the file
(
ref:
https://github.com/puppeteer/puppeteer/issues/1478#issuecomment-358826932
https://pptr.dev/api/puppeteer.cdpsession
)
const downloadPath = `[YOU OWN DOWNLOAD PATH]`
const browser = await puppeteer.launch({
headless: true,
args: ['--no-sandbox', '--disable-setuid-sandbox'],
executablePath: '/usr/bin/google-chrome-stable'
})
const page = await browser.newPage()
const client = await page.target().createCDPSession()
await client.send('Page.setDownloadBehavior', {
behavior: 'allow',
downloadPath
})
Update 30-07-2022:
An update has been made and this feature has been removed from previous versions of the package as well, if you have downloaded the package before 10-06-2022 it should work. For me it is like that, but it is puzzling why also previous versions were changed ..

Resources