puppeteer ignoring args pass for proxy pac url - node.js

I am trying to use pac file as an argument for puppeteer proxy settings as define here: https://www.chromium.org/developers/design-documents/network-settings
--proxy-pac-url=<pac-file-url>
here is my code,
const puppeteer = require('puppeteer');
(async() => {
const proxyUrl = 'http://{IPAddress}:{Port}';
const browser = await puppeteer.launch({
args: [`--proxy-pac=url=${proxypacUrl}`],
headless: false,
});
const page = await browser.newPage();
await page.goto('https://stackoverflow.com/');
await browser.close();
})();
However, when I execute the code, the code just works and visit stackoverflow.com, but completely ignore
--proxy-pac=url=${proxypacUrl}
I know this because I can monitor the proxy logs. Proxy PAC URL file specifically says to use proxy for all traffic.
Here is my proxy pac file,
function FindProxyForURL(url, host) {
return "PROXY IP:PORT; PROXY IP:PORT";
}
When I change --proxy-pac-url=<pac-file-url> to --proxy-server and specify ip and port directly, the traffic goes through the proxy.
Can someone please let me know what I am doing wrong with Proxy PAC URL?

You have an error in your code
--proxy-pac=url=${proxypacUrl} should be --proxy-pac-url=${proxypacUrl}.
Unfortunately fixing it won't help because proxy pac files aren't supported in headless chromium, here is an issue.

Related

How to close connection with proxy-server in chrome extension

When I wanna set the proxy in my extension, I use chrome.proxy.settings.set()
Then I use the
chrome.webRequest.onAuthRequired.addListener(callbackFn, {urls: ['<all_urls>']}, ['blocking']);
const callbackFn = (details: any) => {
const username = 'someUser';
const password = 'somePass';
return {authCredentials: {username, password}};
}
But after 5mins I want to use another user creds. When I set proxy.settings.clear({}) - that's clear proxy and I have my default ip. After that I set proxy, set new onAuthRequired listener, but chrome saved somewhere my first creds, and I can't change it by onAuthRequired because chrome set my first creds to headers for proxy server.
How can I delete from chrome my creds that I have set before?
I think that chrome save connection with server. Because the proxy ask for creds only after chrome reopen.
How to close connection with proxy server (by chrome API)?
So I have found a solution of the problem.
onAuthRequired after server 407 response, get the creds and save it to cookie.
Then for every request browser add Authentication header with your creds from cookie.
This part of code remove creds cookies, and after new proxy connection the server will ask for a new creds.
let options = {};
const rootDomain = 'your.proxy.host'; // for example domain.com
options.origins = [];
options.origins.push("http://"+ rootDomain);
options.origins.push("https://"+ rootDomain);
let types = {"cookies": true};
chrome.browsingData.remove(options, types, function(){
// some code for callback function
});
For example if you have authentication in extension, you can add this code to logout function like in my application.

How to print the raw devtools request sent by Puppeteer?

I see that Puppeteer used devtools protocol. I want to see what requests are sent by Puppeteer.
https://github.com/puppeteer/puppeteer
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto('https://example.com');
await page.screenshot({ path: 'example.png' });
await browser.close();
})();
How can I modify the above simple program to print the devtools requests sent by Puppeteer?
Edit
As the code is in Nodejs, I added the tag nodejs because the solution may be in Nodejs instead of Puppeteer.
Edit
Fiddler is mentioned as relevant. So I add this tag as well.
You could use chrome-protocol-proxy it captures all the CDP messagee. There are few extra steps involved here.
Run google chrome in debug mode
start chrome-protocol-proxy
Start puppeteer using puppeteer.connect()
Run following commads, you may have to change them accordingly
google-chrome-stable --remote-debugging-port=9222 --headless # run chrome
chrome-protocol-proxy # to display CDP messages
Remove this line from your code
const browser = await puppeteer.launch();
Add this line
const browser = await puppeteer.connect({"browserURL":"localhost:9223"});
Instead of browserURL you can give browserWSEndpoint which you will get by cURL on localhost:9223/json/version
If you want to go more into detail of CDP and puppeteer you might want to look at Gettig Started with CDP

Connect Testcafe to AWS Devicefarm

We recently decided to include E2E to our front-end pipeline and we are using testCafe for that and since we use AWS as our SaaS we're being asked to use Devicefarm for remote testing and I'm facing the problem to connect them. I'm based on the selenium implementations and the testCafe documentation but it seems not to be able to establish connection between them, anyone has any idea why?
My code:
const AWS = require('aws-sdk');
const createTestCafe = require('testcafe');
const PROJECT_ARN = "arn:aws:devicefarm:us-west-2:XXXXXXXX:testgrid-project:XXXXXX-XXXX-XXXX-XXXX-XXXXXXXX";
const devicefarm = new AWS.DeviceFarm({ region: "us-west-2", credentials: AWS.config.credentials });
(async () => {
const testGridUrlResult = await devicefarm.createTestGridUrl({
projectArn: PROJECT_ARN,
expiresInSeconds: 6000
}).promise();
const url = new URL(testGridUrlResult.url || '');
const testCafe = await createTestCafe(url.host, 443);
const runner = testCafe.createRunner();
const remoteConnection = await testCafe.createBrowserConnection();
remoteConnection.once('ready', async () => {
await runner
.src(['./load-mfc.ts'])
.browsers('devicefarm:firefox')
.run();
await testCafe.close();
});
})().catch((e) => console.error(e));
According to the AWS Device Farm documentation about the CreateTestGridUrl method:
Creates a signed, short-term URL that can be passed to a Selenium RemoteWebDriver constructor.
This URL can't be passed to the hostname property of the createTestCafe method because TestCafe doesn't implement the WebDriver protocol. TestCafe works differently then Selenium. Once you created a remote browser connection using TestCafe, you should navigate your remote browser to the browserConnection.url URL to connect to a TestCafe server instance and start test execution in this browser.
As the AWS Device Farm service uses the WebDriver protocol to operate remote browsers, you would need to write a custom Browser Provider Plugin for TestCafe. It may be helpful to read more in the TestCafe documentation about this topic and see how a similar thing is implemented in the BrowserStack provider.

How to determine http vs https in nodejs / nextjs api handler

In order to properly build my urls in my xml sitemaps and rss feeds I want to determine if the webpage is currently served over http or https, so it also works locally in development.
export default function handler(req, res) {
const host = req.headers.host;
const proto = req.connection.encrypted ? "https" : "http";
//construct url for xml sitemaps
}
With above code however also on Vercel it still shows as being served over http. I would expect it to run as https. Is there a better way to figure out http vs https?
As Next.js api routes run behind a proxy which is offloading to http the protocol is http.
By changing the code to the following I was able to first check at what protocol the proxy runs.
const proto = req.headers["x-forwarded-proto"];
However this will break the thing in development where you are not running behind a proxy, or a different way of deploying the solution that might also not involve a proxy. To support both use cases I eventually ended up with the following code.
const proto =
req.headers["x-forwarded-proto"] || req.connection.encrypted
? "https"
: "http";
Whenever the x-forwarded-proto header is not present (undefined) we fall back to req.connection.encrypted to determine if we should serve on http vs https.
Now it works on localhost as well a Vercel deployment.
my solution:
export const getServerSideProps: GetServerSideProps = async (context: any) => {
// Fetch data from external API
const reqUrl = context.req.headers["referer"];
const url = new URL(reqUrl);
console.log('====================================');
console.log(url.protocol); // http
console.log('====================================');
// const res = await fetch(`${origin}/api/projets`)
// const data = await res.json()
// Pass data to the page via props
return { props: { data } }
}

Tell Puppeteer to open Chrome tab instead of window

If I have an existing Google Chrome window open, I'd like to tell puppeteer to open a new tab instead of opening a new window. Is there a way to do that? is there some option or flag I can pass to puppeteer to accomplish this?
I have:
const puppeteer = require('puppeteer');
(async function () {
const b = await puppeteer.launch({
devtools: true,
openInExistingWindow: true /// ? something like this?
});
const page = await b.newPage();
await page.goto('https://example.com');
})();
const browser = puppeteer.launch();
const page = browser.newPage();
This will open a new tab (Puppeteer calls them "pages") in your current browser instance. You can check out the Page class docs here and the Browser class docs here.
You'll need to use:
/usr/bin/google-chrome-stable --remote-debugging-port=9220
to get the websocket connection for debugging which then can be fed to Puppeteer:
await puppeteer.connect({browserWSEndpoint: chromeWebsocket})

Resources