I'm actually trying to use puppeteer for scraping and I need to use my current chrome to keep all my credentials. However, chrome can't remember previous session and I have to click the login button every time. By contrast, chrome can remember the saved credential. Is there a way to make it?
I'm actually using:
Node v12.16.1
chrome 80.0.3987.132 (Official Build) (64-bit) (cohort: Stable)
puppeteer-core 2.1.0 // see: https://github.com/puppeteer/puppeteer/blob/v2.1.0/docs/api.md
test.js:
const pptr = require('puppeteer-core');
(async () => {
const browser = await pptr.launch({
executablePath: 'C:\\Program Files (x86)\\Google\\Chrome\\Application\\chrome.exe',//path to your chrome
headless: false,
args:[
'--user-data-dir=D:/Users/xxx/AppData/Local/Google/Chrome/User Data2',
]
});
const page = await browser.newPage();
await page.goto('https://hostloc.com');
await page.screenshot({path: 'example.png'});
await page.waitFor(10000);
await browser.close();
})();
You should use cookies so that you can get the previous data from them. Here is a link about the set cookie in the puppeteer.
Here below is an example of code for how to set cookies in puppeteer. It Sets the "login_email" property in a Paypal cookie so the login screen is pre-filled with an email address.
const cookie = {
name: 'login_email',
value: 'set_by_cookie#domain.com',
domain: '.paypal.com',
url: 'https://www.paypal.com/',
path: '/',
httpOnly: true,
secure: true
}
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch()
const page = await browser.newPage()
await page.setCookie(cookie)
await page.goto('https://www.paypal.com/signin')
await page.screenshot({ path: 'paypal_login.png' })
await browser.close()
})()
Regarding get the cookies, You can create a Chrome DevTools Protocol session on the page target using target.createCDPSession(). Then you can send Network.getAllCookies to obtain a list of all browser cookies.
The page.cookies() function will only return cookies for the current URL. So we can filter out the current page cookies from all of the browser cookies to obtain a list of third-party cookies only.
const client = await page.target().createCDPSession();
const all_browser_cookies = (await client.send('Network.getAllCookies')).cookies;
const current_url_cookies = await page.cookies();
const third_party_cookies = all_browser_cookies.filter(cookie => cookie.domain !== current_url_cookies[0].domain);
console.log(all_browser_cookies); // All Browser Cookies
console.log(current_url_cookies); // Current URL Cookies
console.log(third_party_cookies); // Third-Party Cookies
For example, get all of the cookies
const puppeteer = require('puppeteer');
(async() => {
const browser = await puppeteer.launch({});
const page = await browser.newPage();
await page.goto('https://stackoverflow.com', {waitUntil : 'networkidle2' });
// Here we can get all of the cookies
console.log(await page._client.send('Network.getAllCookies'));
})();
I hope this will help you.
Related
I am trying to login into my Instagram account with Puppeteer, everything goes well until Instagram keeps sending me a pin code every time that I login as I guess because it's detecting that it's an automated browser...
I tried to use my chrome instead of chromium but the issue is still there (I'm using Mac), is there any way I can bypass this?
(async()=>{
const browser = await puppeteer.launch({
headless: false,
defaultViewport: false,
executablePath : `/Applications/Google Chrome.app/Contents/MacOS/Google Chrome`
})
const page = await browser.newPage();
const url = `https://www.instagram.com`;
await page.goto(url, {"waitUntil": "domcontentloaded"});
const username = await page.waitForSelector('input[type="text"]');
await username.type('my_username');
const password = await page.waitForSelector('input[type="password"]');
await password.type('my_password');
await password.press('Enter');
})();```
I am trying to use puppeteer to sign into TikTok. However, each time I try to sign into TikTok with puppeteer it says "You are visiting our site too frequently" as pictured below.
TikTok after running the code
Here are the things I've tried:
using puppeteer stealth
using firefox puppeteer
using both puppeteer stealth and firefox puppeteer
using a VPN
logging into the account on a different device, logging out on that device, and then running the code
waiting 4 hours between running the code
Puppeteer doesn't throw any errors either
Let me know what you guys think!
Here is the code too:
const puppeteer = require("puppeteer-extra");
const StealthPlugin = require("puppeteer-extra-plugin-stealth");
puppeteer.use(StealthPlugin());
(async () => {
const browser = await puppeteer.launch({ headless: false });
const page = await browser.newPage();
await page.goto("https://www.tiktok.com/login/phone-or-email/email");
await page.type("input[name=email]", EMAIL, { delay: 20 });
await page.type("input[name=password]", PASSWORD, { delay: 20 }); // log in w email and password
await page.evaluate(() => {
document.querySelector("button[type=submit]").click();
}); // press login button
await page.screenshot({ path: "example.png" });
await browser.close();
})();
Log in manually one time and then do a cookie injection for each subsequent login :)
(of course, you'll need to save down all the cookies to do so, but only once!)
I'm attempting to use Playwright (https://github.com/microsoft/playwright) and I'm met by the location popup when I try to test the library. Is there a way to bypass this popup or at least click either "Block" or "Allow"? I've tried using the Page.on("popup") event but it isn't quite working the way I was expecting it to.
You have to use the grantPermissions function to grant geolocation for the site.
await context.grantPermissions(['geolocation'], { origin: 'https://www.bestbuy.com' });
This is how I grant geo localization on my script
const { chromium } = require("playwright");
(async () => {
// const browser = await chromium.launch({ headless: false});
const browser = await chromium.launch();
const context = await browser.newContext();
await context.grantPermissions(['geolocation'], { origin: 'yourPage.com' });
const page = await context.newPage();
await page.goto('yourPage.com');
browser.close();
})();
here is the documentation playwright.dev
Here is a simple program on puppeteer:
const puppeteer = require('puppeteer');
async function run() {
const browser = await puppeteer.launch({
headless: false,
args:[ `--proxy-server=104.233.50.38:3199`]
});
;
const page = await browser.newPage();
await page.authenticate({
username: 'myusername',
password: 'mypassword'
})
await page.goto('https://google.com')
};
run();
Note: I have tried similar with over 10 proxies and none of them are working in puppeteer
The credentials are exactly what is provided to me, I have checked multiple times.
This is what I get:
Now again , this is the console of the page:
Why is this happening?
I checked the addresses and username, password multiple times. There is no other error message except this.
It seems that page.authenticate is not working for me either,instead you can use page.setExtraHTTPHeaders
async function run() {
const browser = await puppeteer.launch({
ignoreHTTPSErrors: true,
args: ['--proxy-server=104.233.50.38:3199']
});
const page = await browser.newPage();
await page.setExtraHTTPHeaders
({'Proxy-Authorization': 'Basic ' + Buffer.from('username:password').toString('base64'),
});
};
run();
You can use puppeteer-page-proxy, it offers username and password auth very easily. It also supports http, https, socks4 and socks5 proxies. https://github.com/Cuadrix/puppeteer-page-proxy
You can define the proxy this way:
const proxy = 'http://login:pass#IP:Port';
or
const proxy = 'socks5://IP:Port';
Then you can use it per request:
const useProxy = require('puppeteer-page-proxy');
await page.setRequestInterception(true);
page.on('request', req => {
useProxy(req, proxy);
});
I am trying to browse google.com with puppeteer using proxies but Chromium seems to block me.
Code example:
const puppeteer = require('puppeteer');
(async() => {
const proxyUrl = 'http://gate.smartproxy.com:7000';
const username = 'xxxxx';
const password = 'xxxxx';
const browser = await puppeteer.launch({
args: [`--proxy-server=${proxyUrl}`],
headless: false,
});
const page = await browser.newPage();
await page.authenticate({ username, password });
await page.goto('https://google.com/');
const html = await page.$eval('body', e => e.innerHTML);
console.log(html);
await browser.close();
})();
Error:
(node:6673) UnhandledPromiseRejectionWarning: Error: net::ERR_TUNNEL_CONNECTION_FAILED at https://google.com/...
I already checked on the proxy side and they are working.
If it's not possible with puppeteer (since they are using Chromium), do you have any alternative ideas on how to browse Google with proxies?
Thanks,
Try replacing https with http, and consulting the proxy service and see what documents they have or what advice they can offer. Alternatively, find out what kind of proxy it is and how it normally behaves, and give us more info.
Try using pluginProxy:
const puppeteer = require('puppeteer-extra');
const StealthPlugin = require('puppeteer-extra-plugin-stealth')
const pluginProxy = require('puppeteer-extra-plugin-proxy');
(async() => {
puppeteer.use(StealthPlugin()); // Recommende
puppeteer.use(pluginProxy({
address: <proxy-host> ,
port: <proxy-port> ,
credentials: {
username: <proxy-user> ,
password: <proxy-pwd> ,
}
}));
let browser = await puppeteer.launch({
headless: false,
ignoreHTTPSErrors: true // Some proxies need it
});
let page = await browser.newPage();
await page.goto('https://google.com/');
const html = await page.$eval('body', e => e.innerHTML);
console.log(html);
await browser.close();
})();