Linkedin Like bot - node.js

I have a bot that logs in as different users and likes the latest post by a company. 3 weeks ago it stopped working. I didn't notice because on my end when I ran it, everything seemed fine. But if you go to the Linkedin pages the articles have not been liked. It was working perfectly up until that point.
I'm convinced it has something to do with Linkedin changing something up.
Here is the main code :
const config = require('./config')
const logger = require('./logger')
const { all } = require('./options')
const { sanitize } = require('./util')
module.exports = async (page, company) => {
logger.info(`Go to ${company} page...`)
await page.goto(`${config.get('url')}/company/${company}/`)
logger.info('Waiting for new articles...')
const feed = await page.waitForSelector('#organization-feed')
await feed.hover()
let article
while (
article = await page.waitForSelector('#organization-feed .feed-shared-update-v2').catch(() => null)
) {
await article.hover()
const button = await article.$('.feed-shared-social-action-bar [aria-label="Like"]')
if(button === null)
{
await page.evaluate(node => node.remove(), article);
await page.waitFor(config.get('sleep'))
await page.evaluate(() => window.scrollBy({ top: -100 }))
await page.waitFor(100)
await page.evaluate(() => window.scrollBy({ top: 1000 }))
continue;
}
await button.hover()
const liked = await page.evaluate(node => node.getAttribute('aria-pressed') === 'true', button)
const text = await page.evaluate(node => node.querySelector('.feed-shared-text').innerText, article)
if (!liked) {
logger.info(`Like → ${sanitize(text)}...`)
await button.click({ delay: 20 })
} else if (!all) {
break
}
await page.evaluate(node => node.remove(), article)
await page.waitFor(config.get('sleep'))
await page.evaluate(() => window.scrollBy({ top: -115 }))
await page.waitFor(100)
await page.evaluate(() => window.scrollBy({ top: 1000 }))
}
}
When I view what's happening, the bot opens the browser, logs in as the user, goes to the company page and starts scrolling the articles. It used to click the like button at this point, but it seems like it's missing the like button now.
Thanks in advance!

Linkedin changed their aria labels from [aria-label="Like"] to [aria-label="Like company-names' post"] Just had to update my code to that

Related

Can't click link using puppeteer - Thingiverse

I'm trying to automate away downloading multiple files on thingiverse. I choose an object at random. But I'm having a hard time locating the link I need, clicking and then downloading. Has someone run into this before can I get some help?
I've tried several other variations.
import puppeteer from 'puppeteer';
async function main() {
const browser = await puppeteer.launch({
headless: true,
});
const page = await browser.newPage();
const response = await page.goto('https://www.thingiverse.com/thing:2033856/files');
const buttons = await page.$x(`//a[contains(text(), 'Download')]`);
if(buttons.length > 0){
console.log(buttons.length);
} else {
console.log('no buttons');
}
await wait(5000);
await browser.close();
return 'Finish';
}
async function wait(time: number) {
return new Promise(function (resolve) {
setTimeout(resolve, time);
});
}
function start() {
main()
.then((test) => console.log('DONE'))
.catch((reason) => console.log('Error: ', reason));
}
start();
Download Page
Code
I was able to get it to work.
The selector is: a[class^="ThingFile__download"]
Puppeteer is: const puppeteer = require('puppeteer-extra');
Before the await page.goto() I always recommend setting the viewport:
await page.setViewport({width: 1920, height: 720});
After that is set, change the await page.goto() to have a waitUntil option:
const response = await page.goto('https://www.thingiverse.com/thing:2033856/files', { waitUntil: 'networkidle0' }); // wait until page load
Next, this is a very important part. You have to do what is called waitForSelector() or waitForFunction().
I added both of these lines of code after the const response:
await page.waitForSelector('a[class^="ThingFile__download"]', {visible: true})
await page.waitForFunction("document.querySelector('a[class^=\"ThingFile__download\"]') && document.querySelector('a[class^=\"ThingFile__download\"]').clientHeight != 0");
Next, get the buttons. For my testing I just grabbed the button href.
const buttons = await page.$eval('a[class^="ThingFile__download"]', anchor => anchor.getAttribute('href'));
Lastly, do not check the .length of this variable. In this case we are just returning the href value which is a string. You will get a Promise of an ElementHandle when you try getting just the button:
const button = await page.$('a[class^="ThingFile__download"]');
console.log(button)
if (button) { ... }
Now if you change that page.$ to be page.$$, you will be getting a Promise of an Array<ElementHandle>, and will be able to use .length there.
const buttonsAll = await page.$$('a[class^="ThingFile__download"]');
console.log(buttonsAll)
if (buttons.length > 0) { ... }
Hopefully this helps, and if you can't figure it out I can post my full source later if I have time to make it look better.

Puppeteer Failing for more than 11 Urls

I would like to ask, whats the best way to capture more than 20 screenshots or different Urls?
I have tried the following code.
async function sCapture(url, site_name) {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.setViewport({ width: 1280, height: 720 })
await page.goto(url);
await page.screenshot({
path:`statusImage/${site_name}.jpg`
});
await browser.close();
}
Am getting the Urls from my DB like this.
db_connection.promise()
.execute("SELECT * FROM `urls`")
.then(([rows]) => {
rows.forEach(user => {
const url = user.link;
const name = user.link_name;
console.log(name);
sCapture(url, name)
});
db_connection.end();
}).catch(err => {
console.log(err);
});
Because my DB Table contains more than 50 urls
Before, I was getting this error:
MaxListenersExceededWarning: Possible EventEmitter memory leak detected. 11 exit listeners added. Use emitter.setMaxListeners() to increase limit
After I added the line below. Its just killing my server and I have to do a manual reboot for my site to work again.
require('events').EventEmitter.prototype._maxListeners = 100;
I will appreciate any help rendered.
I think your current code actually starts a new browser instance for each URL you want to fetch and I don't think you need to do that. A separate page is enough. Also, you are currently making all those requests in parallel, which will tax your machine more than doing it in sequence. Putting these two changes together give you something like this:
let browser;
async function sCapture(url, site_name) {
const page = await browser.newPage();
await page.setViewport({ width: 1280, height: 720 })
await page.goto(url);
await page.screenshot({
path:`statusImage/${site_name}.jpg`
});
}
const doit = () => {
db_connection.promise()
.execute("SELECT * FROM `urls`")
.then(([rows]) => {
rows.forEach(async user => {
const url = user.link;
const name = user.link_name;
console.log(name);
await sCapture(url, name);
});
db_connection.end();
}).catch(err => {
console.log(err);
});
}
(async () => {
browser = await puppeteer.launch();
doit();
await browser.close();
})();

How to download pdf file that opens in new tab with puppeteer?

I am trying to download invoice from website using puppeteer, I just started to learn puppeteer. I am using node to create and execute the code. I have managed to login and navigate to the invoice page, but it opens in new tab, so, code is not detecting it since its not the active tab. This is the code I used:
const puppeteer = require('puppeteer')
const SECRET_EMAIL = 'emailid'
const SECRET_PASSWORD = 'password'
const main = async () => {
const browser = await puppeteer.launch({
headless: false,
})
const page = await browser.newPage()
await page.goto('https://my.apify.com/sign-in', { waitUntil: 'networkidle2' })
await page.waitForSelector('div.sign_shared__SignForm-sc-1jf30gt-2.kFKpB')
await page.type('input#email', SECRET_EMAIL)
await page.type('input#password', SECRET_PASSWORD)
await page.click('input[type="submit"]')
await page.waitForSelector('#logged-user')
await page.goto('https://my.apify.com/billing#/invoices', { waitUntil: 'networkidle2' })
await page.waitForSelector('#reactive-table-1')
await page.click('#reactive-table-1 > tbody > tr:nth-child(1) > td.number > a')
const newPagePromise = new Promise(x => browser.once('targetcreated', target => x(target.page())))
const page2 = await newPagePromise
await page2.bringToFront()
await page2.screenshot({ path: 'apify1.png' })
//await browser.close()
}
main()
In the above code I am just trying to take screenshot. Can anyone help me?
Here is an example of a work-around for the chromium issue mentioned in the comments above. Adapt to fit your specific needs and use-case. Basically, you need to capture the new page (target) and then do whatever you need to do to download the file, possibly pass it as a buffer to Node as per the example below if no other means work for you (including a direct request to the download location via fetch or ideally some request library on the back-end)
const [PDF_page] = await Promise.all([
browser
.waitForTarget(target => target.url().includes('my.apify.com/account/invoices/' && target).then(target => target.page()),
ATT_page.click('#reactive-table-1 > tbody > tr:nth-child(1) > td.number > a'),
]);
const asyncRes = PDF_page.waitForResponse(response =>
response
.request()
.url()
.includes('my.apify.com/account/invoices'));
await PDF_page.reload();
const res = await asyncRes;
const url = res.url();
const headers = res.headers();
if (!headers['content-type'].includes('application/pdf')) {
await PDF_page.close();
return null;
}
const options = {
// target request options
};
const pdfAb = await PDF_page.evaluate(
async (url, options) => {
function bufferToBase64(buffer) {
return btoa(
new Uint8Array(buffer).reduce((data, byte) => {
return data + String.fromCharCode(byte);
}, ''),
);
}
return await fetch(url, options)
.then(response => response.arrayBuffer())
.then(arrayBuffer => bufferToBase64(arrayBuffer));
},
url,
options,
);
const pdf = Buffer.from(pdfAb, 'base64');
await PDF_page.close();

Why am I not able to navigate through iFrames using Apify/Puppeteer?

I'm trying to manipulate forms of sites w/ iFrames in it using Puppeteer. I tried different ways to reach a specific iFrame, or even to count iFrames in a website, with no success.
Why isn't Puppeteer's object recognizing the iFrames / child frames of the page I'm trying to navigate through?
It's happening with other pages as well, such as https://www.veiculos.itau.com.br/simulacao
const Apify = require('apify');
const sleep = require('sleep-promise');
Apify.main(async () => {
// Launch the web browser.
const browser = await Apify.launchPuppeteer();
// Create and navigate new page
console.log('Open target page');
const page = await browser.newPage();
await page.goto('https://www.credlineitau.com.br/');
await sleep(15 * 1000);
for (const frame in page.mainFrame().childFrames()) {
console.log('test');
}
await browser.close();
});
Perhaps you'll find some helpful inspiration below.
const waitForIframeContent = async (page, frameSelector, contentSelector) => {
await page.waitForFunction((frameSelector, contentSelector) => {
const frame = document.querySelector(frameSelector);
const node = frame.contentDocument.querySelector(contentSelector);
return node && node.innerText;
}, {
timeout: TIMEOUTS.ten,
}, frameSelector, contentSelector);
};
const $frame = await waitForSelector(page, SELECTORS.frame.iframeNode).catch(() => null);
if ($frame) {
const frame = page.frames().find(frame => frame.name() === 'content-iframe');
const $cancelStatus = await waitForSelector(frame, SELECTORS.frame.membership.cancelStatus).catch(() => null);
await waitForIframeContent(page, SELECTORS.frame.iframeNode, SELECTORS.frame.membership.cancelStatus);
}
Give it a shot.

Click anywhere on page using Puppeteer

Currently I'm using Puppeteer to fetch cookies & headers from a page, however it's using a bot prevention system which is only bypassed when clicking on the page; I don't want to keep this sequential so it's "detectable"
How can I have my Puppeteer click anywhere on the page at random? regardless of wether it clicks a link, button etc..
I've currently got this code
const getCookies = async (state) => {
try {
state.browser = await launch_browser(state);
state.context = await state.browser.createIncognitoBrowserContext();
state.page = await state.context.newPage();
await state.page.authenticate({
username: proxies.username(),
password: proxies.password(),
});
await state.page.setViewport(functions.get_viewport());
state.page.on('response', response => handle_response(response, state));
await state.page.goto('https://www.website.com', {
waitUntil: 'networkidle0',
});
await state.page.waitFor('.unlockLink a', {
timeout: 5000
});
await state.page.click('.unlockLink a');
await state.page.waitFor('input[id="nondevice"]', {
timeout: 5000
});
state.publicIpv4Address = await state.page.evaluate(() => {
return sessionStorage.getItem("publicIpv4Address");
});
state.csrfToken = await state.page.evaluate(() => {
return sessionStorage.getItem("csrf-token");
});
//I NEED TO CLICK HERE! CAN BE WHITESPACE, LINK, IMAGE
state.browser_cookies = await state.page.cookies();
state.browser.close();
for (const cookie of state.browser_cookies) {
if(cookie.name === "dtPC") {
state.dtpc = cookie.value;
}
await state.jar.setCookie(
`${cookie.name}=${cookie.value}`,
'https://www.website.com'
)
}
return state;
} catch(error) {
if(state.browser) {
state.browser.close();
}
throw new Error(error);
}
};
The simplest way I can think of out of my head to choose a random element from DOM would be probably something like using querySelectorAll() which will return you an array of all <div>s in your document (or choose any other element, like <p> or anything else), then you can easily use click() on random one from the result, for example:
await page.evaluate(() => {
const allDivs = document.querySelectorAll('.left-sidebar-toggle');
const randomElement = allDivs[Math.floor(Math.random() * allDivs.length)];
randomElement.click();
});

Resources