NOTE: There is no issue here, The issue was with my functions.json script file location. It was pointing to an old script file. The minute I pointed to a new one, it started working.
I am not sure why this is happening, I have a try-catch block and the function never hits the catch block but the image I am trying upload never shows up in the container.
I am new to NODEJS. Since I cant achieve the same thing using C# functions, I decided to write it in the node.
Problem: Azure function Service bus topic trigger, take the message payload and grabs a screenshot of the page using puppeteer. The output from the buffer is in form of the buffer, I am trying to upload this to Azure storage blob.
import { AzureFunction, Context } from "#azure/functions";
import { ServiceBusMessage } from "#azure/service-bus";
import * as puppeteer from 'puppeteer';
import * as BlobServiceClient from "azure-storage";
import { Readable } from 'stream';
const serviceBusTopicTrigger: AzureFunction = async function (context: Context, mySbMsg: ServiceBusMessage): Promise<void> {
try {
const promotionId = context.bindingData.userProperties.promotionId;
context.log('Player Screen Grabber ServiceBus topic trigger function processing message started', promotionId);
const playerURL = process.env['playerURL'] + promotionId + '/';
let browser = await puppeteer.launch({ headless: true });
let page = await browser.newPage();
await page.goto(playerURL, { waitUntil: 'networkidle2' });
await page.setViewport({ width: 1920, height: 1080 });
const screenshotBuffer = await page.screenshot({
encoding: 'binary'
});
await page.close();
await browser.close();
const newPlayerScreenShotStream = new Readable({
read() {
this.push(screenshotBuffer);
},
});
var fileName = promotionId + ".png";
context.bindings.fileName = fileName;
context.bindings.storage = screenshotBuffer;
context.done();
context.log('Player Screen Grabber ServiceBus topic trigger function processing message ended', promotionId);
}
catch (error) {
throw error;
}
};
According to your infromation you provide, you want to use dymaic name in Azure function blob storage output bining. If so, we cannot use context.bindings.<> to implement it. For more details, please refer to here and here
If you want to implement it, you have the following two choices.
Using Azure Functions binding expression patterns
if you define the message's body as json, we can directly read the value with binding expression in function
For example
My message
function.json
{
"bindings": [
{
"name": "mySbMsg",
"type": "serviceBusTrigger",
"direction": "in",
"topicName": "",
"subscriptionName": "",
"connection": "MYSERVICEBUS"
},
{
"type": "blob",
"direction": "out",
"name": "outputBlob",
"path": "outcontainer/{fileName}.png",
"connection": "AzureWebJobsStorage"
}
],
"scriptFile": "../dist/ServiceBusTopicTrigger1/index.js"
}
Function code
import { AzureFunction, Context } from "#azure/functions";
import * as puppeteer from "puppeteer";
const serviceBusTopicTrigger: AzureFunction = async function (
context: Context,
mySbMsg: any
): Promise<void> {
try {
context.log("ServiceBus topic trigger function processed message", mySbMsg);
const promotionId = context.bindingData.userProperties.promotionId;
const playerURL =
"https://learn.microsoft.com/en-us/azure/azure-functions/functions-reference-node?tabs=v2";
let browser = await puppeteer.launch({ headless: true });
let page = await browser.newPage();
await page.goto(playerURL, { waitUntil: "networkidle2" });
await page.setViewport({ width: 1920, height: 1080 });
const screenshotBuffer = await page.screenshot({
encoding: "binary",
});
await page.close();
await browser.close();
context.bindings.outputBlob = screenshotBuffer;
} catch (error) {
throw error;
}
};
export default serviceBusTopicTrigger;
Using Azure Blob storage sdk
Function code
import { AzureFunction, Context } from "#azure/functions";
import * as puppeteer from "puppeteer";
import { BlobServiceClient } from "#azure/storage-blob";
const serviceBusTopicTrigger: AzureFunction = async function (
context: Context,
mySbMsg: any
): Promise<void> {
try {
context.log("ServiceBus topic trigger function processed message", mySbMsg);
const promotionId = context.bindingData.userProperties.promotionId;
const playerURL =
"https://learn.microsoft.com/en-us/azure/azure-functions/functions-reference-node?tabs=v2";
let browser = await puppeteer.launch({ headless: true });
let page = await browser.newPage();
await page.goto(playerURL, { waitUntil: "networkidle2" });
await page.setViewport({ width: 1920, height: 1080 });
const screenshotBuffer = await page.screenshot({
encoding: "binary",
});
await page.close();
await browser.close();
// the storage account connection string
const constr = process.env["AzureWebJobsStorage"];
const blobserviceClient = BlobServiceClient.fromConnectionString(constr);
const containerClient = blobserviceClient.getContainerClient("output");
const blob = containerClient.getBlockBlobClient(`${promotionId}.png`);
await blob.uploadData(screenshotBuffer);
} catch (error) {
throw error;
}
};
export default serviceBusTopicTrigger;
My message
Result
Related
i'm converting my puppeteer code to puppeteer cluster it was working just fine now i'm facing this error "page.solveRecaptchas is not a function" when trying to 2captcha to solve hcaptcha
this is the complete code that i wrote, it just takes data from an excel file and then filled them on the website
number of the pages depends
`
const xlsx = require('xlsx')
const puppeteer = require('puppeteer-extra')
const StealthPlugin = require('puppeteer-extra-plugin-stealth')
const RecaptchaPlugin = require('puppeteer-extra-plugin-recaptcha')
puppeteer.use(StealthPlugin())
puppeteer.use(
RecaptchaPlugin({
provider: {
id: '2captcha',
token: 'xxxxxxxxxxxx'
},
visualFeedback: true
})
)
const {executablePath} = require('puppeteer')
const { Cluster } = require('puppeteer-cluster');
(async () => {
const cluster = await Cluster.launch({
concurrency: Cluster.CONCURRENCY_PAGE,
maxConcurrency: 10,
timeout: 150 * 1000 ,
puppeteerOptions: {
headless: false,
args: ["--no-sandbox", "--disable-setuid-sandbox","--disable-web-security"],
defaultViewport: null,
executablePath: executablePath()
},
});
cluster.on('taskerror', (err, url) => {
console.error((new Date()).toJSON() + ` Error crawling ${url}: ${err.message}`);
});
//get excele data
let fileURL = 'C:/xxxx/xxxx/xxxxx/clients2.xlsx'
let workbook = xlsx.readFile(fileURL)
const sheet_name_list = workbook.SheetNames;
let clientsArr = xlsx.utils.sheet_to_json(workbook.Sheets[sheet_name_list[0]])
console.log(clientsArr);
await cluster.task(async ({ page, data: [email , password,appiontment, firstName , lastName ] }) => {
await page.goto('https://website.com/')
await page.waitForTimeout(1000)
// close popup 1
await page.waitForSelector('#IDBodyPanelapp > div.popup-appCloseIcon');
await page.click('#IDBodyPanelapp > div.popup-appCloseIcon')
//choose region
await page.waitForSelector('#juridiction');
if(region == 'ALGER'){
region = "15#Al#10"
await page.select('#juridiction', region);
}
else{
region = "14#Ora#9"
await page.select('#juridiction', region);
}
// click to get 2nd otp
page.$eval(`#verification_code`, element =>
element.click()
)
// close popup 2
await page.waitForTimeout(1500)
await page.waitForSelector('#IDBodyPanelapp > div.popup-appCloseIcon');
await page.click('#IDBodyPanelapp > div.popup-appCloseIcon')
//solve hcaptcha and submit form
await page.waitForTimeout(2000)
await page.waitForSelector('#category');
if(appiontment == 'Normal'){
appiontment = "Normal"
await page.select('#category', appiontment);
}
else{
appiontment = "Premuim"
await page.select('#category', appiontment);
}
await page.waitForTimeout(15000)
await page.solveRecaptchas()
await Promise.all([
page.waitForNavigation(),
//click submit
page.click(`#em_tr > div.col-sm-6 > input`)
])
await page.screenshot({ path: 'screenshot.png', fullPage: true })
});
clientsArr.map((data)=>{
cluster.execute([data.email, data.password , data.appiontment, data.firstname , data.lastPrenom ]);
})
// await cluster.idle();
// await cluster.close();
})();
`
i have already searched but there are no solutions
need help and thank you
I have a nextjs page which consists of a react video player which plays a YouTube video based on some id passed in the url. The YouTube video is fetched in getServerSideProps based on the id. Then on the client side I am using /api/some-route to take a screenshot of that video player div using Puppeteer. Problem is when in api side I am opening a browser with Puppeteer with that particular URL, getServerSideProps is called and again my api/some-routes is getting called. So It has made a loop and is not finishing. How do I stop this?
My page:
export default function Home() {
useEffect(() => {
if (typeof window === undefined) {
return;
}
const url = window.location.href;
setTimeout(() => {
fetch(`/api/scrapper?url=${url}`)
.then((res) => {
res.json();
})
.then((data) => {
console.log(data);
});
}, 10000);
}, [params.slug[0]);
return (
<>
<Layout>
<Frame id="capture" />
</Layout>
</>
);
}
export const getServerSideProps = async ({ params }) => {
return {
props: { params, serverData },
};
}
/api/scrapper.js
import puppeteer from "puppeteer";
export default async function My(req, res) {
const url = req.query.url;
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto(url);
const img = await page.screenshot({ path: "output.png" });
console.log("img", img);
await page.close();
await browser.close();
return res.json("done");
}
Is there a way to add command so bot will message this command everday in custom channel on server? Its for CS:GO Matches statistics
const BaseCommand = require('../../utils/structures/BaseCommand');
const Discord = require("discord.js");
const puppeteer = require('puppeteer');
module.exports = class LinkCommand extends BaseCommand {
constructor() {
super('mec', 'fun', []);
}
async run(client, message, args) {
const browser = await puppeteer.launch({defaultViewport: null});
const page = await browser.newPage();
await page.setViewport({
width: 1920,
height: 1080,
deviceScaleFactor: 1,
});
await page.setDefaultNavigationTimeout(0);
await page._client.send('Network.getAllCookies');
await page.goto('https://pro.eslgaming.com/csgo/proleague/schedule/#?matchday=2');
console.log(await page.content());
await page.screenshot({path: 'screenhhhhshot.png'});
let screenshot = await page.screenshot();
await browser.close();
let today = new Date().toISOString().slice(0, 10)
message.channel.send(`${today}`, {files: [screenshot]});
message.delete();
}
}
What i want to do like to set command that will everyday make bot to send message in custom channel on discord server
I got it
setInterval(() => {
message.channel.send(`${today}`, {files: [screenshot]});
message.delete();
}, 60000);
at the end 60000 is milliseconds just change that to 24h into milliseconds and boom :D
When the download button is clicked, a new tab is opened where the user can view a PDF statement.
This new tab has a URL starting with blob:, e.g.: blob:https://some-domain.com/statement-id.
How could I download this PDF statement to the file system?
Note: I'm using { headless: false } mode.
Trying to simulate the case:
import puppeteer from 'puppeteer';
import { writeFileSync } from 'fs';
// Minimal PDF from https://github.com/mathiasbynens/small#documents
const minimalPdf = `%PDF-1.
1 0 obj<</Pages 2 0 R>>endobj
2 0 obj<</Kids[3 0 R]/Count 1>>endobj
3 0 obj<</Parent 2 0 R>>endobj
trailer <</Root 1 0 R>>`;
const browser = await puppeteer.launch({ headless: false, defaultViewport: null });
try {
const [page] = await browser.pages();
await page.goto('http://example.com/');
await page.evaluate((pdf) => {
const url = URL.createObjectURL(new Blob([pdf], {type: 'application/pdf'}));
window.open(url);
}, minimalPdf);
const newTarget = await page.browserContext().waitForTarget(
target => target.url().startsWith('blob:')
);
const newPage = await newTarget.page();
const blobUrl = newPage.url();
page.once('response', async (response) => {
console.log(response.url());
const pdfBuffer = await response.buffer();
console.log(pdfBuffer.toString());
console.log('same:', pdfBuffer.toString() === minimalPdf);
writeFileSync('minimal.pdf', pdfBuffer);
});
await page.evaluate((url) => { fetch(url); }, blobUrl);
} catch(err) { console.error(err); } finally { /* await browser.close(); */ }
Just getting started with Puppeteer and i can launch the browser, go to a url, run a bunch of actions and then close the browser. What i am looking to see if i can do though is open the browser and loop over a set of actions in the same session.
I have a JSON object with urls i want to visit, so want to loop over that
// teams.js
module.exports = {
premier_league: [
{ team_name: "Team 1", url: "https://url-of-site/team_1"},
{ team_name: "Team 2", url: "https://url-of-site/team_2"}
]
}
My script to launch puppeteer is as follows
// index.js
const TEAM = require('./teams');
const puppeteer = require('puppeteer');
(async () => {
// Initialise Browser
const browser = await puppeteer.launch({headless: false});
const page = await browser.newPage();
await page.setViewport({
width: 1280,
height: 800
});
await page.goto('login page');
await page.click('login_box');
await page.keyboard.type('username');
await page.click('login_password');
await page.keyboard.type('password');
await page.click('login_button');
await page.waitForNavigation();
// Go To Team URL
await page.goto('Team URL')
await browser.close();
})();
So to loop over my JSON object I can use
Object.keys(TEAM['premier_league']).forEach(function(key) {
// Output url of each team
console.log(TEAM['premier_league'][key]['url'])
});
If i wrap my go to url with my loop, then page is no longer accessible
// index.js
const TEAM = require('./teams');
const puppeteer = require('puppeteer');
(async () => {
// Initialise Browser
const browser = await puppeteer.launch({headless: false});
const page = await browser.newPage();
await page.setViewport({
width: 1280,
height: 800
});
await page.goto('login page');
await page.click('login_box');
await page.keyboard.type('username');
await page.click('login_password');
await page.keyboard.type('password');
await page.click('login_button');
await page.waitForNavigation();
Object.keys(TEAM['premier_league']).forEach(function(key) {
// Go To Team URL
await page.goto(TEAM['premier_league'][key]['url'])
});
await browser.close();
})();
The actual error is
await page.goto(TEAM[args][key]['url'])
^^^^
SyntaxError: Unexpected identifier
Your Object.keys callback function need to use async as well in order to use await inside. Try to change as below
Object.keys(TEAM['premier_league']).forEach( async function(key) {
// Go To Team URL
await page.goto(TEAM['premier_league'][key]['url'])
});
Hope it helps