How do I set multiple custom HTTP headers in puppeteer? - node.js

I am trying to login using puppeteer at https://kith.com/account/login?return_url=%2Faccount
When I login and solve the captcha with audio, it detects me as a bot, so I am trying to change the request headers to see if that helps but cannot find anything on how to change them.
I found this, but it only shows 1 header:
await page.setRequestInterception(true)
page.on('request', (request) => {
const headers = request.headers();
headers['X-Just-Must-Be-Request-In-All-Requests'] = '1';
request.continue({
headers
});
});

You are able to set multiple HTTP headers with the dedicated puppeteer method: page.setExtraHTTPHeaders as well.
E.g.:
await page.setExtraHTTPHeaders({
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.131 Safari/537.36',
'upgrade-insecure-requests': '1',
'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3',
'accept-encoding': 'gzip, deflate, br',
'accept-language': 'en-US,en;q=0.9,en;q=0.8'
})
await page.goto('...')

header is array you can add many as you want
page.on('request', (request) => {
const headers = request.headers();
headers['X-Just-Must-Be-Request-In-All-Requests'] = '1';
headers['foo'] = 'bar';
headers['foo2'] = 'bar2';
request.continue({
headers
});
});

Related

Axios post form urlencoded requests application/x-www-form-urlencoded

I'm trying to make a requests post with axios, sending that postdata from a checkbox and submit button, but I don't know how to do this correctly with axios, I would appreciate your help
`
const URI = "https://www.guadeloupe.gouv.fr/booking/create/12828/0"
const data = "condition=on&nextButton=Effectuer+une+demande+de+rendez-vous"
const headers = {
'Content-Type': 'application/x-www-form-urlencoded',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36',
'Accept-Encoding': 'gzip, deflate, br',
'Origin': 'https://www.guadeloupe.gouv.fr'
}
const resp = await axios.post(URI,data,headers)
with insomnia = INSONMIA POST REQUEST SCREEN SHUT
CHECKBOX = PHOTO CHECKBOX SUBMIT
I can't test with real REST server but I can suggest this code.
It's base on your image and code.
const resp = await axios.post(URI,
new URLSearchParams({
'condition': 'condition',
'nextButton': 'Effectuer une demande de rendez vous'
}),
{
headers:
{
'Content-Type': 'application/x-www-form-urlencoded'
}
})

Web scraping using fetch - promise doesn't resolve

I am trying to fetch a particular website, and I already mimic all the request headers that Chrome sends and I am still getting a pending promise that never resolves.
Here is my current code and headers:
const fetch = require('node-fetch');
(async () => {
console.log('Starting fetch');
const fetchResponse = await fetch('https://www.g2a.com/rocket-league-pc-steam-key-global-i10000003107015', {
method: 'GET',
headers: {
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36',
'Accept-Language': 'en-US;q=0.7,en;q=0.3',
'Accept-Encoding': 'gzip, deflate, br'
}
})
console.log('I never see this console.log: ', fetchResponse);
if(fetchResponse.ok){
console.log('ok');
}else {
console.log('not ok');
}
console.log('Leaving...');
})();
This is the console logs I can read:
Starting fetch
This is a pending promise: Promise { <pending> }
not ok
Leaving...
Is there something I can do here? I notice on similar questions that for this specific website, I only need to use Accept-Language header, I already tried that, but still the promise never gets resolved.
Also read on another question that they have security against Node.js requests, maybe I need to use another language?
You'll have a better time using async functions and await instead of then here.
I'm assuming your Node.js doesn't support top-level await, hence the last .then.
const fetch = require("node-fetch");
const headers = {
"User-Agent":
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36",
"Accept-Language": "en-US;q=0.7,en;q=0.3",
"Accept-Encoding": "gzip, deflate, br",
};
async function doFetch(url) {
console.log("Starting fetch");
const fetchResponse = await fetch(url, {
method: "GET",
headers,
});
console.log(fetchResponse);
if (!fetchResponse.ok) {
throw new Error("Response not OK");
}
const data = await fetchResponse.json();
return data;
}
doFetch("https://www.g2a.com/rocket-league-pc-steam-key-global-i10000003107015").then((data) => {
console.log("All done", data);
});

Node.js: How to download a file via an HTTPS/POST request

I'm trying to download a file from www.borsaistanbul.com
For some file (like the ones under the link=> https://www.borsaistanbul.com/veriler/verileralt/hisse-senetleri-piyasasi-verileri/bulten-verileri ) they've provided the file paths so I was able to download them via https.get(downloadLink) easily.
But for the files under https://www.borsaistanbul.com/veriler/verileralt/hisse-senetleri-piyasasi-verileri/piyasa-verileri they don't provide the paths and the download links.
I'm trying to download the one named "Üye Bazında Seanslık İşlem Sıralaması"(the one on the 2nd row)
I might be wrong but as far as I understand, when you click on the download image next to it, your browser makes a POST request and then it triggers smth on the server side and then server serves the file to you.
I've found the POST request with the help of chromeDeveloper tool and tried to simulate it but it does not seem to work.
Could anyone helps and shows me a way how to download this file ?
Here is a sample code I've tried:
fs = require('fs');
const request = require('request');
/* Create an empty file where we can save data */
let file = fs.createWriteStream(`denemePost.zip`);
/* Using Promises so that we can use the ASYNC AWAIT syntax */
new Promise((resolve, reject) => {
let stream = request.post({
/* Here you should specify the exact link to the file you are trying to download */
uri: 'https://www.borsaistanbul.com/veriler/verileralt/hisse-senetleri-piyasasi-verileri/bulten-verileri',
headers: {
// 'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8',
'Accept' : 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
'Accept-Encoding': 'gzip, deflate, br',
// 'Accept-Language': 'en-US,en;q=0.9,fr;q=0.8,ro;q=0.7,ru;q=0.6,la;q=0.5,pt;q=0.4,de;q=0.3',
'Accept-Language' : 'en-US,en;q=0.9',
'Cache-Control': 'max-age=0',
'Connection': 'keep-alive',
'Content-Length' : '7511',
'Content-Type' : 'application/x-www-form-urlencoded',
'Cookie' : 'ASP.NET_SessionId=vugebk1zob2fw2hgxiftjg1z; cPER=!SmE/fvI1sjF1DqtSzYfA84hhMFmKdR+VmPTaX1WlhB8KHfkS3iP2fO2FK2iyUzwiDyupy85iZItfoeo=; _ga=GA1.2.534681471.1587587675; _gid=GA1.2.113108587.1588205109',
'Host': 'www.borsaistanbul.com',
'Origin' : 'null',
'Sec-Fetch-Dest': 'document',
'Sec-Fetch-Mode' : 'navigate',
'Sec-Fetch-Site' : 'same-origin',
'Sec-Fetch-User': '?1',
'Upgrade-Insecure-Requests': '1',
// 'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.106 Safari/537.36'
'User-Agent' : 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.122 Safari/537.36'
},
/* GZIP true for most of the websites now, disable it if you don't need it */
gzip: true
})
.pipe(file)
.on('finish', () => {
console.log(`The file is finished downloading.`);
resolve();
})
.on('error', (error) => {
reject(error);
})
})
.catch(error => {
console.log(`Something happened: ${error}`);
});
Any help would be much appreciated,
Thanks in advance
I found a workaround if anyone tries to accomplish a similar thing.
I've downloaded the file with puppeteer libraries.
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch({headless: false,slowMo: 250});
const page = await browser.newPage();
await page.goto('https://www.borsaistanbul.com/veriler/verileralt/hisse-senetleri-piyasasi-verileri/piyasa-verileri');
page.once('load', () => console.log('Page loaded!'));
await page.waitForSelector('#TextContent_C001_lbtnUyeBazindaGunlukIslemSiralamasi');
await page.click('#TextContent_C001_lbtnUyeBazindaGunlukIslemSiralamasi');
await browser.close();
})();

CookieJars obtaining all cookies, nodeJS using request-promise

I am struggling to successfully make a request using request-promise npm on a site that requires a cookie to view or for the request to be successful.
Henceforth, I have looked into cookieJars in order to store all those that are given in the repsonse after the request has been done.
const rp = require("request-promise")
var cookieJar = rp.jar()
function grabcfToken(){
let token = ""
let options = {
url : 'https://www.off---white.com/en/GB',
method: "GET",
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36',
resolveWithFullResponse : true
}
rp(options)
.then((response)=>{
console.log(response)
})
.catch((error)=>{
console.log(error)
})
}
Can someone tell me why the request isn't successfully going through? How do I apply the cookies that I initially get before being timed out.
const rp = require("request-promise")
var cookieJar = rp.jar()
function grabcfToken(){
let token = ""
let options = {
url : 'https://www.off---white.com/en/GB',
method: "GET",
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36',
resolveWithFullResponse : true,
jar: cookieJar
}
rp(options)
.then((response)=>{
console.log(response)
})
.catch((error)=>{
console.log(error)
})
}
If you're asking about including your jar which you filled with the cookies from the request to be sent to across you have to add jar: cookiejar as pasrt of your options object before sending it.

NodeJs request.get() function not working while the url is accessible from the browser

I am using the request npm module.I want to retrieve an image from a url. The request.get(url) function is returning me a '400 Bad Request', whereas the image is accessible from the browser.
The url i am hitting is : http://indiatribune.com/wp-content/uploads/2017/09/health.jpg
You could try to add some headers:
const request = require('request');
request.get({
url: 'http://indiatribune.com/wp-content/uploads/2017/09/health.jpg',
headers: {
Accept: 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8',
'Accept-Encoding': 'gzip, deflate',
'Accept-Language': 'en-GB,en;q=0.8,en-US;q=0.6,hu;q=0.4',
'Cache-Control': 'max-age=0',
Connection: 'keep-alive',
Host: 'indiatribune.com',
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.113 Safari/537.36',
},
}, (err, response, data) => {
console.log(response, data);
});
The User-Agent seems to be enough.
Use download module . It's pretty simple.
const fs = require('fs');
const download = require('download');
download('http://indiatribune.com/wp-content/uploads/2017/09/health.jpg').pipe(fs.createWriteStream('foo.jpg'));

Resources