how to using cookie with request (request , tough-cookie , node.js) - node.js

I'm wondering to know how to using cookie with request (https://github.com/mikeal/request)
I need to set a cookie which able to be fetched for every sub domains from request,
something like
*.examples.com
and the path is for every page, something like
/
then server-side able to fetch the data from cookie correctly, something like
test=1234
I found the cookies which setup from response was working fine,
I added a custom jar to save the cookies, something like
var theJar = request.jar();
var theRequest = request.defaults({
headers: {
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/34.0.1847.116 Safari/537.36'
}
, jar: theJar
});
but the cookies which I setup from request, only able to be fetched in same domain,
and I can't find a method to setup cookie in more options
for now if I want one cookie which able to be fetched in three sub domains,
I have to setup like this way:
theJar.setCookie('test=1234', 'http://www.examples.com/', {"ignoreError":true});
theJar.setCookie('test=1234', 'http://member.examples.com/', {"ignoreError":true});
theJar.setCookie('test=1234', 'http://api.examples.com/', {"ignoreError":true});
Is here any advance ways to setup a cookie from request,
made it able to be fetched in every sub domains ???

I just found the solution ....
theJar.setCookie('test=1234; path=/; domain=examples.com', 'http://examples.com/');
hm...I have to say, the document which for request is not so good..., lol

Related

Using proxy to make request results in bad request (400) error code

I'm using node-fetch and https-proxy-agent to make a request using a proxy, however, I get a 400 error code from the site I'm scraping only when I send the agent, without it, everything works fine.
import fetch from 'node-fetch';
import Proxy from 'https-proxy-agent';
const ip = PROXIES[Math.floor(Math.random() * PROXIES.length)]; // PROXIES is a list of ips
const proxyAgent = Proxy(`http://${ip}`);
fetch(url, {
agent: proxyAgent,
headers: {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/102.0.5005.72 Safari/537.36'
}
}).then(res => res.text()).then(console.log)
This results in a 400 error code like so:
I have absolutely no idea why this is happening. If you want to reproduce the issue, I'm scraping https://azlyrics.com. Please let me know what is wrong.
The issue has been fixed. I did not notice I was making a request to a https site with a http proxy. The site was using https protocol but the proxies were http only. Changing to https proxies works. Thank you.

API GET request not reflecting changes in DB, delayed by 5min

I am trying to make an web-app that notifies when new vaccine slots arrive on government portal using provided public APIs.
What i need is to call the API every minute and check if the slots have been added to the database. But the response I am getting is stale as the new sessions detected by my app(also in Chrome) were about 5 minutes old, I know this because some telegram channels are showing update earlier than my app.
Also, when I try to hit the same API with Postman, the response I am getting is fresh.
Issue is - Chorme/myApp response is not reflecting the updated database... but postman is showing the updated one... chrome is getting the updated response 5 mins after its showing in postman.
Public API: https://cdn-api.co-vin.in/api/v2/appointment/sessions/public/calendarByDistrict?district_id=141&date=06-07-2021
let response = await fetch(`https://cdn-api.co-vin.in/api/v2/appointment/sessions/public/calendarByDistrict?district_id=${id}&date=${today}`, {
method: 'GET',
headers: {
'Content-Type': 'application/json',
'Connection': 'keep-alive',
},
})
Do I need to change some headers or anything else in my get requests?... or anything else???
Help, me fix it...
So couple of things.
First, use Find by district API instead of Calendar by district API. Thats more accurate.
https://cdn-api.co-vin.in/api/v2/appointment/sessions/public/findByDistrict?district_id=512&date=31-03-2021
Second, pass the user agent. This is in PHP, but you can always update to other language.
$header = array(
"Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
"Pragma: no-cache",
"Cache-Control: no-cache",
"Accept-Language: en-us",
"User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.1.1 Safari/605.1.15",
"Upgrade-Insecure-Requests: 1"
);

Request blocked if it is sent by node,js axios

I am using axios and a API (cowin api https://apisetu.gov.in/public/marketplace/api/cowin/cowin-public-v2) which has strong kind of protection against the web requests.
When I was getting error 403 on my dev machine (Windows) then, I solve it by just adding a header 'User-Agent'.
When I have deployed it to heroku I am still getting the same error.
const { data } = await axios.get(url, {
headers: {
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.212 Safari/537.36',
},
})
Using a fake user-agent in your headers can help with this problem, but there are other variables you may want to consider.
For example, if you are making multiple HTTP requests you may want to have multiple fake user-agents to and then randomize the user-agent for every request made. This can help limit the changes of your scraper being detected.
If that still doesn't work you may want to consider optimizing your headers further. Other than sending HTTP requests with a randomized user-agent, you can further imitate a browser's request Headers by adding more Headers than just the "user-agent"- then ensuring that the user-agent that is selected is consistent with the information sent from the rest of the headers.
You can check out here for more information.
On the site it will not only provide information on how to optimize your headers consistently with the user-agent, but also provide more solutions in case the above mentioned still was unsuccessful.
In my situation, it was the case that I had to bypass cloudflare. You can determine if this is your situation as well if you log your error to the terminal and then check if under the "server" key it says "cloudflare". In which case you can use this documentation for further assistance.

Getting 403 forbidden status through python requests

I am trying to scrape a website content and getting 403 Forbidden status. I have tried solutions like using sessions for cookies and mocking browser through a 'User-Agent' header. Here is the code I have been using
session = requests.Session()
headers = {
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36',
}
page = session.get('https://www.sizeofficial.nl/product/zwart-new-balance-992/343646_sizenl/', headers = headers)
Note that this approach works on other websites, it is just this one which does not seem to work. I have even tried using other headers which my browser is sending them, and it does not seem to work. Another approach I have tried is to first create a session cookie and then pass that cookie to session.get, still doesn't work for me. Is it not allowed to scrape the website or am I still missing something?
I am using python 3.8 requests to achieve this purpose.

how to get current windows user from express?

Hi I am new to Nodejs and express framework.
I am implementing a simple CRUD application, and users are expecting to visit the page from MS windows. I wish to log down the current windows user name.
I've tried logging the User-Agent string on the page, and it seems User-Agent does not contain the windows user name. Is this true? and what is the right way to implement this?
res.render('search', {user: req.get('User-Agent')});
Then in jade,
body
p welcome, #{user}!
Here is what i got:
Welcome, Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36!
The User-Agent doesn't include the windows username. Have a look at Wikipedia for further information.
A possible solution to your problem may be a NTLM Authentication. To add this install and optionally save express-ntlm as a dependency:
npm install express-ntlm [--save]
Then require and add it as a middleware to express:
var ntlm = require('express-ntlm');
app.use(ntlm());
You will then be able to use the UserName in jade:
body
p welcome, #{ntlm.UserName}!
In case you want to do a real NTLM Authentication and validate the credentials using an Active Directory you can do this as well:
app.use(ntlm({
domain: 'MYDOMAIN',
domaincontroller: 'ldap://myad.example',
}));

Resources