Log user browser details in Node.js - node.js

Is there a way to log which browser/OS/etc. the user is using from my Node.js app?
Thanks

I believe you want the information stored in the Request Header "User-Agent"
var useragent = request.headers['User-Agent']
My user-agent is: "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/13.0.782.220 Safari/535.1" for chrome

Related

Error 403 while scraping a website in python using requests and selenium

i am trying to scrape a website "https://coinatmradar.com/" . I am using requests, beautifulsoup and selenium (wherever required) to scrape data. But after a while, my ip got blocked by the website as it was using cloudflare protection.
country_url = "https://coinatmradar.com/country/226/bitcoin-atm-united-states/"
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36'}
response=requests.get(country_url, headers=headers)
soup=BeautifulSoup(response.content,'lxml')
This is the part of code that i am using. I am getting response 403. Is there other way around to make it work with requests and selenium both?
Try to set your headers like that:
headers = {'Cookie':'_gcar_id=0696b46733edeac962b24561ce67970199ee8668', 'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36'}

How can I send http post request on python with protobuf text as params?

I want to send http request using protobuf as params on python. I copied the protobuf data from charles proxy (web debugging proxy tool).
the protobuf text request data was:1 { 1: "2345654456765" }
i tried this but not working:
import requests
r = requests.post('https://api.website.com/version/auth/login?locale=en',data={1:{1:'2345654456765'}},headers={'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4181.9 Safari/537.36','platform': 'web',})
print(r.content)
I have no idea of how can I put this as a param. I always worked with json data. Is there anyone who knows the solution?

Google reCAPTCHA cannot be solved in Electron BrowserWindow

In my Electron app I try to open an external website (e.g. BrowserWindow.lodUrl('www.abc.xyz')), which is protected by Googles reCAPATCHA. The browser Window with the page is open, so the user could solve the captcha and it does not act like a bot.
But somehow, the only response for the reCAPTCHA validation request is
)]}'
["rresp",null,null,null,null,null,1]
Also no reCAPTHCA popup for "street sign" or "crossign" selection appears.
Additionally I get a warning in the console
A cookie associated with a cross-site resource at http://google.com/ was set without the `SameSite` attribute.
A future release of Chrome will only deliver cookies with cross-site requests if they are set with `SameSite=None` and `Secure`.
You can review cookies in developer tools under Application>Storage>Cookies and see more details at https://www.chromestatus.com/feature/5088147346030592 and https://www.chromestatus.com/feature/5633521622188032.
I could solve the problem, by adding the user agent to every request separately.
const userAgent = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.163 Safari/537.36';
newSession.webRequest.onBeforeSendHeaders((details, callback: (beforeSendResponse) => void) => {
details.requestHeaders['userAgent'] = userAgent;
callback({cancel: false, requestHeaders: details.requestHeaders});
})

Python: Access Denied at Random Points When Using Requests

I am using requests and beautifulsoup to go through the popular comic store comixology in order to make a list of all comic titles and issues and release date for all of them, so I am requesting a massive amount of web pages. Unfortunately, partway through i will get the error:
you do not have access to (URL) on this server
I tried using a function that recursively tries the request. but this isn't working
Im not putting the whole code in because it is very long.
def getUrl(url):
try:
page = requests.get(url)
except:
getUrl(url)
return page
The User-Agent request header contains a characteristic string that allows the network protocol peers to identify the application type, operating system, software vendor or software version of the requesting software user agent. Validating User-Agent header on server side is a common operation so be sure to use valid browser’s User-Agent string to avoid getting blocked.
(Source: http://go-colly.org/articles/scraping_related_http_headers/)
The only thing you need to do is to set a legitimate user-agent. Therefore add headers to emulate a browser. :
# This is a standard user-agent of Chrome browser running on Windows 10
headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36' }
Example:
from bs4 import BeautifulSoup
import requests
headers={'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36'}
resp = requests.get('http://example.com', headers=headers).text
soup = BeautifulSoup(resp, 'html.parser')
Additionally, you can add another set of headers to pretend like a legitimate browser. Add some more headers like this:
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36',
'Accept' : 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Language' : 'en-US,en;q=0.5',
'Accept-Encoding' : 'gzip',
'DNT' : '1', # Do Not Track Request Header
'Connection' : 'close'
}

Request Headers: 41d9251ae3b6e89193fe, what does it mean?

As we know, the general request headers are always like "User-Agent:Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.140 Safari/537.36", or "Accept:application/json, text/plain".
But this time, I find a request header: "41d9251ae3b6e89193fe:1b237fc9847ec56e144031e03cc72d704777ef4167026f236eb3dd8d2c5b15ad837a20c8ce14459ae9f5c36e581b1322229b548178cce6cdf07ebbea7765f2df", and I have never seen a request header like that.
What does it mean?

Resources