Login to Website using Python and __RequestVerificationToken - python-3.x

I've been trying to login to a website using python so I can access some photos that are behind a login screen. I've seen a bunch of examples here, but none of them seem to work. Here is my code:
#!/usr/local/bin/python3
import requests
from bs4 import BeautifulSoup
if __name__ == "__main__":
s = requests.Session()
url = 'http://www.usafawebguy.com/Account/Login'
g = s.get(url)
token = BeautifulSoup(g.text, 'html.parser').find('input',{'name':'__RequestVerificationToken'})['value']
print (token)
payload = { 'UserName': 'username',
'Password': 'password',
'RememberMe': 'true',
'ReturnURL': '/Photos/2022?page=1',
'__RequestVerificationToken': token }
p = s.post(url, data=payload)
soup = BeautifulSoup(p.text, 'html.parser')
print (soup.title)
#print (p.text)
r = s.get('https://www.usafawebguy.com/Photos/2022?page=1')
soup = BeautifulSoup(r.text, 'html.parser')
print (soup.title)
#print (r.text)
It always brings me back to the login screen. Thanks in advance!

A one character change: The url variable needs to be set to: "https://....", not "http://....". Live and learn.

Related

Why Is JSON Truncated During Linux HTML Response Parsing?

`
import requests
from bs4 import BeautifulSoup
url = "https://music.163.com/discover/toplist?id=19723756"
headers = {
'User-Agent': "PostmanRuntime/7.15.2",
}
response = requests.request("GET", url, headers=headers)
r = response.text
soup = BeautifulSoup(response.text, "lxml")
textarea = soup.find('textarea', attrs={'id': 'song-list-pre-data'}).get_text()
print(textarea)
`
In the Linux environment, the matching result JSON is truncated.
the textarea :xxxxxx ee":0,"album":{"id":158052587,"name":"Sakana~( ˵>ㅿㅿ
I think it's probably because of the special symbols.
How do you deal with this situation?
You need to convert from string to JSON object list
then it can be print a song.
I tested Ubuntu 20.04 and Windows on VS code terminal.
Both are works.
Code
import requests
import json
from bs4 import BeautifulSoup
url = "https://music.163.com/discover/toplist?id=19723756"
headers = {
'User-Agent': "PostmanRuntime/7.15.2",
}
response = requests.request("GET", url, headers=headers)
soup = BeautifulSoup(response.text)
textarea = soup.find('textarea', attrs={'id': 'song-list-pre-data'}).get_text()
json_list = json.loads(textarea)
for song in json_list:
print("album:", song['album']['name'], ", artists: ", song['artists'][0]['name'], "duration: ", song['duration'])
Result on Ubuntu 20.04
Result on VS code Terminal at

python using requests and a webpage with a login issue

I'm trying to login to a website via python to print the info. So I don't have to keep logging into multiple accounts.
In the tutorial I followed, he just had a login and password, but this one has
Website Form Data
Does the _wp attributes change each login?
The code I use:
mffloginurl = ('https://myforexfunds.com/login-signup/')
mffsecureurl = ('https://myforexfunds.com/account-2')
payload = {
'log': '*****#gmail.com',
'pdw': '*****'
'''brandnestor_action':'login',
'_wpnonce': '9d1753c0b6',
'_wp_http_referer': '/login-signup/',
'_wpnonce': '9d1753c0b6',
'_wp_http_referer': '/login-signup/'''
}
r = requests.post(mffloginurl, data=payload)
print(r.text)
using the correct details of course, but it doesn't login.
I tried without the extra wordpress elements and also with them but it still just goes to the signin page.
python output
different site addresses, different login details
Yeah the nonce will change with every new visit to the page.
I would use request.session() so that it automatically stores session cookies and all that good stuff.
Do a session.GET('some_login_page.com')
Parse with the response content with BeautifulSoup to retrieve the nonce.
Then add that into the payload of your POST request when you login.
A very quick and dirty example:
import requests
from bs4 import BeautifulSoup as bs
email = 'test#email.com'
password = 'password1234'
url = 'https://myforexfunds.com/account-2/'
# Start a session
with requests.session() as session:
# Send a GET request to the login page
r = session.get(url)
# Check if the request was successful
if r.status_code != 200:
print("Get Request Failed")
# Parse the HTML content of the page
soup = bs(r.content, 'lxml')
# Extract the value of the nonce from the HTML
nonce = soup.find(id='woocommerce-login-nonce')['value']
# Set up the login form data
params ={
"username": email,
"password": password,
"woocommerce-login-nonce": nonce,
"_wp_http_referer": "/account-2/",
"login": "Log+in"
}
# Send a POST request with the login form data
r = session.post(url, params=params)
# Check if the request was successful
if r.status_code != 200:
print("Login Failed")

how to use python login GitLab through LDAP? (with version ce-8.1.4)

My code is below:
domain = "https://git.mydomian.com"
SIGN_IN_URL = '{domain}/users/sign_in'.format(domain=domain)
LOGIN_URL = '{domain}/users/auth/ldapmain/callback'.format(domain=domain)
session = requests.Session()
session.verify = False
sign_in_page = session.get(SIGN_IN_URL).text
soup = BeautifulSoup(sign_in_page, 'html.parser')
token = soup.find_all('input')[1]['value']
ua = UserAgent()
headers = {'User-Agent': str(ua.chrome)}
payload = {'username': "myusername(forLDAP)",
'password': "mypassword(forLDAP)",
'authenticity_token': token}
response = session.post(LOGIN_URL, data=payload, headers=headers)
print(response.text)
if response.status_code != 200:
print('Failed to log in')
sys.exit(1)
gl = gitlab.Gitlab(domain, ssl_verify=False, session=session)
gl.projects.list()
The response shows the login page, but the gl is no data no projects...
Can anyone help me out from this QQ, thx
I switch python-gitlab version from 2.6 to 1.4 then it works...
(Don't forget to add the api_version parameter)

How to authenticate a site with Python using Requests?

How can I get this to work? I've been trying for days and keep hitting the same dead ends. I've tried all the examples I could find to no avail.
import requests
s = requests.Session()
payload = {'login_username': 'login', 'login_password': 'password'}
url = 'http://rustorka.com/forum/login.php'
requests.post(url, data=payload)
r2 = requests.get('http://rustorka.com/forum/index.php')
print(r2.text)
s.cookies

Python3 requests.get ignoring part of my URL (BEAUTIFULSOUP + PYTHON WEBSCRAPING)

I'm using requests.get like so:
import urllib3
import requests
from bs4 import BeautifulSoup
urllib3.disable_warnings()
cookies = {
'PYPF': '3OyMLS2-xJlxKilWEOSvMQXAhyCgIhvAxYfbB8S_5lGBxxAS18Z7I8Q',
'_ga': 'GA1.2.227320333.1496647453',
'_gat': '1',
'_gid': 'GA1.2.75815641.1496647453'
}
params = {
'platform': 'xbox'
}
page = requests.get("http://www.rl-trades.com/#pf=xbox", headers={'Platform': 'Xbox'}, verify=False, cookies=cookies, params=params).text
page
soup = BeautifulSoup(page, 'html.parser')
... etc.
But, from my results in testing, it seems requests.get is ignoring '/#pf=xbox' in 'http://www.rl-trades.com/#pf=xbox'.
Is this because I am having to set verify to false? What is going on here?

Resources