Python requests not receiving response cookies - python-3.x

I am sending a GET request to this url (mobile user-agent needed). When sending this request on my phone or in postman, it returns a cookie called oidc.sid but when i do this in python requests, it does not return any cookies.
Here is my requests code:
get_resp = requests.get("https://www.uniqlo.com/ca/auth/v1/login", headers=headers)
headers = {
"user-agent": "Mozilla/5.0 (iPhone; CPU iPhone OS 12_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) CriOS/70.0.3538.75 Mobile/15E148 Safari/605.1",
}
Any help would be appreciated. Thank you

It is easy understand why you saw this, because get_resp is the response(last response) after redirects. Website set cookie in first response so you could not get any cookies in get_resp. Only need to set allow_redirects=False your question will be solved
import requests
headers = {
"User-Agent": "Mozilla/5.0 (iPhone; CPU iPhone OS 12_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) CriOS/70.0.3538.75 Mobile/15E148 Safari/605.1",
}
get_resp = requests.get("https://www.uniqlo.com/ca/auth/v1/login", headers=headers,allow_redirects=False)
print(get_resp.cookies)

Related

Can not download excel file using requests python, I can't get the third step of posting request to download excel file. here is my try

Here is my attempt to download excel file ##----------
How Do i make it work. Can someone please help me to fix last call
import requests
from bs4 import BeautifulSoup
url = "http://lijekovi.almbih.gov.ba:8090/SpisakLijekova.aspx"
useragent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.99 Safari/537.36 Edg/97.0.1072.76"
headers={
"User-Agent":useragent
}
session = requests.session() #session
r = session.get(url,headers=headers) #request to get cookies
soup = BeautifulSoup(r.text,"html.parser") #parsing values
viewstate = soup.find('input', {'id': '__VIEWSTATE'}).get('value')
viewstategenerator =soup.find('input', {'id': '__VIEWSTATEGENERATOR'}).get('value')
eventvalidation =soup.find('input', {'id': '__EVENTVALIDATION'}).get('value')
cookies = session.cookies.get_dict()
cookie=""
for k,v in cookies.items():
cookie+=k+"="+v+";"
cookie = cookie[:-1]
#header copied from the requests.
headers={
'Accept-Encoding': 'gzip, deflate',
'Accept-Language': 'en-US,en;q=0.9',
'Connection':'keep-alive',
'Content-Type':'application/x-www-form-urlencoded; charset=UTF-8',
'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.99 Safari/537.36 Edg/97.0.1072.76',
'X-KL-Ajax-Request':'Ajax_Request',
'X-MicrosoftAjax':'Delta=true',
'X-Requested-With':'XMLHttpRequest',
'Cookie':cookie
}
#post request data submission
data={
'ctl00$smMain':'ctl00$MainContent$ReportGrid$ctl103$ReportGrid_top_4',
'__EVENTTARGET':'ctl00$MainContent$ReportGrid$ctl103$ReportGrid_top_4',
'__VIEWSTATE':viewstate,
'__VIEWSTATEGENERATOR':viewstategenerator,
'__EVENTVALIDATION':eventvalidation,
'__ASYNCPOST':'true'
}
#need help with this part
result = requests.get(url,headers=headers,data=data)
print(result.headers)
data = {
"__EVENTTARGET":'ctl00$MainContent$btnExport',
'__VIEWSTATE':viewstate,
}
#remove ajax request for the last call to download excel file
del headers['X-KL-Ajax-Request']
del headers['X-MicrosoftAjax']
del headers['X-Requested-With']
result = requests.post(url,headers=headers,data=data,allow_redirects=True)
print(result.headers)
print(result.status_code)
#print(result.text)
with open("test.xlsx","wb") as f:
f.write(result.content)
I am trying to export excel file without selenium help, but I am not able to get the last step. I need help to convert xmlhttprequest to pure requests using python without any selenium

Get response 403 when i'm trying to crawling, user agent doesn't work in Python 3

I'm trying to crawling this website and get the message:
"You don't have permission to access"
there is a way to bypass this ? already used user agents and urlopen
Here is my code:
import requests
from bs4 import BeautifulSoup
import json
import pandas as pd
from urllib.request import Request, urlopen
url = 'https://www.oref.org.il/12481-he/Pakar.aspx'
header = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.76 Safari/537.36'}
res = requests.get(url, headers=header)
soup = BeautifulSoup(res.content, 'html.parser')
print(res)
output:
<Response [403]>
also tried to do that:
req = Request(url, headers={'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.69 Safari/537.36'})
webpage = urlopen(req).read()
output:
HTTP Error 403: Forbidden
still blocked and get response 403, Anyone who can help?

How to access the Medium.com final URL from link.medium.com, using Axios npm

Accessing https://link.medium.com/C1hxgiphAcb on browser is redirecting to https://medium.com/javascript-in-plain-english/add-size-limit-to-github-actions-551c8fe9e7d7
From the backend, I am trying to figure out the final URL, given the shortened URL.
I am using Axios package at the backend (nodeJS).
var mediumRequest = await Axios.get('https://link.medium.com/C1hxgiphAcb')
console.log(mediumRequest.request.res.responseUrl)
>> 'https://rsci.app.link/C1hxgiphAcb?_p=c21634dc9a016ceeeb1d90f4e8'
But, that is not the actual Final URL.
Am I missing something?
This did the job, adding a user-agent to the headers that represent a browser.
url = 'https://link.medium.com/C1hxgiphAcb';
headers = { 'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36' };
var x = await Axios.get(url, { headers: headers } )
console.log(x.request.res.responseUrl)
Output = https://medium.com/javascript-in-plain-english/add-size-limit-to-github-actions-551c8fe9e7d7

Forbidden (403) when requesting a page via python3

I want to request this via python3.
This is the code:
import requests
bizportal_company_url = "https://www.bizportal.co.il/realestates/quote/generalview/373019"
self.page = requests.get(self.bizportal_company_url)
and I get:
<Response [403]>
When I add verify=False to the get command, I get:
InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
InsecureRequestWarning)
<Response [403]>
How can I fix it? When I access the url there is no password or anything.
You might be missing a few things in your code.
Try this:
import requests
headers = {
"user-agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.198 Safari/537.36"
}
session = requests.Session()
response = session.get("https://www.bizportal.co.il", headers=headers)
url = "https://www.bizportal.co.il/realestates/quote/generalview/373019"
print(session.get(url, headers=headers).status_code)
This should print:
200
Which basically means the request has been successful.

hangs on open url with urllib (python3)

I try to open url with python3:
import urllib.request
fp = urllib.request.urlopen("http://lebed.com/")
mybytes = fp.read()
mystr = mybytes.decode("utf8")
fp.close()
print(mystr)
But it hangs on second line.
What's the reason of this problem and how to fix it?
I suppose the reason is that the url does not support robot visiting a site visit. You need to fake a browser visit by sending browser headers along with your request
import urllib.request
url = "http://lebed.com/"
req = urllib.request.Request(
url,
data=None,
headers={
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.47 Safari/537.36'
}
)
f = urllib.request.urlopen(req)
Tried this one on my system and it works.
Agree with Arpit Solanki. Shown output for a failed request vs successful.
Failed
GET / HTTP/1.1
Accept-Encoding: identity
Host: www.lebed.com
Connection: close
User-Agent: Python-urllib/3.5
Success
GET / HTTP/1.1
Accept-Encoding: identity
Host: www.lebed.com
Connection: close
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.47 Safari/537.36

Resources