Tried to execute the following code but invariably get the "TooManyRedirects" error. What am I doing incorrectly?
My code:
import requests, json
Address = '100 W Grant Street'
City = 'Orlando'
State = 'FL'
url = 'https://tools.usps.com/tools/app/ziplookup/zipByAddress'
data = {'company':'', 'address1': Address, 'address2':'','city': City, 'state': 'State', 'zip': ''}
raw = requests.post(url, data=data)
Here's the massive error message I get:
Traceback (most recent call last):
File "<pyshell#1347>", line 1, in <module>
raw = requests.post(url, data=data)
File "C:\Users\Karun\AppData\Local\Programs\Python\Python36\lib\site-packages\requests\api.py", line 112, in post
return request('post', url, data=data, json=json, **kwargs)
File "C:\Users\Karun\AppData\Local\Programs\Python\Python36\lib\site-packages\requests\api.py", line 58, in request
return session.request(method=method, url=url, **kwargs)
File "C:\Users\Karun\AppData\Local\Programs\Python\Python36\lib\site-packages\requests\sessions.py", line 508, in request
resp = self.send(prep, **send_kwargs)
File "C:\Users\Karun\AppData\Local\Programs\Python\Python36\lib\site-packages\requests\sessions.py", line 640, in send
history = [resp for resp in gen] if allow_redirects else []
File "C:\Users\Karun\AppData\Local\Programs\Python\Python36\lib\site-packages\requests\sessions.py", line 640, in <listcomp>
history = [resp for resp in gen] if allow_redirects else []
File "C:\Users\Karun\AppData\Local\Programs\Python\Python36\lib\site-packages\requests\sessions.py", line 140, in resolve_redirects
raise TooManyRedirects('Exceeded %s redirects.' % self.max_redirects, response=resp)
requests.exceptions.TooManyRedirects: Exceeded 30 redirects.
This particular url, for some reason, wants me to include the headers['User-Agent'] in the requests.post statement also. Then I get an appropriate response. So here's the new code:
import requests
s = requests.Session()
url = 'https://tools.usps.com/tools/app/ziplookup/zipByAddress'
payload = {'companyName':'', 'address1':'10570 Main St', 'address2':'', 'city':'Fairfax', 'state':'VA', 'zip':''}
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36'}
r = s.post(url, data = payload, headers = headers)
Related
I am newbie to python. I am trying to extract data from API's. I tried extracting data in my local using postman and it extracts the data. But when I use python requests I am getting connection aborted error. Can someone please help me in understanding this issue.
Below are the code that I have tried:
import requests
from requests import request
url = "https://abcd/smart_general_codes?category=BANK"
payload={}
headers = {
'TenantId': 'IN0XXX',
'Accept-Language': 'en_us',
'Transfer-Encoding': 'chunked',
'fileType': 'json',
'Authorization': 'Basic XXXXXXXXXX'
}
response = requests.get(url, headers=headers, data=payload, verify=False)
print(response.status_code)
print(response.text)
Code2:
import http.client
conn = http.client.HTTPSConnection("main.com")
payload = ''
headers = {
'powerpayTenantId': 'IN0XXX',
'Accept-Language': 'en_us',
'Transfer-Encoding': 'chunked',
'fileType': 'json',
'Authorization': 'Basic XXXXXXXXXX'
}
conn.request("GET", "abcd/smart_general_codes?category=BANK", payload, headers)
res = conn.getresponse()
data = res.read()
print(data.decode("utf-8"))
Both using httpclient and requests method throws the below error:
urllib3.exceptions.ProtocolError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response',))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "login_2.py", line 20, in <module>
response = requests.get(url, headers=headers, data=payload, verify=False)
File "/usr/lib/python3/dist-packages/requests/api.py", line 72, in get
return request('get', url, params=params, **kwargs)
File "/usr/lib/python3/dist-packages/requests/api.py", line 58, in request
return session.request(method=method, url=url, **kwargs)
File "/usr/lib/python3/dist-packages/requests/sessions.py", line 520, in request
resp = self.send(prep, **send_kwargs)
File "/usr/lib/python3/dist-packages/requests/sessions.py", line 630, in send
r = adapter.send(request, **kwargs)
File "/usr/lib/python3/dist-packages/requests/adapters.py", line 490, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response',))
I have solved the issue. In postman the accept language was showing as en_us but updating that to en_US worked.
Try to add a fake user-agent (such as chrome) and your cookies (if needed) in headers like this:
headers = {
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36',
'cookie': '<Your cookie>'
}
By the way, you can get your cookies by typing document.cookie in your browser's developer console.
Please remove the data=payload from the requests and try.
I have stumbled upon an error which I could not resolve. When I ran my python code, this error occurs and it only happens when I am making an API call using the requests package.
Code calling the API:
def getAccs(id):
accountid = ''
url = "{}/{}".format(acc_api, id)
req = requests.get(url, headers=head)
result = json.loads(req.text)
if result['id'] is None:
# Fetches accountid from another api call after updating
accountid = updateCorp(result['name'], id)
else:
accountid = result['id']
return accountid
if __name__ == "__main__":
### Get data from appSettings.json
with open('appSettingsStg.json') as app:
data = json.load(app)
acc_api = data['Urls']['Accounts']
# Header
head = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.122 Safari/537.36",
"Content-Type": "application/json"
}
Error:
Traceback (most recent call last):
File "insert.py", line 313, in <module>
acc.append(res)
File "insert.py", line 102, in getAccs
req = requests.get(url, headers=head)
File "/home/dev/.local/lib/python3.7/site-packages/requests/api.py", line 76, in get
return request('get', url, params=params, **kwargs)
File "/home/dev/.local/lib/python3.7/site-packages/requests/api.py", line 61, in request
return session.request(method=method, url=url, **kwargs)
File "/home/dev/.local/lib/python3.7/site-packages/requests/sessions.py", line 530, in request
resp = self.send(prep, **send_kwargs)
File "/home/dev/.local/lib/python3.7/site-packages/requests/sessions.py", line 685, in send
r.content
File "/home/dev/.local/lib/python3.7/site-packages/requests/models.py", line 829, in content
self._content = b''.join(self.iter_content(CONTENT_CHUNK_SIZE)) or b''
File "/home/dev/.local/lib/python3.7/site-packages/requests/models.py", line 751, in generate
for chunk in self.raw.stream(chunk_size, decode_content=True):
File "/home/dev/.local/lib/python3.7/site-packages/urllib3/response.py", line 571, in stream
for line in self.read_chunked(amt, decode_content=decode_content):
File "/home/dev/.local/lib/python3.7/site-packages/urllib3/response.py", line 738, in read_chunked
self._init_decoder()
File "/home/dev/.local/lib/python3.7/site-packages/urllib3/response.py", line 376, in _init_decoder
self._decoder = _get_decoder(content_encoding)
File "/home/dev/.local/lib/python3.7/site-packages/urllib3/response.py", line 147, in _get_decoder
return GzipDecoder()
File "/home/dev/.local/lib/python3.7/site-packages/urllib3/response.py", line 74, in __init__
self._obj = zlib.decompressobj(16 + zlib.MAX_WBITS)
ValueError: Invalid initialization option
Some things I have tried
Importing zlib and urllib3 (As mentioned in this post)
Upgrading requests package (v2.24.0 as of 25/08/2020)
Reinstalling requests and urllib3 (Didnt work but I thought it was at least worth a try)
Any advice is much appreciated !
write an asynchronous scraper for RSS feeds and sometimes the following error occurs with some sites, for example:
In [1]: import requests_async as requests
In [2]: headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Ch
...: rome/79.0.3945.79 Safari/537.36'}
In [3]: r = await requests.get('https://albumorientedpodcast.com/category/album-oriented/feed/', headers=headers)
Here is the full traceback of this error:
Traceback (most recent call last):
File "rss_parser.py", line 55, in rss_downloader
response = await requests.get(rss, headers=headers)
File "C:\Python3\lib\site-packages\requests_async\api.py", line 11, in get
return await request("get", url, params=params, **kwargs)
File "C:\Python3\lib\site-packages\requests_async\api.py", line 6, in request
return await session.request(method=method, url=url, **kwargs)
File "C:\Python3\lib\site-packages\requests_async\sessions.py", line 79, in request
resp = await self.send(prep, **send_kwargs)
File "C:\Python3\lib\site-packages\requests_async\sessions.py", line 157, in send
async for resp in self.resolve_redirects(r, request, **kwargs):
File "C:\Python3\lib\site-packages\requests_async\sessions.py", line 290, in resolve_redirects
resp = await self.send(
File "C:\Python3\lib\site-packages\requests_async\sessions.py", line 136, in send
r = await adapter.send(request, **kwargs)
File "C:\Python3\lib\site-packages\requests_async\adapters.py", line 48, in send
response = await self.pool.request(
File "C:\Python3\lib\site-packages\http3\interfaces.py", line 49, in request
return await self.send(request, verify=verify, cert=cert, timeout=timeout)
File "C:\Python3\lib\site-packages\http3\dispatch\connection_pool.py", line 130, in send
raise exc
File "C:\Python3\lib\site-packages\http3\dispatch\connection_pool.py", line 120, in send
response = await connection.send(
File "C:\Python3\lib\site-packages\http3\dispatch\connection.py", line 56, in send
response = await self.h2_connection.send(request, timeout=timeout)
File "C:\Python3\lib\site-packages\http3\dispatch\http2.py", line 52, in send
status_code, headers = await self.receive_response(stream_id, timeout)
File "C:\Python3\lib\site-packages\http3\dispatch\http2.py", line 126, in receive_response
event = await self.receive_event(stream_id, timeout)
File "C:\Python3\lib\site-packages\http3\dispatch\http2.py", line 159, in receive_event
events = self.h2_state.receive_data(data)
File "C:\Python3\lib\site-packages\h2\connection.py", line 1463, in receive_data
events.extend(self._receive_frame(frame))
File "C:\Python3\lib\site-packages\h2\connection.py", line 1486, in _receive_frame
frames, events = self._frame_dispatch_table[frame.__class__](frame)
File "C:\Python3\lib\site-packages\h2\connection.py", line 1560, in _receive_headers_frame
frames, stream_events = stream.receive_headers(
File "C:\Python3\lib\site-packages\h2\stream.py", line 1055, in receive_headers
events[0].headers = self._process_received_headers(
File "C:\Python3\lib\site-packages\h2\stream.py", line 1298, in _process_received_headers
return list(headers)
File "C:\Python3\lib\site-packages\h2\utilities.py", line 335, in _reject_pseudo_header_fields
for header in headers:
File "C:\Python3\lib\site-packages\h2\utilities.py", line 291, in _reject_connection_header
for header in headers:
File "C:\Python3\lib\site-packages\h2\utilities.py", line 275, in _reject_te
for header in headers:
File "C:\Python3\lib\site-packages\h2\utilities.py", line 264, in _reject_surrounding_whitespace
raise ProtocolError(
h2.exceptions.ProtocolError: Received header value surrounded by whitespace b'3.vie _dca '
At the same time, this same site is normally loaded through common requests library:
In [1]: import requests
In [2]: headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Ch
...: rome/79.0.3945.79 Safari/537.36'}
In [3]: r = requests.get('https://albumorientedpodcast.com/category/album-oriented/feed/', headers=headers)
In [4]: r
Out[4]: <Response [200]>
I tried to find at least some information on this error, but nothing. Can someone tell me what I can do to avoid a similar error and load the site normally?
requests-async has been archived, but its github page contains a link to the successor - httpx.
httpx seems to have similar syntax and actively maintained.
Consider try it: many bugs may had been fixed there.
I am trying to connect to a .onion site using python. I have tor running on port 9050 and I am getting the following error:
Traceback (most recent call last):
File "/Users/jane/code/test/test.py", line 15, in main
res = await fetch(session, id)
File "/Users/jane/code/test/test.py", line 9, in fetch
async with session.get(url) as res:
File "/usr/local/lib/python3.7/site-packages/aiohttp/client.py", line 1005, in __aenter__
self._resp = await self._coro
File "/usr/local/lib/python3.7/site-packages/aiohttp/client.py", line 476, in _request
timeout=real_timeout
File "/usr/local/lib/python3.7/site-packages/aiohttp/connector.py", line 522, in connect
proto = await self._create_connection(req, traces, timeout)
File "/usr/local/lib/python3.7/site-packages/aiohttp/connector.py", line 854, in _create_connection
req, traces, timeout)
File "/usr/local/lib/python3.7/site-packages/aiohttp/connector.py", line 959, in _create_direct_connection
raise ClientConnectorError(req.connection_key, exc) from exc
aiohttp.client_exceptions.ClientConnectorError: Cannot connect to host intelex7ny6coqno.onion:80 ssl:None [nodename nor servname provided, or not known]
The code:
import asyncio
import aiohttp
from aiohttp_socks import SocksConnector
async def fetch(session, id):
print('Starting {}'.format(id))
url = 'http://intelex7ny6coqno.onion/topic/{}'.format(id)
async with session.get(url) as res:
return res.text
async def main(id):
connector = SocksConnector.from_url('socks5://localhost:9050')
async with aiohttp.ClientSession(connector=connector) as session:
res = await fetch(session, id)
print(res)
if __name__ == '__main__':
ids = ['10', '11', '12']
loop = asyncio.get_event_loop()
future = [asyncio.ensure_future(main(id)) for id in ids]
loop.run_until_complete(asyncio.wait(future))
This code works fine:
import requests
session = requests.session()
session.proxies['http'] = 'socks5h://localhost:9050'
session.proxies['https'] = 'socks5h://localhost:9050'
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; rv:60.0) Gecko/20100101 Firefox/60.0',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
}
res = session.get(url, headers=headers)
print(res)
Why am I getting Cannot connect to host intelex7ny6coqno.onion:80 ssl:None [nodename nor servname provided, or not known]?
What am I missing here?
By default it appears to be using the local DNS resolver to asynchronously resolve hostnames. When using requests socks5h you are getting DNS resolution over SOCKS (Tor).
Adding rdns=True appears to work for .onion addresses:
connector = SocksConnector.from_url('socks5://localhost:9050', rdns=True)
I'm trying to get a website's source data after logging in but am having trouble logging in to get to the source. The url is the webpage I see after logging in. I.e. if I login on chrome, I can use url to go to where I need to get the source data.
I keep getting multiple errors, primarily handshake errors:
"sslv3 alert handshake failure", "bad handshake", "urllib3.exceptions.MaxRetryError", and I think the primary error is
Traceback (most recent call last):
File "C:\Users\bwayne\AppData\Local\Programs\Python\Python36-32\lib\site-packages\urllib3\contrib\pyopenssl.py", line 441, in wrap_socket
cnx.do_handshake()
File "C:\Users\bwayne\AppData\Local\Programs\Python\Python36-32\lib\site-packages\OpenSSL\SSL.py", line 1716, in do_handshake
self._raise_ssl_error(self._ssl, result)
File "C:\Users\bwayne\AppData\Local\Programs\Python\Python36-32\lib\site-packages\OpenSSL\SSL.py", line 1456, in _raise_ssl_error
_raise_current_error()
File "C:\Users\bwayne\AppData\Local\Programs\Python\Python36-32\lib\site-packages\OpenSSL_util.py", line 54, in exception_from_error_queue
raise exception_type(errors)
OpenSSL.SSL.Error: [('SSL routines', 'ssl3_read_bytes', 'sslv3 alert handshake failure')]
During handling of the above exception, another exception occurred:
import requests, sys
import ssl
from requests.adapters import HTTPAdapter
from requests.packages.urllib3.poolmanager import PoolManager
ctx = ssl.SSLContext(ssl.PROTOCOL_TLSv1_2)
ctx.options |= ssl.OP_NO_SSLv2
ctx.options |= ssl.OP_NO_SSLv3
ctx.options |= ssl.OP_NO_TLSv1
ctx.options |= ssl.OP_NO_TLSv1_1
class Ssl3HttpAdapter(HTTPAdapter):
def init_poolmanager(self, connections, maxsize, block=False):
self.poolmanager = PoolManager(num_pools=connections,
maxsize=maxsize,
block=block,
ssl_version=ssl.PROTOCOL_TLSv1)
url = "www.thewebsite.com"
def do_requests(url):
payload = {'Username': 'myName', 'Password': 'myPass'}
headers = {'User-Agent': 'Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Mobile Safari/537.36'}
with requests.Session() as s:
s.mount(url,Ssl3HttpAdapter())
p = s.post(url, headers=headers, data=payload, verify=False)
def main(url):
do_requests(url)
main(url)
How can I login? I've double and triple checked that the HTML names are correct: