Using proxies with TLS in python - python-3.x

I am currently trying to use TLS 1.1 with python requests. So far I've been using this snippet:
CIPHERS = (
'ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-SHA256:AES256-SHA'
)
class TlsAdapter(HTTPAdapter):
def __init__(self, ssl_options=0, **kwargs):
self.ssl_options = ssl_options
super(TlsAdapter, self).__init__(**kwargs)
def init_poolmanager(self, *pool_args, **pool_kwargs):
ctx = ssl_.create_urllib3_context(ciphers=CIPHERS, cert_reqs=ssl.CERT_REQUIRED, options=self.ssl_options)
self.poolmanager = PoolManager(*pool_args,
ssl_context=ctx,
**pool_kwargs)
self.session = requests.session()
adapter = TlsAdapter(ssl.OP_NO_TLSv1 | ssl.OP_NO_TLSv1_1)
self.session.mount("https://", adapter)
It works fine if I run it without an HTTP proxy, but does not seem to work/create a TLS connection when I do set an HTTP proxy in my session, like so:
proxies = {
'http': 'http://{}:{}#{}:{}'.format(user, password, ip, port),
'https': 'http://{}:{}#{}:{}'.format(user, password, ip, port)
}
self.session.proxies.update(proxies = proxies)
Any ideas on how I could get it to still work with proxies? I am not too sure what I am doing wrong/need to change. Thank you!

It's the way the proxies are being set or updated. The current way isn't setting the proxy at all.
Instead of using this:
self.session.proxies.update(proxies = proxies)
It should be:
self.session.proxies.update(proxies)
https://2.python-requests.org/en/master/user/advanced/#proxies

Related

Python module that will handle connections and add in a Proxy? [duplicate]

I'd like to manually (using the socket and ssl modules) make an HTTPS request through a proxy which itself uses HTTPS.
I can perform the initial CONNECT exchange just fine:
import ssl, socket
PROXY_ADDR = ("proxy-addr", 443)
CONNECT = "CONNECT example.com:443 HTTP/1.1\r\n\r\n"
sock = socket.create_connection(PROXY_ADDR)
sock = ssl.wrap_socket(sock)
sock.sendall(CONNECT)
s = ""
while s[-4:] != "\r\n\r\n":
s += sock.recv(1)
print repr(s)
The above code prints HTTP/1.1 200 Connection established plus some headers, which is what I expect. So now I should be ready to make the request, e.g.
sock.sendall("GET / HTTP/1.1\r\n\r\n")
but the above code returns
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>400 Bad Request</title>
</head><body>
<h1>Bad Request</h1>
<p>Your browser sent a request that this server could not understand.<br />
Reason: You're speaking plain HTTP to an SSL-enabled server port.<br />
Instead use the HTTPS scheme to access this URL, please.<br />
</body></html>
This makes sense too, since I still need to do an SSL handshake with the example.com server to which I'm tunneling. However, if instead of immediately sending the GET request I say
sock = ssl.wrap_socket(sock)
to do the handshake with the remote server, then I get an exception:
Traceback (most recent call last):
File "so_test.py", line 18, in <module>
ssl.wrap_socket(sock)
File "/usr/lib/python2.6/ssl.py", line 350, in wrap_socket
suppress_ragged_eofs=suppress_ragged_eofs)
File "/usr/lib/python2.6/ssl.py", line 118, in __init__
self.do_handshake()
File "/usr/lib/python2.6/ssl.py", line 293, in do_handshake
self._sslobj.do_handshake()
ssl.SSLError: [Errno 1] _ssl.c:480: error:140770FC:SSL routines:SSL23_GET_SERVER_HELLO:unknown protocol
So how can I do the SSL handshake with the remote example.com server?
EDIT: I'm pretty sure that no additional data is available before my second call to wrap_socket because calling sock.recv(1) blocks indefinitely.
This should work if the CONNECT string is rewritten as follows:
CONNECT = "CONNECT %s:%s HTTP/1.0\r\nConnection: close\r\n\r\n" % (server, port)
Not sure why this works, but maybe it has something to do with the proxy I'm using. Here's an example code:
from OpenSSL import SSL
import socket
def verify_cb(conn, cert, errun, depth, ok):
return True
server = 'mail.google.com'
port = 443
PROXY_ADDR = ("proxy.example.com", 3128)
CONNECT = "CONNECT %s:%s HTTP/1.0\r\nConnection: close\r\n\r\n" % (server, port)
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(PROXY_ADDR)
s.send(CONNECT)
print s.recv(4096)
ctx = SSL.Context(SSL.SSLv23_METHOD)
ctx.set_verify(SSL.VERIFY_PEER, verify_cb)
ss = SSL.Connection(ctx, s)
ss.set_connect_state()
ss.do_handshake()
cert = ss.get_peer_certificate()
print cert.get_subject()
ss.shutdown()
ss.close()
Note how the socket is first opened and then open socket placed in SSL context. Then I manually initialize SSL handshake. And output:
HTTP/1.1 200 Connection established
<X509Name object '/C=US/ST=California/L=Mountain View/O=Google Inc/CN=mail.google.com'>
It's based on pyOpenSSL because I needed to fetch invalid certificates too and Python built-in ssl module will always try to verify the certificate if it's received.
Judging from the API of the OpenSSL and GnuTLS library, stacking a SSLSocket onto a SSLSocket is actually not straightforwardly possible as they provide special read/write functions to implement the encryption, which they are not able to use themselves when wrapping a pre-existing SSLSocket.
The error is therefore caused by the inner SSLSocket directly reading from the system socket and not from the outer SSLSocket. This ends in sending data not belonging to the outer SSL session, which ends badly and for sure never returns a valid ServerHello.
Concluding from that, I would say there is no simple way to implement what you (and actually myself) would like to accomplish.
Finally I got somewhere expanding on #kravietz and #02strich answers.
Here's the code
import threading
import select
import socket
import ssl
server = 'mail.google.com'
port = 443
PROXY = ("localhost", 4433)
CONNECT = "CONNECT %s:%s HTTP/1.0\r\nConnection: close\r\n\r\n" % (server, port)
class ForwardedSocket(threading.Thread):
def __init__(self, s, **kwargs):
threading.Thread.__init__(self)
self.dest = s
self.oursraw, self.theirsraw = socket.socketpair(socket.AF_UNIX, socket.SOCK_STREAM)
self.theirs = socket.socket(_sock=self.theirsraw)
self.start()
self.ours = ssl.wrap_socket(socket.socket(_sock=self.oursraw), **kwargs)
def run(self):
rl, wl, xl = select.select([self.dest, self.theirs], [], [], 1)
print rl, wl, xl
# FIXME write may block
if self.theirs in rl:
self.dest.send(self.theirs.recv(4096))
if self.dest in rl:
self.theirs.send(self.dest.recv(4096))
def recv(self, *args):
return self.ours.recv(*args)
def send(self, *args):
return self.outs.recv(*args)
def test():
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(PROXY)
s = ssl.wrap_socket(s, ciphers="ALL:aNULL:eNULL")
s.send(CONNECT)
resp = s.read(4096)
print (resp, )
fs = ForwardedSocket(s, ciphers="ALL:aNULL:eNULL")
fs.send("foobar")
Don't mind custom cihpers=, that only because I didn't want to deal with certificates.
And there's depth-1 ssl output, showing CONNECT, my response to it ssagd and depth-2 ssl negotiation and binary rubbish:
[dima#bmg ~]$ openssl s_server -nocert -cipher "ALL:aNULL:eNULL"
Using default temp DH parameters
Using default temp ECDH parameters
ACCEPT
-----BEGIN SSL SESSION PARAMETERS-----
MHUCAQECAgMDBALAGQQgmn6XfJt8ru+edj6BXljltJf43Sz6AmacYM/dSmrhgl4E
MOztEauhPoixCwS84DL29MD/OxuxuvG5tnkN59ikoqtfrnCKsk8Y9JtUU9zuaDFV
ZaEGAgRSnJ81ogQCAgEspAYEBAEAAAA=
-----END SSL SESSION PARAMETERS-----
Shared ciphers: [snipped]
CIPHER is AECDH-AES256-SHA
Secure Renegotiation IS supported
CONNECT mail.google.com:443 HTTP/1.0
Connection: close
sagq
�u\�0�,�(�$��
�"�!��kj98���� �m:��2�.�*�&���=5�����
��/�+�'�#�� ����g#32��ED���l4�F�1�-�)�%���</�A������
�� ������
�;��A��q�J&O��y�l
It doesn't sound like there's anything wrong with what you're doing; it's certainly possible to call wrap_socket() on an existing SSLSocket.
The 'unknown protocol' error can occur (amongst other reasons) if there's extra data waiting to be read on the socket at the point you call wrap_socket(), for instance an extra \r\n or an HTTP error (due to a missing cert on the server end, for instance). Are you certain you've read everything available at that point?
If you can force the first SSL channel to use a "plain" RSA cipher (i.e. non-Diffie-Hellman) then you may be able to use Wireshark to decrypt the stream to see what's going on.
Building on #kravietz answer. Here is a version that works in Python3 through a Squid proxy:
from OpenSSL import SSL
import socket
def verify_cb(conn, cert, errun, depth, ok):
return True
server = 'mail.google.com'
port = 443
PROXY_ADDR = ("<proxy_server>", 3128)
CONNECT = "CONNECT %s:%s HTTP/1.0\r\nConnection: close\r\n\r\n" % (server, port)
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(PROXY_ADDR)
s.send(str.encode(CONNECT))
s.recv(4096)
ctx = SSL.Context(SSL.SSLv23_METHOD)
ctx.set_verify(SSL.VERIFY_PEER, verify_cb)
ss = SSL.Connection(ctx, s)
ss.set_connect_state()
ss.do_handshake()
cert = ss.get_peer_certificate()
print(cert.get_subject())
ss.shutdown()
ss.close()
This works in Python 2 also.

Python smpplib truncating smpp credentials

I am using Python SMPP lib to send a SMS. When I try to connect to SmppServer with a longer credentials the username and password are getting truncated and authorisation is failing.
Auth failing case:
- Username/password longer than 16 char length
Passing case:
- Username/password not longer than 16 char
So because of the above observation I am sure there is no issue with the SMMPP gateway. The gateway I am trying to communicate itself accepts of username/password of any length.
The following is my code which wraps the smpplib in to a custom class:
import smpplib
import smpplib.gsm
from smpplib.client import Client
class SmppClientConfig(object, ):
def __init__(self, host, port, username, password, source, target, on_delivered_cb, on_sent_cb):
self.HOST = host
self.PORT = port
self.USERNAME = username
self.PASSWORD = password
self.SOURCE = source
self.TARGET = target
self.ON_RECEIVED_CALLBACK = on_sent_cb
self.ON_DELIVERED_CALLBACK = on_delivered_cb
class SmppSenderClient(object):
def __init__(self, config: SmppClientConfig):
print('Creating SMPP client config with host: ' + config.HOST + ' port: ' + str(config.PORT))
self._config = config
self._client = Client(config.HOST, config.PORT)
self._on_delivered_callback = config.ON_DELIVERED_CALLBACK
self._on_received_callback = config.ON_RECEIVED_CALLBACK
self._init_client()
def _init_client(self):
print('Initializing SmppSender client with username: ' + self._config.USERNAME)
self._register_events()
self._client.connect()
self._client.bind_transmitter(system_id=self._config.USERNAME, password=self._config.PASSWORD)
def _register_events(self):
print('Registering Smpp events')
self._client.set_message_sent_handler(self._config.ON_DELIVERED_CALLBACK)
self._client.set_message_received_handler(self._config.ON_RECEIVED_CALLBACK)
def send_message(self, message: str):
print('Sending SMS message to target: ' + self._config.TARGET)
parts, encoding_flag, msg_type_flag = smpplib.gsm.make_parts(message)
for part in parts:
self._client.send_message(
source_addr_ton=smpplib.consts.SMPP_TON_INTL,
source_addr_npi=smpplib.consts.SMPP_NPI_ISDN,
source_addr=self._config.SOURCE,
dest_addr_ton=smpplib.consts.SMPP_TON_INTL,
dest_addr_npi=smpplib.consts.SMPP_NPI_ISDN,
destination_addr=self._config.TARGET,
short_message=part,
data_coding=encoding_flag,
esm_class=msg_type_flag,
registered_delivery=True,
)
I am not sure if it's an expected behaviour of the library or a limitation. I have tried the find the documentation for this lib but could not find anything other than this.
Please advise if you experience a similar issue any work around possible or if this is expected behaviour in SMPP protocol (which very unlikely).
Update:
I found the limitation in the source code:
class BindTransmitter(Command):
"""Bind as a transmitter command"""
params = {
'system_id': Param(type=str, max=16),
'password': Param(type=str, max=9),
'system_type': Param(type=str, max=13),
'interface_version': Param(type=int, size=1),
'addr_ton': Param(type=int, size=1),
'addr_npi': Param(type=int, size=1),
'address_range': Param(type=str, max=41),
}
# Order is important, but params dictionary is unordered
params_order = (
'system_id', 'password', 'system_type',
'interface_version', 'addr_ton', 'addr_npi', 'address_range',
)
def __init__(self, command, **kwargs):
super(BindTransmitter, self).__init__(command, need_sequence=False, **kwargs)
self._set_vars(**(dict.fromkeys(self.params)))
self.interface_version = consts.SMPP_VERSION_34
As you can see the BindTransmitter contructor (__init__) truncates the system_id to a max length of 16 and passsword to 9. Not sure why this was done this way.
I checked the official SMPP protocol spec. According to the spec the system_id is supposed to max 16 and password is supposed to be 9 for connection type: transmitter.
The following is the screenshot of this spec:
This is the link to the spec if anyone is interested.
So in conclusion, the library implementation of the spec is correct.

Set a passthrough using browsermob proxy?

So I'm using browsermob proxy to login selenium tests to get passed IAP for Google Cloud. But that just gets the user to the site, they still need to login to the site using some firebase login form. IAP has me adding Authorization header through browsermob so you can get to the site itself but the when you try to login through the firebase form you get a 401 error message: "Request had invalid authentication credentials. Expected OAuth 2 access token, login cookie or other valid authentication credential..
I thought I could get around this using the whitelist or blacklist feature to just not apply those headers to urls related to the firebase login, but it seems that whitelist and blacklist just return status codes and empty responses for calls that match the regex.
Is there a way to just passthrough certain calls based on the host? Or on the off chance I'm doing something wrong, let me know. Code below:
class ExampleTest(unittest.TestCase):
def setUp(self):
server = Server("env/bin/browsermob-proxy/bin/browsermob-proxy")
server.start()
proxy = server.create_proxy()
bearer_header = {}
bearer_header['Authorization'] = 'Bearer xxxxxxxxexamplexxxxxxxx'
proxy.headers({"Authorization": bearer_header["Authorization"]})
profile = webdriver.FirefoxProfile()
proxy_info = proxy.selenium_proxy()
profile.set_proxy(proxy_info)
proxy.whitelist('.*googleapis.*, .*google.com.*', 200) # This fakes 200 from urls on regex match
# proxy.blacklist('.*googleapis.*', 200) # This fakes 200 from urls if not regex match
self.driver = webdriver.Firefox(firefox_profile=profile)
proxy.new_har("file-example")
def test_wait(self):
self.driver.get("https://example.com/login/")
time.sleep(3)
def tearDown(self):
self.driver.close()
Figured this out a bit later. There isn't anything built into the BrowserMob proxy/client to do this. But you can achieve it through webdriver's proxy settings.
Chrome
self.chrome_options = webdriver.ChromeOptions()
proxy_address = '{}:{}'.format(server.host, proxy.port)
self.chrome_options.add_argument('--proxy-server=%s' % proxy_address)
no_proxy_string = ''
for item in no_proxy:
no_proxy_string += '*' + item + ';'
self.chrome_options.add_argument('--proxy-bypass-list=%s' % no_proxy_string)
self.desired_capabilities = webdriver.DesiredCapabilities.CHROME
self.desired_capabilities['acceptInsecureCerts'] = True
Firefox
self.desired_capabilities = webdriver.DesiredCapabilities.FIREFOX
proxy_address = '{}:{}'.format(server.host, proxy.port)
self.desired_capabilities['proxy'] = {
'proxyType': "MANUAL",
'httpProxy': proxy_address,
'sslProxy': proxy_address,
'noProxy': ['google.com', 'example.com']
}
self.desired_capabilities['acceptInsecureCerts'] = True

Get requests body using selenium and proxies

I want to be able to get a body of the specific subrequest using a selenium behind the proxy.
Now I'm using python + selenium + chromedriver. With logging I'm able to get each subrequest's headers but not body. My logging settings:
caps['loggingPrefs'] =
{'performance': 'ALL',
'browser': 'ALL'}
caps['perfLoggingPrefs'] = {"enableNetwork": True,
"enablePage": True,
"enableTimeline": True}
I know there are several options to form a HAR with selenium:
Use geckodriver and har-export-trigger. I tried to make it work with the following code:
window.foo = HAR.triggerExport().then(harLog => { return(harLog); });
return window.foo;
Unfortunately, I don't see the body of the response in the returning data.
Use browsermob proxy. The solution seems totally fine but I didn't find the way to make browsermob proxy work behind the proxy.
So the question is: how can I get the body of the specific network response on the request made during the downloading of the webpage with selenium AND use proxies.
UPD: Actually, with har-export-trigger I get the response bodies, but not all of them: the response body I need is in json, it's MIME type is 'text/html; charset=utf-8' and it is missing from the HAR file I generate, so the solution is still missing.
UPD2: After further investigation, I realized that a response body is missing even on my desktop firefox when the har-export-trigger add-on is turned on, so this solution may be a dead-end (issue on Github)
UPD3: This bug can be seen only with the latest version of har-export-trigger. With version 0.6.0. everything works just fine.
So, for future googlers: you may use har-export-trigger v. 0.6.0. or the approach from the accepted answer.
I have actually just finished to implemented a selenium HAR script with tools you are mentioned in the question. Both HAR getting from har-export-trigger and BrowserMob are verified with Google HAR Analyser.
A class using selenium, gecko driver and har-export-trigger:
class MyWebDriver(object):
# a inner class to implement custom wait
class PageIsLoaded(object):
def __call__(self, driver):
state = driver.execute_script('return document.readyState;')
MyWebDriver._LOGGER.debug("checking document state: " + state)
return state == "complete"
_FIREFOX_DRIVER = "geckodriver"
# load HAR_EXPORT_TRIGGER extension
_HAR_TRIGGER_EXT_PATH = os.path.abspath(
"har_export_trigger-0.6.1-an+fx_orig.xpi")
_PROFILE = webdriver.FirefoxProfile()
_PROFILE.set_preference("devtools.toolbox.selectedTool", "netmonitor")
_CAP = DesiredCapabilities().FIREFOX
_OPTIONS = FirefoxOptions()
# add runtime argument to run with devtools opened
_OPTIONS.add_argument("-devtools")
_LOGGER = my_logger.get_custom_logger(os.path.basename(__file__))
def __init__(self, log_body=False):
self.browser = None
self.log_body = log_body
# return the webdriver instance
def get_instance(self):
if self.browser is None:
self.browser = webdriver.Firefox(capabilities=
MyWebDriver._CAP,
executable_path=
MyWebDriver._FIREFOX_DRIVER,
firefox_options=
MyWebDriver._OPTIONS,
firefox_profile=
MyWebDriver._PROFILE)
self.browser.install_addon(MyWebDriver._HAR_TRIGGER_EXT_PATH,
temporary=True)
MyWebDriver._LOGGER.info("Web Driver initialized.")
return self.browser
def get_har(self):
# JSON.stringify has to be called to return as a string
har_harvest = "myString = HAR.triggerExport().then(" \
"harLog => {return JSON.stringify(harLog);});" \
"return myString;"
har_dict = dict()
har_dict['log'] = json.loads(self.browser.execute_script(har_harvest))
# remove content body
if self.log_body is False:
for entry in har_dict['log']['entries']:
temp_dict = entry['response']['content']
try:
temp_dict.pop("text")
except KeyError:
pass
return har_dict
def quit(self):
self.browser.quit()
MyWebDriver._LOGGER.warning("Web Driver closed.")
A subclass adding BrowserMob proxy for your reference as well:
class MyWebDriverWithProxy(MyWebDriver):
_PROXY_EXECUTABLE = os.path.join(os.getcwd(), "venv", "lib",
"browsermob-proxy-2.1.4", "bin",
"browsermob-proxy")
def __init__(self, url, log_body=False):
super().__init__(log_body=log_body)
self.server = Server(MyWebDriverWithProxy._PROXY_EXECUTABLE)
self.server.start()
self.proxy = self.server.create_proxy()
self.proxy.new_har(url,
options={'captureHeaders': True,
'captureContent': self.log_body})
super()._LOGGER.info("BrowserMob server started")
super()._PROFILE.set_proxy(self.proxy.selenium_proxy())
def get_har(self):
return self.proxy.har
def quit(self):
self.browser.quit()
self.proxy.close()
MyWebDriver._LOGGER.info("BroswerMob server and Web Driver closed.")

How to write these 3 sentences with requests instead of urllib2? [duplicate]

Just a short, simple one about the excellent Requests module for Python.
I can't seem to find in the documentation what the variable 'proxies' should contain. When I send it a dict with a standard "IP:PORT" value it rejected it asking for 2 values.
So, I guess (because this doesn't seem to be covered in the docs) that the first value is the ip and the second the port?
The docs mention this only:
proxies – (optional) Dictionary mapping protocol to the URL of the proxy.
So I tried this... what should I be doing?
proxy = { ip: port}
and should I convert these to some type before putting them in the dict?
r = requests.get(url,headers=headers,proxies=proxy)
The proxies' dict syntax is {"protocol": "scheme://ip:port", ...}. With it you can specify different (or the same) proxie(s) for requests using http, https, and ftp protocols:
http_proxy = "http://10.10.1.10:3128"
https_proxy = "https://10.10.1.11:1080"
ftp_proxy = "ftp://10.10.1.10:3128"
proxies = {
"http" : http_proxy,
"https" : https_proxy,
"ftp" : ftp_proxy
}
r = requests.get(url, headers=headers, proxies=proxies)
Deduced from the requests documentation:
Parameters:
method – method for the new Request object.
url – URL for the new Request object.
...
proxies – (optional) Dictionary mapping protocol to the URL of the proxy.
...
On linux you can also do this via the HTTP_PROXY, HTTPS_PROXY, and FTP_PROXY environment variables:
export HTTP_PROXY=10.10.1.10:3128
export HTTPS_PROXY=10.10.1.11:1080
export FTP_PROXY=10.10.1.10:3128
On Windows:
set http_proxy=10.10.1.10:3128
set https_proxy=10.10.1.11:1080
set ftp_proxy=10.10.1.10:3128
You can refer to the proxy documentation here.
If you need to use a proxy, you can configure individual requests with the proxies argument to any request method:
import requests
proxies = {
"http": "http://10.10.1.10:3128",
"https": "https://10.10.1.10:1080",
}
requests.get("http://example.org", proxies=proxies)
To use HTTP Basic Auth with your proxy, use the http://user:password#host.com/ syntax:
proxies = {
"http": "http://user:pass#10.10.1.10:3128/"
}
I have found that urllib has some really good code to pick up the system's proxy settings and they happen to be in the correct form to use directly. You can use this like:
import urllib
...
r = requests.get('http://example.org', proxies=urllib.request.getproxies())
It works really well and urllib knows about getting Mac OS X and Windows settings as well.
The accepted answer was a good start for me, but I kept getting the following error:
AssertionError: Not supported proxy scheme None
Fix to this was to specify the http:// in the proxy url thus:
http_proxy = "http://194.62.145.248:8080"
https_proxy = "https://194.62.145.248:8080"
ftp_proxy = "10.10.1.10:3128"
proxyDict = {
"http" : http_proxy,
"https" : https_proxy,
"ftp" : ftp_proxy
}
I'd be interested as to why the original works for some people but not me.
Edit: I see the main answer is now updated to reflect this :)
If you'd like to persisist cookies and session data, you'd best do it like this:
import requests
proxies = {
'http': 'http://user:pass#10.10.1.0:3128',
'https': 'https://user:pass#10.10.1.0:3128',
}
# Create the session and set the proxies.
s = requests.Session()
s.proxies = proxies
# Make the HTTP request through the session.
r = s.get('http://www.showmemyip.com/')
8 years late. But I like:
import os
import requests
os.environ['HTTP_PROXY'] = os.environ['http_proxy'] = 'http://http-connect-proxy:3128/'
os.environ['HTTPS_PROXY'] = os.environ['https_proxy'] = 'http://http-connect-proxy:3128/'
os.environ['NO_PROXY'] = os.environ['no_proxy'] = '127.0.0.1,localhost,.local'
r = requests.get('https://example.com') # , verify=False
The documentation
gives a very clear example of the proxies usage
import requests
proxies = {
'http': 'http://10.10.1.10:3128',
'https': 'http://10.10.1.10:1080',
}
requests.get('http://example.org', proxies=proxies)
What isn't documented, however, is the fact that you can even configure proxies for individual urls even if the schema is the same!
This comes in handy when you want to use different proxies for different websites you wish to scrape.
proxies = {
'http://example.org': 'http://10.10.1.10:3128',
'http://something.test': 'http://10.10.1.10:1080',
}
requests.get('http://something.test/some/url', proxies=proxies)
Additionally, requests.get essentially uses the requests.Session under the hood, so if you need more control, use it directly
import requests
proxies = {
'http': 'http://10.10.1.10:3128',
'https': 'http://10.10.1.10:1080',
}
session = requests.Session()
session.proxies.update(proxies)
session.get('http://example.org')
I use it to set a fallback (a default proxy) that handles all traffic that doesn't match the schemas/urls specified in the dictionary
import requests
proxies = {
'http': 'http://10.10.1.10:3128',
'https': 'http://10.10.1.10:1080',
}
session = requests.Session()
session.proxies.setdefault('http', 'http://127.0.0.1:9009')
session.proxies.update(proxies)
session.get('http://example.org')
i just made a proxy graber and also can connect with same grabed proxy without any input
here is :
#Import Modules
from termcolor import colored
from selenium import webdriver
import requests
import os
import sys
import time
#Proxy Grab
options = webdriver.ChromeOptions()
options.add_argument('headless')
driver = webdriver.Chrome(chrome_options=options)
driver.get("https://www.sslproxies.org/")
tbody = driver.find_element_by_tag_name("tbody")
cell = tbody.find_elements_by_tag_name("tr")
for column in cell:
column = column.text.split(" ")
print(colored(column[0]+":"+column[1],'yellow'))
driver.quit()
print("")
os.system('clear')
os.system('cls')
#Proxy Connection
print(colored('Getting Proxies from graber...','green'))
time.sleep(2)
os.system('clear')
os.system('cls')
proxy = {"http": "http://"+ column[0]+":"+column[1]}
url = 'https://mobile.facebook.com/login'
r = requests.get(url, proxies=proxy)
print("")
print(colored('Connecting using proxy' ,'green'))
print("")
sts = r.status_code
here is my basic class in python for the requests module with some proxy configs and stopwatch !
import requests
import time
class BaseCheck():
def __init__(self, url):
self.http_proxy = "http://user:pw#proxy:8080"
self.https_proxy = "http://user:pw#proxy:8080"
self.ftp_proxy = "http://user:pw#proxy:8080"
self.proxyDict = {
"http" : self.http_proxy,
"https" : self.https_proxy,
"ftp" : self.ftp_proxy
}
self.url = url
def makearr(tsteps):
global stemps
global steps
stemps = {}
for step in tsteps:
stemps[step] = { 'start': 0, 'end': 0 }
steps = tsteps
makearr(['init','check'])
def starttime(typ = ""):
for stemp in stemps:
if typ == "":
stemps[stemp]['start'] = time.time()
else:
stemps[stemp][typ] = time.time()
starttime()
def __str__(self):
return str(self.url)
def getrequests(self):
g=requests.get(self.url,proxies=self.proxyDict)
print g.status_code
print g.content
print self.url
stemps['init']['end'] = time.time()
#print stemps['init']['end'] - stemps['init']['start']
x= stemps['init']['end'] - stemps['init']['start']
print x
test=BaseCheck(url='http://google.com')
test.getrequests()
It’s a bit late but here is a wrapper class that simplifies scraping proxies and then making an http POST or GET:
ProxyRequests
https://github.com/rootVIII/proxy_requests
Already tested, the following code works. Need to use HTTPProxyAuth.
import requests
from requests.auth import HTTPProxyAuth
USE_PROXY = True
proxy_user = "aaa"
proxy_password = "bbb"
http_proxy = "http://your_proxy_server:8080"
https_proxy = "http://your_proxy_server:8080"
proxies = {
"http": http_proxy,
"https": https_proxy
}
def test(name):
print(f'Hi, {name}') # Press Ctrl+F8 to toggle the breakpoint.
# Create the session and set the proxies.
session = requests.Session()
if USE_PROXY:
session.trust_env = False
session.proxies = proxies
session.auth = HTTPProxyAuth(proxy_user, proxy_password)
r = session.get('https://www.stackoverflow.com')
print(r.status_code)
if __name__ == '__main__':
test('aaa')
I share some code how to fetch proxies from the site "https://free-proxy-list.net" and store data to a file compatible with tools like "Elite Proxy Switcher"(format IP:PORT):
##PROXY_UPDATER - get free proxies from https://free-proxy-list.net/
from lxml.html import fromstring
import requests
from itertools import cycle
import traceback
import re
######################FIND PROXIES#########################################
def get_proxies():
url = 'https://free-proxy-list.net/'
response = requests.get(url)
parser = fromstring(response.text)
proxies = set()
for i in parser.xpath('//tbody/tr')[:299]: #299 proxies max
proxy = ":".join([i.xpath('.//td[1]/text()')
[0],i.xpath('.//td[2]/text()')[0]])
proxies.add(proxy)
return proxies
######################write to file in format IP:PORT######################
try:
proxies = get_proxies()
f=open('proxy_list.txt','w')
for proxy in proxies:
f.write(proxy+'\n')
f.close()
print ("DONE")
except:
print ("MAJOR ERROR")

Resources