Python Requests module - does it use system level (on windows) proxy settings?

Python Requests module - does it use system level (on windows) proxy settings? - python-3.x

Background
I've got an app using the requests module to handle connecting to a remote webserver. This works perfectly, but I want to deploy it at within an organisation using an enterprise proxy server. The machines in the organisation have the proxy configured at the operating system level (ie windows setting the system proxy).
I'd prefer to have my app automatically use the already configured OS proxy settings, rather than have to ask them for the info (especially as they use basic authentication, so I'd have to securely store a username/password, not just the proxy host/port).
Question
Does Requests automatically use the operating system's proxy settings if you do not specify a proxy directly yourself?
I couldn't find the definitive answer to this after reading Request's documentation, or the underlying urllib3.
On my dev machine I don't have a proxy to test with, and so would like to know the answer before I go and code manual proxy handling in my app that might not actually be necessary...
Some more info
As a bit of comparison, Urllib does do this - see https://docs.python.org/3/library/urllib.request.html#urllib.request.ProxyHandler ...if no proxy is specified it will utilize the system configured one.
If seemed on my initial review of Request's documentation it didn't use the system configuration, instead only using environment variables if they were set: https://2.python-requests.org/en/master/user/advanced/#proxies
But, after a bit more digging, I found a way to at least obtain the OS proxy configuration, using urllib.request.getproxies(): https://stackoverflow.com/a/16311657/9423009
At this point I thought I'd at least be able to use the above at run time to get the OS proxy config, and pass that to requests...
...but then I found this post, which states that requests will use the OS level configuration if nothing is specified: How to use requests library without system-configured proxies
So, at this point, I can't find a definitive answer in the documentation either for requests or urllib3, but do have a SO post stating requests will use the OS level config, by calling urllib.requests.getproxies() itself.
...so can anyone confirm/deny this is the case?
thanks!

There are two aspects in your question
1. does requests use urllib.request.getproxies ?
As of version requests=2.25.1, from Session.request source, if not provided, proxy information is obtained from self.merge_environment_settings
if self.trust_env:
# Set environment's proxies.
no_proxy = proxies.get('no_proxy') if proxies is not None else None
env_proxies = get_environ_proxies(url, no_proxy=no_proxy)
And get_environ_proxies uses getproxies that is either imported from urllib (py2) or from urllib.request (py3).
So the answer is YES
2. is urllib.request.getproxies able to pick up the OS proxy configuration on windows ?
As far as I know, "the OS configured one" is not reliable on windows. At least on my corporate machine, urllib.request.getproxies does not pick up the proxy. From its documentation or from the one in ProxyHandler it states
If no proxy environment variables are set, then in a Windows environment proxy settings are obtained from the registry’s Internet Settings section, and in a Mac OS X environment proxy information is retrieved from the OS X System Configuration Framework.
From the source code I see that it reads under HKEY_CURRENT_USER > 'Software\Microsoft\Windows\CurrentVersion\Internet Settings', the value of ProxyEnable and ProxyServer. On my machine, that has a proxy configured, this is empty - the settings seem to be rather stored in Internet Explorer / the .Net stack somewhere.
Note that very often in corporate environments the proxy is set from a .pac :
So to conclude on windows at least as of today, we can not reliably trust urllib.request.getproxies. This is why I developed envswitch to make it extremely easy for me and my colleagues to switch all the proxy-related environment variables in one click, back and forth (home-train-plane/office). At least urllib (and requests) use them reliably when they are set. (note: the tool works fine even if there is a "build failed" badge on the tool's doc page :) )

Related

Is there any way to force a program/software to use system proxy in Linux?

I am working on a Java project in Intellij Idea (Linux) that needs to access websites through a proxy. I have a personal proxy subscription to use and I can request through it programmatically with something like -
HttpHost proxy = new HttpHost("PROXY_SERVER", PORT);
String res = Executor.newInstance()
.auth(proxy, "USER_NAME", "PASSWORD")
.execute(Request.Get("https://example.com").viaProxy(proxy))
.returnContent().asString();
System.out.println(res);
However, if I use the proxy in the etc/environment with http(s)_proxy or through ubuntu network proxy from the settings, my browsers and some of the system programs such as - Chrome, Firefox use the proxy while making any requests but Intellij Idea doesn't follow the system proxy. I've tried to set it manually from IDEA settings but it doesn't work. The requests are always going from my current IP. So, I was curious if it is possible to force a software in Linux to use system proxy somehow. I need to mention that, I have tried proxychains but it didn't work, my server wasn't recognized. Any kind of help/suggestion will be appreciated as I have a little or no experience in networking.

How to capture full http requests on express to request them again to my localhost

I have a problem with an Express.js service running on production that I'm not able to replicate on my localhost. I have already tried requesting all the urls to production again to my local machine, but on my machine everything works fine. So I suspect that the problem comes with the data on the http headers (cookies, user agents, languages...).
So, is there a way, (some express module, or sniffer that runs on ubuntu) that allows me to easily create a dump on the server with the whole header so I can later repeat those exact requests to my localhost?

You can capture network packages with https://www.wireshark.org/, analyze them and maybe find the difference between your local environment and the production one.

You can try to use a Proxy-Tool like Charles (https://www.charlesproxy.com/) or Fiddler (http://www.telerik.com/fiddler) to log your Browser Requests.

Observe any XHR request using a node process?

I am building a development tool, and would like to monitor requests to specific domains on a user's system.
I have already written a MITM proxy server. I would like to observe all requests to, say, api.twitter.com, without requiring users to change their code to point to my proxy server. This might be called an HTTP sniffer, I'm not sure.
I have considered:
Browser plugin (eg: chrome dev tools plugin)
/etc/hosts (but this can't map domain to domain I think, and if you did you wouldn't be able to get to the original one)
Native OSX app (learning curve)
Is there a way to observe system HTTP requests using node? I don't know where to start.

Debugging all HTTP[S] on node.js

I'm having fits accomplishing something and after scouring google & SO, throwing my hands up after a few days. Trying to do something that I think is pretty common: debug / examine all HTTP traffic while developing a node.js app.
In Windows it is as simple as firing up Fiddler and I can see all HTTP & HTTPS traffic from all processes. But I've switched platforms over to OSX and trying to make the same work.
I've tried using Charles & MITMPROXY, but all I'm seeing is the traffic to, with the response, my node.js app. My node.js app is calling external services, some using the popular request package (which I have seen how to set that up) but also using other packages, like azure-storage. What's troubling me is I can't get any of the debugging proxies to show me at the azure-storage package is sending / receiving to the endpoints they are calling.
Conceptually I think I get it... I have to tell these different things (like node.js, request & azure-storage) to go through the proxy each of these tools uses... but how can you do that without modifying their source? Can't, like how Fiddler works on Windows, you do something to "all traffic goes through this proxy"?
I'd use Fiddler on OSX but it is currently not working with no ETA in sight after talking to Telerik.

So the problem I was having is what I thought... in my specific instance the module that I was using to access Azure storage was not using the default proxy. I found a package (**global-tunnel that hijacked everything that used the request package to control it going through a proxy. Now I saw stuff show up in the HTTP debuggers I was using.
The problem now is when I am trying to reach an HTTPS endpoint... using something like Charles, it used it's own SSL cert which wasn't trusted by Azure so the connections were refused. Back to the drawing board...

Forwarding or exporting a client certificate in IIS6/7

Currently, our program runs on JBoss and sits behind an apache reverse proxy. Apache handles verifying the client certificate. We have the +ExportCertData option set in apache, and then we use
RequestHeader set SSL_CLIENT_CERT "%{SSL_CLIENT_CERT}e"
to put the cert in the header field SSL_CLIENT_CERT before forwarding to JBoss. Our application in Jboss then reads the cert looking for the SubjectAltName to get the e-mail address, which we use to save the user a step in entering it in.
Now, we will have to live behind IIS, and will need similar functionality to this. What we really care about is extracting the email address from the SubjectAltName. In an ideal world, IIS would provide the same information as apache, so we wouldn't have to modify our application code too much. But if it's not possible, other options are good as well.
Some other notes:
We will probably need to support IIS6 and IIS7. It would be nice to have one solution that works across both, but not necessary
We are currently using IIRF to forward requests that go to a certain virtual directory, but I would be interested in hearing other solutions that could accomplish what we're looking for along with forwarding to our application server.
Just throwing apache in front of IIS isn't going to be a solution because we have to share the box with other programs that use IIS and they might be wary of such a solution. Also, we can't just run on a different port because of firewall restrictions only allow port 80 and port 443.
Any ideas how to make this possible? Let me know if there's any more information I can provide.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string