Why doesn't my Python program reach the intended website? - python-3.x

I am attempting to make a web scraper in Python 3. I keep getting a WinError 10060 stating that the connection failed because the connected party did not properly respond or the connected host failed to respond. Using both the urllib and also trying with requests libraries both create the 10060 error. When using requests the error states they the max retries exceeded with the URL.
import urllib.request
with urllib.request.urlopen('http://python.org') as response:
html = response.read()
People have mentioned that it is likely a proxy or firewall issue as I am attempting to do this on my work network.

Turns out this was a proxy authentication error. Simply adding the proxy with your credentials in the get command solved this.
proxies = {'http': 'http://user:pass#url:8080', 'https': 'http://user:pass#url:8080'}
page = requests.get(webpage, proxies)

Related

Detailed HTTP message with failing POST request _make_request in Requests package of Python 3?

Suppose a POST request fails
self._make_request(method="POST", json=dict(), path=f"test/buy")
returning an error such as 400 but you don't know what is happening behind the hood. You want to get detailed HTTP message to know the problem.
How can you get detailed HTTP message with _make_request of Python 3 requests?
An option is to investigate things in debug mode like
from http.client import HTTPConnection
HTTPConnection.debuglevel = 1
where you see the request and its response witch each piece of information on a new line.

How do i fix HTTPSConnectionPool - Read Timed Out Error when connecting to database server

I am trying to connect to a FileMaker Databse server via python script, and my code was working before but has suddenly stopped, and i didnt make any changes to the portion of code that no longer works. I am encountering the following error:
Request error: HTTPSConnectionPool(host='**.**.*.*', port=443): Read timed out. (read timeout=30)
I have taken out the code that creates the server instance and connects/logs in, and then logs out without making any changes in the database, and i am still recieving the same error. However, i can connect to the filemaker server and database via the FileMaker applicaiton with no issues, and i can connect to the server using Telnet commands. I am on windows 10 and writing the code in PyCharm CE. I have reinstalled PyCharm, created a new virtual environment, and tried reinstalling the fmrest module, as well as using older versions. I have also increased the timeout time to give more time to login, which hasnt worked. I'm basically stumped on why i can no longer log in via the script, when it has been working perfectly in testing for the past couple weeks. My code is below.
import fmrest
from fmrest.exceptions import FileMakerError
from fmrest.exceptions import RequestException
import sys
import requests
# connect to the FileMaker Server
requests.packages.urllib3.disable_warnings()
fmrest.utils.TIMEOUT = 30
try:
fms = fmrest.Server('https://**.**.*.*',
user = '***',
password = '******',
database = 'Hangtag Order Management',
layout = 'OrderAdmin',
verify_ssl = False)
except ValueError as err:
print('Failed to connect to server. Please check server credentials and status and try again\n\n' + str(err))
sys.exit()
print(fms)
print('Attempting to connect to FileMaker Server...')
try:
fms.login()
print('Login Successful\n')
except FileMakerError as err:
print(err)
sys.exit()
except RequestException as err:
print('There was an error connecting to the server, the request timed out\n\n' + str(err))
sys.exit()
fms.logout()
This should successfully login to the database, print 'login successful' and log out. Calling print(fms) returns
<Server logged_in=False database=Hangtag Order Management layout=OrderAdmin>
but i receive the connection error upon the login attempt. I am assuming the error is server side, but i dont know enough about servers to accurately trouble shoot. Could the server have blacklisted my IP for making so many login attempts during my testing? and if so where would i undo that/prevent it from happening again?
A couple of server reboots fixed the error, not really sure of the ultimate cause.

Connecting to Sharepoint Via Python - UrlLib Error

The following code (with variables changed) is what I am trying to use in order to contact a sharepoint server (and ultimately download files from). But I appear to be getting an error related to urllib, for which I can't work out the reason for.
from sharepoint import SharePointSite, basic_auth_opener
server_url = "http://sharepoint.domain.com/"
opener = basic_auth_opener(server_url, "domain/username", "password")
site = SharePointSite(server_url, opener)
sp_list = site.lists
for sp_list in site.lists:
print(sp_list.id, sp_list.meta['Title'])
The error is as follows
urllib.error.URLError: <urlopen error [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond
Could you try with below code
import requests from requests_ntlm import HttpNtlmAuth
requests.get("http://sharepoint.domain.com", auth=HttpNtlmAuth('DOMAIN\\USERNAME','PASSWORD'))

Download documents from https site Error: sslv3 alert handshake failure

Anaconda - Python 3.6
OpenSSL 1.0.2
Operating System: Windows 7
Phase 1 (Completed): Using selenium: launched, navigated, and extracted various data elements including a table from site. Extracted Hyperlinks contained in table that are direct links to documents.
Phase 2: Taking extracted hyperlink from table I need to download the files to a specified folder on the shared drive.
Tried:
import urllib.request
url = 'tts website/test.doc'
urllib.request.urlretrieve(url,'C:\Users\User\Desktop\')
Error I get is sslv3 alert handshake failure
With the site opened, I have clicked on the Lock icon and clicked "Install Certificate". I have saved the certificate to my "Trusted Root Certification Authorities" in the Certificate store.
I can see the certificate name (when i installed certificate) from the above step in the 58 CA Certificates shown by running the following code:
import socket
import ssl
context = ssl.SSLContext(ssl.PROTOCOL_TLSv1)
context.verify_mode = ssl.CERT_REQUIRED
context.load_default_certs()
ca_certs = context.get_ca_certs()
print('{} CA Certificates loaded: ' .format(len(ca_certs)))
for cert_dict in ca_certs:
print(cert_dict['subject'])
print()
I can't figure out how to secure a SSL connection to the website/server in order to download the file from each of the hyperlinks?? This website uses Single Sign On(SSO) and automatically logins me in when I first launch the website.
I have tried to use server server.net 443 to connect to server, but can't seem to get the scripting right to connect and retrieve the document.
I have connected directly to the server and abstracted the certificate details shown here:
HOST, PORT = server.net, 443
ctx = ssl.create_default_context()
s = ctx.wrap_socket(socket.socket(), server_hostname=HOST)
c.connect((HOST, PORT))
cert = s.getpeercert()
print(cert)
When i run urlretrieve i am still getting the same error: handshake. When reviewing my ca certificates i see there is a Personal certificate for my Windows login (username) listed there, that must be how it is automatically logging me in using SSO. How do i take all of this information, connect to the website using my SSO, and retrieve the documents?
Latest UPDATE:
I am finding pycurl to be promising, however I feel like I need a little assistance making a few tweaks to get it working.
import pycurl
fp = open('Test.doc','wb')
curl = pycurl.Curl()
curl.setopt(pycurl.URL, url) # see url link to go to word doc
curl.setopt(pycurl.FOLLOWLOCATION, 1)
curl.setopt(pycurl.MAXREDIRS, 5)
curl.setopt(pycurl.CONNECTTIMEOUT,30)
curl.setopt(pycurl.TIMEOUT, 300)
try:
curl.setopt(pycurl.WRITEDATA, fp)
curl.perform()
except:
import traceback
traceback.print_exc(file=sys.stderr)
sys.stderr.flush()
curl.close()
fp.close()
This code yields no error, however the created word doc contains an error displaying a print screen of the log on page of the website.
Main Problem: HTTPS connection using Single Signon connection behind a corporate network proxy server.
I have been trying to get this to work to validate cacert, but I have been getting this error message now:
curl.setopt(pycurl.SSL_VERIFYPEER, 1)
curl.setopt(pycurl.SSL_VERIFYPEER, 2)
curl.setopt(pycurl.CAINFO, certifi.where())
but now i am getting ERROR: 51, CERT_TRUST_IS_UNTRUSTED_ROOT
How do i add proxy if that is causing the error? and Secondly, how do i attach the ca certificate file directly?

Connection refused when using zeep SSL

I am trying to access a SOAP server using zeep. My server uses SSL with a custom certificate, and connection to that server works, with my cert, or ignoring it:
python -mzeep "https://<server-ip>/servicemanager/1?wsdl" --no-verify
I get a long list of Prefixes, Global elements, Global types, Bindings and Service. The latter one says:
Service: ServiceManager
Port: servicemanager_1 (Soap11Binding: {http://soap.client.<snipped>.at}servicemanager_1Binding)
Operations:
getServices() -> return: ns0:service[]
So, from what I can say by now, I can create a client object and call it's service named getServices().
from zeep import CachingClient as Client
from zeep.wsse.signature import Signature
from zeep.transports import Transport
from requests import Session, Request
session = Session()
session.verify = False
transport = Transport(session=session)
c = Client('https://<server-ip>/servicemanager/1?wsdl', transport=transport)
c.service.getServices()
But that leads to an error in urllib3 (~/.virtualenvs/soap/lib/python3.5/site-packages/urllib3/util/connection.py):
ConnectionRefusedError: [Errno 111] Connection refused
During handling of the above exception, another exception occurred:
[...]
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='localhost',
port=443): Max retries exceeded with url: /servicemanager/1 (Caused by
NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object
at 0x7f4e2a6f7d30>: Failed to establish a new connection: [Errno 111]
Connection refused',))
It does not matter if I ignore the SSL verification, or provide a CA_BUNDLE. both are accepted, the client is created, but I can't call the getServices() method.
What did I forget here? I don't think this is a zeep problem, as the underlying urllib3 throws the exception. But I tried for hours and searched the internet for a solution, without success.
Apart of the XML I get from the endpoint is:
<service name="ServiceManager">
<port name="servicemanager_1" binding="tns:servicemanager_1Binding">
<soap:address location="http://localhost/servicemanager/1"/>
</port>
</service>
And I don't know why it returns a "localhost" there - is zeep using that for its call? Then I would understand why permanent errors occur.
Any hints?
To change the endpoint address I use it this way:
client.service._binding_options['address'] = 'https://mynewaddress.com/service.wsdl'
As always, after days of searching, in the moment I ask at Stackoverflow, the answer came up through other channels.
If anyone has the same problems, here is the solution. My server provides me with the WSDL file, like said above:
<service name="ServiceManager">
<port name="servicemanager_1" binding="tns:servicemanager_1Binding">
<soap:address location="http://localhost/servicemanager/1"/>
</port>
</service>
And there it stands: localhost. Zeep (IMHO correctly) uses that service endpoint to communicate then with the server.
What I did for testing: I SSH-tunnelled the ports 80/443 to localhost, so zeep thought it talked to localhost.
And Shazaam, it worked.
So my server was the culprit - too bad I can't change that, as I have no control over it.
But now a workaround is possible.

Resources