Trouble retrieving information from a web page using HTTP protocol

Trouble retrieving information from a web page using HTTP protocol - python-3.x

This is my first question on this forum. So here we go.
In an assignment on coursera python for everybody, I modified the url in a .py file as told to retrieve the document from the link provided. But after doing all and running in cmd, I get ‘socket.gaierror: [Errno 11001] getaddrinfo failed’ error. I also can’t get to work this with the other alternative methods i.e., browser developer console and telnet. Telnet throws a ‘Could not open connection to the host, on port 80: Connect failed’ error.
I looked into google but not getting a clear answer. It would really help if someone solved this issue for me.
import socket
mysock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
mysock.connect(('http://data.pr4e.org/intro-short.txt', 80))
cmd = 'GET http://data.pr4e.org/intro-short.txt HTTP/1.0\r\n\r\n'.encode()
mysock.send(cmd)
while True:
data = mysock.recv(512)
if len(data) < 1:
break
print(data.decode(),end='')
mysock.close()
I expect to get the metadata and contents of the url.

Sockets don't understand urls, only hostnames. You need to change
mysock.connect(('http://data.pr4e.org/intro-short.txt', 80))
to
mysock.connect((data.pr4e.org', 80))

Related

How do i fix HTTPSConnectionPool - Read Timed Out Error when connecting to database server

I am trying to connect to a FileMaker Databse server via python script, and my code was working before but has suddenly stopped, and i didnt make any changes to the portion of code that no longer works. I am encountering the following error:
Request error: HTTPSConnectionPool(host='**.**.*.*', port=443): Read timed out. (read timeout=30)
I have taken out the code that creates the server instance and connects/logs in, and then logs out without making any changes in the database, and i am still recieving the same error. However, i can connect to the filemaker server and database via the FileMaker applicaiton with no issues, and i can connect to the server using Telnet commands. I am on windows 10 and writing the code in PyCharm CE. I have reinstalled PyCharm, created a new virtual environment, and tried reinstalling the fmrest module, as well as using older versions. I have also increased the timeout time to give more time to login, which hasnt worked. I'm basically stumped on why i can no longer log in via the script, when it has been working perfectly in testing for the past couple weeks. My code is below.
import fmrest
from fmrest.exceptions import FileMakerError
from fmrest.exceptions import RequestException
import sys
import requests
# connect to the FileMaker Server
requests.packages.urllib3.disable_warnings()
fmrest.utils.TIMEOUT = 30
try:
fms = fmrest.Server('https://**.**.*.*',
user = '***',
password = '******',
database = 'Hangtag Order Management',
layout = 'OrderAdmin',
verify_ssl = False)
except ValueError as err:
print('Failed to connect to server. Please check server credentials and status and try again\n\n' + str(err))
sys.exit()
print(fms)
print('Attempting to connect to FileMaker Server...')
try:
fms.login()
print('Login Successful\n')
except FileMakerError as err:
print(err)
sys.exit()
except RequestException as err:
print('There was an error connecting to the server, the request timed out\n\n' + str(err))
sys.exit()
fms.logout()
This should successfully login to the database, print 'login successful' and log out. Calling print(fms) returns
<Server logged_in=False database=Hangtag Order Management layout=OrderAdmin>
but i receive the connection error upon the login attempt. I am assuming the error is server side, but i dont know enough about servers to accurately trouble shoot. Could the server have blacklisted my IP for making so many login attempts during my testing? and if so where would i undo that/prevent it from happening again?

A couple of server reboots fixed the error, not really sure of the ultimate cause.

nodename nor servname provided, or not known on tcp connection in Python

I am tring to understand the socket connections in python and everytime i am tring to connect to a url it's giving me this error:
nodename nor servname provided, or not known
which i have no idea why? And sometimes it's only showing 301 and never a 200 status!
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
host = "Whatever url i am giving"
server_address = (host, 80)
request_header = request_header = 'GET / HTTP/1.1\r\nHost: '+host+'\r\n\r\n'
try:
s.connect(server_address)
s.send(request_header.encode('utf-8'))
result = s.recv(4096).decode('utf-8')
while (len(result) > 0):
print(result)
result = s.recv(4096)
except Exception as ex:
print("Unexpected error:", ex)
s.close()
I know there are other questions but that doesn't satisfy my query. Can someone point me out what's happening here??

You don't connect to a URL. You connect to a host. When I assign host = stackoverflow.com, for example, your code works fine.
The socket layer itself knows nothing about URLs. A URL includes the path you supply to the host's HTTP server after you've connected. So, if you wish to retrieve, say, the URL "http://stackoverflow.com/questions", you connect to the host "stackoverflow.com", then provide this as the first line in the HTTP request:
GET /questions HTTP/1.1\r\n
This request (to stackoverflow.com) will in fact deliver a 301 response. 301 is a redirect response, telling you that the document you seek is available from a different host or service. This is an increasingly common response as most "http" sites now redirect the client to their corresponding "https" service.
If the host name you provide is not a valid hostname (for example, if you attempt to connect to "szackoverflow.com"), the hostname lookup that's being done automatically on your behalf will fail, resulting in a socket.gaierror exception ("gai" => getaddrinfo). On my linux system, that looks like this:
Unexpected error: [Errno -2] Name or service not known
On a different operating system, the text provided with that error might be worded differently.

Connecting to Sharepoint Via Python - UrlLib Error

The following code (with variables changed) is what I am trying to use in order to contact a sharepoint server (and ultimately download files from). But I appear to be getting an error related to urllib, for which I can't work out the reason for.
from sharepoint import SharePointSite, basic_auth_opener
server_url = "http://sharepoint.domain.com/"
opener = basic_auth_opener(server_url, "domain/username", "password")
site = SharePointSite(server_url, opener)
sp_list = site.lists
for sp_list in site.lists:
print(sp_list.id, sp_list.meta['Title'])
The error is as follows
urllib.error.URLError: <urlopen error [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond

Could you try with below code
import requests from requests_ntlm import HttpNtlmAuth
requests.get("http://sharepoint.domain.com", auth=HttpNtlmAuth('DOMAIN\\USERNAME','PASSWORD'))

SCTP server has an abnormal behaviour when connecting with a client

I have this code on a small server that is getting requests from a client using SCTP connection, I keep getting this error every now and then.
BlockingIOError: [Errno 11] Resource temporarily unavailable
I know that I can avoid it by using Try-except but I wanna have a deep understanding of the issue. any help?
my code is here. this is the server
server = ('', 29168)
sk = sctpsocket_tcp(socket.AF_INET)
sk.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
sk.bindx([server])
sk.listen(5)
connection, addr = sk.accept()
while c:
a,b,c,d = connection.sctp_recv(1024)
print(c)

After going again through the SCTP library, I found a closed issue on Github with the solution

Why doesn't my Python program reach the intended website?

I am attempting to make a web scraper in Python 3. I keep getting a WinError 10060 stating that the connection failed because the connected party did not properly respond or the connected host failed to respond. Using both the urllib and also trying with requests libraries both create the 10060 error. When using requests the error states they the max retries exceeded with the URL.
import urllib.request
with urllib.request.urlopen('http://python.org') as response:
html = response.read()
People have mentioned that it is likely a proxy or firewall issue as I am attempting to do this on my work network.

Turns out this was a proxy authentication error. Simply adding the proxy with your credentials in the get command solved this.
proxies = {'http': 'http://user:pass#url:8080', 'https': 'http://user:pass#url:8080'}
page = requests.get(webpage, proxies)

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Trouble retrieving information from a web page using HTTP protocol - python-3.x

Sockets don't understand urls, only hostnames. You need to change mysock.connect(('http://data.pr4e.org/intro-short.txt', 80)) to mysock.connect((data.pr4e.org', 80))

Related

How do i fix HTTPSConnectionPool - Read Timed Out Error when connecting to database server

nodename nor servname provided, or not known on tcp connection in Python

Connecting to Sharepoint Via Python - UrlLib Error

SCTP server has an abnormal behaviour when connecting with a client

Why doesn't my Python program reach the intended website?

Categories

Resources