I'm using urllib3 and I want to see the headers that are send.
I've found this in documentation but it doesn't print the headers:
urllib3.add_stderr_logger(1)
Is there any way of doing this?
Right now, the best way to achieve really verbose logging that includes headers sent in urllib3 is to override the default value in httplib (which is used internally).
For Python 3:
# You'll need to do this before urllib3 creates any http connection objects
import http.client
http.client.HTTPConnection.debuglevel = 5
# Now you can use urllib3 as normal
import urllib3
http = urllib3.PoolManager()
r = http.request('GET', ...)
In Python 2, the HTTPConnection object lives under the httplib module.
This will turn on verbose logging for anything that uses httplib. Note that this is not using the documented API for httplib, but it's monkeypatching the default value for the HTTPConnection class.
The goal is to add better urllib3-native logging for these kinds of things, but it hasn't been implemented yet. Related issue: https://github.com/shazow/urllib3/issues/107
Related
I'm trying to use dask.distributed Python API to start a scheduler. The example provided in http://distributed.dask.org/en/latest/setup.html#using-the-python-api works as expected but it does not provide insight on how to supply the options need to start Bokeh web interface.
Upon inspection of dask.distributed source code, I have understood I need to provide the Bokeh options using Scheduler(services={}). Unfortunately, I have failed on trying to find the correct dictionary format for services={}.
Below is the code for dask scheduler function.
import dask.distributed as daskd
import tornado
import threading
def create_dask_scheduler(scheduler_options_dict):
# Define and start tornado
tornado_loop = tornado.ioloop.IOLoop.current()
tornado_thread = threading.Thread(target=tornado_loop.start,daemon=True)
tornado_thread.start()
# Define and start scheduler
dask_scheduler = daskd.Scheduler(loop=tornado_loop,synchronize_worker_interval=scheduler_options_dict['synchronize_worker_interval'],allowed_failures=scheduler_options_dict['allowed_failures'],services=scheduler_options_dict['services'])
dask_scheduler.start('tcp://:8786')
return dask_scheduler
scheduler_options_dict = collections.OrderedDict()
scheduler_options_dict = {'synchronize_worker_interval':60,'allowed_failures':3,'services':{('http://hpcsrv',8787):8787}}
dask_scheduler = create_dask_scheduler(scheduler_options_dict)
The error I get is:
Exception in thread Thread-4: Traceback (most recent call last):
/uf5a/nbobolea/bin/anaconda2019.03_python3.7/envs/optimization/lib/python3.7/site-packages/ipykernel_launcher.py:18:
UserWarning: Could not launch service 'http‍://hpcsrv' on port 8787.
Got the following message: 'int' object is not callable
distributed.scheduler - INFO - Scheduler at:
tcp://xxx.xxx.xxx.xxx:8786
Help and insight is very appreciated.
You want
'services': {('bokeh', dashboard_address): BokehScheduler, {}}
where dashboard_address is something like "localhost:8787" and BokehScheduler is in distributed.bokeh.scheduler. You will need to read up on the Bokeh server to see what additional kwargs could be passed in that empty dictionary.
I have numerous Bokeh Server files in a directory say.. /dir/bokeh/, assume the bokeh servers are called bokeh1.py, bokeh2.py, bokeh3.py
The file structure is like so:
|--dir
|---flask.py
|---bokeh
|--bokeh1.py
|--bokeh2.py
I am deploying them all on flask like so:
files=[]
for file in os.listdir("/dir/bokeh/"):
if file.endswith('.py'):
file="bokeh/"+file
files.append(file)
argvs = {}
urls = []
for i in files:
argvs[i] = None
urls.append(i.split('\\')[-1].split('.')[0])
host = 'myhost.com'
apps = build_single_handler_applications(files, argvs)
bokeh_tornado = BokehTornado(apps, extra_websocket_origins=["myhost.com"])
bokeh_http = HTTPServer(bokeh_tornado)
sockets, port = bind_sockets("myhost.com", 0)
bokeh_http.add_sockets(sockets)
On update to Tornado 6.0.2, and deploying Flask, I get the Runtimerror There is no current event loop in thread Thread-1. On deeper research Tornado uses asyncio by default and imposes some restrictions. So I add asyncio.set_event_loop(asyncio.new_event_loop()) to the following.
def bk_worker():
asyncio.set_event_loop(asyncio.new_event_loop())####
server = BaseServer(IOLoop.current(), bokeh_tornado, bokeh_http)
server.start()
server.io_loop.start()
gc.collect()
from threading import Thread
Thread(target=bk_worker).start()
However, upon opening the bokeh server url through flask, the bokeh server selected (any of them) do not load and simply return a blank page. How can I circumvent this?
setting asyncio.set_event_loop_policy(AnyThreadEventLoopPolicy)) yields the same result.
edit:The previous code works with python 2/3, Tornado 4.5.3
I think this is a known Bokeh issue. The best way for now is to downgrade to Tornado 4.5.3.
pip install tornado==4.5.3
Veriblock has no python-grpc example. The return information may not be available due to coding problems. I'm not sure. I hope someone can make an example. Thank you very much.
I'm working on a more comprehensive example, but for connecting via gRPC and displaying current block number and node info this should get you started.
from __future__ import print_function
import json
import grpc
import veriblock_pb2 as vbk
import veriblock_pb2_grpc as vbkrpc
channel = grpc.insecure_channel('localhost:10500')
stub = vbkrpc.AdminStub(channel)
def GetStateInfoRequest():
response = stub.GetStateInfo(vbk.GetStateInfoRequest())
response = json.dumps({"connected_peer_count": response.connected_peer_count,
"network_height": response.network_height,
"local_blockchain_height": response.local_blockchain_height,
"network_version": response.network_version,
"program_version": response.program_version,
"nodecore_starttime": response.nodecore_starttime,
"wallet_cache_sync_height": response.wallet_cache_sync_height})
print(response)
def getBlock():
response = stub.GetInfo(vbk.GetInfoRequest())
response = (response.number_of_blocks - 1)
print(response)
getBlock()
GetStateInfoRequest()
Hope it helps.
Is there a specific python question, like calling a function or API or expecting output?
VeriBlock NodeCore does support python, via grpc (https://grpc.io/docs/tutorials/basic/python.html)
FWIW, there is a pre-compiled output for grpc that includes python
https://github.com/VeriBlock/nodecore-releases/releases/tag/v0.4.1-grpc
python
veriblock_pb2.py
veriblock_pb2_grpc.py
There is a C# example here: https://github.com/VeriBlock/VeriBlock.Demo.Rpc.Client (obviously not python, but maybe useful as a conceptual example)
Referencing not getting all cookie info using python requests module
The OP saw many cookies being set on Chrome, but does not see most of those cookies in his Python Requests code. The reason given was that "The cookies being set are from other pages/resources, probably loaded by JavaScript code."
This is the function I'm using to try and get cookies that are loaded when a URL is accessed:
from requests import get
from requests.exceptions import RequestException
from contextlib import closing
def get_cookies(url):
"""
Returns the cookies from the response of `url` when making a HTTP GET request.
"""
try:
s = Session()
with closing(get(url, stream=True)) as resp:
return resp.cookies
except RequestException as e:
print('Error during requests to {0} : {1}'.format(url, str(e)))
return None
But using this function, I only see cookies set by the URL, and not others like advertisement cookies. Given this setup, how do I view the other cookies, exactly as how Chrome sees them? I.e. how do I see all cookies being set when a GET request is made, including those from other pages/resources?
Took a bit of work, but I managed to get it working.
Basically needed selenium and chrome to actually load the website and all 3rd party stuff. One of the outputs is a sqlite3 database of cookies in ./chrome_dir/Default/Cookies which you can just fetch for your own use.
from selenium import webdriver
import sqlite3
def get_cookies(url):
"""
Returns the cookies from the response of `url` when making a HTTP GET request.
"""
co = webdriver.ChromeOptions()
co.add_argument("--user-data-dir=chrome_dir") # creates a directory to store all the chrome data
driver = webdriver.Chrome(chrome_options=co)
driver.get(url)
driver.quit()
conn = sqlite3.connect(r'./chrome_stuff/Default/Cookies')
c = conn.cursor()
c.execute("SELECT * FROM 'cookies'")
return c.fetchall()
I know that it is not an advisable solution to use GET however I am not in control of how this server works and have very little experience with requests.
I'm looking to add a dictionary via a GET request and was told that the server had been set up to accept this but I'm not sure how that works. I have tried using
import requests
r = request.get('www.url.com', data = 'foo:bar')
but this leaves the webpage unaltered, any ideas?
To use request-body with a get request, you must override the post method. e.g.
request_header={
'X-HTTP-Method-Override': 'GET'
}
response = requests.post(request_uri, request_body, headers=request_header)
Use requests like this pass the the data in the data field of the requests
requests.get(url, headers=head, data=json.dumps({"user_id": 436186}))
It seems that you are using the wrong parameters for the get request. The doc for requests.get() is here.
You should use params instead of data as the parameter.
You are missing the http in the url.
The following should work:
import requests
r = request.get('http://www.url.com', params = {'foo': 'bar'})
print(r.content)
The actual request can be inspected via r.request.url, it should be like this:
http://www.url.com?foo=bar
If you're not sure about how the server works, you should send a POST request, like so:
import requests
data = {'name', 'value'}
requests.post('http://www.example.com', data=data)
If you absolutely need to send data with a GET request, make sure that data is in a dictionary and instead pass information with params keyword.
You may find helpful the requests documentation