How to create a tornado unit test with file upload? - python-3.x

I am trying to create a unit test where I need to upload a CSV file. Here is a snippet I am trying to do,
from tornado.testing import AsyncHTTPTestCase
import json
class TestCSV(AsyncHTTPTestCase):
def test_post_with_duplicates_csv_returns_400(self, *args, **kwargs):
dup_file = open("test.csv", 'r')
body = {'upload': dup_file.read()}
request_config = {
'method': 'POST',
'headers': {
'Content-Type': 'application/json',
'Origin': 'localhost'
},
'body': json.dumps(payload)
}
response = self.fetch('http://localhost/file_upload', **request_config)
self.assertEqual(response.code, 400)
and the actual code looks for the uploaded file like this,
...
file = self.request.files['upload'][0]
...
This returns 500 status code with the following message,
HTTPServerRequest(protocol='http', host='127.0.0.1:46243', method='POST', uri='/v2/files/merchants/MWBVGS/product_stock_behaviors', version='HTTP/1.1', remote_ip='127.0.0.1')
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/tornado/web.py", line 1699, in _execute
result = await result
File "/usr/local/lib/python3.6/site-packages/tornado/gen.py", line 191, in wrapper
result = func(*args, **kwargs)
File "/usr/app/src/handlers/merchants.py", line 463, in post
file = self.request.files['upload'][0]
KeyError: 'upload'
Can some one help me on why the file is not getting detected?
Env: Python 3.6, tornado

You're encoding the file as JSON, but the request.files fields are used for HTML multipart uploads. You need to decide which format you want to use (in addition to those formats, you can often just upload the file as the HTTP PUT body directly) and use the same format in the code and the test.
Tornado doesn't currently provide any tools for producing multipart uploads, but the python standard library's email.mime package does.

Related

Why can't upload file into the dropbox with proxy?

The urllib library installed in my os:
pip list |grep urllib
urllib3 1.25.11
I want to upload local file into the dropbox with proxy:
import dropbox
access_token = "xxxxxx"
file_from = "local_file"
file_to = "/directory_in_dropbox"
proxyDict = {
"http": "http://127.0.0.1:8123",
"https": "https://127.0.0.1:8123"
}
mysesh = dropbox.create_session(1,proxyDict)
dbx = dropbox.Dropbox(access_token,session=mysesh)
with open(file_from, 'rb') as f:
dbx.files_upload(f.read(), file_to)
It encounter errors:
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
File "/home/debian/.local/lib/python3.9/site-packages/dropbox/base.py", line 3208, in files_upload
r = self.request(
File "/home/debian/.local/lib/python3.9/site-packages/dropbox/dropbox_client.py", line 326, in request
res = self.request_json_string_with_retry(host,
File "/home/debian/.local/lib/python3.9/site-packages/dropbox/dropbox_client.py", line 476, in request_json_string_with_retry
return self.request_json_string(host,
File "/home/debian/.local/lib/python3.9/site-packages/dropbox/dropbox_client.py", line 589, in request_json_string
r = self._session.post(url,
File "/usr/lib/python3/dist-packages/requests/sessions.py", line 590, in post
return self.request('POST', url, data=data, json=json, **kwargs)
File "/usr/lib/python3/dist-packages/requests/sessions.py", line 542, in request
resp = self.send(prep, **send_kwargs)
File "/usr/lib/python3/dist-packages/requests/sessions.py", line 655, in send
r = adapter.send(request, **kwargs)
File "/usr/lib/python3/dist-packages/requests/adapters.py", line 439, in send
resp = conn.urlopen(
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 696, in urlopen
self._prepare_proxy(conn)
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 966, in _prepare_proxy
conn.connect()
File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 359, in connect
conn = self._connect_tls_proxy(hostname, conn)
File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 500, in _connect_tls_proxy
return ssl_wrap_socket(
File "/usr/lib/python3/dist-packages/urllib3/util/ssl_.py", line 453, in ssl_wrap_socket
ssl_sock = _ssl_wrap_socket_impl(sock, context, tls_in_tls)
File "/usr/lib/python3/dist-packages/urllib3/util/ssl_.py", line 495, in _ssl_wrap_socket_impl
return ssl_context.wrap_socket(sock)
File "/usr/lib/python3.9/ssl.py", line 500, in wrap_socket
return self.sslsocket_class._create(
File "/usr/lib/python3.9/ssl.py", line 997, in _create
raise ValueError("check_hostname requires server_hostname")
ValueError: check_hostname requires server_hostname
It's no use to write the proxy dict as below:
proxyDict = {
"http": "http://127.0.0.1:8123",
"https": "http://127.0.0.1:8123"
}
The proxy 127.0.0.1:8123 works fine,i can down resources from web with proxy in youtube-dl command:
youtube-dl --proxy http://127.0.0.1:8118 $url
Updated for Paulo's advice:
Updaed for Markus' advice:
import ssl
ssl._create_default_https_context = ssl._create_unverified_context
ssl.SSLContext.verify_mode = property(lambda self: ssl.CERT_NONE, lambda self, newval: None)
import dropbox
access_token = "xxxxxxxx"
file_from = "/home/debian/sample.sql"
file_to = "/mydoc"
proxyDict = {
"http": "http://127.0.0.1:8123",
"https": "https://127.0.0.1:8123"
}
mysesh = dropbox.create_session(1,proxyDict)
dbx = dropbox.Dropbox(access_token,session=mysesh)
with open(file_from, 'rb') as f:
dbx.files_upload(f.read(), file_to)
It encounter the below error:
/home/debian/.local/lib/python3.9/site-packages/urllib3/connectionpool.py:981: InsecureRequestWarning: Unverified HTTPS request is being made to host '127.0.0.1'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
warnings.warn(
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
File "/home/debian/.local/lib/python3.9/site-packages/dropbox/base.py", line 3208, in files_upload
r = self.request(
File "/home/debian/.local/lib/python3.9/site-packages/dropbox/dropbox_client.py", line 326, in request
res = self.request_json_string_with_retry(host,
File "/home/debian/.local/lib/python3.9/site-packages/dropbox/dropbox_client.py", line 476, in request_json_string_with_retry
return self.request_json_string(host,
File "/home/debian/.local/lib/python3.9/site-packages/dropbox/dropbox_client.py", line 596, in request_json_string
self.raise_dropbox_error_for_resp(r)
File "/home/debian/.local/lib/python3.9/site-packages/dropbox/dropbox_client.py", line 639, in raise_dropbox_error_for_resp
raise AuthError(request_id, err)
dropbox.exceptions.AuthError: AuthError('xxxxxxxxxxxxxxxxxxxxxx', AuthError('invalid_access_token', None))
Update for Life is complex's advice:
I tried many times to get mysesh = dropbox.create_session(1,proxyDict) to work correctly.
I decided to look at the code for dropbox-sdk-python and noted that it is calling requests.Session(). So I decided to use that over dropbox.create_session()
import requests
from dropbox import Dropbox
from dropbox.files import WriteMode
access_token = "my_access_token"
file_from = 'test.docx'
file_to = '/test.docx'
# https://free-proxy-list.net
proxyDict = {
"http": "http://50.218.57.65:80",
"https": "https://83.229.73.175:80"
}
s = requests.Session()
s.proxies = proxyDict
dbx = Dropbox(access_token, session=s)
with open(file_from, 'rb') as f:
file_content = f.read()
dbx.files_upload(f=file_content, path=file_to, mode=WriteMode.overwrite, mute=False)
Here is a screenshot of the file being written to DropBox.
I have tried this code with multiple proxy servers and it works each time.
Tldr;
So far, my understanding is it may be
Miss-use of the urllib
Bad https certificates
Solution (maybe)
urllib format
If I remember well urllib changed his format at some point from
proxyDict = {
'http':'8.88.888.8:8888',
'https':'8.88.888.8:8888'
}
proxyDict = {
'https': 'https://8.88.888.8:8888',
'http': 'http://8.88.888.8:8888',
}
Have you tried both format ?
You must have a problem with
your proxy not forwarding some stuff the right way or
your access token is wrong
the Dropbox app has the wrong permissions set
because this code (which is basically what you have in your question - even without disabling SSL certificate check!) works just fine with my access token put into the environment variable DROPBOX_ACCESS_TOKEN.
import dropbox
import sys
import os
DROPBOX_ACCESS_TOKEN = os.getenv('DROPBOX_ACCESS_TOKEN')
def uploadFile(fromFilePath,toFilePath):
proxy = '127.0.0.1:3128' # locally installed squid proxy server
proxyDict = {
"http": "http://"+proxy,
"https": "http://"+proxy # connection to proxy is http!!
}
session = dropbox.create_session(1,proxyDict)
client = dropbox.Dropbox(DROPBOX_ACCESS_TOKEN,session)
client.files_upload(open(fromFilePath, "rb").read(), toFilePath)
print("Done uploading {} to {}".format(fromFilePath,toFilePath))
if __name__=="__main__":
uploadFile(sys.argv[1],sys.argv[2])
Be aware though, that the access token - once it is generated - has the permissions that were in effect at the time of token generation. If you change the app's permissions AFTER generating the token, the token will still have the original permissions!
EDIT: It looks like, the Dropbox API is clever enough to NOT use the proxy, if it can reach the target directly. Thus this code is working with ANYTHING you put into the proxyDict and it is not at all clear, if the code works, if it really has to go through the proxy. I am working on verifying that and will update the answer accordingly.
Update: I installed squid on my MacBook and used http://127.0.0.1:3128 as the proxy in above code, but the logs showed, the code never even tried to go through the proxy. But once I set the environment variables http_proxy and https_proxy to "http://127.0.0.1:3128" the request WOULD go through the proxy and proceed successfully. So... either there is something going on, I don't fully understand or the Dropbox API has some problem with the proxy definitions in the create_session call. Time to look at the API source code I guess...
Thank for Life is complex's code,i add permission on Files and folders.
And re-generate the dropbox token ,execute the same code (nothing changed) with the new token ,done!
It is nothing related with proxy setting,just dropbox setting!

Flask restx multipart/form request with file and body with swagger

I am trying to implement, using flask-restx, an endpoint which will take both formData (a list of files to be more precise) and a body as json. My code looks as follow:
Multiple files param in some module:
def authorization_param(ns: Namespace, parser: Optional[RequestParser] = None) -> RequestParser:
if not parser:
parser = ns.parser()
parser.add_argument('Authorization', location='headers', required=False, default='Bearer ')
return parser
def multiple_file_param(arg_name: str, ns: Namespace, parser: Optional[RequestParser] = None) -> RequestParser:
if not parser:
parser = ns.parser()
parser.add_argument(arg_name, type=FileStorage, location='files', required=True, action='append')
return parser
Model:
some_form_model = api.model('form', {'field': fields.String())
And the endpoint itself:
ns = Namespace('sth', description='Some stuff'))
auth_param = authorization_param(ns=ns)
file_param = multiple_file_param(arg_name='File', ns=ns)
#ns.route('/files')
#ns.expect(auth_param)
class PreprocessFiles(Resource):
#ns.response(code=201, description='Job created', model=some_model)
#ns.response(code=400, description='Bad request', model=None)
#ns.response(code=401, description='Authentication Error', model=None)
#ns.response(code=403, description='Forbidden', model=None)
#ns.response(
code=422,
description='Input data validation Error',
model=some_model
)
#ns.expect(some_form_model)
#ns.expect(file_param)
def post(self):
payload = request.get_json()
# do some stuff..
return {'text': 'ok'}, 201
The endpoint is registered in an API object:
api.add_namespace(ns)
My problem is that in swagger I get either input body or file parameter, depending on the order of decorators I use. If I try to pass both form model and file param into one ns.expect as so
#ns.expect(some_form_model, file_param)
I get the following error in the console and the schema is not rendered:
2022-08-26 12:19:45.764 ERROR flask_restx.api api.__schema__: Unable to render schema
Traceback (most recent call last):
File "D:\Project\venv\lib\site-packages\flask_restx\api.py", line 571, in __schema__
self._schema = Swagger(self).as_dict()
File "D:\Project\venv\lib\site-packages\flask_restx\swagger.py", line 239, in as_dict
serialized = self.serialize_resource(
File "D:\Project\venv\lib\site-packages\flask_restx\swagger.py", line 446, in serialize_resource
path[method] = self.serialize_operation(doc, method)
File "D:\Project\venv\lib\site-packages\flask_restx\swagger.py", line 469, in serialize_operation
if any(p["type"] == "file" for p in all_params):
File "D:\Project\venv\lib\site-packages\flask_restx\swagger.py", line 469, in <genexpr>
if any(p["type"] == "file" for p in all_params):
KeyError: 'type'
Is there any way to go around this? I would really like to have good swagger docs for the frontend folks.
Thanks in advance!
Best,
Mateusz

How to fix error urllib.error.HTTPError: HTTP Error 400: BAD REQUEST?

I have a script(test.py) to test some api, like this:
def get_response(fct, data, method=GET):
"""
Performs the query to the server and returns a string containing the
response.
"""
assert(method in (GET, POST))
url = f'http://{hostname}:{port}/{fct}'
if method == GET:
encode_data = parse.urlencode(data)
response = request.urlopen(f'{url}?{encode_data}')
elif method == POST:
response = request.urlopen(url, parse.urlencode(data).encode('ascii'))
return response.read()
In terminal I call:
python test.py -H 0.0.0.0 -P 5000 --add-data
The traceback:
Traceback (most recent call last):
File "test.py", line 256, in <module>
add_plays()
File "test.py", line 82, in add_plays
get_response("add_channel", {"name": channel}, method=POST)
File "test.py", line 43, in get_response
response = request.urlopen(url, parse.urlencode(data).encode('ascii'))
File "/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 223, in urlopen
return opener.open(url, data, timeout)
File "/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 532, in open
response = meth(req, response)
File "/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 642, in http_response
'http', request, response, code, msg, hdrs)
File "/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 570, in error
return self._call_chain(*args)
File "/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 504, in _call_chain
result = func(*args)
File "/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/urllib/request.py", line 650, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 400: BAD REQUEST
The data is {"name": "Channel1"}. I couldn't understand what is wrong. Please can someone give some tip or show whats's wrong?
When I call using curl, works:
curl -X POST -H "Content-Type: application/json" -d '{"name": "Channel1"}' http://0.0.0.0:5000/add_channel
I solved the problem change the test script:
The api was expected a JSON_MIME_TYPE = 'application/json', so I add a header in a request as follow bellow.
The scrit was using a wrong encode because some text in JSON couldn't be encode in Unicode, Eg:"Omö" encode in ascii launch the exception UnicodeEncodeError: 'ascii' codec can't encode character '\xf6' in position 1: ordinal not in range(128). So I changed to utf8.
Here is the fixed code:
def get_response(fct, data, method=GET):
"""
Performs the query to the server and returns a string containing the
response.
"""
assert(method in (GET, POST))
url = f'http://{hostname}:{port}/{fct}'
if method == GET:
encode_data = parse.urlencode(data)
req = request.Request(f'{url}?{encode_data}'
, headers={'content-type': 'application/json'})
response = request.urlopen(req)
elif method == POST:
params = json.dumps(data)
binary_data = params.encode('utf8')
req = request.Request(url
, data= binary_data
, headers={'content-type': 'application/json'})
response = request.urlopen(req)
x = response.read()
return x

How do I handle aiohttp exceptions that occur when handling other exceptions?

I have a python server setup with aiohttp that is accepting files POST'd to a specific endpoint. I only want to accept a json body, or gzip'd json files. My code is as follows:
class Uploader(View):
async def post(self):
if not self.request.can_read_body:
return json_response({'message': 'Cannot read body'}, status=400)
elif self.request.content_type != 'application/json' and self.request.content_type != 'multipart/form-data':
return json_response({'message': 'Incorrect data type sent to the server'}, status=400)
try:
json_body = await self.request.json()
# Other bits of code using the json body
except RequestPayloadError as e:
# Internal logging here
return json_response({'message': 'Unable to read payload'}, status=400)
# Other code for handling ValidationError, JSONDecodeError, Exception
return json_response({'message': 'File successfully uploaded'}, status=201)
When I test this by uploading something that isn't json or gzip'd json, the RequestPayloadError exception is correctly being hit, the internal logging is being done as expected, and the client is being returned the expected response. However, I'm also seeing the following unhandled exception:
Unhandled exception
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/aiohttp/web_protocol.py", line 428, in start
await payload.readany()
File "/usr/local/lib/python3.6/site-packages/aiohttp/streams.py", line 325, in readany
raise self._exception
File "/web_api/app/views/resources/Uploader.py", line 49, in post
json_body = await self.request.json()
File "/usr/local/lib/python3.6/site-packages/aiohttp/web_request.py", line 512, in json
body = await self.text()
File "/usr/local/lib/python3.6/site-packages/aiohttp/web_request.py", line 506, in text
bytes_body = await self.read()
File "/usr/local/lib/python3.6/site-packages/aiohttp/web_request.py", line 494, in read
chunk = await self._payload.readany()
File "/usr/local/lib/python3.6/site-packages/aiohttp/streams.py", line 325, in readany
raise self._exception
aiohttp.web_protocol.RequestPayloadError: 400, message='Can not decode content-encoding: gzip'
How am I supposed to handle this currently unhandled exception given that it doesn't seem to be originating in my code, and I'm already handling the one that I'm expecting? Can I suppress aiohttp exceptions somehow?
EDIT: I'm using version 3.1.1 of aiohttp
Can not decode content-encoding: gzip points on the problem source.
Your peer sends data with Content-Encoding: gzip HTTP header but actually the data is not gzip compressed (other compressor is used or no compressor at all).
As result aiohttp fails on decompressing such data with RequestPayloadError exception.

How to create delegate/nested async context manager for aiohttp?

I want to create custom request manager for crawler with dynamic waiting.
My crawler need to make requests to sites which prohibit parallel requests from same ip address. If such blocking occurs requests returns with HTTP error codes 403, 503, 429, etc.
In case of error I want to wait some time and repeat request. But for simplicity of parsers they just call get and receive correct page.
I want to use aiohttp and new async with syntax of Python 3.5 so my parsers classes can use async with for my requester class same way if they used aiohttp.ClientSession like this:
# somewhere in a parser
async def get_page(self, requester, page_index):
async with requester.get(URL_FMT.format(page_index)) as response:
html_content = await response.read()
result = self.parsing_page(html_content)
return result
if requester is aiohttp.ClientSession, then response is aiohtpp.ClientResponse which have __aenter__ and __aexit__ methods, so async with working as expected.
But if I put my requester class in the middle it is not working anymore.
Traceback (most recent call last):
File "/opt/project/api/tornado_runner.py", line 6, in <module>
from api import app
File "/opt/project/api/api.py", line 20, in <module>
loop.run_until_complete(session.login())
File "/usr/local/lib/python3.5/asyncio/base_events.py", line 337, in run_until_complete
return future.result()
File "/usr/local/lib/python3.5/asyncio/futures.py", line 274, in result
raise self._exception
File "/usr/local/lib/python3.5/asyncio/tasks.py", line 239, in _step
result = coro.send(None)
File "/opt/project/api/viudata/session.py", line 72, in login
async with self.get('https://www.viudata.com') as resp:
AttributeError: __aexit__
Unclosed client session
client_session: <aiohttp.client.ClientSession object at 0x7f44f61ef240>
It is looking like this.
class Requester:
def __init__(self, session: aiohttp.ClientSession):
self.session = session
async def get(self, *args, **kwargs):
is_result_successful = False
while not is_result_successful:
response = await self.session.get(*args, **kwargs)
if response.status in [503, 403, 429]:
await self.wait_some_time()
else:
is_result_successful = True
return response
From my understanding self.session.get is coroutine function so I will await it. Result is aiohttp.ClientResponse which have __aenter__ or __aexit__. But if return it parser's code of async with block return odd error.
Can you say what I need to replace to with my requester class as with aiohttp.ClientSession?
You should write additional code to support async with protocol.
See client.request() and _RequestContextManager for inspiration.

Resources