I have studied the code as follow in Cherrypy web development,
if returnpage != '':
raise cherrypy.InternalRedirect(returnpage)
else:
raise cherrypy.HTTPRedirect("/hqc")
Google doesn't help much in this case after I did some research.
I've checked from cherrypy's __doc__, but the documentation there is very terse.
>>>print(cherrypy.InternalRedirect.__doc__)
Exception raised to switch to the handler for a different URL.
This exception will redirect processing to another path within the site
(without informing the client). Provide the new path as an argument when
raising the exception. Provide any params in the querystring for the new URL.
>>> print(cherrypy.HTTPRedirect.__doc__)
Exception raised when the request should be redirected.
This exception will force a HTTP redirect to the URL or URL's you give it.
The new URL must be passed as the first argument to the Exception,
e.g., HTTPRedirect(newUrl). Multiple URLs are allowed in a list.
If a URL is absolute, it will be used as-is. If it is relative, it is
assumed to be relative to the current cherrypy.request.path_info.
If one of the provided URL is a unicode object, it will be encoded
using the default encoding or the one passed in parameter.
There are multiple types of redirect, from which you can select via the
``status`` argument. If you do not provide a ``status`` arg, it defaults to
303 (or 302 if responding with HTTP/1.0).
Examples::
raise cherrypy.HTTPRedirect("")
raise cherrypy.HTTPRedirect("/abs/path", 307)
raise cherrypy.HTTPRedirect(["path1", "path2?a=1&b=2"], 301)
See :ref:`redirectingpost` for additional caveats.
My questions are:
- Why bother with redirect when you can simply invoke another handler?
- What are some practical senarios for the two redirect exception respectively?
InternalRedirect is only handled in the server side, this means that the client would not be aware of that redirection, because in terms of the HTTP protocol that is mediating the session between the client and the server, nothing changed. By server side I mean ONLY CherryPy will be aware of the rediction, if you have some intermediate server (like an nginx reverse proxy) it would not see anything different.
For example if the client visited a url /page_one and then you used raise InternalRedirect('/page_two'), the client (browser) will receive the content from the /page_two handler in the /page_one url. If you raised a regular HTTPRedirect the server would end the first request with an HTTP status code of 303 (or any other status that you passed to the exception) and a Location header to /page_two. Then is the client who will initiate another request to /page_two, basically everybody will be aware of the redirection (more info about HTTP redirection). Most of the time this is the better alternative.
Additionally you could detect if the request came from a previous InternalRedirect by verifying the cherrypy.request.prev property. It will have the previous cherrypy.request object as its value or None.
For the sake of a possible (maybe not the best example) use of an InternalRedirect, checkout this production/beta example page, in addition I added a tool to prohibit the client to reach to handlers directly.
The client will see a different content in the same page /. Note that the access log that CherryPy generates will log the url of the handler that end up handling the request, in this case you will see /_beta or /_production.
import random
import cherrypy
#cherrypy.tools.register('before_handler')
def private_handler():
"""End the request with HTTP 404 not found if the client
tries to reach the handler directly instead of being
internally redirected from other handler.
"""
if cherrypy.request.prev is None:
raise cherrypy.NotFound()
class MainApp:
#cherrypy.expose
def index(self):
# 50/50 change of receiving production or the new SHINY beta page
use_beta = random.randint(0, 1)
if use_beta:
raise cherrypy.InternalRedirect('/_beta')
else:
raise cherrypy.InternalRedirect('/_production')
#cherrypy.tools.private_handler()
#cherrypy.expose
def _production(self):
return (
"<html>"
"<h2>{}</h2>"
"</html>"
).format(
"Welcome to our awesome site!"
)
#cherrypy.tools.private_handler()
#cherrypy.expose
def _beta(self):
return (
"<html>"
'<h1 style="color: blue">{}</h1>'
"<p>{}</p>"
"</html>"
).format(
"Welcome to our awesome site!",
"Here is our new beta content..."
)
cherrypy.quickstart(MainApp())
Related
I have a Pyramid web app with fail2ban set up to jail ten consecutive 404 statuses (i.e. bots that probe for vulnerabilities), Sentry error logging and, as far as I know, there are no security vulnerabilities. However, every few days I get a notification of a 502 caused by a null byte attack. This is harmless, but it has become very tiresome and I ignored a bizarre but legitimate human-user–generated 502 status as a result.
A null byte attack in Pyramid, in my set-up, raises a URLDecodeError ('utf-8' codec can't decode byte 0xc0 in position 16: invalid start byte) at the url dispatch level, so is not routed to the notfound_view_config decorated view.
Is there any way to capture %EF/%BF in requests in Pyramid or should I block them in Apache?
Comment by Steve Piercy converted into an Answer:
A search in the Pyramid issue tracker yields several related results. The first hit provides one way to deal with it.
In brief, the view constructor class exception_view_config(ExceptionClass, renderer) captures it behaving like notfound_view_config or forbidden_view_config (which aren't passed declared routes in contrast to view_config).
So the 404 view could look like:
from pyramid.view import notfound_view_config
from pyramid.exceptions import URLDecodeError
from pyramid.view import exception_view_config
#exception_view_config(context=URLDecodeError, renderer='json')
#notfound_view_config(renderer='json')
def notfound_view(request):
request.response.status = 404
return {"status": "error"}
This can be tested by visiting the browser http://0.0.0.0:👾👾/%EF%BF (where 👾👾 is the port served onto).
However, there are two additionally considerations.
It does not play well with the debug toolbar (pyramid.includes = pyramid_debugtoolbar in the local configuration ini file).
Also, an error gets raises if any dynamic attribute like request.path_info gets accessed. So either the response is minimally formatted or request.environ['PATH_INFO'] is assigned a new value before any operation in the view (e.g. usage data etc.).
The view call happens after the debugtoolbar error is raises, however, so the first point still stands even with a request.environ['PATH_INFO'] = 'hacked'.
Bonus
As this is unequivocally an attack, this could be customised to play well with fail2ban to block the hacker IP as described here by using a unique status code, say 418, at the first occurrence.
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(("privnote.com", 80))
#s = ssl.wrap_socket(s, keyfile=None, certfile=None, server_side=False, cert_reqs=ssl.CERT_NONE, ssl_version=ssl.PROTOCOL_SSLv23)
def claim_note(note_url):
s.sendall(b'DELETE /'+note_url.encode()+b'HTTP/1.1\r\nX-Requested-With: XMLHttpRequest\r\nHost: privnote.com\r\n')
print(s.recv(4096))
This is my code, now let me first start by saying that I have tried so many different things apart from this. I’ve tried https port and http port, 443, 80. I’ve commented and uncommented the statement that wraps the socket with ssl. All with the same outcome. Either the api returning absolutely nothing or the api telling me the request couldn’t be understood by the server. I was looking at a GitHub repo and only one header was used and it was because it was for an Ajax call which was x-requested-with. I tried adding user agent content type and now I’m just using host and x requested with. It’s a DELETE request and the url is the first 8 chars after the link. I’ve also tried adding \r\n\r\n at the end and even tried content-length. I don’t know what else to do. I want to know why the server is saying that.
There are multiple problems with your code. If you actually print out the request you are trying to sent it will look like this:
b'DELETE /node_urlHTTP/1.1\r\nX-Requested-With: XMLHttpRequest\r\nHost: privnote.com\r\n'
There are two problems with this line: a missing space between /node_url and HTTP/1.1 and a missing final \r\n als end-of-header marker at the end. Once these are fixed you get a successful response - a 302 redirect to the HTTPS version:
b'HTTP/1.1 302 Found\r\nDate:...\r\nLocation: https://privnote.com/node_url ...
When repeating the request with HTTPS and a valid node_url (with an invalid node_url you get an error that DELETE is not an allowed method):
s.connect(("privnote.com", 443))
s = ssl.wrap_socket(s)
...
b'HTTP/1.1 200 OK\r\n ...
I'm new to flask and currently converting an existing WSGI application to run through flask as long term it'll make life easier.
All requests are POST to specific routes however the current application inspects the post data prior to executing the route to see if the request needs to be run at all or not (i.e. if an identifier supplied in the post data already exists in our database or not).
If it does exist a 200 code and json is returned "early" and no other action is taken; if not the application continues to route as normal.
I think I can replicate the activity at the right point by calling before_request() but I'm not sure if returning a flask Response object from before_request() would terminate the request adequately at that point? Or if there's a better way of doing this?
NB: I must return this as a 200 - other examples I've seen result in a redirect or 4xx error handling (as a close parallel to this activity is authentication) so ultimately I'm doing this at the end of before_request():
if check_request_in_progress(post_data) is True:
response = jsonify({'request_status': 'already_running'})
response.status_code = 200
return response
else:
add_to_requests_in_progress(post_data)
Should this work (return and prevent further routing)?
If not how can I prevent further routing after calling before_request()?
Is there a better way?
Based on what they have said in the documents, it should do what you want it to do.
The function will be called without any arguments. If the function returns a non-None value, it’s handled as if it was the return value from the view and further request handling is stopped.
(source)
#app.route("/<name>")
def index(name):
return f"hello {name}"
#app.before_request
def thing():
if "john" in request.path:
return "before ran"
with the above code, if there is a "john" in the url_path, we will see the before ran in the output, not the actual intended view. you will see hello X for other string.
so yes, using before_request and returning something, anything other than None will stop flask from serving your actual view. you can redirect the user or send them a proper response.
I am using the Requests library of robot framework to upload files to a server. The file RequestsKeywords.py has a line
logger.info('Post Request using : alias=%s, uri=%s, data=%s, headers=%s, files=%s, allow_redirects=%s '
% (alias, uri, dataStr, headers, files, redir))
This prints out the whole contents of my upload file inside the request in my log file. Now i could get rid of this log by changing the log level however, my goal is to be able to see the log but just truncate it to 80 characters, so I am not browsing through lines of hex values. Any idea how this could be done?
A solution would be to create a wrapper method, that'll temporary disable the logging, and enable it back once completed.
The flow is - get an instance of the RequestsLibrary, call RF's Set Log Level with argument "ERROR" (so at least an error gets through, if needed), call the original keyword, set the log level back to what it was, and return the result.
Here's how it looks like in python:
from robot.libraries.BuiltIn import BuiltIn
def post_request_no_log(*args, **kwargs):
req_lib = BuiltIn().get_library_instance('RequestsLibrary')
current_level = BuiltIn().set_log_level('ERROR')
try:
result = req_lib.post_request(*args, **kwargs)
except Exception as ex:
raise ex
finally:
BuiltIn().set_log_level(current_level)
return result
And the same, in robotframework syntax:
Post Request With No Logging
[Documentation] Runs RequestsLibrary's Post Request, with its logging surpressed
[Arguments] #{args} &{kwargs}
${current level}= Set Log Level ERROR
${result}= Post Request #{args} &{kwargs}
[Return] ${result}
[Teardown] Set Log Level ${current level}
The python's version is bound to be milliseconds faster - no need to parse & match the text in the RF syntax, which on large usage may add up.
Perhaps not the answer you're looking for, but after having looked at the source of the RequestsLibrary I think this is indeed undesirable and should be corrected. It makes sense to have the file contents when running in a debug or trace setting, but not during regular operation.
As I consider this a bug, I'd recommend registering an issue with the GitHub project page or correcting it yourself and providing a pull request. In my opinion the code should be refactored to send the file name under the info setting and the file contents under the trace/debug setting:
logger.info('Post Request using : alias=%s, uri=%s, data=%s, headers=%s, allow_redirects=%s' % ...
logger.trace('Post Request files : files=%s' % ...
In the mean time you have two options. As you correctly said, temporarily reduce the log level settings in Robot Code. If you can't change the script, then using a Robot Framework Listener can help with that. Granted, it would be more work then making the change in the ReqestsLibrary yourself.
An temporary alternative could be to use the RequestLibrary Post, which is deprecated but still present.
If you look at the method in RequestKeywords library, its only calling self. _body_request() at the end. What we ended up doing is writing another keyword that was identical to the original except the part where it called logger.info(). We modified it to log files=%.80s which truncated the file to 80 chars.
def post_request_truncated_logs(
self,
alias,
uri,
data=None,
params=None,
headers=None,
files=None,
allow_redirects=None,
timeout=None):
session = self._cache.switch(alias)
if not files:
data = self._format_data_according_to_header(session, data, headers)
redir = True if allow_redirects is None else allow_redirects
response = self._body_request(
"post",
session,
uri,
data,
params,
files,
headers,
redir,
timeout)
dataStr = self._format_data_to_log_string_according_to_header(data, headers)
logger.info('Post Request using : alias=%s, uri=%s, data=%s, headers=%s, files=%.80s, allow_redirects=%s '
% (alias, uri, dataStr, headers, files, redir))
I'm trying to use python's requests library to log in to a website. It's a pretty simple code, and you can really get the gist of requests just by going on its website. I, however, want to check if I'm successfully logged in via the url. The problem I've encountered is when I initiate the post requests and give it (the variable p) a url, whether the html has changed or not I'm still passed the same url when I type print(p.url). Is there any way for me to refresh the browser or update the url to whatever it's currently set at?
(I can add a line for checking the url against itself later, but for now I just want to get the correct url)
#!usr/bin/env python3
import requests
payload = {'login': 'USERNAME,
'password': 'PASSWORD'}
with requests.Session() as s:
p = s.post('WEBSITE', data=payload)
#print p.text
print(p.url)
The usuage of python-requests may not as complex as you think. It will automatically handle the redirect of your post ( or session.get()).
Here, session.post() method return a response object:
r = s.post('website', data=payload)
which means r.url is current url you are looking for.
If you still want to refresh current page, just use:
s.get(r.url)
To verify whether you has login successfully, one solution is to do the login in your browser.
Based on the title or content of the webpage returned (i.e, use the content in r.text), you can judge whether you have made it.
BTW, python-requests is a great library, enjoy it.