Why is the request body blank when the content-type is application/x-www-form-urlencoded? - cherrypy

I am receiving a request with content-type application/x-www-form-urlencoded. When I try to read the body of the request using cherrypy.request.body.read() the results is b''.
I seem to be able to access the request form parameters using any of these:
cherrypy.request.params
cherrypy.request.body.params
cherrypy.request.body.request_params
But this is inconvenient for my use case, I want to be able to obtain the raw request body regardless of content-type. Also the above 3 give me a dictionary, which isn't the exact format that the request had in its body. Is there a way to do that with cherrypy? Or is this functionality hidden?

Not sure what are you trying to accomplish by not using the already parsed body that correspond to the defined Content-Type... but you can process the body of the request yourself configuring: cherrypy.request.process_request_body = False and read the body with something like:
cherrypy.request.rfile.read(cherrypy.request.headers['Content-Length'])
For more information see: https://github.com/cherrypy/cherrypy/blob/master/cherrypy/_cprequest.py#L292-L315
Fragment of relevant parts of that url:
rfile = None
"""
If the request included an entity (body), it will be available
as a stream in this attribute. However, the rfile will normally
be read for you between the 'before_request_body' hook and the
'before_handler' hook, and the resulting string is placed into
either request.params or the request.body attribute.
You may disable the automatic consumption of the rfile by setting
request.process_request_body to False, either in config for the desired
path, or in an 'on_start_resource' or 'before_request_body' hook.
WARNING: In almost every case, you should not attempt to read from the
rfile stream after CherryPy's automatic mechanism has read it. If you
turn off the automatic parsing of rfile, you should read exactly the
number of bytes specified in request.headers['Content-Length'].
Ignoring either of these warnings may result in a hung request thread
or in corruption of the next (pipelined) request.
"""
process_request_body = True
"""
If True, the rfile (if any) is automatically read and parsed,
and the result placed into request.params or request.body.
"""
body = None
"""
If the request Content-Type is 'application/x-www-form-urlencoded'
or multipart, this will be None. Otherwise, this will be an instance
of :class:`RequestBody<cherrypy._cpreqbody.RequestBody>` (which you
can .read()); this value is set between the 'before_request_body' and
'before_handler' hooks (assuming that process_request_body is True."""

Related

What happens if I do not use body parser or express.json()?

I am new to the whole backend stuff I understood that both bodyparser and express.json() will parse the incoming request(body from the client) into the request object.
But what happens if I do not parse the incoming request from the client ?
without middleware parsing your requests, your req.body will not be populated. You will then need to manually go research on the req variable and find out how to get the values you want.
Your bodyParser acts as an interpreter, transforming http request, in to an easily accessible format base on your needs.
You may read more on HTTP request here ( You can even write your own http server )
https://nodejs.org/api/http.html#http_class_http_incomingmessage
You will just lose the data, and request.body field will be empty.
Though the data is still sent to you, so it is transferred to the server, but you have not processed it so you won't have access to the data.
You can parse it yourself, by the way. The request is a Readable stream, so you can listen data and end events to collect and then parse the data.
You shall receive what you asked for in scenarios where you do not convert the data you get the raw data that looks somewhat like this username=scott&password=secret&website=stackabuse.com, Now this ain't that bad but you will manually have to filter out which is params, what is a query and inside of those 2 where is the data..
unless it is a project requirement all that heavy lifting is taken care of by express and you get a nicely formatted object looking like this
{
username: 'scott',
password: 'secret',
website: 'stackabuse.com'
}
For Situation where you DO need to use the raw data express gives you a convenient way of accessing that as well all you need to do is use this line of code
express.raw( [options] ) along with express.json( [options] )

Scrapy - is it possible to extract Payload Request from Response

is it possible to extract and set as variable the "Payload Request" which has been pushed in order to receive particular response?
You can access the request object in the callback function by response.request.
This object is the request object itself, so it contains everything you passed in the request. It doesn't have a "payload" attribute though.
The equivalent should be response.request.body, assuming you had a body in the request. Everything else is still there, headers, cookies, meta, method, etc
More on the params of request here.

Truncating logging of Post Request in RobotFramework

I am using the Requests library of robot framework to upload files to a server. The file RequestsKeywords.py has a line
logger.info('Post Request using : alias=%s, uri=%s, data=%s, headers=%s, files=%s, allow_redirects=%s '
% (alias, uri, dataStr, headers, files, redir))
This prints out the whole contents of my upload file inside the request in my log file. Now i could get rid of this log by changing the log level however, my goal is to be able to see the log but just truncate it to 80 characters, so I am not browsing through lines of hex values. Any idea how this could be done?
A solution would be to create a wrapper method, that'll temporary disable the logging, and enable it back once completed.
The flow is - get an instance of the RequestsLibrary, call RF's Set Log Level with argument "ERROR" (so at least an error gets through, if needed), call the original keyword, set the log level back to what it was, and return the result.
Here's how it looks like in python:
from robot.libraries.BuiltIn import BuiltIn
def post_request_no_log(*args, **kwargs):
req_lib = BuiltIn().get_library_instance('RequestsLibrary')
current_level = BuiltIn().set_log_level('ERROR')
try:
result = req_lib.post_request(*args, **kwargs)
except Exception as ex:
raise ex
finally:
BuiltIn().set_log_level(current_level)
return result
And the same, in robotframework syntax:
Post Request With No Logging
[Documentation] Runs RequestsLibrary's Post Request, with its logging surpressed
[Arguments] #{args} &{kwargs}
${current level}= Set Log Level ERROR
${result}= Post Request #{args} &{kwargs}
[Return] ${result}
[Teardown] Set Log Level ${current level}
The python's version is bound to be milliseconds faster - no need to parse & match the text in the RF syntax, which on large usage may add up.
Perhaps not the answer you're looking for, but after having looked at the source of the RequestsLibrary I think this is indeed undesirable and should be corrected. It makes sense to have the file contents when running in a debug or trace setting, but not during regular operation.
As I consider this a bug, I'd recommend registering an issue with the GitHub project page or correcting it yourself and providing a pull request. In my opinion the code should be refactored to send the file name under the info setting and the file contents under the trace/debug setting:
logger.info('Post Request using : alias=%s, uri=%s, data=%s, headers=%s, allow_redirects=%s' % ...
logger.trace('Post Request files : files=%s' % ...
In the mean time you have two options. As you correctly said, temporarily reduce the log level settings in Robot Code. If you can't change the script, then using a Robot Framework Listener can help with that. Granted, it would be more work then making the change in the ReqestsLibrary yourself.
An temporary alternative could be to use the RequestLibrary Post, which is deprecated but still present.
If you look at the method in RequestKeywords library, its only calling self. _body_request() at the end. What we ended up doing is writing another keyword that was identical to the original except the part where it called logger.info(). We modified it to log files=%.80s which truncated the file to 80 chars.
def post_request_truncated_logs(
self,
alias,
uri,
data=None,
params=None,
headers=None,
files=None,
allow_redirects=None,
timeout=None):
session = self._cache.switch(alias)
if not files:
data = self._format_data_according_to_header(session, data, headers)
redir = True if allow_redirects is None else allow_redirects
response = self._body_request(
"post",
session,
uri,
data,
params,
files,
headers,
redir,
timeout)
dataStr = self._format_data_to_log_string_according_to_header(data, headers)
logger.info('Post Request using : alias=%s, uri=%s, data=%s, headers=%s, files=%.80s, allow_redirects=%s '
% (alias, uri, dataStr, headers, files, redir))

How to post data using node-curl?

I'm very new to LINUX working with node.js. Its just my 2nd day. I use node-curl for curl request. In the link below I have found example with Get request. Can anybody provide me a Post request example using node-curl.
https://github.com/jiangmiao/node-curl/blob/master/examples/low-level.js
You need to use setopt in order to specify POST options for a cURL request. The options you should start looking at first are CURLOPT_POST and CURLOPT_POSTFIELDS. From the libcurl documentation linked from node-curl:
CURLOPT_POST
A parameter set to 1 tells the library to do a regular HTTP post. This will also make the library use a "Content-Type: application/x-www-form-urlencoded" header. (This is by far the most commonly used POST method).
Use one of CURLOPT_POSTFIELDS or CURLOPT_COPYPOSTFIELDS options to specify what data to post and CURLOPT_POSTFIELDSIZE or CURLOPT_POSTFIELDSIZE_LARGE to set the data size.
Optionally, you can provide data to POST using the CURLOPT_READFUNCTION and CURLOPT_READDATA options but then you must make sure to not set CURLOPT_POSTFIELDS to anything but NULL. When providing data with a callback, you must transmit it using chunked transfer-encoding or you must set the size of the data with the CURLOPT_POSTFIELDSIZE or CURLOPT_POSTFIELDSIZE_LARGE option. To enable chunked encoding, you simply pass in the appropriate Transfer-Encoding header, see the post-callback.c example.
CURLOPT_POSTFIELDS
... [this] should be the full data to post in a HTTP POST operation. You must make sure that the data is formatted the way you want the server to receive it. libcurl will not convert or encode it for you. Most web servers will assume this data to be url-encoded.
This POST is a normal application/x-www-form-urlencoded kind (and libcurl will set that Content-Type by default when this option is used), which is the most commonly used one by HTML forms. See also the CURLOPT_POST. Using CURLOPT_POSTFIELDS implies CURLOPT_POST.
If you want to do a zero-byte POST, you need to set CURLOPT_POSTFIELDSIZE explicitly to zero, as simply setting CURLOPT_POSTFIELDS to NULL or "" just effectively disables the sending of the specified string. libcurl will instead assume that you'll send the POST data using the read callback!
With that information, you should be able add the following options to the low-level example to have it make a POST request:
var fieldsStr = '{}';
curl.setopt('CURLOPT_POST', 1); // true?
curl.setopt('CURLOPT_POSTFIELDS', fieldsStr);
You will need to tweak the contents of fieldsStr to match the format the server is expecting. Per the documentation you may also need to url-encode the data - which should be as simple as using encodeURIComponent according to this post.

What exactly does "Response already committed" mean? How to handle exceptions then?

I know writing business logic in getters and setters is a very bad programming practice, but is there any way to handle exceptions if the response is already committed?
What exactly is the meaning of "Response already committed" and "Headers are already sent to the client"?
There's no nice way to handle exceptions if the response is already committed. The HTTP response exist basically of a header and a body. The headers basically instruct the client (the webbrowser) how exactly it should deal with the response, e.g. the content type, the content length, the character encoding, the body encoding, the cache instructions, etcetera.
You can see the headers in the HTTP traffic monitor of the webbrowser's developer toolset. Press F12 in Chrome/IE9+/Firefox23+ and check the "Network" tab. The below screenshow is what my Chrome shows on your current question:
(note: the "Response" tab shows the response body)
The response body is the actual content, usually in flavor of a bunch of HTML code. The server has usually a fixed size buffer to write the response to. The buffer size depends on server make/version and configuration and is usually 2KB~10KB. If this buffer overflows, then it will be flushed to the other end of the connection, the client. This is the commit of a response. The client has already obtained the first part of the response, usually already representing the whole bunch of headers and maybe a part of the body.
The commit of a response is a point of no return. The server cannot take the already sent bytes back. It's too late to change the response headers (for example, a redirect is basically instructed by a Location header with therein the new URL), let alone the response body. Best what you can do is to append the error information to the already written response body. But this may end up in some weird looking HTML as it's not known which HTML tags needs to be closed at that point. The browser may fail to present it in a proper manner.
Apart from avoiding business logic in getters so that the exceptions are not thrown while rendering the response, another way to avoid an already committed response is to configure the response buffer size to be as large as the largest page which your webapp can serve. How to do that depends on the server make/version. In Tomcat for example, you can configure it as bufferSize attribute of the <Connector> element. Note that this won't prevent from flushing if your own code is (implicitly) calling flush() on the response output stream.
Good exlanation BalusC and I would add that primefaces has an issue in their exception handler. They try to redirect to error page after request was already committed. And as you said the only solution I found is to add some extra content to the response body. I owerride the handler and add this code
if ( extContext.isResponseCommitted() ) {
PartialResponseWriter writer = context.getPartialViewContext().getPartialResponseWriter();
writer.startElement( "script", null );
writer.write( "window.location.href = '" + errorPageUrl + "';" );
writer.endElement( "script" );
writer.getWrapped().endCDATA();
writer.endElement( "update" );
writer.getWrapped().endDocument();
}
else {
extContext.redirect( errorPageUrl );
context.responseComplete();
}

Resources