parse http response set cookie string in python

parse http response set cookie string in python - python-3.x

audit=value1; Version=1; Max-Age=31535999; Expires=Thu, 25-Jul-2019 17:20:26 GMT,c=1; httponly; Path=/
I am beginner in python. I have a string that contains the http.response.set_cookie captured using wireshark. I want to parse the cookies and save them in the database.
I used the code suggested by falsetrue in Python - convert set-cookies response to array of cookies
but unfortunately I did not know how to retreive the other cookie parameters like "httponly" ,"domain","secure"
Appreciate any help!
Thanks.

Related

Python requests: chunked post request

I am trying to send a post request through the request module with headers["Transfer-encoding"] = "chunked", but I am getting back:
<BODY><h2>Bad Request - Invalid Content Length</h2><hr><p>HTTP Error 400. There is an invalid content length or chunk length in the request.</p>
I am sending a json string. headers["Content-Type"] = "application/json" is also given.
Does anybody know if I am missing some setting? Maybe I should set the chunk-size somewhere?
Analysing the headers of the request attached to the response I actually get a content-length header different from zero.
I also tried to create a custom generator from the json string, and pass it to the post method as data=, but it it seems to simply hang there (also above the given timeout=).

Your error says you didn't create the request properly (it's 4xx error, not 5xx which would indicate server issue).
Transfer-Encoding: chunked serves for sending data in chunks. When the body of your message consists of unspecified number of chunks and you send them in lets say - stream. I would suggest reading this.
Each chunk should have it's size in front of the data. For instance:
HTTP/1.1 200 OK
Content-Type: text/plain
Transfer-Encoding: chunked
9\r\n
Some data\r\n
6\r\n
Python\r\n
If you want to send chunked requests with python requests module. You probably need a generator method for that. Please see this. With such few information I can't help you more.

Four space characters in the beginning of a random name HTTP header

I've discovered a domain name (web site and API) which adds a header like this to each HTTP response:
XTVOpalL: Gtm; path=/; Max-Age=900
The header name looks random. Here are few other examples:
XRQOJalT: LtZ; path=/; Max-Age=900
XYjOzalA: Ntx; path=/; Max-Age=900
XykOMalm: ytD; path=/; Max-Age=900
Note the leading 4 spaces. And compare to other response headers:
HTTP/1.1 301 Moved Permanently
Date: Sat, 05 May 2018 11:52:25 GMT
Server: Apache
Location: http://example.com/wp/
Content-Length: 229
Content-Type: text/html; charset=iso-8859-1
Set-Cookie: visid_incap_993094=GuEL85vzTDKQUJ9jfphhgvma7VoAAAAAQUIPAAAAAACgWz3NlkG3smvkXeB6Ewyl; expires=Sun, 05 May 2019 08:21:45 GMT; path=/; Domain=.example.com
Set-Cookie: nlbi_993094=z0NWEcMl0wAVBr8CiwzebQAAAACu2KRRlrUCoWpyWKTrUAJF; path=/; Domain=.example.com
Set-Cookie: incap_ses_115_993094=/xoUXc5Kags3fAFBHpCYAfma7VoAAAAABT/i1XAh1J4D/02wGnXO9w==; path=/; Domain=.example.com
Set-Cookie: ___utmvmicuVtwf=peInjtBXhca; path=/; Max-Age=900
Set-Cookie: ___utmvaicuVtwf=wYxmyOU; path=/; Max-Age=900
Set-Cookie: ___utmvbicuVtwf=TZr
XYjOzalA: Ntx; path=/; Max-Age=900
X-Iinfo: 13-63374213-63374214 NNNN CT(222 -1 0) RT(1525521145044 0) q(0 0 2 0) r(5 5) U11
X-CDN: Incapsula
Main problem - this header sometimes is the first header in the response. Which, in turn, is considered a vulnerability.
In my case it looks like this:
HTTP/1.1 301 Moved Permanently
XYjOzalA: Ntx; path=/; Max-Age=900
Date: Sat, 05 May 2018 11:52:25 GMT
Server: Apache
Location: http://example.com/wp/
...
Quoting the RFC of HTTP 1.1 https://www.rfc-editor.org/rfc/rfc7230#section-3
A sender MUST NOT send whitespace between the start-line and the first header field.
...
The presence of such whitespace in a request might be an attempt to
trick a server into ignoring that field or processing the line after
it as a new request, either of which might result in a security
vulnerability if other implementations within the request chain
interpret the same message differently. Likewise, the presence of
such whitespace in a response might be ignored by some clients or
cause others to cease parsing.
This results in node.js throwing error trying to parse these HTTP responses. Error code is HPE_INVALID_HEADER_TOKEN, which is thrown only if HTTP headers are malformed.
Question: What is it? Who's doing it? Why?

"What is it?"
This is a bug in server side, as it violates HTTP protocol.
Actually, it was discussed in HTTP working group in 2013 for "a bug into python library", and I think the conclusion by Julian Reschke is correct:
It's not a legal field name, thus not a legal start of a header field line.
...
It's forbidden by the grammar, so it's invalid.
"Who's doing it? Why?"
When developer generate the random HTTP header name, he/she introduces this 4-whitespace leading characters, by accident.

Check if a large file exists without downloading it

Not sure if this is possible, but I would like to check the status code of an HTTP request to a large file without downloading it; I just want to check if it's present on the server.
Is it possible to do this with Python's requests? I already know how to check the status code but I can only do that after the file has been downloaded.
I guess what I'm asking is can you issue a GET request and stop it as soon as you've receive the response headers?

Use requests.head(). This only returns the header of requests, not all content — in other words, it will not return the body of a message, but you can get all the information from the header.
The HEAD method is identical to GET except that the server MUST NOT
return a message-body in the response. The metainformation contained
in the HTTP headers in response to a HEAD request SHOULD be identical
to the information sent in response to a GET request. This method can
be used for obtaining metainformation about the entity implied by the
request without transferring the entity-body itself. This method is
often used for testing hypertext links for validity, accessibility,
and recent modification.
For example:
import requests
url = 'http://lmsotfy.com/so.png'
r = requests.head(url)
r.headers
Output:
{'Content-Type': 'image/png', 'Content-Length': '6347', 'ETag': '"18cb-4f7c2f94011da"', 'Accept-Ranges': 'bytes', 'Date': 'Mon, 09 Jan 2017 11:23:53 GMT', 'Last-Modified': 'Thu, 24 Apr 2014 05:18:04 GMT', 'Server': 'Apache', 'Keep-Alive': 'timeout=2, max=100', 'Connection': 'Keep-Alive'}
This code does not download the picture, but returns the header of the picture message, which contains the size, type and date. If the picture does not exist, there will be no such information.

Use HEAD method.
For example urllib
import urllib.request
response = urllib.request.urlopen(url)
if response.getcode() == 200:
print(response.headers['content-length'])
In your case with requests
import requests
response = requests.head(url)
if response.status_code == 200:
print(response.headers['content-length'])

Normally, you use HEAD method instead of GET for such sort of things. If you query some random server on the web, then be prepared that it may be configured to return inconsistent results (this is typical for servers requiring registration). In such cases you may want to use GET request with Range header to download only small number of bytes.

Setting multiple cookie headers in Koa

I am trying to clear two cookies in my clients browser via the following:
this.response.set('Set-Cookie', 'mycookie1=; Path=/; expires=Thu, 01 Jan 1970 00:00:00 GMT; ,mycookie1.sig=; Path=/; expires=Thu, 01 Jan 1970 00:00:00 GMT;');
I can only seem to get rid of mycookie1 and not the mycookie.sig.

It's more about the protocol (HTTP). You should split it into two header fields (Set-Cookie for each cookie).
By RFC6265:
An origin server can include multiple
Set-Cookie header fields in a single response. ... Origin servers SHOULD NOT fold multiple Set-Cookie header fields into
a single header field.
There is a better way to set cookies with Koa than the raw way, two cookies are set by call it simply twice (see the docs for the possible options):
function *() {
this.cookies.set('mycookie1', 'value1', options);
this.cookies.set('mycookie2', 'value2', options);
}

wsgi cookies - no middleware

Sounds simple enough
def create_cookie():
bag = string.ascii_uppercase + string.ascii_lowercase + string.digits
cookie = Cookie.SimpleCookie()
cookie['sessionid'] = ''.join(random.sample(bag,24))
cookie['sessionid']['expires'] = 600
return 'Set-Cookie: ', cookie.output().replace('Set-Cookie: ', '', 1)
cookie.output() is Set-Cookie: sessionid=YmsrvCMFapXk6wAt4EVKz2uU; expires=Sun, 14-Aug-2011 21:48:19 GMT
headers.append(('Content-type', 'text/html'))
headers.append(('Content-Length', str(output_len)))
headers.append(create_cookie)
This is my response
('200 OK', [('Content-type', 'text/html'), ('Content-Length', '1204'), ('Set-Cookie', 'sessionid=YmsrvCMFapXk6wAt4EVKz2uU; expires=Sun, 14-Aug-2011 21:48:19 GMT')], 'html stuff')
This is what I get from envirion:
HTTP_COOKIE: sessionid=YmsrvCMFapXk6wAt4EVKz2uU
And when I click another link on my page, no more HTTP_COOKIE
Using the chrome dev console I can see the request cookie and the page header contains:
Cookie:: sessionid=YmsrvCMFapXk6wAt4EVKz2uU
Now, this bothers me a bit. First of all why does it have double :: ? I tried using 'Set-Cookie' instead of 'Set-Cookie: ' in the create_cookie function. Doing that I didn't get any HTTP_COOKIE at all from environ.
So after lots of searching in the web and everyone just talking middleware (don't suggest I use one please - I'm doing this to learn the wsgi) ... I've come up empty.

Invisible default behavior ftw...
After some intensive debugging I noticed that the following request didn't include the HTTP_COOKIE making it conclusively a problem on the browsers side of actually sending the cookie that I could find in the browser otherwise.
Some digging around revealed that the default path and domain behavior was spoiling my efforts , the difference between /action/login (where the cookie was set) and /display/data (where the cookie wasn't sent was fixed by setting the path in this case to '/'.
"yay"

you could try:
return [tuple(line.split(': ',1)) for line in cookie.output().split('\r\n')]
This also works for multiple entries in cookie.
Of course, you need to use extend instead of append:
headers.extend(create_cookie())

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

parse http response set cookie string in python - python-3.x

Related

Python requests: chunked post request

Four space characters in the beginning of a random name HTTP header

Check if a large file exists without downloading it

Setting multiple cookie headers in Koa

wsgi cookies - no middleware

Categories

Resources