Python POST request to retrieve base64 encode File - python-3.x

Im trying to POST request using Python to retreive a specific File. Since the URL is behind a server with authorized access theres no use posting it here
However the form data contains a field called base64 and lengthy which I cant figure out if its a form data value or base64 encoding of post request
Here are browser parameters
General:
Request URL: http://exampleapi.com/api/Document/Export
Request Method: POST
Status Code: 200 OK
Remote Address: XX.XXX.XXX.XX:XX
Referrer Policy: no-referrer-when-downgrade
Response Headers:
Access-Control-Allow-Origin: http://example.com
Cache-Control: no-cache
Content-Disposition: attachment; filename=location-downloads.xlsx
Content-Length: 7148
Content-Type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
Date: Tue, 23 Jul 2019 21:00:18 GMT
Expires: -1
Pragma: no-cache
Server: Microsoft-IIS/7.5
X-AspNet-Version: 4.0.30319
X-Powered-By: ASP.NET
Request Headers :
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3
Accept-Encoding: gzip, deflate
Accept-Language: en-US,en;q=0.9
Cache-Control: max-age=0
Connection: keep-alive
Content-Length: 10162
Content-Type: application/x-www-form-urlencoded
Cookie: abcConnection=!UA7tkC3iZCmVNGRUyRpDWARVBWk/lY6SZvgxLlaygsQKk+vuwA1NxvhwE9ph4i+3NZlKeepIfuHhUvyQjl68fhhrT9ueqMx/3mBKUDcT
DNT: 1
Host: exampleapi.com
Origin: http://example.com
Referer: http://example.com/
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.142 Safari/537.36
Form Data:
fileName: location-downloads.xlsx
contentType: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
base64: UEsDBAoAAAAAAAh4904AAAAAAAAAAAAAAAAJAAAAZG9jUHJvcHMvUEsDBAoAAAAIAAh490(shortened for simplicity)
Here is what I tried
url='http://example.com'
urllib3.disable_warnings()
headers = {
"Content-Type": "application/x-www-form-urlencoded",
"User-Agent": "Mozilla/5.0",
}
with requests.session() as s:
r=s.get(url,headers={"User-Agent":"Mozilla/5.0"},verify=False)
data=r.content
soup=BeautifulSoup(data,'html.parser')
form_data = {
"fileName":"location-downloads.xlsx",
"contentType":"application/vnd.openxmlformats-officedocument.spreadsheetml.sheet"
}
r2=s.post('http://exampleapi.com/api/Document/Export',data=json.dumps(form_data,ensure_ascii=True).encode('utf-8'),headers=headers,verify=False)
print(r2.status_code)
Any idea how i should proceed. My status code also shows 500 for the post here

Related

Node.js express server with compression not work

From lighthouse chrome test page:
URL Transfer Size Potential Savings
/three.module.min.js 630.3 KiB 477.3 KiB
I minify but performance still need compression.
...
function shouldCompress (req, res) {
if (req.headers['x-no-compression']) {
// don't compress responses with this request header
return false
}
// fallback to standard filter function
return compression.filter(req, res)
}
hostingHTTP.use(compression({ filter: shouldCompress }))
...
My request :
Accept: */*
Accept-Encoding: gzip, deflate, br
Accept-Language: en-US,en;q=0.9,ru;q=0.8
Cache-Control: no-cache
Connection: keep-alive
Host: maximumroulette.com
Pragma: no-cache
Referer: https://maximumroulette.com/apps/magic/public/module.html
sec-ch-ua: "Not_A Brand";v="99", "Google Chrome";v="109", "Chromium";v="109"
sec-ch-ua-mobile: ?0
sec-ch-ua-platform: "Windows"
Sec-Fetch-Dest: script
Sec-Fetch-Mode: no-cors
Sec-Fetch-Site: same-origin
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36
My response:
Accept-Ranges: bytes
Cache-Control: public, max-age=0
Connection: keep-alive
Content-Length: 394590
Content-Type: application/javascript; charset=UTF-8
Date: Mon, 13 Feb 2023 20:39:00 GMT
ETag: W/"6055e-1864c7208a7"
Keep-Alive: timeout=5
Last-Modified: Mon, 13 Feb 2023 20:23:07 GMT
X-Powered-By: Express
Did i need some action onclietn part for decopression ?
If i setup on server res.set('Content-Encoding', 'deflate'); // gzip, deflate, br a get error on chrome:
net::ERR_CONTENT_DECODING_FAILED 200 (OK)
Any suggestion ?

After I getting a set-cookie in response, is not saved and transmitted in requests

Basically after an auth, I setting a cookie, but apparently after page refresh on the cookie that was set by cloudflare is saved
And the cookie that I transmitted with set-cookie is not used in after set-cookie requests
# Response headers
HTTP/2.0 200 OK
date: Thu, 18 Jul 2019 10:03:25 GMT
content-type: application/json; charset=utf-8
content-length: 29
set-cookie: __cfduid=d578c7a5e4378dc1b1946964a08ebc4ec1563444205; expires=Fri, 17-Jul-20 10:03:25 GMT; path=/; domain=.doc.io; HttpOnly; Secure
set-cookie: __doc=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJlIjoiNzQ2NTczNzQ0MDY2NjE3Mzc0NmQ2MTY5NmMyZTZkNzgiLCJwIjoiNTUzMjQ2NzM2NDQ3NTY2YjU4MzEzODMzNTU0NjUwNTU2ZjRiMzkzMTY3NDUzNDY5NDc3MzM3MzgzOTU5MzczMDUxNjk0ZjQxNjQ0OTM5Nzg0YjZiNzU1Njc3Nzk0NDc0NjE3NDMxNTE0NzcwMzE0YjQxNmY1MjU5MzM3YTZhNDU2NDJiNmU0ZTc0NGE3NTMyNTQ1ODc2NjI1YTczNDc1MTQ1Njc0MjVhNGQ0MTNkM2QiLCJkIjoiMzEzNTM2MzMzNDM0MzQzMjMwMzUzNTM2MzMiLCJpYXQiOjE1NjM0NDQyMDV9.go1jDpc2rBe5FjK2sKX4ybW4PhCPFq1xT1WIX-mSI84; Domain=.doc.io; Path=/; Expires=Thu, 18 Jul 2019 16:03:25 GMT; HttpOnly; Secure
expect-ct: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"
server: cloudflare
cf-ray: 4f83a02a3dc36455-FRA
X-Firefox-Spdy: h2
reply
.code(200)
.header('Access-Control-Allow-Origin', '*')
.header('Content-Type', 'application/json; charset=utf-8')
.setCookie('__doc', token, {
domain: '.doc.io',
path: '/',
secure: true,
httpOnly: true,
expires: new Date(new Date().setHours(new Date().getHours() + 6))})
.send({ 'success': 'Sign In success' })
All my websites are https
First I do POST request for an auth on /auth, and you could see response in response headers above and after I do GET on (trying to load page) from /page and get cookies, but with reply.log.info(request.cookies) I see only cookies from cloudflare. Surely I tried to refresh and go to address in different table, there just no any cookies, but from cloudflare.
# Request headers
Host: test.doc.io
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:67.0) Gecko/20100101 Firefox/67.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate, br
DNT: 1
Connection: keep-alive
Cookie: __cfduid=d578c7a5e4378dc1b1946964a08ebc4ec1563444205
Upgrade-Insecure-Requests: 1
TE: Trailers
After I tested it with an actual page with js code for a XHR request it works fine. However cookie was only displayed, but not actually saved in the storage when I was sending request directly in Firefox's inspector. Lost a day to figure out that created in inspector cookies seems preventing from installing

Expressjs Route contains weird characters

What could possibly be the reason for expressjs route to contain the following data? I am expecting it to return JSON data. I am making an ajax call to the server(expressjs) which gives me the below data with weird characters. Is this data gzipped? I have set the headers and contentType as follows:
headers: {"Access-Control-Allow-Origin":"*"}
contentType: 'application/json; charset=utf-8'
�=O�0�b��K�)�%7�܈9���G��%NOU���O'6��k�~6��S.���,��/�wأ%6�K�)��e�
The HTTP response is as follows:
General:
Request URL: http://localhost/expressRoute.js
Request Method: GET
Status Code: 200 OK
Remote Address: [::1]:80
Referrer Policy: no-referrer-when-downgrade
Response Headers:
Accept-Ranges: bytes
Connection: Keep-Alive
Content-Length: 29396
Content-Type: application/javascript
Date: Thu, 22 Nov 2018 00:50:36 GMT
ETag: "72d4-57b124e0c372e"
Keep-Alive: timeout=5, max=100
Last-Modified: Tue, 20 Nov 2018 05:57:12 GMT
Server: Apache/2.4.34 (Win32) OpenSSL/1.1.0i PHP/7.2.10
Request Headers:
Accept: */*
Accept-Encoding: gzip, deflate, br
Accept-Language: en-US,en;q=0.9
Cache-Control: no-cache
Connection: keep-alive
Host: localhost
Pragma: no-cache
Referer: http://localhost/index.html
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36

Access to a web page via a robot

I need to occasionally access an HTML page to update a database. This page is easily accessible via a web browser, but when I try to access it via a node.js application it doesn't work (the website detect that the request is made by a robot). However,
The robot request contains the same headers (including the
user-agent) that the web browser request.
The robot request doesn't contains referer header or cookie header, but the browser request either.
The IP of the robot is the same that the IP that I use
to browse the website.
In my eyes the robot request and the browser request are strictly identical. Nevertheless they are processed differently.
I'm running out of ideas... Maybe the request contains metadata like "this request was sent by node.js" but it would be really weird.
EDIT, here is a code sample :
// callback (error, responseContent)
function getPage (callback){
let options = {
protocol : 'https:',
hostname : 'xxx.yyy.fr',
port : 443,
path : '/abc/def',
agent : false,
headers : {
'Accept' : 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Encoding' : 'gzip, deflate, br',
'Accept-Language' : 'fr,fr-FR;q=0.8,en-US;q=0.5,en;q=0.3',
'Cache-Control' : 'no-cache',
'Connection' : 'keep-alive',
'DNT' : '1',
'Host' : 'ooshop.carrefour.fr',
'Pragma' : 'no-cache',
'Upgrade-Insecure-Requests' : '1',
'User-Agent' : 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:60.0) Gecko/20100101 Firefox/60.0'
}
};
https.get (options, function (res){
if (res.statusCode !== 200){
res.resume ();
callback ('Error : res code != 200, res code = ' + res.statusCode);
return;
}
res.setEncoding ('utf-8');
let content = '';
res.on ('data', chunk => content += chunk);
res.on ('end', () => callback (null, content));
}).on ('error', e => callback (e));
}
EDIT : here is a comparison of the requests/responses :
Mozilla Firefox
request headers :
GET /3274080001005/eau-de-source-cristaline HTTP/1.1
Host: ooshop.carrefour.fr
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:61.0) Gecko/20100101 Firefox/61.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: fr,fr-FR;q=0.8,en-US;q=0.5,en;q=0.3
Accept-Encoding: gzip, deflate, br
DNT: 1
Connection: keep-alive
Upgrade-Insecure-Requests: 1
Pragma: no-cache
Cache-Control: no-cache
response headers :
HTTP/2.0 200 OK
date: Wed, 11 Jul 2018 21:25:25 GMT
server: Unknown
content-type: text/html; charset=UTF-8
age: 0
x-varnish-cache: MISS
accept-ranges: bytes
set-cookie: visid_incap_1213048=G8a0mWzmQYi0GKuT2Ht7YeQ9QVsAAAAAQkIPAAAAAADvVZnsZHK18dQQxHakBprg; expires=Thu, 11 Jul 2019 11:17:56 GMT; path=/; Domain=.carrefour.fr
incap_ses_466_1213048=/2NKHS4HXU0T7FpkwpJ3BsV1RlsAAAAAAY3wbUkXacAceu2NkgUrhw==; path=/; Domain=.carrefour.fr
x-iinfo: 7-11020186-11020187 NNNN CT(1 2 0) RT(1531344324722 0) q(0 0 0 0) r(4 4) U12
x-cdn: Incapsula
content-encoding: gzip
X-Firefox-Spdy: h2
response content : expected HTML page
Node.js robot
request headers :
Accept : text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Encoding : gzip, deflate, br
Accept-Language : fr,fr-FR;q=0.8,en-US;q=0.5,en;q=0.3
Cache-Control : no-cache
Connection : keep-alive
DNT : 1
Host : ooshop.carrefour.fr
Pragma : no-cache
Upgrade-Insecure-Requests : 1
User-Agent : Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:60.0) Gecko/20100101 Firefox/60.0
response headers :
content-type : text/html
connection : close, close
cache-control : no-cache
content-length : 210
x-iinfo : 1-17862634-0 0NNN RT(1531344295049 65) q(0 -1 -1 0) r(0 -1) B10(4,314,0) U19
set-cookie : incap_ses_466_1213048=j34jMBWkPFYT7FpkwpJ3Bqd1RlsAAAAAVBfoZBShAvoun/M8UFxPPA==; path=/; Domain=.carrefour.fr
response content :
<html>
<head>
<META NAME="robots" CONTENT="noindex,nofollow">
<script src="/_Incapsula_Resource?SWJIYLWA=5074a744e2e3d891814e9a2dace20bd4,719d34d31c8e3a6e6fffd425f7e032f3">
</script>
<body>
</body></html>

Server headers - 302 temporary redirect

I have just checked the server headers for my website and this is what i got:
1 Server Response: http://www.pjnsports.co.uk
HTTP/1.1 302 Moved Temporarily
Content-Length: 0
Location: /?6690d3e0
I havent set up any 302 redirects - i assume this will be my host that is doing this. Is this normal practice?? will it have a negative affect on search results, site load speed etc? basically should i be going to them and telling them to do something about it?
Cheers
Paul
I'm not getting any 302 / 301 going to that site:
GET / HTTP/1.1
Host: www.pjnsports.co.uk
Connection: keep-alive
Accept: application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/534.13 (KHTML, like Gecko) Chrome/9.0.597.107 Safari/534.13
Accept-Encoding: gzip,deflate,sdch
Accept-Language: en-US,en;q=0.8
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3
Cookie: PHPSESSID=emr10g0gs9srtjccadb4k7t846; language=en; currency=GBP; __utmz=239376578.1300041169.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); __utma=239376578.1365490247.1300041169.1300041169.1300041169.1; __utmc=239376578; __utmb=239376578.1.10.1300041169
HTTP/1.1 200 OK
Date: Sun, 13 Mar 2011 18:33:41 GMT
Server: Apache
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Pragma: no-cache
Vary: Accept-Encoding
Content-Encoding: gzip
Content-Length: 6531
Keep-Alive: timeout=15, max=100
Connection: Keep-Alive
Content-Type: text/html; charset=utf-8

Resources