Python requests module GET method: handling pagination token in params containing % - python-3.x

I am trying to handle an API response with pagination. The first page provides a pagination token to reach the next one, but when I try to feed this back into the params parameter of the requests.get method it seems to slightly encode the token in the wrong way.
My attempt to retrieve the next page (using the response output of the first requests.get method):
# Initial request
response = requests.get(url=url, headers=headers, params=params)
params.update({"paginationToken": response.json()["paginationToken"]})
# Next page
response = requests.get(url=url, headers=headers, params=params)
This fails with status 500: Internal Server Error and message Padding is invalid and cannot be removed.
An example pagination token:
gyuqfh%2bqyNrV9SI1%2bXulE6MXxJgb1VmOu68eH4YZ6dWUgRItb7yJPnO9bcEXdwg6gnYStBuiFhuMxILSB2gpZCLb2UjRE0pp9RkDdIP226M%3d
The url attribute of response seems to show a slightly different token if you look carefully, especially around the '%' signs:
https://www.wikiart.org/en/Api/2/DictionariesByGroup?group=1&paginationToken=gyuqfh%252bqyNrV9SI1%252bXulE6MXxJgb1VmOu68eH4YZ6dWUgRItb7yJPnO9bcEXdwg6gnYStBuiFhuMxILSB2gpZCLb2UjRE0pp9RkDdIP226M%253d
For example, the pagination token and url end differently: 226M%3d and 226M%253d. When I manually copy the first part of the url and add in the correct pagination token it does retrieve the information in a browser.
Am I missing some kind of encoding I should apply to the request.get parameters before feeding them back into a new request?

You are right it is some form of encoding, percentage encoding to be precise. It is frequently used to encode URLs. It is easy to decode:
from urllib.parse import unquote
pagination_token="gyuqfh%252bqyNrV9SI1%252bXulE6MXxJgb1VmOu68eH4YZ6dWUgRItb7yJPnO9bcEXdwg6gnYStBuiFhuMxILSB2gpZCLb2UjRE0pp9RkDdIP226M%253d"
pagination_token = unquote(pagination_token)
print(pagination_token)
Outputs:
gyuqfh%2bqyNrV9SI1%2bXulE6MXxJgb1VmOu68eH4YZ6dWUgRItb7yJPnO9bcEXdwg6gnYStBuiFhuMxILSB2gpZCLb2UjRE0pp9RkDdIP226M%3d
But I expect that is half your problem, use a requests session object https://requests.readthedocs.io/en/master/user/advanced/#session-objects to make the requests as there is most likely a cookie which will be sent with the request to be used in conjunction with the pagination token. I can not tell for sure as the website is currently down.

Related

HERE Maps URL not being decoded

I'm trying to make a query to Here Maps API with JavaScript to calculate a route with waypoints, where the waypoints are of type "passThrough", the actual produced URL is (I just removed the API key):
https://router.hereapi.com/v8/routes?xnlp=CL_JSMv3.1.21.3&apikey={API_KEY_HERE}&routingMode=fast&transportMode=truck&origin=25.900672%2C-80.253709&destination=40.213615%2C-97.188347&unit=imperial&truck=%5Bobject%20Object%5D&return=polyline%2CtravelSummary&via=40.052839%2C-87.410475!passThrough%3Dtrue
This query returns an error response, even when I'm following the documentation. Here is the problem I found,
If I paste this URL in the browser and remove "%3D" after "passThrough" from the URL, and explicitly change it to "=", the API then returns the expected response. Have to clarify that the URL from above works with curl -X GET. So I really think that the Here Maps API is not decoding the URL, even when they say that special characters have to be encoded.
Any clue on this?
Am I wrong?

Error to post the data in python requests

While am trying to post the data with python requests, error raises
Actual form data from browser inspection console:
{"params":"query=&hitsPerPage=1000&facetFilters=%5B%5B%22catalogs%3A000buyvallencom%22%5D%2C%22active%3Atrue%22%2C%22slug%3A3m-05710-superbuff-pad-adapter-p8hg1vv3b6b2%22%2C%22active%3Atrue%22%5D"}
I had tried the following:
session=requests.session()
data={
"params":"query=&hitsPerPage=1000&facetFilters=%5B%5B%22catalogs%3A000buyvallencom%22%5D%2C%22active%3Atrue%22%2C%22slug%3A3m-05710-superbuff-pad-adapter-p8hg1vv3b6b2%22%2C%22active%3Atrue%22%5D"
}
response = session.post('https://lcz09p4p1r-dsn.algolia.net/1/indexes/ecomm_production_products/query?x-algolia-agent=Algolia%20for%20AngularJS%203.32.0&x-algolia-application-id=LCZ09P4P1R&x-algolia-api-key=2d74cf84e190a2f9cd8f4fe6d32613cc',data=data)
print(response.text)
But while am posting, getting an error as
{"message":"lexical error: invalid char in json text. Around 'params=que' near line:1 column:1","status":400}
The API accepts JSON encoded POST data. Change the data=data to json=data in your post request.
From the documentation
Instead of encoding the dict yourself, you can also pass it directly
using the json parameter (added in version 2.4.2) and it will be
encoded automatically:
>>> url = 'https://api.github.com/some/endpoint'
>>> payload = {'some': 'data'}
>>> r = requests.post(url, json=payload)
Note, the json parameter is ignored if either data or files is passed.
Using the json parameter in the request will change the Content-Type
in the header to application/json.
Code
import requests
session=requests.session()
url='https://lcz09p4p1r-dsn.algolia.net/1/indexes/ecomm_production_products/query?x-algolia-agent=Algolia%20for%20AngularJS%203.32.0&x-algolia-application-id=LCZ09P4P1R&x-algolia-api-key=2d74cf84e190a2f9cd8f4fe6d32613cc'
data={
"params":"query=&hitsPerPage=1000&facetFilters=%5B%5B%22catalogs%3A000buyvallencom%22%5D%2C%22active%3Atrue%22%2C%22slug%3A3m-05710-superbuff-pad-adapter-p8hg1vv3b6b2%22%2C%22active%3Atrue%22%5D"
}
response = session.post(url,json=data)
print(response.text)
Output
{"hits":[{"active":true,"heading":"05710 Superbuff Pad Adapter","heading_reversed":"Adapter Pad Superbuff 05710","subheading":"","features":"Our 3M™ Adaptors for Polishers are designed for saving time and hassle in collision repair jobs requiring double-sided screw-on compounding and polishing pads. It is part of a fast, effective assembly that incorporates our 3M™ Perfect-It™ Backup Pad and wool polishing pads, allowing users to quickly attach them without removing the adaptor. This durable adaptor is used with all polishers.<br>• Part of a complete assembly for compounding and polishing<br>• Designed to attach buffing pads or backup pads to machine polishers<br>• Helps reduce wear and vibration<br>• Users can change screw-on pads without removing the adaptor, saving time<br>• Provides hassle-free centering with 3M double-sided wool compounding and polishing pads","product_id":"P8HG1VV3B6B2","product_number":"IDG05114405710","brand":"","keywords":"sanding, polishing, buffing","image":"G8HI043XOD5X.jpg","image_type":"illustration","unspsc":"31191505","system":"sxe","cost":9.0039,"catalogs":["000BuyVallenCom"],"vendor":{"name":"3M","slug":"3m","vendor_id":"VACF1JS0AAP0","image":"G8HIP6V1J7UJ.jpg"},"taxonomy":{"department":{"name":"Paint, Marking & Tape","slug":"paint-marking-tape"},"category":{"name":"Filling, Polishing, Buffing","slug":"filling-polishing-buffing"},"style":{"name":"Adapters","slug":"adapters"},"type":{"name":"Pads","slug":"pads"},"vendor":{"name":"3M","slug":"3m"}},"slug":"3m-05710-superbuff-pad-adapter-p8hg1vv3b6b2","color":null,"material":null,"model":null,"model_number":null,"shape":null,"size":null,"display_brand":null,"style":null,"purpose":null,"product_type":null,"specifications":[],"item_specifications":[],"batch_id":"000BuyVallenCom-1551410144451","status":"Stk","erp":"05114405710","iref":null,"cpn":null,"gtin":"00051144057108","description":"05710 ADAPTOR 5/8 SHAFT SUPERBUFF","sequence":10,"item_id":"I8HG1VV6JL3B","vpn":"05710","uom":"Ea","specification_values":[],"objectID":"000BuyVallenCom-P8HG1VV3B6B2-I8HG1VV6JL3B","_highlightResult":{"heading_reversed":{"value":"Adapter Pad Superbuff 05710","matchLevel":"none","matchedWords":[]},"subheading":{"value":"","matchLevel":"none","matchedWords":[]},"brand":{"value":"","matchLevel":"none","matchedWords":[]},"taxonomy":{"style":{"name":{"value":"Adapters","matchLevel":"none","matchedWords":[]}}}}}],"nbHits":1,"page":0,"nbPages":1,"hitsPerPage":1000,"processingTimeMS":1,"exhaustiveNbHits":true,"query":"","params":"query=&hitsPerPage=1000&facetFilters=%5B%5B%22catalogs%3A000buyvallencom%22%5D%2C%22active%3Atrue%22%2C%22slug%3A3m-05710-superbuff-pad-adapter-p8hg1vv3b6b2%22%2C%22active%3Atrue%22%5D"}
Documentation
More complicated POST requests
Algolia API

Sending data in GET request Python

I know that it is not an advisable solution to use GET however I am not in control of how this server works and have very little experience with requests.
I'm looking to add a dictionary via a GET request and was told that the server had been set up to accept this but I'm not sure how that works. I have tried using
import requests
r = request.get('www.url.com', data = 'foo:bar')
but this leaves the webpage unaltered, any ideas?
To use request-body with a get request, you must override the post method. e.g.
request_header={
'X-HTTP-Method-Override': 'GET'
}
response = requests.post(request_uri, request_body, headers=request_header)
Use requests like this pass the the data in the data field of the requests
requests.get(url, headers=head, data=json.dumps({"user_id": 436186}))
It seems that you are using the wrong parameters for the get request. The doc for requests.get() is here.
You should use params instead of data as the parameter.
You are missing the http in the url.
The following should work:
import requests
r = request.get('http://www.url.com', params = {'foo': 'bar'})
print(r.content)
The actual request can be inspected via r.request.url, it should be like this:
http://www.url.com?foo=bar
If you're not sure about how the server works, you should send a POST request, like so:
import requests
data = {'name', 'value'}
requests.post('http://www.example.com', data=data)
If you absolutely need to send data with a GET request, make sure that data is in a dictionary and instead pass information with params keyword.
You may find helpful the requests documentation

Response URL different from initial browser URL

Im getting a different URL from what was initially displayed when tried on a browser
Facebook's docs say that a
Login Request
should have a format like this so using requests and urllib.parse I tried getting the response URL
import requests, facebook, logging
# REQUIRED AUTHENTICATION PARAMS
APP_ID = '1976346389294466'
APP_SECRET = '*************************'
REDIRECT_URI = 'https://www.facebook.com/connect/login_success.html/'
logging.basicConfig(level=logging.DEBUG)
perms = ['manage_pages','publish_pages']
fb_login_url = facebook.auth_url(app_id=APP_ID, canvas_url=REDIRECT_URI, perms=perms)
logging.debug("-----LOGIN URL:" + fb_login_url)
response = requests.get(fb_login_url, params={'response_type':'token'}, allow_redirects=True)
try:
response.raise_for_status()
except Exception as exec:
print("%(There was a problem)s" % (exec))
response = requests.get(response.url)
logging.debug("-----Response URL: "+response.url)
I'm expecting a Expected Return URL in the format of
https://www.facebook.com/connect/login_success.html#
access_token=ACCESS_TOKEN...
However, I'm only getting the correct response when I use a browser, on my program the response returns a URL of an entirely different format
https://www.facebook.com/login.php?skip_api_login=1&api_key=xxxxxxxxx&signed_next=1&next=https%3A%2F%2Fwww.facebook.com%2Fv2.11%2Fdialog%2Foauth%3Fredirect_uri%3Dhttps%253A%252F%252Fwww.facebook.com%252Fconnect%252Flogin_success.html%252F%26scope%3Dmanage_pages%252Cpublish_pages%26response_type%3Dtoken%26client_id%xxxxxxxxxxx%26ret%3Dlogin%26logger_id%xxxxxxxxxxxxxxx&cancel_url=https%3A%2F%2Fwww.facebook.com%2Fconnect%2Flogin_success.html%2F%3Ferror%3Daccess_denied%26error_code%3D200%26error_description%3DPermissions%2Berror%26error_reason%3Duser_denied%23_%3D_&display=page&locale=en_US&logger_id=xxx-xxxxx-xxxxx-xxxxxxxx
When I GET from the last redirect url from response.history,
the response returns a url to itself, so I'm not sure how to go about capturing
the initial value of the url such as when I use the browser
the thing is, Im not looking for anything else from the response besides the URL itself.
Additional Notes:
-in the browser after getting the response url I think javascript also changes the url to blank after a brief moment for security reasons
-When I enter the wrong formatted url to the browser, it redirects to the right value so is there something thats handling the response differently when I'm using the browser. If so, how do grab the right url?
Simply put
When I enter fb_login_url in browser I get...
https://www.facebook.com/connect/login_success.html#access_token=ACCESS_TOKEN...
which is what I want, but
when I do it in the app with requests...
either with requests.get(fb_login_url).url
OR (because of a 303) something like
for r in response.history:
requests.get(r.url).url
i get the wrong url which is
https://www.facebook.com/login.php?skip_api_login=1&api_key=xxxxxxxxx&signed_next=1&n....

Python Requests Refresh

I'm trying to use python's requests library to log in to a website. It's a pretty simple code, and you can really get the gist of requests just by going on its website. I, however, want to check if I'm successfully logged in via the url. The problem I've encountered is when I initiate the post requests and give it (the variable p) a url, whether the html has changed or not I'm still passed the same url when I type print(p.url). Is there any way for me to refresh the browser or update the url to whatever it's currently set at?
(I can add a line for checking the url against itself later, but for now I just want to get the correct url)
#!usr/bin/env python3
import requests
payload = {'login': 'USERNAME,
'password': 'PASSWORD'}
with requests.Session() as s:
p = s.post('WEBSITE', data=payload)
#print p.text
print(p.url)
The usuage of python-requests may not as complex as you think. It will automatically handle the redirect of your post ( or session.get()).
Here, session.post() method return a response object:
r = s.post('website', data=payload)
which means r.url is current url you are looking for.
If you still want to refresh current page, just use:
s.get(r.url)
To verify whether you has login successfully, one solution is to do the login in your browser.
Based on the title or content of the webpage returned (i.e, use the content in r.text), you can judge whether you have made it.
BTW, python-requests is a great library, enjoy it.

Resources