How to handle ezproxy authorization to download data through API? - python-3.x

I have a token to have an access to download large files from the comtrade. The original webpage is http://comtrade.un.org/ however I have a premium access through my university library subscription. So,if I want to use the premium features the website automatically redirects me to this page and after pressing login button the URL is https://ezproxy.nu.edu.kz:5588/data/dev/portal/. I am trying to send request and download files with API(using requests). I am getting response from http://comtrade.un.org/ but in order to download I need to use https://ezproxy.nu.edu.kz:5588/data/dev/portal/. and when I tried to download urllib.error.HTTPError: HTTP Error 401: Unauthorized this error message appeared. How can I handle this problem?
px = 'px=HS&' #classification
freq = 'freq=A&' #annual
type = 'type=C&' #commodity
auth = 'https://comtrade.un.org/api/getUserInfo?token=ZF5TSW8giRQMFHuPmS5JwQLZ5FB%2BNO0NCcjxFQUJADrLzCRDCkG5F0ZPnZTYQWO3MPgj96gZNF7Z9iN8BwscUMYBbXuDVYVDvsTAVNzAJ6FNC2dnN7gtB1rt9qJShAO467zBegHTLwvmlRIBSpjjwg%3D%3D'
with open('reporterAreas.json') as json_file:
data = json.load(json_file)
ls = data['results']
list_year = [*range(2011, 2021,1)]
for years in list_year:
print(years)
ps = 'ps='+ str(years) + '&'
for item in ls:
r = item['id'] #report_country_id
report_country_txt = item['text']
if r == 'all':
req_url = 'r=' + r + '&' + px + ps + type + freq + token
request = url + req_url
response = requests.get(request)
if response.status_code == 200:
print("Response is OK!")
data = response.json()[0]
download_url = dwld_url + data['downloadUri']
print(download_url)
filename = str(years) + '_' + report_country_txt + '.zip'
urllib.request.urlretrieve(url, filename)

I'm not sure if Ezproxy provides an API or SDK way to authenticate a request but i don't think.
What you could do is to provide the Ezproxy session to your request and with that, you request will not be treated as unauthorized because you're passing a valid session and therefore your request will be treated as a valid one.
Notice that you can retrieve your Ezproxy session id from your cookies.
And finally, you have to make your request against the starting point url
Otherwise, you can use selenium to fill automatically the login form and retrieve the Ezproxy session id to pass it to the requests.
I hope this could help you !

Related

Python Akamai Sensor Data Generation with valid Cookie abck

I am trying to send twice post request to www.footlocker.it
sess = requests.session()
print("start-Point")
bot = BotDetector()
payload = "{\"sensor_data\":\"" + bot.generatesensordata() + "\"}"
d = sess.post(url_ak, headers=headers_ak, data=payload, verify=False, timeout=15)
bot.cookie = sess.cookies["_abck"]
payload = "{\"sensor_data\":\"" + bot.generatesensordata1() + "\"}"
d = sess.post(url_ak, headers=headers_ak, data=payload, verify=False, timeout=15)
print('Status code {},'.format(d.status_code))
print('Header {},'.format(d.headers))
Target is for getting valid cookie abck and success true as status code.
I have write some custom code for botdetector. But i can't bypass with good result.
it means your sensor data is bad most likely. take a look at the akamai script for the site & compare it to what you have now.

Sending URL inside another URL

So I'm building a telegram bot with python and I need to send to the user an URL. I'm using telegram send_text URL:
https://api.telegram.org/bot{bot_token}/sendMessage?chat_id={chat_id}&parse_mode=Markdown&text={message}
but the URL that I'm using:
https://www.amazon.es/RASPBERRY-Placa-Modelo-SDRAM-1822096/dp/B07TC2BK1X/ref=sr_1_3?__mk_es_ES=%C3%85M%C3%85%C5%BD%C3%95%C3%91&crid=YJ6X8FN3V801&keywords=raspberry+pi+4&qid=1577853490&sprefix=raspberr%2Caps%2C195&sr=8-3
has a special character like & that prevents the message to be sent with the full URL. In the case of this URL I only receive this:
https://www.amazon.es/RASPBERRY-Placa-Modelo-SDRAM-1822096/dp/B07TC2BK1X/ref=sr13?mkesES=ÅMÅŽÕÑ
I tried using utf-8 to replace the characters like & but python transforms them back to "real character" so I had to throw the idea off.
In case you want to check out what I tried here is the code snippet:
url = url.replace('&', u"\x26")
So is there any way I could fix this?
Encode the URL with urlencode()
import requests
import urllib.parse
link = "https://www.amazon.es/RASPBERRY-Placa-Modelo-SDRAM-1822096/dp/B07TC2BK1X/ref=sr_1_3?__mk_es_ES=%C3%85M%C3%85%C5%BD%C3%95%C3%91&crid=YJ6X8FN3V801&keywords=raspberry+pi+4&qid=1577853490&sprefix=raspberr%2Caps%2C195&sr=8-3"
markdownMsg = "[Click me!](" + urllib.parse.quote(link) + ")"
url = "https://api.telegram.org/bot<TOKEN>/sendMessage?chat_id=<ID>&text=" + markdownMsg + "&parse_mode=MarkDown"
response = requests.request("GET", url, headers={}, data ={})
print(response.text.encode('utf8'))
This also works for &parse_mode=HTML
htmlMsg = "Click me!"

call OAUTH2 api in python script

i have found this interesting article here https://developer.byu.edu/docs/consume-api/use-api/oauth-20/oauth-20-python-sample-code
in this article there is an example how to call an oauth2 api using authorization_code flow. the problem with this approach is that you need to open a new browser, get the code and paste in the script. i would open and get the code directly from python script. is it possible?
print "go to the following url on the browser and enter the code from the
returned url: "
print "--- " + authorization_redirect_url + " ---"
access_token = raw_input('access_token: ')
I have been battling with this same problem today and found that the following worked for me. You'll need:
An API ID
A secret key
the access token url
I then used requests_oauthlib: https://github.com/requests/requests-oauthlib
from requests_oauthlib import OAuth2Session
# Both id and secret should be provided by whoever owns the api
myid = 'ID_Supplied'
my_secret = 'Secret_pass'
access_token_url = "https://example.com/connect/token" # suffix is also an example
client = BackendApplicationClient(client_id=myid)
oauth = OAuth2Session(client=client)
token = oauth.fetch_token(token_url=access_token_url, client_id=myid,
client_secret=my_secret)
print(token)

Unable to get the response in POST method in Python

I am facing a unique problem.
Following is my code.
url = 'ABCD.com'
cookies={'cookies':'xyz'}
r = requests.post(url,cookies=cookies)
print(r.status_code)
json_data = json.loads(r.text)
print("Printing = ",json_data)
When I use the url and cookie in the POSTMAN tool and use POST request I get JSON response . But when I use the above code with POST request method in python I get
404
Printing = {'title': 'init', 'description': "Error: couldn't find a device with id: xxxxxxxxx in ABCD: d1"}
But when I use the following code i .e with GET request method
url = 'ABCD.com'
cookies={'cookies':'xyz'}
r = requests.post(url,cookies=cookies)
print(r.status_code)
json_data = json.loads(r.text)
print("Printing = ",json_data)
I get
200
Printing = {'apiVersion': '0.4.0'}
I am not sure why POST method works with JSON repsone in POSTMAN tool and when I try using python it is not work. I use latest python 3.6.4
I finally found what was wrong following is correct way
url = 'ABCD.com'
cookies={'cookies':'xyz'}
r = requests.post(url,headers={'Cookie'=cookies)
print(r.status_code)
json_data = json.loads(r.text)
print("Printing = ",json_data)
web page was expecting headers as cookie and i got the response correctly

Get list of commits by user with the GitLab API

I can successfully access info about a user with this command:
curl http://gitlab.$INTERNAL_SERVER.com/api/v3/\
users/$USER_ID\?private_token\=$GITLAB_TOKEN
However, I can not find the API endpoint for getting a list of the commits that the user has pushed to the GitLab server. Does a URL with this info exist?
To the best of my knowledge, such an API endpoint does not exist. Essentially the best I've been able to come up with is this flow:
find all the projects the user is involved with (not 100% simple in itself)
then get commits for that project
THEN filter those commits based on useremail.
I am using java-gitlab-api to access the Gitlab server, so don't have curl samples handy (sorry!).
It looks like you can get a list of commits by using the Events endpoint
data = requests.get(host + "/api/v4/users/{id}/events".format(id=user_id),
params={"action": "pushed"})
And you can chain that by updating params to
params.update({"before": before_date})
Where before date can be the last element in data, and you can loop continuously to get all commits by user from a specific date
I have written a Python script that does what #demaniak suggest. Enjoy
import requests
import ujson as json
header ={...}
def get_all_commits_gitlab(project_id, username):
json_loads_of_commit = []
f_date = "2022-01-01T00:00:42.000+01:00"
params = {"until": f_date}
url_p = "https://gitlab.xxx.xx/api/v4/projects/%d/\
repository/commits" % project_id
r = requests.get(url_p, params, headers=header)
c = 0
while r.status_code == 200:
jsLoad = json.loads(r.content)
newDate = jsLoad[-1]["committed_date"]
if (params["until"] == newDate):
break
user_commits = []
for cm in jsLoad:
if cm["author_name"] == username:
user_commits.append(cm)
c += 1
json_loads_of_commit.append(user_commits)
params["until"] = newDate
r = requests.get(url_p, params, headers=header)
print("project %d: %d commits by user %s, \
the first one %s" % (project_id, c, username, newDate))
return json_loads_of_commit

Resources