When I use pycurl to execute a curl command, I get error 3 "illegal characters found in URL" but when pasting said URL in Chome, it can be resolved - python-3.x

Good day. I'm writing a python program that requests some posts from my Facebook page. To do so, Facebook offers a tool that they call "Graph API Explorer". Using something similar to a GET request, I can get anything that I want (granted that I have access and a valid token). I've come up with my own solution for the Graph API Explorer and that is generating my URLs. After generating a URL, I use pycurl to get a JSON object from Facebook that contains all of my data.
When I use pycurl, I get the following error:
pycurl.error: (3, 'Illegal characters found in URL')
but when printing said URL and pasting it to a browser, I got a valid response.
URL: https://graph.facebook.com/v7.0/me?fields=posts%7Bmessage%2Cfrom%7D&access_token=<and my access token which is valid>
my code looks like this:
def get_posts_curl(nodes=['posts'], fields=[['message', 'from']], token_file='Facebook/token.txt'):
curl = pycurl.Curl()
response = BytesIO()
token = get_token_from_file(token_file)
# constructing request.
url = parse_facebook_url_request(nodes, fields, token)
url = convert_to_curl(url)
print("---URL---: " + url)
# curl session and settings.
curl.setopt(curl.CAINFO, certifi.where())
curl.setopt(curl.URL, url)
curl.setopt(curl.WRITEDATA, response)
curl.perform()
curl.close()
return response.getvalue().decode('utf-8')
The error pops up at curl.perform()
Some info that might be relevant:
All was working great a while ago. After transferring my program from my workstation (that is running Windows 10) to my server (Ubuntu 18.04 Server) still, all was working fine and I placed that project to the side. Only now that error pops up and I haven't touched the project in a while.

It seems that the token is causing the issue. I've tried about 100 tokens and some cause the problem and some don't. Also, a fix that solved it all was using urllib3.unquote
from urllib.parse import unquote
...
url = unquote(url)

Related

Can MechanicalSoup log into page requiring SAML Auth?

I'm trying to download some files from behind a SSO (Single Sign-On) site. It seems to be SAML authenticated, that's where I'm stuck. Once authenticated I'll be able to perform API requests that return JSON, so no need to interpret/scrape.
Not really sure how to deal with that in mechanicalsoup (and relatively unfamiliar with web-programming in general), help would be much appreciated.
Here's what I've got so far:
import mechanicalsoup
from getpass import getpass
import json
login_url = ...
br = mechanicalsoup.StatefulBrowser()
response = br.open(login_url)
if verbose: print(response)
# provide the username + password
br.select_form('form[id="loginForm"]')
print(br.get_current_form().print_summary()) # Just to see what's there.
br['UserName'] = input('Email: ')
br['Password'] = getpass()
response = br.submit_selected().text
if verbose: print(response)
At this point I get a page telling me javascript is disabled and that I must click submit to continue. So I do:
br.select_form()
response = br.submit_selected().text
if verbose: print(response)
That's where I get a complaint about state information being lost.
Output:
<h2>State information lost</h2>
State information lost, and no way to restart the request<h3>Suggestions for resolving this problem:</h3><ul><li>Go back to the previous page and try again.</li><li>Close the web browser, and try again.</li></ul><h3>This error may be caused by:</h3><ul><li>Using the back and forward buttons in the web browser.</li><li>Opened the web browser with tabs saved from the previous session.</li><li>Cookies may be disabled in the web browser.</li></ul>
The only hits I've found on scraping behind SAML logins are all going with a selenium approach (and sometimes dropping down to requests).
Is this possible with mechanicalsoup?
My situation turned out to require Javascript for login. My original question about getting into SAML auth was not the true environment. So this question has not truly been answered.
Thanks to #Daniel Hemberger for helping me figure that out in the comments.
In this situation MechanicalSoup is not the correct tool (due to Javascript) and I ended up using selenium to get through authenication then using requests.

PRIVACY Card API Authorization

I have recently been working with API's but I am stuck on one thing and it's been holding me back for a few days.
I am trying to work with Privacy's API and I do not understand the Authentication/Authorization process. When I enter the url in a browser I get the error "message": "Please provide API key in Authorization header", even when I use the correct format of Authorization. I also get an error when I make a request in Python. The format I'm using for the url is https://api.privacy.com/v1/card "Authorization: api-key:".
If someone could explain how to work this or simply give an example of how I would make a request through Python3. The API information is in the link below.
Thank you in advance.
https://developer.privacy.com/docs
This is the code I am using in Python. After I run this I receive a 401 status code.
import requests
headers={'Authorization': 'api-key:200e6036-6894-xxxx-xxxx-xxxx'}
url = 'https://api.privacy.com/v1/card'
r = requests.get(url)
print("Status code:", r.status_code)
You need to add the authentication header to the get call. It isn't enough to include it in a header variable. You need to provide those headers to requests
import requests
response = requests.get('https://api.privacy.com/v1/card', headers={'Authorization': 'api-key 65a9566c-XXXXXXXXXXXX'})
print(response.json())

Spotipy - Cannot log in to authenticate (Authorization Code Flow)

I am working with the Spotipy Python library to connect to the Spotify web API. I want to get access to my Spotify's user account via Authorization Code Flow. I am using Python 3.5, Spotipy 2.4.4, Google Chrome 55.0.2883.95 (64-bit) and Mac OS Sierra 10.12.2
First, I went to the Spotify Developer website to register the program to get a Client ID, a Client Secret key and enter a redirect URI (https://www.google.com) on my white-list.
Second, I set the environment variables from terminal:
export SPOTIPY_CLIENT_ID='my_spotify_client_id'
export SPOTIPY_CLIENT_SECRET='my_spotify_client_secret'
export SPOTIPY_REDIRECT_URI='https://www.google.com'
Then I try to run the example code typing 'python3.5 script.py my_username' from terminal. Here is the script:
import sys
import spotipy
import spotipy.util as util
scope = 'user-library-read'
if len(sys.argv) > 1:
username = sys.argv[1]
else:
print "Usage: %s username" % (sys.argv[0],)
sys.exit()
token = util.prompt_for_user_token(username, scope)
if token:
sp = spotipy.Spotify(auth=token)
results = sp.current_user_saved_tracks()
for item in results['items']:
track = item['track']
print track['name'] + ' - ' + track['artists'][0]['name']
else:
print "Can't get token for", username
When running this code, it takes me to a log in screen on my browser. I enter my Spotify's credential to grant access to my app. But when I finally click on 'log in' (or 'Iniciar sesiĆ³n' in Spanish) nothing happens. I tried to log in with my Facebook account but it does not work either. It seems that I get a Bad Request every time that I click on 'log in'. Here is a capture of Chrome:
The process is incomplete because when trying to enter the redirect URI back on terminal I get another Bad Request:
raise SpotifyOauthError(response.reason)
spotipy.oauth2.SpotifyOauthError: Bad Request
I tried to restart my computer, clean my browser cookies, use another different browser but did not work. It seems that I am not the only one having this problem. Is it maybe a bug? Please, avoid answers like "read the documentation of the API going here". Thank you.
Solved it. In order to run the Authorization Code Flow example code provided in the Spotipy's documentation correctly, I specified the redirect URI in line 13 of the example script when calling util.prompt_for_user_token, even when I had done this previously when setting environment variables:
token = util.prompt_for_user_token(username, scope, redirect_uri = 'https://example.com/callback/')
Likewise, do not use https://www.google.com or similar web address as your redirect URI. Instead, try 'https://example.com/callback/' or 'http://localhost/' as suggested here. Do not forget that the redirected URL once you are logged in must have the word code included.
Cheers,

Unicode-objects must be encoded before hashing when requesting data using Flask-OAuth

I'm integrating Google's login with a Flask site using Flask-OAuth.
Everything is working fine. I can authorise the login and get a token back etc without any difficulties. But when I use Flask-OAuth's get method to request the logged in user's email address I get an error saying:
TypeError: Unicode-objects must be encoded before hashing
I'm using Python3 and this has the smell of a Python version issue but I can't figure out what I'd need to change.
The code I'm using is this:
def get_additional_data(self):
access_token = session.get('oauth_token')
headers = {'Authorization': 'OAuth ' + access_token[0]}
return self.service.get(
'https://www.googleapis.com/oauth2/v1/userinfo', None,
headers=headers)
I'm not sure what I can encode in that request. Even if I don't pass the headers I get the same error (rather than an invalid request or something like that).
I've run 2to3 on oauth2/__init__.py and the tweaks is suggests are very minor and shouldn't prevent the code from running in Python 3. Also, everything else OAuth2 related is working.
The bad news is that the solution to this problem is switching to Flask-OAuthlib.
The good news is it required very few changes from Flask-OAuth to get it working.

Using SSL with the python-instagram and localhost on sample-app.py

I plan on using the sample-app.py as a baseline of what I am building out and then expanding it from there. Just want to get comfortable with the instagram API and build out from there.
I am trying to use the sample-app.py provided with python-instagram. I have registered an application on instagrams website. I set it up using the default redirect uri from sample-app.py:
http://localhost:8515/oauth_callback .
I was able to authorize my instagram account to use the app, but when I click on any of the links, I get an error about the acccess-token.
When you look at the python command-line window that stays open, I get the following error:
"check_hostname needs a SSL context with either CERT_OPTIONAL or CERT_REQUIRED"
It appears that when the sample app is processing the lines below, it is trying to connect to instagram, but is not able to because SSL in local host is not set up properly. How do I set up SSL so i do not get the above error?
access_token, user_info = unauthenticated_api.exchange_code_for_access_token(code)
if not access_token:
return 'Could not get access token'
api = client.InstagramAPI(access_token=access_token)
request.session['access_token'] = access_token
print ("access token="+access_token)
There are a few steps to solve this problem (it appears that it is actually several problems in aggregate causing this issue):
use openssl to create a ssl certificate and save cert to the same location as your python script. Download open ssl here: http://slproweb.com/products/Win32OpenSSL.html
You need to tweak bottle so that it will support ssl. Do so by adding the following lines in run in class WSGIRefServer(ServerAdapter):
import ssl
srv.socket = ssl.wrap_socket (
srv.socket,
certfile='server.pem', # path to certificate
server_side=True)
There is a bug in python 3 and above(https://github.com/jcgregorio/httplib2/issues/173). I am using 3.4, so the bug could be fixed in 3.5. In the instagram/oauth2.py file, change all disable_ssl_certificate_validation=False to True.

Resources