python3 github authorizaitons Oauth2 not working - python-3.x

I'm trying to implement an oauth2 client in Python3 so that I can upload files to github. For a very basic start I'm trying to get a list of authorizations using the API.
This code works:
from subprocess import Popen,PIPE
user = 'MYUSERNAME'
pw = 'MYPASSWORD'
git_url = "https://api.github.com/authorizations"
res = Popen(['curl','--user',user + ':' + pw,git_url],stdout=PIPE,stderr=PIPE).communicate()[0]
print(res)
This code does not work:
user = 'MYUSERNAME'
pw = 'MYPASSWORD'
git_url = "https://api.github.com/authorizations"
import urllib.request
# Create an OpenerDirector with support for Basic HTTP Authentication...
auth_handler = urllib.request.HTTPBasicAuthHandler()
auth_handler.add_password(realm=None,
uri=git_url,
user=user,
passwd=pw)
opener = urllib.request.build_opener(auth_handler)
f = opener.open(git_url)
print(f.read())
In fact, it generates this error:
Traceback (most recent call last):
File "demo.py", line 18, in <module>
f = opener.open("https://api.github.com/authorizations")
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/urllib/request.py", line 375, in open
response = meth(req, response)
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/urllib/request.py", line 487, in http_response
'http', request, response, code, msg, hdrs)
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/urllib/request.py", line 413, in error
return self._call_chain(*args)
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/urllib/request.py", line 347, in _call_chain
result = func(*args)
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.2/lib/python3.2/urllib/request.py", line 495, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 404: Not Found
I know that there is an existing Oauth2 implementation in python, but it's python2, not python3, and it does a lot more than I need.
I also know that I could just have my Python program call curl, and that's my fallback.
I'd really like to know what I'm doing wrong.
Thanks.

I have just posted an answer to another question with a full example using urllib2 from python2. Obviously you are interested in python3, but it shouldn't be to difficult to migrate the code.
Hope that helps,

Related

Downloading a csv file with python

I'm trying to download historical stock prices from Yahoo Finance using Python using the following code:
import urllib.request
import ssl
import os
url = 'https://query1.finance.yahoo.com/v7/finance/download/%5ENSEI?period1=1537097203&period2=1568633203&interval=1d&events=history&crumb=0PVssBOEZBk'
ctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONE
connection = urllib.request.urlopen(url,context = ctx)
data = connection.read()
with urllib.request.urlopen(url) as testfile, open('data.csv', 'w') as f:
f.write(testfile.read().decode())
however, I'm getting a traceback as mentioned below:
Traceback (most recent call last):
File "C:/Users/jmirand/Desktop/test.py", line 11, in <module>
connection = urllib.request.urlopen(url,context = ctx)
File "C:\Users\jmirand\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 222, in urlopen
return opener.open(url, data, timeout)
File "C:\Users\jmirand\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 531, in open
response = meth(req, response)
File "C:\Users\jmirand\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 641, in http_response
'http', request, response, code, msg, hdrs)
File "C:\Users\jmirand\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 569, in error
return self._call_chain(*args)
File "C:\Users\jmirand\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 503, in _call_chain
result = func(*args)
File "C:\Users\jmirand\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 649, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 401: Unauthorized
I assume this has to do with the fact that it's because of the HTTPS and Python doesn't have enough certificates to put into it by default.
The webpage im on is here Yahoo Finance NSEI historical prices and its on the 'Download Data' tab that you click on where the data automatically gets downloaded through a csv file.
Can you please help in rectifying the code?
The yahoo api expects cookies from your browser to authenticate. I have copied the cookies from my browser and passed them through python requests
import requests
import csv
url = "https://query1.finance.yahoo.com/v7/finance/download/%5ENSEI?period1=1537099135&period2=1568635135&interval=1d&events=history&crumb=MMDwV5mvf2J"
cookies = {
"APID":"UP26c2bef4-bc0b-11e9-936a-066776ea83e8",
"APIDTS":"1568635136",
"B":"d10v5dhekvhhg&b=3&s=m6",
"GUC":"AQEBAQFda2VeMUIeqgS6&s=AQAAAICdZvpJ&g=XWoXdg",
"PRF":"t%3D%255ENSEI",
"cmp":"t=1568635133&j=0",
"thamba":"2"
}
with requests.Session() as s:
download = s.get(url,cookies=cookies)
decoded_content = download.content.decode('utf-8')
cr = csv.reader(decoded_content.splitlines(), delimiter=',')
my_list = list(cr)
for row in my_list:
print(row)

my code is not working, tried every manner for urllib package?

import re
>>> import urllib.request
>>> url="https://www.google.com/search?q=googlestock"
>>> print(url)
https://www.google.com/search?q=googlestock
>>> data=urllib.request.urlopen(url).read()
I get an error however the url works fine when opened manually. error is
File "<pyshell#4>", line 1, in <module>
data=urllib.request.urlopen(url).read()
File "C:\Users\SHARM\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 222, in urlopen
return opener.open(url, data, timeout)
File "C:\Users\SHARM\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 531, in open
response = meth(req, response)
File "C:\Users\SHARM\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 641, in http_response
'http', request, response, code, msg, hdrs)
File "C:\Users\SHARM\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 569, in error
return self._call_chain(*args)
File "C:\Users\SHARM\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 503, in _call_chain
result = func(*args)
File "C:\Users\SHARM\AppData\Local\Programs\Python\Python37-32\lib\urllib\request.py", line 649, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 403: Forbidden
If you want to do web scraping from google, you can use the "google" library.
On your command prompt, pip install google (it is literally "pip install google").
Then, just try something like that:
from googlesearch import search
for s in search("googlestock"):
print(s)
This will print all of the results from google search "googlestock". Here to learn more about this library: https://pypi.org/project/google/
I hope it helps,
BR

HTTP error while scraping images using urllib in python 3

I have a list of urls and I am using following code to scrape images from websites, using urllib in python3.
i=0
all_image_links=[]
r=requests.get(urllink)
data=r.text
soup=BeautifulSoup(data,"lxml")
name=soup.find('title')
name=name.text
for link in soup.find_all('img'):
image_link=link.get('src')
final_link=urllink+image_link
all_image_links.append(final_link)
for each in all_image_links:
urllib.request.urlretrieve(each,name+str(i))
i=i+1
I am encountered with the following error:
Traceback (most recent call last):
File "j1.py", line 91, in <module>
import_personal_images(each)
File "j1.py", line 63, in import_personal_images
urllib.request.urlretrieve(each,name+str(i))
File "/usr/lib/python3.5/urllib/request.py", line 188, in urlretrieve
with contextlib.closing(urlopen(url, data)) as fp:
File "/usr/lib/python3.5/urllib/request.py", line 163, in urlopen
return opener.open(url, data, timeout)
File "/usr/lib/python3.5/urllib/request.py", line 472, in open
response = meth(req, response)
File "/usr/lib/python3.5/urllib/request.py", line 582, in http_response
'http', request, response, code, msg, hdrs)
File "/usr/lib/python3.5/urllib/request.py", line 510, in error
return self._call_chain(*args)
File "/usr/lib/python3.5/urllib/request.py", line 444, in _call_chain
result = func(*args)
File "/usr/lib/python3.5/urllib/request.py", line 590, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 403: Forbidden
I found few solutions on the web and changed code to :
1):
all_image_links=[]
i=0
req = Request(urllink, headers={'User-Agent': 'Mozilla/5.0'})
webpage = urlopen(req).read()
r=webpage.decode('utf-8')
soup=BeautifulSoup(r,"lxml")
for link in soup.find_all('img'):
image_link=link.get('src')
all_image_links.append(urllink+image_link)
for each in all_image_links:
urllib.request.urlretrieve(each,str(i))
i=i+1
2):
all_image_links=[]
i=0
headers = {'User-Agent':'Mozilla/5.0'}
page = requests.get(urllink)
soup = BeautifulSoup(page.text, "html.parser")
for link in soup.find_all('img'):
image_link=link.get('src')
print(image_link)
all_image_links.append(urllink+image_link)
for each in all_image_links:
urllib.request.urlretrieve(each,str(i))
i=i+1
and i am still getting the same error. Can someone explain where my code is incorrect?
HTTP Error 403: Forbidden - the server understood the request, but will not fulfill it for some reason unrelated to authorization.
The server is actively denying you access to this file. Either you have exceeded a rate-limit, or you are not logged in, and attempting to access a privileged resource.
No amount of code, unless that of the authorization / authentication type, will resolve this error.

Python Bot using PRAW started returning an error "recived 403 HTTP response"

I am writing a twitter bot that is supposed to take the top in the hot posts section of reddits r/dankmemes, find the .png file, download it then post it to twitter using tweepy. I am not a very experienced coder and this is my first real project. I have had other issues that I have found answers on google, but this one has got me stumped. The part of the code that is throwing the error was working fine until I added the os.path.join() command into this line
urllib.request.urlretrive(reddit_image_url, os.path.join('~/t-r_dankmemes__bot/jpg_folder', submission.title + '.jpg'))
Now it throws this error
Traceback (most recent call last):
File "t-r_dankmemes_bot.py", line 25, in <module>
for submission in hot_dankmemes:
File "/usr/local/lib/python3.6/site-packages/praw/models/listing/generator.py", line 52, in __next__
self._next_batch()
File "/usr/local/lib/python3.6/site-packages/praw/models/listing/generator.py", line 62, in _next_batch
self._listing = self._reddit.get(self.url, params=self.params)
File "/usr/local/lib/python3.6/site-packages/praw/reddit.py", line 367, in get
data = self.request('GET', path, params=params)
File "/usr/local/lib/python3.6/site-packages/praw/reddit.py", line 472, in request
params=params)
File "/usr/local/lib/python3.6/site-packages/prawcore/sessions.py", line 179, in request
params=params, url=url)
File "/usr/local/lib/python3.6/site-packages/prawcore/sessions.py", line 110, in _request_with_retries
data, files, json, method, params, retries, url)
File "/usr/local/lib/python3.6/site-packages/prawcore/sessions.py", line 95, in _make_request
params=params)
File "/usr/local/lib/python3.6/site-packages/prawcore/rate_limit.py", line 32, in call
kwargs['headers'] = set_header_callback()
File "/usr/local/lib/python3.6/site-packages/prawcore/sessions.py", line 139, in _set_header_callback
self._authorizer.refresh()
File "/usr/local/lib/python3.6/site-packages/prawcore/auth.py", line 328, in refresh
password=self._password)
File "/usr/local/lib/python3.6/site-packages/prawcore/auth.py", line 138, in _request_token
response = self._authenticator._post(url, **data)
File "/usr/local/lib/python3.6/site-packages/prawcore/auth.py", line 31, in _post
raise ResponseException(response)
prawcore.exceptions.ResponseException: received 403 HTTP response
I'm sorry if this question is dumb, as I said I'm new to coding except for the very basic java course I took at school last year. Also if somebody could also explain how to "read" error messages like this so I can trace back where the issue is myself next time, I would appreciate it.
import praw
import tweepy
import urllib
import os
###reddit API setup (praw)###
reddit = praw.Reddit(client_id = '******', client_secret = ******', username = '*******', password = '******', user_agent = '******')
subreddit = reddit.subreddit('dankmemes')
hot_dankmemes = subreddit.hot(limit = 5)
###twitter API setup (tweepy)###
ckey = '******'
csecret = '******'
akey = '*******'
asecret = '******'
auth = tweepy.OAuthHandler(ckey, csecret)
auth.set_access_token(akey, asecret)
TWEEPYAPI = tweepy.API(auth)
###MAIN###
for submission in hot_dankmemes:
print('Checking ' + submission.title)
if submission.ups >= 5000:
print('Found Post')
print('Checking for image in ' + submission.title)
with urllib.request.urlopen(submission.shortlink) as pageurl:
for line in pageurl:
line = line.decode('utf-8')
if 'data-url=' in line:
print('Found Image')
reddit_image_url = line[line.index('data-url="') + 10:line.index('\" data-permalink=')]
print(reddit_image_url)
urllib.request.urlretrive(reddit_image_url, os.path.join('~/t-r_dankmemes__bot/jpg_folder', submission.title + '.jpg'))
image_path = str('~/t-r_dankmemes_bot/jpg_folder/' + submission.title + '.jpg')
TWEEPYAPI.update_with_media(image_path)
print('Image successfully posted')
break
print('Post did not contain image')
break
Thank you for your time, Tristan :)

urllib cannot read https

(Python 3.4.2)
Would anyone be able to help me fetch https pages with urllib? I've spent hours trying to figure this out.
Here's what I'm trying to do (pretty basic):
import urllib.request
url = "".join((baseurl, other_string, midurl, query))
response = urllib.request.urlopen(url)
html = response.read()
Here's my error output when I run it:
File "./script.py", line 124, in <module>
response = urllib.request.urlopen(url)
File "/usr/lib/python3.4/urllib/request.py", line 153, in urlopen
return opener.open(url, data, timeout)
File "/usr/lib/python3.4/urllib/request.py", line 455, in open
response = self._open(req, data)
File "/usr/lib/python3.4/urllib/request.py", line 478, in _open
'unknown_open', req)
File "/usr/lib/python3.4/urllib/request.py", line 433, in _call_chain
result = func(*args)
File "/usr/lib/python3.4/urllib/request.py", line 1244, in unknown_open
raise URLError('unknown url type: %s' % type)
urllib.error.URLError: <urlopen error unknown url type: 'https>
I've also tried using data=None to no avail:
response = urllib.request.urlopen(url, data=None)
I've also tried this:
import urllib.request, ssl
https_sslv3_handler = urllib.request.HTTPSHandler(context=ssl.SSLContext(ssl.PROTOCOL_SSLv3))
opener = urllib.request.build_opener(https_sslv3_handler)
urllib.request.install_opener(opener)
resp = opener.open(url)
html = resp.read().decode('utf-8')
print(html)
A similar error occurs with this^ script, where the error is found on the "resp = ..." line and complains that 'https' is an unknown url type.
Python was compiled with SSL support on my computer (Arch Linux). I've tried reinstalling python3 and openssl a few times, but that doesn't help. I haven't tried to uninstall python completely and then reinstall because I would also need to uninstall a lot of other programs on my computer.
Anyone know what's going on?
-----EDIT-----
I figured it out, thanks to help from Andrew Stevlov's answer. My url had a ":" in it, and I guess urllib didn't like that. I replaced it with "%3A" and now it's working. Thanks so much guys!!!
this may help
Ignore SSL certificate errors
ctx = ssl.create_default_context()
ctx.check_hostname = False
ctx.verify_mode = ssl.CERT_NONE
url = input('Enter - ')
html = urllib.request.urlopen(url, context=ctx).read()
Double check your compilation options, looks like something is wrong with your box.
At least the following code works for me:
from urllib.request import urlopen
resp = urlopen('https://github.com')
print(resp.read())
urllib.error.URLError: <urlopen error unknown url type: 'https>
The 'https and not https in the error message indicates that you did not try a http:// request but instead a 'https:// request which of course does not exist. Check how you construct your URL.
I had the same error when I tried to open a url with https, but no errors with http.
>>> from urllib.request import urlopen
>>> urlopen('http://google.com')
<http.client.HTTPResponse object at 0xb770252c>
>>> urlopen('https://google.com')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.7/urllib/request.py", line 222, in urlopen
return opener.open(url, data, timeout)
File "/usr/local/lib/python3.7/urllib/request.py", line 525, in open
response = self._open(req, data)
File "/usr/local/lib/python3.7/urllib/request.py", line 548, in _open
'unknown_open', req)
File "/usr/local/lib/python3.7/urllib/request.py", line 503, in _call_chain
result = func(*args)
File "/usr/local/lib/python3.7/urllib/request.py", line 1387, in unknown_open
raise URLError('unknown url type: %s' % type)
urllib.error.URLError: <urlopen error unknown url type: https>
This was done on Ubuntu 16.04 using Python 3.7. The native Ubuntu defaults to Python 3.5 in /usr/bin and previously I had source downloaded and upgraded to 3.7 in /usr/local/bin. The fact that there was no error for 3.5 pointed to the executable /usr/bin/openssl not being installed correctly in 3.7 which is also evident below:
>>> import ssl
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.7/ssl.py", line 98, in <module>
import _ssl # if we can't import it, let the error propagate
ModuleNotFoundError: No module named '_ssl'
By consulting this link, I changed SSL=/usr/local/ssl to SSL=/usr in 3.7 source dir's Modules/Setup.dist and also cp it into Setup and then rebuilt Python 3.7.
$ ./configure
$ make
$ make install
Now it is fixed:
>>> import ssl
>>> ssl.OPENSSL_VERSION
'OpenSSL 1.0.2g 1 Mar 2016'
>>> urlopen('https://www.google.com')
<http.client.HTTPResponse object at 0xb74c4ecc>
>>> urlopen('https://www.google.com').read()
b'<!doctype html>...
and 3.7 has been complied with OpenSSL support successfully. Note that the Ubuntu command "openssl version" is not complete until you load it into Python.

Resources