Proper Use Of Python 3.x AMFY Module - python-3.x

How am I supposed to use the Amfy module? I try to use it like the JSON module (amfy.loads or amfy.load), but it just gives me errors:
C:\Users\Other>"C:\Users\Other\Desktop\Python3.5.2\test amf.py"
Traceback (most recent call last):
File "C:\Users\Other\Desktop\Python3.5.2\test amf.py", line 4, in <module>
print(amfy.load(cn_rsp.text))
File "C:\Users\Other\Desktop\Python3.5.2\lib\site-packages\amfy\__init__.py", line 9, in load
return Loader().load(input, proto=proto)
File "C:\Users\Other\Desktop\Python3.5.2\lib\site-packages\amfy\core.py", line 33, in load
return self._read_item3(stream, context)
File "C:\Users\Other\Desktop\Python3.5.2\lib\site-packages\amfy\core.py", line 52, in _read_item3
marker = stream.read(1)[0]
AttributeError: 'str' object has no attribute 'read'
this is what I wrote:
import requests
import amfy
cn_rsp = requests.get("http://realm498.c10.castle.rykaiju.com/api/locales/en/get_serialized_new")
print(amfy.load(cn_rsp.text))

After tinkering around and googling some stuff, I found a fix:
New code:
import amfy, requests, json
url = "http://realm416.c9.castle.rykaiju.com/api/locales/en/get_serialized_static"
req = requests.get(url)
if req.status_code == 200:
ret = req.json() if "json" in req.headers["content-type"] else amfy.loads(req.content)
else:
ret = {"failed": req.reason}
with open ("doa manifest.txt", 'w', encoding = 'utf-8') as dump:
json.dumps(ret, dump)
The Terminal throws a UnicodeEncodeError, but I was able to fix that by entering chcp 65001 and then set PYTHONIOENCODING=utf-8

The load method expects an input stream, you provide it a string. Just convert your string into a memory buffer which supports read method like this:
import io
print(amfy.load(io.BytesIO(cn_rsp.text.encode())))
unfortunately serialization fails when using this. Is there another url where it would work, a test URL maybe?
File "C:\Python34\lib\site-packages\amfy\core.py", line 146, in _read_vli
byte = stream.read(1)[0]
IndexError: index out of range

Related

Having trouble modifying python 2 code to python 3

I've been trying to translate python 2.7 code to python 3. I believe everything above checkpoint 1 should be correct. But I'm getting an error I associated with the second half. I can always download the file I need straight from the link, but I'd like to know what's breaking here.
import urllib
from urllib.request import urlopen
import tarfile
import os
path = 'https://www.cs.cmu.edu/~./enron/enron_mail_20150507.tar.gz'
url = urlopen(path)
#checkpoint 1
os.chdir('..')
tfile = tarfile.open(url, "r:gz")
tfile.extractall(".")
Error:
Traceback (most recent call last):
File "startup.py", line 43, in <module>
tfile = tarfile.open(url, "r:gz")
File "/anaconda3/lib/python3.6/tarfile.py", line 1589, in open
return func(name, filemode, fileobj, **kwargs)
File "/anaconda3/lib/python3.6/tarfile.py", line 1636, in gzopen
fileobj = gzip.GzipFile(name, mode + "b", compresslevel, fileobj)
File "/anaconda3/lib/python3.6/gzip.py", line 163, in __init__
fileobj = self.myfileobj = builtins.open(filename, mode or 'rb')
TypeError: expected str, bytes or os.PathLike object, not HTTPResponse
When confronted with an error like this, closely look at the traceback, and read the documentation for the functions and objects involved.
urllib.request.urlopen returns a HTTPResponse object.
If you look at the error message, you see that tarfile.open expects a str, bytes or os.PathLike object for the parameter name.
However, tarfile.open supports using a file object as a third argument fileobj, and HTTPResponse implements the io.BufferedIOBase interface. The classes in io are basically the file objects that the open function returns.
So you should be able to do this:
tfile = tarfile.open(None, "r:gz", files)
or
tarfile.open(fileobj=url, mode="r:gz")
The latter could be considered more Pythonic ("explicit is better than implicit").
os.chdir('..')
tfile = tarfile.open("enron_mail_20150507.tar.gz", "r:gz")
Instead of doing the above two steps, can you just mention the fully qualified file name as parameter to tarfile.open? just to rule out the possibility that the path is incorrect

unable take input from a text file in python crawler

I have created a basic crawler in python, I want to take input from a text file.
I used open/raw_input but there was an error.
When I used input("") function it is prompting for input and was working fine.
The problem only with reading a file
import re
import urllib.request
url = open('input.txt', 'r')
data = urllib.request.urlopen(url).read()
data1 = data.decode("utf8")
print(data1)
file =open('output.txt' , 'w')
file.write(data1)
file.close()
error output below.
Traceback (most recent call last):
File "scrape.py", line 8, in <module>
data = urllib.request.urlopen(url).read()
File "/usr/lib/python3.6/urllib/request.py", line 223, in urlopen
return opener.open(url, data, timeout)
File "/usr/lib/python3.6/urllib/request.py", line 518, in open
protocol = req.type
AttributeError: '_io.TextIOWrapper' object has no attribute 'type'
the method open returns a file object, and not the content of the file as a string. if you want url to contain the content as a string, change the line to:
url = open('input.txt', 'r').read()

Editing PDF metadata fields with Python3 and pdfrw

I'm trying to edit the metadata Title field of PDFs, to include the ASCII equivalents when possible. I'm using Python3 and the module pdfrw.
How can I do string operations that replace the metadata fields?
My test code is here:
from pdfrw import PdfReader, PdfWriter, PdfString
import unicodedata
def edit_title_metadata(inpdf):
trailer = PdfReader(inpdf)
# this statement is breaking pdfrw
trailer.Info.Title = unicode_normalize(trailer.Info.Title)
# also have tried:
#trailer.Info.Title = PdfString(unicode_normalize(trailer.Info.Title))
PdfWriter("test.pdf", trailer=trailer).write()
return
def unicode_normalize(s):
return unicodedata.normalize('NFKD', s).encode('ascii', 'ignore')
if __name__ == "__main__":
edit_title_metadata('Anadon-2011-Scientific Opinion on the safety e.pdf')
And the traceback is:
Traceback (most recent call last):
File "get_metadata.py", line 68, in <module>
main()
File "get_metadata.py", line 54, in main
edit_title_metadata(pdf)
File "get_metadata.py", line 11, in edit_title_metadata
trailer.Info.Title = PdfString(unicode_normalize(trailer.Info.Title))
File "get_metadata.py", line 18, in unicode_normalize
return unicodedata.normalize('NFKD', s).encode('ascii', 'ignore')
File "/path_to_python/python3.7/site-packages/pdfrw/objects/pdfstring.py", line 550, in encode
if isinstance(source, uni_type):
TypeError: isinstance() arg 2 must be a type or tuple of types
Notes:
This issue at GitHub may be related.
FWIW, Also getting same error with Python3.6
I've shared the pdf (which has non-ascii hyphens, unicode char \u2010)
.
wget https://gist.github.com/philshem/71507d4e8ecfabad252fbdf4d9f8bdd2/raw/cce346ab39dd6ecb3a718ad3f92c9f546761e87b/Anadon-2011-Scientific%2520Opinion%2520on%2520the%2520safety%2520e.pdf
You have to use the .decode() method on the metadata fields:
trailer.Info.Title = unicode_normalize(trailer.Info.Title.decode())
And full working code:
from pdfrw import PdfReader, PdfWriter, PdfReader
import unicodedata
def edit_title_metadata(inpdf):
trailer = PdfReader(inpdf)
trailer.Info.Title = unicode_normalize(trailer.Info.Title.decode())
PdfWriter("test.pdf", trailer=trailer).write()
return
def unicode_normalize(s):
return unicodedata.normalize('NFKD', s).encode('ascii', 'ignore')
if __name__ == "__main__":
edit_title_metadata('Anadon-2011-Scientific Opinion on the safety e.pdf')

TypeError: Can't convert 'bytes' object to str implicitly for tweepy

from tweepy import Stream
from tweepy import OAuthHandler
from tweepy.streaming import StreamListener
ckey=''
csecret=''
atoken=''
asecret=''
class listener(StreamListener):
def on_data(self,data):
print(data)
return True
def on_error(self,status):
print(status)
auth = OAuthHandler(ckey,csecret)
auth.set_access_token(atoken, asecret)
twitterStream = Stream(auth, listener())
twitterStream.filter(track="cricket")
This code filter the twitter stream based on the filter. But I am getting following traceback after running the code. Can somebody please help
Traceback (most recent call last):
File "lab.py", line 23, in <module>
twitterStream.filter(track="car".strip())
File "C:\Python34\lib\site-packages\tweepy\streaming.py", line 430, in filter
self._start(async)
File "C:\Python34\lib\site-packages\tweepy\streaming.py", line 346, in _start
self._run()
File "C:\Python34\lib\site-packages\tweepy\streaming.py", line 286, in _run
raise exception
File "C:\Python34\lib\site-packages\tweepy\streaming.py", line 255, in _run
self._read_loop(resp)
File "C:\Python34\lib\site-packages\tweepy\streaming.py", line 298, in _read_loop
line = buf.read_line().strip()
File "C:\Python34\lib\site-packages\tweepy\streaming.py", line 171, in read_line
self._buffer += self._stream.read(self._chunk_size)
TypeError: Can't convert 'bytes' object to str implicitly
Im assuming you're using tweepy 3.4.0. The issue you've raised is 'open' on github (https://github.com/tweepy/tweepy/issues/615).
Two work-arounds :
1)
In streaming.py:
I changed line 161 to
self._buffer += self._stream.read(read_len).decode('UTF-8', 'ignore')
and line 171 to
self._buffer += self._stream.read(self._chunk_size).decode('UTF-8', 'ignore')
and then reinstalled via python3 setup.py install on my local copy of tweepy.
2)
remove the tweepy 3.4.0 module, and install 3.3.0 using command: pip install -I tweepy==3.3.0
Hope that helps,
-A
You can't do twitterStream.filter(track="car".strip()). Why are you adding the strip() it's serving no purpose in there.
track must be a str type before you invoke a connection to Twitter's Streaming API and tweepy is preventing that connection because you're trying to add strip()
If for some reason you need it, you can do track_word='car'.strip() then track=track_word, that's even unnecessary because:
>>> print('car'.strip())
car
Also, the error you're getting does not match the code you have listed, the code that's in your question should work fine.

AttributeError: 'module' object has no attribute 'urlretrieve'

I am trying to write a program that will download mp3's off of a website then join them together but whenever I try to download the files I get this error:
Traceback (most recent call last):
File "/home/tesla/PycharmProjects/OldSpice/Voicemail.py", line 214, in <module> main()
File "/home/tesla/PycharmProjects/OldSpice/Voicemail.py", line 209, in main getMp3s()
File "/home/tesla/PycharmProjects/OldSpice/Voicemail.py", line 134, in getMp3s
raw_mp3.add = urllib.urlretrieve("http://www-scf.usc.edu/~chiso/oldspice/m-b1-hello.mp3")
AttributeError: 'module' object has no attribute 'urlretrieve'
The line that is causing this problem is
raw_mp3.add = urllib.urlretrieve("http://www-scf.usc.edu/~chiso/oldspice/m-b1-hello.mp3")
As you're using Python 3, there is no urllib module anymore. It has been split into several modules.
This would be equivalent to urlretrieve:
import urllib.request
data = urllib.request.urlretrieve("http://...")
urlretrieve behaves exactly the same way as it did in Python 2.x, so it'll work just fine.
Basically:
urlretrieve saves the file to a temporary file and returns a tuple (filename, headers)
urlopen returns a Request object whose read method returns a bytestring containing the file contents
A Python 2+3 compatible solution is:
import sys
if sys.version_info[0] >= 3:
from urllib.request import urlretrieve
else:
# Not Python 3 - today, it is most likely to be Python 2
# But note that this might need an update when Python 4
# might be around one day
from urllib import urlretrieve
# Get file from URL like this:
urlretrieve("http://www-scf.usc.edu/~chiso/oldspice/m-b1-hello.mp3")
Suppose you have following lines of code
MyUrl = "www.google.com" #Your url goes here
urllib.urlretrieve(MyUrl)
If you are receiving following error message
AttributeError: module 'urllib' has no attribute 'urlretrieve'
Then you should try following code to fix the issue:
import urllib.request
MyUrl = "www.google.com" #Your url goes here
urllib.request.urlretrieve(MyUrl)

Resources