undefined unicode in python3 - python-3.x

I am trying to follow scrapy docs in scrapy (python3)
using scrapy shell "any_website"
from scrapy.loader.processors import MapCompose, Join
MapCompose(unicode.strip)([u' I',u' am\n'])
I am getting this error `Traceback (most recent call last):
File "/usr/lib/python3.6/code.py", line 91, in runcode
exec(code, self.locals)
File "<console>", line 1, in <module>
NameError: name 'unicode' is not defined
`
this is affecting my scrapy Item Loader when I use (same error happens)
l = ItemLoader(item=PropertiesItem(), response=response)
l.add_xpath('title', '//*[#itemprop="name"][1]/text()',MapCompose(unicode.strip, unicode.title))
the example on the scrapy docs is pretty straightforward but I am getting this error is it because I use python3 ?

in python2.x:
item = unicode(item, 'utf-8')
in python3.x:
item = str(item.encode('utf-8'))
Python 3 renamed the unicode type to str, the old str type has been replaced by bytes
renaming unicode occurrences with str will worked

Related

How can I execute python module on Python3 if I encountered print without parants

I want to launch pybrain tests on Python3 but I get error:
Traceback (most recent call last):
File "runtests.py", line 107, in <module>
runner.run(make_test_suite())
File "runtests.py", line 72, in make_test_suite
test_package = __import__(test_package_path, fromlist=module_names)
File "B:\msys64\mingw64\bin\WinPython\Python373\lib\site-packages\pybrain\tests\__init__.py", line 1, in <module>
from helpers import gradientCheck, buildAppropriateDataset, xmlInvariance, \
File "B:\msys64\mingw64\bin\WinPython\Python373\Lib\site-packages\pybrain\tests\helpers.py", line 42
print 'Module has no parameters'
^
SyntaxError: Missing parentheses in call to 'print'. Did you mean print('Module has no parameters')?
I looked helpers.py and found that prints are without parents(as operators, I think it was in Python2).How can I fix that?Can I import some module to
execute with such problem, for example six, but I don t know what it does.

Having trouble modifying python 2 code to python 3

I've been trying to translate python 2.7 code to python 3. I believe everything above checkpoint 1 should be correct. But I'm getting an error I associated with the second half. I can always download the file I need straight from the link, but I'd like to know what's breaking here.
import urllib
from urllib.request import urlopen
import tarfile
import os
path = 'https://www.cs.cmu.edu/~./enron/enron_mail_20150507.tar.gz'
url = urlopen(path)
#checkpoint 1
os.chdir('..')
tfile = tarfile.open(url, "r:gz")
tfile.extractall(".")
Error:
Traceback (most recent call last):
File "startup.py", line 43, in <module>
tfile = tarfile.open(url, "r:gz")
File "/anaconda3/lib/python3.6/tarfile.py", line 1589, in open
return func(name, filemode, fileobj, **kwargs)
File "/anaconda3/lib/python3.6/tarfile.py", line 1636, in gzopen
fileobj = gzip.GzipFile(name, mode + "b", compresslevel, fileobj)
File "/anaconda3/lib/python3.6/gzip.py", line 163, in __init__
fileobj = self.myfileobj = builtins.open(filename, mode or 'rb')
TypeError: expected str, bytes or os.PathLike object, not HTTPResponse
When confronted with an error like this, closely look at the traceback, and read the documentation for the functions and objects involved.
urllib.request.urlopen returns a HTTPResponse object.
If you look at the error message, you see that tarfile.open expects a str, bytes or os.PathLike object for the parameter name.
However, tarfile.open supports using a file object as a third argument fileobj, and HTTPResponse implements the io.BufferedIOBase interface. The classes in io are basically the file objects that the open function returns.
So you should be able to do this:
tfile = tarfile.open(None, "r:gz", files)
or
tarfile.open(fileobj=url, mode="r:gz")
The latter could be considered more Pythonic ("explicit is better than implicit").
os.chdir('..')
tfile = tarfile.open("enron_mail_20150507.tar.gz", "r:gz")
Instead of doing the above two steps, can you just mention the fully qualified file name as parameter to tarfile.open? just to rule out the possibility that the path is incorrect

dicttoxml throws AttributeError

I try to check how dicttoxml is working. But I receive this error from within the dicttoxml module.
I am starting program from geany.
Can anybody help?
Thanks!
import dicttoxml
myDict = {'myKey':"theirValue"};
xml =dicttoxml.dicttoxml(myDict);
Output is:
martin#saturn:~/it/python/python_work$ /bin/sh /tmp/geany_run_script_Q2NH3Z.sh
0.32000000000000006
1.6666666666666667
['1', '6666666666666667']
Traceback (most recent call last):
File "dicttoxmlExmp.py", line 4, in <module>
xml =dicttoxml.dicttoxml(myDict);
File "/home/martin/.local/lib/python3.6/site-packages/dicttoxml.py", line 393, in dicttoxml
convert(obj, ids, attr_type, item_func, cdata, parent=custom_root),
File "/home/martin/.local/lib/python3.6/site-packages/dicttoxml.py", line 176, in convert
if isinstance(obj, numbers.Number) or type(obj) in (str, unicode):
AttributeError: module 'numbers' has no attribute 'Number'
You can simply do:
python yourscript
Instead of running it as a shell script.

Why import of regular expressions falling a traceback error?

Having assignment "Extracting Data With Regular Expressions". For this I'm importing regex, but the code is not working. what is my mistake?
I checked the code without "import", it does work. Lines 2-7 are working. But it got a traceback error on "import re" line 1.
import re
fname = input('Enter file: ')
if len(fname) < 1 : fname = "sample.txt"
hand = open(fname)
hd = hand.read()
for line in hand:
line = line.rstrip()
nm = re.findall('[0-9]+',line)
print(nm)
C:\Users\Desktop\new>re.py
Enter file:
Traceback (most recent call last):
File "C:\Users\Desktop\new\re.py", line 1, in <module>
import re
File "C:\Users\Desktop\new\re.py", line 9, in <module>
[enter image description here][1]nm = re.findall('[0-9]+',line)
AttributeError: module 're' has no attribute 'findall'
Because you have called your file re.py, the import will actually import this file instead of the built-in module for regular expressions.
Just rename your file to something different and it should work as expected.

TypeError: Can't convert 'bytes' object to str implicitly for tweepy

from tweepy import Stream
from tweepy import OAuthHandler
from tweepy.streaming import StreamListener
ckey=''
csecret=''
atoken=''
asecret=''
class listener(StreamListener):
def on_data(self,data):
print(data)
return True
def on_error(self,status):
print(status)
auth = OAuthHandler(ckey,csecret)
auth.set_access_token(atoken, asecret)
twitterStream = Stream(auth, listener())
twitterStream.filter(track="cricket")
This code filter the twitter stream based on the filter. But I am getting following traceback after running the code. Can somebody please help
Traceback (most recent call last):
File "lab.py", line 23, in <module>
twitterStream.filter(track="car".strip())
File "C:\Python34\lib\site-packages\tweepy\streaming.py", line 430, in filter
self._start(async)
File "C:\Python34\lib\site-packages\tweepy\streaming.py", line 346, in _start
self._run()
File "C:\Python34\lib\site-packages\tweepy\streaming.py", line 286, in _run
raise exception
File "C:\Python34\lib\site-packages\tweepy\streaming.py", line 255, in _run
self._read_loop(resp)
File "C:\Python34\lib\site-packages\tweepy\streaming.py", line 298, in _read_loop
line = buf.read_line().strip()
File "C:\Python34\lib\site-packages\tweepy\streaming.py", line 171, in read_line
self._buffer += self._stream.read(self._chunk_size)
TypeError: Can't convert 'bytes' object to str implicitly
Im assuming you're using tweepy 3.4.0. The issue you've raised is 'open' on github (https://github.com/tweepy/tweepy/issues/615).
Two work-arounds :
1)
In streaming.py:
I changed line 161 to
self._buffer += self._stream.read(read_len).decode('UTF-8', 'ignore')
and line 171 to
self._buffer += self._stream.read(self._chunk_size).decode('UTF-8', 'ignore')
and then reinstalled via python3 setup.py install on my local copy of tweepy.
2)
remove the tweepy 3.4.0 module, and install 3.3.0 using command: pip install -I tweepy==3.3.0
Hope that helps,
-A
You can't do twitterStream.filter(track="car".strip()). Why are you adding the strip() it's serving no purpose in there.
track must be a str type before you invoke a connection to Twitter's Streaming API and tweepy is preventing that connection because you're trying to add strip()
If for some reason you need it, you can do track_word='car'.strip() then track=track_word, that's even unnecessary because:
>>> print('car'.strip())
car
Also, the error you're getting does not match the code you have listed, the code that's in your question should work fine.

Resources