I have a windows path stored in a variable called "a". When I tried to print or use it in the code, somehow some special characters are added to the string.
>>> import re
>>> from pathlib import Path
>>>
>>>
>>> a = "E:\POC\testing\functionalities\logs\timer.logs"
>>> a
'E:\\POC\testing\x0cunctionalities\\logs\timer.logs'
>>>
>>> Path(a)
WindowsPath('E:/POC\testing\x0cunctionalities/logs\timer.logs')
>>> Path.absolute(a)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "c:\program files (x86)\python38-32\lib\pathlib.py", line 1159, in absolute
if self._closed:
AttributeError: 'str' object has no attribute '_closed'
>>>
>>> re.escape(a)
'E:\\\\POC\\\testing\\\x0cunctionalities\\\\logs\\\timer\\.logs'
>>>
>>> a.replace("\\", "/")
'E:/POC\testing\x0cunctionalities/logs\timer.logs'
>>> a.__repr__()
"'E:\\\\POC\\testing\\x0cunctionalities\\\\logs\\timer.logs'"
>>>
I'm able to handle all the special characters but \f is somehow changed to \x0c.
One solution is adding r to the string, but my path is stored in a variable. How I can achieve that? I'm using python 3.8.5 and Windows 10
>>> a = r"E:\POC\testing\functionalities\logs\timer.logs"
>>> a
'E:\\POC\\testing\\functionalities\\logs\\timer.logs'
>>>
>>>
>>> a = "E:\POC\testing\functionalities\logs\timer.logs"
>>> a = r"" + a
>>> a
'E:\\POC\testing\x0cunctionalities\\logs\timer.logs'
>>>
Use raw string or escape the backward slash:
a = r"E:\POC\testing\functionalities\logs\timer.logs"
or
a = "E:\\POC\\testing\\functionalities\\logs\\timer.logs"
Based on your comment under #user8086906's post, couldn't you just do
a.replace('\\', '\')
? I see you tried a.replace("\\", "/") above - could you explain what the desired behavior is? On my machine, the first snippet I posted works.
EDIT:
Thanks #Gopirengaraj C - I see what the issue is now. The problem is that \f is an escape character in Unicode - more specifically, it is called a "form feed". I think a good way to get around this would then be to avoid replace and do something like this instead:
a = r'{0}'.format(a)
Lmk if that works.
Related
I use the code below to read a pickle file made in python2
import pickle
with open('data.pkl', 'rb') as fin:
data_df = pickle.load(fin, encoding='latin1')
Everything works well except the column including Japanese charactors.
For example, string supposed to be "東京都" may become something like "æ±äº¬é".
I think python3 reads the bytes format string as str. How can I convert it back?
Here is some test I did in python3
>>> a='\xe6\x9d\xb1\xe4\xba\xac\xe9\x83\xbd'
>>> b=b'\xe6\x9d\xb1\xe4\xba\xac\xe9\x83\xbd'
>>> a
'æ\x9d±äº¬é\x83½'
>>> b
b'\xe6\x9d\xb1\xe4\xba\xac\xe9\x83\xbd'
>>> print(a)
æ±äº¬é
>>> print(b)
b'\xe6\x9d\xb1\xe4\xba\xac\xe9\x83\xbd'
>>> a.decode('utf-8')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'str' object has no attribute 'decode'
>>> b.decode('utf-8')
'東京都'
I think pickle.load reads the utf-8 code as str (like the a case above).
[EDIT]
The reason why I set pickle.load encoding to latin1 was because there's column with datetime format. It causes error if I set encoding='utf-8
I have the following code and would like to make it compatible with both python 2.7 and python 3.6
from re import sub, findall
return sub(r' ', ' ', sub(r'(\s){2,}', ' ',sub(r'[^a-z|\s|,]|_|
(x)\1{1,}', '', x.lower())))
I received the following error:
TypeError: cannot use a string pattern on a bytes-like object
I understood that the python3 distinguishes byte and string(unicode),but not sure how to proceed.
Thanks.
tried the following and not working
return sub(rb' ', b' ', sub(rb'(\s){2,}', b' ',sub(rb'[^a-z|\s|,]|_|(x)\1{1,}', b'', x.lower())))
Have you tried using re.findall? For instance:
import re
respdata = # the data you are reading
content = re.findall(r'#findall from and too#', str(respdata)) # output in string
for contents in content:
print(contents) # print results
The "string" you have must be a series of bytes, which you can convert to a real string using x.decode('utf-8'). You can see the problem with a simple example:
>>> import re
>>> s = bytes('hello', 'utf-8')
>>> s
b'hello'
>>> re.search(r'[he]', s)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/Cellar/python/3.7.4/Frameworks/Python.framework/Versions/3.7/lib/python3.7/re.py", line 183, in search
return _compile(pattern, flags).search(string)
TypeError: cannot use a string pattern on a bytes-like object
>>> s.decode('utf-8')
'hello'
>>> re.search(r'[he]', s.decode('utf-8'))
<re.Match object; span=(0, 1), match='h'>
I'm assuming your bytes represent UTF-8 data, but if you're working with a different encoding then just pass its name to decode() instead.
I read "How to think like a Computer Scientist. Learning with Python." book. So I usually have no difficulties to interpret examples from python2 to python3, but at chapter 11 Files & Exceptions I encountered this snippet
>>> import pickle
>>> f = open("test.pck", "w")
>>> pickle.dump(12.3, f)
>>> pickle.dump([1,2,3], f)
>>> f.close()
which when I evaluate it using Python 3.5.2 gives this error
Traceback (most recent call last): File "/(myDirs)/files.py", line 3, in <module>
pickle.dump(3.14, f)
TypeError: write() argument must be str, not bytes
I am not a good docs reader, so if you can help me to solve this riddle I would be grateful.
You need to open the file in binary mode.
In line 2:
f = open("test.pck", "wb")
Hi guys I'm using python3 and install googlefinace module(https://pypi.python.org/pypi/googlefinance) and the example says it's works
>>> from googlefinance import getQuotes
>>> import json
>>> print json.dumps(getQuotes('AAPL'), indent=2)
but I type this code using my terminal access python3
>>> from googlefinance import getQuotes
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/googlefinance/__init__.py", line 55
print "url: ", url
^
SyntaxError: Missing parentheses in call to 'print'
so what's the problem please help me
In python3 print syntax contains parenthesis. There for it is giving you syntax error. Use correct print syntax. print (url)
>>> help(bytearray.count)
Help on method_descriptor:
count(...)
B.count(sub[, start[, end]]) -> int
Return the number of non-overlapping occurrences of subsection sub in
bytes B[start:end]. Optional arguments start and end are interpreted
as in slice notation.
>>> b = bytearray(b'abcd')
>>> b
bytearray(b'abcd')
>>> b.count('a')
Traceback (most recent call last):
File "<console>", line 1, in <module>
TypeError: Type str doesn't support the buffer API
Question> How to use count for bytearray?
You pretty clearly need to pass another byte array to b.count:
>>> b.count(b'a')
You can search for bytes, not Unicode strings:
>>> b.count(b'a')
1