subprocess.popen is returning the output as class bytes as below
b'Caption FreeSpace Size \r\r\nC: 807194624
63869808640 \r\r\nD: \r\r\nY:
216847310848 2748779065344 \r\r\n\r\r\n'
How to remove all occurance of \r\r\n Or how can I convert this to python string or array
my_str = "hello world"
Converting string to bytes
my_str_as_bytes = str.encode(my_str)
Converting bytes to string
my_decoded_str = my_str_as_bytes.decode()
You can print it by:
print(b'Caption FreeSpace ... \r\r\n\r\r\n'.decode())
Result:
Caption FreeSpace ...
You can remove the characters with translate like so
b'Caption FreeSpace ... \r\r\n\r\r\n'.translate(None, b'\r\n')
which results in
b'Caption FreeSpace Size C: 807194624 63869808640 D: Y: 216847310848 2748779065344 '
If you know the encoding of the returned data you may want to use decode which will give you a string for further processing.
For example, assumed it is encoded in utf-8, you can just call decode with its default value and directly call split on it to split by white-space characters to get an array like this
b'Caption FreeSpace ... \r\r\n\r\r\n'.translate(None, b'\r\n').decode().split()
Result
['Caption', 'FreeSpace', 'Size', 'C:', '807194624', '63869808640', 'D:', 'Y:', '216847310848', '2748779065344']
Related
I read some lines from a textfile which was in a zip file and have to modified to a readable string.
for example one line i get from file show like this:
byte_code = b"\x000\x002\x002\x008\x007\x00:\x00,\x00'\x001\x004\x00.\x001\x002\x00.\x002\x000\x001\x009\x00 \x002\x000\x00:\x002\x008\x00:\x002\x007\x00'\x00,\x00$\x000\x001\x00F\x00B\x00,\x00,\x00,\x00,\x00\r\x00\n"
if i decode and print it, i get a readabel result (sory the output have some null byte in it and i could not enter her
print(byte_code.decode('latin-1'))
i would to get the readable result like the print function got into a normal string variable without null bytes in it
this line i expect
02287:,'14.12.2019 20:28:27',$01FB,,,,
but if i assign the decode line to a string variable i get this one not readable string
mystr = byte_code.decode('latin-1')
mystr
Out[55]: "\x000\x002\x002\x008\x007\x00:\x00,\x00'\x001\x004\x00.\x001\x002\x00.\x002\x000\x001\x009\x00 \x002\x000\x00:\x002\x008\x00:\x002\x007\x00'\x00,\x00$\x000\x001\x00F\x00B\x00,\x00,\x00,\x00,\x00\r\x00\n"
Is the decoding of byte string with the correct encoding format?
How i get the correct readable string without null bytes?
This might not be the perfect answer, but it might be a quick solution until something better comes up.
byteLine = b"\xff\xfe0" + byte_code + b"\x00"
strLine = byteLine.decode('utf16')
The strLine value is then:
In [1] : strLine
Out[2] : "002287:,'14.12.2019 20:28:27',$01FB,,,,\r\n"
Using python3 and I've got a string which displayed as bytes
strategyName=\xe7\x99\xbe\xe5\xba\xa6
I need to change it into readable chinese letter through decode
orig=b'strategyName=\xe7\x99\xbe\xe5\xba\xa6'
result=orig.decode('UTF-8')
print()
which shows like this and it is what I want
strategyName=百度
But if I save it in another string,it works different
str0='strategyName=\xe7\x99\xbe\xe5\xba\xa6'
result_byte=str0.encode('UTF-8')
result_str=result_byte.decode('UTF-8')
print(result_str)
strategyName=ç¾åº¦é£é©çç¥
Please help me about why this happening,and how can I fix it.
Thanks a lot
Your problem is using a str literal when you're trying to store the UTF-8 encoded bytes of your string. You should just use the bytes literal, but if that str form is necessary, the correct approach is to encode in latin-1 (which is a 1-1 converter for all ordinals below 256 to the matching byte value) to get the bytes with utf-8 encoded data, then decode as utf-8:
str0 = 'strategyName=\xe7\x99\xbe\xe5\xba\xa6'
result_byte = str0.encode('latin-1') # Only changed line
result_str = result_byte.decode('UTF-8')
print(result_str)
Of course, the other approach could be to just type the Unicode escapes you wanted in the first place instead of byte level escapes that correspond to a UTF-8 encoding:
result_str = 'strategyName=\u767e\u5ea6'
No rigmarole needed.
AttributeError: 'builtin_function_or_method' object has no attribute 'encode'
I'm trying to make a text to code converter as an example for an assignment and this is some code based off of some I found in my research,
import binascii
text = input('Message Input: ')
data = binascii.b2a_base64.encode(text)
text = binascii.a2b_base64.encode(data)
print (text), "<=>", repr(data)
data = binascii.b2a_uu(text)
text = binascii.a2b_uu(data)
print (text), "<=>", repr(data)
data = binascii.b2a_hqx(text)
text = binascii.a2b_hqx(data)
print (text), "<=>", repr(data)
can anyone help me get it working? it's supposed to take an input in and then convert it into hex and others and display those...
I am using Python 3.6 but I am also a little out of practice...
TL;DR:
data = binascii.b2a_base64(text.encode())
text = binascii.a2b_base64(data).decode()
print (text, "<=>", repr(data))
You've hit on a common problem in the Python3 - str object vs bytes object. The bytes object contains sequence of bytes. One byte can contain any number from 0 to 255. Usually those number are translated through the ASCII table into a characters like english letters. Usually in the Python you should use bytes for working with binary data.
On the other hand the str object contains sequence of code points. One code point usually represent one character printed on your screen when you call print. Internally it is sequence of bytes so the Chinese symbol 的 is internally saved as 3 bytes long sequence.
Now to the your problem. The function requires as input the bytes object but you've got a str object from the function input. To convert str into bytes you have to call str.encode() method on the str object.
data = binascii.b2a_base64(text.encode())
Your original call binascii.b2a_base64.encode(text) means call method encode of the object binascii.b2a_base64 with parameter text.
The function binascii.b2a_base64 returns bytes contains original input encoded with the base64 algorithms. Now to get back the original str from encoded data you have to call this:
# Take base64 encoded data and return it decoded as bytes object
decoded_data = binascii.a2b_base64(data)
# Convert bytes object into str
text = decoded_data.decode()
It can be written as one line
decoded_data = binascii.a2b_base64(data).decode()
WARNING: Your call of print is invalid for Python 3 (it will work only in the python console)
Suppose there is a string:
String str="Hello";
HOw can i get the ASCII value of that above mentioned string?
Given your comment, it sounds like all you need is:
char[] chars = str.ToCharArray();
Array.Sort(chars);
A char value in .NET is actually a UTF-16 code unit, but for all ASCII characters, the UTF-16 code unit value is the same as the ASCII value anyway.
You can create a new string from the array like this:
string sortedText = new string(chars);
Console.WriteLine(chars);
As it happens, "Hello" is already in ascending ASCII order...
byte[] asciiBytes =Encoding.ASCII.GetBytes(str);
You now have an array of the ASCII value of the bytes
I'm using Python 3.3.2 and I want convert a hex to a string.
This is my code:
junk = "\x41" * 50 # A
eip = pack("<L", 0x0015FCC4)
buffer = junk + eip
I've tried use
>>> binascii.unhexlify("4142")
b'AB'
... but I want the output "AB", no "b'AB'". What can I do?
Edit:
buffer = junk + binascii.unhexlify(eip).decode('ascii')
binascii.Error: Non-hexadecimal digit found
The problem is I can't concatenate junk + eip.
Thank you.
What that b stands for is to denote that is a bytes class, i.e. a string of bytes. If you want to convert that into a string you want to use the decode method.
>>> type(binascii.unhexlify(b"4142"))
<class 'bytes'>
>>> binascii.unhexlify(b"4142").decode('ascii')
'AB'
This results in a string, which is a string of unicode characters.
Edit:
If you want to work purely with binary data, don't do decode, stick with using the bytes type, so in your edited example:
>>> #- junk = "\x41" * 50 # A
>>> junk = b"\x41" * 50 # A
>>> eip = pack("<L", 0x0015FCC4)
>>> buffer = junk + eip
>>> buffer
b'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA\xc4\xfc\x15\x00'
Note the b in b"\x41", which denote that as a binary string, i.e. standard string type in python2, or literally a string of bytes rather than a string of unicode characters which are two completely different things.
That's just a literal representation. Don't worry about the b, as it's not actually part of the string itself.
See What does the 'b' character do in front of a string literal?