Why are all my strings printed as string literals? [duplicate] - linux

How do I print a bytes string without the b' prefix in Python 3?
>>> print(b'hello')
b'hello'

Use decode:
>>> print(b'hello'.decode())
hello

If the bytes use an appropriate character encoding already; you could print them directly:
sys.stdout.buffer.write(data)
or
nwritten = os.write(sys.stdout.fileno(), data) # NOTE: it may write less than len(data) bytes

If the data is in an UTF-8 compatible format, you can convert the bytes to a string.
>>> print(str(b"hello", "utf-8"))
hello
Optionally, convert to hex first if the data is not UTF-8 compatible (e.g. data is raw bytes).
>>> from binascii import hexlify
>>> print(hexlify(b"\x13\x37"))
b'1337'
>>> print(str(hexlify(b"\x13\x37"), "utf-8"))
1337
>>> from codecs import encode # alternative
>>> print(str(encode(b"\x13\x37", "hex"), "utf-8"))
1337

According to the source for bytes.__repr__, the b'' is baked into the method.
One workaround is to manually slice off the b'' from the resulting repr():
>>> x = b'\x01\x02\x03\x04'
>>> print(repr(x))
b'\x01\x02\x03\x04'
>>> print(repr(x)[2:-1])
\x01\x02\x03\x04

To show or print:
<byte_object>.decode("utf-8")
To encode or save:
<str_object>.encode('utf-8')

I am a little late but for Python 3.9.1 this worked for me and removed the -b prefix:
print(outputCode.decode())

It's so simple...
(With that, you can encode the dictionary and list bytes, then you can stringify it using json.dump / json.dumps)
You just need use base64
import base64
data = b"Hello world!" # Bytes
data = base64.b64encode(data).decode() # Returns a base64 string, which can be decoded without error.
print(data)
There are bytes that cannot be decoded by default(pictures are an example), so base64 will encode those bytes into bytes that can be decoded to string, to retrieve the bytes just use
data = base64.b64decode(data.encode())

Use decode() instead of encode() for converting bytes to a string.
>>> import curses
>>> print(curses.version.decode())
2.2

Related

Python3 How to get raw bytes string without encode?

I want to get a string of origin bytes (assemble code) without encoding to another encoding. As the content of bytes is shellcode, I do not need to encode it and want to write it directly as raw bytes.
By simplify, I want to convert "b'\xb7\x00\x00\x00'" to "\xb7\x00\x00\x00" and get the string representation of raw bytes.
For example:
>> byte_code = b'\xb7\x00\x00\x00\x05\x00\x00\x00\x95\x00\x00\x00\x00\x00\x00\x00'
>> uc_str = str(byte_code)[2:-1]
>> print(byte_code, uc_str)
b'\xb7\x00\x00\x00\x05\x00\x00\x00\x95\x00\x00\x00\x00\x00\x00\x00' \xb7\x00\x00\x00\x05\x00\x00\x00\x95\x00\x00\x00\x00\x00\x00\x00
Currently I have only two ugly methods,
>> uc_str = str(byte_code)[2:-1]
>> uc_str = "".join('\\x{:02x}'.format(c) for c in byte_code)
Raw bytes usage:
>> my_template = "const char byte_code[] = 'TPL'"
>> uc_str = str(byte_code)[2:-1]
>> my_code = my_template.replace("TPL", uc_str)
# then write my_code to xx.h
Is there any pythonic way to do this?
Your first method is broken, because any bytes that can be represented as printable ASCII will be, for example:
>>> str(b'\x00\x20\x41\x42\x43\x20\x00')[2:-1]
'\\x00 ABC \\x00'
The second method is actually okay. Since this feature appears to be missing from stdlib I've published all-escapes which provides it.
pip install all-escapes
Example usage:
>>> b"\xb7\x00\x00\x00".decode("all-escapes")
'\\xb7\\x00\\x00\\x00'
I came across this trying to do something similar with some SNMP code.
byte_code = b'\xb7\x00\x00\x00\x05\x00\x00\x00\x95\x00\x00\x00\x00\x00\x00\x00'
text = byte_code.decode('raw_unicode_escape')
writer_func(text)
It worked to send an SNMP Hex string as an OctetString when there was no helper support for hex.
See also standard-encodings and bytes decode
and for anyone looking at the SNMP Set Types
basic of conversion byte / str is this :
>>> b"abc".decode()
'abc'
>>>
or :
>>> sb = b"abc"
>>> s = sb.decode()
>>> s
'abc'
>>>
The inverse is :
>>> "abc".encode()
b'abc'
>>>
or :
>>> s="abc"
>>> sb=s.encode()
>>> sb
b'abc'
>>>
And in your case, you should use errors argument :
>>> b"\xb7".decode(errors="replace")
'�'
>>>

Is there any way to get the direct hexadecimal value in bytes instead of getting string value?

In python3.5 I need to convert the string to IPFIX supported field value for UDP packet. While I am sending string bytes as UDP packet I am unable to recover the string data again. In Wireshark, it says that "Malformed data".
I found that IPFIX supports only the "ASCII" for strings. So I have converted ASCII value to hex and then converted into bytes. But while converting hex("4B") to byte. I am not getting my hex value in bytes instead of I am getting the string in bytes("K").
I have tried the following in the python console. I need exact byte what I have entered. But it seems like b'\x4B' instead of '\x4B' I am getting 'K'. I am using python3.5
b'\x4B'
b'K'
Code: "K".encode("ascii")
Actual OP: b'K'
Expected OP: b'\x4B'
There are multiple ways to do this:
1. The hex method (python 3.5 and up)
>>> 'K'.encode('ascii').hex()
'4b' # type str
2. Using binascii
>>> binascii.hexlify('K'.encode('ascii'))
b'4b' # type bytes
3. Using str.format
>>> ''.join('{:02x}'.format(x) for x in 'K'.encode('ascii'))
'4b' # type str
4. Using format
>>> ''.join(format(x, '02x') for x in 'K'.encode('ascii'))
'4b' # type str
Note: Methods using format are not very performance efficient.
If you really care about the \x you will have to use format, eg:
>>> print(''.join('\\x{:02x}'.format(x) for x in 'K'.encode('ascii')))
\x4b
>>> print(''.join('\\x{:02x}'.format(x) for x in 'KK'.encode('ascii')))
\x4b\x4b
If you care about uppercase then you can use X instead of x, eg:
>>> ''.join('{:02X}'.format(x) for x in 'K'.encode('ascii'))
'4B'
>>> ''.join(format(x, '02X') for x in 'K'.encode('ascii'))
'4B'
Uppercase and with \x:
>>> print(''.join('\\x{:02X}'.format(x) for x in 'Hello'.encode('ascii')))
\x48\x65\x6C\x6C\x6F
If you want bytes instead of str then just encode the output to ascii again:
>>> print(''.join('\\x{:02X}'.format(x) for x in 'Hello'.encode('ascii')).encode('ascii'))
b'\\x48\\x65\\x6C\\x6C\\x6F'

Incorporate Base64 encoded data in Python Web Service call

I am trying to make a web service call in Python 3. A subset of the request includes a base64 encoded string, which is coming from a list of Python dictionaries.
So I dump the list and encode the string:
j = json.dumps(dataDictList, indent=4, default = myconverter)
encodedData = base64.b64encode(j.encode('ASCII'))
Then, when I build my request, I add in that string. Because it comes back in bytes I need to change it to string:
...
\"data\": \"''' + str(encodedData) + '''\"
...
The response I'm getting from the web service is that my request is malformed. When I print our str(encodedData) I get:
b'WwogICAgewogICAgICAgICJEQVlfREFURSI6ICIyMDEyLTAzLTMxIDAwOjAwOjAwIiwKICAgICAgICAiQ0FMTF9DVFJfSUQiOiA1LAogICAgICAgICJUT1RfRE9MTEFSX1NBTEVTIjogMTk5MS4wLAogICAgICAgICJUT1RfVU5JVF9TQUxFUyI6IDQ0LjAsCiAgICAgICAgIlRPVF9DT1NUIjogMTYxOC4xMDM3MDAwMDAwMDA2LAogICAgICAgICJHUk9TU19ET0xMQVJfU0FMRVMiOiAxOTkxLjAKICAgIH0KXQ=='
If I copy this into a base64 decoder, I get gibberish until I remove the b' at the beginning as well as the last single quote. I think those are causing my request to fail. According to this note, though, I would think that the b' is ignored: What does the 'b' character do in front of a string literal?
I'll appreciate any advice.
Thank you.
Passing a bytes object into str causes it to be formatted for display, it doesn't convert the bytes into a string (you need to know the encoding for that to work):
In [1]: x = b'hello'
In [2]: str(x)
Out[2]: "b'hello'"
Note that str(x) actually starts with b' and ends with '. If you want to decode the bytes into a string, use bytes.decode:
In [5]: x = base64.b64encode(b'hello')
In [6]: x
Out[6]: b'aGVsbG8='
In [7]: x.decode('ascii')
Out[7]: 'aGVsbG8='
You can safely decode the base64 bytes as ASCII. Also, your JSON should be encoded as UTF-8, not ASCII. The following changes should work:
j = json.dumps(dataDictList, indent=4, default=myconverter)
encodedData = base64.b64encode(j.encode('utf-8')).decode('ascii')

Python, base64, float

Do you have any idea how to encode and decode a float number with base64 in Python.
I am trying to use
response='64.000000'
base64.b64decode(response)
the expected output is 'AAAAAAAALkA=' but i do not get any output for float numbers.
Thank you.
Base64 encoding is only defined for byte strings, so you have to convert your number into a sequence of bytes using struct.pack and then base64 encode that. The example you give looks like a base64 encoded little-endian double. So (for Python 2):
>>> import struct
>>> struct.pack('<d', 64.0).encode('base64')
'AAAAAAAAUEA=\n'
For the reverse direction you base64 decode and then unpack it:
>>> struct.unpack('<d', 'AAAAAAAALkA='.decode('base64'))
(15.0,)
So it looks like your example is 15.0 rather than 64.0.
For Python 3 you need to also use the base64 module:
>>> import struct
>>> import base64
>>> base64.encodebytes(struct.pack('<d', 64.0))
b'AAAAAAAAUEA=\n'
>>> struct.unpack('<d', base64.decodebytes(b'AAAAAAAALkA='))
(15.0,)

Decoding of a encoded base64 string

I have a base64 encoded string S="aGVsbG8=", now i want to decode the string into ASCII, UTF-8, UTF-16, UTF-32, CP-1256, ISO-8659-1, ISO-8659-2, ISO-8659-6, ISO-8659-15 and Windows-1252, How i can decode the string into the mentioned format. For UTF-16 I tried following code, but it was giving error "'bytes' object has no attribute 'deocde'".
base64.b64decode(encodedBase64String).deocde('utf-8')
Please read the doc or docstring for the 3.x base64 module. The module works with bytes, not text. So your base64 encoded 'string' would be a byte string B = b"aGVsbG8". The result of base64.decodebytes(B) is bytes; binary data with whatever encoding it has (text or image or ...). In this case, it is b'hello', which can be viewed as ascii-encoded text. To change to other encodings, first decode to unicode text and then encode to bytes in whatever other encoding you want. Most of the encodings you list above will have the same bytes.
>>> B=b"aGVsbG8="
>>> b = base64.decodebytes(B)
>>> b
b'hello'
>>> t = b.decode()
>>> t
'hello'
>>> t.encode('utf-8')
b'hello'
>>> t.encode('utf-16')
b'\xff\xfeh\x00e\x00l\x00l\x00o\x00'
>>> t.encode('utf-32')
b'\xff\xfe\x00\x00h\x00\x00\x00e\x00\x00\x00l\x00\x00\x00l\x00\x00\x00o\x00\x00\x00'

Resources