How do I format this from str to byte-like object? - python-3.x

Ahoy, I'm having trouble with decoding these filenames (They're encoded as base64). I know they need to be byte-like objects, but I can't for the life of me make it so. Please help, much love.
for filename in os.listdir('./Files'):
name, typeId = base64.b64decode(filename.replace('.png', '')).split('_!_')
Error:
name, typeId = base64.b64decode(filename.replace('.png', '')).split('_!_')
TypeError: a bytes-like object is required, not 'str'

TypeError: a bytes-like object is required, not 'str'
You're probably going to get this error from two places:
b64decode(filename.replace('.png', ''))
As you've mentioned, b64decode expects a bytes-like object.
But filename is a str and filename.replace will also return a str.
.split('_!_')
Since b64decode will return bytes, you also have to pass a bytes-like object to split.
Try this:
for fname in os.listdir('./Files'):
fname_bytes = os.fsencode(fname.replace('.png', ''))
dec = base64.b64decode(fname_bytes)
parts = dec.split(b"_!_")
To solve 1., you can use fsencode as noted in the os.listdir docs:
Note: To encode str filenames to bytes, use fsencode().
To solve 2., you can prefix a "b" to the string to make it a byte literal:
Bytes literals are always prefixed with 'b' or 'B'; they produce an instance of the bytes type instead of the str type.

Related

Using Protocol Buffer to Serialize Bytes Python3

I am trying to serialize a bytes object - which is an initialization vector for my program's encryption. But, the Google Protocol Buffer only accepts strings. It seems like the error starts with casting bytes to string. Am I using the correct method to do this? Thank you for any help or guidance!
Or also, can I make the Initialization Vector a string object for AES-CBC mode encryption?
Code
Cast the bytes to a string
string_iv = str(bytes_iv, 'utf-8')
Serialize the string using SerializeToString():
serialized_iv = IV.SerializeToString()
Use ParseToString() to recover the string:
IV.ParseFromString( serialized_iv )
And finally, UTF-8 encode the string back to bytes:
bytes_iv = bytes(IV.string_iv, encoding= 'utf-8')
Error
string_iv = str(bytes_iv, 'utf-8')
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x9b in position 3: invalid start byte
If you must cast an arbitrary bytes object to str, these are your option:
simply call str() on the object. It will turn it into repr form, ie. something that could be parsed as a bytes literal, eg. "b'abc\x00\xffabc'"
decode with "latin1". This will always work, even though it technically makes no sense if the data isn't text encoded with Latin-1.
use base64 or base85 encoding (the standard library has a base64 module wich covers both)

How to ignore/delete undefined characters on string decoded

I'm reading a bus with bytes-like sequence of characters and i need to decode it in a string, but when I use decode method the output show undefined characters and i need to delete/ignore them.
Thanks all for help
I have already tried to use the method decode(encoding='utf-8', errors='ignore'), or with encoding='ascii', but I get same result.
x = ser.read_until(b'\x03', None)
string = x.decode(encoding='utf-8', errors='ignore')
This is the actual result: xx423711B552000083x (x = undefined character)
And I expected to have: 423711B552000083

Python how to trim a bytes string

I want to trim a bytes string before an index found by locating $$$,
trimmed_bytes_stream = padded_bytes_stream[:padded_stream.index('$$$')]
but got an error:
TypeError: a bytes-like object is required, not 'str'
Is there bytes object equivalent methods to do that? Or have convert bytes string to string and then using string methods? finally convert back to bytes after trimming?
Append a b to your search item
trimmed_bytes_stream = padded_bytes_stream[:padded_stream.index(b'$$$')]

Why is passing bytes to class str constructor special?

Offical Python3 docs say this about passing bytes to the single argument constructor for class str:
Passing a bytes object to str() without the encoding or errors
arguments falls under the first case of returning the informal string
representation (see also the -b command-line option to Python).
Ref: https://docs.python.org/3/library/stdtypes.html#str
informal string representation -> Huh?
Using the Python console (REPL), and I see the following weirdness:
>>> ''
''
>>> b''
b''
>>> str()
''
>>> str('')
''
>>> str(b'')
"b''" # What the heck is this?
>>> str(b'abc')
"b'abc'"
>>> "x" + str(b'')
"xb''" # Woah.
(The question title can be improved -- I'm struggling to find a better one. Please help to clarify.)
The concept behind str seems to be that it returns a "nicely printable" string, usually in a human understandable form. The documentation actually uses the phrase "nicely printable":
If neither encoding nor errors is given, str(object) returns
object.__str__(), which is the “informal” or nicely printable string
representation of object. For string objects, this is the string
itself. If object does not have a __str__() method, then str() falls
back to returning repr(object).
With that in mind, note that str of a tuple or list produces string versions such as:
>>> str( (1, 2) )
'(1, 2)'
>>> str( [1, 3, 5] )
'[1, 3, 5]'
Python considers the above to be the "nicely printable" form for these objects. With that as background, the following seems a bit more reasonable:
>>> str(b'abc')
"b'abc'"
With no encoding provided, the bytes b'abc' are just bytes, not characters. Thus, str falls back to the "nicely printable" form and the six character string b'abc' is nicely printable.

Converting a string to and from Base 64 [duplicate]

This question already has answers here:
Why do I need 'b' to encode a string with Base64?
(5 answers)
Closed 6 years ago.
I am trying to write two programs one that converts a string to base64 and then another that takes a base64 encoded string and converts it back to a string.
so far i cant get past the base64 encoding part as i keep getting the error
TypeError: expected bytes, not str
my code looks like this so far
def convertToBase64(stringToBeEncoded):
import base64
EncodedString= base64.b64encode(stringToBeEncoded)
return(EncodedString)
A string is already 'decoded', thus the str class has no 'decode' function.Thus:
AttributeError: type object 'str' has no attribute 'decode'
If you want to decode a byte array and turn it into a string call:
the_thing.decode(encoding)
If you want to encode a string (turn it into a byte array) call:
the_string.encode(encoding)
In terms of the base 64 stuff:
Using 'base64' as the value for encoding above yields the error:
LookupError: unknown encoding: base64
Open a console and type in the following:
import base64
help(base64)
You will see that base64 has two very handy functions, namely b64decode and b64encode. b64 decode returns a byte array and b64encode requires a bytes array.
To convert a string into it's base64 representation you first need to convert it to bytes. I like utf-8 but use whatever encoding you need...
import base64
def stringToBase64(s):
return base64.b64encode(s.encode('utf-8'))
def base64ToString(b):
return base64.b64decode(b).decode('utf-8')

Resources