Converting a string to and from Base 64 [duplicate] - python-3.x

This question already has answers here:
Why do I need 'b' to encode a string with Base64?
(5 answers)
Closed 6 years ago.
I am trying to write two programs one that converts a string to base64 and then another that takes a base64 encoded string and converts it back to a string.
so far i cant get past the base64 encoding part as i keep getting the error
TypeError: expected bytes, not str
my code looks like this so far
def convertToBase64(stringToBeEncoded):
import base64
EncodedString= base64.b64encode(stringToBeEncoded)
return(EncodedString)

A string is already 'decoded', thus the str class has no 'decode' function.Thus:
AttributeError: type object 'str' has no attribute 'decode'
If you want to decode a byte array and turn it into a string call:
the_thing.decode(encoding)
If you want to encode a string (turn it into a byte array) call:
the_string.encode(encoding)
In terms of the base 64 stuff:
Using 'base64' as the value for encoding above yields the error:
LookupError: unknown encoding: base64
Open a console and type in the following:
import base64
help(base64)
You will see that base64 has two very handy functions, namely b64decode and b64encode. b64 decode returns a byte array and b64encode requires a bytes array.
To convert a string into it's base64 representation you first need to convert it to bytes. I like utf-8 but use whatever encoding you need...
import base64
def stringToBase64(s):
return base64.b64encode(s.encode('utf-8'))
def base64ToString(b):
return base64.b64decode(b).decode('utf-8')

Related

Converting 16-digit hexadecimal string into double value in Python 3

I want to convert 16-digit hexadecimal numbers into doubles. I actually did the reverse of this before which worked fine:
import struct
import wrap
def double_to_hex(doublein):
return hex(struct.unpack('<Q', struct.pack('<d', doublein))[0])
for i in modified_list:
encoded_list.append(double_to_hex(i))
modified_list.clear()
encoded_msg = ''.join(encoded_list).replace('0x', '')
encoded_list.clear()
print_command('encode', encoded_message)
And now I want to sort of do the reverse. I tried this without success:
from textwrap import wrap
import struct
import binascii
MESSAGE = 'c030a85d52ae57eac0129263c4fffc34'
#Splitting up message into n 16-bit strings
MSGLIST = wrap(MESSAGE, 16)
doubles = []
print(MSGLIST)
for msg in MSGLIST:
doubles.append(struct.unpack('d', binascii.unhexlify(msg)))
print(doubles)
However, when I run this, I get crazy values, which are of course not what I put in:
[(-1.8561629252326087e+204,), (1.8922789420412524e-53,)]
Were your original numbers -16.657673995556173 and -4.642958715557189 ?
If so, then the problem is that your hex strings contain the big-endian (most-significant byte first) representation of the double, but the 'd' format string in your unpack call specifies conversion using your system's native format, which happens to be little-endian (least-significant byte first). The result is that unpack reads and processes the bytes of the unhexlify'ed string from the wrong end. Unsurprisingly, that will produce the wrong value.
To fix, do one of:
convert the hex string into little-endian format (reverse the bytes, so c030a85d52ae57ea becomes ea57ae525da830c0) before passing it to binascii.unhexlify, or
reverse the bytes produced by unhexlify (change binascii.unhexlify(msg) to binascii.unhexlify(msg)[::-1]) before you pass them to unpack, or
tell unpack to do the conversion using big-endian order (replace the format string 'd' with '>d')
I'd go with the last one, replacing the format string.

Using Protocol Buffer to Serialize Bytes Python3

I am trying to serialize a bytes object - which is an initialization vector for my program's encryption. But, the Google Protocol Buffer only accepts strings. It seems like the error starts with casting bytes to string. Am I using the correct method to do this? Thank you for any help or guidance!
Or also, can I make the Initialization Vector a string object for AES-CBC mode encryption?
Code
Cast the bytes to a string
string_iv = str(bytes_iv, 'utf-8')
Serialize the string using SerializeToString():
serialized_iv = IV.SerializeToString()
Use ParseToString() to recover the string:
IV.ParseFromString( serialized_iv )
And finally, UTF-8 encode the string back to bytes:
bytes_iv = bytes(IV.string_iv, encoding= 'utf-8')
Error
string_iv = str(bytes_iv, 'utf-8')
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x9b in position 3: invalid start byte
If you must cast an arbitrary bytes object to str, these are your option:
simply call str() on the object. It will turn it into repr form, ie. something that could be parsed as a bytes literal, eg. "b'abc\x00\xffabc'"
decode with "latin1". This will always work, even though it technically makes no sense if the data isn't text encoded with Latin-1.
use base64 or base85 encoding (the standard library has a base64 module wich covers both)

How to Turn string into bytes?

Using python3 and I've got a string which displayed as bytes
strategyName=\xe7\x99\xbe\xe5\xba\xa6
I need to change it into readable chinese letter through decode
orig=b'strategyName=\xe7\x99\xbe\xe5\xba\xa6'
result=orig.decode('UTF-8')
print()
which shows like this and it is what I want
strategyName=百度
But if I save it in another string,it works different
str0='strategyName=\xe7\x99\xbe\xe5\xba\xa6'
result_byte=str0.encode('UTF-8')
result_str=result_byte.decode('UTF-8')
print(result_str)
strategyName=ç¾åº¦é£é©ç­ç¥
Please help me about why this happening,and how can I fix it.
Thanks a lot
Your problem is using a str literal when you're trying to store the UTF-8 encoded bytes of your string. You should just use the bytes literal, but if that str form is necessary, the correct approach is to encode in latin-1 (which is a 1-1 converter for all ordinals below 256 to the matching byte value) to get the bytes with utf-8 encoded data, then decode as utf-8:
str0 = 'strategyName=\xe7\x99\xbe\xe5\xba\xa6'
result_byte = str0.encode('latin-1') # Only changed line
result_str = result_byte.decode('UTF-8')
print(result_str)
Of course, the other approach could be to just type the Unicode escapes you wanted in the first place instead of byte level escapes that correspond to a UTF-8 encoding:
result_str = 'strategyName=\u767e\u5ea6'
No rigmarole needed.

Why is this error appearing?

AttributeError: 'builtin_function_or_method' object has no attribute 'encode'
I'm trying to make a text to code converter as an example for an assignment and this is some code based off of some I found in my research,
import binascii
text = input('Message Input: ')
data = binascii.b2a_base64.encode(text)
text = binascii.a2b_base64.encode(data)
print (text), "<=>", repr(data)
data = binascii.b2a_uu(text)
text = binascii.a2b_uu(data)
print (text), "<=>", repr(data)
data = binascii.b2a_hqx(text)
text = binascii.a2b_hqx(data)
print (text), "<=>", repr(data)
can anyone help me get it working? it's supposed to take an input in and then convert it into hex and others and display those...
I am using Python 3.6 but I am also a little out of practice...
TL;DR:
data = binascii.b2a_base64(text.encode())
text = binascii.a2b_base64(data).decode()
print (text, "<=>", repr(data))
You've hit on a common problem in the Python3 - str object vs bytes object. The bytes object contains sequence of bytes. One byte can contain any number from 0 to 255. Usually those number are translated through the ASCII table into a characters like english letters. Usually in the Python you should use bytes for working with binary data.
On the other hand the str object contains sequence of code points. One code point usually represent one character printed on your screen when you call print. Internally it is sequence of bytes so the Chinese symbol 的 is internally saved as 3 bytes long sequence.
Now to the your problem. The function requires as input the bytes object but you've got a str object from the function input. To convert str into bytes you have to call str.encode() method on the str object.
data = binascii.b2a_base64(text.encode())
Your original call binascii.b2a_base64.encode(text) means call method encode of the object binascii.b2a_base64 with parameter text.
The function binascii.b2a_base64 returns bytes contains original input encoded with the base64 algorithms. Now to get back the original str from encoded data you have to call this:
# Take base64 encoded data and return it decoded as bytes object
decoded_data = binascii.a2b_base64(data)
# Convert bytes object into str
text = decoded_data.decode()
It can be written as one line
decoded_data = binascii.a2b_base64(data).decode()
WARNING: Your call of print is invalid for Python 3 (it will work only in the python console)

Decoding of a encoded base64 string

I have a base64 encoded string S="aGVsbG8=", now i want to decode the string into ASCII, UTF-8, UTF-16, UTF-32, CP-1256, ISO-8659-1, ISO-8659-2, ISO-8659-6, ISO-8659-15 and Windows-1252, How i can decode the string into the mentioned format. For UTF-16 I tried following code, but it was giving error "'bytes' object has no attribute 'deocde'".
base64.b64decode(encodedBase64String).deocde('utf-8')
Please read the doc or docstring for the 3.x base64 module. The module works with bytes, not text. So your base64 encoded 'string' would be a byte string B = b"aGVsbG8". The result of base64.decodebytes(B) is bytes; binary data with whatever encoding it has (text or image or ...). In this case, it is b'hello', which can be viewed as ascii-encoded text. To change to other encodings, first decode to unicode text and then encode to bytes in whatever other encoding you want. Most of the encodings you list above will have the same bytes.
>>> B=b"aGVsbG8="
>>> b = base64.decodebytes(B)
>>> b
b'hello'
>>> t = b.decode()
>>> t
'hello'
>>> t.encode('utf-8')
b'hello'
>>> t.encode('utf-16')
b'\xff\xfeh\x00e\x00l\x00l\x00o\x00'
>>> t.encode('utf-32')
b'\xff\xfe\x00\x00h\x00\x00\x00e\x00\x00\x00l\x00\x00\x00l\x00\x00\x00o\x00\x00\x00'

Resources