Python 3.6 - Splitting hex data - python-3.x

I am trying to read a binary file and get out a header which is in utf-8 format. However the rest of the file has byte values that go over decimal 127, so I cannot convert that to a string. I have to split the text until ; (or 0x3B) and I cannot get it to work.
with open("test_qifs_single_frame.qifs", "rb") as file:
data = file.read()
print(binascii.hexlify(data))
I cannot read it in as a string either, because it tells me that I cannot decode 0x81 to UTF-8. Which I understand, it falls outside of the ASCII range. What can I do to solve this?

You can read the file byte by byte until you reach the stop character, then decode the data that you have read.
Create some sample data
>>> from random import randint
>>> header = 'Heaðer;'.encode('utf-8')
>>> bs = b''.join(bytes.fromhex('{:0>2x}'.format(randint(0, 255))) for _ in range(56))
>>> with open('test_qifs_single_frame.qifs', 'wb') as f:
... f.write(header + bs)
>>>
Read the header from the file
>>> # Create a bytearray to hold the bytes that we read.
>>> ba = bytearray()
>>> import functools
>>> with open('test_qifs_single_frame.qifs', 'rb') as f:
... breader = functools.partial(f.read, 1)
... for b in iter(breader, b';'):
... ba += b
...
>>> ba
bytearray(b'Hea\xc3\xb0er')
>>> ba.decode('utf-8')
'Heaðer'
If the iter builtin is passed a callable and a value, it will call the callable until it returns the value. In the code we use functools.partial to create a function that reads the file one byte at a time, then pass this to iter.

Related

Converts strings of binary to binary

I have a text file and a would like to read it in binary so I can transform its content into hexadecimal characters.
Then, I need to replace '20' by '0' and '80', 'e2', '8f' by '1'.
This would create a string of 0 and 1 (basically binary).
Finally, I need to convert this binary string into ascii characters.
I'm almost finish but I struggle with the last part:
import binascii
import sys
bin_file = 'TheMessage.txt'
with open(bin_file, 'rb') as file:
file_content = file.read().hex()
file_content = file_content.replace('20', '0').replace('80', '1').replace('e2', '1').replace('8f', '1')
print(file_content)
text_bin = binascii.a2b_uu(file_content)
The last line produces an error (I do not fully understand strings/hex/binary interpretation in python):
Traceback (most recent call last):
File "binary_to_string.py", line 34, in <module>
text_bin = binascii.a2b_uu(file_content)
binascii.Error: Trailing garbage
Could you give me a hand?
I'm working on this file: blank_file
I think you're looking for something like this? Refer to comments for why I do what I did.
import binascii
import sys
bin_file = 'TheMessage.txt'
with open(bin_file, 'rb') as file:
file_content = file.read().hex()
file_content = file_content.replace('20', '0').replace('80', '1').replace('e2', '1').replace('8f', '1')
# First we must split the string into a list so we can get bytes easier.
bin_list = []
for i in range(0, len(file_content), 8): # 8 bits in a byte!
bin_list.append(file_content[i:i+8])
message = ""
for binary_value in bin_list:
binary_integer = int(binary_value, 2) # Convert the binary value to base2
ascii_character = chr(binary_integer) # Convert integer to ascii value
message+=ascii_character
print(message)
One thing I noticed while working with this is that using your solution/file, there are 2620 bits, and this does not divide into 8, so it can not properly become bytes.

converting bytes to string gives b' prefix

I'm trying to convert an old python2 code to python3, and I'm facing a problem with strings vs bytes
In the old code, this line was executed:
'0x' + binascii.hexlify(bytes_reg1)
In python2 binascii.hexlify(bytes_reg1) was returning a string but in python3 it returns bytes, so it cannot be concatenated to "0x"
TypeError: can only concatenate str (not "bytes") to str
I tried converting it to string:
'0x' + str(binascii.hexlify(bytes_reg1))
But what I get as a result is:
"0xb'23'"
And it should be:
"0x23"
How can I convert the bytes to just 23 instead of b'23' so when concatenating '0x' I get the correct string?
can you try doing this and let me know whether it worked for you or not :
'0x' + str(binascii.hexlify(bytes_reg1)).decode("utf-8")
# or
'0x' + str(binascii.hexlify(bytes_reg1), encoding="utf-8")
note- Also if you can provide the sample of bytes_reg1, it will be easier to provide a solution.
Decode is the way forward, as #Satya says.
You can access the hex string in another way:
>>> import binascii
>>> import struct
>>>
>>> some_bytes = struct.pack(">H", 12345)
>>>
>>> h = binascii.hexlify(some_bytes)
>>> print(h)
b'3039'
>>>
>>> a = h.decode('ascii')
>>> print(a)
3039
>>>
>>> as_hex = hex(int(a, 16))
>>> print(as_hex)
0x3039
>>>

I have a list of bytes and I want a list of strings

I want to take the first line of a file opened from an url, search for a specific string and then split that string.
request=urllib.request.Request(url)
response=urllib.request.urlopen(request)
input_file=response.readlines()
for l in input_file:
if "target" in l:
dum, stat = l.split(":")
stat = stat.strip()
I expect to get a stat="StationX"
instead I get
TypeError: a bytes-like object is required, not 'str'
because input_file is a list of type bytes instead of type strings.
I don't know how to either bring input_file in as strings (I thought thats what readlines() vs read() did?) or convert the list of type bytes to a list of type stings.
The urllib.request package has a little nuance to it as highlighted below. One might expected the return type of .read() to be a string but it's actually raw bytes that you have to decode.
>>> import urllib.request
>>> req = urllib.request.Request("http://www.voidspace.org.uk")
>>> res = urllib.request.urlopen(req)
>>> raw_contents = res.read()
>>> type(raw_contents)
<class 'bytes'>
>>> page = raw_contents.decode()
>>> type(page)
<class 'str'>
Now in your case
request = urllib.request.Request(url)
response = urllib.request.urlopen(request)
raw_lines = response.readlines()
for raw_line raw_lines:
line = raw_line.decode()
if "target" in line:
dum, stat = l.split(":")
stat = stat.strip()
Alternatively,
for line in map(lambda x: x.decode(), raw_lines):
# etc

Convert string of hex to hex value

How to print ♠ at terminal where I read string u"\u2660" from data.txt
data = "./data.txt"
with open(data, 'r') as source:
for info in source: print(info)
u"\u2660" is what I get in the terminal
The string u"\u2660" is just a plain text in a txt file. It needs to be interpreted by python interpreter to become a string which represents the unicode character. And you can use eval to do that.
>>> a=r'u"\u2660"'
>>> print(a)
u"\u2660"
>>> b = eval(a)
>>> print(b)
♠

How to create a representable hex int by string concatenation?

Consider:
>>> a = '\xe3'
>>> a
'ã'
>>> a.encode('cp1252')
b'\xe3'
I would like to recreate the a variable if the user input the string e3:
>>> from_user = 'e3'
>>> a = '\x' + from_user
File "<stdin>", line 1
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 0-1: end of string in escape sequence
>>> a = '\\x' + from_user
>>> a
'\\xe3'
>>> a.encode('cp1252')
b'\\xe3'
With the string from_user, how might I create the a variable such that I could use it just like I did in the first example?
This should give you an idea:
unichr(int('e3', 16)).encode('cp1252')

Resources