Python 3.6 ASCII inside hex bytes - python-3.x

I am receiving binary data, such as
data = b'\xaa\x44\x12\x1c\x2a'
When I try to parse each byte - what I am actually parsing is
b'\xaaD\x12\x1c*'
Is there a reason why bytes 44 and 2a are converted from HEX to ASCII?
Is there a way to prevent this conversion.
I have tried -
data = data.hex()
print(data.hex())
#print output aa44121c2a
Which does some what maintains the format but converts it to a string and cannot iterate through each byte but each character.
Any suggestions?

Related

Converting bytes data do hex with special character '\'

Given example of bytes retrieved from packet capture:
b'\x18\x05'
how can i hexlify it properly considering special character '' ?
When i hexlify it with python i'm getting b'1805' but when i remove manually special character '' (b'x18x05') i'm getting proper value b'783138783035'.
Considering online hex encoders ( for example : https://www.hexator.com/ ) the result of b'\x18\x05' is 62275c7831385c78303527.
Thank you in advance
b'\x18\x05' is the two bytes 0x18 and 0x05. That's the "proper value". b'' is just the default display notation for a bytes Python object. \nn is an escape code representing the hexadecimal value of a single byte.
For display you can use:
data = b'\x18\x15'
print(data)
print(data.hex())
print(data.hex(sep=' '))
Output:
b'\x18\x15'
1815
18 15

Bytes Object is Comma Separated Decimal Values (Python 3)

A machine I interface with at my work returns its frequency as a bytes object like this:
b'192,232,206,0'
This little-endian (I think that's right, I'm not great at remembering which is which) bytes object is supposed to translate to a hex bytes object \x00\xCE\xE8\xC0 which translates into decimal as 13560000. I have found that Python has int.from_bytes() which takes the hex bytes object and turns it to a nice integer, but when I apply that to my comma-separated bytes object where each bytes is a decimal value, I get an astronomically large number (3816634650710199623094969186609 to be exact). Can anyone help me out here?
It's a bit convoluted but you can convert the decimal numbers to hex strings and combine their byte values to assemble a number:
freq = b'192,232,206,0'
freq_bytes = b''
for decimal in freq.split(b','):
hex_str = hex(int(decimal))
if hex_str == '0x0':
hex_str += '0' # otherwise it won't convert
freq_bytes += bytes.fromhex(hex_str[2:]) # remove the 0x part
freq_int = int.from_bytes(freq_bytes, 'little')
gives 13560000

How to Turn string into bytes?

Using python3 and I've got a string which displayed as bytes
strategyName=\xe7\x99\xbe\xe5\xba\xa6
I need to change it into readable chinese letter through decode
orig=b'strategyName=\xe7\x99\xbe\xe5\xba\xa6'
result=orig.decode('UTF-8')
print()
which shows like this and it is what I want
strategyName=百度
But if I save it in another string,it works different
str0='strategyName=\xe7\x99\xbe\xe5\xba\xa6'
result_byte=str0.encode('UTF-8')
result_str=result_byte.decode('UTF-8')
print(result_str)
strategyName=ç¾åº¦é£é©ç­ç¥
Please help me about why this happening,and how can I fix it.
Thanks a lot
Your problem is using a str literal when you're trying to store the UTF-8 encoded bytes of your string. You should just use the bytes literal, but if that str form is necessary, the correct approach is to encode in latin-1 (which is a 1-1 converter for all ordinals below 256 to the matching byte value) to get the bytes with utf-8 encoded data, then decode as utf-8:
str0 = 'strategyName=\xe7\x99\xbe\xe5\xba\xa6'
result_byte = str0.encode('latin-1') # Only changed line
result_str = result_byte.decode('UTF-8')
print(result_str)
Of course, the other approach could be to just type the Unicode escapes you wanted in the first place instead of byte level escapes that correspond to a UTF-8 encoding:
result_str = 'strategyName=\u767e\u5ea6'
No rigmarole needed.

Python bytes representation

I'm writing a hex viewer on python for examining raw packet bytes. I use dpkt module.
I supposed that one hex byte may have value between 0x00 and 0xFF. However, I've noticed that python bytes representation looks differently:
b'\x8a\n\x1e+\x1f\x84V\xf2\xca$\xb1'
I don't understand what do these symbols mean. How can I translate these symbols to original 1-byte values which could be shown in hex viewer?
The \xhh indicates a hex value of hh. i.e. it is the Python 3 way of encoding 0xhh.
See https://docs.python.org/3/reference/lexical_analysis.html#string-and-bytes-literals
The b at the start of the string is an indication that the variables should be of bytes type rather than str. The above link also covers that. The \n is a newline character.
You can use bytearray to store and access the data. Here's an example using the byte string in your question.
example_bytes = b'\x8a\n\x1e+\x1f\x84V\xf2\xca$\xb1'
encoded_array = bytearray(example_bytes)
print(encoded_array)
>>> bytearray(b'\x8a\n\x1e+\x1f\x84V\xf2\xca$\xb1')
# Print the value of \x8a which is 138 in decimal.
print(encoded_array[0])
>>> 138
# Encode value as Hex.
print(hex(encoded_array[0]))
>>> 0x8a
Hope this helps.

Lua: Read hex values from binary

I'm trying to read hex values from a binary file. I don't have a problem with extracting a string and converting letters to hex values, but how can I do that with control characters and other non-printable characters? Is there a way of reading the string directly in hex values without the need of converting it?
Have a look over here:
As a last example, the following program makes a dump of a binary file. Again, the first program argument is the input file name; the output goes to the standard output. The program reads the file in chunks of 10 bytes. For each chunk, it writes the hexadecimal representation of each byte, and then it writes the chunk as text, changing control characters to dots.
local f = assert(io.open(arg[1], "rb"))
local block = 10
while true do
local bytes = f:read(block)
if not bytes then break end
for b in string.gfind(bytes, ".") do
io.write(string.format("%02X ", string.byte(b)))
end
io.write(string.rep(" ", block - string.len(bytes) + 1))
io.write(string.gsub(bytes, "%c", "."), "\n")
end
From your question it's not clear what exactly you aim to do, so I'll give 2 approaches.
Either you have a file full with hex values, and read it like this:
s='ABCDEF1234567890'
t={}
for val in s:lower():gmatch'(%x%x)' do
-- do whatever you want with the data
t[#t+1]=s:char(val)
end
Or you have a binary file, and you convert it to hex values:
s='kl978331asdfjhvkasdf'
t={s:byte(1,-1)}

Resources