Convert string of hex to hex value - python-3.x

How to print ♠ at terminal where I read string u"\u2660" from data.txt
data = "./data.txt"
with open(data, 'r') as source:
for info in source: print(info)
u"\u2660" is what I get in the terminal

The string u"\u2660" is just a plain text in a txt file. It needs to be interpreted by python interpreter to become a string which represents the unicode character. And you can use eval to do that.
>>> a=r'u"\u2660"'
>>> print(a)
u"\u2660"
>>> b = eval(a)
>>> print(b)
♠

Related

converting bytes to string gives b' prefix

I'm trying to convert an old python2 code to python3, and I'm facing a problem with strings vs bytes
In the old code, this line was executed:
'0x' + binascii.hexlify(bytes_reg1)
In python2 binascii.hexlify(bytes_reg1) was returning a string but in python3 it returns bytes, so it cannot be concatenated to "0x"
TypeError: can only concatenate str (not "bytes") to str
I tried converting it to string:
'0x' + str(binascii.hexlify(bytes_reg1))
But what I get as a result is:
"0xb'23'"
And it should be:
"0x23"
How can I convert the bytes to just 23 instead of b'23' so when concatenating '0x' I get the correct string?
can you try doing this and let me know whether it worked for you or not :
'0x' + str(binascii.hexlify(bytes_reg1)).decode("utf-8")
# or
'0x' + str(binascii.hexlify(bytes_reg1), encoding="utf-8")
note- Also if you can provide the sample of bytes_reg1, it will be easier to provide a solution.
Decode is the way forward, as #Satya says.
You can access the hex string in another way:
>>> import binascii
>>> import struct
>>>
>>> some_bytes = struct.pack(">H", 12345)
>>>
>>> h = binascii.hexlify(some_bytes)
>>> print(h)
b'3039'
>>>
>>> a = h.decode('ascii')
>>> print(a)
3039
>>>
>>> as_hex = hex(int(a, 16))
>>> print(as_hex)
0x3039
>>>

Special characters are printed only as a part of string, but not independently (python3)

I work with strings containing diacritics. When I print the string, it is printed correctly:
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
s = "ˈtau̯rum"
print(s)
> ˈtau̯rum
However, When I iterate over the string and print each character independently, some of the characters are not printed:
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
s = "ˈtau̯rum"
for c in s:
print (c)
>
ˈ
t
a
u
r
u
m
As the comment suggested, the printing issue is most likely due to how your terminal handles displaying the unicode characters. You can check that the character is what you expect by encoding it to utf-8 bytes, or by using the ord() built-in.
Given a string representing one Unicode character, return an integer representing the Unicode code point of that character. For example, ord('a') returns the integer 97 and ord('€') (Euro sign) returns 8364. This is the inverse of chr().
E.g.
Python 3.7.1 (default, Oct 23 2018, 19:19:42)
Type 'copyright', 'credits' or 'license' for more information
IPython 7.1.1 -- An enhanced Interactive Python. Type '?' for help.
IIn [1]: s = "ˈtau̯rum"
In [2]: print(s)
ˈtau̯rum
In [3]: for c in s:
...: print(c, c.encode('utf-8'), ord(c))
...:
ˈ b'\xcb\x88' 712
t b't' 116
a b'a' 97
u b'u' 117
̯ b'\xcc\xaf' 815
r b'r' 114
u b'u' 117
m b'm' 109
You can use this code:
import unicodedata
s = "ˈtau̯rum"
a = ""
for c in s:
if unicodedata.combining(c):
a += c
else:
print(a)
a = c
else:
print(a)
In this manner, you are combining the combining characters. You may substitute print(a) to a list storage of codepoint that must be keep together.

Python 3.6 - Splitting hex data

I am trying to read a binary file and get out a header which is in utf-8 format. However the rest of the file has byte values that go over decimal 127, so I cannot convert that to a string. I have to split the text until ; (or 0x3B) and I cannot get it to work.
with open("test_qifs_single_frame.qifs", "rb") as file:
data = file.read()
print(binascii.hexlify(data))
I cannot read it in as a string either, because it tells me that I cannot decode 0x81 to UTF-8. Which I understand, it falls outside of the ASCII range. What can I do to solve this?
You can read the file byte by byte until you reach the stop character, then decode the data that you have read.
Create some sample data
>>> from random import randint
>>> header = 'Heaðer;'.encode('utf-8')
>>> bs = b''.join(bytes.fromhex('{:0>2x}'.format(randint(0, 255))) for _ in range(56))
>>> with open('test_qifs_single_frame.qifs', 'wb') as f:
... f.write(header + bs)
>>>
Read the header from the file
>>> # Create a bytearray to hold the bytes that we read.
>>> ba = bytearray()
>>> import functools
>>> with open('test_qifs_single_frame.qifs', 'rb') as f:
... breader = functools.partial(f.read, 1)
... for b in iter(breader, b';'):
... ba += b
...
>>> ba
bytearray(b'Hea\xc3\xb0er')
>>> ba.decode('utf-8')
'Heaðer'
If the iter builtin is passed a callable and a value, it will call the callable until it returns the value. In the code we use functools.partial to create a function that reads the file one byte at a time, then pass this to iter.

print recursive pattern without quotes in python

my code for pattern:
def pattern(n):
if n==1:
return '1'
else:
return pattern(n-int(n/2))*2+str(n)
print(pattern(n))
i need:
>>> pattern(1)
1
>>> pattern(2)
112
>>> pattern(4)
1121124
>>> pattern(8)
112112411211248
but i get:
>>> pattern(1)
'1'
>>> pattern(2)
'112'
>>> pattern(4)
'1121124'
>>> pattern(8)
'112112411211248'
i have tried a lot but nothing is working to get rid of those pesky quotes.
The quotes are from the REPL printing the representation of the result of the function call, which is a string. If you do not want the representation then just print the result explicitly instead.

How to create a representable hex int by string concatenation?

Consider:
>>> a = '\xe3'
>>> a
'ã'
>>> a.encode('cp1252')
b'\xe3'
I would like to recreate the a variable if the user input the string e3:
>>> from_user = 'e3'
>>> a = '\x' + from_user
File "<stdin>", line 1
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 0-1: end of string in escape sequence
>>> a = '\\x' + from_user
>>> a
'\\xe3'
>>> a.encode('cp1252')
b'\\xe3'
With the string from_user, how might I create the a variable such that I could use it just like I did in the first example?
This should give you an idea:
unichr(int('e3', 16)).encode('cp1252')

Resources