I want to create a new binary file by using python according to the following format:
< Part1: 8 bytes > < Part2: 4 bytes > < Part3: 16 bytes>
so that i will write to any part some value and if this value is not the size of that part, then there will be a complement of zeros for that part.
I looking for the best way and the most efficient way to do it.
I read in the internet that I can do something like that:
f = open('file', 'w+b')
res = struct.pack(">l", 0000)
f.write(res)
but I don't sure that i can by this way to keep a place from the hand.
Let's start with some terminology when working with Python data before getting to your code to write a binary file.
Note: The experiments below are using the Python REPL
An integer in Python can be written as a denary/decimal number (e.g. 1)
>>> type(1)
<class 'int'>
It can also be written in hex by adding a leading 0x:
>>> 0x1
1
>>> type(0x1)
<class 'int'>
A hex integer's leading zeros have no effect. While in denary they give an error:
>>> x = 0x0001
>>> print(x)
1
>>> x = 0001
x = 0001
^^^
SyntaxError: leading zeros in decimal integer literals are not permitted; use an 0o prefix for octal integers
When writing to a binary file it is bytes that need to get written. If the content is an integer it can be converted to bytes with either the int.to_bytes functionality or the struct library functionality.
To convert 1 to bytes using int.to_bytes:
>>> int(1).to_bytes(length=1, byteorder='little')
b'\x01'
With a length of 1 the byte order (endianness) is not important. For numbers stored in more bytes it is important.
>>> int(1).to_bytes(length=4, byteorder='little')
b'\x01\x00\x00\x00'
>>> int(1).to_bytes(length=4, byteorder='big')
b'\x00\x00\x00\x01'
The same result can be achieved with the struct library:
>>> struct.pack('<l', 1)
b'\x01\x00\x00\x00'
>>> struct.pack('>l', 1)
b'\x00\x00\x00\x01'
The other common way to see values written is a hex string. The denary value of 1 could be written as 01000000 or 00000001 to represent 1 in different endian in 4 bytes.
>>> int(1).to_bytes(length=4, byteorder='big').hex()
'00000001'
>>> int(1).to_bytes(length=4, byteorder='little').hex()
'01000000'
In your question you have written 0000 for the value to be converted using struct and written to a file.
f = open('file', 'w+b')
res = struct.pack(">l", 0000)
f.write(res)
0000 will work but 0001 will give the SyntaxError: leading zeros in decimal integer literals are not permitted;
I think what you have in your question is the hex string representation of the value you want written.
If it is a hex string you are trying to input then the following will work:
f = open('file', 'w+b')
res = bytes.fromhex('0001')
f.write(res)
The other piece in your question was about making the values to certain byte length.
If your hex string represents the correct byte length then you are good.
However the example you gave was only 2 bytes long:
bytes.fromhex('0001')
b'\x00\x01'
len(bytes.fromhex('0001'))
2
And you wanted fields of either 4, 8, or 16 bytes long in which case the bytes have to be "padded` with bytes of zero value to get the correct number of bytes. e.g.
>>> bytes.fromhex('0001').rjust(4, b'\x00')
b'\x00\x00\x00\x01'
>>> bytes.fromhex('0001').rjust(8, b'\x00')
b'\x00\x00\x00\x00\x00\x00\x00\x01'
>>> bytes.fromhex('0001').rjust(16, b'\x00')
b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01'
If the hex string are in little endian format then ljust would be required:
bytes.fromhex('0100').ljust(4, b'\x00')
b'\x01\x00\x00\x00'
Related
in attempt to make simple code to convert a hex string to base64:
my thought was: hex -> integer -> binary -> base64
so i wrote this little code:
import string
def bit(integer):
# To binary
return int(bin(integer))[2:]
#Hex multiply by 16 depending on position: 0xAB = A*(16**2) + B(16**1) = #10*16**2 + 11*16**1
#0x3a2f
#3*(16**2) + 7*(16**1) + 5
def removeletter(list):
#"abcdef" = "10 11 12 13 14 15"
for i, letter in enumerate(list):
if letter in hextable.keys():
list[i] = hextable[letter]
return list
def todecimal(h):
power = 0
l = [num for num in str(h)] #['3', 'a', '2', 'f']
l = removeletter(l)
l.reverse() #['f', '2', 'a', '3']
for i, n in enumerate(l):
number = int(n)
l[i] = number*(16**power)
power += 1
l.reverse()
return sum(l)
lowers = string.ascii_lowercase
hextable = {}
for number, letter in enumerate(lowers[:6]):
hextable[letter] = number + 10
in this little challenge i am doing, it says:
The string:
49276d206b696c6c696e6720796f757220627261696e206c696b65206120706f69736f6e6f7573206d757368726f6f6d
Should produce:
SSdtIGtpbGxpbmcgeW91ciBicmFpbiBsaWtlIGEgcG9pc29ub3VzIG11c2hyb29t
ok,
print(bit(todecimal('49276d206b696c6c696e6720796f757220627261696e206c696b65206120706f69736f6e6f7573206d757368726f6f6d')))
this should get the binary of the hex string, which if I put through a binary to base64 converter, should return SSdtIGtpbGxpbmcgeW91ciBicmFpbiBsaWtlIGEgcG9pc29ub3VzIG11c2hyb29t. or so i thought.
└─$ python3 hextobase64.py
10010010010011101101101001000000110101101101001011011000110110001101001011011100110011100100000011110010110111101110101011100100010000001100010011100100110000101101001011011100010000001101100011010010110101101100101001000000110000100100000011100000110111101101001011100110110111101101110011011110111010101110011001000000110110101110101011100110110100001110010011011110110111101101101
after checking using a hex to binary converter, i can see that the binary is correct.
now, if if i put this through a binary to base64 converter, it should return SSdtIGtpbGxpbmcgeW91ciBicmFpbiBsaWtlIGEgcG9pc29ub3VzIG11c2hyb29t, right?
the thing is, this binary to base64 converter gave me kk7aQNbS2NjS3M5A8t7q5EDE5MLS3EDY0tbKQMJA4N7S5t7c3urmQNrq5tDk3t7a which is odd to me. so I must be doing something wrong. from my understanding, hexadecimal can be represented in binary, and so can base64. base64 takes the binary and groups the binary by 6 digits to produce its own representation. so obviously if I have the binary, it should be interchangeable, but something is wrong.
what am i doing wrong?
I have a hex value of 16 digits (i.e., length 16) that I want to convert to binary as the following:
n = '5851F42D00000000' # length = 16
bin_ = bin(int(n, 16))[2:]
print(len(bin_))
I get here 63 but I expect the length of the resulted binary to be 64. Am I doing something wrong here?
Thank you
bin doesn't give you leading zeroes, because it has no way of knowing how many leading zeroes you would want. You can get the desired behaviour using string formatting:
>>> n = '5851F42D00000000'
>>> '{:064b}'.format(int(n, 16))
'0101100001010001111101000010110100000000000000000000000000000000'
>>> len(_)
64
The format string {:064b} formats an integer in binary, with leading zeroes up to length 64.
This number in binary has a leading zero:
>>> f'{int(n, 16):064b}'
'0101100001010001111101000010110100000000000000000000000000000000'
bin will return the shortest possible representation, not including leading zeros:
>>> bin(int(n, 16))
'0b101100001010001111101000010110100000000000000000000000000000000'
^ there's technically a zero before that
I'm having an issue getting from hexadecimal string to hexadecimal integer in Python 3.
When you write hex(12) you get the output 0xc which is a str class. However, when you type fx. int = 0x55 the class/type is an INTEGER.
How do you go from "0x55" to 0x55 (as an integer)
Thank you :)
You might be getting confused between the concepts of a number's representation and a number's value. For example, the following evaluates to True in Python: 83 == 0o123 == 0x53 The decimal representation is 83, and octal representation is 123, and the hexadecimal representation is 53. However, the value of all those representations are the same. For the sake of explanation, I will give you the value using the decimal representations; the value is 83.
If you are trying to convert a string into a number, you may want to look at the eval and ast.literal_eval functions. Here is a demonstration of how you might use the second:
>>> number = 12345
>>> hex_string = hex(number)
>>> oct_string = oct(number)
>>> print('hex_string =', repr(hex_string))
hex_string = '0x3039'
>>> print('oct_string =', repr(oct_string))
oct_string = '0o30071'
>>> number == ast.literal_eval(hex_string)
True
>>> number == ast.literal_eval(oct_string)
True
>>>
I am trying to write a specific number of bytes of a string to a file. In C, this would be trivial: since each character is 1 byte, I would simply write however many characters from the string I want.
In Python, however, since apparently each character/string is an object, they are of varying sizes, and I have not been able to find how to slice the string at byte-level specificity.
Things I have tried:
Bytearray:
(For $, read >>>, which messes up the formatting.)
$ barray = bytearray('a')
$ import sys
$ sys.getsizeof(barray[0])
24
So turning a character into a bytearray doesn't turn it into an array of bytes as I expected and it's not clear to me how to isolate individual bytes.
Slicing byte objects as described here:
$ value = b'a'
$ sys.getsizeof(value[:1])
34
Again, a size of 34 is clearly not 1 byte.
memoryview:
$ value = b'a'
$ mv = memoryview(value)
$ sys.getsizeof(mv[0])
34
$ sys.getsizeof(mv[0][0])
34
ord():
$ n = ord('a')
$ sys.getsizeof(n)
24
$ sys.getsizeof(n[0])
Traceback (most recent call last):
File "<pyshell#29>", line 1, in <module>
sys.getsizeof(n[0])
TypeError: 'int' object has no attribute '__getitem__'
So how can I slice a string into a particular number of bytes? I don't care if slicing the string actually leads to individual characters being preserved or anything as with C; it just has to be the same each time.
Make sure the string is encoded into a byte array (this is the default behaviour in Python 2.7).
And then just slice the string object and write the result to file.
In [26]: s = '一二三四'
In [27]: len(s)
Out[27]: 12
In [28]: with open('test', 'wb') as f:
....: f.write(s[:2])
....:
In [29]: !ls -lh test
-rw-r--r-- 1 satoru wheel 2B Aug 24 08:41 test
file_1 = (r'res\test.png')
with open(file_1, 'rb') as file_1_:
file_1_read = file_1_.read()
file_1_hex = binascii.hexlify(file_1_read)
print ('Hexlifying test.png..')
pack = ("test.packet")
file_1_size_bytes = len(file_1_read)
print (("test.png is"),(file_1_size_bytes),("bytes."))
struct.pack( 'i', file_1_size_bytes)
file_1_size_bytes_hex = binascii.hexlify(struct.pack( '>i', file_1_size_bytes))
print (("Hexlifyed length - ("),(file_1_size_bytes_hex),(")."))
with open(pack, 'ab') as header_1_:
header_1_.write(binascii.unhexlify(file_1_size_bytes_hex))
print (("("),(binascii.unhexlify(file_1_size_bytes_hex)),(")"))
with open(pack, 'ab') as header_head_1:
header_head_1.write(binascii.unhexlify("0000020000000D007200650073002F00000074006500730074002E0070006E006700000000"))
print ("Header part 1 added.")
So this writes "0000020000000D007200650073002F00000074006500730074002E0070006E006700000000(00)" to the pack unhexlifyed.
There's an extra "00" byte at the end. this is messing everything up im trying to do because the packets length is referred back to when loading it and i have about 13 extra "00" bytes at the end of each string i write to the file. So in turn my file is 13 bytes longer than it should be. Not to mention the headers byte length isnt being read properly because the padding is off by 1 byte.
You seem to be saying that binascii.unhexlify does not really condense the input string. I have trouble believing that. Here is a minimal complete runnable example and the output I get with 3.4.2 on Win 7.
import binascii
import io
b = binascii.unhexlify(
"000000030000000100000000000000040041004E0049004D00000000000000")
print(b) # bytes
bf = io.BytesIO()
bf.write(b)
print(bf.getvalue())
>>>
b'\x00\x00\x00\x03\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x04\x00A\x00N\x00I\x00M\x00\x00\x00\x00\x00\x00\x00'
b'\x00\x00\x00\x03\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x04\x00A\x00N\x00I\x00M\x00\x00\x00\x00\x00\x00\x00'
Unhexlify has converted each pair of hex characters to the byte expected.