How to create a dump file in hex format from python - python-3.x

I have a array of integer which I want to dump in one binary file (HEX file to be specific) using python script
I have written a code as
MemDump = Debug.readMemory(ic.IConnectDebug.fRealTime, 0, 0xB0009CC4, 0xCFF, 1)
MemData = MemDump[:3321]
# Create New file in binary mode and open for writing
fp = open("MON.dmp", 'w')
sys.stdout = fp
for byte in MemData:
print(byte)
Here MemDump contains an array of integer values. From this array first 3321 bytes I want to dump in file.
Here I am getting the the output in file MON.dmp but in ASCII fromat.
and if I create file in binary format using
fp = open("MON.dmp", 'wb')
print(byte) command gives me an error saying
'str' does not support the buffer interface
Thank you in Advance.

You need to convert byte to a binary string before you can write it to a file opened in 'wb' mode. This can be done using the bytearray() function. So in this case you should use:
for byte in MemData:
print(bytearray(byte))

Related

generate a hex file by converting strings into hex from a text file in Python

I have a Python tool that generates a text file in which each line has a string. I want to generate a hex file using this text file. The file has lines the following line:
-5.139488050547036391e-01
3.181812818225058681e+00
475.465798764
abc[0]
abc[0]*abc[10]
I tried using binascii.hexlify(b'<String>'), which works when I manually enter the strings, but when I do that:
with open("strings.txt", "r") as a_file:
for line in a_file:
if not line.strip():
continue
stripped_line = line.strip()
hex_= binascii.hexlify(b'<'+ stripped_line +'>')
print(hex_)
I get this error:
TypeError: can't concat str to bytes
How can I convert those strings of different types into hex and generate a .hex file?
To convert a string (line in file) to bytes object you have to encode it. In your case, that only means to replace this line from your code
hex_= binascii.hexlify(b'<'+ stripped_line +'>')
with this line
hex_= binascii.hexlify(stripped_line.encode())
Your code ran into error, because you tried to concatenate 'b<' (byte object) with stripped_line (string object) and it has no meaning in python.

How to convert Hex to original file format?

I have a .tgz file that was formatted as shell code, it looks like this (Hex):
"\x1F\x8B\x08\x00\x44\x7A\x91\x4F\x00\x03\xED\x59\xED\x72.."
It was generated this way (python3):
import os
def main():
dump_src = "MyPlugin.tgz"
fc = ""
try:
with open(dump_src, 'rb') as fd:
fcr = fd.read()
for byte in bytearray(fcr):
fc += "\\x{:02x}".format(byte)
except:
fcr = dump_src
for byte in bytearray(fcr):
fc += "\\x{:02x}".format(byte)
print(fc)
# failed attempt:
fcback = bytes(int(fc[i+2:i+4], 16) for i in range(0, len(fc), 4))
print (fcback)
if __name__ == "__main__":
main()
How can I convert this back to the original tgz archive?
Edit: failed attempt in the last section outputs this:
b'\x8b\x00\x10]\x03\x93o0\x85%\xe2!\xa4H\xf1Fi\xa7\x15\xf61&\x13N\xd9[\xfag\x11V\x97\xd3\xfb%\xf7\xe3\\\xae\xc2\xff\xa4>\xaf\x11\xcc\x93\xf1\x0c\x93\xa4\x1b\xefxj\xc3?\xf9\xc1\xe8\xd1\xd9\x01\x97qB"\x1a\x08\x9cO\x7f\xe9\x19\xe3\x9c\x05\xf2\x04a\xaa\x00A,\x15"RN-\xb6\x18K\x85\xa1\x11\x83\xac/\xffR\x8a\xa19\xde\x10\x0b\x08\x85\x93\xfc]\x8a^\xd2-T\x92\x9a\xcc-W\xc7|\xba\x9c\xb3\xa6V0V H1\x98\xde\x03#\x14\'\n 1Y\xf7R\x14\xe2#\xbe*:\xe0\xc8\xbb\xc9\x0bo\x8bm\xed.\xfd\xae\xef\x9fT&\xa1\xf4\xcf\xa7F\xf4\xef\xbb"8"\xb5\xab,\x9c\xbb\xfc3\x8b\xf5\x88\xf4A\x0ek%5eO\xf4:f\x0b\xd6\x1bi\xb6\xf3\xbf\xf7\xf9\xad\xb5[\xdba7\xb8\xf9\xcd\xba\xdd,;c\x0b\xaaT"\xd4\x96\x17\xda\x07\x87& \xceH\xd6\xbf\xd2\xeb\xb4\xaf\xbd\xc2\xee\xfc\'3zU\x17>\xde\x06u\xe3G\x7f\x1e\xf3\xdf\xb6\x04\x10A\x04\x10A\x04\x10A\x04\x10A\xff\x9f\xab\xe8(\x00'
And when I output it to a file (e.g. via python3 main.py > MyFile.tgz) the file is corrupted.
Since you know the format of the data (each byte is encoded as a string of 4 characters in the format "\xAB") it's easy to revert the conversion and get the original bytes again. It'll only take one line of Python code:
data = bytes(int(fc[i+2:i+4], 16) for i in range(0, len(fc), 4))
This uses:
range(start, stop, step) with step 4 to iterate in groups of 4 characters through your string
slicing to get each group of 2 hexadecimal digits
int(x, base) to convert the hexadecimal string to an integer
a generator expression to immediately pass the converted elements to:
bytes() to create a bytes object with the data
The variable data is now of type bytes and you could directly write it to a file (to decompress with an external zip program), or pass it to zlib.decompress() (to further process it in Python).
UPDATE (follow-up on the comments and updated question):
Firstly, I have tested the above code and it does result in the same bytes as the input. Are you really sure that the example output in your question is the actual result of the code in your question? Please try to be careful when copying code and/or output. A few remarks:
Your code is not properly formatted, so I cannot run it without making modifications. And when I have made modifications to the code, I might run different code than you do, yielding different results. So next time please copy-paste your exact (working, tested) code without modifications.
The format string in your code uses lowercase hexadecimal format, and your first example output uses uppercase. So that output cannot be from this code.
I don't have access to your file "MyPlugin.tgz", but when I test your code with another .tgz file (after fixing the IndentationErrors), my output is correct. It starts with \x1f\x8b as expected (this is the magic number in the gzip header). I can't explain why your output is different...
Secondly, it seems like you don't fully understand how bytes and string representations work. When you write print(fcback), a string representation of the Python object fcback (in this case a bytes object) is printed. The string representation of a bytes object is not the same as the binary data! When printing a bytes object, each byte that corresponds to a printable ASCII character is replaced by that character, other bytes are escaped (similar to the formatted string that your code generates). Also, it starts with b' and ends with '.
You cannot print binary data to your terminal and then pipe the output to a file. This will result in a different file. The correct way to write the data to a file is using file.write(data) in your Python code.
Here's a fully working example:
def binary_to_text(data):
"""Convert a bytes object to a formatted text string."""
text = ""
for byte in data:
text += "\\x{:02x}".format(byte)
return text
def text_to_binary(text):
"""Convert a formatted text string to a bytes object."""
return bytes(int(text[i+2:i+4], 16) for i in range(0, len(text), 4))
def main():
# Read the binary data from input file:
with open('MyPlugin.tgz', 'rb') as input_file:
input_data = input_file.read()
# Convert binary to text (based on your original code):
text = binary_to_text(input_data)
print(text[0:100])
# Convert the text back to binary:
output_data = text_to_binary(text)
print(output_data[0:100])
# Write the binary data back to a file:
with open('MyPlugin-restored.tgz', 'wb') as output_file:
output_file.write(output_data)
if __name__ == '__main__':
main()
Note that I only print the first 100 elements to keep the output short. Also notice that the second print-statement prints a much longer text. This is because the first print gets 100 characters (which are printed "as is"), while the second print gets 100 bytes (of which most bytes are escaped, causing the output to be longer).

Need to open and read a .bin file in Python. Getting error: utf-8' codec can't decode byte 0x81 in position 11: invalid start byte

I am trying to read and convert binary into text that anyone could read. I am having trouble with the error message:
'utf-8' codec can't decode byte 0x81 in position 11: invalid start byte
I have gone throughout: Reading binary file and looping over each byte
trying multiple versions of trying to open and read the binary file in some way. After reading about this error message, most people either had trouble with .cvs files, or had to change the utf-8 to -16. But reading up on https://en.wikipedia.org/wiki/UTF-16#Byte_order_encoding_schemes , Python does not use -16 anymore.
Also, if I add encoding = utf-16/32, the error states: binary mode doesn't take an encoding argument
Here is my code:
with open(b"P:\Projects\2018\1809-0068-R\Bin_Files\snap-pac-eb1-R10.0d.bin", "rb") as f:
byte = f.read(1)
while byte != b"":
byte = f.read(1)
print(f)
I am expecting to be able to read and write to the binary file. I would like to translate it to Hex and then to text (or to legible text somehow), but I think I have to go through this step before. If anyone could help with what I am missing, that would be greatly appreciated! Any way to open and read a binary file would be accepted. Thank you for your time!
I am not sure but this might help:
import binascii
with open('snap-pac-eb1-R10.0d.bin', 'rb') as f:
header = f.read(6)
b = bytearray(header)
binary=[bin(i)[2:].zfill(8) for i in b]
n = int('0b'+''.join(binary), 2)
nn = binascii.unhexlify('%x' % n)
nnn=nn.decode("ascii")[0:-1]
result='.'.join(str(ord(c)) for c in nnn[0:-1])
print(result)
Output:
16.0.8.0

Writing to files in ASCII with Python3, not UTF8

I have a program that I created with two sections.
The first one copies a text file with an integer in the middle of the file name in this format.
file = "Filename" + "str(int)" + ".txt"
the user can create as many copies of the file that they would like.
The second part of the program is what I am having the problem with. There is an integer at the very bottom of the file that is to correspond with the integer in the file name. After the first part is done, I open each file one at a time in "r+" read/write format. So I can file.seek(1000) to about where the integer is in the file.
Now in my opinion the next part should be easy. I should just simply have to write str(int) into the file right here. But it wasn't that easy. It worked just fine doing it like that in Linux at home, but at work on Windows it proved difficult. What I ended up having to do after file.seek(1000) is write to the file using Unicode UTF-8. I accomplished this with this code snippet of the rest of the program. I will document it so that it is able to be understood what is going on. Instead of having to write this in Unicode, I would love to be able to write this in good old regular English ASCII characters. Eventually this program will be expanded to include a lot more data at the bottom of each file. Having to write the data in Unicode is going to make things extremely difficult. If I just write the data without turning it into Unicode this is the result. This string is supposed to say #2 =1534, instead it says #2 =ㄠ㌵433.
If someone can show me what I am doing wrong that would be great. I would love to just use something like file.write('1534') to write the data to the file instead of having to do it in Unicode UTF-8.
while a1 < d1 :
file = "file" + str(a1) + ".par"
f = open(file, "r+")
f.seek(1011)
data = f.read() #reads the data from that point in the file into a variable.
numList= list(str(a1)) # "a1" is the integer in the file name. I had to turn the integer into a list to accomplish the next task.
replaceData = '\x00' + numList[0] + '\x00' + numList[1] + '\x00' + numList[2] + '\x00' + numList[3] + '\x00' #This line turns the integer into Utf 8 Unicode. I am by no means a Unicode expert.
currentData = data #probably didn't need to be done now that I'm looking at this.
data = data.replace(currentData, replaceData) #replaces the Utf 8 string in the "data" variable with the new Utf 8 string in "replaceData."
f.seek(1011) # Return to where I need to be in the file to write the data.
f.write(data) # Write the new Unicode data to the file
f.close() #close the file
f.close() #make sure the file is closed (sometimes it seems that this fails in Windows.)
a1 += 1 #advances the integer, and then return to the top of the loop
This is an example of writing to a file in ASCII. You need to open the file in byte mode, and using the .encode method for strings is a convenient way to get the end result you want.
s = '12345'
ascii = s.encode('ascii')
with open('somefile', 'wb') as f:
f.write(ascii)
You can obviously also open in rb+ (read and write byte mode) in your case if the file already exists.
with open('somefile', 'rb+') as f:
existing = f.read()
f.write(b'ascii without encoding!')
You can also just pass string literals with the b prefix, and they will be encoded with ascii as shown in the second example.

Convert Binary data from file to readable string

I have binary data stored in a file. I am doing this:
byte[] fileBytes = File.ReadAllBytes(#"c:\carlist.dat");
string ascii = Encoding.ASCII.GetString(fileBytes);
This is giving me following result with lot of invalid characters. What am i doing wrong?
?D{F ?x#??4????? NBR-OF-CARSNUMBER-OF-CARS!"#??? NBR-OF-CARS$%??1y0#123?G??#$ NBR-OF-CARS%45??1y#  NUMBER-OF-CARSd?
hmm... seems like a save was made from a byte buffer where after NBR-OF-CARS was written some numeric data. If you have an access to the code that saves the file could you check if there are numbers over there and if there are - check does the code converts numbers to string before witing the value into the binary stream.

Resources