Pickle in python3, error on concating string to bytes - python-3.x

I am converting some code from python2 to 3 and saw an error that the 2to3 did not catch on a line:
pickle.dumps(('predskew', predskewData[0])) + pickleSep
That produces an error in python3:
pickledPredskewData = pickle.dumps(('predskew', predskewData[0])) + pickleSep
TypeError: can't concat str to bytes
I know from other posts on stack over flow I could perhaps use an encode? or a decode? I just wasn't sure where or what. So I did try this in python2:
pickleSep = ":::::"
pickle.dumps(('predskew',0)) + pickleSep
Which produces:
"(S'predskew'\np0\nI0\ntp1\n.:::::"
Also,
pickle.dumps(('predskew',0)) + pickleSep.encode()
Gives the same result.
Now if I try the same line in python3, I get what 'looks' like vastly different output:
pickle.dumps(('predskew', 0)) + pickleSep.encode()
Gives the output of:
b'\x80\x04\x95\x10\x00\x00\x00\x00\x00\x00\x00\x8c\x08predskew\x94K\x00\x86\x94.:::::'
So not sure my encode fix is the right approach as the answers seem different (unless it is the print just showing me the bytes itself?!)

Related

Python : stripping, converting bytes type

Under Python 3.10, I do have an UDP socket that listens to a COM port.
I do get datas like this :
b'SENDPKT: "STN1" "" "SH/DX\r"\x98\x00'
The infos SH/DX before the "\n" can change and has a different length and I need to extract them.
.strip('b\r') doesn't work.
Using .decode() and str(), I tried to convert this bytes datas to a string for easier manipulation, but that doesn't work either.
I get an error "invalid start byte at position 27 for 0x98
Any guess, how I can solve this ?
Thanks,
For sophisticated input you can try ignoring errors while decoding:
b = b'SENDPKT: "STN1" "" "SH/DX\r"\x98\x00'
s = b.decode(errors='ignore')
res = s[20:s.find('\r')] # 'SH/DX'

error trying to recreate php's dechex function in python3

i have a php file that takes a simple 8 digit id and converts it to hex using
dechex(intval($id))
i am now trying todo the same thing in python i start by grabbing my list of ids from the web these are returned as strings such as
00274956 , 00002645, 00000217
i then convert them to intagers and hex them using
hex(int(item_id))
but i am getting the error
ValueError: invalid literal for int() with base 10: 'init'
here is the code the id comes direct from a http get request
FILE_NUMBER = int(ITEM_ID)
FILE_HEX = hex(FILE_NUMBER)
FILE_NEW = FILE_HEX + ".pdf"

Angr can't solve the googlectf beginner problem

I am a student studying angr, first time.
I'm watching the code in this url.
https://github.com/Dvd848/CTFs/blob/master/2020_GoogleCTF/Beginner.md
import angr
import claripy
FLAG_LEN = 15
STDIN_FD = 0
base_addr = 0x100000 # To match addresses to Ghidra
proj = angr.Project("./a.out", main_opts={'base_addr': base_addr})
flag_chars = [claripy.BVS('flag_%d' % i, 8) for i in range(FLAG_LEN)]
flag = claripy.Concat( *flag_chars + [claripy.BVV(b'\n')]) # Add \n for scanf() to accept the input
state = proj.factory.full_init_state(
args=['./a.out'],
add_options=angr.options.unicorn,
stdin=flag,
)
# Add constraints that all characters are printable
for k in flag_chars:
state.solver.add(k >= ord('!'))
state.solver.add(k <= ord('~'))
simgr = proj.factory.simulation_manager(state)
find_addr = 0x101124 # SUCCESS
avoid_addr = 0x10110d # FAILURE
simgr.explore(find=find_addr, avoid=avoid_addr)
if (len(simgr.found) > 0):
for found in simgr.found:
print(found.posix.dumps(STDIN_FD))
https://github.com/google/google-ctf/tree/master/2020/quals/reversing-beginner/attachments
Which is the answer of googlectf beginner.
But, the above code does not work. It doesn't give me the answer.
I want to know why the code is not working.
When I execute this code, the output was empty.
I run the code with python3 in Ubuntu 20.04 in wsl2
Thank you.
I believe this script isn't printing anything because angr fails to find a solution and then exits. You can prove this by appending the following to your script:
else:
raise Exception('Could not find the solution')
If the exception raises, a valid solution was not found.
In terms of why it doesn't work, this code looks like copy & paste from a few different sources, and so it's fairly convoluted.
For example, the way the flag symbol is passed to stdin is not ideal. By default, stdin is a SimPackets, so it's best to keep it that way.
The following script solves the challenge, I have commented it to help you understand. You will notice that changing stdin=angr.SimPackets(name='stdin', content=[(flag, 15)]) to stdin=flag will cause the script to fail, due to the reason mentioned above.
import angr
import claripy
base = 0x400000 # Default angr base
project = angr.Project("./a.out")
flag = claripy.BVS("flag", 15 * 8) # length is expected in bits here
initial_state = project.factory.full_init_state(
stdin=angr.SimPackets(name='stdin', content=[(flag, 15)]), # provide symbol and length (in bytes)
add_options ={
angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS
}
)
# constrain flag to common alphanumeric / punctuation characters
[initial_state.solver.add(byte >= 0x20, byte <= 0x7f) for byte in flag.chop(8)]
sim = project.factory.simgr(initial_state)
sim.explore(
find=lambda s: b"SUCCESS" in s.posix.dumps(1), # search for a state with this result
avoid=lambda s: b"FAILURE" in s.posix.dumps(1) # states that meet this constraint will be added to the avoid stash
)
if sim.found:
solution_state = sim.found[0]
print(f"[+] Success! Solution is: {solution_state.posix.dumps(0)}") # dump whatever was sent to stdin to reach this state
else:
raise Exception('Could not find the solution') # Tell us if angr failed to find a solution state
A bit of Trivia - there are actually multiple 'solutions' that the program would accept, I guess the CTF flag server only accepts one though.
❯ echo -ne 'CTF{\x00\xe0MD\x17\xd1\x93\x1b\x00n)' | ./a.out
Flag: SUCCESS

Decode byte message in python3

I'm trying to decode this message below. For some reason I keep getting error. I tried everything on google but no success.
b'6362561400022,B,,\x00\x04\x14\x01\x0bPQ=\n\x15(3\x19\x1a<\x1e\x80\x00\x00\xc8\x04\r\xc6\xb1"\xc4\xf2D\xff\xcb\x02\x0c\xfe\x02\x00\x00\x00\nR\x00\x17\x00\x00\x00\x01'
UPDATE. Found the solution
int("0x" + ''.join([hex(x)[2:] for x in byte_string]), base=16)
Found the answer.
int("0x" + ''.join([hex(x)[2:] for x in byte_string]), base=16)

MafftCommandline and io.StringIO

I've been trying to use the Mafft alignment tool from Bio.Align.Applications. Currently, I've had success writing my sequence information out to temporary text files that are then read by MafftCommandline(). However, I'd like to avoid redundant steps as much as possible, so I've been trying to write to a memory file instead using io.StringIO(). This is where I've been having problems. I can't get MafftCommandline() to read internal files made by io.StringIO(). I've confirmed that the internal files are compatible with functions such as AlignIO.read(). The following is my test code:
from Bio.Align.Applications import MafftCommandline
from Bio import SeqIO
from Bio.Seq import Seq
from Bio.SeqRecord import SeqRecord
import io
from Bio import AlignIO
sequences1 = ["AGGGGC",
"AGGGC",
"AGGGGGC",
"AGGAGC",
"AGGGGG"]
longest_length = max(len(s) for s in sequences1)
padded_sequences = [s.ljust(longest_length, '-') for s in sequences1] #padded sequences used to test compatibilty with AlignIO
ioSeq = ''
for items in padded_sequences:
ioSeq += '>unknown\n'
ioSeq += items + '\n'
newC = io.StringIO(ioSeq)
cLoc = str(newC).strip()
cLocEdit = cLoc[:len(cLoc)] #create string to remove < and >
test1Handle = AlignIO.read(newC, "fasta")
#test1HandleString = AlignIO.read(cLocEdit, "fasta") #fails to interpret cLocEdit string
records = (SeqRecord(Seq(s)) for s in padded_sequences)
SeqIO.write(records, "msa_example.fasta", "fasta")
test1Handle1 = AlignIO.read("msa_example.fasta", "fasta") #alignIO same for both #demonstrates working AlignIO
in_file = '.../msa_example.fasta'
mafft_exe = '/usr/local/bin/mafft'
mafft_cline = MafftCommandline(mafft_exe, input=in_file) #have to change file path
mafft_cline1 = MafftCommandline(mafft_exe, input=cLocEdit) #fails to read string (same as AlignIO)
mafft_cline2 = MafftCommandline(mafft_exe, input=newC)
stdout, stderr = mafft_cline()
print(stdout) #corresponds to MafftCommandline with input file
stdout1, stderr1 = mafft_cline1()
print(stdout1) #corresponds to MafftCommandline with internal file
I get the following error messages:
ApplicationError: Non-zero return code 2 from '/usr/local/bin/mafft <_io.StringIO object at 0x10f439798>', message "/bin/sh: -c: line 0: syntax error near unexpected token `newline'"
I believe this results due to the arrows ('<' and '>') present in the file path.
ApplicationError: Non-zero return code 1 from '/usr/local/bin/mafft "_io.StringIO object at 0x10f439af8"', message '/usr/local/bin/mafft: Cannot open _io.StringIO object at 0x10f439af8.'
Attempting to remove the arrows by converting the file path to a string and indexing resulted in the above error.
Ultimately my goal is to reduce computation time. I hope to accomplish this by calling internal memory instead of writing out to a separate text file. Any advice or feedback regarding my goal is much appreciated. Thanks in advance.
I can't get MafftCommandline() to read internal files made by
io.StringIO().
This is not surprising for a couple of reasons:
As you're aware, Biopython doesn't implement Mafft, it simply
provides a convenient interface to setup a call to mafft in
/usr/local/bin. The mafft executable runs as a separate process
that does not have access to your Python program's internal memory,
including your StringIO file.
The mafft program only works with an input file, it doesn't even
allow stdin as a data source. (Though it does allow stdout as a
data sink.) So ultimately, there must be a file in the file system
for mafft to open. Thus the need for your temporary file.
Perhaps tempfile.NamedTemporaryFile() or tempfile.mkstemp() might be a reasonable compromise.

Resources