Getting owner of file from smb share, by using python on linux - python-3.x

I need to find out for a script I'm writing who is the true owner of a file in an smb share (mounted using mount -t cifs of course on my server and using net use through windows machines).
Turns out it is a real challenge finding this information out using python on a linux server.
I tried using many many smb libraries (such as smbprotocol, smbclient and others), nothing worked.
I find few solutions for windows, they all use pywin32 or another windows specific package.
And I also managed to do it from bash using smbcalcs but couldn't do it cleanly but using subprocess.popen('smbcacls')..
Any idea on how to solve it?

This was unbelievably not a trivial task, and unfortunately the answer isn't simple as I hoped it would be..
I'm posting this answer if someone will be stuck with this same problem in the future, but hope maybe someone would post a better solution earlier
In order to find the owner I used this library with its examples:
from smb.SMBConnection import SMBConnection
conn = SMBConnection(username='<username>', password='<password>', domain=<domain>', my_name='<some pc name>', remote_name='<server name>')
conn.connect('<server name>')
sec_att = conn.getSecurity('<share name>', r'\some\file\path')
owner_sid = sec_att.owner
The problem is that pysmb package will only give you the owner's SID and not his name.
In order to get his name you need to make an ldap query like in this answer (reposting the code):
from ldap3 import Server, Connection, ALL
from ldap3.utils.conv import escape_bytes
s = Server('my_server', get_info=ALL)
c = Connection(s, 'my_user', 'my_password')
c.bind()
binary_sid = b'....' # your sid must be in binary format
c.search('my_base', '(objectsid=' + escape_bytes(binary_sid) + ')', attributes=['objectsid', 'samaccountname'])
print(c.entries)
But of course nothing will be easy, it took me hours to find a way to convert a string SID to binary SID in python, and in the end this solved it:
# posting the needed functions and omitting the class part
def byte(strsid):
'''
Convert a SID into bytes
strdsid - SID to convert into bytes
'''
sid = str.split(strsid, '-')
ret = bytearray()
sid.remove('S')
for i in range(len(sid)):
sid[i] = int(sid[i])
sid.insert(1, len(sid)-2)
ret += longToByte(sid[0], size=1)
ret += longToByte(sid[1], size=1)
ret += longToByte(sid[2], False, 6)
for i in range(3, len(sid)):
ret += cls.longToByte(sid[i])
return ret
def byteToLong(byte, little_endian=True):
'''
Convert bytes into a Python integer
byte - bytes to convert
little_endian - True (default) or False for little or big endian
'''
if len(byte) > 8:
raise Exception('Bytes too long. Needs to be <= 8 or 64bit')
else:
if little_endian:
a = byte.ljust(8, b'\x00')
return struct.unpack('<q', a)[0]
else:
a = byte.rjust(8, b'\x00')
return struct.unpack('>q', a)[0]
... AND finally you have the full solution! enjoy :(

I'm adding this answer to let you know of the option of using smbprotocol; as well as expand in case of misunderstood terminology.
SMBProtocol Owner Info
It is possible to get the SID using the smbprotocol library as well (just like with the pysmb library).
This was brought up in the github issues section of the smbprotocol repo, along with an example of how to do it. The example provided is fantastic and works perfectly. An extremely stripped down version
However, this also just retrieves a SID and will need a secondary library to perform a lookup.
Here's a function to get the owner SID (just wrapped what's in the gist in a function. Including here in case the gist is deleted or lost for any reason).
import smbclient
from ldap3 import Server, Connection, ALL,NTLM,SUBTREE
def getFileOwner(smb: smbclient, conn: Connection, filePath: str):
from smbprotocol.file_info import InfoType
from smbprotocol.open import FilePipePrinterAccessMask,SMB2QueryInfoRequest, SMB2QueryInfoResponse
from smbprotocol.security_descriptor import SMB2CreateSDBuffer
class SecurityInfo:
# 100% just pulled from gist example
Owner = 0x00000001
Group = 0x00000002
Dacl = 0x00000004
Sacl = 0x00000008
Label = 0x00000010
Attribute = 0x00000020
Scope = 0x00000040
Backup = 0x00010000
def guid2hex(text_sid):
"""convert the text string SID to a hex encoded string"""
s = ['\\{:02X}'.format(ord(x)) for x in text_sid]
return ''.join(s)
def get_sd(fd, info):
""" Get the Security Descriptor for the opened file. """
query_req = SMB2QueryInfoRequest()
query_req['info_type'] = InfoType.SMB2_0_INFO_SECURITY
query_req['output_buffer_length'] = 65535
query_req['additional_information'] = info
query_req['file_id'] = fd.file_id
req = fd.connection.send(query_req, sid=fd.tree_connect.session.session_id, tid=fd.tree_connect.tree_connect_id)
resp = fd.connection.receive(req)
query_resp = SMB2QueryInfoResponse()
query_resp.unpack(resp['data'].get_value())
security_descriptor = SMB2CreateSDBuffer()
security_descriptor.unpack(query_resp['buffer'].get_value())
return security_descriptor
with smbclient.open_file(filePath, mode='rb', buffering=0,
desired_access=FilePipePrinterAccessMask.READ_CONTROL) as fd:
sd = get_sd(fd.fd, SecurityInfo.Owner | SecurityInfo.Dacl)
# returns SID
_sid = sd.get_owner()
try:
# Don't forget to convert the SID string-like object to a string
# or you get an error related to "0" not existing
sid = guid2hex(str(_sid))
except:
print(f"Failed to convert SID {_sid} to HEX")
raise
conn.search('DC=dell,DC=com',f"(&(objectSid={sid}))",SUBTREE)
# Will return an empty array if no results are found
return [res['dn'].split(",")[0].replace("CN=","") for res in conn.response if 'dn' in res]
to use:
# Client config is required if on linux, not if running on windows
smbclient.ClientConfig(username=username, password=password)
# Setup LDAP session
server = Server('mydomain.com',get_info=ALL,use_ssl = True)
# you can turn off raise_exceptions, or leave it out of the ldap connection
# but I prefer to know when there are issues vs. silently failing
conn = Connection(server, user="domain\username", password=password, raise_exceptions=True,authentication=NTLM)
conn.start_tls()
conn.open()
conn.bind()
# Run the check
fileCheck = r"\\shareserver.server.com\someNetworkShare\someFile.txt"
owner = getFileOwner(smbclient, conn, fileCheck)
# Unbind ldap session
# I'm not clear if this is 100% required, I don't THINK so
# but better safe than sorry
conn.unbind()
# Print results
print(owner)
Now, this isn't super efficient. It takes 6 seconds for me to run this one a SINGLE file. So if you wanted to run some kind of ownership scan, then you probably want to just write the program in C++ or some other low-level language instead of trying to use python. But for something quick and dirty this does work. You could also setup a threading pool and run batches. The piece that takes longest is connecting to the file itself, not running the ldap query, so if you can find a more efficient way to do that you'll be golden.
Terminology Warning, Owner != Creator/Author
Last note on this. Owner != File Author. Many domain environments, and in particular SMB shares, automatically alter ownership from the creator to a group. In my case the results of the above is:
What I was actually looking for was the creator of the file. File creator and modifier aren't attributes which windows keeps track of by default. An administrator can enable policies to audit file changes in a share, or auditing can be enabled on a file-by-file basis using the Security->Advanced->Auditing functionality for an individual file (which does nothing to help you determine the creator).
That being said, some applications store that information for themselves. For example, if you're looking for Excel this answer provides a method for which to get the creator of any xls or xlsx files (doesn't work for xlsb due to the binary nature of the files). Unfortunately few files store this kind of information. In my case I was hoping to get that info for tblu, pbix, and other reporting type files. However, they don't contain this information (which is good from a privacy perspective).
So in case anyone finds this answer trying to solve the same kind of thing I did - Your best bet (to get actual authorship information) is to work with your domain IT administrators to get auditing setup.

Related

Custom file generation using cuckoo sandbox

I am trying to generate custom result file which will contain system information gathered using python psutil library.I searched, followed the documentation but couldn't get the answer. All I want to do is, I have written code to collect system information in python psutil library, this code should run along with malware file in cuckoo sandbox and result should be saved in ubuntu host machine.
I don't know where should i put my code and how to save the results.
Here is my code to collect system information
import psutil
pids = psutil.pids()
max_pid = max(pids)
total_id = len(pids)
psutil.cpu_times_percent()
cputimes = psutil.cpu_times_percent(0.1)
usertime = cputimes.user
systemtime = cputimes.system
memuse = psutil.virtual_memory().used
swapuse = psutil.swap_memory().used
net = psutil.net_io_counters()
packsent = net.packets_sent
packrecv = net.packets_recv
bytessent = net.bytes_sent
bytesrecv = net.bytes_recv
result={"maxpid":max_pip,"totalid":total_id,"usertime":usertime,"systemtime":systemtime,"memuse":memuse,"swapuse":swapuse,"packsent":packsent,"packrecv":packrecv,"bytessent":bytessent,"bytesrecv":bytesrecv}
I want to save this result dictionary to be saved in json file in host machine.

Angr can't solve the googlectf beginner problem

I am a student studying angr, first time.
I'm watching the code in this url.
https://github.com/Dvd848/CTFs/blob/master/2020_GoogleCTF/Beginner.md
import angr
import claripy
FLAG_LEN = 15
STDIN_FD = 0
base_addr = 0x100000 # To match addresses to Ghidra
proj = angr.Project("./a.out", main_opts={'base_addr': base_addr})
flag_chars = [claripy.BVS('flag_%d' % i, 8) for i in range(FLAG_LEN)]
flag = claripy.Concat( *flag_chars + [claripy.BVV(b'\n')]) # Add \n for scanf() to accept the input
state = proj.factory.full_init_state(
args=['./a.out'],
add_options=angr.options.unicorn,
stdin=flag,
)
# Add constraints that all characters are printable
for k in flag_chars:
state.solver.add(k >= ord('!'))
state.solver.add(k <= ord('~'))
simgr = proj.factory.simulation_manager(state)
find_addr = 0x101124 # SUCCESS
avoid_addr = 0x10110d # FAILURE
simgr.explore(find=find_addr, avoid=avoid_addr)
if (len(simgr.found) > 0):
for found in simgr.found:
print(found.posix.dumps(STDIN_FD))
https://github.com/google/google-ctf/tree/master/2020/quals/reversing-beginner/attachments
Which is the answer of googlectf beginner.
But, the above code does not work. It doesn't give me the answer.
I want to know why the code is not working.
When I execute this code, the output was empty.
I run the code with python3 in Ubuntu 20.04 in wsl2
Thank you.
I believe this script isn't printing anything because angr fails to find a solution and then exits. You can prove this by appending the following to your script:
else:
raise Exception('Could not find the solution')
If the exception raises, a valid solution was not found.
In terms of why it doesn't work, this code looks like copy & paste from a few different sources, and so it's fairly convoluted.
For example, the way the flag symbol is passed to stdin is not ideal. By default, stdin is a SimPackets, so it's best to keep it that way.
The following script solves the challenge, I have commented it to help you understand. You will notice that changing stdin=angr.SimPackets(name='stdin', content=[(flag, 15)]) to stdin=flag will cause the script to fail, due to the reason mentioned above.
import angr
import claripy
base = 0x400000 # Default angr base
project = angr.Project("./a.out")
flag = claripy.BVS("flag", 15 * 8) # length is expected in bits here
initial_state = project.factory.full_init_state(
stdin=angr.SimPackets(name='stdin', content=[(flag, 15)]), # provide symbol and length (in bytes)
add_options ={
angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS
}
)
# constrain flag to common alphanumeric / punctuation characters
[initial_state.solver.add(byte >= 0x20, byte <= 0x7f) for byte in flag.chop(8)]
sim = project.factory.simgr(initial_state)
sim.explore(
find=lambda s: b"SUCCESS" in s.posix.dumps(1), # search for a state with this result
avoid=lambda s: b"FAILURE" in s.posix.dumps(1) # states that meet this constraint will be added to the avoid stash
)
if sim.found:
solution_state = sim.found[0]
print(f"[+] Success! Solution is: {solution_state.posix.dumps(0)}") # dump whatever was sent to stdin to reach this state
else:
raise Exception('Could not find the solution') # Tell us if angr failed to find a solution state
A bit of Trivia - there are actually multiple 'solutions' that the program would accept, I guess the CTF flag server only accepts one though.
❯ echo -ne 'CTF{\x00\xe0MD\x17\xd1\x93\x1b\x00n)' | ./a.out
Flag: SUCCESS

From SSH not decoded from bytes to ASCII?

Good afternoon.
I get the example below from SSH:
b"rxmop:moty=rxotg;\x1b[61C\r\nRADIO X-CEIVER ADMINISTRATION\x1b[50C\r\nMANAGED OBJECT DATA\x1b[60C\r\n\x1b[79C\r\nMO\x1b[9;19HRSITE\x1b[9;55HCOMB FHOP MODEL\x1b[8C\r\nRXOTG-58\x1b[10;19H54045_1800\x1b[10;55HHYB"
I process ssh.recv (99999) .decode ('ASCII')
but some characters are not decoded for example:
\x1b[61C
\x1b[50C
\x1b[9;55H
\x1b[9;19H
The article below explains that these are ANSI escape codes that appear since I use invoke_shell. Previously everything worked until it moved to another server.
Is there a simple way to get rid of junk values that come when you SSH using Python's Paramiko library and fetch output from CLI of a remote machine?
When I write to the file, I also get:
rxmop:moty=rxotg;[61C
RADIO X-CEIVER ADMINISTRATION[50C
MANAGED OBJECT DATA[60C
[79C
MO[9;19HRSITE[9;55HCOMB FHOP MODEL[8C
RXOTG-58[10;19H54045_1800[10;55HHYB
If you use PuTTY everything is clear and beautiful.
I can't get away from invoke_shell because the connection is being thrown from one server to another.
Sample code below:
# coding:ascii
import paramiko
port = 22
data = ""
client = paramiko.SSHClient()
client.set_missing_host_key_policy(paramiko.AutoAddPolicy())
client.connect(hostname=host, username=user, password=secret, port=port, timeout=10)
ssh = client.invoke_shell()
ssh.send("rxmop:moty=rxotg;\n")
while data.find("<") == -1:
time.sleep(0.1)
data += ssh.recv(99999).decode('ascii')
ssh.close()
client.close()
f = open('text.txt', 'w')
f.write(data)
f.close()
The normal output is below:
MO RSITE COMB FHOP MODEL
RXOTG-58 54045_1800 HYB BB G12
SWVERREPL SWVERDLD SWVERACT TMODE
B1314R081D TDM
CONFMD CONFACT TRACO ABISALLOC CLUSTERID SCGR
NODEL 4 POOL FLEXIBLE
DAMRCR CLTGINST CCCHCMD SWVERCHG
NORMAL UNLOCKED
PTA JBSDL PAL JBPTA
TGFID SIGDEL BSSWANTED PACKALG
H'0001-19B3 NORMAL
What can you recommend in order to return normal output, so that all characters are processed?
Regular expressions do not help, since the structure of the record is shifted, then characters from certain positions are selected in the code.
PS try to use ssh.invoke_shell (term='xterm') don't work.
There is an answer here:
How can I remove the ANSI escape sequences from a string in python
There are other ways...
https://unix.stackexchange.com/questions/14684/removing-control-chars-including-console-codes-colours-from-script-output
Essentially, you are 'screen-scraping' input, and you need to strip the ANSI codes. So, grab the input, and then strip the codes.
import re
... (your ssh connection here)
data = ""
while data.find("<") == -1:
time.sleep(0.1)
chunk = ssh.recv(99999)
data += chunk
... (your ssh connection cleanup here)
ansi_escape = re.compile(r'\x1B(?:[#-Z\\-_]|\[[0-?]*[ -/]*[#-~])')
data = ansi_escape.sub('', data)

How to return pointer string with ctypes in python 2.7

I am working on implementation of new fiscal device. And it is using OPOS / UPOS library for communication. I am very new to ctypes and have no experience with C at all. However, I have managed to make it work, mostly.
But I am having issues with returning a string from generalist method DirectIO. documentation says: "This command should be used immediately after EndFiscalReceipt() to retrieve unique ID of latest receipt"
" Parameters:
– Data [in]
Ignored.
– Obj [in/out]
Value to be read."
And adds .NET example under it:
int iData = 0;
string strReferenceID = "";
fiscalPrinter.EndFiscalReceipt();
fiscalPrinter.DirectIO(CMD_EKASA_RECEIPT_ID, ref iData, ref strReferenceID);
// strReferenceID will contain latest receipt ID, e.g. "O−7DBCDA8A56EE426DBCDA8A56EE426D1A"
the first parameter (CMD_EKASA_RECEIPT_ID) is the command executed, thats why its not listed above.
However, python is not .NET and I have never been working with .NET.
I have been following instructions in ctypes doku (https://docs.python.org/2.7/library/ctypes.html), defiend this methods arguments and return in init:
self.libc.DirectIO.argtypes = [ctypes.c_int32, ctypes.c_int32, ctypes.c_char_p]
self.libc.DirectIO.restype = ctypes.c_char_p
Than tried different ways to retrieve reply string, but neither of these does work in my case:
s = ""
c_s = ctypes.c_char_p(s)
result = self.send_command(CMD_EKASA_RECEIPT_ID, 0, c_s)
p = ctypes.create_string_buffer(40)
poin = ctypes.pointer(p)
result = self.send_command(CMD_EKASA_RECEIPT_ID, 0, poin)
p = ctypes.create_string_buffer(40)
result = self.send_command(CMD_EKASA_RECEIPT_ID, 0, p)
s = ctypes.create_string_buffer('\000' * 32)
result = self.send_command(CMD_EKASA_RECEIPT_ID, 0, s)
the string object I have created is allways empty, a.k.a. "" after caling the Cmethod, just like I have created it.
However, there is one more thing, that does not make sense to me. My colleague showed me, how you can see method arguments and return in header file. For this one, there is this:
int DirectIO(int iCommand, int* piData, const char* pccString);
Which means, it returns integer? If I am not mistaken.
so, what I am thinking is, that I have to pass to the method some pointer to a string, created in python, and C will change it, into what I should read. Thus, I think my way of thinking about solution is right.
I have also tried this approach, but that does not work for me either How to pass pointer back in ctypes?
and I am starting to feel desperate. Not sure if I understand the problem correctly and looking for a solution is right place.
I have solved my problem. The whole thing was, in allocating of memory. Every example on the net that I have readed did create empty string, like s = "". But, that is not correct!
When allocated empty string "" C library have had no memory where to write result.
this was almost correct approach,
self.libc = ctypes.cdll.LoadLibrary(LIB_PATH)
self.libc.DirectIO.argtypes = [ctypes.c_int32, ctypes.c_int32, ctypes.c_char_p]
result_s = ctypes.c_char_p("")
log.info('S address: %s | S.value: "%s"' % (result_s, result_s.value))
self.libc.DirectIO(CMD_EKASA_RECEIPT_ID, 0, result_s)
log.info('S address: %s | S.value: "%s"' % (result_s, result_s.value))
returns:
S address: c_char_p(140192115373700) | S.value: ""
S address: c_char_p(140192115373700) | S.value: ""
it needed just a small modification:
self.libc = ctypes.cdll.LoadLibrary(LIB_PATH)
self.libc.DirectIO.argtypes = [ctypes.c_int32, ctypes.c_int32, ctypes.c_char_p]
result_s = ctypes.c_char_p(" " * 10)
log.info('S address: %s | S.value: %s' % (result_s, result_s.value))
self.libc.DirectIO(CMD_EKASA_RECEIPT_ID, 0, result_s)
log.info('S address: %s | S.value: %s' % (result_s, result_s.value))
now, printing result_s after calling self.libc.DirectIO does return different string, than it was before call.
S address: c_char_p(140532072777764) | S.value: " "
S address: c_char_p(140532072777764) | S.value: "0-C12A22F5"
There is linux in the tag, but OPOS does not work on linux.
Or are you working in an emulation environment such as Wine?
In any case, if you don't have the right environment, you can get into trouble with a little bit of nothing.
First, work in a Windows 32-bit environment, create something that works there, and then port it to another environment.
Since OPOS uses OLE/COM technology, the first package to use is win32com or comtypes.
UnifiedPOS is a conceptual specification and there are no actual software components.
There are three types of software that actually run: OPOS, JavaPOS, and POS for.NET.
OPOS and POS for.NET only work in a Windows 32-bit environment.
Only JavaPOS can work in a Linux environment, and it is usually only available from Java.
If you want to make something in Python, you need to create a Wrapper (or glue) library that calls Java from Python.
If the C interface UnifiedPOS(OPOS) is running on Linux without using the Windows emulator or the Wrapper for Java, it may be an original library/component created by the printer vendor with reference to UnifiedPOS.
In that case, I think that the detailed specification can only be heard from the vendor who created it.
To explain, DirectIO method and DirectIOEvent are defined as method/event that vendors can freely define and use.
Therefore, only method/event names and parameters are defined in the UnifiedPOS document.
It is necessary to ask the vendor who provides the DirectIO method/DirectIOEvent what function the specific vendor's service object has, and it is up to the vendor to determine what the parameter means is.
The OPOS specification was absorbed by UnifiedPOS from the middle, but until then it existed as a single specification.
The rest of the name is here. MCS: OPOS Releases
This is the root of the return value of the method of your library being integer.
By the way, this is the latest UnifiedPOS specification for now.
Document -- retail/17-07-32 (UnifiedPOS Retail Peripheral Architecture, Version 1.14.1)

MafftCommandline and io.StringIO

I've been trying to use the Mafft alignment tool from Bio.Align.Applications. Currently, I've had success writing my sequence information out to temporary text files that are then read by MafftCommandline(). However, I'd like to avoid redundant steps as much as possible, so I've been trying to write to a memory file instead using io.StringIO(). This is where I've been having problems. I can't get MafftCommandline() to read internal files made by io.StringIO(). I've confirmed that the internal files are compatible with functions such as AlignIO.read(). The following is my test code:
from Bio.Align.Applications import MafftCommandline
from Bio import SeqIO
from Bio.Seq import Seq
from Bio.SeqRecord import SeqRecord
import io
from Bio import AlignIO
sequences1 = ["AGGGGC",
"AGGGC",
"AGGGGGC",
"AGGAGC",
"AGGGGG"]
longest_length = max(len(s) for s in sequences1)
padded_sequences = [s.ljust(longest_length, '-') for s in sequences1] #padded sequences used to test compatibilty with AlignIO
ioSeq = ''
for items in padded_sequences:
ioSeq += '>unknown\n'
ioSeq += items + '\n'
newC = io.StringIO(ioSeq)
cLoc = str(newC).strip()
cLocEdit = cLoc[:len(cLoc)] #create string to remove < and >
test1Handle = AlignIO.read(newC, "fasta")
#test1HandleString = AlignIO.read(cLocEdit, "fasta") #fails to interpret cLocEdit string
records = (SeqRecord(Seq(s)) for s in padded_sequences)
SeqIO.write(records, "msa_example.fasta", "fasta")
test1Handle1 = AlignIO.read("msa_example.fasta", "fasta") #alignIO same for both #demonstrates working AlignIO
in_file = '.../msa_example.fasta'
mafft_exe = '/usr/local/bin/mafft'
mafft_cline = MafftCommandline(mafft_exe, input=in_file) #have to change file path
mafft_cline1 = MafftCommandline(mafft_exe, input=cLocEdit) #fails to read string (same as AlignIO)
mafft_cline2 = MafftCommandline(mafft_exe, input=newC)
stdout, stderr = mafft_cline()
print(stdout) #corresponds to MafftCommandline with input file
stdout1, stderr1 = mafft_cline1()
print(stdout1) #corresponds to MafftCommandline with internal file
I get the following error messages:
ApplicationError: Non-zero return code 2 from '/usr/local/bin/mafft <_io.StringIO object at 0x10f439798>', message "/bin/sh: -c: line 0: syntax error near unexpected token `newline'"
I believe this results due to the arrows ('<' and '>') present in the file path.
ApplicationError: Non-zero return code 1 from '/usr/local/bin/mafft "_io.StringIO object at 0x10f439af8"', message '/usr/local/bin/mafft: Cannot open _io.StringIO object at 0x10f439af8.'
Attempting to remove the arrows by converting the file path to a string and indexing resulted in the above error.
Ultimately my goal is to reduce computation time. I hope to accomplish this by calling internal memory instead of writing out to a separate text file. Any advice or feedback regarding my goal is much appreciated. Thanks in advance.
I can't get MafftCommandline() to read internal files made by
io.StringIO().
This is not surprising for a couple of reasons:
As you're aware, Biopython doesn't implement Mafft, it simply
provides a convenient interface to setup a call to mafft in
/usr/local/bin. The mafft executable runs as a separate process
that does not have access to your Python program's internal memory,
including your StringIO file.
The mafft program only works with an input file, it doesn't even
allow stdin as a data source. (Though it does allow stdout as a
data sink.) So ultimately, there must be a file in the file system
for mafft to open. Thus the need for your temporary file.
Perhaps tempfile.NamedTemporaryFile() or tempfile.mkstemp() might be a reasonable compromise.

Resources