Python3 - Sanitizing user input for shell use - python-3.x

I am busy writing a Python3 script which requires user input, the input is used as parameters in commands passed to the shell.
The script is only intended to be used by trusted internal users - however I'd rather have some contingencies in place to ensure the valid execution of commands.
Example 1:
import subprocess
user_input = '/tmp/file.txt'
subprocess.Popen(['cat', user_input])
This will output the contents of '/tmp/file.txt'
Example 2:
import subprocess
user_input = '/tmp/file.txt && rm -rf /'
subprocess.Popen(['cat', user_input])
Results in (as expected):
cat: /tmp/file.txt && rm -rf /: No such file or directory
Is this an acceptable method of sanitizing input? Is there anything else, per best practice, I should be doing in addition to this?

The approach you have chosen,
import subprocess
user_input = 'string'
subprocess.Popen(['command', user_input])
is quite good as command is static and user_input is passed as one single argument to command. As long as you don't do something really stupid like
subprocess.Popen(['bash', '-c', user_input])
you should be on the safe side.
For commands that require multiple arguments, I'd recommend that you request multiple inputs from the user, e.g. do this
subprocess.Popen(['cp', user_input1, user_input2])
instead of this
user_input="file1.txt file2.txt"
subprocess.Popen(['cp'] + user_input.split())
If you want to increase security further, you could:
explicitly set shell=False (to ensure you never run shell commands; this is already the current default, but defaults may change over time):
subprocess.Popen(['command', user_input], shell=False)
use absolute paths for command (to prevent injection of malicious executables via PATH):
subprocess.Popen(['/usr/bin/command', user_input])
explicitly instruct commands that support it to stop parsing options, e.g.
subprocess.Popen(['rm', '--', user_input1, user_input2])
do as much as you can natively, e.g. cat /tmp/file.txt could be accomplished with a few lines of Python code instead (which would also increase portability if that should be a factor)


How to comment all the uncommented lines in a file using puppet module

I have a sshd_config configuration file which contains commented as well as uncommented lines. I want to comment all the uncommented lines in that file using puppet. Is there any optimal/simple way to do this? Or is there a way to run bash command (maybe sed to replace) via puppet? I am not sure that using bash command is a right approach.
It would be really helpful is someone guides me with this. Thanks in advance!
Is there any optimal/simple way to do this?
There is no built-in resource type or well-known module that specifically ensures that non-blank lines of a file start with a # character.
Or is there a way to run bash command (maybe sed to replace) via puppet?
Yes, the Exec resource type. That's your best bet short of writing a custom resource type.
I am not sure that using bash command is a right approach.
In a general sense, it's not. Appropriate, specific resource types are better than Exec. But when you don't have a suitable one and can't be bothered to make one, Exec is available.
It might look like this:
# The file to work with, so that we don't have to repeat ourselves
$target_file = '/etc/ssh/sshd_config'
exec { "Comment uncommented ${target_file} lines":
# Specifying the command in array form avoids complicated quoting or any
# risk of Puppet word-splitting the command incorrectly
command => ['sed', '-i', '-e', '/^[[:space:]]*[^#]/ s/^/# /', $target_file],
# If we didn't specify a search path then we would need to use fully-qualified
# command names in 'command' above and 'onlyif' below
path => ['/bin', '/usr/bin', '/sbin', '/usr/sbin'],
# The file needs to be modified only if it contains any non-blank, uncommented
# lines. Testing that via an 'onlyif' ensures that Puppet will not
# run 'sed' or (more importantly) report the file changed when it does
# not initially contain any lines that need to be commented
onlyif => [['grep', '-q', '^[[:space:]]*[^#]', $target_file]],
# This is the default provider for any target node where the rest of this
# resource would work anyway. Specifying it explicitly will lead to a more
# informative diagnostic if there is an attempt to apply this resource to
# a system to which it is unsuited.
provider => 'posix',
That does not rely on bash or any other shell to run the commands, but it does rely on sed and grep being available in one of the specified directories. In fact, it relies specifically on GNU sed or one that supports an -i option with the same semantics. Notably, that does not include BSD-style sed, such as you will find on macOS.

Can I pass a variable from python to bash file?

I have a bash file with a bunch of sed commands like this :
sed -i 's/hello my name is Thibault/hello my name is Louis/g' "$1"
so for now i'm doing all of this "by hand", however, I have a python script with a tkinter GUI and several input fields for the user. I would like to find a trick so that if the user inputs "hello my name is Olivia" in the text field then the regex would look like this:
sed -i 's/hello my name is Thibault/hello my name is Olivia/g' "$1"
So I was thinking that i could store the python text input result in the variable to have the regex look like this:
sed -i 's/hello my name is Thibault/$my_variable/g' "$1"
but i don't know how or if this is even possible. Lastly I want to mention that i know i could just ask for the user input in the bash script but this is for my first internship and I have to go through the python GUI.
Edit: i'm on windows 10 if this is any important
Try it like this :
import os
original_text = 'hello my name is Thibault'
new_text = 'hello my name is Louis'
filename = 'test.txt'
os.system (f'sed -i "s/{original_text}/{new_text}/g" {filename}')
For passing data (in your case: some string) from your Python program to a subprocess running a bash script, you have first of all the same options like when calling one bash script from another one: Either design the called script to expect positional parameters (use it as $1 for example) and pass the string as parameter. For instance, if the string is stored in the Python variable parameter, it would look like:
import subprocess ['bash', './script_to_be_called', parameter]
The other possibility is to design the bash script so that it expects the string to be stored in a variable of a certain name (use it as $PARSTRING for instance) and pass the data via the environment:
import os
os.environ['PARSTRING']=parameter['bash', './script_to_be_called']
If the "script" executes only a single command, you could alternatively synthesize the command line in your Python program. Assume that you have a string bashcommand, which already holds the complete command which is supposed to be executed by bash, you could do a
import subprocess ['bash', '-c', bashcommand]
While this should answer your question, I can't help but pointing out, that for executing a single external command, I would not create a shell process, but invoke this program directly as a child process. Also don't forget that spawning a child process takes time, and if you have many such invocations, it might make sense to redesign your approach, for instance by doing everything inside Python, or having only one child prcocess which gets as input the data for all the substitutions to be performed (typically via a file).

Is there a way to know how the user invoked a program from bash?

Here's the problem: I have this script, and if the user invokes it without the --bar option, I'd like to display the following error message:
Please add the --bar option to your command, like so:
python --bar
Now, the tricky part is that there are several ways the user might have invoked the command:
They may have used python like in the example
They may have used /usr/bin/
They may have a shell alias frob='python', and actually ran frob
Maybe it's even a git alias flab=!/usr/bin/, and they used git flab
In every case, I'd like the message to reflect how the user invoked the command, so that the example I'm providing would make sense.
sys.argv always contains, and /proc/$$/cmdline doesn't know about aliases. It seems to me that the only possible source for this information would be bash itself, but I don't know how to ask it.
Any ideas?
UPDATE How about if we limit possible scenarios to only those listed above?
UPDATE 2: Plenty of people wrote very good explanation about why this is not possible in the general case, so I would like to limit my question to this:
Under the following assumptions:
The script was started interactively, from bash
The script was start in one of these 3 ways:
foo <args> where foo is a symbolic link /usr/bin/foo ->
git foo where!/usr/bin/foo in ~/.gitconfig
git baz where alias.baz=!/usr/bin/foo in ~/.gitconfig
Is there a way to distinguish between 1 and (2,3) from within the script? Is there a way to distinguish between 2 and 3 from within the script?
I know this is a long shot, so I'm accepting Charles Duffy's answer for now.
UPDATE 3: So far, the most promising angle was suggested by Charles Duffy in the comments below. If I can get my users to have
trap 'export LAST_BASH_COMMAND=$(history 1)' DEBUG
in their .bashrc, then I can use something like this in my code:
like_so = None
cmd = os.environ['LAST_BASH_COMMAND']
if cmd is not None:
cmd = cmd[8:] # Remove the history counter
if cmd.startswith("foo "):
like_so = "foo --bar " + cmd[4:]
elif cmd.startswith(r"git foo "):
like_so = "git foo --bar " + cmd[8:]
elif cmd.startswith(r"git baz "):
like_so = "git baz --bar " + cmd[8:]
if like_so is not None:
print("Please add the --bar option to your command, like so:")
print(" " + like_so)
print("Please add the --bar option to your command.")
This way, I show the general message if I don't manage to get their invocation method. Of course, if I'm going to rely on changing my users' environment I might as well ensure that the various aliases export their own environment variables that I can look at, but at least this way allows me to use the same technique for any other script I might add later.
No, there is no way to see the original text (before aliases/functions/etc).
Starting a program in UNIX is done as follows at the underlying syscall level:
int execve(const char *path, char *const argv[], char *const envp[]);
Notably, there are three arguments:
The path to the executable
An argv array (the first item of which -- argv[0] or $0 -- is passed to that executable to reflect the name under which it was started)
A list of environment variables
Nowhere in here is there a string that provides the original user-entered shell command from which the new process's invocation was requested. This is particularly true since not all programs are started from a shell at all; consider the case where your program is started from another Python script with shell=False.
It's completely conventional on UNIX to assume that your program was started through whatever name is given in argv[0]; this works for symlinks.
You can even see standard UNIX tools doing this:
$ ls '*.txt' # sample command to generate an error message; note "ls:" at the front
ls: *.txt: No such file or directory
$ (exec -a foobar ls '*.txt') # again, but tell it that its name is "foobar"
foobar: *.txt: No such file or directory
$ alias somesuch=ls # this **doesn't** happen with an alias
$ somesuch '*.txt' # ...the program still sees its real name, not the alias!
ls: *.txt: No such file
If you do want to generate a UNIX command line, use pipes.quote() (Python 2) or shlex.quote() (Python 3) to do it safely.
from pipes import quote # Python 2
except ImportError:
from shlex import quote # Python 3
cmd = ' '.join(quote(s) for s in open('/proc/self/cmdline', 'r').read().split('\0')[:-1])
print("We were called as: {}".format(cmd))
Again, this won't "un-expand" aliases, revert to the code that was invoked to call a function that invoked your command, etc; there is no un-ringing that bell.
That can be used to look for a git instance in your parent process tree, and discover its argument list:
def find_cmdline(pid):
return open('/proc/%d/cmdline' % (pid,), 'r').read().split('\0')[:-1]
def find_ppid(pid):
stat_data = open('/proc/%d/stat' % (pid,), 'r').read()
stat_data_sanitized = re.sub('[(]([^)]+)[)]', '_', stat_data)
return int(stat_data_sanitized.split(' ')[3])
def all_parent_cmdlines(pid):
while pid > 0:
yield find_cmdline(pid)
pid = find_ppid(pid)
def find_git_parent(pid):
for cmdline in all_parent_cmdlines(pid):
if cmdline[0] == 'git':
return ' '.join(quote(s) for s in cmdline)
return None
See the Note at the bottom regarding the originally proposed wrapper script.
A new more flexible approach is for the python script to provide a new command line option, permitting users to specify a custom string they would prefer to see in error messages.
For example, if a user prefers to call the python script '' via an alias, they can change the alias definition from this:
alias myAlias=' $#'
to this:
alias myAlias=' --caller=myAlias $#'
If they prefer to call the python script from a shell script, it can use the additional command line option like so:
exec "$#" --caller=${0##*/}
Other possible applications of this approach:
bash -c --caller="bash -c"
For listing expanded command lines, here's a script '', based on feedback by #CharlesDuffy, that lists cmdline for the running python script, as well as the parent process that spawned it.
If the new -caller argument is used, it will appear in the command line, although aliases will have been expanded, etc.
#!/usr/bin/env python
import os, re
with open ("/proc/self/stat", "r") as myfile:
data = [x.strip() for x in str.split(myfile.readlines()[0],' ')]
pid = data[0]
ppid = data[3]
def commandLine(pid):
with open ("/proc/"+pid+"/cmdline", "r") as myfile:
return [x.strip() for x in str.split(myfile.readlines()[0],'\x00')][0:-1]
pid_cmdline = commandLine(pid)
ppid_cmdline = commandLine(ppid)
print "%r" % pid_cmdline
print "%r" % ppid_cmdline
After saving this to a file named '', and then calling it from a bash script called '' with various arguments, here's the output:
$ ./ a b "c d" e
['python', './']
['/bin/bash', './', 'a', 'b', 'c d', 'e']
NOTE: criticisms of the original wrapper script were valid. Although the existence of a pre-defined alias is part of the specification of the question, and may be presumed to exist in the user environment, the proposal defined the alias (creating the misleading impression that it was part of the recommendation rather than a specified part of the user's environment), and it didn't show how the wrapper would communicate with the called python script. In practice, the user would either have to source the wrapper or define the alias within the wrapper, and the python script would have to delegate the printing of error messages to multiple custom calling scripts (where the calling information resided), and clients would have to call the wrapper scripts. Solving those problems led to a simpler approach, that is expandable to any number of additional use cases.
Here's a less confusing version of the original script, for reference:
shopt -s expand_aliases
alias myAlias=''
# called like this:
set -o history
myAlias $#
CALL_HISTORY=( `history` )
case "$_EXITCODE" in
0) # no error message required
echo "customized error message #1 [$_CALLING_MODE]" 1>&2
echo "customized error message #2 [$_CALLING_MODE]" 1>&2
Here's the output:
$ 1 2 3
['./', '1', '2', '3']
customized error message #2 [myAlias]
There is no way to distinguish between when an interpreter for a script is explicitly specified on the command line and when it is deduced by the OS from the hashbang line.
$ cat
#!/usr/bin/env bash
ps -o command $$
$ bash ./
bash ./
$ ./
bash ./
This prevents you from detecting the difference between the first two cases in your list.
I am also confident that there is no reasonable way of identifying the other (mediated) ways of calling a command.
I can see two ways to do this:
The simplest, as suggested by 3sky, would be to parse the command line from inside the python script. argparse can be used to do so in a reliable way. This only works if you can change that script.
A more complex way, slightly more generic and involved, would be to change the python executable on your system.
Since the first option is well documented, here are a bit more details on the second one:
Regardless of the way your script is called, python is ran. The goal here is to replace the python executable with a script that checks if is among the arguments, and if it is, check if --bar is as well. If not, print the message and return.
In every other case, execute the real python executable.
Now, hopefully, running python is done trough the following shebang: #!/usr/bin/env python3, or trough python, rather than a variant of #!/usr/bin/python or /usr/bin/python That way, you can change the $PATH variable, and prepend a directory where your false python resides.
In the other case, you can replace the /usr/bin/python executable, at the risk of not playing nice with updates.
A more complex way of doing this would probably be with namespaces and mounts, but the above is probably enough, especially if you have admin rights.
Example of what could work as a script:
#!/usr/bin/env bash
function checkbar
for i in "$#"
if [ "$i" = "--bar" ]
echo "Well done, you added --bar!"
return 0
return 1
command=$(basename ${1:-none})
if [ $command = "" ]
if ! checkbar "$#"
echo "Please add --bar to the command line, like so:"
printf "%q " $0
printf "%q " "$#"
printf -- "--bar\n"
exit 1
/path/to/real/python "$#"
However, after re-reading your question, it is obvious that I misunderstood it. In my opinion, it is all right to just print either " must be called like --bar", "please add bar to your arguments" or "please try (instead of )", regardless of what the user entered:
If that's an (git) alias, this is a one time error, and the user will try their alias after creating it, so they know where to put the --bar part
with either with /usr/bin/ or python
If the user is not really command line-savvy, they can just paste the working command that is displayed, even if they don't know the difference
If they are, they should be able to understand the message without trouble, and adjust their command line.
I know it's bash task, but i think the easiest way is modify ''. Of course it depends on level of script complicated, but maybe it will fit. Here is sample code:
import sys
if len(sys.argv) > 1 and sys.argv[1] == '--bar':
print 'make magic'
print 'Please add the --bar option to your command, like so:'
print ' python --bar'
In this case, it does not matter how user run this code.
$ ./
Please add the --bar option to your command, like so:
python --bar
$ ./ -dua
Please add the --bar option to your command, like so:
python --bar
$ ./ --bar
make magic
$ python --t
Please add the --bar option to your command, like so:
python --bar
$ /home/3sky/test/
Please add the --bar option to your command, like so:
python --bar
$ alias a='python'
$ a
Please add the --bar option to your command, like so:
python --bar
$ a --bar
make magic

How to get the complete calling command of a BASH script from inside the script (not just the arguments)

I have a BASH script that has a long set of arguments and two ways of calling it:
my_script --option1 value --option2 value ... etc
my_script val1 val2 val3 ..... valn
This script in turn compiles and runs a large FORTRAN code suite that eventually produces a netcdf file as output. I already have all the metadata in the netcdf output global attributes, but it would be really nice to also include the full run command one used to create that experiment. Thus another user who receives the netcdf file could simply reenter the run command to rerun the experiment, without having to piece together all the options.
So that is a long way of saying, in my BASH script, how do I get the last command entered from the parent shell and put it in a variable? i.e. the script is asking "how was I called?"
I could try to piece it together from the option list, but the very long option list and two interface methods would make this long and arduous, and I am sure there is a simple way.
I found this helpful page:
BASH: echoing the last command run
but this only seems to work to get the last command executed within the script itself. The asker also refers to use of history, but the answers seem to imply that the history will only contain the command after the programme has completed.
Many thanks if any of you have any idea.
You can try the following:
myInvocation="$(printf %q "$BASH_SOURCE")$((($#)) && printf ' %q' "$#")"
$BASH_SOURCE refers to the running script (as invoked), and $# is the array of arguments; (($#)) && ensures that the following printf command is only executed if at least 1 argument was passed; printf %q is explained below.
While this won't always be a verbatim copy of your command line, it'll be equivalent - the string you get is reusable as a shell command.
chepner points out in a comment that this approach will only capture what the original arguments were ultimately expanded to:
For instance, if the original command was my_script $USER "$(date +%s)", $myInvocation will not reflect these arguments as-is, but will rather contain what the shell expanded them to; e.g., my_script jdoe 1460644812
chepner also points that out that getting the actual raw command line as received by the parent process will be (next to) impossible. Do tell me if you know of a way.
However, if you're prepared to ask users to do extra work when invoking your script or you can get them to invoke your script through an alias you define - which is obviously tricky - there is a solution; see bottom.
Note that use of printf %q is crucial to preserving the boundaries between arguments - if your original arguments had embedded spaces, something like $0 $* would result in a different command.
printf %q also protects against other shell metacharacters (e.g., |) embedded in arguments.
printf %q quotes the given argument for reuse as a single argument in a shell command, applying the necessary quoting; e.g.:
$ printf %q 'a |b'
a\ \|b
a\ \|b is equivalent to single-quoted string 'a |b' from the shell's perspective, but this example shows how the resulting representation is not necessarily the same as the input representation.
Incidentally, ksh and zsh also support printf %q, and ksh actually outputs 'a |b' in this case.
If you're prepared to modify how your script is invoked, you can pass $BASH_COMMANDas an extra argument: $BASH_COMMAND contains the raw[1]
command line of the currently executing command.
For simplicity of processing inside the script, pass it as the first argument (note that the double quotes are required to preserve the value as a single argument):
my_script "$BASH_COMMAND" --option1 value --option2
Inside your script:
# The *first* argument is what "$BASH_COMMAND" expanded to,
# i.e., the entire (alias-expanded) command line.
myInvocation=$1 # Save the command line in a variable...
shift # ... and remove it from "$#".
# Now process "$#", as you normally would.
Unfortunately, there are only two options when it comes to ensuring that your script is invoked this way, and they're both suboptimal:
The end user has to invoke the script this way - which is obviously tricky and fragile (you could however, check in your script whether the first argument contains the script name and error out, if not).
Alternatively, provide an alias that wraps the passing of $BASH_COMMAND as follows:
alias my_script='/path/to/my_script "$BASH_COMMAND"'
The tricky part is that this alias must be defined in all end users' shell initialization files to ensure that it's available.
Also, inside your script, you'd have to do extra work to re-transform the alias-expanded version of the command line into its aliased form:
# The *first* argument is what "$BASH_COMMAND" expanded to,
# i.e., the entire (alias-expanded) command line.
# Here we also re-transform the alias-expanded command line to
# its original aliased form, by replacing everything up to and including
# "$BASH_COMMMAND" with the alias name.
myInvocation=$(sed 's/^.* "\$BASH_COMMAND"/my_script/' <<<"$1")
shift # Remove the first argument from "$#".
# Now process "$#", as you normally would.
Sadly, wrapping the invocation via a script or function is not an option, because the $BASH_COMMAND truly only ever reports the current command's command line, which in the case of a script or function wrapper would be the line inside that wrapper.
[1] The only thing that gets expanded are aliases, so if you invoked your script via an alias, you'll still see the underlying script in $BASH_COMMAND, but that's generally desirable, given that aliases are user-specific.
All other arguments and even input/output redirections, including process substitutiions <(...) are reflected as-is.
"$0" contains the script's name, "$#" contains the parameters.
Do you mean something like echo $0 $*?

How to prevent execution of command in ZSH?

I wrote hook for command line:
# Transforms command 'ls?' to 'man ls'
function question_to_man() {
if [[ $2 =~ '^\w+\?$' ]]; then
man ${2[0,-2]}
autoload -Uz add-zsh-hook
add-zsh-hook preexec question_to_man
But when I do:
> ls?
After exiting from man I get:
> zsh: no matches found: ls?
How can I get rid of from message about wrong command?
? is special to zsh and is the wildcard for a single character. That means that if you type ls? zsh tries find matching file names in the current directory (any three letter name starting with "ls").
There are two ways to work around that:
You can make "?" "unspecial" by quoting it: ls\?, 'ls?' or "ls?".
You make zsh handle the cases where it does not match better:
The default behaviour if no match can be found is to print an error. This can be changed by disabling the NOMATCH option (also NULL_GLOB must not be set):
This will leave the word untouched, if there is no matching file.
Caution: In the (maybe unlikely) case that there is a file with a matching name, zsh will try to execute a command with the name of the first matching file. That is if there is a file named "lsx", then ls? will be replaced by lsx and zsh will try to run it. This may or may not fail, but will most likely not be the desired effect.
Both methods have their pro and cons. 1. is probably not exactly what you are looking for and 2. does not work every time as well as changes your shells behaviour.
Also (as #chepner noted in his comment) preexec runs additionally to not instead of a command. That means you may get the help for ls but zsh will still try to run ls? or even lsx (or another matching name).
To avoid that, I would suggest defining a command_not_found_handler function instead of preexec. From the zsh manual:
If no external command is found but a function command_not_found_handler exists the shell executes this function with all command line arguments. The function should return status zero if it successfully handled the command, or non-zero status if it failed. In the latter case the standard handling is applied: ‘command not found’ is printed to standard error and the shell exits with status 127. Note that the handler is executed in a subshell forked to execute an external command, hence changes to directories, shell parameters, etc. have no effect on the main shell.
So this should do the trick:
command_not_found_handler () {
if [[ $1 =~ '\?$' ]]; then
man ${1%\?}
return 0
return 1
If you have a lot of matching file names but seldomly misstype commands (the usual reason for "Command not found" errors) you might want to consider using this instead:
command_not_found_handler () {
man ${1%?}
This does not check for "?" at the end, but just cuts away any last character (note the missing "\" in ${1%?}) and tries to run man on the rest. So even if a file name matches, man will be run unless there is indeed a command with the same name as the matched file.
Note: This will interfere with other tools using command_not_found_handler for example the command-not-found tool from Ubuntu (if enabled for zsh).
That all being said, zsh has a widget called run-help which can be bound to a key (in Emacs mode it is by default bound to Alt+H) and than runs man for the current command.
The main advantages of using run-help over the above are:
You can call it any time while typing a longer command, as long as the command name is complete.
After you leave the manpage, the command is still there unchanged, so you can continue writing on it.
You can even bind it to Alt+? to make it more similar: bindkey '^[?' run-help
