Python subprocess (shell=True), not working for postgres command - linux

Using the command line, I confirm that the following commands executes correctly
echo '\c mydatabase;\i db-reset.sql' | psql -U postgres -h localhost
However, in Python, I can confirm that the following lines do absolutely nothing, and return an status code of 0.
import subprocess
code = subprocess.call(r"echo '\c mydatabase;\i db-reset.sql' | psql -U postgres -h localhost", shell=True)
assert code == 0 # This comes to true
Essentially, why is the command invoked using subprocess not actually doing anything?

It works, but you need more backslashes.
Also, I would recommend you don't use shell=True here.
That is what you do, but without shell:
p = subprocess.Popen(['psql', '-U', 'postgres', '-h', 'localhost'], shell=False, stdin=subprocess.PIPE)
p.communicate(r"\c mydatabase;\i db-reset.sql")

Igor has the right approach without a doubt - though it'd be a good idea to close the session afterwards. However, there's a bigger picture issue here, which is that you should not generally be invoking psql to communicate with PostgreSQL from Python.
Use the psycopg2 module, which is widespread and available almost everywhere, to talk to PostgreSQL directly. This will immensely simplify your database communications.
For cases where you actually need psql, like running scripts, please use psql -f and a database argument. Your command in this case should be:
try:
subprocess.check_call([
'psql', '-q',
'-U', 'postgres',
'-h', 'localhost',
'-f', 'db-reset.sql',
'mydatabase'
])
except subprocess.CalledProcessError, ex:
print("Failed to invoke psql: {0}".format(ex))
... or even better, use check_output if you're on a new enough Python version, so you capture error output too. Note the -q (quiet mode) flag, too.
(Note that subprocess will do its own escaping when you're running on a platform like Windows where there's no sensible execv variant system calls or equivalents. So you don't need to care about painful shell escaping quirks.)

Related

Python3 - Sanitizing user input for shell use

I am busy writing a Python3 script which requires user input, the input is used as parameters in commands passed to the shell.
The script is only intended to be used by trusted internal users - however I'd rather have some contingencies in place to ensure the valid execution of commands.
Example 1:
import subprocess
user_input = '/tmp/file.txt'
subprocess.Popen(['cat', user_input])
This will output the contents of '/tmp/file.txt'
Example 2:
import subprocess
user_input = '/tmp/file.txt && rm -rf /'
subprocess.Popen(['cat', user_input])
Results in (as expected):
cat: /tmp/file.txt && rm -rf /: No such file or directory
Is this an acceptable method of sanitizing input? Is there anything else, per best practice, I should be doing in addition to this?
The approach you have chosen,
import subprocess
user_input = 'string'
subprocess.Popen(['command', user_input])
is quite good as command is static and user_input is passed as one single argument to command. As long as you don't do something really stupid like
subprocess.Popen(['bash', '-c', user_input])
you should be on the safe side.
For commands that require multiple arguments, I'd recommend that you request multiple inputs from the user, e.g. do this
user_input1='file1.txt'
user_input2='file2.txt'
subprocess.Popen(['cp', user_input1, user_input2])
instead of this
user_input="file1.txt file2.txt"
subprocess.Popen(['cp'] + user_input.split())
If you want to increase security further, you could:
explicitly set shell=False (to ensure you never run shell commands; this is already the current default, but defaults may change over time):
subprocess.Popen(['command', user_input], shell=False)
use absolute paths for command (to prevent injection of malicious executables via PATH):
subprocess.Popen(['/usr/bin/command', user_input])
explicitly instruct commands that support it to stop parsing options, e.g.
subprocess.Popen(['rm', '--', user_input1, user_input2])
do as much as you can natively, e.g. cat /tmp/file.txt could be accomplished with a few lines of Python code instead (which would also increase portability if that should be a factor)

How to call and pipe multiple postgres commands from python

In order to copy a file-like object to a postgres database, I take the following steps:
~$ sudo psql -U postgres
password for root:
password for user postgres:
postgres=# \c migration v0
You are now connected to database "migration_v0" as user "postgres".
migration_v0=# cat file.csv | \copy table1 from stdin csv
I want to take the exact same steps, but from within Python and want to pass a StringIO buffer instead of a literal file. My first attempt consisted of the following steps:
# test.py
fmt = r"copy table1 FROM stdin csv"
sql = fmt.format(string_io)
psql = ['psql', '-U', 'postgres', '-c', sql]
output = subprocess.check_output(psql)
print(output)
The command is executed (a prompt pops up to type the password for the user postgres) but I get the following error:
ERROR: relation "table1" does not exist
This happens because I am currently trying to execute \copy on the default database postgres instead of migration_v0. Thus, I want to include both commands in the subprocess call (\c migration_v0 and \copy ...) and I don't know how to do this, since the postgres' flag -c takes only a single command.
I looked up a workaround and came across with this command line example:
\c migration_v0 \\ \copy ... | psql -U postgres
, but I have no idea how I can port this to python code.
Any suggestions on how I can pull this off?
Edit 1
I realized the flag -d also enables switching databases so now I don't need to run multiple commands. My code now looks like this:
p = subprocess.Popen([
'psql', '-U', 'postgres',
'-d', 'migration_v0',
'-c', '\copy table1 FROM stdin csv'],
shell=False,
stdin=string_io)
but I get the following error:
io.UnsupportedOperation: fileno
Apparently StringIO doesn't implement fileno. At this point I'm wondering if it's even possible to achieve what I want to through a subprocess call.

Curl execution in python failed

I wanted to download files for around 300 item. An example is below:
curl 'http://genome.jgi.doe.gov/ext-api/downloads/get-directory?organism=Absrep1' -b cookies > Absrep1.xml
This opens the page and downloads the content and stores it as xml file in my end
I tried to do a batch script in perl with system command, like
system('curl 'http://genome.jgi.doe.gov/ext-api/downloads/get-directory?organism=Absrep1'
-b cookies > Absrep1.xml');
But, it did not work. There was syntax error, which I guess is due to single quotes.
I tried with python,
import subprocess
bash_com = 'curl "http://genome.jgi.doe.gov/ext-api/downloads/get-directory?organism=Absrep1" '
subprocess.Popen(bash_com)
output = subprocess.check_output(['bash','-c', bash_com])
It did not work. I get the error, File does not exist. Even if it works, how can I include the
-b cookies > Absrep1.xml'
part in it?
Please help. Thanks in Advance,
AP
In Perl, you should be able to use this:
system(q{curl 'http://genome.jgi.doe.gov/ext-api/downloads/get-directory?organism=Absrep1' -b cookies > Absrep1.xml});
However, you might be better off using LWP or possibly even HTTP::Tiny(unless you need the cookies) instead of shelling out. For more advanced uses, there is also WWW::Mechanize.
The syntax error is almost certainly down to the quotes in the system call:
system('curl 'http://genome.jgi.doe.gov/ext-api/downloads/get-directory?organism=Absrep1' -b cookies > Absrep1.xml');
The single quotes either need to be escaped or alternative parentheses can be used such as double quotes or custom parentheses with q or qq, eg:
system(q{curl 'http://genome.jgi.doe.gov/ext-api/downloads/get-directory?organism=Absrep1' -b cookies > Absrep1.xml});
It's hard to tell from the context given, but wrapping the curl call in perl or python would likely be a less than optimal approach. Perl has LWP, Python has requests, and the bash shell is already well equipped to run simple batch jobs. It might be best to stick to a single interpreter unless there's a good reason not to.

python3 subprocess in Oracle Linux (wget -o)

I see there are several posts on python subprocess invoking bash shell commands. But I can't find an answer to my problem unless someone has a link that I'm missing.
So here is a start of my code.
import os;
import subprocess;
subprocess.call("wget ‐O /home/oracle/Downloads/puppet-repo.rpm https://yum.puppetlabs.com/puppetlabs-release-el-6.noarch.rpm");
When I do
wget ‐O /home/oracle/Downloads/puppet-repo.rpm https://yum.puppetlabs.com/puppetlabs-release-el-6.noarch.rpm
straight up in terminal, it works.
But my IDE gives me FileNotFoundError: [Errno 2] No such file or directory: 'wget'
Again, I'm new to invoking os/subprocess module within python and I would appreciate any insight on how to use these modules effectively.
{UPDATE: with miindlek's answer, I get these errors. 1st - subprocess.call(["wget", "‐O", "/home/oracle/Downloads/puppet-repo.rpm", "https://yum.puppetlabs.com/puppetlabs-release-el-6.noarch.rpm"])}
--2015-06-07 17:14:37-- http://%E2%80%90o/
Resolving ‐o... failed: Temporary failure in name resolution.
wget: unable to resolve host address “‐o”
/home/oracle/Downloads/puppet-repo.rpm: Scheme missing.
--2015-06-07 17:14:52-- https://yum.puppetlabs.com/puppetlabs-release-el-6.noarch.rpm
{with 2nd bash method subprocess.call("wget ‐O /home/oracle/Downloads/puppet-repo.rpm https://yum.puppetlabs.com/puppetlabs-release-el-6.noarch.rpm", shell=True)}
Resolving yum.puppetlabs.com... 198.58.114.168, 2600:3c00::f03c:91ff:fe69:6bf0
Connecting to yum.puppetlabs.com|198.58.114.168|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 10184 (9.9K) [application/x-redhat-package-manager]
Saving to: “puppetlabs-release-el-6.noarch.rpm.1”
0K ......... 100% 1.86M=0.005s
2015-06-07 17:14:53 (1.86 MB/s) - “puppetlabs-release-el-6.noarch.rpm.1” saved [10184/10184]
FINISHED --2015-06-07 17:14:53--
Downloaded: 1 files, 9.9K in 0.005s (1.86 MB/s)
Process finished with exit code 0
You should split your command string into a list of arguments:
import subprocess
subprocess.call(["wget", "-O", "/home/oracle/Downloads/puppet-repo.rpm", "https://yum.puppetlabs.com/puppetlabs-release-el-6.noarch.rpm"])
You could also use the shell option as an alternative:
import subprocess
subprocess.call("wget -O /home/oracle/Downloads/puppet-repo.rpm https://yum.puppetlabs.com/puppetlabs-release-el-6.noarch.rpm", shell=True)
By the way, in python you don't need to add semicolons at the end of a line.
Update
The dash in option -O is a utf8 hyphen Charakter, not a dash. See for example:
>>> a = "‐" # utf8 hyphen
>>> b = "-" # dash
>>> str(a)
'\xe2\x80\x9'
>>> str(b)
'-'
You should delete your old dash and relace it by a normal one. I updated the former source code. You can also copy it from there.
This sounds primarily because your IDE is launching that python subprocess from not 'straight up in a terminal'.
This will be a reading suggestion, rather than a direct answer to only this problem.
Check your IDE; read docs about how it launches stuff.
1 - in terminal
type $ env where you tested $ wget
2 - in IDE
import os ; print(os.environ)
3 - read here about shell and Popen
https://docs.python.org/3/library/subprocess.html
Begin the learning process from there.
I would even suggest replacing
subprocess.call("wget -O /home/oracle/Downloads/puppet-repo.rpm https://yum.puppetlabs.com/puppetlabs-release-el-6.noarch.rpm", shell=True)
With a clear declaration of what 'shell' you want to use
subprocess.Popen(['/bin/sh', '-c', 'wget' '<stuff>'])
to mitigate future IDE/shell/env assumption problems.

How to write a bash script to give another program response

I have a bash script that does several tasks, including python manage.py syncdb on a fresh database. This command asks for input, like the login info for the admin. Currently, I just type this into the command line every time. Is there a way I can automatically provide these replies as part of the bash script?
Thanks, I don't really know anything about bash.
I'm using Ubuntu 10.10.
I answered a similar question on SF, but this one is more general, and it's good to have on SO.
"You want to use expect for this. It's probably already on your machine [try which expect]. It's the standard tool for any kind of interactive command-line automation. It's a Tcl library, so you'll get some Tcl skills along the way for free. Beware; it's addictive."
I should mention in this case that there is also pexpect, which is a Python expect-alike.
#!/path/to/expect
spawn python manage.py syncdb
expect "login:*"
send -- "myuser\r"
expect "*ssword:*"
send -- "mypass\r"
interact
If the program in question cannot read the input from stdin such as:
echo "some input" | your_progam
then you'll need to look to something like expect and/or autoexepect
You can give defaults values to the variables. In line 4 and 5, if the variables RSRC and LOCAL aren't set, they are set to those default values. This way you can give the options to your script or use the default ones
#!/bin/bash
RSRC=$1
LOCAL=$2
: ${RSRC:="/var/www"}
: ${LOCAL:="/disk2/backup/remote/hot"}
rsync -avz -e 'ssh ' user#myserver:$RSRC $LOCAL
You can do it like this, given an example login.py script:
if __name__ == '__main__':
import sys
user = sys.stdin.readline().strip()
passwd = sys.stdin.readline().strip()
if user == 'root' and passwd == 'password':
print 'Login successful'
sys.exit(0)
sys.stderr.write('error: invalid username or password\n')
sys.exit(1)
good-credentials.txt
root
password
bad-credentials.txt
user
foo
Then you can do the login automatically using:
$cat good-credentials.txt | python login.py
Login successful
$cat bad-credentials.txt | python login.py
error: invalid username or password
The down-side of this approach is you're storing your password in plain text, which isn't great practice.

Resources