In order to copy a file-like object to a postgres database, I take the following steps:
~$ sudo psql -U postgres
password for root:
password for user postgres:
postgres=# \c migration v0
You are now connected to database "migration_v0" as user "postgres".
migration_v0=# cat file.csv | \copy table1 from stdin csv
I want to take the exact same steps, but from within Python and want to pass a StringIO buffer instead of a literal file. My first attempt consisted of the following steps:
# test.py
fmt = r"copy table1 FROM stdin csv"
sql = fmt.format(string_io)
psql = ['psql', '-U', 'postgres', '-c', sql]
output = subprocess.check_output(psql)
print(output)
The command is executed (a prompt pops up to type the password for the user postgres) but I get the following error:
ERROR: relation "table1" does not exist
This happens because I am currently trying to execute \copy on the default database postgres instead of migration_v0. Thus, I want to include both commands in the subprocess call (\c migration_v0 and \copy ...) and I don't know how to do this, since the postgres' flag -c takes only a single command.
I looked up a workaround and came across with this command line example:
\c migration_v0 \\ \copy ... | psql -U postgres
, but I have no idea how I can port this to python code.
Any suggestions on how I can pull this off?
Edit 1
I realized the flag -d also enables switching databases so now I don't need to run multiple commands. My code now looks like this:
p = subprocess.Popen([
'psql', '-U', 'postgres',
'-d', 'migration_v0',
'-c', '\copy table1 FROM stdin csv'],
shell=False,
stdin=string_io)
but I get the following error:
io.UnsupportedOperation: fileno
Apparently StringIO doesn't implement fileno. At this point I'm wondering if it's even possible to achieve what I want to through a subprocess call.
Related
How to call a sql query using bash shell script. I tried the below but seems there is some syntax error:
#!/bin/sh
LogDir='/albt/dev/test1/test2/logs' # log file
USER='test' #Enter Oracle DB User name
PASSWORD='test' #Enter Oracle DB Password
SID='test' #Enter SID
sqlplus -s << EOF > ${LogDir}/sql.log
${DB_USER_NAME}/${DB_PASSWORD}#${DB_SID}
SELECT count(1) FROM dual; # SQL script here to get executed
EOF
var=$(SELECT count(1) FROM dual)
I'm getting - unexpected token error
#!/bin/sh
user="test"
pass="test"
var="$1"
sqlplus -S $user/$pass <<EOF
SELECT * FROM tableName WHERE username=$var;
exit;
EOF
I'm getting - sqlplus: command not found -- when I run the above script
Can anyone guide me?
In your first script, one error is in the use of count(1). The whole line is
var=$(SELECT count(1) FROM dual)
This means that the shell is supposed to execute a program named SELECT with the parameters count(1), FROM and dual, and stores its stdout into the variable var. I think you want instead to feed a SELECT command into sqlplus, i.e. something like
var=$(sqlplus .... )
In your second script, the error message simply means that sqlplus can not be found in your path. Either add it to the PATH or provide the absolute path to the command explicitly.
I need to generate postgres schema from a dataframe. I found csvkit library to come closet to matching datatypes. I can run csvkit and generate postgres schema over a csv on my desktop via terminal through this command found in docs:
csvsql -i postgresql myFile.csv
csvkit docs - https://csvkit.readthedocs.io/en/stable/scripts/csvsql.html
And I can run the terminal command in my script via this code:
import os
a=os.popen("csvsql -i postgresql Desktop/myFile.csv").read()
However I have a dataframe, that I have converted to a csv string and need to generate schema from the string like so:
csvstr = df.to_csv()
In the docs it says that under positional arguments:
The CSV file(s) to operate on. If omitted, will accept
input on STDIN
How do I pass my variable csvstr into the line of code a=os.popen("csvsql -i postgresql csvstr").read() as a variable?
I tried to do the below line of code but got an error OSError: [Errno 7] Argument list too long: '/bin/sh':
a=os.popen("csvsql -i postgresql {}".format(csvstr)).read()
Thank you in advance
You can't pass such a big string via commandline! You have to save the data to a file and pass its path to csvsql.
import csv
csvstr = df.to_csv()
with open('my_cool_df.csv', 'w', newline='') as csvfile:
csvwriter= csv.writer(csvfile)
csvwriter.writerows(csvstr)
And later:
a=os.popen("csvsql -i postgresql my_cool_df.csv")
I am using below command to get result for my SQL query.
su - postgres -c 'psql -d dbname' with stdin "COPY ( my SQL query ) TO STDOUT WITH CSV HEADER"
This works fine on my server but on different machine it is printing bash warning with output of SQL query.
For example -
/etc/profile: line 46: HISTSIZE: readonly variable
/etc/profile: line 50: HISTCONTROL: readonly variable
/etc/profile.d/20-tmout.sh: line 1: TMOUT: readonly variable
/etc/profile.d/history.sh: line 6: hcmnt_tty: readonly variable
name
abc
Please let me know anyway so that I can skip above warning messages and only get data.
If I would like to use /dev/null in this case how to modify above command to get data only.
if what you mean is "how to discard only error output?", the way to go is to redirect the standard error stream to oblivion (/dev/null), like so:
your-command 2>/dev/null
that way, if the command outputs data to standard out, it passes through, but any output to the standard error socket is discarded, so you won't see these error messages.
by the way, 2 here is a shorthand file descriptor for the standard error.
Sorry this is untested, but I hit this same error, your db session isn't read/write. You can echo the statements to psql to force a proper session as follows. I'm unsure as to how stdin may be effected
echo 'SET TRANSACTION READ WRITE; SET SESSION CHARACTERISTICS AS TRANSACTION READ WRITE ; COPY ( my SQL query ) TO STDOUT WITH CSV HEADER' | su - postgres -c 'psql -d dbname' with stdin
caution - bash hack
su - postgres -c 'psql -d dbname' with stdin "COPY ( my SQL query ) TO STDOUT WITH CSV HEADER" | grep -v "readonly"
Using the command line, I confirm that the following commands executes correctly
echo '\c mydatabase;\i db-reset.sql' | psql -U postgres -h localhost
However, in Python, I can confirm that the following lines do absolutely nothing, and return an status code of 0.
import subprocess
code = subprocess.call(r"echo '\c mydatabase;\i db-reset.sql' | psql -U postgres -h localhost", shell=True)
assert code == 0 # This comes to true
Essentially, why is the command invoked using subprocess not actually doing anything?
It works, but you need more backslashes.
Also, I would recommend you don't use shell=True here.
That is what you do, but without shell:
p = subprocess.Popen(['psql', '-U', 'postgres', '-h', 'localhost'], shell=False, stdin=subprocess.PIPE)
p.communicate(r"\c mydatabase;\i db-reset.sql")
Igor has the right approach without a doubt - though it'd be a good idea to close the session afterwards. However, there's a bigger picture issue here, which is that you should not generally be invoking psql to communicate with PostgreSQL from Python.
Use the psycopg2 module, which is widespread and available almost everywhere, to talk to PostgreSQL directly. This will immensely simplify your database communications.
For cases where you actually need psql, like running scripts, please use psql -f and a database argument. Your command in this case should be:
try:
subprocess.check_call([
'psql', '-q',
'-U', 'postgres',
'-h', 'localhost',
'-f', 'db-reset.sql',
'mydatabase'
])
except subprocess.CalledProcessError, ex:
print("Failed to invoke psql: {0}".format(ex))
... or even better, use check_output if you're on a new enough Python version, so you capture error output too. Note the -q (quiet mode) flag, too.
(Note that subprocess will do its own escaping when you're running on a platform like Windows where there's no sensible execv variant system calls or equivalents. So you don't need to care about painful shell escaping quirks.)
In PostgreSQL and bash(linux), are there ways to directly import a file from the filesystem like
[execute.sh]
pgsql .... -f insert.txt
[insert.txt]
insert into table(id,file) values(1,import('/path/to/file'))
There seems to be no import function, bytea_import as well, lo_import would save int and I don't know how to get the file back (these files are in small sizes so using lo_import seems not appropriate)
And can how do I move the insert.txt statement to PostgreSQL?
I'm not sure what you're after, but if you have script with SQL-statements, for example the insert statements that you mention, you can run psql and then run the script from within psql. For example:
postgres#server:~$ psql dbname
psql (8.4.1)
Type "help" for help.
dbname=# \i /tmp/queries.sql
This will run the statements in /tmp/queries.sql.
Hope this was what you asked for.
In case of more detailed parameters:
$ psql -h [host] -p [port] -d [databaseName] -U [user] -f [/absolute/path/to/file]
The manual has some examples:
testdb=> \set content '''' `cat my_file.txt` ''''
testdb=> INSERT INTO my_table VALUES (:content);
See http://www.postgresql.org/docs/8.4/interactive/app-psql.html