Can I execute a partitioned procedure in all partitions (using#GetPartitionKeys) from the SQL terminal? - voltdb

I need to execute a one-time task to update all rows in a large database. I want to do this as quickly as possible. Each row needs to be read, have a value in a column modified by an algorithm, and then updated with the transformed value. I have written a single partitioned stored procedure to do this.
I am aware of the examples on this page: https://docs.voltdb.com/UsingVoltDB/sysprocgetpartitionkeys.php
I would rather not have to write a java client to execute this procedure in each partition, and would ideally like to call #GetPartitionKeys, then execute the stored procedure on each value, in sqlcmd.

I found a way to do it using pipelines in bash:
echo "exec #GetPartitionKeys INTEGER;" | sqlcmd --output-skip-metadata | awk '{print $2}' | xargs -n 1 sh -c 'echo "exec YourPartitionedProcedure $0;" | sqlcmd'
This will take the partition key column and then execute YourPartitionedProcedure once for each partition key returned. You can add additional parameters as needed after the $0. You can switch INTEGER to STRING as needed as well.

Related

I needs to execute one sql query against two DBs in at a time and export the data to csv files [duplicate]

I have file1.sh file and which internally needs to execute one sql query against two Oracle DBs at a same time and needs to export date to csv fiiles, below is the sample shellscript which executes the query against two dbs.
....
#!bin/bash
set -X
sqlplus -S ${user1}#${DBCONNECTIONNAME_1}/${Pwd} Datesquery.sql & >> ${Targetdirectory}/csvfile1.csv
sqlplus -S ${user1}#${DBCONNECTIONNAME_2}/${Pwd} Datesquery.sql & >> ${Targetdirectory}/csvfile2.csv
sed 1d csvfile2.csv > file2noheader.csv
cat csvfile1.csv file2noheader.csv > ${Targetdirectory}/Expod.csv
....
But it does not connect to DB and does not execute any query and simply displays sqlplus manual as how to use the connection string, please let me know how to call one query against two dbs and execute them in parrallay and binds output to separate csv files.
A given sqlplus session can only connect to one db at a time, so your requirement 'at the same time' is essentially a non-starter. If 'at the same time' really means 'sequentially, in the same script, then you are back to fixing your connect string. And at that you 'have more errors than an early Mets game' (with apologies to any NY Mets fans).
First, your script indicates that your sqlplus command is the very first actual command following specification of your shell processor and 'set -x'. Yet you make heavy use of environment variables as substitutions for username, password, and connection name - without ever setting those variables.
Second, your use of an '&' in the command line is totally confusing to both me and the parser.
Third, you need to preceed your reference to the sql script with '#'.
Fourth, your order of elements in the command line is all wrong.
Try this
#!/bin/bash
orauser1=<supply user name here>
orapw2=<supply password here>
oradb_1=<supply connection name of first database>
#
orauser1=<supply user name here>
orapw2=<supply password here>
oradb_1=<supply connection name of first database>
#
Targetdirectory=<supply value here>
#
sqlplus -S ${orauser1}/${orapw1}#${oradb_1} #Datesquery.sql >> ${Targetdirectory}/csvfile1.csv
sqlplus -S ${orauser2}/${orapw2}#${oradb_1} #Datesquery.sql >> ${Targetdirectory}/csvfile2.csv
Or create a database link form one DB to other and then run both sqls in one db, one over DB link.
select * from tab1
union
select * from tab1#db_link

Storing oracle query results into bash variable

declare -a result=`$ORACLE_HOME/bin/sqlplus -silent $DBUSER/$DBPASSWORD#$DB << EOF $SQLPLUSOPTIONS $roam_query exit; EOF`
I am trying to pull data from an oracle database and populate a bash variable. The select query works however it returns multiple rows and those rows are returned as a long continuous string. I want to capture each row from the database in an array index for example:
index[0] = row 1 information
index[1] = row 2 information
Please help. All suggestions are appreciated. I checked all documentation without no luck. Thank you. I am using solaris unix
If you have bash version 4, you can use the readarray -t command to do this. Any vaguely recent linux should have bash v4, but I don't know about Solaris.
BTW, I'd also recommend putting double-quotes around variable references (e.g. "$DBUSER/$DBPASSWORD#$DB" instead of just $DBUSER/$DBPASSWORD#$DB) (except in here-documents), using $( ) instead of backticks, and using lower- or mixed-case variable names (there are a bunch of all-caps names with special meanings, and if you use one of those by accident, weird things can happen).
I'm not sure I have the here-document (the SQL commands) right, but here's roughly how I'd do it:
readarray -t result < <("$oracle_home/bin/sqlplus" -silent "$dbuser/$dbpassword#$db" << EOF
$sqlplusoptions $roam_query
exit;
EOF
)

CQL3.2: DROP TABLE with certain prefix?

I have a Cassandra 2.1.8 database with a bunch of tables, all in the form of either "prefix1_tablename" or "prefix2_tablename".
I want to DROP every table that begins with prefix1_ and leave anything else alone.
I know I can grab table names using the query:
SELECT columnfamily_name FROM system.schema_columnfamilies
WHERE keyspace_name='mykeyspace'
And I thought about filtering the results somehow to get only prefix1_ tables, putting them into a table with DROP TABLE prepended to each one, then executing all the statements in my new table. It was similar thinking to strategies I've seen for people solving the same problem with MySQL or Oracle.
With CQL3.2 though, I don't have access to User-Defined Functions (at least according to the docs I've read...) and I don't know how to do something like execute statements off of a table query result, as well as even how to filter out prefix1_ tables with no LIKE operator in Cassandra.
Is there a way to accomplish this?
I came up with a Bash shell script to solve my own issue. Once I realized that I could export the column families table to a CSV file, it made more sense to me to perform the filtering and text manipulation with grep and awk as opposed to finding a 'pure' cqlsh method.
The script I used:
#!/bin/bash
# No need for a USE command by making delimiter a period
cqlsh -e "COPY system.schema_columnfamilies (keyspace_name, columnfamily_name)
TO 'alltables.csv' WITH DELIMITER = '.';"
cat alltables.csv | grep -e '^mykeyspace.prefix1_' \
| awk '{print "DROP TABLE " $0 ";"}' >> remove_prefix1.cql
cqlsh -f 'remove_prefix1.cql'
rm alltables.csv remove_prefix1.cql

Cassandra selective copy

I want to copy selected rows from a columnfamily to a .csv file. The copy command is available just to dump a column or entire table to a file without where clause. Is there a way to use where clause in copy command?
Another way I thought of was,
Do "Insert into table2 () values ( select * from table1 where <where_clause>);" and then dump the table2 to .csv , which is also not possible.
Any help would be much appreciated.
There are no way to make a where clause in copy, but you can use this method :
echo "select c1,c2.... FROM keySpace.Table where ;" | bin/cqlsh > output.csv
It allows you to save your result in the output.csv file.
No, there is no built-in support for a "where" clause when exporting to a CSV file.
One alternative would be to write your own script using one of the drivers. In the script you would do the "select", then read the results and write out to a CSV file.
In addition to Amine CHERIFI's answer:
| sed -e 's/^\s+//; s_\s*\|\s*_,_g; /^-{3,}|^$|^\(.+\)$/d'
Removes spaces
Replaces | with ,
Removes header separator, empty and summary lines
Other ways to run the SQL with filter and redirect the response to csv
1) Inside the cqlsh, use the CAPTURE command and redirect the output to a file. You need to set the tracing on before executing the command
Example: CAPTURE 'output.txt' -- output of the sql executed after this command gets captured into output.txt file
2) In case if you would like to redirect the SQL output to a file from outside of cqlsh
./cqlsh -e'select * from keyspaceName.tableName' > fileName.txt -- hostname

Extracting sql insert with sed, line cuts

I've been reading on stackoverflow about the use of sed for extracting data from sql dumps, being more accurate, the final purpose is to extract inserts for an specific table in order to restore only that table.
I’m using this:
sed -n '/LOCK TABLES `TABLE_NAME`/,/UNLOCK TABLES/p' dump.sql > output.sql
The problem that I’m having is that we have inserts on 1 line that are more than 50Mb long, so while extracting the insert, the output gets cut before the end of the line.
like:
......
(4
3458,'0Y25565137SEOEJ','001','PREPAR',1330525937741,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL),
(43459,'666
I tried to use awk and even simple grep and the result is the same, the line gets cut.
Edit: Im using this on a sql dump from mysql and the system I'm working on is a Centos 5.2
You can try awk and see if it's better (I think so) :
awk '/LOCK TABLES `TABLE_NAME`/,/UNLOCK TABLES/' dump.sql > output.sql
But if it's a dump file created with exp, you can import only the needed tables with
imp user/pass tables=table1,table2 ...

Resources