CQL3.2: DROP TABLE with certain prefix? - cassandra

I have a Cassandra 2.1.8 database with a bunch of tables, all in the form of either "prefix1_tablename" or "prefix2_tablename".
I want to DROP every table that begins with prefix1_ and leave anything else alone.
I know I can grab table names using the query:
SELECT columnfamily_name FROM system.schema_columnfamilies
WHERE keyspace_name='mykeyspace'
And I thought about filtering the results somehow to get only prefix1_ tables, putting them into a table with DROP TABLE prepended to each one, then executing all the statements in my new table. It was similar thinking to strategies I've seen for people solving the same problem with MySQL or Oracle.
With CQL3.2 though, I don't have access to User-Defined Functions (at least according to the docs I've read...) and I don't know how to do something like execute statements off of a table query result, as well as even how to filter out prefix1_ tables with no LIKE operator in Cassandra.
Is there a way to accomplish this?

I came up with a Bash shell script to solve my own issue. Once I realized that I could export the column families table to a CSV file, it made more sense to me to perform the filtering and text manipulation with grep and awk as opposed to finding a 'pure' cqlsh method.
The script I used:
#!/bin/bash
# No need for a USE command by making delimiter a period
cqlsh -e "COPY system.schema_columnfamilies (keyspace_name, columnfamily_name)
TO 'alltables.csv' WITH DELIMITER = '.';"
cat alltables.csv | grep -e '^mykeyspace.prefix1_' \
| awk '{print "DROP TABLE " $0 ";"}' >> remove_prefix1.cql
cqlsh -f 'remove_prefix1.cql'
rm alltables.csv remove_prefix1.cql

Related

Can I execute a partitioned procedure in all partitions (using#GetPartitionKeys) from the SQL terminal?

I need to execute a one-time task to update all rows in a large database. I want to do this as quickly as possible. Each row needs to be read, have a value in a column modified by an algorithm, and then updated with the transformed value. I have written a single partitioned stored procedure to do this.
I am aware of the examples on this page: https://docs.voltdb.com/UsingVoltDB/sysprocgetpartitionkeys.php
I would rather not have to write a java client to execute this procedure in each partition, and would ideally like to call #GetPartitionKeys, then execute the stored procedure on each value, in sqlcmd.
I found a way to do it using pipelines in bash:
echo "exec #GetPartitionKeys INTEGER;" | sqlcmd --output-skip-metadata | awk '{print $2}' | xargs -n 1 sh -c 'echo "exec YourPartitionedProcedure $0;" | sqlcmd'
This will take the partition key column and then execute YourPartitionedProcedure once for each partition key returned. You can add additional parameters as needed after the $0. You can switch INTEGER to STRING as needed as well.

updating all Cassandra tables starting with a specific name

I am trying to alter my cassandra tables starting with a specific name.
My table starts with sample_1,sample_2,sample_13567,sample_adgf and so on...
The table names are random but starting with same prefix.
I want to add a new column to all these tables.
Can some one suggest me the update query using the regex for table names.
If you are using linux You can this in two step :
First Generate all alter command into a file like below :
for i in {1..13567}; do echo "ALTER TABLE sample_$i ADD test text;"; done > alter.cql
The above command will create alter command to add test text column for table sample_1 to sample_13567 and store into a file alter.cql
Now you can just load the cql file into cqlsh like below :
cqlsh 127.0.0.1 -u cassandra -p cassandra -k ashraful_test -f alter.cql
Here
-u username
-p password
-k keyspace_name
-f file name to load
By the way having too much table is not a good idea.
Check this link https://stackoverflow.com/a/33389204/2320144

Cassandra selective copy

I want to copy selected rows from a columnfamily to a .csv file. The copy command is available just to dump a column or entire table to a file without where clause. Is there a way to use where clause in copy command?
Another way I thought of was,
Do "Insert into table2 () values ( select * from table1 where <where_clause>);" and then dump the table2 to .csv , which is also not possible.
Any help would be much appreciated.
There are no way to make a where clause in copy, but you can use this method :
echo "select c1,c2.... FROM keySpace.Table where ;" | bin/cqlsh > output.csv
It allows you to save your result in the output.csv file.
No, there is no built-in support for a "where" clause when exporting to a CSV file.
One alternative would be to write your own script using one of the drivers. In the script you would do the "select", then read the results and write out to a CSV file.
In addition to Amine CHERIFI's answer:
| sed -e 's/^\s+//; s_\s*\|\s*_,_g; /^-{3,}|^$|^\(.+\)$/d'
Removes spaces
Replaces | with ,
Removes header separator, empty and summary lines
Other ways to run the SQL with filter and redirect the response to csv
1) Inside the cqlsh, use the CAPTURE command and redirect the output to a file. You need to set the tracing on before executing the command
Example: CAPTURE 'output.txt' -- output of the sql executed after this command gets captured into output.txt file
2) In case if you would like to redirect the SQL output to a file from outside of cqlsh
./cqlsh -e'select * from keyspaceName.tableName' > fileName.txt -- hostname

Extracting sql insert with sed, line cuts

I've been reading on stackoverflow about the use of sed for extracting data from sql dumps, being more accurate, the final purpose is to extract inserts for an specific table in order to restore only that table.
I’m using this:
sed -n '/LOCK TABLES `TABLE_NAME`/,/UNLOCK TABLES/p' dump.sql > output.sql
The problem that I’m having is that we have inserts on 1 line that are more than 50Mb long, so while extracting the insert, the output gets cut before the end of the line.
like:
......
(4
3458,'0Y25565137SEOEJ','001','PREPAR',1330525937741,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL),
(43459,'666
I tried to use awk and even simple grep and the result is the same, the line gets cut.
Edit: Im using this on a sql dump from mysql and the system I'm working on is a Centos 5.2
You can try awk and see if it's better (I think so) :
awk '/LOCK TABLES `TABLE_NAME`/,/UNLOCK TABLES/' dump.sql > output.sql
But if it's a dump file created with exp, you can import only the needed tables with
imp user/pass tables=table1,table2 ...

Is it possible to load a subset of columns using Sybase 15 bcp?

I have a CSV file with 20 or so columns and I want to load it into a table with only 9 columns - I want to throw away the rest.
Can I do it directly with bcp or do I need to preprocess the file to strip it down to just what I need?
The manual does not seem to detail it.
But then I seem to have options that arent in the manual, eg -labeled ?
Thanks in advance, Chris
No, this isn't possible with bcp.
You can combine pipes, awk and bcp.
F.e.
In the first shell:
mknod bcp.pipe p
cat > awk > bcp.pipe
in the second shell:
bcp db..table in bcp.pipe -c -U ...
You could create a view on the table which only includes the columns you want. Then bcp out the view instead of the table.

Resources