Cassandra selective copy - cassandra

I want to copy selected rows from a columnfamily to a .csv file. The copy command is available just to dump a column or entire table to a file without where clause. Is there a way to use where clause in copy command?
Another way I thought of was,
Do "Insert into table2 () values ( select * from table1 where <where_clause>);" and then dump the table2 to .csv , which is also not possible.
Any help would be much appreciated.

There are no way to make a where clause in copy, but you can use this method :
echo "select c1,c2.... FROM keySpace.Table where ;" | bin/cqlsh > output.csv
It allows you to save your result in the output.csv file.

No, there is no built-in support for a "where" clause when exporting to a CSV file.
One alternative would be to write your own script using one of the drivers. In the script you would do the "select", then read the results and write out to a CSV file.

In addition to Amine CHERIFI's answer:
| sed -e 's/^\s+//; s_\s*\|\s*_,_g; /^-{3,}|^$|^\(.+\)$/d'
Removes spaces
Replaces | with ,
Removes header separator, empty and summary lines

Other ways to run the SQL with filter and redirect the response to csv
1) Inside the cqlsh, use the CAPTURE command and redirect the output to a file. You need to set the tracing on before executing the command
Example: CAPTURE 'output.txt' -- output of the sql executed after this command gets captured into output.txt file
2) In case if you would like to redirect the SQL output to a file from outside of cqlsh
./cqlsh -e'select * from keyspaceName.tableName' > fileName.txt -- hostname

Related

How to execute multiple statements in Presto?

I need to run multiple lines of code in Presto together. Here is example.
drop table if exists table_a
drop table if exists table_b
The above gives me error:
SQL Error [1]: Query failed (#20190820_190638_03672_kzuv6): line 2:1: mismatched input 'drop'. Expecting: '.', <EOF>
I already tried adding ";", but no luck.
Is it possible to stack multiple statements or I need to execute line by line? My actual example involves many other commands such as create table and etc.
You can use presto command line option to submit the sql file which may consist many sql commands.
/presto/executable/path/presto client --file $filename
Example:
/usr/lib/presto/bin/presto client --file /my/presto/sql/file.sql

Using mysqldump to backup database to file with full data

I'm trying to use mysqldump to backup my databases - data and all. I can use this command to dump the data on the command line:
mysqldump -u username -ppassword --skip-lock-tables --databases database
That works great, and I have a full mass insert statement with all the data. if I do this however:
mysqldump -u username -ppassword --skip-lock-tables --databases database > /var/backup/$(date +\%d-\%m-\%Y)_dump.sql
to add the output to a file (I've checked that the file name scheme works), I only get the create and update commands, and one row of data for each table. I've also tried this without --skip-lock-tables, but I wanted to make sure this wasn't a problem with getting a lock. This will eventually go into a cron job, so I'd like to be able to keep this to one line if possible.
The output on the command line is quite long, but here is an example of part of the mass insert statement:
/*!40000 ALTER TABLE `clients` DISABLE KEYS */;
INSERT INTO `clients` VALUES (1,'nicholas','sallis','it#konditormeister.com','11 hunnewell circle ','newton','02458','ma','7818491970','2016-05-10 16:17:55','2016-05-10 16:17:55')

CQL3.2: DROP TABLE with certain prefix?

I have a Cassandra 2.1.8 database with a bunch of tables, all in the form of either "prefix1_tablename" or "prefix2_tablename".
I want to DROP every table that begins with prefix1_ and leave anything else alone.
I know I can grab table names using the query:
SELECT columnfamily_name FROM system.schema_columnfamilies
WHERE keyspace_name='mykeyspace'
And I thought about filtering the results somehow to get only prefix1_ tables, putting them into a table with DROP TABLE prepended to each one, then executing all the statements in my new table. It was similar thinking to strategies I've seen for people solving the same problem with MySQL or Oracle.
With CQL3.2 though, I don't have access to User-Defined Functions (at least according to the docs I've read...) and I don't know how to do something like execute statements off of a table query result, as well as even how to filter out prefix1_ tables with no LIKE operator in Cassandra.
Is there a way to accomplish this?
I came up with a Bash shell script to solve my own issue. Once I realized that I could export the column families table to a CSV file, it made more sense to me to perform the filtering and text manipulation with grep and awk as opposed to finding a 'pure' cqlsh method.
The script I used:
#!/bin/bash
# No need for a USE command by making delimiter a period
cqlsh -e "COPY system.schema_columnfamilies (keyspace_name, columnfamily_name)
TO 'alltables.csv' WITH DELIMITER = '.';"
cat alltables.csv | grep -e '^mykeyspace.prefix1_' \
| awk '{print "DROP TABLE " $0 ";"}' >> remove_prefix1.cql
cqlsh -f 'remove_prefix1.cql'
rm alltables.csv remove_prefix1.cql

Extracting sql insert with sed, line cuts

I've been reading on stackoverflow about the use of sed for extracting data from sql dumps, being more accurate, the final purpose is to extract inserts for an specific table in order to restore only that table.
I’m using this:
sed -n '/LOCK TABLES `TABLE_NAME`/,/UNLOCK TABLES/p' dump.sql > output.sql
The problem that I’m having is that we have inserts on 1 line that are more than 50Mb long, so while extracting the insert, the output gets cut before the end of the line.
like:
......
(4
3458,'0Y25565137SEOEJ','001','PREPAR',1330525937741,
NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL),
(43459,'666
I tried to use awk and even simple grep and the result is the same, the line gets cut.
Edit: Im using this on a sql dump from mysql and the system I'm working on is a Centos 5.2
You can try awk and see if it's better (I think so) :
awk '/LOCK TABLES `TABLE_NAME`/,/UNLOCK TABLES/' dump.sql > output.sql
But if it's a dump file created with exp, you can import only the needed tables with
imp user/pass tables=table1,table2 ...

Is it possible to load a subset of columns using Sybase 15 bcp?

I have a CSV file with 20 or so columns and I want to load it into a table with only 9 columns - I want to throw away the rest.
Can I do it directly with bcp or do I need to preprocess the file to strip it down to just what I need?
The manual does not seem to detail it.
But then I seem to have options that arent in the manual, eg -labeled ?
Thanks in advance, Chris
No, this isn't possible with bcp.
You can combine pipes, awk and bcp.
F.e.
In the first shell:
mknod bcp.pipe p
cat > awk > bcp.pipe
in the second shell:
bcp db..table in bcp.pipe -c -U ...
You could create a view on the table which only includes the columns you want. Then bcp out the view instead of the table.

Resources