cassandra cqlsh <composed_ttl> error - cassandra

I´m learning Cassandra CQL using CQL 3.1 documentation manual on mac with cassandra installed from homebrew (cqlsh 4.0.0 | Cassandra 2.0.0 | CQL spec 3.1.0 | Thrift protocol 19.37.0). From cqlsh, when I enter collections map example number 7:
UPDATE users USING TTL <computed_ttl> SET todo['2012-10-1'] = 'find water' WHERE user_id = 'frodo';
I´m getting this error:
Bad Request: line 1:22 no viable alternative at input '<'
So, docs are wrong or am I doing something wrong?

You need to replace <computed_ttl> with an actual TTL e.g.
UPDATE users USING TTL 100 SET todo['2012-10-1'] = 'find water' WHERE user_id = 'frodo';
which would cause the value to expire after 100 seconds.

Related

How can i describe table in cassandra database?

$describe = new Cassandra\SimpleStatement(<<<EOD
describe keyspace.tablename
EOD
);
$session->execute($describe);
i used above code but it is not working.
how can i fetch field name and it's data type from Cassandra table ?
Refer to CQL documentation. Describe expects a table/schema/keyspace.
describe table keyspace.tablename
Its also a cqlsh command, not an actual cql command. To get this information query the system tables. try
select * from system.schema_columns;
- or for more recent versions -
select * from system_schema.columns ;
if using php driver may want to check out http://datastax.github.io/php-driver/features/#schema-metadata
Try desc table keyspace.tablename;

Unknown property 'compaction_strategy_class' with cql 3 and cassandra 2.0.1

Using this configuration of cassandra:
Connected to Test Cluster at localhost:9161.
[cqlsh 4.0.1 | Cassandra 2.0.1 | CQL spec 3.1.1 | Thrift protocol 19.37.0]
When I tried to do:
ALTER TABLE snpSearch WITH compaction_strategy_class='SizeTieredCompactionStrategy'
I obtain this error:
Bad Request: Unknown property 'compaction_strategy_class'
I know that SizeTieredCompactionStrategy is the default strategy, but i want also to modify the sstables size and this:
ALTER TABLE snpSearch WITH compaction_strategy_class='SizeTieredCompactionStrategy' AND compaction_strategy_options:sstable_size_in_mb:10;
give me this error:
Bad Request: line 1:116 mismatched input ':' expecting '='
I read the cql documentation and should be correct, does anyone know what could be the problem?
Thanks
The correct format is:
ALTER TABLE snpSearch WITH compaction={'class':'SizeTieredCompactionStrategy'};
The format for WITH options of the ALTER command is described here. The important part is:
[...] The supported (and syntax) are the same as for the CREATE TABLE statement [...]
And the example from the CQL3.1 documentation shows how the compaction and compression strategies can be set.
( Tested on [cqlsh 4.0.1 | Cassandra 2.0.1 | CQL spec 3.1.1 | Thrift protocol 19.37.0].)

RPC timeout error while exporting data from CQL

I am trying to export data from cassandra using CQL client. A column family has about 100000 rows in it. when i am copying dta into csv file using COPY TO command i get following rpc_time out error.
copy mycolfamily to '/root/mycolfamily.csv'
Request did not complete within rpc_timeout.
I am running in:
[cqlsh 3.1.6 | Cassandra 1.2.8 | CQL spec 3.0.0 | Thrift protocol 19.36.0]
How can I increase RPC timeout limit?
I tried adding rpc_timeout_in_ms: 20000 (defalut is 10000) in my conf/cassandra.yaml file. but while restarting cassandra I get:
[root#user ~]# null; Can't construct a java object for tag:yaml.org,2002:org.apache.cassandra.config.Config; exception=Cannot create property=rpc_timeout_in_ms for JavaBean=org.apache.cassandra.config.Config#71bfc4fc; Unable to find property 'rpc_timeout_in_ms' on class: org.apache.cassandra.config.Config
Invalid yaml; unable to start server. See log for stacktrace.
The COPY command currently does the same thing with SELECT with LIMIT 99999999. So, it will eventually goes to timeout while your data is growing. Here's the export function;
https://github.com/apache/cassandra/blob/trunk/bin/cqlsh#L1524
I'm doing the same export on production. What I'm doing is the following;
make select * from table where timeuuid = someTimeuuid limit 10000
write the result set to a csv file w/ >> mode
make the next selects with respect to the last timeuuid
You can pipe command in cqlsh by the following cqlsh command
echo "{$cql}" | /usr/bin/cqlsh -u user -p password localhost 9160 > file.csv
You can use Auto pagination by specifying fetch size in Datastax Java driver.
Statement stmt = new SimpleStatement("SELECT id FROM mycolfamily;");
stmt.setFetchSize(500);
session.execute(stmt);
for (Row r:result.all()){
//write to file
}
I have encountered the same problem a few minutes ago then I have found CAPTURE and it worked:
First start capturing on cqlsh and then run your query with some limiting of your choice.
http://www.datastax.com/documentation/cql/3.0/cql/cql_reference/capture_r.html
The best way yo export the data is using nodetool snapshot option. This returns immediately and can be restored later on. The only issue is that this export is per node and for the entire cluster.
Example:
nodetool -h localhost -p 7199 snapshot
See reference:
http://docs.datastax.com/en/archived/cassandra/1.1/docs/backup_restore.html#taking-a-snapshot

Cassandra Pig example failing with wide row input enabled

Using Cassandra 1.1.6, Pig 0.10.0 and Hadoop 1.1.0, I can successfully run the pig_cassandra example script in provided with cassandra in examples/pig.
But when I change
rows = LOAD 'cassandra://PigTest/SomeApp' USING CassandraStorage();
to:
rows = LOAD 'cassandra://PigTest/SomeApp?widerows=true' USING CassandraStorage();
I get the following error:
java.lang.IndexOutOfBoundsException: Index: 8, Size: 2
at java.util.ArrayList.rangeCheck(ArrayList.java:604)
at java.util.ArrayList.get(ArrayList.java:382)
at org.apache.pig.data.DefaultTuple.get(DefaultTuple.java:156)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.processInputBag(POProject.java:579)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:248)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:316)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:332)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:284)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:290)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:233)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:290)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POPreCombinerLocalRearrange.getNext(POPreCombinerLocalRearrange.java:126)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:290)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:233)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:290)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POLocalRearrange.getNext(POLocalRearrange.java:256)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:271)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:266)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
This happens when running in both in local and mapreduce mode, or if I set PIG_WIDEROW_INPUT=true.
The following Pig Latin script will fail with the "widerows=true" parameter present.
rows = LOAD 'cassandra://PigTest/SomeApp?widerows=true' USING CassandraStorage();
cols = FOREACH rows GENERATE flatten(columns.name);
DUMP cols;
I can't seem to get beyond this, not read the static columns in the SomeApp column family when using wide row input. The same issue is present with other column families.
I had a similar issue. It may be because of bugs in get_paged_slices which were fixed in later 1.1.x releases. The solution would be to upgrade Cassandra to 1.1.8 1.1.9
See:
CASSANDRA-4919: StorageProxy.getRangeSlice sometimes returns incorrect number of columns
CASSANDRA-4816: Broken get_paged_slice
CASSANDRA-5098: CassandraStorage doesn't decode name in widerow mode

How do I delete all data in a Cassandra column family?

I'm looking for a way to delete all of the rows from a given column family in cassandra.
This is the equivalent of TRUNCATE TABLE in SQL.
You can use the truncate thrift call, or the TRUNCATE <table> command in CQL.
http://www.datastax.com/docs/1.0/references/cql/TRUNCATE
You can also do this via Cassandra CQL.
$ cqlsh
Connected to Test Cluster at localhost:9160.
[cqlsh 4.1.1 | Cassandra 2.0.6 | CQL spec 3.1.1 | Thrift protocol 19.39.0]
Use HELP for help.
cqlsh> TRUNCATE my_keyspace.my_column_family;
Its very simple in Astyanax. Just a Single Line statement
/* keyspace variable is Keyspace Type */
keyspace.truncateColumnFamily(ColumnFamilyName);
If you are using Hector it is easy as well:
cluster.truncate("our keyspace name here", "your column family name here");
If you are using cqlsh, then you can either do it in 2 ways
use keyspace; and then truncate column_family;
truncate keyspace.column_family;
If you want to use DataStax Java driver, you can look at -
http://www.datastax.com/drivers/java/1.0/com/datastax/driver/core/querybuilder/QueryBuilder.html
or
http://www.datastax.com/drivers/java/2.0/com/datastax/driver/core/querybuilder/Truncate.html
depending on your version.
if you are working on cluster setup, truncate can only be used when all the nodes of the cluster are UP.
By using truncate, we will miss the data(we are not sure with the importance of the data)
So the very safe way as well a trick to delete data is to use COPY command,
1) backup data using copy cassandra cmd
copy tablename to 'path'
2) duplicate the file using linux cp cmd
cp 'src path' 'dst path'
3) edit duplicate file in dst path, delete all lines expect first line.
save the file.
4) use copy cassandra cmd to import
copy tablename from 'dst path'

Resources