How to execute nodetool status command in python? - cassandra

nodetool command
I can get my database status by using this command in shell. How can I execute this command through python? Can you help me out?

Have a look at the Python subprocess module. It has what you need to run nodetool and read output.
Depending on what you're after, another option in Cassandra 4+ will be to use a Python client driver and query virtual tables.

The nodetool utility is designed for use on the command line so you can either get information or perform operations on the cluster. It isn't intended to be run programatically.
Your question doesn't make much sense. Perhaps if you provide a bit more detail on what outcome you're after, we'd be able to give you a better answer. Cheers!

Related

Cassandra TLP-Stress Tarball Installation

I have just downloaded tarball tlp-stress for cassandra and extracted and found some jars. what the next step to run stress test?
Thanks.
The quick start guide says:
from within tlp-stress run the following command to execute 10,000 queries:
bin/tlp-stress run KeyValue -n 10000
you can specify additional options, such as, --host, etc., or use different workload (you can get list of implemented workloads via bin/tlp-stress list command.

what are the difference between the data back up using nodetool and cqlsh copy command?

Currently we have two options to take data back up of the tables in a Cassandra keyspace. We can either user nodetool commands or use the copy command from the cqlsh terminal.
1) What are the differences between these commands ?
2) Which one is most appropriate ?
3) Also if we are using nodetool to take backup we would generally flush the data from mem tables to sstables before we issue the nodetool snapshot command. So my question is should we employ the same techinque of flushing the data if we use the cqlsh copy command ?
Any help is appreciated.
Thanks very much.
GREAT question!
1) What are the differences between these commands ?
Running a nodetool snapshot creates a hard-link to the SSTable files on the requested keyspace. It's the same as running this from the (Linux) command line:
ln {source} {link}
A cqlsh COPY is essentially the same as doing a SELECT * FROM on a table. It'll create a text file with the table's data in whichever format you have specified.
In terms of their difference from a backup context, a file created using cqlsh COPY will contain data from all nodes. Whereas nodetool snapshot needs to be run on each node in the cluster. In clusters where the number of nodes is greater than the replication factor, each snapshot will only be valid for the node which it was taken on.
2) Which one is most appropriate ?
It depends on what you're trying to do. If you simply need backups for a node/cluster, then nodetool snapshot is the way to go. If you're trying to export/import data into a new table or cluster, then COPY is the better approach.
Also worth noting, cqlsh COPY takes a while to run (depending on the amount of data in a table), and can be subject to timeouts if not properly configured. nodetool snapshot is nigh instantaneous; although the process of compressing and SCPing snapshot files to an off-cluster instance will take some time.
3) Should we employ the same technique of flushing the data if we use the cqlsh copy command ?
No, that's not necessary. As cqlsh COPY works just like a SELECT, it will follow the normal Cassandra read path, which will check structures both in RAM and on-disk.
nodetool snapshot is good approach for any amount of data and it creates a hard link within seconds.copy command will take much time because and depends upon the size of data and cluster. for less data and testing you may use copy command but for production nodetool snapshot is recommended.

Most common cassandra cql commands

I am looking at exploring optimization of cassandra for a limited set of commands. For that I wanted to know which among SELECT, INSERT, UPDATE, DELETE & BATCH is the CQL command with highest frequency of use in realtime systems. Any pointers and thoughts on this would be great help.
There is no such thing as common cql commands, it all depends for which use case cassandra is deployed.
So Instead of optimizing commands you could go for Use Case based optimization:
Eg: UseCase: Write oriented Workload:
Optimize Insert and Update commands.

Error while executing query on shark shell with DSE 4.5

I am using datastax 4.5 and trying to use shark .i am able to open shark shell but queries are not working ,Error is :
shark> use company2;
OK
Time taken: 0.126 seconds
shark> select count(*) from nhanes;
java.lang.RuntimeException: Could not get input splits
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:158)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1414)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1192)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1020)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:888)
at shark.SharkCliDriver.processCmd(SharkCliDriver.scala:347)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
at shark.SharkCliDriver$.main(SharkCliDriver.scala:240)
at shark.SharkCliDriver.main SharkCliDriver.scala
FAILED: Execution Error, return code -101 from shark.execution.SparkTask
Any idea about this error?
My second question is related to backup.
As i am using opscenter for taking backup but in production is it reliable or do i go for nodetool backup and schedule it on individual node.
Thanks
Check "Could not get input splits" Error, with Hive-Cassandra-CqlStorageHandler. You can first test it using hive. If it fails in hive, you need check you keyspace partitioner. I would suggest to create a clean new keyspace and table to test it. Most likely it's something wrong with your KS settings. You can also check the replication of the keyspace, make sure it's replicated to the datacenter the cassandra node starts.
For the second question, it's recommend to use opscenter to backup which is fully tested and easy to use. You can also manually backup by using node tool for each node which causes some human error.

How to execute cassandra queries from a file

Is it possible to run more than one Cassandra query from a single Cassandra file?
So that if I share that file, the others can run it to replicate the database in all systems
The easiest way is to pass the file containing CQL statements to either cqlsh (using the -f option) or using DevCenter
If you are using Java, the Achilles framework has a class called ScriptExecutor that you can use to run CQL statements from a file and even plug in parameters to dynamically change the statements during execution.
ScriptExecutor documentation

Resources