I have just downloaded tarball tlp-stress for cassandra and extracted and found some jars. what the next step to run stress test?
Thanks.
The quick start guide says:
from within tlp-stress run the following command to execute 10,000 queries:
bin/tlp-stress run KeyValue -n 10000
you can specify additional options, such as, --host, etc., or use different workload (you can get list of implemented workloads via bin/tlp-stress list command.
Related
nodetool command
I can get my database status by using this command in shell. How can I execute this command through python? Can you help me out?
Have a look at the Python subprocess module. It has what you need to run nodetool and read output.
Depending on what you're after, another option in Cassandra 4+ will be to use a Python client driver and query virtual tables.
The nodetool utility is designed for use on the command line so you can either get information or perform operations on the cluster. It isn't intended to be run programatically.
Your question doesn't make much sense. Perhaps if you provide a bit more detail on what outcome you're after, we'd be able to give you a better answer. Cheers!
Is there any option in ignitevisorcmd where I can see what entries(key,value details) are present in particular node? I tried cache -scan -c=mycache -id8=12345678 command but it prints entries from all other nodes also for mycache instead of printing data for 12345678 node only.
Current version of Visor Cmd does not support this, but I think it is easy to implement. I created issue in Ignite JIRA, you may track or even contribute.
I have two different independent machines running Cassandra and I want to migrate the data from one machine to the other.
Thus, I first took a snapshot of my Cassandra Cluster on machine 1 according to the datastax documentation.
Then I moved the data to machine 2, where I'm trying to import it with sstableloader.
As a note: The keypsace (open_weather) and tablename (raw_weather_data) on the machine 2 have been created and are the same as on machine 1.
The command I'm using looks as follows:
bin/sstableloader -d localhost "path_to_snapshot"/open_weather/raw_weather_data
And then get the following error:
Established connection to initial hosts
Opening sstables and calculating sections to stream
For input string: "CompressionInfo.db"
java.lang.NumberFormatException: For input string: "CompressionInfo.db"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:580)
at java.lang.Integer.parseInt(Integer.java:615)
at org.apache.cassandra.io.sstable.Descriptor.fromFilename(Descriptor.java:276)
at org.apache.cassandra.io.sstable.Descriptor.fromFilename(Descriptor.java:235)
at org.apache.cassandra.io.sstable.Component.fromFilename(Component.java:120)
at org.apache.cassandra.io.sstable.SSTable.tryComponentFromFilename(SSTable.java:160)
at org.apache.cassandra.io.sstable.SSTableLoader$1.accept(SSTableLoader.java:84)
at java.io.File.list(File.java:1161)
at org.apache.cassandra.io.sstable.SSTableLoader.openSSTables(SSTableLoader.java:78)
at org.apache.cassandra.io.sstable.SSTableLoader.stream(SSTableLoader.java:162)
at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:106)
Unfortunately I have no idea why?
I'm not sure if it is related to the issue, but somehow on machine 1 my *.db files are name rather "strange" as compared to the *.db files I already have on machine 2.
*.db files from machine 1:
la-53-big-CompressionInfo.db
la-53-big-Data.db
...
la-54-big-CompressionInfo.db
...
*.db files from machine 2:
open_weather-raw_weather_data-ka-5-CompressionInfo.db
open_weather-raw_weather_data-ka-5-Data.db
What am I missing? Any help would be highly appreciated. I'm also open to any other suggestions. The COPY command will most probably not work since it is Limited to 99999999 rows as far as I know.
P.s. I didn't want to create a overly huge post, but if you need any further information to help me out, just let me know.
EDIT:
Note that I'm using Cassandra in the stand-alone mode.
EDIT2:
After installing the same version 2.1.4 on my destination machine (machine 2), I still get all the same error. With SSTableLoader I still get the above mentioned error and with copying the files manually (as described by LHWizard), I still get empty tables after starting Cassandra again and performing a SELECT command.
Regarding the initial tokens, I get a huge list of tokens if I perform node ring on machine 1. I'm not sure what to do with those?
your data is already in the form of a snapshot (or backup). What I have done in the past is the following:
install the same version of cassandra on the restore node
edit cassandra.yaml on the restore node - make sure that cluster_name and snitch are the same.
edit seeds: list and any other properties that were altered in the original node.
get the schema from the original node using cqlsh DESC KEYSPACE.
start cassandra on the restore node and import the schema.
(steps 6 & 7 may not be completely necessary, but this is what I do.)
stop cassandra, delete the contents of /var/lib/cassandra/data/, commitlog/, and saved_caches/* folders.
restart cassandra on the restore node to recreate the correct folders, then stop it
copy the contents of the snapshots folder to each corresponding table folder in the restore node, then start cassandra. You probably want to run nodetool repair.
You don't really need to bulk import the data, it's already in the correct format if you are using the same version of cassandra, although you didn't specify that in your original question.
I am using datastax 4.5 and trying to use shark .i am able to open shark shell but queries are not working ,Error is :
shark> use company2;
OK
Time taken: 0.126 seconds
shark> select count(*) from nhanes;
java.lang.RuntimeException: Could not get input splits
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:158)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1414)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1192)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1020)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:888)
at shark.SharkCliDriver.processCmd(SharkCliDriver.scala:347)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
at shark.SharkCliDriver$.main(SharkCliDriver.scala:240)
at shark.SharkCliDriver.main SharkCliDriver.scala
FAILED: Execution Error, return code -101 from shark.execution.SparkTask
Any idea about this error?
My second question is related to backup.
As i am using opscenter for taking backup but in production is it reliable or do i go for nodetool backup and schedule it on individual node.
Thanks
Check "Could not get input splits" Error, with Hive-Cassandra-CqlStorageHandler. You can first test it using hive. If it fails in hive, you need check you keyspace partitioner. I would suggest to create a clean new keyspace and table to test it. Most likely it's something wrong with your KS settings. You can also check the replication of the keyspace, make sure it's replicated to the datacenter the cassandra node starts.
For the second question, it's recommend to use opscenter to backup which is fully tested and easy to use. You can also manually backup by using node tool for each node which causes some human error.
Is it possible to run more than one Cassandra query from a single Cassandra file?
So that if I share that file, the others can run it to replicate the database in all systems
The easiest way is to pass the file containing CQL statements to either cqlsh (using the -f option) or using DevCenter
If you are using Java, the Achilles framework has a class called ScriptExecutor that you can use to run CQL statements from a file and even plug in parameters to dynamically change the statements during execution.
ScriptExecutor documentation