How to create a Cassandra copy to a test machine? - cassandra

We have a staging environment which runs a one node cluster completely separate from our production environment. What I'd like to do is copy this one node cluster over to a test machine that I have for the sole purpose of testing.
What is the correct way to do this? The server and test server are running Centos 6.x, and the version of DSE is 4.5.1 and Cassandra 2.0.8.39

All you need is to follow the steps described in this document:
http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_backup_restore_c.html.
If your test cluister's topology is different from the original cluster then you will need to use a tool like a sstableloader:
http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_backup_snapshot_restore_t.html
http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_move_cluster.html
http://www.datastax.com/documentation/cassandra/2.0/cassandra/tools/toolsBulkloader_t.html

Related

Is Apache Spark recommended to run on windows?

I have a requirement to run Spark on Windows in a production environment. I would like to get advice in understanding if Apache Spark on Windows is recommended. If not, I would like to know the reason behind the same.

Is it possible to fake Cassandra connection?

I have been given a task to configure Cassandra DB for the project. We are facing a problem - for all environments there is a dedicated server for Cassandra. But, for the DEV environment, the client does not want to provide a seperate server and current DEV servers are already fully packaged and we can't afford to install Cassandra on them.
My question is, is there any possibility to fake connection to Cassandra in an environment? I've created CassandraConfiguration.java class, configured session, cluster etc etc, it all works smoothly on other envs, but on DEV, well, it fails, as it cannot connect, because there's no Cassandra... Commiting the cassandraconfiguration file will kill the dev.
You can use scassandra (simulated cassandra), or Simulacron that are emulating Cassandra. Or you can use cassandra-unit that will run Cassandra in the same JVM as your test.

What is the best way to test Cassandra applications?

I am currently using Achilles Embedded to spin up a local, temporary Cassandra instance and test my functionality there. While this is working to some extend, there seems to be a memory leak as the more tests I run, the more I see messages like PS Scavenge GC in xx ms, and my system slows to a crawl, even freezing the mouse pointer.
So, is there a better way to automatically spin up a small Cassandra instance to run my tests against?
The tool I use for quickly creating a local Cassandra cluster is the ccm (Cassandra Cluster Manager) utility. You can easily create a multi-node cluster on your local machine for any release. See more information here.
I believe some of the Cassandra developers use ccm for their development work, so ccm is kept up to date with the newest releases.
I agree, you can use use CCM. if you have a test cluster. Try using cassandra stress tool (Either standalone or using yam profile). If I am getting your question correct, it will solve your problem.

Cassandra 2+ HPC Deployment

I am trying to deploy Cassandra on a Linux Based HPC cluster and I need some guidelines if possible. Specifically, what is the difference between running Cassandra locally and in cluster.
When managing locally (in which case it runs smoothly) we duplicate the original files for every node inside our Cassandra directory and we apply the appropriate changes for IP address, rcp, JMX etc... however, when managing a network which files do we need to install in each node. The whole package with all the files or just some of the required ones
like, bin/cassandra.in.sh, conf/cassandra.yaml, bin/cassandra.
I am a little bit confused on what to store in each node separately so to start working on the cluster.
You need to install Cassandra on each node (VM), i.e. the whole package and then update config files as neccessary. As described here to configure cluster in a single data center you need:
Install Cassandra on each node
Configure cluster name
Configure seeds
Configure snitch, if needed

titan rexster with external cassandra instance

I have a cassandra cluster (2.1.0) running fine.
After installing titan 5.1, and editing the titan-cassandra.properties to point to cluster hostname list rather than localhost, i run following -
titan.sh -c conf/titan-cassandra.properties start
It is able to recognize running cassandra instance, starts elastic search, but times out while connecting to rexster.
If i run it with local cassandra, everything runs fine using following ->br>
titan.sh start
do i need to make any change in rexster properties to point to running cassandra cluster..
Thanks in advance
Titan Server started by titan.sh represents a quick way to get started with Titan/Rexster/ES. It is designed to simplify running all those things with one startup script. Once you start breaking things apart (e.g. a separate cassandra cluster), you might not want to use titan.sh anymore because, it still forks a cassandra process when it starts up. Presumably, you don't need that anymore, given that you have a separate cassandra cluster.
Given the more advanced nature of your environment, I would simply download Rexster and configure it to connect to your cluster.

Resources