Upgrading Cassandra without losing the current data - cassandra

My current version of Cassandra is 2.2.4 and I want to upgrade it to 3.0.10 with out losing any data. How is it possible?
My cluster consist of 3 nodes with replication factor of 2. Will this update affect my cluster architecture?

Steps for upgrade cassandra version
1. Run nodetool drain before shutting down the existing Cassandra service.
nodetool drain -h hostname
2. Stop cassandra services.
service cassandra stop
3. Back up your Cassandra configuration files from the old installation to safe place.
4. Update java version.
apt-get update
apt-get install oracle-java8-set-default
java -version
5. Install the new version of Apache Cassandra.
apt-get update
apt-get install cassandra=3.7.0
If you are running Cassandra from a source you should download the latest tar.gz instead of using the package manager.
6. Configure the new product. Review, compare, merge and/or update any modifications you have previously made into the new configuration files for the new version (cassandra.yml, cassandra-env.sh, etc.).
7. Start the cassandra services.
service cassandra start
Check the logs for warnings, errors, and exceptions.
tail -f /var/logs/cassandra/system.log # or path where you set your logs.
8. Run nodetool upgradesstables
nodetool upgradesstables
9. Check the logs for warnings, errors, and exceptions.
tail -f /var/logs/cassandra/system.log # or path where you set your logs.
10. Check the status of the cluster
nodetool -h hostname status
11. Repeat theses upgrade steps on each node in the cluster.
For more details please go to the link. upgrade Cassandra to the latest version

I had answered a similar question on dba.stackexchange, with data based on the DataStax upgrade documentation. Yes, you can upgrade your cluster without losing existing data, and yes there is a direct upgrade path from 2.2 to 3. The idea is to use a rolling-upgrade approach. Essentially, you'll want to follow these steps to upgrade:
Stop the node.
Back up your configuration files. Depending on how you install the product, these files may be overwritten with default values during the installation.
Install the binaries (via tarball, apt-get, yum, etc...) for the new version of Cassandra.
Configure the new product. Using the backups you made of your configuration files, merge any modifications you have previously made into the new configuration files for the new version. Configuration options change often, so be sure to double check the version restrictions for additional steps and changes regarding configuration. This is necessary when upgrading to Cassandra 3, as you cannot use the existing config files from 2.2 or lower.
Start the node.
Upgrade the sstables on each node: $ nodetool upgradesstables
Check the logs for warnings, errors and exceptions. Repeat on each node in the cluster. The upgradesstables step can be run on each node after the fact. Cassandra can read the sstables for one version lower, but you'll need to complete that step on all nodes to get the full benefits of the new Cassandra 3 storage engine.
Edit 20170518
Can you please explain the step 2. Where to install and how to install?
Since you are upgrading, it depends on how the initial install was done, which also depends on the OS and package manager (if any) used.
Debian-based Linux (Debian, Ubuntu, Knoppix)
sudo dpkg -S cassandra should tell you where it is installed.
Red Hat-based Linux (CentOS, Fedora, RHEL)
sudo rpm -q cassandra should tell you where it is installed.
If neither of those work, then your nodes were probably built with the tarball install process. And seriously, that's like anybody's guess as to where the binaries were installed. Common locations are /etc/cassandra, /opt/cassandra/ and /usr/local/cassandra.
Once you figure that out, you should be able to invoke an upgrade with your package manager using apt-get (Debian):
sudo apt-get update
sudo apt-get install casandra
For yum (Red Hat) right now I think you still need to download the RPM, as they don't quite have that in the correct repos yet:
sudo rpm cassandra-3.10-noarch.rpm
And if you're running on a tarball install, what I like to do is rename the directory before downloading and untaring the new binaries:
sudo mv /etc/cassandra /etc/cassandra_20170510
sudo mv ~/Downloads/apache-cassandra-3.10.tar.gz /etc/
cd /etc
sudo tar -zxvf apache-cassandra-3.10.tar.gz
sudo mv /etc/apache-cassandra-3.10 /etc/cassandra
And don't forget to change ownership on the new dir to match the previous install.
More information on the specifics behind this process (and each method) can be found on the Apache Cassandra Download page.

Related

stpes required to install apache cassandra-3.11.x as as a service on centos offline?

I want to install apache cassandra as a process.
can we make a process as a service then what things need to be done?
or do i have to proceed with rpm installation of cassandra package?
below steps required to install cassandra as a service-
download the latest rpm package-
https://www.apache.org/dist/cassandra/redhat/311x/cassandra-3.11.4-1.src.rpm
go to the directory where you have downloaded the rpm package and run below commands as a super user-
sudo rpm -ivh cassandra-3.11.4-1.src.rpm
this will do the installation and creation of cassandra directory at different location systemwise. some of the important directory you can go and check with default configuration (no changes in cassandra.yaml file)-
data dir- /var/lib/cassandra/
log dir- /var/log/cassandra/
configuration dir- /etc/cassandra/conf/
now run the command
sudo service cassandra start
this will start cassandra as a service. you can check the nodetool status for more information.

How to install Apache Cassandra on CentOS 7?

Cassandra installation documentation mentions installation from a tarball or as a Debian package. Is there a way to install it using yum, now that DataStax does not provides the distribution?
CentOS uses RPM as package format as far as I can remember. So look on http://cassandra.apache.org/download/:
Installation from RPM packages
For the specify the major version number, without dot, and with an appended x. The latest is 311x. For older releases, the can be one of 30x, 22x, or 21x.
(Not all versions of Apache Cassandra are available, since building RPMs is a recent addition to the project.)
Add the Apache repository of Cassandra to /etc/yum.repos.d/cassandra.repo, for example for the latest 3.11 version:
[cassandra]
name=Apache Cassandra
baseurl=https://www.apache.org/dist/cassandra/redhat/311x/
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://www.apache.org/dist/cassandra/KEYS
Install Cassandra, accepting the gpg key import prompts:
sudo yum install cassandra
Start Cassandra (will not start automatically):
service cassandra start
Systemd based distributions may require to run systemctl daemon-reload once to make Cassandra available as a systemd service. This should happen automatically by running the command above.
Make Cassandra start automatically after reboot:
chkconfig cassandra on
Please note that official RPMs for Apache Cassandra only have been available recently and are not tested thoroughly on all platforms yet. We appreciate your feedback and support and ask you to post details on any issues in the corresponding Jira ticket.

Missing hadoop package in Bigtop (centos) - installation issue

I am trying to install bigtop on centos6 (VM using virtualbox).
I am following links given below with little modifications to get latest versions (bigtop 1.1.0) -
http://www.dummies.com/how-to/content/set-up-the-hadoop-environment-with-apache-bigtop.html
https://cwiki.apache.org/confluence/display/BIGTOP/How+to+install+Hadoop+distribution+from+Bigtop+0.5.0
To be precise, I have run following commands till now -
wget -O /etc/yum.repos.d/bigtop.repo http://www.apache.org/dist/bigtop/bigtop-1.1.0/repos/centos6/bigtop.repo
yum install hadoop\* mahout\* oozie\* hbase\* hive\* hue\* pig\* zookeeper\*
Now the problem is, it says -
No package hadoop* available.
No package hue* available.
No package zookeeper* available.
I am new to linux and don't completely understand what exactly these commands are doing. I have wasted an entire day on this. As I am just trying to explore hadoop on my VM, I am fine if I can get some older version of bigtop too but I would prefer that I can get atleast hadoop 2.0 or above.
Can someone help on this?
Thanks.
You have to run sudo apt-get update between adding new repository and installing packages from it.

How To Restore a Missing Redis Service

I installed an older version of Redis on a CentOS server. I tried to remove that old version and update it to latest version, but it seems that the redis service is gone and the new version installation doesn't reproduce it. Is there any way I can uninstall the Redis completely and make a fresh install? Otherwise, is there any way I can reinstall Redis service? When I check service list, I see redis in the list but when I execute service Redis restart, it says "unrecognized service".
Do you want to remove redis old package you can use yum remove command as below.
yum remove redis
then check it still available as below
rpm -qi redis
and also check files
rpm -ql redis
if its there you can remove as below.
rpm -e redis
(or you can mention package full name with versions)
then you can install new version which you want.
wget -r --no-parent -A 'epel-release-*.rpm' http://dl.fedoraproject.org/pub/epel/7/x86_64/e/
rpm -Uvh dl.fedoraproject.org/pub/epel/7/x86_64/e/epel-release-*.rpm
then run
yum install redis
or you can download rpm and install it as below
rpm -ivh redis-"version".rpm
but better to use yum because its going with all dependences.
You might try init 1 then init 5 to take the system to single user then back to gui thus restarting all services in case your Redis is relying upon another service. Also do this as SU.

Installing cassandra in ubuntu?

I have already installed cassandra in ubuntu using with wiki
Problem is I have no control over which version to install and upgrade to in feature.
I am want to be able to install specific version not just latest, because i have a machine running 0.6.2 now i want a another node and i want to install 0.6.2.
How can i install debian package for specific version instead of latest one?
for installing a specific version of cassandra you can do something like this:
in this case i want to install cassandra 1.2.8
sudo apt-get clean
sudo apt-get update
sudo apt-get install cassandra=1.2.8
The best way to do something like this, that I have found so far is pinning. This is a little inconvenient at the moment because you have to manually create the pinning preferences (and change them if necessary). Also, the pinning will not work with aptitude in case you use this.
Another example is the pinning I have done for php here. However, you have to make sure that whatever version you want to have is available in the repos/ppas that you have configured in your sources.list (sources.list.d).

Resources