Cassandra service not starting - cassandra

Im trying to start cassandra but dont see any response.
hduser#vagrant:~/cassandra/apache-cassandra-2.1.12-src/bin$ cassandra -f
hduser#vagrant:~/cassandra/apache-cassandra-2.1.12-src/bin$
I have downloaded cassandra in this folder -
/home/hduser/cassandra/apache-cassandra-2.1.12-src

It looks like you downloaded the source distribution of cassandra, try downloading the binary ('bin') distribution instead from the download page.
Otherwise, you can compile the source distribution using Apache Ant by running ant from the root directory of the extracted archive and then running cassandra per usual.

Related

Cloudera Quick Start VM lacks Spark 2.0 or greater

In order to test and learn Spark functions, developers require Spark latest version. As the API's and methods earlier to version 2.0 are obsolete and no longer work in the newer version. This throws a bigger challenge and developers are forced to install Spark manually which wastes a considerable amount of development time.
How do I use a later version of Spark on the Quickstart VM?
Every one should not waste setup time which I have wasted, so here is the solution.
SPARK 2.2 Installation Setup on Cloudera VM
Step 1: Download a quickstart_vm from the link:
Prefer a vmware platform as it is easy to use, anyways all the options are viable.
Size is around 5.4gb of the entire tar file. We need to provide the business email id as it won’t accept personal email ids.
Step 2: The virtual environment requires around 8gb of RAM, please allocate sufficient memory to avoid performance glitches.
Step 3: Please open the terminal and switch to root user as:
su root
password: cloudera
Step 4: Cloudera provides java –version 1.7.0_67 which is old and does not match with our needs. To avoid java related exceptions, please install java with the following commands:
Downloading Java:
wget -c --header "Cookie: oraclelicense=accept-securebackup-cookie" http://download.oracle.com/otn-pub/java/jdk/8u131-b11/d54c1d3a095b4ff2b6607d096fa80163/jdk-8u131-linux-x64.tar.gz
Switch to /usr/java/ directory with “cd /usr/java/” command.
cp the java download tar file to the /usr/java/ directory.
Untar the directory with “tar –zxvf jdk-8u31-linux-x64.tar.gz”
Open the profile file with the command “vi ~/.bash_profile”
export JAVA_HOME to the new java directory.
export JAVA_HOME=/usr/java/jdk1.8.0_131
Save and Exit.
In order to reflect the above change, following command needs to be executed on the shell:
source ~/.bash_profile
The Cloudera VM provides spark 1.6 version by default. However, 1.6 API’s are old and do not match with production environments. In that case, we need to download and manually install Spark 2.2.
Switch to /opt/ directory with the command:
cd /opt/
Download spark with the command:
wget https://d3kbcqa49mib13.cloudfront.net/spark-2.2.0-bin-hadoop2.7.tgz
Untar the spark tar with the following command:
tar -zxvf spark-2.2.0-bin-hadoop2.7.tgz
We need to define some environment variables as default settings:
Please open a file with the following command:
vi /opt/spark-2.2.0-bin-hadoop2.7/conf/spark-env.sh
Paste the following configurations in the file:
SPARK_MASTER_IP=192.168.50.1
SPARK_EXECUTOR_MEMORY=512m
SPARK_DRIVER_MEMORY=512m
SPARK_WORKER_MEMORY=512m
SPARK_DAEMON_MEMORY=512m
Save and exit
We need to start spark with the following command:
/opt/spark-2.2.0-bin-hadoop2.7/sbin/start-all.sh
Export spark_home :
export SPARK_HOME=/opt/spark-2.2.0-bin-hadoop2.7/
Change the permissions of the directory:
chmod 777 -R /tmp/hive
Try “spark-shell”, it should work.

Elastic search installation questions on RHEL

I am working to install elastic search on a LINUX box. As I understand, there are couple of options like tar and RPM. I am not sure on difference between those two. I find tar very easy to download and unzip... Please help explain when you chose tar vs RPM or other options.
Also - I have multiple JRE versions on my servers. Is there a way to specify JRE path to Elastic configuration? At this point I exported JAVA_HOME and started Elastic Search?
tar is a compressed file containing the required binary / config / other files for your application.
RPM is a package manager which allows easier installation of the files which are contained in a tar or multiple tar files.
using a package manager is usually preferable as it can install dependencies and allow cleaner removal or updating of applications.
After installation, I was also facing "bootstrap checks failed" every time I tried to put network.host to M/C IP.
Below changes solved the problem -
network.host: 0.0.0.0
http.port: 9200
transport.host: localhost
transport.tcp.port: 9300

How to upgrade Spark to newer version?

I have a virtual machine which has Spark 1.3 on it but I want to upgrade it to Spark 1.5 primarily due certain supported functionalities which were not in 1.3. Is it possible I can upgrade the Spark version from 1.3 to 1.5 and if yes then how can I do that?
Pre-built Spark distributions, like the one I believe you are using based on another question of yours, are rather straightforward to "upgrade", since Spark is not actually "installed". Actually, all you have to do is:
Download the appropriate Spark distro (pre-built for Hadoop 2.6 and later, in your case)
Unzip the tar file in the appropriate directory (i.e.where folder spark-1.3.1-bin-hadoop2.6 already is)
Update your SPARK_HOME (and possibly some other environment variables depending on your setup) accordingly
Here is what I just did myself, to go from 1.3.1 to 1.5.2, in a setting similar to yours (vagrant VM running Ubuntu):
1) Download the tar file in the appropriate directory
vagrant#sparkvm2:~$ cd $SPARK_HOME
vagrant#sparkvm2:/usr/local/bin/spark-1.3.1-bin-hadoop2.6$ cd ..
vagrant#sparkvm2:/usr/local/bin$ ls
ipcluster ipcontroller2 iptest ipython2 spark-1.3.1-bin-hadoop2.6
ipcluster2 ipengine iptest2 jsonschema
ipcontroller ipengine2 ipython pygmentize
vagrant#sparkvm2:/usr/local/bin$ sudo wget http://apache.tsl.gr/spark/spark-1.5.2/spark-1.5.2-bin-hadoop2.6.tgz
[...]
vagrant#sparkvm2:/usr/local/bin$ ls
ipcluster ipcontroller2 iptest ipython2 spark-1.3.1-bin-hadoop2.6
ipcluster2 ipengine iptest2 jsonschema spark-1.5.2-bin-hadoop2.6.tgz
ipcontroller ipengine2 ipython pygmentize
Notice that the exact mirror you should use with wget will be probably different than mine, depending on your location; you will get this by clicking the "Download Spark" link in the download page, after you have selected the package type to download.
2) Unpack the tgz file with
vagrant#sparkvm2:/usr/local/bin$ sudo tar -xzf spark-1.*.tgz
vagrant#sparkvm2:/usr/local/bin$ ls
ipcluster ipcontroller2 iptest ipython2 spark-1.3.1-bin-hadoop2.6
ipcluster2 ipengine iptest2 jsonschema spark-1.5.2-bin-hadoop2.6
ipcontroller ipengine2 ipython pygmentize spark-1.5.2-bin-hadoop2.6.tgz
You can see that now you have a new folder, spark-1.5.2-bin-hadoop2.6.
3) Update accordingly SPARK_HOME (and possibly other environment variables you are using) to point to this new directory instead of the previous one.
And you should be done, after restarting your machine.
Notice that:
You don't need to remove the previous Spark distribution, as long as all the relevant environment variables point to the new one. That way, you may even quickly move "back-and-forth" between the old and new version, in case you want to test things (i.e. you just have to change the relevant environment variables).
sudo was necessary in my case; it may be unnecessary for you depending on your settings.
After ensuring that everything works fine, it's good idea to delete the downloaded tgz file.
You can use the exact same procedure to upgrade to future versions of Spark, as they come out (rather fast). If you do this, either make sure that previous tgz files have been deleted, or modify the tar command above to point to a specific file (i.e. no * wildcards as above).
Set your SPARK_HOME to /opt/spark
Download the latest pre-built binary i.e. spark-2.2.1-bin-hadoop2.7.tgz - can use wget
Create the symlink to the latest download - ln -s /opt/spark-2.2.1 /opt/spark
Edit files in $SPARK_HOME/conf accordingly
For every new version you download just create the symlink to it (step 3)
ln -s /opt/spark-x.x.x /opt/spark

How to move data to another folder in memsql

Working with memsql cluster as primary storage design, by default data files are installed in a place like the following on CentOS 6.x:
/var/lib/memsql-ops/data/installs/MI9dfcc72a5b044f2694b5f7028803a21e
Is there any way to relocate the data path to another folder on the same machine?
This is not a best way but it works. I just re-install MemSQL to other directory:
sudo mkdir /data/memsql
sudo ./install.sh --root-dir /data/memsql
In this case MemSQL Ops still will be in /var/lib/memsql-ops but all nodes will be installed to /data/memsql directory (look at symlink /var/lib/memsql) and all data will be inside this directory too.
P.S. Additional installation options you can find use memsql-ops agent-install --help command.

rpmbuild differences in RHEL 5.7 and RHEL 6.1

I'm trying to build an RPM using rpmbuild, which would work for both RHEL 5.7 machines and RHEL 6.1 machines, and I'm having some trouble understanding how to structure my rpmbuild/SOURCE directory.
According to what I understood, if my package name is XXX, than I need to prepare rpmbuild/SOURCE/XXX.tar.gz, a tarball which contains:
1. A directory named XXX;
2. In it, all the directories and files I'm installing should be ordered as if their paths are relative to the root directory (i.e. /)
For instance, if I want to install a file called foo.sh to /tmp/XXXdir/, I need to have rpmbuild/SOURCE/XXX.tar.gz contain XXX/tpm/xxxdir/foo.sh
This is what I understood and this is what works when I install my RPM on my RHEL 5.7 machine (i.e. in the example above the file is installaed to /tmp/XXXdir/foo.sh).
However, on an RHEL 6.1 machine I get the undesired behaviour of having my files installed to a newly created /XXX directory, and from there I get the same tree structure I wanted for / (i.e. in the example above I get the file at /XXX/tmp/XXXdir/foo.sh).
Any idea why this happenes? Perhaps I've got it wrong and my rpmbuild/SOURCE structure is not as it should be? Any insights would be very helpful.
Thanks a lot in advance,
Lior

Resources