Can CCM create command use a locally installed version? - cassandra

I'm trying to create a Cassandra Cluster locally on a single Windows 64 bit machine and followed these instructions.
I already have Cassandra 3.7 locally installed and was assuming there'd be a way to make use of the same installation through ccm. But it looks like, ccm always tries to download and install the Cassandra version. Looking into the ccm create [options] didn't provide me a pointer.
Does this needs to be followed instead for an already installed one?

You can create a cluster with ccm by using the --install-dir= parameter as described in the README.

Related

Eclipse Hono - Installation (Version 1.1.1)

I am not sure on the exact instructions to install Hono 1.1.1 locally. Following the documentation , I was able to build the project with maven but I am not sure on how to proceed.
I was using version 0.9 before in which I managed to run Hono using docker swarm by running the swarm_deploy.sh script that was located in the deploy folder after building the project with maven. Currently in Hono 1.1.1 in the deploy folder we have services.sh instead of swarm_deploy.sh.
I would like to know, how could I run the docker swarm as it was in version 0.9? Are there any major drawbacks from this approach?
Note: I am looking for a simple way to install Hono locally as its a small experimental project and not aiming at a full scalable version yet such as using Kubernetes.
Sorry, but we no longer support deployment to plain Docker Swarm. You shouldn't have any issues installing Hono 1.1.1 using the Helm chart to a local minikube or kind (single node) cluster, though. There is no big difference in resource consumption compared to plain Docker Swarm, in particular if you are using kind.
Using this approach there also is no need to compile Hono from source. Just follow the Hono chart's README.

Datastax Enterprise Installation on Virtual Box CentOS

Can anyone please guide me step by step installation one by one for Datastax Enterprise Installation on Virtual Box CentOS .
I checked Datastax Documentation , but getting little bit confused in few steps and due to which I am not satisfied. Also checked other resources but not able to understand completely.
So Help me to know installation process one by one with all basis steps.
Thanks in advance .
You may have an easier time using OpsCenter's Lifecycle Manager to deploy DSE. (Disclaimer, I am a Lifecycle Manager dev so am biased.)
First you need to install OpsCenter in a separate VM or Centos box. If you're able to get through the Java install and yum repository setup parts of DSE setup, this won't be difficult: https://docs.datastax.com/en/opscenter/6.0/opsc/install/opscInstallRHEL_t.html
Then run an install job from LCM: https://docs.datastax.com/en/opscenter/6.0/opsc/LCM/opscLCMinstallJob.html Example the pre-requisite section of that page carefully. It will show you the things you need to do in LCM to get ready to run the job, it's all point-and-click, though.
The only pre-requisites on your target DSE machine are "python" (usually installed by default) and for the minute "which", though we'll be removing that dependency in an upcoming version.
Note at the end of this process, you'll need to provide cqlsh an IP address, username, and password to connect to the cluster... even when making a "local" connection from your DSE vm. For example: "cqlsh 192.168.1.100 -u cassandra -p the-password-you-chose-during-lcm-install"

Using container for Linux applications?

I am experimenting with multiple versions of QEMU.
This involves downloading different versions and variants of source code, and running the usual: configure, make and make install.
The problem is I can't install multiple versions simultaneously because they use the same install script. I need to uninstall (make uninstall) before I install another one. This only works if I have kept the makefile of the installed binaries.
I think what I would like to do is something similar to Python's virtualenv. A standalone Linux user(?) environment for each application that I can easily remove.
Is there such a thing? Or is my approach completely flawed?
I think the best approach for such cases is docker container. Docker is a container-based virtualization technology, In which you can build your customized Linux-based environment and host your application inside it. thereafter, that means, you have containerized your application and its ready to be distributed and run easily.

Connecting SparkR to the spark cluster

I have a spark cluster running on 10 machines (1 - 10) with the master at machine 1. All of these run on CentOS 6.4.
I am trying to connect a jupyterhub installation (which is running inside a ubuntu docker because of issues with installing on CentOS), using sparkR, to the cluster and get the spark context.
The code I am using is
Sys.setenv(SPARK_HOME="/usr/local/spark-1.4.1-bin-hadoop2.4")
library(SparkR)
sc <- sparkR.init(master="spark://<master-ip>:7077")
The output I get is
attaching package: ‘SparkR’
The following object is masked from ‘package:stats’:
filter
The following objects are masked from ‘package:base’:
intersect, sample, table
Launching java with spark-submit command spark-submit sparkr-shell/tmp/Rtmpzo6esw/backend_port29e74b83c7b3 Error in sparkR.init(master = "spark://10.10.5.51:7077"): JVM is not ready after 10 seconds
Error in sparkRSQL.init(sc): object 'sc' not found
I am using Spark 1.4.1. The spark cluster is also running CDH 5.
The jupyterhub installation can connect to the cluster via pyspark and I have python notebooks which use pyspark.
Can someone tell me what I am doing wrong?
I have a similar problem and have searching all around but no solutions. Can you please tell me what do you mean by "jupyterhub installation (which is running inside a ubuntu docker because of issues with installing on CentOS), "?
We have 4 clusters too on CentOS 6.4. One of my other problem is that how do use an IDE like IPython or RStudio to interact with these 4 servers? Do I use my laptop to connect to these servers remotely (if yes, then how?) and if no then what can be the other solution.
Now to answer your question, I can give it a try. I think the you have to use --yarn-cluster option as stated here I hope this helps you solving the problem.
Cheers,
Ashish

Hadoop multi-node cluster manual installation over Ubuntu 14.04

I am a newcomer to Hadoop. For my College project we are given 4 VMs. I need to configure a multi-mode Hadoop cluster on this ( 1 master 3 slaves) and run my webapp on it. I would be using HBase in my project. Usually CentOS is used for installation and deployment of HDP, whereas I was given ubuntu. I cannot use Apache ambari plugin for installation as it is not supported in Ubuntu. I need to manually deploy them, Hence I tried looking out for tutorials.
I looked out for a tutorial to install HDP multinode clusters on ubuntu and found this [http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/]
But its too outdated (2010)
I have the official documentation here, but I am not able to follow it properly.
[http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.1-latest/bk_installing_manually_book/content/rpm-chap2-3.html] and I tried following them.
Could someone suggest me somelinks which are latest, a tutorial with decent amount of screenshots for installation of multinode clusters over Ubuntu 14.04 ( 12.04 is also fine).
Thanks a lot!!
The Michael Noll tutorial is too old, I think. I found this site:
https://www.digitalocean.com/community/tutorials/how-to-install-hadoop-on-ubuntu-13-10
I have a mini cluster (with 5 slaves and a master) in my University Lab. Ubuntu 12.04 and Hadoop 2.5.0 is there. Furthermore, I have a VM cluster in my laptop (2 slaves and a master) of Hadoop 1.2.1 on Ubuntu 12.04 too.
But I couldn't install Hadoop (any version) in Ubuntu 14.04. I don't remember the cause, but I think it was some problem with Java version (I don't check that).
I hope that help you!
I can across the same issue to install HDP 2.2 on Ubuntu 14.04, and found a solution.
I documented everything here: http://www.swiss-scalability.com/2014/12/install-hdp-22-on-ubuntu-1404-trusty.html
In a nutshell, the magic happens here:
sed -e "s/14.04/12.04/g" -i /etc/*-release
And the you can install or restart ambari-agent, it will be able to communicate with ambari-server.

Resources