Solr: How to upgrade solrcloud 5.5.1 to 6.6.0 - linux

I want to upgrade the solrcloud to the newest version in CentOS.IS there have a step by step to do this process. and I don't want to loose my indexes also.

Lets answers your question by taking example. Assume that below is your Solr Cloud structure.
Shard1
node1 (20 docs) (5.5.1)
node2 (20 docs) (5.5.1)
Shard2
node3 (20 docs) (5.5.1)
node4 (20 docs) (5.5.1)
Now you will add a replica to both of the shards.
First you add two nodes to Zookeeper. But this both nodes will be of Solr 6.6.0.
bin/solr start -cloud -s /Users/XYZ/Downloads/solr-6.5.0/example/cloud/node6/solr -p 8957 -z localhost:9983
Add one replica to Both shards
localhost:8983/solr/admin/collections?action=ADDREPLICA&collection=collectionName&shard=shard1
localhost:8983/solr/admin/collections?action=ADDREPLICA&collection=collectionName&shard=shard2
After adding replicas your Solr Cloud Structure will look like :-
Shard1
node1 (20 docs) (5.5.1)
node2 (20 docs) (5.5.1)
node5 (20 docs) (6.6.0)
Shard2
node3 (20 docs) (5.5.1)
node4 (20 docs) (5.5.1)
node6 (20 docs) (6.6.0)
Remove node1, node2, node3 and node4. Your cloud structure will be in Solr 6.6.0.

Related

Best approach to remove cassandra-topology.properties file in running cluster nodes

There is 3 node cassandra cluster running and which is serving Production Traffic And in cassandra.yaml file "endpoint_snitch: GossipingPropertyFileSnitch" is configured but somehow we have forgot to remove file cassandra-topology.properties from cassandra conf directory. As per Cassandra documentation if you are using GossipingPropertyFileSnitch you should remove cassandra-topology.properties file.
Now As all three nodes are running and serving Production traffic So can I remove this file all three nodes or I have to remove this file after shutdown the nodes one by one.
Apache Cassandra Version is "3.11.2"
./bin/nodetool status
Datacenter: dc1
================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN x.x.x.x1 409.39 GiB 256 62.9% cshdkd-6065-4813-ae53-sdh89hs98so RAC1
UN x.x.x.x2 546.33 GiB 256 67.8% jfdsdk-f18f-4d46-af95-33jw9yhfcsd RAC2
UN x.x.x.x3 594.73 GiB 256 69.3% 7s9skk-a27f-4875-a410-sdsiudw9eww RAC3
If the cluster is already migrated to GossippingPropertyFileSnitch, then you can safely remove that file without stopping the cluster nodes. See the item 7 in DSE 5.1 documentation (compatible with Cassandra 3.11)

Can't Connect to Cassandras default cluster (Test Cluster) Using Opscenter

My Error is as below
OpsCenter was not able to add the cluster: OpsCenter was unable to resolve the ip for Test_Cluster, please specify seed nodes using the rpc_address
My OS is CentOS 7
I install DSE 6
i found that datastax forbidden my ip

Cassandra reads slow when add new node

I'm using cassandra 3.11.0 on Ubuntu 16.04 (VM).
So with 2 nodes in same cluster using 0.8s for select * from emp.
But when i add new node in same cluster it using 2.0s
Keyspace
class: SimpleStrategy
replication_factor: 2
Table
CREATE TABLE emp (key int PRIMARY KEY,empname text);
Cassandra.yaml
Node1 and Node2 have a same config
autobootstrap: false
seed provider: "node1,node2"
num_tokens: 256
endpoint_snitch: GossipingPropertyFileSnitch
Node3 (new node) config
autobootstrap: true
seed provider: "node1,node2"
num_tokens: 256
endpoint_snitch: GossipingPropertyFileSnitch

Datastax Spark worker is always looking for master at 127.0.0.1

I am trying to bring up datastax cassandra in analytics mode by using "dse cassandra -k -s". I am using DSE 5.0 sandbox on a single node setup.
I have configured the spark-env.sh with SPARK_MASTER_IP as well as SPARK_LOCAL_IP to point to my LAN IP.
export SPARK_LOCAL_IP="172.40.9.79"
export SPARK_MASTER_HOST="172.40.9.79"
export SPARK_WORKER_HOST="172.40.9.79"
export SPARK_MASTER_IP="172.40.9.79"
All above variables are setup in spark-env.sh.
Despite these, the worker will not come up. It is always looking for a master at 127.0.0.1.This is the error i am seeing in /var/log/cassandra/system.log
WARN [worker-register-master-threadpool-8] 2016-10-04 08:02:45,832 SPARK-WORKER Logging.scala:91 - Failed to connect to master 127.0.0.1:7077
java.io.IOException: Failed to connect to /127.0.0.1:7077
Result from dse client-tool shows 127.0.0.1
$ dse client-tool -u cassandra -p cassandra spark master-address
spark://127.0.0.1:7077
However i am able to access the spark web UI from the LAN IP 172.40.9.79
Spark Web UI screenshot
Any help is greatly appreciated
Try add in file spark-defaults.conf this parameter: spark.master local[*] or spark.master 172.40.9.79. Maybe this solves your problem

Spark Executors off-heap memory usage keeps increasing

The off-heap memory usage of the 3 Spark executor processes keeps increasing constantly until the boundaries of the physical RAM are hit. This happened two weeks ago, at which point the system comes to a grinding halt, because it's unable to spawn new processes. At such a moment restarting Spark is the obvious solution. In the collectd memory usage graph below we see two moments that we've restarted Spark: last week when we upgraded Spark from 1.4.1 to 1.5.1 and two weeks ago when the physical memory was exhausted.
As can be seen below, the Spark executor process uses approx. 62GB of memory, while the heap size max is set to 20GB. This means the off-heap memory usage is approx. 42GB.
$ ps aux | grep 40724
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
apache-+ 40724 140 47.1 75678780 62181644 ? Sl Nov06 11782:27 /usr/lib/jvm/java-7-oracle/jre/bin/java -cp /opt/spark-1.5.1-bin-hadoop2.4/conf/:/opt/spark-1.5.1-bin-hadoop2.4/lib/spark-assembly-1.5.1-hadoop2.4.0.jar:/opt/spark-1.5.1-bin-hadoop2.4/lib/datanucleus-rdbms-3.2.9.jar:/opt/spark-1.5.1-bin-hadoop2.4/lib/datanucleus-api-jdo-3.2.6.jar:/opt/spark-1.5.1-bin-hadoop2.4/lib/datanucleus-core-3.2.10.jar -Xms20480M -Xmx20480M -Dspark.driver.port=7201 -Dspark.blockManager.port=7206 -Dspark.executor.port=7202 -Dspark.broadcast.port=7204 -Dspark.fileserver.port=7203 -Dspark.replClassServer.port=7205 -XX:MaxPermSize=256m org.apache.spark.executor.CoarseGrainedExecutorBackend --driver-url akka.tcp://sparkDriver#xxx.xxx.xxx.xxx:7201/user/CoarseGrainedScheduler --executor-id 2 --hostname xxx.xxx.xxx.xxx --cores 10 --app-id app-20151106125547-0000 --worker-url akka.tcp://sparkWorker#xxx.xxx.xxx.xxx:7200/user/Worker
$ sudo -u apache-spark jps
40724 CoarseGrainedExecutorBackend
40517 Worker
30664 Jps
$ sudo -u apache-spark jstat -gc 40724
S0C S1C S0U S1U EC EU OC OU PC PU YGC YGCT FGC FGCT GCT
158720.0 157184.0 110339.8 0.0 6674944.0 1708036.1 13981184.0 2733206.2 59904.0 59551.9 41944 1737.864 39 13.464 1751.328
$ sudo -u apache-spark jps -v
40724 CoarseGrainedExecutorBackend -Xms20480M -Xmx20480M -Dspark.driver.port=7201 -Dspark.blockManager.port=7206 -Dspark.executor.port=7202 -Dspark.broadcast.port=7204 -Dspark.fileserver.port=7203 -Dspark.replClassServer.port=7205 -XX:MaxPermSize=256m
40517 Worker -Xms2048m -Xmx2048m -XX:MaxPermSize=256m
10693 Jps -Dapplication.home=/usr/lib/jvm/java-7-oracle -Xms8m
Some info:
We use Spark Streaming lib.
Our code is written in Java.
We run Oracle Java v1.7.0_76
Data is read from Kafka (Kafka runs on different boxes).
Data is written to Cassandra (Cassandra runs on different boxes).
1 Spark master and 3 Spark executors/workers, running on 4 separate boxes.
We recently upgraded Spark to 1.4.1 and 1.5.1 and the memory usage pattern is identical on all those versions.
What can be the cause of this ever-increasing off-heap memory use?

Resources