In a clustered environment, if there is items.xml change, is it enough to do Update Running System in one node and just clear cache the other nodes? - sap-commerce-cloud

In a clustered environment, if there is items.xml change, is it enough to do Update Running System in one node and just clear cache the other nodes?
​Assume that I have Node1 and Node2. If I add an attribute in items.xml and build-and-deploy it to Node1 and Node2, is it enough for me to do Update Running System in Node1, and just Clear Cache Node2? Or, do I also need to do Update Running System in Node2?

Ideally, its good to restart other nodes once you are done with an update running system. If zero downtime is not the case, then take downtime to make on node ready with an update system and then make other nodes ready.
Another way:
Take node1 out of the cluster, get it ready with all configuration, update system, etc. Add it back to the cluster and take all other nodes down. Do deployment on other remaining nodes and make them up. Here you do not need to update the running system or cache clear.
if downtime is really critical for the business, then refer rolling update on the cluster to implement the proper solution.

Related

Cassandra cluster - Migrating all hosts in cluster

I am using Cassandra(3.5) with 20 nodes with data center-1 with 10 nodes and data center-2 with 10 nodes and has huge data. All hosts belong to say legacy hosts. Now we have newer generation hosts say generation-2.
I have tried adding new nodes and decommissioning old node. But this will be tie consuming.
Q1: How can I migrate all hosts from legacy hosts to generation-2 host? What is the best approach for that?
Q2: What will be rollback strategy?
Q3: Finally, How can I validate data once I migrate to generation-2 hosts?
If you just replacing the nodes with newer hardware, keeping the same number of nodes, then it's simple (operations should be done on every node):
prepare the new installation on every node, with configuration identical to existing nodes, but with different IP addresses but don't start the nodes
(optional) disable autocompaction with nodetool disableautocompaction - this could help to execute step 5 faster
copy data from old node to new node using rsync (this could take long time)
execute nodetool drain & stop old node
use rsync to synchronize changes happened since initial copying (it should be relatively fast)
make sure that the old node won't start again (for example, remove Cassandra package) - otherwise it could be a chaos
start the new node
This works because Cassandra node is identified by the UUID that is stored in the local table, so changing of IP doesn't affect the operations.
P.S. In future, if you'll need to replace node (not as described, but completely died), use the procedure of node replacement - in this case, you won't stream data twice, as happened when you do decomissioning and then re-adding node.

How to update configuration of a Cassandra cluster

I have a 3 node Cassandra cluster and I want to make some adjustments to the cassandra.yaml
My question is, how should I perform this? One node at a time or is there a way to make it happen without shutting down nodes?
Btw, I am using Cassandra 2.2 and this is a production cluster.
There are multiple approaches here:
If you edit the cassandra.yaml file, you need to restart cassandra to re-read the contents of that file. If you restart all nodes at once, your cluster will be unavailable. Restarting one node at a time is almost always safe (provided you have sane replication-factors and consistency-levels). If your cluster is configured to survive a rack or datacenter outage, then you can safely restart more nodes concurrently.
Many settings can be changed without a restart via JMX, though I don't have a documentation link handy. Changing via JMX WON'T change cassandra.yml though, so you'll need to update that also or your config will revert back to what's in the file when the node restarts.
If you're using DSE, OpsCenter's Lifecycle Manager feature makes updating configs a simple point-and-click affair (disclaimer, I'm biased as I'm an LCM dev).

Changing Snitch on live cluster in datastax 4.5

I have 8 nodes in one region and now i want to add new node in other region.Presently i m using ec2snitch ,after adding node to new region i need to change snitchs of all nodes to ec2 multiregion snitch.
Now my question is, does this change will impact my current running cluster? and what would be the best practice for doing this .
Thanks
You should do a rolling restart changing to ec2 multi region snitch before adding the new node. It should not impact your running cluster. Though I would suggest you bring up a test cluster briefly to test making the change.
To perform a rolling restart from Opscenter:
Click Nodes in the left pane.
In the contextual menu select Restart
from the Cluster Actions dropdown.
Set the amount of time to wait after restarting each node, select whether the node should be
drained before stopping, and then click Restart Cluster.
See more details here:
http://www.datastax.com/documentation/opscenter/5.0/opsc/online_help/opscRestartingCluster_t.html
Here is a link to the DataStax documentation for switching snitches. I found that to be useful when I switched to the GossipingPropertiesFileSnitch. I also had to edit cassandra-rackdc.properties on all nodes before doing the rolling restart.
Even though my topology didn't change, I followed the instruction in the reference. Stopped all the nodes, restarted them (start with the seeds), then ran 'nodetool repair' and 'nodetool cleanup' on all nodes.

Adding a new node to existing cluster

Is it possible to add a new node to an existing cluster in cassandra 1.2 without running nodetool cleanup on each individual node once data has been added?
It probably isn't but I need to ask because I'm trying to create an application where each user's machine is a server allowing for endless scaling.
Any advice would be appreciated.
Yes, it is possible. But you should be aware of the side-effects of not doing so.
nodetool cleanup purges keys that are no longer allocated to that node. According to the Apache docs, these keys count against the allocated data for that node, which can cause the auto bootstrap process for the next node to not properly balance the ring. So depending on how you are bringing new user machines into the ring, this may or may not be a problem.
Also keep in mind that nodetool cleanup only needs to be run on nodes that lost keyspace to the new node - i.e. adjacent nodes, not all nodes, in the cluster.

Best way to shrink a Cassandra cluster

So there is a fair amount of documentation on how to scale up a Cassandra, but is there a good resource on how to "unscale" Cassandra and remove nodes from the cluster? Is it as simple as turning off a node, letting the cluster sync up again, and repeating?
The reason is for a site that expects high spikes of traffic, climbing from the daily few thousand hits to hundreds of thousands over a few days. The site will be "ramped up" before hand, starting up multiple instances of the web server, Cassandra, etc. After the torrent of requests subsides, the goal is to turn off the instances that are not longer used, rather than pay for servers that are just sitting around.
If you just shut the nodes down and rebalance cluster, you risk losing some data, that exist only on removed nodes and hasn't replicated yet.
Safe cluster shrink can be easily done with nodetool. At first, run:
nodetool drain
... on the node removed, to stop accepting writes and flush memtables, then:
nodetool decommission
To move node's data to other nodes, and then shut the node down, and run on some other node:
nodetool removetoken
... to remove the node from the cluster completely. The detailed documentation might be found here: http://wiki.apache.org/cassandra/NodeTool
From my experience, I'd recommend to remove nodes one-by-one, not in batches. It takes more time, but much more safe in case of network outages or hardware failures.
When you remove nodes you may have to re-balance the cluster, moving some nodes to a new token. In a planed downscale, you need to:
1 - minimize the number of moves.
2 - if you have to move a node, minimize the amount of transfered data.
There's an article about cluster balancing that may be helpful:
Balancing Your Cassandra Cluster
Also, the begining of this video is about add node and remove node operations and best strategies to minimize the cluster impact in each of these operations.
Hopefully, these 2 references will give you enough information to plan your downscale.
First, on the node, which will be removed, flush memory (memtable) to SSTables on disk:
-nodetool flush
Second, run command to leave a cluster:
-nodetool decommission
This command will assign ranges that the node was responsible for to other nodes and replicates the data appropriately.
To monitor a process you can use command:
- nodetool netstats
Found an article on how to remove nodes from Cassandra. It was helpful for me scaling down cassandra.All actions are described step-by-step there.

Resources