Currently commitlog directory is pointing to Directory1. I want to change it different directory D2. How should the migration be ?
This is how we did it. We have a load-balanced client that talks to Cassandra 1.1.2, and each client lives on each Cassandra node.
Drain your service.
Wait for the load balancer to remove the node.
Stop your service on the local node to halt direct Client-Cassandra writes: systemctl stop <your service name>
At this point there should be no more writes and greatly reduced disk activity:
iostat 2 - Disk activity should be near zero
nodetool gossipinfo
Disable Cassandra gossip protocol to mark the node dead and halt Cassandra-Cassandra writes: nodetool disablegossip
Flush all contents of the commit log into SSTables: nodetool flush
Drain the node – this command is more important than nodetool flush, (and might include all the behaviour of nodetool flush): nodetool drain
Stop the cassandra process: systemctl stop cassandra
Modify Cassandra config file(s), e.g. vi /etc/cassandra/default.conf/cassandra.yaml
Start Cassandra: systemctl start cassandra
Wait 10-20 minutes. Tail Cassandra logs to follow along, e.g. tail -F /var/log/cassandra/system.log
Confirm ring is healthy before moving on to next node: nodetool ring
Re-start client service: systemctl start <your service here>
Note that there was no need for us to do manual copying of the commitlog files themselves. Flushing and draining took care of that. The files then slowly reappeared in the new commitlog_dir location.
You can change the commit log directory in cassandra.yaml (key: "commitlog_directory") and copy all logs to the new destination (see docs) :
commitlog_directory
commitlog_directory
The directory where the commit log is stored. Default locations:
Package installations: /var/lib/cassandra/commitlog
Tarball installations: install_location/data/commitlog
For optimal write performance, place the commit log be on a separate disk partition, or (ideally) a separate physical device from
the data file directories. Because the commit log is append only, an
HDD is acceptable for this purpose.
If you are using bitnami/cassandra containers, this should be done using this env var (see docs):
CASSANDRA_COMMITLOG_DIR: Directory where the commit logs will be
stored. Default: /bitnami/cassandra/data/commitlog
Related
I'm working on a SoM running 5.4.80+g521466cb0f2c kernel. My goal is for journalD logs to persist indefinitely (up to size limit). The steps I followed for setting up persistent systemd logs were
mkdir /var/log/journal
systemd-tmpfiles --create --prefix /var/log/journal
systemctl restart systemd-journald.service
I left the journal.conf settings to auto
After that, everything seemed fine. /var/log/journal/8293ab5eb5f64ef3937cbb9ed84b4d3e was created and logs appeared to be being written correctly. I could still access new logs with journalctl. I rebooted the system and suddenly /var/log/journal does not exist and logs only exist from the latest power up. Is there a step I'm missing to make my logs actually persist through a powercycle?
Revisiting the data locality for Spark on Kubernetes question: if the Spark pods are colocated on the same nodes as the HDFS data node pods then does data locality work ?
The Q&A session here: https://www.youtube.com/watch?v=5-4X3HylQQo seems to suggest it doesn't.
Locality is an issue Spark on Kubernetes. Basic Data locality does work if the Kubernetes provider provides a network topology plugins that are required to resolve where the data is and where the spark nodes should be run. and you have built kubernetes to include the code here
There is a method to test this data locality. I have copied it here for completeness:
Here's how one can check if data locality in the namenode works.
Launch a HDFS client pod and go inside the pod.
$ kubectl run -i --tty hadoop --image=uhopper/hadoop:2.7.2
--generator="run-pod/v1" --command -- /bin/bash
Inside the pod, create a simple text file on HDFS.
$ hadoop fs
-fs hdfs://hdfs-namenode-0.hdfs-namenode.default.svc.cluster.local
-cp file:/etc/hosts /hosts
Set the number of replicas for the file to the number of your cluster nodes. This ensures that there will be a copy of the file in the cluster node that your client pod is running on. Wait some time until this happens.
`$ hadoop fs -setrep NUM-REPLICAS /hosts`
Run the following hdfs cat command. From the debug messages, see which datanode is being used. Make sure it is your local datanode. (You can get this from $ kubectl get pods hadoop -o json | grep hostIP. Do this outside the pod)
$ hadoop --loglevel DEBUG fs
-fs hdfs://hdfs-namenode-0.hdfs-namenode.default.svc.cluster.local
-cat /hosts ... 17/04/24 20:51:28 DEBUG hdfs.DFSClient: Connecting to datanode 10.128.0.4:50010 ...
If no, you should check if your local datanode is even in the list from the debug messsages above. If it is not, then this is because step (3) did not finish yet. Wait more. (You can use a smaller cluster for this test if that is possible)
`17/04/24 20:51:28 DEBUG hdfs.DFSClient: newInfo = LocatedBlocks{ fileLength=199 underConstruction=false blocks=[LocatedBlock{BP-347555225-10.128.0.2-1493066928989:blk_1073741825_1001; getBlockSize()=199; corrupt=false; offset=0; locs=[DatanodeInfoWithStorage[10.128.0.4:50010,DS-d2de9d29-6962-4435-a4b4-aadf4ea67e46,DISK], DatanodeInfoWithStorage[10.128.0.3:50010,DS-0728ffcf-f400-4919-86bf-af0f9af36685,DISK], DatanodeInfoWithStorage[10.128.0.2:50010,DS-3a881114-af08-47de-89cf-37dec051c5c2,DISK]]}] lastLocatedBlock=LocatedBlock{BP-347555225-10.128.0.2-1493066928989:blk_1073741825_1001;`
Repeat the hdfs cat command multiple times. Check if the same datanode is being consistently used.
Im trying to start DSE 5.0.1 Cassandra (Single node) in my local.
Getting below error:
CassandraDaemon.java:698 - Cannot start node if snitch's data center
(Cassandra) differs from previous data center (Graph). Please fix the
snitch configuration, decommission and rebootstrap this node or use
the flag -Dcassandra.ignore_dc=true
If you are using GossipingPropertyFileSnitch, start Cassandra with the option
-Dcassandra.ignore_dc=true
If it starts successfully, execute:
nodetool repair
nodetool cleanup
Afterwards, Cassandra should be able to start normally without the ignore option.
This occurs when the node starts and see's that it has information indicating that it was previously part of a different datacenter. This occurs if the datacenter was different on a prior boot and was then changed.
In your case you are most likely using DseSimpleSnitch which names the Datacenter based on the workload of that node. Previously the node was started with Graph enabled which turned the name to Graph. Now trying to start it without Graph enabled leads to it naming the Datacenter Cassandra which is the default.
Using the -Dcassandra.ignore_dc=true flag will allow you to proceed but a better solution would be to switch to GossipingPropertyFileSnitch and give this machine a dedicated datacenter name.
Another option (if you are just testing) is to wipe out the data directory as this will clear out the information previously labeling the datacenter for the node. This will most likely be sudo rm -R /var/lib/cassandra/
This issue will happen when you change Datacenter name in this below respective file /etc/dse/cassandra/cassandra-rackdc.properties
To resolve please follow the below 3 steps
Clear the below-mentioned directories(Note:- if have data please take a backup with cp command )
cd /var/lib/cassandra/commitlog
sudo rm -rf *
cd /var/lib/cassandra/data
sudo rm -rf *
now start the dse service with the below command
service dse start
command to check the list node's status
nodetool -h ::FFFF:127.0.0.1 status
all,
I met a very serious problem about my couchdb. I install the couchdb on a virtual machine, the system is ubuntu. The disk of this ubuntu is 10G, I asked the couchdb to catch data from twitter but I was not aware that the couchdb occupy all the disk storage in the system. To get some storage to run the couchdb, I have delete the system log. Then I type: sudo service couchdb start, it is start, but the http://127.0.0.1:9000/_utils/cannot open. All show in the following. Is anyone can help?? I am really anxious, because all my data is store only in this couchdb.
Documentation: https://help.ubuntu.com/
ubuntu#election2:~$ sudo service couchdb status
couchdb stop/waiting
ubuntu#election2:~$ sudo service couchdb start
couchdb start/running, process 1325
ubuntu#election2:~$ sudo service couchdb status
couchdb stop/waiting
ubuntu#election2:~$ sudo service couchdb stop
stop: Unknown instance:
ubuntu#election2:~$ sudo service couchdb restart
stop: Unknown instance:
couchdb start/running, process 1601
ubuntu#election2:~$
if this couchdb is difficult to repair, is anyone can tell me how can I remove the data in this couchdb wihtout starting it.There must be some tangible document. Thank you in advance!!!!
There are two CouchDB configuration files, default.ini and local.ini. My default.ini contains an entry in the [couchdb] section called database_dir which specifies where couch databases are stored. To find your configuration files run: couchdb -c. Your database .couch file will have the same name as you would have seen in the web interface. To remove the database, delete the file. If you want to keep the data, move it to a different location.
Shut down couch using: couchdb -d or sudo service couchdb stop depending on how you started it.
Once shut down you can copy / move the couchdb database directory to a location with more space. Change your database_dir setting to the new location and restart couchdb with: couchdb -b or sudo service couchdb start
The data is stored as a db file with the name of the CouchDB bucket in a dictionary that is specified in the local.ini as database_dir. Delete the file and the data is gone.
The running CouchDB can be killed by kill :pid. The :pid is a number (process id) and can be investigated with the command ps -ax | grep "couchdb"
I ran the spark-ec2 script with --ebs-vol-size=1000 (and the 1000GB volumes are attached) but when I run hadoop dfsadmin -report shows only:
Configured Capacity: 396251299840 (369.04 GB)
per node. How do I increase the space or tell HDFS to use the full capacity?
Run lsblk and see where the volume is mounted. It is probably vol0. In your hdfs-site.xml , add /vol0 to dfs.data.dir value after comma to the existing default. Copy this to all slaves and restart cluster. You should see full capacity now