cassandra Backup from one node to another node

cassandra Backup from one node to another node - cassandra

I'm new to cassandra and gremlin.i am using gremlin to enter and retrive the data from cassandra .i want to take a bakup and restore it on new node.i took a snapshot using nodetool.please help me with some links or documents

I used the secound approach of this post : How do I replicate a Cassandra's local node for other Cassandra's remote node?
If structure of the tables is the same, you could create two bash's scripts like below:
1. Export the data using these commands:
nodetool flush <your-keyspace-name>
nodetool cleanup <your-keyspace-name>
nodetool -h localhost -p 7199 snapshot <your-keyspace-name>
zip -r /tmp/bkp.zip /var/lib/cassandra/data/<your-keyspace-name>/
sshpass -p <password> scp -v /tmp/bkp.zip root#<ip>:/tmp
2. Import the data:
unzip /tmp/bkp.zip
nodetool cleanup <your-keyspace-name>
cd /var/lib/cassandra/data/<your-keyspace-name>/ && find /var/lib/cassandra/data/<your-keyspace-name>/ -maxdepth 5 -type d -exec sstableloader -v --nodes 127.0.0.1 {} \;
If you note some slow process, please check this another post: Cassandra's sstableloader too slow in import data
Important: You should adapt this informaction to your reallity.

Related

postgresql pg_dump backup transfer

I need to transfer around 50 databases from one server to other. Tried two different backup options.
1. pg_dump -h XX.XXX.XX.XXX -p pppp -U username -Fc -d db -f db.cust -- all backups in backup1 folder size 10GB
2. pg_dump -h XX.XXX.XX.XXX -p pppp -U username -j 4 -Fd -d db -f db.dir -- all backups in backup2 folder size 10GB
then transferred to other server for restoration using scp.
scp -r backup1 postgres#YY.YYYY.YY.YYYY:/backup/
scp -r backup2 postgres#YY.YYYY.YY.YYYY:/backup/
Noticed a strange thing. Though the backup folder size are same for both backup but it takes different time to transfer using scp. For directory format backup, transfer is 4 times than custom format backup. Both the SCP done in same network and tried multiple times but result are some.also tried rsync but no difference.
please suggest, what could be the reason for slowness and how can I speed up. I am open to use any other method to transfer.
Thanks

Cassandra - migrate data folder (change data_file_directories value in config)

We have a 3-node Cassandra cluster (version 3.16). On one of the node (10.0.4.4) we need to move the data from one drive to another drive on the same node. On SO and other web sites, we have seen the following procedure:
Start by rsyncing the data from old location to new location:
sudo rsync -avzP --delete /var/lib/cassandra/data /datadrive/cassandra/data
Repeat until it is fast enough. Then drain & flush the node
nodetool flush
nodetool drain
Stop cassandra service. We are running it on systemd so we issue
sudo systemctl stop cassandra
Run again rsync
sudo rsync -avzP --delete /var/lib/cassandra/data /datadrive/cassandra/data
Update the cassandra.yaml and set data_file_directories:
data_file_directories:
- /datadrive/cassandra/data
Ensure permissions are set:
chown -R cassandra:cassandra /datadrive/cassandra/data
Restart the node:
sudo systemctl start cassandra
When doing this sequence, at start-up we get the following error:
ERROR [main] 2020-04-18 14:51:51,742 CassandraDaemon.java:774 - Exception encountered during startup
Apr 18 14:51:51 i1 cassandra: java.lang.RuntimeException: A node with address /10.0.4.4 already exists,
cancelling join. Use cassandra.replace_address if you want to replace this node.
What are we missing? The node is not being replace, so I am hesitant to set replace_address.
What is the right way to change the data folder of a running node that already has data?
Thank you.

Well, I finally found the issue. I am posting the answer as I am thinking it may help people who makes the same mistake. The problem was this line:
sudo rsync -avzP --delete /var/lib/cassandra/data /datadrive/cassandra/data
The syntax to sync the contents of dir1 to dir2 on the same system is:
rsync -r dir1/ dir2
Notice the extra / after dir1. So the rsync command corrected is:
sudo rsync -avzP --delete /var/lib/cassandra/data/ /datadrive/cassandra/data
And once this is done, everything works perfectly...

pg_basebackup: directory "/var/lib/pgsql/9.3/data" exists but is not empty

I am trying to build a HA two node cluster with pacemaker and corosync for postgresql-9.3. I am using the link below as my guide.
http://kb.techtaco.org/#!linux/postgresql/building_a_highly_available_multi-node_cluster_with_pacemaker_&_corosync.md
However, I cannot get pass to the part where I need do pg_basebackup as shown below.
[root#TKS-PSQL01 ~]# runuser -l postgres -c 'pg_basebackup -D
/var/lib/pgsql/9.3/data -l date +"%m-%d-%y"_initial_cloning -P -h
TKS-PSQL02 -p 5432 -U replicator -W -X stream' pg_basebackup:
directory "/var/lib/pgsql/9.3/data" exists but is not empty
/var/lib/pgsql/9.3/data in TKS-PSQL02 is confirmed empty.
[root#TKS-PSQL02 9.3]# ls -l /var/lib/pgsql/9.3/data/ total 0
Any idea why I am getting such error? And is there any better way to do Postgresql HA? Note: I am not using shared storage for the database so I could not proceed with Redhat clustering.
Appreciate all the answers in advance.

CouchDb data migration from 1.4 to 2.0

I am updating CouchDB from 1.4 to 2.0 on my windows 8 system.
I have taken backup of my data and view files from /var/lib/couchdb and uninstalled 1.4.
Installed 2.0 successfully and its running.
Now I copied all data to /var/lib/couchdb and /data folder but futon is not showing any database.
I created a new database "test" and its accessible in futon but I could not find it in /data dir.
Configuration:
default.ini:
[couchdb]
uuid =
database_dir = ./data
view_index_dir = ./data
Also I want to understand that: will upgrade require re-indexing?

you might want to look at the local port of the node you copied the data into: when you just copy data files, it will likely work, but they appear at another Port (5986 instead of 5984).
What this means is: when you copy the database file (those residing in the directory specified in /_config/couchdb/database_dir and ending with .couch; quoting https://blog.couchdb.org/2016/08/17/migrating-to-couchdb-2-0/ here) into the data directory of one of the nodes of the CouchDB 2.0 cluster (e.g., lib/node1/data), the database will appear in http://localhost:5986/_all_dbs (note 5986 instead of 5984: this is the so-called local port never intended for production use but helpful here).
As the local port is not a permanent solution, you can now start a replication from the local port to a clustered port (still quoting https://blog.couchdb.org/2016/08/17/migrating-to-couchdb-2-0/ - assuming you're dealing with a database named mydb resulting in a filename mydb.couch):
# create a clustered new mydb on CouchDB 2.0
curl -X PUT 'http://machine2:5984/mydb'
# replicate data (local 2 cluster)
curl -X POST 'http://machine2:5984/_replicate' -H 'Content-type: application/json' -d '{"source": "http://machine2:5986/mydb", "target": "http://machine2:5984/mydb"}'
# trigger re-build index(es) of somedoc with someview;
# do for all to speed up first use of application
curl -X GET 'http://machine2:5984/mydb/_design/_view/?stale=update_after'
As an alternative, you could also replicate from the old CouchDB (running) to the new one as you can replicate between 1.x and 2.0 just as you could replicate between 1.x and 1.x

Use this to migrate all databases residing in CouchDB's database_dir, e.g. /var/lib/couchdb
# cd to database dir, where all .couchdb files reside
cd /var/lib/couchdb
# create new databases in the target instance
for i in ./*.couch; do curl -X PUT http://machine2:5986$( echo $i | grep -oP '[^.]+(?=.couch)'); done
# one-time replication of each database from source to target instance
for i in ./*.couch; do curl -X POST http://machine1:5984/_replicate -H "Content-type: application/json" -d '{"source": "'"$( echo $i | grep -oP '[^./]+(?=.couch)')"'", "target": "http://machine2:5986'$( echo $i | grep -oP '[^.]+(?=.couch)')'"}'; done
If you are running both the source and the target CouchDB within a docker container on the same docker host, you might first check the docker host IP that is mapped into the source container in order to allow the source container to access the target container
/sbin/ip route|awk '/default/ { print $3 }'

mysql dump 200DB ~ 40GB from one server to another

What would be the most efficient way for me to export 200 databases with a total of 40GB of data, and import them into another server? I was originally planning on running a script that would export each DB to their own sql file, and then import them into the new server. If this is the best way, are there some additional flags i can pass to the mysqldump that will speed it up?
The other option I saw was to directly pipe the mysqldump into an import over SSH. Would this be a better option? If so could you provide some info on what the script might look like?

If the servers can ping each other you could use PIPES to do so:
mysqldump -hHOST_FROM -u -p db-name | mysql -hHOST_TO -u -p db-name
Straightforward!
[EDIT]
Answer for your question:
mysqldump -hHOST_FROM -u -p --all | mysql -hHOST_TO -u -p

The quick and fastest way is to use percona xtrabackup to take hot backups. It is very fast and you can use it on live system whereas mysqldump can cause locking. Please avoid copying /var/lib directory to other server in case of Innodb, this would have very bad effects.
Try percona xtrabackup, here is some more information on this on installation and configuration. Link here.

If both mysql servers will have same dbs and config I think the best method is to copy the /var/lib/mysql dir using rsync. Stop servers before doing the copy to avoid table corruption

Export MySQL database using SSH with the command
mysqldump -p -u username database_name > dbname.sql
Move the dump using wget from the new server SSH.
wget http://www.domainname.com/dbname.sql
Import the MySQL database using SSH with the command
mysql -p -u username database_name < file.sql
Done!!

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

cassandra Backup from one node to another node - cassandra

I'm new to cassandra and gremlin.i am using gremlin to enter and retrive the data from cassandra .i want to take a bakup and restore it on new node.i took a snapshot using nodetool.please help me with some links or documents

Related

postgresql pg_dump backup transfer

Cassandra - migrate data folder (change data_file_directories value in config)

pg_basebackup: directory "/var/lib/pgsql/9.3/data" exists but is not empty

CouchDb data migration from 1.4 to 2.0

mysql dump 200DB ~ 40GB from one server to another

Categories

Resources