Restoring data after upgrading Cassandra - cassandra

I'm trying to upgrade from Cassandra to the latest Datastax Enterprise and everything went fine except the fact I can't get my data back.
Basically, I had a clean cassandra after the upgrade, then I recreated the schema and trying to somehow link the files that are left from old db to the new db.
That's what I have right now in /var/lib/cassandra/data/wowch directory for example:
drwxr-x--- 4 cassandra cassandra 4.0K Feb 27 13:05 users-247834809d2011e58d82b7a748b1d9c2/
drwxr-xr-x 2 cassandra cassandra 4.0K Feb 27 18:53 users-f41a5300dd5611e58bc7b7a748b1d9c2/
As I get, the older directory is what was in the db before the upgrade. It contains some db files:
total 144K
drwxr-x--- 4 cassandra cassandra 4.0K Feb 27 13:05 ./
drwxr-x--- 60 cassandra cassandra 20K Feb 27 14:35 ../
drwxr-x--- 2 cassandra cassandra 4.0K Dec 7 21:21 backups/
-rwxr-x--- 2 cassandra cassandra 51 Jan 20 00:05 ma-46-big-CompressionInfo.db*
-rwxr-x--- 2 cassandra cassandra 828 Jan 20 00:05 ma-46-big-Data.db*
-rwxr-x--- 2 cassandra cassandra 10 Jan 20 00:05 ma-46-big-Digest.crc32*
-rwxr-x--- 2 cassandra cassandra 16 Jan 20 00:05 ma-46-big-Filter.db*
-rwxr-x--- 2 cassandra cassandra 83 Jan 20 00:05 ma-46-big-Index.db*
-rwxr-x--- 2 cassandra cassandra 4.9K Jan 20 00:05 ma-46-big-Statistics.db*
-rwxr-x--- 2 cassandra cassandra 92 Jan 20 00:05 ma-46-big-Summary.db*
-rwxr-x--- 2 cassandra cassandra 92 Jan 20 00:05 ma-46-big-TOC.txt*
-rwxr-x--- 2 cassandra cassandra 43 Feb 12 15:05 ma-47-big-CompressionInfo.db*
-rwxr-x--- 2 cassandra cassandra 41 Feb 12 15:05 ma-47-big-Data.db*
-rwxr-x--- 2 cassandra cassandra 10 Feb 12 15:05 ma-47-big-Digest.crc32*
-rwxr-x--- 2 cassandra cassandra 16 Feb 12 15:05 ma-47-big-Filter.db*
-rwxr-x--- 2 cassandra cassandra 20 Feb 12 15:05 ma-47-big-Index.db*
-rwxr-x--- 2 cassandra cassandra 4.5K Feb 12 15:05 ma-47-big-Statistics.db*
-rwxr-x--- 2 cassandra cassandra 92 Feb 12 15:05 ma-47-big-Summary.db*
-rwxr-x--- 2 cassandra cassandra 92 Feb 12 15:05 ma-47-big-TOC.txt*
-rwxr-x--- 2 cassandra cassandra 43 Feb 12 16:05 ma-48-big-CompressionInfo.db*
-rwxr-x--- 2 cassandra cassandra 169 Feb 12 16:05 ma-48-big-Data.db*
-rwxr-x--- 2 cassandra cassandra 10 Feb 12 16:05 ma-48-big-Digest.crc32*
-rwxr-x--- 2 cassandra cassandra 16 Feb 12 16:05 ma-48-big-Filter.db*
-rwxr-x--- 2 cassandra cassandra 20 Feb 12 16:05 ma-48-big-Index.db*
-rwxr-x--- 2 cassandra cassandra 4.9K Feb 12 16:05 ma-48-big-Statistics.db*
-rwxr-x--- 2 cassandra cassandra 92 Feb 12 16:05 ma-48-big-Summary.db*
-rwxr-x--- 2 cassandra cassandra 92 Feb 12 16:05 ma-48-big-TOC.txt*
-rwxr-x--- 1 cassandra cassandra 31 Dec 7 21:26 manifest.json*
drwxr-x--- 3 cassandra cassandra 4.0K Feb 27 13:05 snapshots/
I tried to copy all the stuff from here to the users-f41a5300dd5611e58bc7b7a748b1d9c2/ directory and run nodetool repair or nodetool refresh -- wowch users but had no success — the data is still not loaded.
Did I forget something? What is the right way of doing it and how to get the data back?

Depending on the version of Cassandra/DSE you were previously running you may need to run a nodetool upgradesstables. You can see the documentation here.
https://docs.datastax.com/en/cassandra/2.0/cassandra/tools/toolsUpgradeSstables.html

It's possible that you've run into this issue but without more info I can't say for sure.
You also haven't provided info on which version you started with and ended. A little more info would be very helpful. Can you also clarify - are you upgrading from community Cassandra to DSE? I couldn't tell from the way your question was worded.
Stuff to check: Do you have the token assignments from the old version? I didn't use vnodes and I found that I had to manually set initial_token in cassandra.yaml after a backup/restore of my cluster. Make sure that cassandra owns all of the dirs and files. After you import the schema, stop DSE and then empty the contents of the commitlog directory. move your data if necessary into the new folders and then restart DSE. Hope this helps.

Related

zoo + snapshot files are created very frequently

under folder - /var/hadoop/zookeeper/version-2/
we can see that Zookeeper transaction logs and snapshot files are created very frequently (multiple files in every minute) and that fills up the Filesystem in a very short time.
ROOT CAUSE
One or more application are creating or modifying the znodes too frequently, causing too many transactions in a short duration. This leads to the creation of too many transactional log files and snapshot files since they get rolled over after 100,000 entries by default (as defined by zookeeper property 'snapCount')
-rw-r--r-- 1 zookeeper hadoop 67108880 Jul 28 17:24 log.570021fa92
-rw-r--r-- 1 zookeeper hadoop 490656299 Jul 28 17:24 snapshot.5700232ffa
-rw-r--r-- 1 zookeeper hadoop 67108880 Jul 28 17:29 log.5700232ffc
-rw-r--r-- 1 zookeeper hadoop 490656389 Jul 28 17:29 snapshot.5700249d7f
-rw-r--r-- 1 zookeeper hadoop 67108880 Jul 28 17:33 log.5700249d78
-rw-r--r-- 1 zookeeper hadoop 490656275 Jul 28 17:33 snapshot.570025fdaf
-rw-r--r-- 1 zookeeper hadoop 67108880 Jul 28 17:36 log.570025fdae
-rw-r--r-- 1 zookeeper hadoop 490656275 Jul 28 17:36 snapshot.570026c447
-rw-r--r-- 1 zookeeper hadoop 67108880 Jul 28 17:40 log.570026c449
-rw-r--r-- 1 zookeeper hadoop 490658969 Jul 28 17:40 snapshot.570027caed
-rw-r--r-- 1 zookeeper hadoop 67108880 Jul 28 17:43 log.570027caef
-rw-r--r-- 1 zookeeper hadoop 490658981 Jul 28 17:43 snapshot.570028a0d0
-rw-r--r-- 1 zookeeper hadoop 67108880 Jul 28 17:48 log.570028a0d2
-rw-r--r-- 1 zookeeper hadoop 165081088 Jul 28 17:48 snapshot.57002a0268
-rw-r--r-- 1 zookeeper hadoop 67108880 Jul 28 17:48 log.57002a026b
.
.
.
.
when we opened one of the log as - log.57002a026b we saw encrypted log
any suggestion how to unencrypted the logs above ?
or how to know which is the application thatcreating or modifying the znodes too frequently ?
PROBLEM
Zookeeper transaction logs and snapshot files are created very frequently (multiple files in every minute) and that fills up the FileSystem in a very short time.
ROOT CAUSE
One or more application are creating or modifying the znodes too frequently, causing too many transactions in a short duration. This leads to the creation of too many transactional log files and snapshot files since they get rolled over after 100,000 entries by default (as defined by zookeeper property 'snapCount')
RESOLUTION
The resolution for such cases involves reviewing the zookeeper transaction logs to find the znodes that are updated/created most frequently using the following command on one of the zookeeper servers:
# cd /usr/hdp/current/zookeeper-server
# java -cp zookeeper.jar:lib/* org.apache.zookeeper.server.LogFormatter /hadoop/zookeeper/version-2/logxxx
(where 'dataDir' is set to '/hadoop/zookeeper' within zookeeper configuration)
Once the frequently updating znodes are identified using the above command, one should continue with fixing the related application that is creating such a large number of updates on zookeeper.
An example of such an application that can cause this problem is Hbase, when there are very large number of regions stuck in transition and they repeatedly fail to become online.

Cassandra - Node stuck in joining as another node is down

I am trying to add another node to a Production cassandra cluster as the disc space utilization across nodes is reaching over 90%. However, the node is in joining state for over 2 days. I also noticed that one of the node went down(DN) as it is at 100% disc space utilization. Cassandra server is unable to run on this instance!!
Will this affect bootstrapping completion of the new node?
Any immediate solutions for restoring space on the node that went down?
If I remove this out of the ring, this may add more stress of data load and increase disc space on the other nodes.
Can I remove any SSTable(like the list of files) temporarily out of the instance, bring up the server, perform clean-up and then add back these files?
-rw-r--r--. 1 polkitd input 5551459 Sep 17 2020 mc-572-big-CompressionInfo.db
-rw-r--r--. 1 polkitd input 15859691072 Sep 17 2020 mc-572-big-Data.db
-rw-r--r--. 1 polkitd input 8 Sep 17 2020 mc-572-big-Digest.crc32
-rw-r--r--. 1 polkitd input 22608920 Sep 17 2020 mc-572-big-Filter.db
-rw-r--r--. 1 polkitd input 5634549206 Sep 17 2020 mc-572-big-Index.db
-rw-r--r--. 1 polkitd input 12538 Sep 17 2020 mc-572-big-Statistics.db
-rw-r--r--. 1 polkitd input 44510338 Sep 17 2020 mc-572-big-Summary.db
-rw-r--r--. 1 polkitd input 92 Sep 17 2020 mc-572-big-TOC.txt
If you are using vnodes then downed node will surelyimpact bootstrapping. For immediate relife, identify tables which are not used in traffic and move sstables to backup from that table.
I resolved this by temporarily increasing the EBS volume(disc space)on that node, brought up the server, then removed the node out of the cluster, cleared out cassandra data folders, decreased the EBS Volume and then added back the node to the cluster.
One thing that I noticed was removing the node out of the cluster, increased disc space on the other nodes. So I added additional nodes to distribute the load, then ran clean up on all other nodes before moving on to removing the node out of the cluster.

Cache accumulation of long running spark application

We have long-running spark streaming application in our hadoop cluster. The problem is cache directory size is growing continuously until stop spark application.
Directory : yarn/local/usercache
Now, we restart application periodically. Not smart way...
Can limit the size of directory?
File list example
-r-x------ 1 yarn hadoop 169M Sep 20 14:53 /data/hadoop/yarn/local/usercache/username/filecache/81/appname-SNAPSHOT.jar
-r-x------ 1 yarn hadoop 169M Sep 20 15:55 /data/hadoop/yarn/local/usercache/username/filecache/84/appname-SNAPSHOT.jar
-r-x------ 1 yarn hadoop 169M Sep 20 16:02 /data/hadoop/yarn/local/usercache/username/filecache/87/appname-SNAPSHOT.jar
-r-x------ 1 yarn hadoop 169M Sep 20 17:30 /data/hadoop/yarn/local/usercache/username/filecache/90/appname-SNAPSHOT.jar
-r-x------ 1 yarn hadoop 169M Sep 21 10:55 /data/hadoop/yarn/local/usercache/username/filecache/93/appname-SNAPSHOT.jar
-r-x------ 1 yarn hadoop 169M Sep 21 11:01 /data/hadoop/yarn/local/usercache/username/filecache/96/appname-SNAPSHOT.jar
-r-x------ 1 yarn hadoop 169M Sep 21 12:14 /data/hadoop/yarn/local/usercache/username/filecache/99/appname-SNAPSHOT.jar

Cassandra keyspace fails when using symbolic link

Need: create keyspace on alternate device
Problem: service aborts on startup with dir-create failure messages below.
INFO [main] 2017-01-06 00:45:03,300 ViewManager.java:137 - Not submitting build tasks for views in keyspace system_schema as storage service is not initialized
ERROR [main] 2017-01-06 00:45:03,393 Directories.java:239 - Failed to create /var/lib/cassandra/data/opus/aa-15be7240d3db11e6ad0eed0a1d791016 directory
ERROR [main] 2017-01-06 00:45:03,397 DefaultFSErrorHandler.java:92 - Exiting forcefully due to file system exception on startup, disk failure policy "stop"
Context: Cassandra 3.9 single-node ubuntu 16.04; directory perms are below.
01:52 opus/ cd /var/lib/cassandra/data
01:52 opus/ ls -l
total 24
drwxr-xr-x 3 cassandra cassandra 4096 Jan 6 00:41 opus
drwxr-xr-x 24 cassandra cassandra 4096 Jan 5 23:49 system
drwxr-xr-x 6 cassandra cassandra 4096 Jan 5 23:50 system_auth
drwxr-xr-x 5 cassandra cassandra 4096 Jan 5 23:50 system_distributed
drwxr-xr-x 12 cassandra cassandra 4096 Jan 5 23:50 system_schema
drwxr-xr-x 4 cassandra cassandra 4096 Jan 5 23:50 system_traces
01:52 opus/ cd opus
01:52 opus/ ls -l
total 4
drwxr-xr-x 3 cassandra cassandra 4096 Jan 6 00:41 aa-15be7240d3db11e6ad0eed0a1d791016
when the link is installed
01:57 data/ ls -l
total 20
lrwxrwxrwx 1 root root 35 Jan 6 01:57 opus -> /media/opus/quantdrive/opus
Steps:
Vanilla install of cassandra 3.9;
Create keyspace in cqlsh create keyspace opus with replication = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 };
Create table use opus; create table aa(aa int, primary key(aa));
Stop cassandra
Move keyspace dir mv /var/lib/cassandra/data/opus /media/opus/quantdrive
Create symbolic link ln -s /media/opus/quantdrive/opus /var/lib/cassandra/opus
Start cassandra [FAILS AS ABOVE] with create directory, when directory already present
No change in perms on opus keyspace directory, I just moved it. When I move it back, cassandra starts fine.
I would be grateful for any help with this and I apologize in advance if I the solution to my problem is described elsewhere or if I'm missing the obvious.
Move the mount point for the target drive from a user-owned directory to a root-owned one. I moved the mount-point in my case from /media/opus/quantdrive which is owned by user opus to /mnt/quantdrive which is owned by root and everything worked fine.

Linux Joomla Can't write to file with 755 permission

Hello I am trying to setup Joomla. When I try to change some settings through the Global Settings Manager, and then save, I keep getting and error saying I can't write to that file.
I have tried playing around with the settings and file permissions even changing them to 755, and it still won't let me write to the file.
I have the owner set to 'root.root' and am running Fedora 18.
I have it installed on localhost, and not through FTP.
Why can't I write to these files (like configuration.php), is there something I am missing?
Joomla does not tell me what file I am trying to write to, but I assume if I'm editing Global Settings it attempts to write to configuration.php.
here is the output of ls -l /var/www/html/joomla
total 116
-rw-r--r--. 1 apache apache 17816 Nov 6 15:18 LICENSE.txt
-rw-r--r--. 1 apache apache 4300 Nov 6 15:18 README.txt
drwxr-xr-x. 10 apache apache 4096 Nov 6 15:18 administrator
drwxr-xr-x. 2 apache apache 4096 Nov 6 15:18 bin
drwxr-xr-x. 2 apache apache 4096 Nov 6 15:18 cache
drwxr-xr-x. 2 apache apache 4096 Nov 6 15:18 cli
drwxr-xr-x. 17 apache apache 4096 Nov 6 15:18 components
-rw-r--r--. 1 apache apache 2018 Dec 6 05:56 configuration.php
-rw-r--r--. 1 apache apache 3118 Nov 6 15:18 htaccess.txt
drwxr-xr-x. 5 apache apache 4096 Nov 6 15:18 images
drwxr-xr-x. 2 apache apache 4096 Nov 6 15:18 includes
-rw-r--r--. 1 apache apache 1011 Nov 6 15:18 index.php
-rw-r--r--. 1 apache apache 1909 Nov 6 15:20 joomla.xml
drwxr-xr-x. 4 apache apache 4096 Nov 6 15:18 language
drwxr-xr-x. 4 apache apache 4096 Nov 6 15:18 layouts
drwxr-xr-x. 12 apache apache 4096 Nov 6 15:18 libraries
drwxr-xr-x. 2 apache apache 4096 Dec 6 04:51 logs
drwxr-xr-x. 18 apache apache 4096 Nov 6 15:18 media
drwxr-xr-x. 28 apache apache 4096 Nov 6 15:18 modules
drwxr-xr-x. 14 apache apache 4096 Nov 6 15:18 plugins
-rw-r--r--. 1 apache apache 901 Nov 6 15:18 robots.txt.dist
drwxr-xr-x. 5 apache apache 4096 Dec 6 04:39 templates
drwsr-xr-x. 2 apache apache 4096 Dec 6 04:44 tmp
-rw-r--r--. 1 apache apache 1715 Nov 6 15:18 web.config.txt
And output of ls -ld joomla/
drwxr-xr-x. 18 apache apache 4096 Dec 6 05:57 joomla/
Also, running the command tail -f /var/log/httpd/error_log I get this
PHP Warning: file_put_contents(/var/www/html/joomla/configuration.php): failed to open stream: Permission denied in /var/www/html/joomla/libraries/joomla/filesystem/file.php on line 422, referer: http://localhost/administrator/index.php?option=com_config
After digging a bit deeper into the problem. I discovered that SELinux was blocking r/w access to httpd. This could be seen when running
ls -aLZ joomla
By running the command you would see that all files would show up to be
httpd_sys_content_t
When they really should be
httpd_sys_rw_content_t
Running a simple
chcon -R -t httpd_sys_content_rw_t /var/www/html/joomla/
AND VOILA! Problem Solved.
Thank you everyone for the help, and I hope someone else can benefit from this in the near future.
Try restarting the webserver?
As the permission must get reflected.

Resources