hdfs + what would be the main cause of UNDER REPLICA blocks - linux

I'm just curious to know what would be the main cause of UNDER REPLICA blocks
We have ambari cluster with HDP version - 2.6.5
Number of data nodes machines are - 5
As it would always have at least three copies, I thought this would be hard to happen (but happens)
If HDFS can't create one copy or detects corruption, wouldn't it try to recover by copying a good one into another DataNode?
Or once a file was properly created in HDFS, does it never check if file is corrupted or not until HDFS is restarted?
To fix the under replica we can use the following steps:
su hdfs
hdfs fsck / | grep 'Under replicated' | awk -F':' '{print $1}' >> /tmp/under_replicated_files
for hdfsfile in `cat /tmp/under_replicated_files`; do echo "Fixing $hdfsfile :" ; hadoop fs -setrep 3 $hdfsfile; done
how hadoop fs -setrep 3 , is works?

Related

how to get Linux to automatically remove old pages from pagecache?

Ever since upgrading to Linux kernel 5.8 and later, I've been having problems with my system freezing up from running out of RAM, and it's all going to the pagecache.
I have a program that reorganizes the data from the OpenStreetMap planet_latest.osm.pbf file into a structure that's more efficient for my usage. However, because this file is larger than the amount of RAM on my system (60GB file versus 48GB RAM), the page cache fills up. Before kernel 5.8, the cache would reach full, and then keep chugging along (with an increase in disk thrashing). Since 5.8, the system freezes because it won't ever automatically release a page from the page cache (such as 30GB earlier in my sequential read of the planet_latest.osm.pbf file). I don't need to use my reorganizing program to hang the system; I found the following unprivileged command would do it:
cat planet_latest.osm.pbf >/dev/null
I have tried using the fadvise64() system service to manually force releases of pages in the planet file I have already passed; it helps, but doesn't entirely solve the problem with the various output files my program creates (especially when those temporary output files are randomly read back later).
So, what does it take to get the 5.8 through 5.10 Linux kernel to actually automatically release old pages from the page cache when system RAM gets low?
To work around the problem, I have been using a script to monitor cache size and write to /proc/sys/vm/drop_caches when the cache gets too large, but of course that also releases new pages I am currently using along with obsolete pages.
while true ; do
H=`free | head -2 | tail -1 | awk '{print $6}'`
if [ $H -gt 35000000 ]; then
echo -n $H " # " ; date
echo 1 >/proc/sys/vm/drop_caches
sensors | grep '°C'
H=`free | head -2 | tail -1 | awk '{print $6}'`
echo -n $H " # "; date
fi
sleep 30
done
(the sensors stuff is to watch out for CPU overheating in stages of my program that are multi-threaded CPU-intensive rather than disk-intensive).
I have filed a bug report at kernel.org, but they haven't looked at it yet.

cassandra - Take snapshot to different location

I want to take Cassandra backup at every 1hr interval and move it to Shared location.
Cassandra taking the snapshot in the default location, how can I take the snapshot on /opt/backup location?
You can't (with snapshots).
nodetool snapshot -t <tag> <keyspace> is a quite simple tool - it just creates hard links for every file in your keyspace data directories to snapshots/<tag>.
Since these are hard links they have to be on the same filesystem. Benefit of those hard links is that a snapshot is quite fast and doesn't consume additional disk space initially (when sstables got compacted / deleted the files remain in the snapshot).
If you want those backups in a different location use -t <tag> while creating your snapshot. I made up a demo with demosnapshot and a simple script (not fully elaborated but shows the idea:
$ cat cassandrabackup.sh
#!/bin/bash
TAG=`date +%Y%m%d%H%M%S`
BACKUP_LOC=/tmp/backup/`hostname`
KEYSPACE=demokeyspace
echo creating snapshot $TAG
nodetool snapshot -t $TAG $KEYSPACE
echo sync to backup location $BACKUP_LOC
find /var/lib/cassandra -type f -path "*snapshots/$TAG*" -printf %P\\0 | rsync -avP --files-from=- --from0 /var/lib/cassandra/ $BACKUP_LOC
echo removing snapshot $TAG
nodetool clearsnapshot -t $TAG
The script creates a snaphot with a specific tag (datetime), rsyncs the contents to a backup location and then removes the snapshot. If KEYSPACE is not defined all keyspaces are backuped.
Result is like this:
$ ./cassandrabackup.sh
creating snapshot 20170823132936
Requested creating snapshot(s) for [demokeyspace] with snapshot name [20170823132936] and options {skipFlush=false}
Snapshot directory: 20170823132936
sync to backup location /tmp/backup/host1.domain.tld
building file list ...
6 files to consider
data1/
data1/demokeyspace/
data1/demokeyspace/demotable-0bbb579087ef11e7aa786377cd3ba823/
data1/demokeyspace/demotable-0bbb579087ef11e7aa786377cd3ba823/snapshots/
data1/demokeyspace/demotable-0bbb579087ef11e7aa786377cd3ba823/snapshots/20170823132936/
data1/demokeyspace/demotable-0bbb579087ef11e7aa786377cd3ba823/snapshots/20170823132936/manifest.json
13 100% 0.00kB/s 0:00:00 (xfr#1, to-chk=0/6)
sent 305 bytes received 50 bytes 710.00 bytes/sec
total size is 13 speedup is 0.04
removing snapshot 20170823132936
Requested clearing snapshot(s) for [all keyspaces] with snapshot name [20170823132936]
$ ifjke#fsca01:~$ find /tmp/backup/
/tmp/backup/
/tmp/backup/host1.domain.tld
/tmp/backup/host1.domain.tld/data2
/tmp/backup/host1.domain.tld/data2/demokeyspace
/tmp/backup/host1.domain.tld/data2/demokeyspace/demotable-0bbb579087ef11e7aa786377cd3ba823
/tmp/backup/host1.domain.tld/data2/demokeyspace/demotable-0bbb579087ef11e7aa786377cd3ba823/snapshots
/tmp/backup/host1.domain.tld/data2/demokeyspace/demotable-0bbb579087ef11e7aa786377cd3ba823/snapshots/20170823125951
/tmp/backup/host1.domain.tld/data2/demokeyspace/demotable-0bbb579087ef11e7aa786377cd3ba823/snapshots/20170823125951/manifest.json
/tmp/backup/host1.domain.tld/data2/demokeyspace/demotable-0bbb579087ef11e7aa786377cd3ba823/snapshots/20170823130014
/tmp/backup/host1.domain.tld/data2/demokeyspace/demotable-0bbb579087ef11e7aa786377cd3ba823/snapshots/20170823130014/manifest.json
/tmp/backup/host1.domain.tld/data1
/tmp/backup/host1.domain.tld/data1/demokeyspace
/tmp/backup/host1.domain.tld/data1/demokeyspace/demotable-0bbb579087ef11e7aa786377cd3ba823
/tmp/backup/host1.domain.tld/data1/demokeyspace/demotable-0bbb579087ef11e7aa786377cd3ba823/snapshots
/tmp/backup/host1.domain.tld/data1/demokeyspace/demotable-0bbb579087ef11e7aa786377cd3ba823/snapshots/20170823132936
/tmp/backup/host1.domain.tld/data1/demokeyspace/demotable-0bbb579087ef11e7aa786377cd3ba823/snapshots/20170823132936/manifest.json
$
As I did that error by myself in the past - include hostname in the backups ;)
Apart from that there is also an incremental backup feature in cassandra:
http://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsBackupIncremental.html

To find Cassandra disk space usage

I am using Jconsole for monitoring Cassandra. I can get value like how much load each keyspace is having.
I want to find out disk space usage for each node in a cluster by remotely.
Is there any way to do so?
A shell script can do the trick
for i in node1_ip node2_ip ... nodeN_ip
do
ssh user#$i "du -sh /var/lib/cassandra/data" >> /tmp/disk_usage.txt
done
Replace /var/lib/cassandra/data if your data folder is put somewhere else

Too many open files - KairosDB

on running this query:
{ "start_absolute":1359695700000, "end_absolute":1422853200000,
"metrics":[{"tags":{"Building_id":["100"]},"name":"meterreadings","group_by":[{"name":"time","group_count":"12","range_size":{"value":"1","unit":"MONTHS"}}],"aggregators":[{"name":"sum","align_sampling":true,"sampling":{"value":"1","unit":"Months"}}]}]}
I am getting the following response:
500 {"errors":["Too many open files"]}
Here this link it is written that increase the size of file-max.
My file-max output is:
cat /proc/sys/fs/file-max
382994
it is already very large, do I need to increase its limit
What version are you using? Are you using a lot of grou-by in your queries?
You may need to restart kairosDB as a workaround.
Can you check if you have deleted (ghost) files handles (replace by kairosDB process ID in the command line below)?
ls -l /proc/<PID>/fd | grep kairos_cache | grep -v '(delete)' | wc -l
THere was a fix in 0.9.5 for unclosed file handles.
There's a fix pending for next release (1.0.1).
cf. https://github.com/kairosdb/kairosdb/pull/180, https://github.com/kairosdb/kairosdb/issues/132, and https://github.com/kairosdb/kairosdb/issues/175.

Remote linux server to remote linux server large sparse files copy - How To?

I have two twins CentOS 5.4 servers with VMware Server installed on each.
What is the most reliable and fast method for copying virtual machines files from one server to the other, assuming that I always use sparse file for my vmware virtual machines?
The vm's files are a pain to copy since they are very large (50 GB) but since they are sparse files I think something can be done to improve the speed of the copy.
If you want to copy large data quickly, rsync over SSH is not for you. As running an rsync daemon for quick one-shot copying is also overkill, yer olde tar and nc do the trick as follows.
Create the process that will serve the files over network:
tar cSf - /path/to/files | nc -l 5000
Note that it may take tar a long time to examine sparse files, so it's normal to see no progress for a while.
And receive the files with the following at the other end:
nc hostname_or_ip 5000 | tar xSf -
Alternatively, if you want to get all fancy, use pv to display progress:
tar cSf - /path/to/files \
| pv -s `du -sb /path/to/files | awk '{ print $1 }'` \
| nc -l 5000
Wait a little until you see that pv reports that some bytes have passed by, then start the receiver at the other end:
nc hostname_or_ip 5000 | pv -btr | tar xSf -
Have you tried rsync with the option --sparse(possibly over ssh)?
From man rsync:
Try to handle sparse files efficiently so they take up less
space on the destination. Conflicts with --inplace because it’s
not possible to overwrite data in a sparse fashion.
Since rsync is terribly slow at copying sparse file, I usually resort using tar over ssh :
tar Scjf - my-src-files | ssh sylvain#my.dest.host tar Sxjf - -C /the/target/directory
You could have a look at http://www.barricane.com/virtsync
(Disclaimer: I am the author.)

Resources