I'm trying to setup cassandra 3.11.6.1 (and tried with 3.11.4.1) but failed to make it works. In the yaml configuration I set directories values for commitlog, hints, data and saved_cache to another root dir but in the logs it seems that cassandra doesn't take care of it as it tries to open directories in the default conf root dir:
WARN [HintsWriteExecutor:1] 2020-05-06 11:30:34,864 NativeLibrary.java:306 - open(/var/lib/cassandra/hints, O_RDONLY) failed, errno (2).
ERROR [HintsWriteExecutor:1] 2020-05-06 11:30:34,864 HintsCatalog.java:167 - Unable to open directory /var/lib/cassandra/hints
The group/owner is correctly set and chmod is 0777 to avoid any user rights problems.
The last thing I've tried is to create a symlink /var/lib/cassandra pointing to my datastore directory but it doesn't change anything.
Is it possible to use antoher directory configuration but default one?
Is someone have faced this problem and solved it? (and how, please)
The problem was that cassandra reads /etc/cassandra/default.conf/cassandra.yaml even if you create your own cassandra.yaml in /etc/cassandra
Related
We are facing a peculiar problem on one of our 2 environments. A PutFile processor throws the following error
PutFile[id=xxx] Penalizing StandardFlowFileRecord[uuid=xxx,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=xxx, container=default, section=1012], offset=94495, length=9778],offset=0,name=xxxxxxxxxxxxxxxx_PROD_20200513020001.json.gz,size=9778] and routing to 'failure' because the output directory /data/home/datadelivery/OUT/Test does not exist and Processor is configured not to create missing directories
After enabling the creation of missing directories, the error changes to:
Could not set create directory with permissions 664 because /data/home/datadelivery/OUT/Test: java.nio.file.AccessDeniedException: /data/home/datadelivery/OUT/TestPutFile[id=xxx...
Based on the error message one would think that it is an issue with file and folder permissions, however, the path /data/home/datadelivery/OUT/Test exists, and the nifi user can access and create files and folders in there as well (verified from the command line). The same folder permissions and ownership rights are configured on our DEV environment, where the PutFile processor works as expected. We could change the configuration to use a different location, but I'd rather find the root cause instead.
Where should I start debugging?
Thank you for your help in advance!
Kind regards, Julius
Strange issue, I would try to set full permission on the folder/file you want to write (ie chmod 777 + chown nifi:nifi + recursively), and see if the error is still there. If not it's kind of a start ...
Restarting the NiFi service solved the problem. The issue was that the Unix user (nifi) was modified months after starting the NiFi service. Most probably this was the reason the PutFile processor wasn't able to access a folder which the nifi unix user could.
After starting Cassandra and starting batch writes, the system disk becomes full and when I inspect it using df -h. But I can't find the which files use this space used. I tried to inspect using du -h with no success. After restarting the machine, the problem still exists.
When I delete some files and start Cassandra again I got about 11GB available?
Any advice to got a solution for this problem?
Thanks
For data and commit log files see these places. You can configure these in cassandra.yaml file.
data_file_directories
The directory location where table data (SSTables) is stored. Cassandra distributes data evenly across the location, subject to the granularity of the configured compaction strategy. Default locations:
Package installations: /var/lib/cassandra/data
Tarball installations: install_location/data/data
commitlog_directory
The directory where the commit log is stored. Default locations:
Package installations: /var/lib/cassandra/commitlog
Tarball installations: install_location/data/commitlog
For more information read this.
My Issue
I am having trouble removing MongoDB warnings about Transparent Huge Pages (THP) on an OVH CentOS 7 installation, and the issue appears to be the inability to write to /sys/kernel/mm as root.
First, I realize the OVH kernel is customized, and I know many of you will say to go with a fresh non-customized kernel, but that's not an option right now. I need to solve this problem for the current OS.
MongoDB Warnings:
2016-03-09T00:31:45.889-0500 W CONTROL [initandlisten] Failed to probe "/sys/kernel/mm/transparent_hugepage": Permission denied
2016-03-09T00:31:45.889-0500 W CONTROL [initandlisten] Failed to probe "/sys/kernel/mm/transparent_hugepage": Permission denied
MongoDB is trying to read the transparent_hugepage files (below), but they do not exist:
/sys/kernel/mm/transparent_hugepage/enabled
/sys/kernel/mm/transparent_hugepage/defrag
Cannot Create the Files
All of the solutions I've seen involve creating the files and populating them with never, including the script in the MongoDB documentation. In all of the solutions, this is the key part:
echo never > /sys/kernel/mm/transparent_hugepage/enabled
echo never > /sys/kernel/mm/transparent_hugepage/defrag
However, the files do not exist, and I cannot create anything under /sys/kernel/mm as root.
root#myhost [~]# echo never > /sys/kernel/mm/transparent_hugepage/enabled
-bash: /sys/kernel/mm/transparent_hugepage/enabled: No such file or directory
root#myhost [~]# mkdir -p /sys/kernel/mm/transparent_hugepage
mkdir: cannot create directory ‘/sys/kernel/mm/transparent_hugepage’: Operation not permitted
The owner and group of directory /sys/kernel/mm are root, and I have temporarily changed the permissions from 700 to 777, yet I still cannot create the directory as root.
Tuned Profile Also Doesn't Help
To be thorough, I have also created the custom Tuned profile (per instructions in MongoDB link above) and activated it, but it generates the error WARNING tuned.plugins.plugin_vm: Option 'transparent_hugepages' is not supported on current hardware.
Tuned Profile (/etc/tuned/no-thp/tuned.conf):
[main]
include=virtual-guest
[vm]
transparent_hugepages=never
Error in Tuned log:
WARNING tuned.plugins.plugin_vm: Option 'transparent_hugepages' is not supported on current hardware.
Some Solution in MongoDB Itself?
It seems like the best solution would be to somehow explicitly configure MongoDB not to use THP so that it wouldn't have to check for the missing files, but I've seen nothing like this. If there is a way, even if it involves customizing MongoDB (and repeating after every update), I'm willing to do it.
Right now I've installed CentOS 7 on OVH. They use /boot/bzImage-3.14.32-xxxx-grs-ipv6-64 that implements grsecurity (https://grsecurity.net) which precludes access to some folders.
The very simple solution to the warnings from MongoDB about huge pages can be solved by replacing the kernel. The procedure for CentOS7 is as follows:
Download required kernel from OVH ftp: ftp://ftp.ovh.net/made-in-ovh/bzImage2 into /boot folder.
Edit /etc/grub2.cfg:
# linux /boot/bzImage-3.14.32-xxxx-grs-ipv6-64 root=/dev/md1 ro net.ifnames=0
linux /boot/bzImage-4.8.17-xxxx-std-ipv6-64 root=/dev/md1 ro net.ifnames=0
Here I replaced bzImage-3.14.32-xxxx-grs-ipv6-64 default by bzImage-4.8.17-xxxx-std-ipv6-64 without grs.
Now, reboot and check if the new kernel is ok:
root#ns506846 ~]# uname -r
4.8.17-xxxx-std-ipv6-64
Regarding the location of cassandra created data files and system files, I need to move the "commitlog_directory", "data_file_directories" and "saved_caches_directory" which have settings in the "cassandra.yaml" config file. It is currently at the default location "/var/lib/cassandra". The data is only some test data and of course the system generated keyspaces which are
dse_perf
dse_system
OpsCenter
system
system_traces
There are also the commitlog and saved_caches.db to move.
I am thinking of moving the keyspace directories with linux shell commands but I'm very unsure if they will become corrupt somehow. There is simply no space in the default drive and we need to move everything to the secondary and tertiary mounted drives.
Right now I'm in the process of moving all the files and resetting the yaml settings.
I have two questions -
Regarding the cassandra.yaml file, are there any other files besides this that are depended upon to have the location of the commitlog_directory and data_file_directories and saved_caches_directory, and their 'wrong location' will cause failure once I move all these files? I am also concerned the files (like the db files) inside the tables themselves have references to their own location and cause failure once they are moved.
If I just move the three settings commitlog_directory and data_file_directories and saved_caches_directory, will dse/cassandra actually create all the system keyspaces (system_traces, dse_perf, system, OpsCenter, dse_system), and the commitlof and the saved_caches.db, and will any other upstream config files be out of sync with that (same as first part of question 1)?
It is a very new installation so re installing would not be the end of the world but I realllly don't want to because we have kerberos and all kinds of other stuff on top of this cluster now.
This OS is ubuntu 14.0.4 and the DSE version is 4.7.
I just finished doing this. My instances are in AWS EC2 so your process may vary, but in essence:
create a new volume and attach it to the instance. my new device was
/dev/xvdg.
create new mount point sudo mkdir /new_data
format the new volume sudo mkfs -t ext4 /dev/xvdg
edit /etc/fstab so that your mount will survive reboots and add this
line /dev/xvdg /new_data ext4 defaults,nofail,nobootwait 0 2
mount the new volume sudo mount -a
make the new directories sudo mkdir -p
/new_data/lib/cassandra/commitlog
chown the ownership sudo chown -R cassandra:cassandra
/new_data/lib/cassandra
change cassandra.yaml to point to the new dirs
drain the node. if you're moving the data dir, copy over the data
from the old location to the new location. if you're moving
commitlog only, just restart cassandra.
I was able to move all the files and the commitlog as well. I changed the yaml and pointed it to where I wanted it to go. Remember to run the following command afterward -
chown -R cassandra:cassandra
And voila! Everything is reading/writing as it should. Cassandra is neato.
I've hadoop single instance cluster configured to run with some IP address ( instead of localhost ) on centos linux. I was able to execute example mapreduce job correctly. That tells me that the hadoop setup appears to be fine.
I have also addded couple of data files to hadoop databse under "/data" folder and are visible through the "dfs" comand
bin/hadoop dfs -ls /data
I am trying to connect to this HDFS system from PDI/Kettle. In the HDFS File browser, if I put the HDFS connection parameters incorrectly, e.g. incorrect port, it says it can not connect to the HDFS server. Instead, If I put in all parameters correctly ( server,port,user,password ), and click 'connect' it does not give the error, meaning it is able to connect. But in the file list, it shows "/" .
Doesnt show data folder. What could be going wrong ?
I've already tried this :
tried chmod 777 to the datafiles using "bin/hadoop dfs -chmod -R 777 /data"
tried using root and also hdfs linux user in the PDI file browser
tried adding the data files in some other location
re-formatting hdfs several times and adding data files again
copying the hadoop-core jar file from hadoop installable to PDI extlib
but it does not list files in the PDI browser. I can not see anything in the PDI log either... Need quick help ... thanks !!!
-abhay
I got past this issue. On windows, PDI was not logging anything in the log file. I tried same thing on linux, when it showed me in the log that it was missing a library from Apache, the commons-configuration. I downloaded latest version of the same and put it under the extlib/pentaho folder and boom ! it worked !!