Changing data file directories Cassandra - cassandra

I'm trying to change the Cassandra data, commit log and saved caches directories by defining a custom shell script for CASANDRA_INCLUDE. I'm modifying the properties in the script as follows :
***
data_file_directories = "/usr/pic1/kearanky/cassandra/data"
commitlog_directory = "/usr/pic1/kearanky/cassandra/commitlog"
saved_caches_directory: "/usr/pic1/kearanky/cassandra/saved_caches"
***
When I run cassandra I get the error "data_file_directories: command not found". How can I modify the directories correctly?
PS: I don't have write access to cassandra.yaml and can't create the default directories it uses.

referrer to this answer Make your own cassandra.yaml with your custom directories and then run cassandra with with -d flag and cassandra.config=directory
or set $CASSANDRA_HOME variable in your .bashrc and then run cassandra

Related

Error while trying to cleanup Cassandra node

I am facing issue while launching cleanup command with nodetool.
Cleanup did work fine until now. I did'nt find any modification on my configuration. I have no clue on what could have change.
nodetool > cleanup
error: Expecting URI in variable: [cassandra.config]. Found[cassandra.yaml]. Please prefix the file with [file:///] for local files and [file://<server>/] for remote files. If you are executing this from an external tool, it needs to set Config.setClientMode(true) to avoid loading configuration.
-- StackTrace --
org.apache.cassandra.exceptions.ConfigurationException: Expecting URI in variable: [cassandra.config]. Found[cassandra.yaml]. Please prefix the file with [file:///] for local files and [file://<server>/] for remote files. If you are executing this from an external tool, it needs to set Config.setClientMode(true) to avoid loading configuration.
at org.apache.cassandra.config.YamlConfigurationLoader.getStorageConfigURL(YamlConfigurationLoader.java:80)
at org.apache.cassandra.config.YamlConfigurationLoader.loadConfig(YamlConfigurationLoader.java:100)
at org.apache.cassandra.config.DatabaseDescriptor.loadConfig(DatabaseDescriptor.java:262)
at org.apache.cassandra.config.DatabaseDescriptor.toolInitialization(DatabaseDescriptor.java:180)
at org.apache.cassandra.config.DatabaseDescriptor.toolInitialization(DatabaseDescriptor.java:151)
at org.apache.cassandra.tools.NodeProbe.checkJobs(NodeProbe.java:281)
at org.apache.cassandra.tools.NodeProbe.forceKeyspaceCleanup(NodeProbe.java:288)
at org.apache.cassandra.tools.nodetool.Cleanup.execute(Cleanup.java:55)
at org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:255)
at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:169)
Any idea ?
Regards,
Nicolas
Nodetool uses cassandra.yaml to to find the number of concurrent compactors. Since you have cassandra.config set its using that cassandra.yaml, but the cassandra.config has an invalid value so the nodetool is choking on it.
I did find a solution that works for me (but i think it's specific to my prod environment).
Somewhere, something should have used /etc/alternative/cassandra symlink, which was heading to my empty default cassandra configuration.
After i correct it :
[root#myhost ~]# ll /etc/alternatives/
total 116
lrwxrwxrwx 1 root root 30 Oct 2 12:40 cassandra -> /etc/cassandra/my-cassandra-instance
That done, i was able to cleanup my cassandra clusters.
Didn't find time to work this case much more.
Anyway thanks for the clues, it was a context issue.
Nicolas

Spark cluster with Jupyter Notebook tmp directory settings

I have a problem (rather a requirement) that all temporary files are written to a specific directory.
I currently set:
spark.local.dir /path/to/my/other/tmp/directory
spark.eventLog.dir /path/to/foo/bar
This kinda works, but I still get some files in the default /tmp folder.
Some <some hash>_resources, a folder called hive, a lot of liblz4-java<some hash>.so and snappy-<version>-<hash>-libsnappyjava.so files.
I would like to set the path for these temporary files, any ideas? And what would be the best practice for something like this?

Starting Cassandra on the foreground

If I start the Cassandra service everything is ok, but when I try to start Cassandra on the foreground using "cassandra -f" I get the following error:
Error: Could not find or load main class
Files\DataStax-DDC\apache-cassandra.logs.gc.log
Do I need to configure anything in particular to run Cassandra in the foreground?
Looks like the space in your "Program Files" directory is not escaped in your CASSANDRA_HOME environment variable. It gets set in your cassandra-env.ps1 (in conf/) config file, you could manually set it.

How to change Flink's log directory

I understand Flink uses log4j to manage log. So I change log setting in log4j.property, where I set the output location. However, when I start job master, it says that the log location is changed, not the default location. So how could I change the log location of Flink gracefully?
The default lib directory is set via bin/config.sh. Look for FLINK_LOG_DIR. You can just update the script to change the default log directory.
Add the following line in flink-conf.yaml that can be found in conf directory of Flink installation:
env.log.dir: /var/log/flink
Where /var/log/flink is the directory you want to use for logs.
Note that Flink does not seem to support full YML syntax, so
env:
log:
dir: /var/log/flink
will not work!
Since 1.0.3 you can set env.log.dir to change the directory where the logs are saved.

Cannot create KeySpace using Cassandra-CLI

I have installed and started up Cassandra server(1.0.8). I can connect to the server using the CLI application. But as soon as I try to create a keyspace "twissandra" following the step from CassandraCLI
I end up getting the following error
I can see the cassandra.yaml file in "config" directory of the installation.
EDIT - THE ANSWER
OK So after a few days of short chitty chats with libjack. The error was tracked down.
REM set CLASSPATH="%CASSANDRA_HOME%\conf"
The line above is a remark (comment if you may) I had to go through the whole BAT file line by line before finally removing the REM clause.
The cassandra.yaml is expected to be found in the classpath. By default, the cassandra.bat (my version from 1.07 zip) adds the $CASSANDRA_HOME\conf directory to the classpath (as well as other necessary Jars)
If CASSANDRA_HOME is not set, it uses the directory above the script location.
To test, perhaps modify the cassandra.bat to echo out all the commands and see where things get messed up.

Resources