cassandra copy [Errno 13] Permission denied - cassandra

Cassandra newbie here. I have just set up a proof of concept single node machine on Red Hat Linux. I finally got all of the permissions correct and started up the machine. I then created a keyspace called harvard, issues the use command to switch into harvard, and then created a table called hmxpc.
I then wanted to import a .csv file. I placed the .csv file in the cassandra folder just for simplicity, chmod 755 the file, and issued the following:
copy hmxpc (course_id, userid_di, certified, explored, final_cc_cname_di, gender, grade, incomplete_flag, last_event_di, loe_di, nchapters, ndays_act, nevents, nforum_posts, nplay_video, registered, roles, start_time_di, viewed, yob) from 'cassandra/HMXPC.csv' with header=true;
When I run it, I get the following error:
[Errno 13] Permission denied: 'import_harvard_hmxpc.err'
What am I doing wrong?

I just had the same issue. I figured it out by using the --debug flag.
My floats had ',' instead of '.' so my csv couldn't be parsed. CQLSH tried to write an err file describing the issue, but I was in /root, which cassandra can't write to. So I cd'ed to /tmp and did the same, this time getting I got errors showing that my floats couldn't be parsed

The problem ended up being a Red Hat permissions issue and had nothing to do with Cassandra. Thanks for looking.

I was getting the same error as show in screenshot_Errored. Moved the .csv file to .cassandra directory and was able to execute the same csql command as shown in screenshot_worked

Beyond the other cases that are described in other responses, the error may appear, as described below if an incorrect ordering or number of columns appears in the COPY command.
i.e consider having a CSV file with the following header line:
actor, added date, video id, character name, description, encoding, tags, title, user id
If I use the following COPY command:
cqlsh:killrvideo> COPY videos_by_actor(actor, added_date, character_name, description, encoding, tags, title, user_id, video_id) FROM 'videos_by_actor.csv' WITH HEADER = true;
I will get the Error 13:
Using 7 child processes
Starting copy of killrvideo.videos_by_actor with columns [actor, added_date, character_name, description, encoding, tags, title, user_id, video_id].
[Errno 13] Permission denied: 'import_killrvideo_videos_by_actor.err'
If I set the names of the columns correctly in the COPY commands as follows
cqlsh:killrvideo>COPY videos_by_actor(actor, added_date, video_id, character_name, description, encoding, tags, title, user_id ) FROM 'videos_by_actor.csv' WITH HEADER = true
then command completes successfully.
Using 7 child processes
Starting copy of killrvideo.videos_by_actor with columns [actor, added_date, video_id, character_name, description, encoding, tags, title, user_id].
Processed: 81659 rows; Rate: 5149 rows/s; Avg. rate: 2520 rows/s
81659 rows imported from 1 files in 32.399 seconds (0 skipped).

Here's my checklist for this rather non-specific (catch-all) cqlsh error "[Errno 13] Permission denied" for the containerized use cases (e.g. when using bitnami/cassandra:latest):
Make sure the path you are supplying to the COPY command is the internal (container) path, not an external one (host, PVC etc).
Make sure the CSV file has correct read permissions for the internal container user ID, not the external one (host, PVC etc), especially if the CSV was created in another containerized app (e.g. Jupyter Notebook).
If the file contains a header, our COPY command ends in WITH HEADER = true; (yes, omitting it also raises permission denied error...)
For example, assuming you have run your Cassandra container like this:
$ docker run -d --rm --name cassandra -v /tmp:/bitnami -u 1001 bitnami/cassandra:4.0
Then the COPY command issued in cqlsh to import a /tmp/random_data1.csv from the host should be:
> COPY dicts.dict1 (key, value) FROM '/bitnami/random_data1.csv' WITH HEADER = true;
and the /tmp/random_data1.csv file should be owned by user 1001 or accessible for reading for all users.

The most bizarre reason for this error is lack of write access... to the errors file (the path to which is left empty in the default config file). This is particularly likely if running Cassandra (container) as a non-root user.
To solve it, one needs to pass a custom config file (e.g. /bitnami/cqlshrc) when executing the client:
$ cqlsh -u <user> -p <pass> --cqlshrc=/bitnami/cqlshrc
There should be a sample config cqlshrc.sample shipped with your installation (use cd / && find | grep cqlshrc to find it).
Changes to be made in your custom config file:
# uncomment this section:
; [copy-from] -> [copy-from]
# uncomment and specify custom error file
; errfile = -> errfile = /bitnami/cassandra-errfile.txt
More info on the setting in question: errfile.

Related

Zip compress without root folders

My problem is that I have to generate a zip file using the linux zip console command. My command is as follows:
zip -r /folder1/folder2/EXP_45.zip /folder1/folder2/EXP_45/
That returns a correct zip only that includes the root folders I want:
Returns
EXP_45.zip
-folder1
--folder2
---EXP_45
...
I want
EXP_45.zip
-EXP_45
...
EXP_45 is a folder that can contain files and folders and they must be present in the zip. I just want the tree structure to start with the EXP_45 folder.
Is there any solution?
The reason why I need it to be a single command is that it is an action of a job in a PL SQL function like:
BEGIN
DBMS_SCHEDULER.CREATE_JOB (
JOB_NAME=>'compress_files', --- job name
JOB_ACTION=>'/usr/bin/zip', --- executable file with path
JOB_TYPE=>'executable', ----- job type
NUMBER_OF_ARGUMENTS=>4, -- parameters in numbers
AUTO_DROP =>false,
CREDENTIAL_NAME=>'credentials' -- give credentials name which you have created before "credintial"
);
dbms_scheduler.set_job_argument_value('compress_files',1,'-r');
dbms_scheduler.set_job_argument_value('compress_files',2,'-m');
dbms_scheduler.set_job_argument_value('compress_files',3,'/folder1/folder2/EXP_45.zip');
dbms_scheduler.set_job_argument_value('compress_files',4,'/folder1/folder2/EXP_45/');
DBMS_SCHEDULER.RUN_JOB('compress_files');
END;
I haven't been able to find a solution to this problem using zip but I have found it using jar. The command would be:
jar cMf /folder1/folder2/EXP_45.zip -C /folder1/folder2/EXP_45 .
Also, the solution using a job in pl sql in case it works for someone would be:
BEGIN
DBMS_SCHEDULER.CREATE_JOB (
JOB_NAME=>'compress_files', --- job name
JOB_ACTION=>'/usr/bin/jar', --- executable file with path
JOB_TYPE=>'executable', ----- job type
NUMBER_OF_ARGUMENTS=>5, -- parameters in numbers
AUTO_DROP =>false,
CREDENTIAL_NAME=>'credentials' -- give credentials name which you have created before "credintial"
);
dbms_scheduler.set_job_argument_value('compress_files',1,'cMf');
dbms_scheduler.set_job_argument_value('compress_files',2,'/folder1/folder2/EXP_45.zip');
dbms_scheduler.set_job_argument_value('compress_files',3,'-C');
dbms_scheduler.set_job_argument_value('compress_files',4,'/folder1/folder2/EXP_45');
dbms_scheduler.set_job_argument_value('compress_files',5,'.');
DBMS_SCHEDULER.RUN_JOB('compress_files');
END;
You want to use the -j (or --junk-paths) option when you are creating the zip file. Below is from the zip man page.
-j
--junk-paths
Store just the name of a saved file (junk the path), and do not store directory names.
By default, zip will store the full path (relative to the current directory).
Update following Question Clarification
Why not put the equivalent to the code below in a shell script & get the SQL function to invoke that? You just need to pass the directory name to cd into and the name of the output zip.
cd folder1/folder2
zip -r /tmp/EXP_45.zip EXP_45

How do I use Nagios to monitor a log file that generates a random ID

This the log file that I want to monitor:
/test/James-2018-11-16_15215125111115-16.15.41.111-appserver0.log
I want Nagios to read it this log file so I can monitor a specific string.
The issue is with 15215125111115 this is the random id that gets generated
Here is my script where the Nagios is checking for the Logfile path:
Veriables:
HOSTNAMEIP=$(/bin/hostname -i)
DATE=$(date +%F)
..
CHECK=$(/usr/lib64/nagios/plugins/check_logfiles/check_logfiles
--tag='failorder' --logfile=/test/james-${date +"%F"}_-${HOSTNAMEIP}-appserver0.log
....
I am getting the following output in nagios:
could not find logfile /test/James-2018-11-16_-16.15.41.111-appserver0.log
15215125111115 This number is always generated randomly but I don't know how to get nagios to identify it. Is there a way to add a variable for this or something? I tried adding an asterisk "*" but that didn't work.
Any ideas would be much appreciated.
--tag failorder --type rotating::uniform --logfile /test/dummy \
--rotation "james-${date +"%F"}_\d+-${HOSTNAMEIP}-appserver0.log"
If you add a "-v" you can see what happens inside. Type rotating::uniform tells check_logfiles that the rotation scheme makes no difference between current log and rotated archives regarding the filename. (You frequently find something like xyz..log). What check_logfile does is to look into the directory where the logfiles are supposed to be. From /test/dummy it only uses the directory part. Then it takes all the files inside /test and compares the filenames with the --rotation argument. Those files which match are sorted by modification time. So check_logfiles knows which of the files in question was updated recently and the newest is considered to be the current logfile. And inside this file check_logfiles searches the criticalpattern.
Gerhard

zip command not working

I am trying to zip a file using shell script command. I am using following command:
zip ./test/step1.zip $FILES
where $FILES contain all the input files. But I am getting a warning as follows
zip warning: name not matched: myfile.dat
and one more thing I observed that the file which is at last in the list of files in a folder has the above warning and that file is not getting zipped.
Can anyone explain me why this is happening? I am new to shell script world.
zip warning: name not matched: myfile.dat
This means the file myfile.dat does not exist.
You will get the same error if the file is a symlink pointing to a non-existent file.
As you say, whatever is the last file at the of $FILES, it will not be added to the zip along with the warning. So I think something's wrong with the way you create $FILES. Chances are there is a newline, carriage return, space, tab, or other invisible character at the end of the last filename, resulting in something that doesn't exist. Try this for example:
for f in $FILES; do echo :$f:; done
I bet the last line will be incorrect, for example:
:myfile.dat :
...or something like that instead of :myfile.dat: with no characters before the last :
UPDATE
If you say the script started working after running dos2unix on it, that confirms what everybody suspected already, that somehow there was a carriage-return at the end of your $FILES list.
od -c shows the \r carriage-return. Try echo $FILES | od -c
Another possible cause that can generate a zip warning: name not matched: error is having any of zip's environment variables set incorrectly.
From the man page:
ENVIRONMENT
The following environment variables are read and used by zip as described.
ZIPOPT
contains default options that will be used when running zip. The contents of this environment variable will get added to the command line just after the zip command.
ZIP
[Not on RISC OS and VMS] see ZIPOPT
Zip$Options
[RISC OS] see ZIPOPT
Zip$Exts
[RISC OS] contains extensions separated by a : that will cause native filenames with one of the specified extensions to be added to the zip file with basename and extension swapped.
ZIP_OPTS
[VMS] see ZIPOPT
In my case, I was using zip in a script and had the binary location in an environment variable ZIP so that we could change to a different zip binary easily without making tonnes of changes in the script.
Example:
ZIP=/usr/bin/zip
...
${ZIP} -r folder.zip folder
This is then processed as:
/usr/bin/zip /usr/bin/zip -r folder.zip folder
And generates the errors:
zip warning: name not matched: folder.zip
zip I/O error: Operation not permitted
zip error: Could not create output file (/usr/bin/zip.zip)
The first because it's now trying to add folder.zip to the archive instead of using it as the archive. The second and third because it's trying to use the file /usr/bin/zip.zip as the archive which is (fortunately) not writable by a normal user.
Note: This is a really old question, but I didn't find this answer anywhere, so I'm posting it to help future searchers (my future self included).
eebbesen hit the nail in his comment for my case (but i cannot vote for comment).
Another possible reason missed in the other comments is file exceeding the file size limit (4GB).
I converted my script for unix environment using dos2unix command and executed my script as ./myscript.sh instead bash myscript.sh.
I just discovered another potential cause for this. If the permissions of the directory/subdirectory don't allow the zip to find the file, it will report this error. Actually, if you run a chmod -R 444 on the directory, and then try to zip it, you will reproduce this error, and also have a "stored 0%" report, like this:
zip warning: name not matched: borrar/enviar
adding: borrar/ (stored 0%)
Hence, try changing the permissions of the file. If you are trying to send them through email, and those email filters (like Gmail's) invent silly filters of not sending executables, don't forget that making permissions very strict when making zip compression can be the cause of the error you are reporting, of "name not matched".
spaces are not allowed:
it would fail if there are more than one files(s) in $FILES unless you put them in loop
I also encountered this issue. In my case, the line separate is CRLF in my zip shell script which causes the problem. Using LF fixed it.

Need help - Getting an error: xrealloc: subst.c:4072: cannot reallocate 1073741824 bytes (0 bytes allocated)

Checking if anybody else had the similar issue.
Code in the shell script:
## Convert file into Unix format first.
## THIS is IMPORTANT.
#####################
dos2unix "${file}" "${file}";
#####################
## Actual DB Change
db_change_run_op="$(ssh -qn ${db_ssh_user}#${dbserver} "sqlplus $dbuser/${pswd}#${dbname} <<ENDSQL
#${file}
ENDSQL
")";
Summary:
1. From a shell script (on a SunOS source server) I'm running a sqlplus session via ssh on a target machine to run a .sql script.
2. Output of this target ssh session (running sqlplus) is getting stored in a variable within the shell script. Variable name: db_change_run_op (as shown above in the code snapshot).
3. Most of the .sql scripts (that the variable "${file}" stores) that I'm running, shell script runs it fine and returns me the output of the .sql file (ran on target server via ssh from source server) provided, if the .sql file contains something which doesn't take much time to complete -or generates reasonable amount of output log/lines.
for ex: Let's assume if .sql I want to run does the following, then it runs fine.
select * from database123;
udpate table....
alter table..
insert ....
...some procedure .... which doesn't take much time to create....
...some more sql commands which complete..within few minutes to an hour....
4. Now, the issue I'm facing is:
Let's assume I have a .sql file where a single select command from a table have couple of hundred thousands - upto 1-5millions of lines i.e.
select * from database321;
assume the above generates the above bullet 4 condition.
In this case, I'm getting the following error message thrown by the shell script (running on the source server).
Error:
*./db_change_load.sh: xrealloc: subst.c:4072: cannot reallocate 1073741824 bytes (0 bytes allocated)*
My questions:
1. Did the .sql script complete - I assume yes. But, how can I get the output LOG file of the .sql file generated on the target server directly. If this can be done, then I won't need the variable to hold the output of whole ssh session sqlplus command and then create a log file on source server by doing [ echo "${db_change_run_op}" > sql.${file}.log ] way.
I assume the error is coming as the output or no. of lines generated by the ssh session i.e. by the sqlplus is so big that it can't fit Unix/Linux BASH variable's limit and thus, xrealloc error.
Please advise if on the above 2 questions if you have any experience or how can i solve this.
I assume, I'll try using " | tee /path/on.target.ssh.server/sql.${file}.log" soon after << ENDSQL or final close of ENDSQL (here doc keyword), wondering if that would work or not..
OK. got it working. No more store stuff in a var and then echo $var to a file.
Luckily, I had a same mount point on both source and target server i.e. if I go to /scm on source and on target, the mount (df -kvh .) shows same output for Share/NAS mount value.
Filesystem size used avail capacity Mounted on
ServerNAS02:/vol/vol1/scm 700G 560G 140G 81% /scm
Now, instead of using the variable to store the whole output of ssh session calling sqlplus session, all I did is was to create a file on the remote server using the following code.
## Actual DB Change
#db_change_run_op="$(ssh -qn ${pdt_usshu_dbs}#${dbs} "sqlplus $dbu/${pswd}#$dbn <<ENDSQL | tee "${sql_run_output_file}".ssh.log
#set echo off
#set echo on
#set timing on
#set time on
#set serveroutput on size unlimited
##${file}
#ENDSQL
#")";
ssh -qn ${pdt_usshu_dbs}#${dbs} "sqlplus $dbu/${pswd}#$dbn <<ENDSQL | tee "${sql_run_output_file}".ssh.log
set echo off
set echo on
set timing on
set time on
set serveroutput on size 1000000
#${file}
ENDSQL
"
seems like unlimited doesn't work in 11g so I had to use the 1000000 value (these small sql cmds help to show command with its output, show clock time for each output line etc).
But basically, in the above code, I'm calling the ssh command directly without using a variable="$(.....)" way.. and after the <
Even if I wouldn't have the same mount, I could have tee'd the output to a file on the remote server path (which is not available from source server) but atleast I can see upto what level the .sql command completed or generated output as now output is going directly to a file on remote server and Unix/Linux doesn't care much about the file size until there's no space left.

D2RQ parameters for generate-mapping

We are currently working on a project involving an "ordinary" relational database, but we wish to enable SPARQL requests towards this database.
d2rq.org is a tool that enables SPARQL to be run towards the database with the help of a .ttl file which defines the database to RDF mapping.
This .ttl file can be built automatically with a D2RQ tool named "generate-mapping".
http://d2rq.org/generate-mapping takes quite a few arguments, some preceeded with a single dash "-" and some double "--". My challenge is that any argument preceeded with a double dash generates this error:
Command:
./generate-mapping -u root -p password -o testmappingLocal.ttl --verbose jdbc:mysql:///iswc
Result:
Exception in thread "main" java.lang.IllegalArgumentException: Unknown argument: --verbose
at jena.cmdline.CommandLine.handleUnrecognizedArg(CommandLine.java:215)
at jena.cmdline.CommandLine.process(CommandLine.java:177)
at d2rq.generate_mapping.main(generate_mapping.java:41)
Any help with the double-dash arguments will be greatly appreciated.
OS: Ubuntu Linux, D2RQ version: 0.8
D2rq and mysql database using generate mapping file & rdf files.
1).mapping file generate commands:
./generate-mapping -u root -p root -o /home/bigtapp/Documents/d2rqgenerate_mapping/mapfile.ttl jdbc:mysql://localhost:3306/d2rq
note: 1. root -p root -> mysql db username & password.
2. /home/bigtapp/Documents/d2rqgenerate_mapping/mapfile.ttl -> file save output path .
3.jdbc:mysql://localhost:3306 ->mysql driver.
4./d2rq ->database name.
2).the mapping file using RDF creation:
use following command.
The RDF syntax to use for output. Supported syntaxes are “TURTLE”, “RDF/XML”, “RDF/XML-ABBREV”, “N3”, and “N-TRIPLE” (the default). “N-TRIPLE” works best for large databases.
command:
./dump-rdf -f RDF/XML -b localhost:3306 -o /home/bigtapp/Documents/d2rqgenerate_mapping/dumpfile.rdf /home/bigtapp/Documents/d2rqgenerate_mapping/mapfile.ttl.
apache-jena-fuseki create dataset then rdf file uploadserver then your using sparql query ..you get the result...

Resources