Hadoop/Hive commands not working from Crontab - linux

I am not able to execute hadoop/hive commands from Crontab. Basically i have scheduled a perl script in crontab which contains system commands which are setting PATH before my operations.
I am aware that, the env of running from cron could be different from your regular shell. That's the reason i am setting paths like below. IS there any other way to make it work?
system(". /home/ciber/.bash_profile");
system("export JAVA_HOME=/usr/lib/jvm/java-6-openjdk-amd64");
system("export HADOOP_INSTALL=~/poc/install/hadoop-1.0.3");
system("export PATH=$PATH:$HADOOP_INSTALL/bin");
system("export HADOOP_HOME=$HADOOP_INSTALL");
system("export HIVE_INSTALL=~/poc/install/hive-0.9.0");
system("export PATH=$PATH:$HIVE_INSTALL/bin");
#Jingguo Yao: Do you have any idea abt this?

You can absolute path of commands in crontab . Also, you can declare env variable in crontab simply . for example
foo=bar

If it is executing properly from terminal and not in crontab, loading user bash profile in the script should do the work, as below
. ~/.bash_profile
or
. /home/<user>/.bash_profile
Usually #!/bin/bash is included in bash_profile and it will have user specific configurations as well.

Related

crontab is returning error

I'm trying to run some crawler with Linux crontab.
This should go to the Python environment with
pyenv shell jake-crawler
Here is my crontab -e
*/10 * * * * /home/ammt/apps/crawler/scripts/bat_start.sh
This will run every 10 minutes. This command line works fine when I type
(jake-crawler) [jake#KIBA_OM crawler]$ /home/jake/apps/crawler/scripts/bat_start.sh
[DEBUG|run.py:30] 2017-09-24 19:55:49,980 > BATCH_SN:1, COLL_SN:1, 1955 equal 0908 = False
Inside of bat_start.sh I have init.sh which changes the environment to Python.
Here is my init.sh
#!/usr/bin/env bash
export PATH="${HOME}/.pyenv/scripts:$PATH"
eval "$(pyenv init -)"
eval "$(pyenv virtualenv-init -)"
pyenv shell jake-crawler
This has no problem when I personally run it from command line. But when cron run it by itself, it cannot find the pyenv command.
I think that you can specify which user should run that script in cron configuration file.
So, if that script is working with your user, then define it in your cron configuration filr.
See this answer for example... https://stackoverflow.com/a/8475757/3827004.
There are two things that diferentiate when you launch an application from the terminal, and when you do from a crontab file:
the environment is not the same, at least if you don't execute your .profile script from your cron job.
You don't have access to a terminal. Cron jobs don't use a terminal, so you will not be able, for example to open /dev/tty. You will have to be very careful on how redirections are handled, as you have them all directed to your tty when running on an interactive session, but all of them will be redirected possibly to a pipe, when run from cron(8).
This makes your environment quite different and is normally a source of errors. Read crontab(1) man page for details.

source ~/.bashrc from bash script does not work

I am trying to create a script to be reload bashrc once but it did not work.
reloader.sh
#!bin/bash
source ~/.bashrc
rm reloader.sh
I had the same problem. The issue is that only interactive shells can access whatever you have defined in your .bashrc(Aliases and so on)
To make your shell-script interactive use a shebang with parameter:
#!/bin/bash -i
You need to use source to run the script:
source reloader.sh
If you just run it as a command, it will run in a new process, so none of the changes that .bashrc makes will affect your original shell process.

Script runs from terminal, but not cron. What edits to this script do I need to make?

I have a script used for zipping a database and site files, then dumps the output into a backup folder on the server. The script runs fine from the command line, but it will not work through cron.
After much research, I am thinking that cron cannot run it in its current form because it runs in a different environment.
Here is the script, saved as file_name.sh
#!/bin/bash
NOW=$(date +"%Y-%m-%d-%H%M")
FILE="website.com.$NOW.tar"
BACKUP_DIR="/backupfolder"
WWW_DIR="/var/www/website/"
DB_USER="dbuser"
DB_PASS="dbpw"
DB_NAME="dbname"
DB_FILE="website.com.$NOW.sql"
WWW_TRANSFORM='s,^var/www/website,www,'
DB_TRANSFORM='s,^backupfolder,database,'
tar -cvf $BACKUP_DIR/$FILE --transform $WWW_TRANSFORM $WWW_DIR
mysqldump -u$DB_USER -p$DB_PASS $DB_NAME > $BACKUP_DIR/$DB_FILE
tar --append --file=$BACKUP_DIR/$FILE --transform $DB_TRANSFORM $BACKUP_DIR/$DB_FILE
rm $BACKUP_DIR/$DB_FILE
gzip -9 $BACKUP_DIR/$FILE
I currently have the script stored in /usr/local/scripts/
Is there something wrong with the above code that does not allow it to run through cron?
Which crontab should it go in? crontab -e from terminal, or /etc/crontab? They are two different files.
Several things come to mind: first, one of the most common problems with cron jobs is that generally crond runs things with a very minimal PATH (usually just /usr/bin:/bin), so if the script uses any commands from some other binaries directory, it'll fail. Where is mysqldump on your system (run which mysqldump if you aren't sure)? If this is the problem, adding PATH=/usr/local/bin:/usr/bin:/bin (or whatever's appropriate in your case) at the beginning of your script should fix it. Alternately, you can set PATH in the crontab file (put this line before the entry that runs your script).
If that's not the problem, my next step would be to capture the script's output, with something like:
1 1 * * * /usr/local/scripts/file_name.sh >/tmp/file_name.log 2>&1
... and see if the output is informative. BTW, as #tripleee mentioned, the format of your cron entry is suitable for the files crontab -e edits, but not for /etc/crontab. The /etc version has an additional field specifying which user to run the job as, e.g.
1 1 * * * eric /usr/local/scripts/file_name.sh >/tmp/file_name.log 2>&1
Best practice is to always use crontab -e (the resultant files are usually in /var/spool/cron/) and this works on every unix and linux platform I ever worked on.
Other common issues with cron execution are missing environment variables. Any environment variables set in .bash_profile (or .profile if you use korn shell) will not necessarily be present in the cron environment. This can be overcome by including them in your script.
As Gordon said, paths are another suspect. You can always full path you executables in your script (eg /bin/mysqldump). Some of the more cynical of us do this anyway to make sure we are executing what we intended as apposed to some other file of the same name in the current path.
I can only guess at your specific problem since you fixed it by creating /scripts, that perhaps the permissions on /usr/local/scripts directory did not allow execution by the cron user?
I have had to remove the extension (.sh) for cron to run in some instances.
So I fixed it. Not sure what the problem was, but this worked for me.
I originally had the scripts located in /usr/local/scripts/
I created a new directory here - /scripts/ and moved the scripts there. The new crontab -e command looked like this:
1 1 * * * bash /scripts/file_name.sh
Works perfectly. Again, I am not sure what the issue was before, but it works now.

Setting of Environmental Variables which persist in Linux

I have added the following:
export SQOOP_HOME=/usr/bin/
to my /etc/profile file. However when I run an install.sh script it keeps saying the environment variable is not set. I have also added similar lines to the bash_profile.
Any ideas what I could be doing wrong?
When running a shell script, it runs (by default) non-login and non-interactive--see my answer to another question on Unix.SE for a rundown of when and where bash looks for config files. You will probably want to add the -l option to the shebang line to make it a login shell.
You need to do a login before you can see the changes in /etc/profile. Try:
bash -l
for example.

Cron does not run from /root

If I run a script from /home/<user>/<dir>/script.sh, as root, the cron works pretty well. But If I run the script from /root/<dir>/script.sh (as root, again), the cron does not seem to work.
Having run afoul of various default $PATHs in the past when using 'cron', I always spell in full the absolute $PATH for each executable file and each target file. I always assume that 'cron' has NO $PATH set and has NO current-working-directory.
In other words don't use a command like
"myprocess abc*.txt"
but do it in full like
"/usr/localbin/myprocess /home/jvs/abc*.txt".
Alternatively, create a bash script which does the job, and call that bash script with a full absolute path, such as
"/usr/local/bin/myprocess_abc_txts".
If you need to have some flexibility in the script, use environment variables which are set specifically within the bash script you call with 'cron'.
I think you need to add a little more information. I'd guess it is a permissions thing though. Add the permissions of the file, the directories, and the line in your crontab so we can help. Also, if you are putting this in /root, are you running this in root's crontab?
Remember the environment - especially when run by cron rather than by root. When cron runs something, you probably don't have anything much set of your environment, unlike when you run a command via at. It is also not clear what your current directory will be. So, for commands that will be run by cron, use a script (as you're already doing) and make sure it sets enough of the environment for it to run. And make sure your environment setting code is not interactive!
On my machines, I have a mechanism such that the cron entry reads (for example):
23 1 * * 1-5 /usr/bin/ksh /work1/jleffler/bin/Cron/weekday
The weekday script in the Cron directory is a link to a standard script that first sets the environment and then runs the command /work1/jleffler/bin/weekday (in this case - it uses the name of the command to determine what to run).
The actual script in the Cron directory is:
: "$Id: runcron.sh,v 2.1 2001/02/27 00:53:22 jleffler Exp $"
#
# Commands to be performed by Cron (no debugging options)
# Set environment -- not done by cron (usually switches HOME)
. $HOME/.cronfile
base=`basename $0`
cmd=${REAL_HOME:-/real/home}/bin/$base
if [ ! -x $cmd ]
then cmd=${HOME}/bin/$base
fi
exec $cmd ${#:+"$#"}
I've been using it a while now - this version since 2001 - and it works a treat for me. I'm using a basic (Sun Solaris 10) implementation of cron; there may be new features in new versions of cron on other platforms to make some of this unnecessary. (The $REAL_HOME stuff is a weirdness of mine; pretend it says $HOME - though that makes some of the script unnecessary for you.) The .cronfile is responsible for the environment setting - it does quite a lot, but that's my problem, not yours.
It could be because you're looking for relative directories/files in the script which are located when running it from /home/ but not from /root, because /root is not in /home/root nor would it look like a users homefolder in /home/
Can you check and see if it is looking for relative files, or post the script?
On another note, why don't you just set it to run from a user's homefolder then?
Another way to run sh script is place your bash script in /usr/bin directory and simply run command bash yourscript.sh without adding /usr/bin/ directory

Resources