Linux cronjob wget to file instead of STDOUT - linux

I have a cronjob that was specified like this :
0 * * * * root bash /data/daily.sh
Inside this daily.sh is -> /data/get.sh https://www.xxxxxxx.com/ccc/ 0
As you can see, get.sh take two arguments, the first URL and the recursive depth. The script will call another get.sh with incremented depth counter and different url which is scrapped from the first run result and stop until it reaches certain depth.
Inside the get.sh, I am scrapping a website with this command
wget -O- $1 > main.htm
The problem is, main.htm is not created when this script is run via crontab. The log is saying it is saved to 'STDOUT', while when I manually run it it will save to 'main.htm'. How to solve this?

Output to file by doing the following.
wget -O {output-filename} $1

Related

Unable to output script results with column/table formatting

Answered - previously titled 'Cron job for shell script not running'
I recently downloaded Speedtest onto my Raspberry Pi, and wrote a script to output the results in csv format to a CSV file.
I'm trying to do this regularly via a cron job, but for some reason, it won't execute the shell script as intended.
Here's the script below. I've commented/cut out a lot to try and find the issue
#!/bin/bash
# Commented out if statement detects presence of data file and creates one if it doesn't exist. Was going to adjust later to include variables/input options if I wanted to used script on alternate systems, but commented out while working on main issue.
file='/home/User/Documents/speedtestdata.csv'
# have tried this with and without quotes, does not seem to make a difference either way
#HEADERS='/usr/bin/speedtest-cli --csv-header'
SPEEDTEST='/usr/bin/speedtest-cli --csv'
# Used absolute path for the executable
#LOG=/home/User/scripts/testreclog.txt
#DATE=$( date )
# Was using the above to log steps of script running successfully with timestamp, commented out
#if [ ! -f $file ]
#then
# echo "Creating results file">>$LOG
# touch $file
# $HEADERS > $file
#fi
#echo "Running speedtest">>$LOG
$SPEEDTEST >> $file
#echo "Formatting results">>$LOG
#column -s, -t < $file
# this step was used to format the log file neatly
#echo "Time completed ",$DATE>>$LOG
And here's how the crontab currently looks
# Edit this file to introduce tasks to be run by cron.
#
# Each task to run has to be defined through a single line
# indicating with different fields when the task will be run
# and what command to run for the task
#
# To define the time you can provide concrete values for
# minute (m), hour (h), day of month (dom), month (mon),
# and day of week (dow) or use '*' in these fields (for 'any').
#
# Notice that tasks will be started based on the cron's system
# daemon's notion of time and timezones.
#
# Output of the crontab jobs (including errors) is sent through
# email to the user the crontab file belongs to (unless redirected).
#
# For example, you can run a backup of all your user accounts
# at 5 a.m every week with:
# 0 5 * * 1 tar -zcf /var/backups/home.tgz /home/
#
# For more information see the manual pages of crontab(5) and cron(8)
#
# m h dom mon dow command
*/5 * * * * /bin/bash /home/User/scripts/testandrec.sh
# 2> /home/User/scripts/testrecerror.txt
# Was attempting to log errors to this file, nothing seen so commented out on a newline.
#* * * * * /home/User/scripts/testscript.sh test to verify cron works (it does)
I've added my scripts folder to the end of my path, but for some reason this only shows up when I'm using the Pi directly, when I ssh in I'm missing the scripts folder on the end.
However, given that I've used absolute path for everything I'm not sure why this would be an issue.
First I tested whether a simple Cron job would work, so I created testscript.sh, which simply returned 'Test' and a timestamp to a specific file and used the same shebang, and used the absolute paths, and functioned as intended.
I have checked systemctl for Cron, restarted Cron with sudo service cron restart and made sure a new line is in place in the crontab.
I have tried with and without /bin/bash in the cron tab entry, it seemingly hasn't made a difference.
I tried cd /home/User/scripts && ./testandrec.sh but no luck.
I changed the run time to every 5 then every 10 minutes, which has not worked.
I have noticed that when I ran the script manually with column -s, -t < $file left in, when cating the results file it is formatted as intended.
However, the next instance of when the cron job should run reverts this to CSV with a , as a delimitter, so clearly something is running.
To confuse matters further, I think the script may be firing once after restarting cron, and then not working when it should be running subsequently. When I leave the column line in, this appears to just revert the formatting, but if I comment it out it appears to run a speed test and append the results, but only once. However, I may be wrong on this and reproducing it
If I instead try 0 * * * * /usr/bin/speedtest-cli --csv >> /home/User/Documents/speedtestdata.csv && column -s, -t < /home/User/Documents/speedtestdata.csv, it appeared to perform/append speedtest but does not action the column command.
I would much rather neatly tie up the process in a shell script, however, rather than have the above which isn't very DRY code.
I've looked extensively, but none of the solutions I've found on this site or others have fixed the issue.
Any troubleshooting suggestions/help would be greatly appreciated.
Here you go - the solution is simple:
#!/bin/bash
# Commented out if statement detects presence of data file and creates one if it doesn't exist. Was going to adjust later to include variables/input options if I wanted to used script on alternate systems, but commented out while working on main issue.
file='/home/User/Documents/speedtestdata.csv'
# have tried this with and without quotes, does not seem to make a difference either way
#HEADERS='/usr/bin/speedtest-cli --csv-header'
SPEEDTEST='/usr/bin/speedtest-cli --csv'
# Used absolute path for the executable
#LOG=/home/User/scripts/testreclog.txt
#DATE=$( date )
# Was using the above to log steps of script running successfully with timestamp, commented out
#if [ ! -f $file ]
#then
# echo "Creating results file">>$LOG
# touch $file
# $HEADERS > $file
#fi
#echo "Running speedtest">>$LOG
$SPEEDTEST | column -s, -t >> $file
Just check the last line ;)

Cronjob to add datestamp to file not running

Good day everyone.
I have an issue, and Googling the issue has not helped me, basically I have the following requirement.
cronjob that runs 1st script, output is written to a file
file that is created, to have a date stamp
2nd script executes, mail the generated file as an attachment
The issue is with adding the timestamp, if I set the cron to run and just create a file with a generic filename the cronjob runs fine.
I have tried the following:
0 8-17/1 * * * python /usr/local/bin/script1.py >> /usr/local/bin/file_`date +\%Y-%m-%d`.txt 2>&1 && python /usr/local/bin/email_script.py
0 8-17/1 * * * python /usr/local/bin/acme_transcoding_check.py >> /usr/local/bin/file_$(date +"%Y-%m-%d").txt 2>&1 && python /usr/local/bin/email_script.py
Server is running Ubuntu 16.04
You need to escape the percent-sign (%) with a backslash as explained in this answer (not mine).

PHP iconv error using Zend Lucene when executing script via cron, but not on commandline

I am executing a PHP script via the command line which, for a specific user, runs fine when executed on the commandline, but when the exact same command is put into the same user's crontab, a PHP iconv error is returned.
The commandline is utilising the Yii framework and the Zend Lucene library, but I'm not sure if that's pertinent.
I've made all executable and script paths absolute in the crontab line and can verify that it works when executed directly on the commandline.
I wrapped the actual PHP invocation in a one-line shell script, as I read elsewhere here that this solved a similar problem for someone, but no joy.
The command successfully executed on the commandline is:
/bin/sh /var/www/yii-projects/projectname/protected/scripts/buildIndex.sh >> /var/lucene/lucene.log
The content of the buildIndex.sh script is:
/usr/bin/php /var/www/yii-projects/projectname/protected/scripts/cron.php lucene buildIndex
And the crontab line is:
*/10 * * * * /bin/sh /var/www/yii-projects/projectname/protected/scripts/buildIndex.sh >> /var/lucene/lucene.log
The error shown in the log when the crontab executes is:
PHP Error[8]: iconv(): Detected an illegal character in input string
in file /var/www/yii-projects/projectname/protected/vendors/Zend/Search/Lucene/Analysis/Analyzer/Common/Text.php at line 58
0 /var/www/yii-projects/projectname/protected/vendors/Zend/Search/Lucene/Analysis/Analyzer/Common/Text.php(58): iconv()
1 /var/www/yii-projects/projectname/protected/vendors/Zend/Search/Lucene/Analysis/Analyzer.php(125): Zend_Search_Lucene_Analysis_Analyzer_Common_Text_CaseInsensitive->reset()
2 /var/www/yii-projects/projectname/protected/vendors/Zend/Search/Lucene/Index/SegmentWriter/DocumentWriter.php(98): Zend_Search_Lucene_Analysis_Analyzer_Common_Text_CaseInsensitive->setInput()
3 /var/www/yii-projects/projectname/protected/vendors/Zend/Search/Lucene/Index/Writer.php(244): Zend_Search_Lucene_Index_SegmentWriter_DocumentWriter->addDocument()
4 /var/www/yii-projects/projectname/protected/vendors/Zend/Search/Lucene.php(1410): Zend_Search_Lucene_Index_Writer->addDocument()
5 /var/www/yii-projects/projectname/protected/vendors/Zend/Search/Lucene/Proxy.php(500): Zend_Search_Lucene->addDocument()
6 /var/www/yii-projects/projectname/protected/commands/LuceneCommand.php(97): Zend_Search_Lucene_Proxy->addDocument()
7 unknown(0): LuceneCommand->actionBuildIndex()
8 /var/www/yii-projects/yii-1.1.12.b600af/framework/console/CConsoleCommand.php(173): ReflectionMethod->invokeArgs()
9 /var/www/yii-projects/yii-1.1.12.b600af/framework/console/CConsoleCommandRunner.php(68): LuceneCommand->run()
10 /var/www/yii-projects/yii-1.1.12.b600af/framework/console/CConsoleApplication.php(92): CConsoleCommandRunner->run()
11 /var/www/yii-projects/yii-1.1.12.b600af/framework/base/CApplication.php(162): CConsoleApplication->processRequest()
12 /var/www/yii-projects/projectname/protected/scripts/cron.php(14): CConsoleApplication->run()
I cannot think of any reason why there is any difference, given the measures taken, and the fact that the user is the same in both cases.
Please help!
Thanks
Edit - I should also confirm that the underlying data that is being indexed is not changing - I've executed both scenarios alternately many times and get the above results consistently.
Try with the -f switch and directly from crontab:
/usr/bin/php -f /var/www/yii-projects/projectname/protected/scripts/cron.php lucene buildIndex
Also are you sure that the text of the command you are passing is in UTF8? Could there be some other symbol there? Maybe a BOM? You can check this with a HEX editor - open your shell script, omit all the letters and see what's left. Usually a BOM in UTF8 is EF BB BF but it may not be a bom at all. Just check.
Necessary shell environment variables are not available to crontab job, so added this to irishhp users's crontab:
PATH=/usr/kerberos/sbin:/usr/kerberos/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin
LANG=en_US.UTF-8
which resolved.

Variables in crontab?

How can I store variables in my crontab? I realize it's not shell but say I want to have some constants like a path to my app or something?
In Vixie cron, which is possibly the most common, you can do this almost exactly like a shell script.
VARIABLE=value
PATH=/bin:/path/to/doathing
0 0 * * * doathing.sh $VARIABLE
The man page says:
An active line in a crontab will be either an environment setting or a cron command. An environment setting is of the form,
name = value
where the spaces around the equal-sign (=) are optional, and any subsequent non-leading spaces in value will be part of the value assigned
to name. The value string may be placed in quotes (single or double, but matching) to preserve leading or trailing blanks. The name
string may also be placed in quote (single or double, but matching) to preserve leading, trailing or inner blanks.
You can tell if you have Vixie cron by checking the man page for crontab; the author will be Paul Vixie. Different crons may or may not support this (BusyBox's cron for example, does not), in which case your best option is to wrap your command in a shell script and run that script from cron instead. In fact, this is a good thing to do for anything complicated.
To keep my crontab clean, I would just call a shell script and do the fun stuff in the script.
I think the important fact to point out here is (as stated in an earlier comment by Pierre D Mar 25, 2015 at 18:58) that variable declarations are not expand/interpolated and so can not embed other variable values.
Variables are only expanded/interpolated in the commands themselves.
So:
var1 = bar
var2 = foo${var1}
42 17 * * * /path/to/command ${var2}
Results in: /path/to/command foo${var1}
While:
var1 = bar
var2 = foo
42 17 * * * /path/to/command ${var2}${var1}
Results in: /path/to/command foobar
So in my case the following works fine, no wrapping in shell scripts required:
SHELL=/bin/bash
timestamp=date +20%y_%m_%d_%H_%M_%S
logdir=/my/log/dir
0 2 * * * /my/command/path/mycmd param >> ${logdir}/myfile_$(${timestamp}).log
verses something like this which does not work:
logfile = /my/log/dir/myfile_${timestamp}.log
since the later is not expanded, but is rather interpreted as is including "${" and "}" as part of the string.
Just a working example of using variables in the crontab file and their substitution in the strings:
CURRENT_TIME=date +%Y.%m.%d_%H:%M:%S.%3N
CURRENT_DATE=date +%Y_%m_%d
SIMPLE_VAR=the_simple_var
LOG_DIR=/var/log/cron
* * * * * /bin/echo "simple variable test! ${SIMPLE_VAR}__test!" >> "${LOG_DIR}/test.log"
* * * * * /bin/echo "complex variable test! $(${CURRENT_TIME})__test!" >> "${LOG_DIR}/test.log"
Tested on this Docker image (paste the above crontab to the crontab.txt):
FROM debian:10-slim
# Install docker (Yep, this is a docker in docker):
RUN curl -fsSL https://get.docker.com -o get-docker.sh && sh get-docker.sh
# Install CRON:
RUN apt-get update && apt-get install -y --no-install-recommends cron
# Add a crontab_config.txt task:
COPY crontab.txt /var/crontab.txt
RUN crontab /var/crontab.txt
ENTRYPOINT ["cron", "-f"]
Add this to the crontab to run any commands inside another docker containers:
/usr/bin/docker exec container_name ls -la
If you have a few environment variables you want to set for a particular job, just inline those into the sh snippet.
42 17 * * * myvariable='value' path/to/command
In case it's not obvious, the sh syntax var=value command sets var to value for the duration of command. You can have several of these if you need more than one.
42 17 * * * firstname='Fred` lastname='Muggs' path/to/command
If you have nontrivial variables you want to access from several places, probably put them in a separate file, and source (.) that file from your shell startup script and your cron jobs.
Let's say you have a file $HOME/bin/myenv with settings like
myvar=$(output of some complex command)
another='another
complex
variable'
then you can add
. $HOME/bin/myenv
to your .profile (or .zshrc or .bash_profile or what have you; but .profileis portable, and also used bysh`) and
42 17 * * * . $HOME/bin/myenv; path/to/command
in your crontab.
Notice the lone dot before the space before the file name; this is the dot command (also known as source in e.g. Bash) which reads the file into the current shell instance as if you had typed in the things in the file here.
Tangentially, the $HOME/ part is strictly speaking redundant; cron jobs will always run in the invoking user's home directory.
Obviously, if you want a variable to be true in your entire crontab, set it at the top, before the scheduled jobs.

Creating a Named Cron Job

How do you create a cron job from the command line, so that it shows up with a name in gnome-schedule?
I know how to create a cron job using crontab. However, all my jobs show up with a blank name. I'd like to better document my jobs so I can easily identify them in gnome-schedule, or similar cron wrapper.
Well, just made a cronjob in Scheduler, and took a look at my crontab file, and it looked like this:
0 0 * * * ls >/dev/null 2>&1 # JOB_ID_1
Notice the JOB_ID_1 at the end.
I went into ~/.gnome/gnome-scheduler/, looked at the files there, and there was one named just 1 (as in the number "one") which had a bit of info, including the name
ver=3
title=Hello
desc=
nooutput=1
So, I made a second cronjob:
0 0 * * * ls -al >/dev/null 2>&1 # JOB_ID_2
Copied the file 1 to 2 to match the JOB_ID_2, changed the description, making the file as:
ver=3
title=This is a test
desc=
nooutput=1
Then I switched over to Gnome-Schedule, and it had added the cronjob, and had the name updated.
Follow the same steps, and you should be able to manually name any cronjob you want

Resources