Bash empty variable after running cron, but runs manually - linux

I have simple bash script that working manual at the terminal, but after cron gives an empty variable.
#!/bin/bash
gwip=`/usr/bin/nmcli dev list iface eth0 | grep IP4-SETTINGS.GATEWAY: | awk '{ print $2}'`
printf '%s\n' "$(date) =- $gwip -= " >> /var/log/looog.log
run: /bin/bash /test.bash
output in file /var/log/looog.log:
1 monday 2016 14:17:36 +0300 =- 23.18.117.254 -=
When I run through cron, variable is empty.
*/1 * * * * root /bin/bash /test.bash
output in file /var/log/looog.log:
1 monday 2016 14:19:13 +0300 =- -=
Why variable $gwip is empty? how to fix it?

Qualifying /usr/bin/nmcli isn't enough -- you're calling a bunch of other tools that need to be found from the PATH also.
Also, in general -- when debugging a cron job, arrange for its stderr to go to a file, like so:
#!/bin/bash
# log stdout and stderr to two different files
exec >>/var/log/looog.log 2>>/var/log/looog.err.log
# ...and log every command we try to execute to stderr (aka looog.err.log)
set -x
# set a PATH variable
export PATH=/bin:/usr/bin
# original code here, using modern POSIX $() syntax, vs old hard-to-nest ``
gwip=$(nmcli dev list iface eth0 | awk '/IP4-SETTINGS[.]GATEWAY:/ { print $2}')
printf '%s\n' "$(date) =- $gwip -= "
The key things here are the explicitly-set PATH (not having the value you expect set in PATH is a common issue in cron jobs), and the stderr log (which ensures that any other issues can be identified by reading its contents).
Note the use of a single redirection to looog.log up-front. This doesn't make a significant difference when you're literally only running one print statement, but if you would have extended this script to have more than one, it's more efficient to open your output file only one than to re-open it every time you have something to write.

Related

sed output in bash script works in CLI, output different in cron

A simple script to list sites on my Webinoly Ubuntu 18 server works in the CLI but fails in cron. Webinoly is a site management script and has a command to list the sites it is managing:
site -list
The output from this command in the CLI looks like this:
- catalyst.dk39ecbk3.com
- siteexample3.com
- webinoly.dk39ecbk3.com
The script I'm having trouble with (below) should remove the control characters, the " - " from the beginning of each line, and the blank lines using sed:
#!/bin/bash
# create array of sites using webinoly 'site' command
# webinoly site -list command returns lines that start with hyphens and spaces, along with control chars, which need to be removed
# SED s/[\x01-\x1F\x7F]//g removes control characters
# SED s/.{7}// removes first seven chars and s/.{5}$// removes last 5 chars
# SED /^\s*$/d removes blank lines
# removing characters http://www.theunixschool.com/2014/08/sed-examples-remove-delete-chars-from-line-file.html
# removing empty lines https://stackoverflow.com/questions/16414410/delete-empty-lines-using-sed
SITELIST=($(/usr/bin/site -list | sed -r "s/[\x01-\x1F\x7F]//g;s/.{7}//;s/.{5}$//;/^\s*$/d"))
#print site list
for SITE in ${SITELIST[#]}; do
echo "$SITE"
done
Here's the desired output, which I see in the CLI:
root#server1 ~/scripts # ./gdrive-backup-test.sh
catalyst.dk39ecbk3.com
siteexample3.com
webinoly.dk39ecbk3.com
The trouble happens when the script runs in cron. Here's the cron file:
root#server1 ~/scripts # crontab -l
SHELL=/bin/bash
MAILTO=myemail#gmail.com
15 3 * * 7 certbot renew --post-hook "service nginx restart"
47 01 * * * /root/scripts/gdrive-backup-test.sh > /root/scripts/output-gdrive-backup.txt
and here's the output-gdrive-backup.txt file generated by the cron command:
root#server1 ~/scripts # cat output-gdrive-backup.txt
lyst.dk39ecbk3
example3
noly.dk39ecbk3
The first three characters of each line are missing, as are the last four (the .com).
I've researched and made sure to force use of bash in the cron file as well as at the beginning of the script.
With the following input:
$ cat site
- catalyst.dk39ecbk3.com
- siteexample3.com
- webinoly.dk39ecbk3.com
- webinoly.dk39ecbk3.com
you can use the following sed command to reach your output:
$ cat site | sed -e "s/^\s*-\s*//g;/^\s*$/d"
catalyst.dk39ecbk3.com
siteexample3.com
webinoly.dk39ecbk3.com
webinoly.dk39ecbk3.com
Replace the cat site by the command you want to filter the output from.
The answer turned out to be failure to specify TERM in my cron file. That solved the main issue I was having. This was a strange issue -- hard to research and figure out.
There were a few others -- one of them was that the path for one of the commands wasn't part of the path cron uses, but was included in the user root in the CLI. For more info on the TERM issue, see "tput: No value for $TERM and no -T specified " error logged by CRON process.

stdout all at once instead of line by line

I wrote a script that gets load and mem information for a list of servers by ssh'ing to each server. However, since there are around 20 servers, it's not very efficient to wait for the script to end. That's why I thought it might be interesting to make a crontab that writes the output of the script to a file, so all I need to do is cat this file whenever I need to know load and mem information for the 20 servers. However, when I cat this file during the execution of the crontab it will give me incomplete information. That's because the output of my script is written line by line to the file instead of all at once at termination. I wonder what needs to be done to make this work...
My crontab:
* * * * * (date;~/bin/RUP_ssh) &> ~/bin/RUP.out
My bash script (RUP_ssh):
for comp in `cat ~/bin/servers`; do
ssh $comp ~/bin/ca
done
Thanks,
niefpaarschoenen
You can buffer the output to a temporary file and then output all at once like this:
outputbuffer=`mktemp` # Create a new temporary file, usually in /tmp/
trap "rm '$outputbuffer'" EXIT # Remove the temporary file if we exit early.
for comp in `cat ~/bin/servers`; do
ssh $comp ~/bin/ca >> "$outputbuffer" # gather info to buffer file
done
cat "$outputbuffer" # print buffer to stdout
# rm "$outputbuffer" # delete temporary file, not necessary when using trap
Assuming there is a string to identify which host the mem/load data has come from you can update your txt file as each result comes in. Asuming the data block is one line long you could use
for comp in `cat ~/bin/servers`; do
output=$( ssh $comp ~/bin/ca )
# remove old mem/load data for $comp from RUP.out
sed -i '/'"$comp"'/d' RUP.out # this assumes that the string "$comp" is
# integrated into the output from ca, and
# not elsewhere
echo "$output" >> RUP.out
done
This can be adapted depending on the output of ca. There is lots of help on sed across the net.

Bash script to capture input, run commands, and print to file

I am trying to do a homework assignment and it is very confusing. I am not sure if the professor's example is in Perl or bash, since it has no header. Basically, I just need help with the meat of the problem: capturing the input and outputting it. Here is the assignment:
In the session, provide a command prompt that includes the working directory, e.g.,
$./logger/home/it244/it244/hw8$
Accept user’s commands, execute them, and display the output on the screen.
During the session, create a temporary file “PID.cmd” (PID is the process ID) to store the command history in the following format (index: command):
1: ls
2: ls -l
If the script is aborted by CTRL+C (signal 2), output a message “aborted by ctrl+c”.
When you quit the logging session (either by “exit” or CTRL+C),
a. Delete the temporary file
b. Print out the total number of the commands in the session and the numbers of successful/failed commands (according to the exit status).
Here is my code so far (which did not go well, I would not try to run it):
#!/bin/sh
trap 'exit 1' 2
trap 'ctrl-c' 2
echo $(pwd)
while true
do
read -p command
echo "$command:" $command >> PID.cmd
done
Currently when I run this script I get
command read: 10: arg count
What is causing that?
======UPDATE=========
Ok I made some progress not quite working all the way it doesnt like my bashtrap or incremental index
#!/bin/sh
index=0
trap bashtrap INT
bashtrap(){
echo "CTRL+C aborting bash script"
}
echo "starting to log"
while :
do
read -p "command:" inputline
if [ $inputline="exit" ]
then
echo "Aborting with Exit"
break
else
echo "$index: $inputline" > output
$inputline 2>&1 | tee output
(( index++ ))
fi
done
This can be achieved in bash or perl or others.
Some hints to get you started in bash :
question 1 : command prompt /logger/home/it244/it244/hw8
1) make sure of the prompt format in the user .bashrc setup file: see PS1 data for debian-like distros.
2) cd into that directory within you bash script.
question 2 : run the user command
1) get the user input
read -p "command : " input_cmd
2) run the user command to STDOUT
bash -c "$input_cmd"
3) Track the user input command exit code
echo $?
Should exit with "0" if everything worked fine (you can also find exit codes in the command man pages).
3) Track the command PID if the exit code is Ok
echo $$ >> /tmp/pid_Ok
But take care the question is to keep the user command input, not the PID itself as shown here.
4) trap on exit
see man trap as you misunderstood the use of this : you may create a function called on the catched exit or CTRL/C signals.
5) increment the index in your while loop (on the exit code condition)
index=0
while ...
do
...
((index++))
done
I guess you have enough to start your home work.
Since the example posted used sh, I'll use that in my reply. You need to break down each requirement into its specific lines of supporting code. For example, in order to "provide a command prompt that includes the working directory" you need to actually print the current working directory as the prompt string for the read command, not by setting the $PS variable. This leads to a read command that looks like:
read -p "`pwd -P`\$ " _command
(I use leading underscores for private variables - just a matter of style.)
Similarly, the requirement to do several things on either a trap or a normal exit suggests a function should be created which could then either be called by the trap or to exit the loop based on user input. If you wanted to pretty-print the exit message, you might also wrap it in echo commands and it might look like this:
_cleanup() {
rm -f $_LOG
echo
echo $0 ended with $_success successful commands and $_fail unsuccessful commands.
echo
exit 0
}
So after analyzing each of the requirements, you'd need a few counters and a little bit of glue code such as a while loop to wrap them in. The result might look like this:
#/usr/bin/sh
# Define a function to call on exit
_cleanup() {
# Remove the log file as per specification #5a
rm -f $_LOG
# Display success/fail counts as per specification #5b
echo
echo $0 ended with $_success successful commands and $_fail unsuccessful commands.
echo
exit 0
}
# Where are we? Get absolute path of $0
_abs_path=$( cd -P -- "$(dirname -- "$(command -v -- "$0")")" && pwd -P )
# Set the log file name based on the path & PID
# Keep this constant so the log file doesn't wander
# around with the user if they enter a cd command
_LOG=${_abs_path}/$$.cmd
# Print ctrl+c msg per specification #4
# Then run the cleanup function
trap "echo aborted by ctrl+c;_cleanup" 2
# Initialize counters
_line=0
_fail=0
_success=0
while true
do
# Count lines to support required logging format per specification #3
((_line++))
# Set prompt per specification #1 and read command
read -p "`pwd -P`\$ " _command
# Echo command to log file as per specification #3
echo "$_line: $_command" >>$_LOG
# Arrange to exit on user input with value 'exit' as per specification #5
if [[ "$_command" == "exit" ]]
then
_cleanup
fi
# Execute whatever command was entered as per specification #2
eval $_command
# Capture the success/fail counts to support specification #5b
_status=$?
if [ $_status -eq 0 ]
then
((_success++))
else
((_fail++))
fi
done

Bash script and cron anomaly

I have a bash script that I am run to check to see if one of my programs has hung, and if it has kill it. The script works fine if ran from the command line, but if I schedule it with cron it does something very strange.
Basically the script (below) gets the PID of my program and gets its created date/time from its entry in the /proc/ directory. It then gets the current date/time from the system and converts these two values into seconds since 1970 with the "date" command, before finally subtracting the two. This usually ends up with a total of 2100 seconds or something like that, which equates to 35 minutes.
#!/bin/bash
THEDATE=$(date +%s)
MYPID=$(ps aux|grep -v grep|egrep "MyProgram.exe"|awk '{print $2}')
if (( ${#MYPID} > 0 )); then
STARTTIME=$(ls -ld /proc/$MYPID|date +%s -d"$(awk '{print $6, $7}')")
TOTALMINS=$(( ($THEDATE - $STARTTIME) / 60 ))
if (( $TOTALMINS >= 30 )); then
kill -9 $MYPID
logger -t "[KillLongRunningProcesses] Killed my program which had been running for $TOTALMINS minutes"
fi
fi
When ran from the command line, the two date variables (THEDATE and STARTTIME) both get the correct values. But when run by cron the STARTTIME is wrong. It has the correct date, but seems to ignore the time part and sets it to midnight, ie "2009-12-14 00:00:00" is obtained instead of "2009-12-14 13:23:00", which throws off all the calculations.
Any ideas? Thanks.
First off never parse the output of ls, read THIS to understand why. Next, your script can be greatly improved by using pgrep rather than using awk to parse the PID from a grep on 'ps aux'. Also, your script breaks horribly in the case where you have more than one PID returned. And finally, when writing shell scripts try not to use CAPITALS for variable names; that convention is reserved for variables that you export into your environment.
The following script attempts to solve the problems mentioned above. It is as efficient as I could make it and it will handle the case where you have multiple PIDs. It also checks to make sure that the PID still exists before we kill it because it's possible that when we kill the parent it may take out the child processes.
#!/bin/bash
prog_name="MyProgram.exe"
the_date=$(date +%s)
my_pids=( $(pgrep "$prog_name") )
for ((i=0; i < ${#my_pids[#]}; i++)); do
if [[ -d /proc/${my_pids[i]} ]]; then
start_time=$(stat --printf=%Y /proc/${my_pids[i]})
total_mins=$(( (the_date - start_time) / 60 ))
if (( $total_mins >= 30 )); then
kill -9 ${my_pids[i]}
logger -t "Your custom message here"
fi
fi
done
That's why you can't rely on parsing ls. You should use stat if your system has it.
stat --printf=%Y /proc/$MYPID
If not, perhaps your find can do it for you:
find /proc -maxdepth 1 -name $MYPID -printf "%T#"
For what it's worth I used to get caught out with the PATH and other environment variables not being set automatically.
Worth checking. If it's not that then comment and I'll delete my answer.
I've figured out how to do it. If I use the "--time-style=long-iso" argument with ls it returns it in the format I need. Apparently cron has different preferences for some command than the default user.
I take your point Dennis that "ls" is unreliable for parsing, but the configuration for the computer I'm using is never going to change, so now that it works I will leave it as is. In future I will likely do things your way.

Trace of executed programs called by a Bash script

A script is misbehaving. I need to know who calls that script, and who calls the calling script, and so on, only by modifying the misbehaving script.
This is similar to a stack-trace, but I am not interested in a call stack of function calls within a single bash script.
Instead, I need the chain of executed programs/scripts that is initiated by my script.
A simple script I wrote some days ago...
# FILE : sctrace.sh
# LICENSE : GPL v2.0 (only)
# PURPOSE : print the recursive callers' list for a script
# (sort of a process backtrace)
# USAGE : [in a script] source sctrace.sh
#
# TESTED ON :
# - Linux, x86 32-bit, Bash 3.2.39(1)-release
# REFERENCES:
# [1]: http://tldp.org/LDP/abs/html/internalvariables.html#PROCCID
# [2]: http://linux.die.net/man/5/proc
# [3]: http://linux.about.com/library/cmd/blcmdl1_tac.htm
#! /bin/bash
TRACE=""
CP=$$ # PID of the script itself [1]
while true # safe because "all starts with init..."
do
CMDLINE=$(cat /proc/$CP/cmdline)
PP=$(grep PPid /proc/$CP/status | awk '{ print $2; }') # [2]
TRACE="$TRACE [$CP]:$CMDLINE\n"
if [ "$CP" == "1" ]; then # we reach 'init' [PID 1] => backtrace end
break
fi
CP=$PP
done
echo "Backtrace of '$0'"
echo -en "$TRACE" | tac | grep -n ":" # using tac to "print in reverse" [3]
... and a simple test.
I hope you like it.
You can use Bash Debugger http://bashdb.sourceforge.net/
Or, as mentioned in the previous comments, the caller bash built-in. See: http://wiki.bash-hackers.org/commands/builtin/caller
i=0; while caller $i ;do ((i++)) ;done
Or as a bash function:
dump_stack(){
local i=0
local line_no
local function_name
local file_name
while caller $i ;do ((i++)) ;done | while read line_no function_name file_name;do echo -e "\t$file_name:$line_no\t$function_name" ;done >&2
}
Another way to do it is to change PS4 and enable xtrace:
PS4='+$(date "+%F %T") ${FUNCNAME[0]}() $BASH_SOURCE:${BASH_LINENO[0]}+ '
set -o xtrace # Comment this line to disable tracing.
~$ help caller
caller: caller [EXPR]
Returns the context of the current subroutine call.
Without EXPR, returns "$line $filename". With EXPR,
returns "$line $subroutine $filename"; this extra information
can be used to provide a stack trace.
The value of EXPR indicates how many call frames to go back before the
current one; the top frame is frame 0.
Since you say you can edit the script itself, simply put a:
ps -ef >/tmp/bash_stack_trace.$$
in it, where the problem is occurring.
This will create a number of files in your tmp directory that show the entire process list at the time it happened.
You can then work out which process called which other process by examining this output. This can either be done manually, or automated with something like awk, since the output is regular - you just use those PID and PPID columns to work out the relationships between all the processes you're interested in.
You'll need to keep an eye on the files, since you'll get one per process so they may have to be managed. Since this is something that should only be done during debugging, most of the time that line will be commented out (preceded by #), so the files won't be created.
To clean them up, you can simply do:
rm /tmp/bash_stack_trace.*
UPDATE:
The code below should work. Now I have a newer answer with a newer code version that allows a message inserted in the stacktrace.
IIRC I just couldn't find this answer to update it as well at the time. But now decided code is better kept in git so latest version of the above should be in this gist.
original code-corrected answer below:
There was another answer about this somewhere but here is a function to use for getting stack trace in the sense used for example in the java programming language. You call the function and it puts the stack trace into the variable $STACK. It show the code points that led to get_stack being called. This is mostly useful for complicated execution where single shell sources multiple script snippets and nesting.
function get_stack () {
STACK=""
# to avoid noise we start with 1 to skip get_stack caller
local i
local stack_size=${#FUNCNAME[#]}
for (( i=1; i<$stack_size ; i++ )); do
local func="${FUNCNAME[$i]}"
[ x$func = x ] && func=MAIN
local linen="${BASH_LINENO[(( i - 1 ))]}"
local src="${BASH_SOURCE[$i]}"
[ x"$src" = x ] && src=non_file_source
STACK+=$'\n'" "$func" "$src" "$linen
done
}
adding pstree -p -u `whoami` >>output in your script will probably get you the information you need.
The simplest script which returns a stack trace with all callers:
i=0; while caller $i ;do ((i++)) ;done
You could try something like
strace -f -e execve script.sh

Resources