How do I manage log verbosity inside a shell script? - linux

I have a pretty long bash script that invokes quite a few external commands (git clone, wget, apt-get and others) that print a lot of stuff to the standard output.
I want the script to have a few verbosity options so it prints everything from the external commands, a summarized version of it (e.g. "Installing dependencies...", "Compiling...", etc.) or nothing at all. But how can I do it without cluttering up all my code?
I've thought about to possible solutions to this: One is to create a wrapper function that runs the external commands and prints what's needed to the standard output, depending on the options set at the start. This ones seems easier to implement, but it means adding a lot of extra clutter to the code.
The other solution is to send all the output to a couple of external files and, when parsing the arguments at the start of the script, running tail -f on that file if verbosity is specified. This would be very easy to implement, but seems pretty hacky to me and I'm concerned about the performance impact of it.
Which one is better? I'm also open to other solutions.

Improving on #Fred's idea a little bit more, we could build a small logging library this way:
declare -A _log_levels=([FATAL]=0 [ERROR]=1 [WARN]=2 [INFO]=3 [DEBUG]=4 [VERBOSE]=5)
declare -i _log_level=3
set_log_level() {
level="${1:-INFO}"
_log_level="${_log_levels[$level]}"
}
log_execute() {
level=${1:-INFO}
if (( $1 >= ${_log_levels[$level]} )); then
"${#:2}" >/dev/null
else
"${#:2}"
fi
}
log_fatal() { (( _log_level >= ${_log_levels[FATAL]} )) && echo "$(date) FATAL $*"; }
log_error() { (( _log_level >= ${_log_levels[ERROR]} )) && echo "$(date) ERROR $*"; }
log_warning() { (( _log_level >= ${_log_levels[WARNING]} )) && echo "$(date) WARNING $*"; }
log_info() { (( _log_level >= ${_log_levels[INFO]} )) && echo "$(date) INFO $*"; }
log_debug() { (( _log_level >= ${_log_levels[DEBUG]} )) && echo "$(date) DEBUG $*"; }
log_verbose() { (( _log_level >= ${_log_levels[VERBOSE]} )) && echo "$(date) VERBOSE $*"; }
# functions for logging command output
log_debug_file() { (( _log_level >= ${_log_levels[DEBUG]} )) && [[ -f $1 ]] && echo "=== command output start ===" && cat "$1" && echo "=== command output end ==="; }
log_verbose_file() { (( _log_level >= ${_log_levels[VERBOSE]} )) && [[ -f $1 ]] && echo "=== command output start ===" && cat "$1" && echo "=== command output end ==="; }
Let's say the above source is in a library file called logging_lib.sh, we could use it in a regular shell script this way:
#!/bin/bash
source /path/to/lib/logging_lib.sh
set_log_level DEBUG
log_info "Starting the script..."
# method 1 of controlling a command's output based on log level
log_execute INFO date
# method 2 of controlling the output based on log level
date &> date.out
log_debug_file date.out
log_debug "This is a debug statement"
...
log_error "This is an error"
...
log_warning "This is a warning"
...
log_fatal "This is a fatal error"
...
log_verbose "This is a verbose log!"
Will result in this output:
Fri Feb 24 06:48:18 UTC 2017 INFO Starting the script...
Fri Feb 24 06:48:18 UTC 2017
=== command output start ===
Fri Feb 24 06:48:18 UTC 2017
=== command output end ===
Fri Feb 24 06:48:18 UTC 2017 DEBUG This is a debug statement
Fri Feb 24 06:48:18 UTC 2017 ERROR This is an error
Fri Feb 24 06:48:18 UTC 2017 WARNING This is a warning
Fri Feb 24 06:48:18 UTC 2017 FATAL This is a fatal error
As we can see, log_verbose didn't produce any output since the log level is at DEBUG, one level below VERBOSE. However, log_debug_file date.out did produce the output and so did log_execute INFO, since log level is set to DEBUG, which is >= INFO.
Using this as the base, we could also write command wrappers if we need even more fine tuning:
git_wrapper() {
# run git command and print the output based on log level
}
With these in place, the script could be enhanced to take an argument --log-level level that can determine the log verbosity it should run with.
Here is a complete implementation of logging for Bash, rich with multiple loggers:
https://github.com/codeforester/base/blob/master/lib/stdlib.sh
If anyone is curious about why some variables are named with a leading underscore in the code above, see this post:
Correct Bash and shell script variable capitalization

You already have what seems to be the cleanest idea in your question (a wrapper function), but you seem to think it would be messy. I would suggest you reconsider. It could look like the following (not necessarily a full-fledged solution, just to give you the basic idea) :
#!/bin/bash
# Argument 1 : Logging level for that command
# Arguments 2... : Command to execute
# Output suppressed if command level >= current logging level
log()
{
if
(($1 >= logging_level))
then
"${#:2}" >/dev/null 2>&1
else
"${#:2}"
fi
}
logging_level=2
log 1 command1 and its args
log 2 command2 and its args
log 3 command4 and its args
You can arrange for any required redirection (with file descriptors if you want) to be handled in the wrapper function, so that the rest of the script remains readable and free from redirections and conditions depending on the selected logging level.

Solution 1.
Consider using additional file descriptors.
Redirect required file descriptors to STDOUT or /dev/null depending on selected verbosity.
Redirect output of every statement in your script to a file descriptor corresponding to its importance.
Have a look at https://unix.stackexchange.com/a/218355 .

Solution 2.
Set $required_verbosity and pipe STDOUT of every statement in your script to a helper script with two parameters, something like this:
statement | logger actual_verbosity $required_verbosity
In a logger script echo STDIN to STDOUT (or log file, whatever) if $actual_verbosity >= $required_verbosity.

Related

How to display content of a file which added from past 5 mins without scanning whole file in Linux?

I have DB error log file, it will grow continuously.
Now i want to set some error monitoring on that file for every 5 minutes.
The problem is i don’t want to scan whole file for every 5 minutes(when monitoring cron executed), because it may grow very big in future. Scanning through whole(big) file for every 5 mins will consume bit more resources.
So i just want to scan only the lines which were inserted/written to the log during last 5 mins interval.
Each error recorded in log will have Timestamp prepend to it like below:
180418 23:45:00 [ERROR] mysql got signal 11.
So i want to search with pattern [ERROR] only on lines which were added from last 5 mins(not whole file) and place the output to another file.
Please help me here.
Feel free if u need more clarification on my question.
I’m using RHEL 7 and i’m trying to implement above monitoring through bash shell script
Serializing the Byte Offset
This picks up where the last instance left off. If you run it every 5 minutes, then, it'll scan 5 minutes of data.
Note that this implementation knowingly can scan data added during an invocation's run twice. This is a little sloppy, but it's much safer to scan overlapping data twice than to never read it at all, which is a risk that can be run if relying on cron to run your program on schedule (likewise, sleeps can run over the requested time if the system is busy).
#!/usr/bin/env bash
file=$1; shift # first input: filename
grep_opts=( "$#" ) # remaining inputs: grep options
dir=$(dirname -- "$file") # extract directory name to use for offset storage
basename=${file##*/} # pick up file name w/o directory
size_file="$dir/.$basename.size" # generate filename to use to store offset
if [[ -s $size_file ]]; then # ...if we already have a file with an offset...
old_size=$(<"$size_file") # ...read it from that file
else
old_size=0 # ...otherwise start at the front.
fi
new_size=$(stat --format=%s -- "$file") || exit # Figure out current size
if (( new_size < old_size )); then
old_size=0 # file was truncated, so we can't trust old_size
elif (( new_size == old_size )); then
exit 0 # no new contents, so no point in trying to search
fi
# read starting at old_size and grep only that content
dd iflag=skip_bytes skip="$old_size" if="$file" | grep "${grep_opts[#]}"; grep_retval=$?
# if the read failed, don't store an updated offset
(( ${PIPESTATUS[0]} != 0 )) && exit 1
# create a new tempfile to store offset in
tempfile=$(mktemp -- "${size_file}.XXXXXX") || exit
# write to that temporary file...
printf '%s\n' "$new_size" > "$tempfile" || { rm -f "$tempfile"; exit 1; }
# ...and if that write succeeded, overwrite the last place where we serialized output.
mv -- "$tempfile" "$new_size" || exit
exit "$grep_retval"
Alternate Mode: Bisect For The Timestamp
Note that this can miss content if you're relying on, say, cron to invoke your code every 5 minutes on-the-dot; storing byte offsets can thus be more accurate.
Using the bsearch tool by Ole Tange:
#!/usr/bin/env bash
file=$1; shift
start_date=$(date -d 'now - 5 minutes' '+%y%m%d %H:%M:%S')
byte_offset=$(bsearch --byte-offset "$file" "$start_date")
dd iflag=skip_bytes skip="$byte_offset" if="$file" | grep "$#"
Another approach could be something like this:
DB_FILE="FULL_PATH_TO_YOUR_DB_FILE"
current_db_size=$(du -b "$DB_FILE" | cut -f 1)
if [[ ! -a SOME_PATH_OF_YOUR_CHOICE/last_size_db_file ]] ; then
tail --bytes $current_db_size $DB_FILE > SOME_PATH_OF_YOUR_CHOICE/log-file_$(date +%Y-%m-%d_%H-%M-%S)
else
if [[ $(cat last_size_db_file) -gt $current_db_size ]] ; then
previously_readed_bytes=0
else
previously_readed_bytes=$(cat last_size_db_file)
fi
new_bytes=$(($current_db_size - $previously_readed_bytes))
tail --bytes $new_bytes $DB_FILE > SOME_PATH_OF_YOUR_CHOICE/log-file_$(date +%Y-%m-%d_%H-%M-%S)
fi
printf $current_db_size > SOME_PATH_OF_YOUR_CHOICE/last_size_db_file
this prints all bytes of DB_FILE not previously printed to SOME_PATH_OF_YOUR_CHOICE/log-file_$(date +%Y-%m-%d_%H-%M-%S)
Note that $(date +%Y-%m-%d_%H-%M-%S) will be the current 'full' date at the time of creating the log file
you can make this an script, and use cron to execute that script every five minutes; something like this:
*/5 * * * * PATH_TO_YOUR_SCRIPT
Here is my approach:
First, read the whole log once so far.
If you reach the end, collect and read new lines for a timespan (in my example 9 seconds, for faster testing, while my dummy server appends to the logfile every 3 seconds).
After the timespan, echo the cache, clear the cache (an array arr), loop and sleep for some time, so that this process doesn't consume all CPU time.
First, my dummy logfile writer:
#!/bin/bash
#
# dummy logfile writer
#
while true
do
s=$(( $(date +%s) % 3600))
echo $s server msg
sleep 3
done >> seconds.log
Startet via ./seconds-out.sh &.
Now the more complicated part:
#!/bin/bash
#
# consume a logfile as written so far. Then, collect every new line
# and show it in an interval of $interval
#
interval=9 # 9 seconds
#
printf -v secnow '%(%s)T' -1
start=$(( secnow % (3600*24*365) ))
declare -a arr
init=false
while true
do
read line
printf -v secnow '%(%s)T' -1
now=$(( secnow % (3600*24*365) ))
# consume every line created in the past
if (( ! init ))
then
# assume reading a line might not take longer than a second (rounded to whole seconds)
while (( ${#line} > 0 && (now - start) < 2 ))
do
read line
start=$now
echo -n "." # for debugging purpose, remove
printf -v secnow '%(%s)T' -1
now=$(( secnow % (3600*24*365) ))
done
init=1
echo "init=$init" # for debugging purpose, remove
# collect new lines, display them every $interval seconds
else
if ((${#line} > 0 ))
then
echo -n "-" # for debugging purpose, remove
arr+=("read: $line \n")
fi
if (( (now - start) > interval ))
then
echo -e "${arr[#]]}"
arr=()
start=$now
fi
fi
sleep .1
done < seconds.log
Output with logfile generator in 3 seconds, running for some time, then starting the read-seconds.sh script, with debugging output activated:
./read-seconds.sh
.......................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................init=1
---read: 1688 server msg
read: 1691 server msg
read: 1694 server msg
---read: 1697 server msg
read: 1700 server msg
read: 1703 server msg
----read: 1706 server msg
read: 1709 server msg
read: 1712 server msg
read: 1715 server msg
^C
Every dot represents a logfile line from the past and therefor skipped.
Every dash represents a logfile line collected.

How can I detect a sequence of "hollows" (holes, lines not matching a pattern) bigger than n in a text file?

Case scenario:
$ cat Status.txt
1,connected
2,connected
3,connected
4,connected
5,connected
6,connected
7,disconnected
8,disconnected
9,disconnected
10,disconnected
11,disconnected
12,disconnected
13,disconnected
14,connected
15,connected
16,connected
17,disconnected
18,connected
19,connected
20,connected
21,disconnected
22,disconnected
23,disconnected
24,disconnected
25,disconnected
26,disconnected
27,disconnected
28,disconnected
29,disconnected
30,connected
As can be seen, there are "hollows", understanding them as lines with the "disconnected" value inside the sequence file.
I want, in fact, to detect these "holes", but it would be useful if I could set a minimum n of missing numbers in the sequence.
I.e: for ' n=5' a detectable hole would be the 7... 13 part, as there are at least 5 "disconnected" in a row on the sequence. However, the missing 17 should not be considered as detectable in this case. Again, at line 21 whe get a valid disconnection.
Something like:
$ detector Status.txt -n 5 --pattern connected
7
21
... that could be interpreted like:
- Missing more than 5 "connected" starting at 7.
- Missing more than 5 "connected" starting at 21.
I need to script this on Linux shell, so I was thinking about programing some loop, parsing strings and so on, but I feel like if this could be done by using linux shell tools and maybe some simpler programming. Is there a way?
Even when small programs like csvtool are a valid solution, some more common Linux commands (like grep, cut, awk, sed, wc... etc) could be worth for me when working with embedded devices.
#!/usr/bin/env bash
last_connected=0
min_hole_size=${1:-5} # default to 5, or take an argument from the command line
while IFS=, read -r num state; do
if [[ $state = connected ]]; then
if (( (num-last_connected) > (min_hole_size+1) )); then
echo "Found a hole running from $((last_connected + 1)) to $((num - 1))"
fi
last_connected=$num
fi
done
# Special case: Need to also handle a hole that's still open at EOF.
if [[ $state != connected ]] && (( num - last_connected > min_hole_size )); then
echo "Found a hole running from $((last_connected + 1)) to $num"
fi
...emits, given your file on stdin (./detect-holes <in.txt):
Found a hole running from 7 to 13
Found a hole running from 21 to 29
See:
BashFAQ #1 - How can I read a file (data stream, variable) line-by-line (and/or field-by-field)?
The conditional expression -- the [[ ]] syntax used to make it safe to do string comparisons without quoting expansions.
Arithmetic comparison syntax -- valid in $(( )) in all POSIX-compliant shells; also available without the expansion side effects as (( )) as a bash extension.
This is the perfect use case for awk, since the machinery of line reading, column splitting, and matching is all built in. The only tricky bit is getting the command line argument to your script, but it's not too bad:
#!/usr/bin/env bash
awk -v window="$1" -F, '
BEGIN { if (window=="") {window = 1} }
$2=="disconnected"{if (consecutive==0){start=NR}; consecutive++}
$2!="disconnected"{if (consecutive>window){print start}; consecutive=0}
END {if (consecutive>window){print start}}'
The window value is supplied as the first command line argument; left out, it defaults to 1, which means "display the start of gaps with at least two consecutive disconnections". Probably could have a better name. You can give it 0 to include single disconnections. Sample output below. (Note that I added series of 2 disconnections at the end to test the failure that Charles metions).
njv#organon:~/tmp$ ./tst.sh 0 < status.txt # any number of disconnections
7
17
21
31
njv#organon:~/tmp$ ./tst.sh < status.txt # at least 2 disconnections
7
21
31
njv#organon:~/tmp$ ./tst.sh 8 < status.txt # at least 9 disconnections
21
Awk solution:
detector.awk script:
#!/bin/awk -f
BEGIN { FS="," }
$2 == "disconnected"{
if (f && NR-c==nr) c++;
else { f=1; c++; nr=NR }
}
$2 == "connected"{
if (f) {
if (c > n) {
printf "- Missing more than 5 \042connected\042 starting at %d.\n", nr
}
f=c=0
}
}
Usage:
awk -f detector.awk -v n=5 status.txt
The output:
- Missing more than 5 "connected" starting at 7.
- Missing more than 5 "connected" starting at 21.

Write output of subprocess launched by a `screen` CLI Command to a log file?

I am launching a bunch of the same script (generate_records.php) into screens. I am doing this to easily parallelize the processes. I would like to write the output of each of the PHP processes to a log file using something like &> log_$i (StdOut an StdErr).
My shell scripting is weak sauce, and I can't get the syntax correct. I keep getting the output of the screen, which is empty.
Exmaple: launch_processes_in_screens.sh
max_record_id=300000000
# number of parallel processors to run
total_processors=10
# max staging companies per processor
(( num_records_per_processor = $max_record_id / $total_processors))
i=0
while [ $i -lt $total_processors ]
do
(( starting_id = $i * $num_records_per_processor + 1 ))
(( ending_id = $starting_id + $num_records_per_processor - 1 ))
printf "\n - Starting processor #%s starting at ID:%s and ending at ID: %s" "$i" "$starting_id" "$ending_id"
screen -d -m -S "process_$i" php generate_records.php "$starting_id" "$num_records_per_processor" "FALSE"
((i++))
done
If the only reason you're using screen is to launch many processes in parallel, you can avoid it entirely and use & to start them in the background:
php generate_records.php "$starting_id" "$num_records_per_processor" FALSE &
You may also be able to remove some code by using parallel.

Generate time serie in iso-8601 format using date command, how to deal with server system date origin offset?

I have the following bash function that generate an epoch list in iso-8601 format on a machine that runs ubuntu and it works fine. (where isdate and isint bash functions to test the input)
gen_epoch()
{
## USAGE: gen_epoch [start_date_iso] [end_date_iso] [increment_in_seconds]
##
## TASK : generate an epoch list (epoch list in isodate format).
## result on STDOUT: [epoch_list]
## error_code : 2 0
## test argument
if [ "$#" -ne 3 ]; then echo "$FUNCNAME: input error [nb_of_input]"; return 2
elif [ $( isdate $1 &> /dev/null; echo $? ) -eq 2 ]; then echo "$FUNCNAME: argument error [$1]"; return 2
elif [ $( isdate $2 &> /dev/null; echo $? ) -eq 2 ]; then echo "$FUNCNAME: argument error [$2]"; return 2
elif [ $( isint $3 &> /dev/null; echo $? ) -eq 2 ]; then echo "$FUNCNAME: argument error [$3]"; return 2
else local beg=$( TZ=UTC date --date="$1" +%s ); local end=$( TZ=UTC date --date="$2" +%s ); local inc=$3; fi
## generate epoch
while [ $beg -le $end ]
do
local date_out=$( TZ=UTC date --date="UTC 1970-01-01 $beg secs" --iso-8601=seconds ); beg=$(( $beg + $inc ))
echo ${date_out%+*}
done
}
It generates the expected values for this command line example:
gen_epoch 2014-04-01T00:00:00 2014-04-01T07:00:00 3600
expected values:
2014-04-01T00:00:00
2014-04-01T01:00:00
2014-04-01T02:00:00
2014-04-01T03:00:00
2014-04-01T04:00:00
2014-04-01T05:00:00
2014-04-01T06:00:00
2014-04-01T07:00:00
However i have tried this function on a server where i have no root privileges and i have found the following results:
2014-03-31T17:00:00
2014-03-31T18:00:00
2014-03-31T19:00:00
2014-03-31T20:00:00
2014-03-31T21:00:00
2014-03-31T22:00:00
2014-03-31T23:00:00
2014-04-01T00:00:00
and i have seen that the server time origin is not at 1970-01-01T00:00:00.
typing TZ=UTC date --date="1970-01-01T00:00:00" +%s command gives the value of -25200 which corresponds to a 7 hours lag while it should give 0.
My question is how this problem could be corrected on the server?
Could You help me to find an equivalent solution for this function assuming that i don't know on which machine i am running it, so i have know apriori knowledge if the system time is correct or not?
Not a complete answer but too long for a comment.
I guess that this particular server is incorrectly configured upon setup. The problem is that BIOS clocks are set to localtime time, while the systems thinks it's in UTC (or vise versa) (use hwclock to query hardware clocks settings).
If the system is configured incorrectly and you can't fix it for any reason (don't have superuser account or whatever), I'd suggest to provide a "fixing timezone description file with your software and specify it in TZ variable like this: TZ=:/path/to/fixing/timezone date --date="1970-01-01T00:00:00" +%s. Obviously you have to pre-calculate which TZ description file fixes the problem and use a proper one. Usually available timezones are stored in /usr/share/zoneinfo

Run cron job in non-silent mode?

I created a simple linux script that essentially calls sqlplus and puts the results in variable X. I then analyze X and determine whether or not I need to send out a syslog message.
The script works perfectly when I run it from the command line as "oracle"; however when I use crontab as "oracle" and add it to my job, X isn't getting filled.
I could be wrong, but I believe the issue is since cron runs things in silent mode, X isn't actually getting filled, but when I run it manually it is.
Here's my crontab -l result (as oracle):
0,30 * * * * /scripts/isOracleUp.sh syslog
Here's my full script:
#Created by: hatguy
#Created date: May 8, 2012
#File Attributes: Must be executable by "oracle"
#Description: This script is used to determine if Oracle is up
# and running. It does a simple select on dual to check this.
DATE=`date`
USER=$(whoami)
if [ "$USER" != "oracle" ]; then
#note: $0 is the full path of whatever script is being run.
echo "You must run this as oracle. Try \"su - oracle -c $0\" instead"
exit;
fi
X=`sqlplus -s '/ as sysdba'<<eof
set serveroutput on;
set feedback off;
set linesize 1000;
select count(*) as count_col from dual;
EXIT;
eof`
#This COULD be more elegant. The issue I'm having is that I can't figure out
#which hidden characters are getting fed into X, so instead what I did was
#check the string legth (26) and checked that COUNT_COL and 1 were where I
#expected.
if [ ${#X} -eq 26 ] && [ ${X:1:10} = "COUNT_COL" ] && [ ${X:24:3} = "1" ] ; then
echo "Connected"
#log to a text file that we checked and confirmed connection
if [ "$1" == "syslog" ]; then
echo "$DATE: Connected" >> /scripts/log/isOracleUp.log
fi
else
echo "Not Connected"
echo "Details: $X"
if [ "$1" == "syslog" ]; then
echo "Sending this to syslog"
echo "==========================================================" >> /scripts/log/isOracleUp.log
echo "$DATE: Disconnected" >> /scripts/log/isOracleUp.log
echo "Message from sqlplus: $X" >> /scripts/log/isOracleUp.log
/scripts/sendMessageToSyslog.sh "PROD Oracle is DOWN!!!"
/scripts/sendMessageToSyslog.sh "PROD Details: $X"
fi
fi
Here's output when run as oracle from terminal:
Wed May 9 10:03:07 MDT 2012: Disconnected
Message from sqlplus: select count(*) as count_col from dual
*
ERROR at line 1:
ORA-01034: ORACLE not available
Process ID: 0
Session ID: 0 Serial number: 0
Here's my log output when run through oracle's crontab job:
Wed May 9 11:00:04 MDT 2012: Disconnected
Message from sqlplus:
And to syslog:
PROD Details:
PROD Oracle is DOWN!!!
Any help would be appreciated as I'm a new linux user and this is my first linux script.
Thanks!
My Oracle db skills are pretty limited but dont you need to set ORACLE_SID and ORACLE_HOME ?
Check these variables from the command lines and set these variables within cron and retry.

Resources