Strange output in shell script - linux

I'm encountering a very strange thing while writing a shell script.
Initially I have the following script.
OUTPUT_FOLDER=/home/user/output
HOST_FILE=/home/user/host.txt
LOG_DIR=/home/user/log
SLEEPTIME=60
mkdir -p $OUTPUT_FOLDER
while true
do
rm -f $OUTPUT_FOLDER/*
DATE=`date +"%Y%m%d"`
DATETIME=`date +"%Y%m%d_%H%M%S"`
pssh -h $HOST_FILE -o $OUTPUT_FOLDER
while read host
do
numline=0
while read line
do
if [[ $numline == 0 ]]
then
FIELD1=`echo $line | cut -d':' -f2 | cut =d' ' -f2`
fi
((numline+=1))
done < ${OUTPUT_FOLDER}/${host}
echo "$DATETIME,$FIELD1" >> ${LOG_DIR}/${DATE}.log
done < $HOST_FILE
sleep $SLEEPTIME
done
When I ran this script, every 60 seconds, I'll see the $DATETIME,$FIELD1 values in my log file.
What is very strange, is that every 30 seconds or so, after the first minute has passed, I'll see 20160920_120232,, meaning there was output where there shouldn't be, and I guess because I deleted the contents of my output folder, FIELD1 is empty.
What is even more strange, is while trying to debug this, I added a few more echo statements to print to my log file, and then I deleted those lines. However, they continued to be printed every 30 seconds or so, after the first minute has passed.
What is stranger still is that I then commented out everything inside the while true block, that is,
OUTPUT_FOLDER=/home/user/output
HOST_FILE=/home/user/host.txt
LOG_DIR=/home/user/log
SLEEPTIME=60
mkdir -p $OUTPUT_FOLDER
while true
do
: << 'END'
rm -f $OUTPUT_FOLDER/*
DATE=`date +"%Y%m%d"`
DATETIME=`date +"%Y%m%d_%H%M%S"`
pssh -h $HOST_FILE -o $OUTPUT_FOLDER
while read host
do
numline=0
while read line
do
if [[ $numline == 0 ]]
then
FIELD1=`echo $line | cut -d':' -f2 | cut =d' ' -f2`
fi
((numline+=1))
done < ${OUTPUT_FOLDER}/${host}
echo "$DATETIME,$FIELD1" >> ${LOG_DIR}/${DATE}.log
done < $HOST_FILE
sleep $SLEEPTIME
END
done
Even with this script, where I'm expecting nothing to be printed to my log file, I see my previous echo statements that I have deleted, and the lines with the empty FIELD1. I have checked that I'm running the correct version of the script each time.
What is going on?

Am not sure, it could be a the actual reason for the screw-up. You have an incorrect usage of cut in line number 22 which could have mangled the FIELD1 inadvertently.
FIELD1=`echo $line | cut -d':' -f2 | cut =d' ' -f2`
# ^ incorrect usage of the delimiter '-d' flag
which should have been used as
FIELD1=`echo $line | cut -d':' -f2 | cut -d' ' -f2`
# ^ Proper usage of the '-d' flag with space delimiter
I tried to capture the piped command output to see if the last command could have succeeded and below was my observation.
echo $line | cut -d':' -f2 | cut =d' ' -f2;echo "${PIPESTATUS[#]}"
cut: =d : No such file or directory # --> Error from the failed 'cut' command
0 0 1 # --> Return codes of each of the command being piped
Also, if you have been a bash purist, you have avoided the legacy `` style command expansion and used a $(cmd) syntax like,
FIELD1=$(echo "$line" | cut -d':' -f2 | cut -d' ' -f2)
Another simpler way would have been to avoid use of cut & echo and do pure bash string-manipulation.
Assuming your $line contains : and de-limited string e.g."What Hell:Hello World" and you are trying to extract the World part as I could see, just do
FIELD1="${line##* }" # Strips-off characters until last occurrence of whitespace

I tried to reproduce it and couldn't. Thre is no reason for this script (when commented as you have) to produce any output.
There are couple of cases:
As others pointed out, there may be some other script writing to your log file
You may have left few instances of this script running on other terminals
You may be running this script from crontab once every 5 mins
In order to rule out possibilities, do a "ps -aef" and check out all the processes that could resemble the name of your script. To catch things from crontab, you may have to watch the output of "ps -aef" a big longer (or just check in the crontab entries).

Related

Grep function not stopping with head pipe

So i'm currently trying to grep a single result from a random file in a specific directory. The grepping works just fine and the expected output file is populated as expected, but for some reason, even after the output file has already been filled, the process won't stop. This is the grep command where the program seems to be getting stuck.
searchFILE(){
case $2 in
pref)
echo "Populating output file: $3-$1.data.out"
dataOutputFile="$3-$1.data.out"
zgrep -a "\"someParameter\"\:\"$1\"" /folder/anotherFolder/filetemplate.log.* | zgrep -a "\"parameter2\"\:\"$3\"" | head -1 > $dataOutputFile
;;
*)
echo "Unrecognized command"
;;
esac
echo "Query finished"
}
What is currently happening is that the output file is being populated as expected with the head pipe, but for some reason I'm not getting the "Query finished" message, and the process seems not to stop at all.
grep does not know that head -n1 is no longer reading from the pipe until it attempts to write to the pipe, which it will only do if another match is found. There is no direct communication between the processes. It will eventually stop, but only once all the data is read, a second match is found and write fails with EPIPE, or some other error occurs.
You can watch this happen in a simple pipeline like this:
cat /dev/urandom | grep -ao "12[0-9]" | head -n1
With a sufficiently rare pattern, you will observe a delay between output and exit.
One solution is to change your stop condition. Instead of waiting for SIGPIPE as your pipeline does, wait for grep to match once using the -m1 option:
cat /dev/urandom | grep -ao -m1 "12[0-9]"
I saw better performance results with zcat myZippedFile | grep whatever paradigm...
The first difference you need to try is pipe with | head -z --lines=1
The reason is null terminated lines instead of newlines (just in case).
My example script below worked (drop the case statement to make it more simple). If I hold onto $1 $2 inside functions things go wrong. I use parameter $names and only use the $1 $2 $# once, because it also goes wrong for me if I don't and in any case you can then shift over $# and catch arguments. The $# in the script itself are not the same as arguments in bash functions.
grep searching for 2 or multiple parameters in any order means using grep twice; in your case zgrep | grep. The second grep is a normal grep! You only need the first grep to be zgrep to do the unzip. Your question is simpler if you drop the case statement as bash case scares off people: bash was always an ugly lady that works good for short scripts.
zgrep searches text or compressed text, but newlines in LINUX style vs WINDOWS are not the same. So use dos2unix to convert files so that newlines work. I use compressed file simply because it is strange and rare to see zgrep, so it is demonstrated in a shell script with a compressed file! It works for me. I changed a few things, like >> and "sort -u" but you can obviously change them back.
#!/usr/bin/env bash
# Search for egA AND egB using option go
# COMMAND LINE: ./zgrp egA go egB
A="$1"
cOPT="$2" # expecting case go
B="$3"
LOG="./filetemplate.log" # use parameters for long names.
# Generate some data with gzip and delete the temporary file.
echo "\"pramA\":\"$A\" \"pramB\":\"$B\"" >> $B$A.tmp
rm -f ${LOG}.A; tar czf ${LOG}.A $B$A.tmp
rm -f $B$A.tmp
# Use paramaterise $names not $1 etc because you may want to do shift etc
searchFILE()
{
outFile="$B-$A.data.out"
case $cOPT in
go) # This is zgrep | grep NOT zgrep | zgrep
zgrep -a "\"pramA\":\"$A\"" ${LOG}.* | grep -a "\"pramB\":\"$B\"" | head -z --lines=1 >> $outFile
sort -u $outFile > ${outFile}.sorted # sort unique on your output.
;;
*) echo -e "ERROR second argument must be go.\n Usage: ./zgrp egA go egB"
exit 9
;;
esac
echo -e "\n ============ Done: $0 $# Fin. ============="
}
searchFILE "$#"
cat ${outFile}.sorted

how to remove the extension of multiple files using cut in a shell script?

I'm studying about how to use 'cut'.
#!/bin/bash
for file in *.c; do
name = `$file | cut -d'.' -f1`
gcc $file -o $name
done
What's wrong with the following code?
There are a number of problems on this line:
name = `$file | cut -d'.' -f1`
First, to assign a shell variable, there should be no whitespace around the assignment operator:
name=`$file | cut -d'.' -f1`
Secondly, you want to pass $file to cut as a string, but what you're actually doing is trying to run $file as if it were an executable program (which it may very well be, but that's beside the point). Instead, use echo or shell redirection to pass it:
name=`echo $file | cut -d. -f1`
Or:
name=`cut -d. -f1 <<< $file`
I would actually recommend that you approach it slightly differently. Your solution will break if you get a file like foo.bar.c. Instead, you can use shell expansion to strip off the trailing extension:
name=${file%.c}
Or you can use the basename utility:
name=`basename $file .c`
You should use the command substitution (https://www.gnu.org/software/bash/manual/bashref.html#Command-Substitution) to execute a command in a script.
With this the code will look like this
#!/bin/bash
for file in *.c; do
name=$(echo "$file" | cut -f 1 -d '.')
gcc $file -o $name
done
With the echo will send the $file to the standard output.
Then the pipe will trigger after the command.
The cut command with the . delimiter will split the file name and will keep the first part.
This is assigned to the name variable.
Hope this answer helps

Bash - Piping output of command into while loop

I'm writing a Bash script where I need to look through the output of a command and do certain actions based on that output. For clarity, this command will output a few million lines of text and it may take roughly an hour or so to do so.
Currently, I'm executing the command and piping it into a while loop that reads a line at a time then looks for certain criteria. If that criterion exists, then update a .dat file and reprint the screen. Below is a snippet of the script.
eval "$command"| while read line ; do
if grep -Fq "Specific :: Criterion"; then
#pull the sixth word from the line which will have the data I need
temp=$(echo "$line" | awk '{ printf $6 }')
#sanity check the data
echo "\$line = $line"
echo "\$temp = $temp"
#then push $temp through a case statement that does what I need it to do.
fi
done
So here's the problem, the sanity check on the data is showing weird results. It is printing lines that don't contain the grep criteria.
To make sure that my grep statement is working properly, I grep the log file that contains a record of the text that is output by the command and it outputs only the lines that contain the specified criteria.
I'm still fairly new to Bash so I'm not sure what's going on. Could it be that the command is force feeding the while loop a new $line before it can process the $line that met the grep criteria?
Any ideas would be much appreciated!
How does grep know what line looks like?
if ( printf '%s\n' "$line" | grep -Fq "Specific :: Criterion"); then
But I cant help feel like you are overcomplicating a lot.
function process() {
echo "I can do anything I want"
echo " per element $1"
echo " that I want here"
}
export -f process
$command | grep -F "Specific :: Criterion" | awk '{print $6}' | xargs -I % -n 1 bash -c "process %";
Run the command, filter only matching lines, and pull the sixth element. Then if you need to run an arbitrary code on it, send it to a function (you export to make it visible in subprocesses) via xargs.
What are you applying the grep on ?
Modify
if grep -Fq "Specific :: Criterion"; then
as below
if ( echo $line | grep -Fq "Specific :: Criterion" ); then

bash: grep in loop does not grep

I have (probably a obvious/stupid) problem:
I want to loop over a list of paths, cut them and use the strings to grep in log files.
While every step works fine on its own and 'processed manually' results in hits - grep does not find anything when in the loop?
for FILE in `awk -F "/" '{print $13}' /tmp/files_not_visible.uniq`; do
echo -e "\n\n$FILE\n";
grep "$FILE" /var/log/PATH/FILENAME-2015.12.*;
done
I also tried to do a while loop as reverse exercise, but fails with the same non-result
while read FILE; do
echo $FILE;
echo $FILE | awk -F "/" '{print $13}' | grep -f - /var/log/PATH/FILENAME-2015.12.* ;
done < /tmp/files_not_visible.uniq/tmp/files_not_visible.uniq
So, I guess there is some systematic issue, how I handle the search string with grep?
Found it: the list of files contained invisible characters as the last character of the line! Probably the user, who send me the list of files, created it on some other OS! And I only copied -of course- the visible characters when testing by hand!
Fixed the loop by cutting the last character of a line with
> sed -e 's/.$//'

What's an alternative to echo grep in parsing a running log?

I'm currently figuring my way around a bash script (sorry, can't use other languages like perl) to keep track of a running log during a server startup. Basically, I have to trigger certain events depending on whether or not i run into certain strings or patterns while the log is being written. Currently, i have this code:
LOG=path_to_logfile
LINE1="[1-9][0-9]* some string"
LINE2="another string"
LINE3="third string"
tail -fn0 $LOG | \
while read line
do
echo $line | grep "$LINE1" || echo $line | grep "$LINE2" || echo $line | grep "$LINE3"
if [ $? = 0 ]
then
TMP=<echo line above>
... bunch of conditional statements...
fi
done
However, this is kinda slow; by the time the line i need to track is detected by the echo/grep combinations using or, it's waaaay after the server already started up. What's a good alternative to the above? I've read awk should be used but when i tried writing it in awk, either i wrote it wrong or the processing was also taking too much time to finish.
Any help will be appreciated. Thanks!
Rather than calling grep (potentially several times) on each line, let bash do the regular expression matching.
LOG=path_to_logfile
LINE1="[1-9][0-9]* some string"
LINE2="another string"
LINE3="third string"
tail -fn0 $LOG | while read line
do
if [[ $line =~ $LINE1|$LINE2|$LINE3 ]]; then
TMP=<echo line above>
... bunch of conditional statements...
fi
done
I'd try something like this instead:
tail -fn0 $LOG | egrep "$LINE1|$LINE2|$LINE3" | \
while read TMP
do
...
done
That way, the while read loop, which at a guess is going to be the slowest part of this whole operation, is only invoked when egrep actually finds a matching line in the input log.
You can have multiple match statements, which are ORed together to see if the line matches:
tail -f -n0 "$LOG" | grep -e "$LINE1" -e "$LINE2" -e "$LINE3" | while IFS= read -r line
do
# Do something with each matching $line
done

Resources