I have a bash script I use to check if certain forever processes are running. The script is basically:
#!/bin/bash
processIsRunning=$(forever list | grep -q 'process/index.js')
if [ -n $processIsRunning]; then
echo 'Processes are running'
else
echo 'Processes are not running'
fi
I get this error though:
events.js:72
throw er; // Unhandled 'error' event
^
Error: write EPIPE
at errnoException (net.js:904:11)
at Object.afterWrite (net.js:720:19)
If I remove the '-q' flag from my grep command in line 3 then I do not get a pipe error, but instead I get an error about it trying to run the grep result as a command instead of just checking for the length of the output to be greater than 0.
Does anyone know why the -q parameter would cause an EPIPE error?
UPDATE BASED ON COMMENTS:
My mistake, I'm pretty new to bash and was trying to learn how to use if statements. I originally had it directly in the if statement but took it out into a variable because it wasn't working (turns out it was failing because of my lack of spaces, didn't realize they are a requirement in bash). I clearly didn't port it out properly. I'm currently using just grep without -q and then checking the length of the output and that is working well.
Try something like this:
forever list | grep -q 'process/index.js'
if [ $? -eq 0 ]; then
echo 'Processes are running'
else
echo 'Processes are not running'
fi
grep -q says do not write anything to standard output.
$? is used to find the return value of the last executed command.
Related
The below piece of script is not behaving as expected
if docker pull docker.pkg.github.com/private-repo/centos7 | grep -q 'Error response from daemon: unauthorized'; then
echo "matched"
else
echo "unmatched"
fi
output
Error response from daemon: unauthorized
unmatched
expected output
matched
I have followed this post
What i have tried:
i replaced "docker.pkg.github.com/private-repo/centos7" with echo "Error response from daemon: unauthorized" and it gives expected o/p as matched.
so, what i understand here is the o/p from command "docker pull docker.pkg.github.com/private-repo/centos7" is not captured by "grep" but i don't understand why?
I've also tried this but same result:
docker pull docker.pkg.github.com/private-repo/centos7 | grep 'Error response from daemon: unauthorized' &> /dev/null
if [ $? == 0 ]; then
echo "matched"
else
echo "unmatched"
fi
Working solution suggested by #Gordon Davisson
docker pull docker.pkg.github.com/private-repo/centos7 2>&1 | grep 'Error response from daemon: unauthorized' &> /dev/null
if [ $? == 0 ]; then
echo "matched"
else
echo "unmatched"
fi
output:
matched
It’s just as #Gordon Davisson said, and please give him the answer credit if he chooses to claim. I’m making the answer more visible.
This is an oversimplification, but it will get the point across. All “outputs” are sent to the terminal through stdout and stderr.
When you use the basic pipe-syntax (|), the only thing actually being processed by the pipe is the stdout. The stderr will still be printed to the terminal. In your case this is undesirable behavior.
The fix is to force the stderr into the stdout BEFORE the pipe, the syntax for this is 2&>1 (or in Bash |&). This works around the pipe’s limitation of only being able to process stdout, and it also prevents the stderr leak into the terminal.
if docker pull… 2&>1 | grep -q…
<SNIPPED>
OR IN BASH
if docker pull… |& grep -q…
<SNIPPED>
The reason your 2nd-attempted solution didn’t work was because pipes and redirections are processed in-order from left-to-right.
if docker pull… | grep… &> /dev/null
# ^ LEAK HAPPENS HERE, FIX COMES TOO LATE
<SNIPPED>
Meaning that the stderr leak into the terminal already occurred BEFORE you redirected grep’s output. And that the error wasn’t occurring from grep.
You might have some luck with just searching for Error instead of the whole string and see if you got something wrong with the way you typed out the string.
I need to write a shell script that starts a process in background and parse its output till it checks the output doesn't contains any Error in its output. The process will keep on running in the background as it needs to listen on ports. If the process output contained an error exit the script.
Based on the output of the previous process (it didn't contain any errors, the process was able to establish connection to DB) run the next command.
I have tried many approches suggested on Stack overflow, which includes:
https://unix.stackexchange.com/questions/12075/best-way-to-follow-a-log-and-execute-a-command-when-some-text-appears-in-the-log
https://unix.stackexchange.com/questions/45941/tail-f-until-text-is-seen
https://unix.stackexchange.com/questions/137030/how-do-i-extract-the-content-of-quoted-strings-from-the-output-of-a-command
/home/build/a_process 2>&1 | tee "output_$(date +"%Y_%m_%d").log"
tail -fn0 "output_$(date +"%Y_%m_%d").log" | \
while read line ; do
if [ echo "$line" | grep "Listening" ]
then
/home/build/b_process 2>&1 | tee "output_script_$(date +"%Y_%m_%d").log"
elif [ echo "$line" | grep "error occurred in load configuration" ] || [ echo "$line" | grep "Binding Failure" ]
then
sl -e
fi
done
The problem is since the process keep running despite it contains the text i was searching for it gets stuck in parsing the staring and never able to exit watching the output or tailing. As a result it's not able to execute next command.
On surface, the issue is with "tee" command (a_process ... | tee).
Recall that a pipeline will result in the shell
Creating the pipeline between the command
Waiting for the LAST command the finish.
Since the tee will not finish until a_process is done, and since a_process is a daemon, your script may wait forever (at least, until a_process exit).
In this case, consider sending the whole pipeline to the background.
log_file=output_$(date +"%Y_%m_%d").log
( /home/build/a_process 2>&1 | tee "$logfile" ) &
tail -fn0 "$logfile" |
...
Side note: consider setting the log file into a variable. This will make it easier to maintain (and understand) the script.
I need to have a .sh file that will echo 0 if my python service is not running. I know that pgrep is the command I want to use, but I am getting errors using it.
if [ [ ! $(pgrep -f service.py) ] ]; then
echo 0
fi
Is what I found online, and I keep getting the error
./test_if_running.sh: line 3: syntax error near unexpected token `fi'
./test_if_running.sh: line 3: `fi;'
When I type
./test_if_running.sh
The issue in your code is the nested [ ... ]. Also, as #agc has noted, what we need to check here is the exit code of pgrep and not its output. So, the right way to write the if is:
if ! pgrep -f service.py &> /dev/null 2>&1; then
# service.py is not running
fi
This is a bit simple, but why not just print a NOT'd exit code, like so:
! pgrep -f service.py &> /dev/null ; echo $?
As a bonus it'll print 1 if the service is running.
I'm writing a shell script to parse through log file and pull out all instances where sudo succeeded and/or failed. I'm realizing now that this probably would've been easier with shell's equivalent of regex, but I didn't want to take the time to dig around (and now I'm paying the price). Anyway:
sudobool=0
sudoCount=0
for i in `cat /var/log/auth.log`;
do
for word in $i;
do
if $word == "sudo:"
then
echo "sudo found"
sudobool=1;
sudoCount=`expr $sudoCount + 1`;
fi
done
sudobool=0;
done
echo "There were " $sudoCount " attempts to use sudo, " $sudoFailCount " of which failed."
So, my understanding of the code I've written: read auth.log and split it up line by line, which are stored in i. Each word in i is checked to see if it is sudo:, if it is, we flip the bool and increment. Once we've finished parsing the line, reset the bool and move to the next line.
However, judging by my output, the shell is trying to execute the individual words of the log file, typically returning '$word : not found'.
why don't you use grep for this?
grep sudo /var/log/auth.log
if you want a count pipe it to wc -l
grep sudo /var/log/auth.log | wc -l
or still better use -c option to grep, which prints how many lines were found containing sudo
grep -c sudo /var/log/auth.log
or maybe I am missing something simple here?
EDIT: I saw $sudoFailCount after scrolling, do you want to count how many failed attempts were made to use sudo ?? You have not defined any value for $sudoFailCount in your script, so it will print nothing. Also you are missing the test brackets [[ ]] around your if condition checking
Expanding on Sudhi's answer, here's a one-liner:
$ echo "There were $(grep -c ' sudo: ' /var/log/auth.log) attempts to use sudo, $(grep -c ' sudo: .*authentication failure' /var/log/auth.log) of which failed."
There were 17 attempts to use sudo, 1 of which failed.
Your error message arises from a lack of syntax in your if statement: you need to put the condition in [[brackets]]
Using the pattern matching in bash:
#!/bin/bash
sudoCount=0
while read line; do
sudoBool=0
if [[ "$line" = *sudo:* ]]; then
sudoBool=1
(( sudoCount++ ))
# do something with sudobool ?
fi
done < /var/log/auth.log
echo "There were $sudoCount attempts to use sudo."
I'm not initimately familiar with the auth.log -- what is the pattern to determine success or failure?
I have a bash script that checks some log files created by a cron job that have time stamps in the filename (down to the second). It uses the following code:
CRON_LOG=$(ls -1 $LOGS_DIR/fetch_cron_{true,false}_$CRON_DATE*.log 2> /dev/null | sed 's/^[^0-9][^0-9]*\([0-9][0-9]*\).*/\1 &/' | sort -n | cut -d ' ' -f2- | tail -1 )
if [ -f "$CRON_LOG" ]; then
printf "Checking $CRON_LOG for errors\n"
else
printf "\n${txtred}Error: cron log for $CRON_NOW does not exist.${txtrst}\n"
printf "Either the specified date is too old for the log to still be around or there is a problem.\n"
exit 1
fi
CRIT_ERRS=$(cat $CRON_LOG | grep "ERROR" | grep -v "Duplicate tracking code")
if [ -z "$CRIT_ERRS" ]; then
printf "%74s[${txtgrn}PASS${txtrst}]\n"
else
printf "%74s[${txtred}FAIL${txtrst}]\n"
printf "Critical errors detected! Outputting to console...\n"
echo $CRIT_ERRS
fi
So this bit of code works fine, but I'm trying to clean up my scripts now and implement set -e at the top of all of them. When i do it to this script, it exits with error code 1. Note that I have errors form the first statement dumping to /dev/null. This is because some days the file has the word "true" and other days "false" in it. Anyway, i don't think this is my problem because the script outputs "Checking xxxxx.log for errors." before exiting when I add set -e to the top.
Note: the $CRON_DATE variable is derived form user input. I can run the exact same statement from command line "$./checkcron.sh 01/06/2010" and it works fine without the set -e statement at the top of the script.
UPDATE: I added "set -x" to my script and narrowed the problem down. The last bit of output is:
Checking /map/etl/tektronix/logs/fetch_cron_false_010710054501.log for errors
++ cat /map/etl/tektronix/logs/fetch_cron_false_010710054501.log
++ grep ERROR
++ grep -v 'Duplicate tracking code'
+ CRIT_ERRS=
[1]+ Exit 1 ./checkLoad.sh...
So it looks like the problem is occurring on this line:
CRIT_ERRS=$(cat $CRON_LOG | grep "ERROR" | grep -v "Duplicate tracking code")
Any help is appreciated. :)
Thanks,
Ryan
Adding set -x, which prints a trace of the script's execution, may help you diagnose the source of the error.
Edit:
Your grep is returning an exit code of 1 since it's not finding the "ERROR" string.
Edit 2:
My apologies regarding the colon. I didn't test it.
However, the following works (I tested this one before spouting off) and avoids calling the external cat. Because you're setting a variable using the results of a subshell and set -e looks at the subshell as a whole, you can do this:
CRIT_ERRS=$(cat $CRON_LOG | grep "ERROR" | grep -v "Duplicate tracking code"; true)
bash -c 'f=`false`; echo $?'
1
bash -c 'f=`true`; echo $?'
0
bash -e -c 'f=`false`; echo $?'
bash -e -c 'f=`true`; echo $?'
0
Note that backticks (and $()) "return" the error code of the last command they run. Solution:
CRIT_ERRS=$(cat $CRON_LOG | grep "ERROR" | grep -v "Duplicate tracking code" | cat)
Redirecting error messages to /dev/null does nothing about the exit status returned by the script. The reason your ls command isn't causing the error is because it's part of a pipeline, and the exit status of the pipeline is the return value of the last command in it (unless pipefail is enabled).
Given your update, it looks like the command that's failing is the last grep in the pipeline. grep only returns 0 if it finds a match; otherwise it returns 1, and if it encounters an error, it returns 2. This is a danger of set -e; things can fail even when you don't expect them to, because commands like grep return non-zero status even if there hasn't been an error. It also fails to exit on errors earlier in a pipeline, and so may miss some error.
The solutions given by geocar or ephemient (piping through cat or using || : to ensure that the last command in the pipe returns successfully) should help you get around this, if you really want to use set -e.
Asking for set -e makes the script exit as soon as a simple command exits with a non-zero exit status. This combines perniciously with your ls command, which exits with a non-zero status when asked to list a non-existent file, which is always the case for you because the true and false variants don't co-exist.