Trying to make a live /proc/ reader using bash script for live process monitoring - linux

Im trying to make a little side project script to sit and monitor all of the /proc/ directories, for the most part I have the concept running and it works(to a degree). What im aiming for here is to scan through all the files and cat their status files and pull out the appropriate info, and then I would like to run this process in an infinite loop to give me live updates of when something is running on and dropping off of the scheduler. Right now every time you run the script, it will print 50+ blank lines and every single time it hits the proper regex it will print it correctly, but Im aiming for it to not roll down the screen the way it does. Any help at all would be appreciated.
regex="[0-9]"
temp=""
for f in /proc/*; do
if [[ -d $f && $f =~ /proc/$regex ]]; then
output=$(cat $f/status | grep "^State") #> /dev/null
process_id=$(cut -b 7- <<< $f)
state=$(cut -b 10-19 <<< $output)
tabs 4
if [[ $state =~ "(running)" ]]; then
echo -e "$process_id:$state\n" | sort >> temp
fi
fi
done
cat temp
rm temp````

To get the PID and status of running all processes, try:
awk -F':[[:space:]]*' '/State:/{s=$2} /Pid:/{p=$2} ENDFILE{if (s~/running/) print p,s; p="X"; s="X"}' OFS=: /proc/*/status
To get this output updated every second:
while sleep 1; do awk -F':[[:space:]]*' '/State:/{s=$2} /Pid:/{p=$2} ENDFILE{if (s~/running/) print p,s; p="X"; s="X"}' OFS=: /proc/*/status; done

Related

Bash script optimization for waiting for a particular string in log files

I am using a bash script that calls multiple processes which have to start up in a particular order, and certain actions have to be completed (they then print out certain messages to the logs) before the next one can be started. The bash script has the following code which works really well for most cases:
tail -Fn +1 "$log_file" | while read line; do
if echo "$line" | grep -qEi "$search_text"; then
echo "[INFO] $process_name process started up successfully"
pkill -9 -P $$ tail
return 0
elif echo "$line" | grep -qEi '^error\b'; then
echo "[INFO] ERROR or Exception is thrown listed below. $process_name process startup aborted"
echo " ($line) "
echo "[INFO] Please check $process_name process log file=$log_file for problems"
pkill -9 -P $$ tail
return 1
fi
done
However, when we set the processes to print logging in DEBUG mode, they print so much logging that this script cannot keep up, and it takes about 15 minutes after the process is complete for the bash script to catch up. Is there a way of optimizing this, like changing 'while read line' to 'while read 100 lines', or something like that?
How about not forking up to two grep processes per log line?
tail -Fn +1 "$log_file" | grep -Ei "$search_text|^error\b" | while read line; do
So one long running grep process shall do preprocessing if you will.
Edit: As noted in the comments, it is safer to add --line-buffered to the grep invocation.
Some tips relevant for this script:
Checking that the service is doing its job is a much better check for daemon startup than looking at the log output
You can use grep ... <<<"$line" to execute fewer echos.
You can use tail -f | grep -q ... to avoid the while loop by stopping as soon as there's a matching line.
If you can avoid -i on grep it might be significantly faster to process the input.
Thou shalt not kill -9.

How to get watch to run a bash script with quotes

I'm trying to have a lightweight memory profiler for the matlab jobs that are run on my machine. There is either one or zero matlab job instance, but its process id changes frequently (since it is actually called by another script).
So here is the bash script that I put together to log memory usage:
#!/bin/bash
pid=`ps aux | grep '[M]ATLAB' | awk '{print $2}'`
if [[ -n $pid ]]
then
\grep VmSize /proc/$pid/status
else
echo "no pid"
fi
when I run this script in bash like this:
./script.sh
it works fine, giving me the following result:
VmSize: 1289004 kB
which is exactly what I want.
Now, I want to run this periodically. So I run it with watch, like this:
watch ./script.sh
But in this case I only receive:
no pid
Please note that I know the matlab job is still running, because I can see it with the same pid on top, and besides, I know each matlab job take several hours to finish.
I'm pretty sure that something is wrong with the quotes I have when setting pid. I just can't figure out how to fix it. Anyone knows what I'm doing wrong?
PS.
In the man page of watch, it says that commands are executed by sh -c. I did run my script like sh -c ./script and it works just fine, but watch doesn't.
Why don't you use a loop with sleep command instead?
For example:
#!/bin/bash
pid=`ps aux | grep '[M]ATLAB' | awk '{print $2}'`
while [ "1" ]
do
if [[ -n $pid ]]
then
\grep VmSize /proc/$pid/status
else
echo "no pid"
fi
sleep 10
done
Here the script sleeps(waits) for 10 seconds. You can set the interval you need changing the sleep command. For example to make the script sleep for an hour use sleep 1h.
To exit the script press Ctrl - C
This
pid=`ps aux | grep '[M]ATLAB' | awk '{print $2}'`
could be changed to:
pid=$(pidof MATLAB)
I have no idea why it's not working in watch but you could use a cron job and make the script log to a file like so:
#!/bin/bash
pid=$(pidof MATLAB) # Just to follow previously given advice :)
if [[ -n $pid ]]
then
echo "$(date): $(\grep VmSize /proc/$pid/status)" >> logfile
else
echo "$(date): no pid" >> logfile
fi
You'd of course have to create logfile with touch.
You might try just running ps command in watch. I have had issues in the past with watch chopping lines and such when they get too long.
It can be fixed by making the terminal you are running the command from wider or changing the column like this (may need to adjust the 160 to your liking):
export COLUMNS=160;

Terminating half of a pipe on Linux does not terminate the other half

I have a filewatch program:
#!/bin/sh
# On Linux, uses inotifywait -mre close_write, and on OS X uses fswatch.
set -e
[[ "$#" -ne 1 ]] && echo "args count" && exit 2
if [[ `uname` = "Linux" ]]; then
inotifywait -mcre close_write "$1" | sed 's/,".*",//'
elif [[ `uname` = "Darwin" ]]; then
# sed on OSX/BSD wants -l for line-buffering
fswatch "$1" | sed -l 's/^[a-f0-9]\{1,\} //'
fi
echo "fswatch: $$ exiting"
And a construct i'm trying to use from a script (and I am testing with it on the command line on CentOS now):
filewatch . | while read line; do echo "file $line has changed\!\!"; done &
So what I am hoping this does is it will let me process, one line at a time, the output of inotify, which of course sends out one line for each file it has detected a change on.
Now for my script to clean stuff up properly I need to be able to kill this whole backgrounded pipeline when the script exits.
So i run it and then if I run kill on either the first part of the pipe or the second part, the other part does not terminate.
So I think if I kill the while read line part (which should be sh (zsh in the case of running on the cmd line)) then filewatch should be receiving a SIGPIPE. Okay so I am not handling that, I guess it can keep running.
If I kill filewatch, though, it looks like zsh continues with its while read line. Why?

How to tail -f the latest log file with a given pattern

I work with some log system which creates a log file every hour, like follows:
SoftwareLog.2010-08-01-08
SoftwareLog.2010-08-01-09
SoftwareLog.2010-08-01-10
I'm trying to tail to follow the latest log file giving a pattern (e.g. SoftwareLog*) and I realize there's:
tail -F (tail --follow=name --retry)
but that only follow one specific name - and these have different names by date and hour. I tried something like:
tail --follow=name --retry SoftwareLog*(.om[1])
but the wildcard statement is resoved before it gets passed to tail and doesn't re-execute everytime tail retries.
Any suggestions?
I believe the simplest solution is as follows:
tail -f `ls -tr | tail -n 1`
Now, if your directory contains other log files like "SystemLog" and you only want the latest "SoftwareLog" file, then you would simply include a grep as follows:
tail -f `ls -tr | grep SoftwareLog | tail -n 1`
[Edit: after a quick googling for a tool]
You might want to try out multitail - http://www.vanheusden.com/multitail/
If you want to stick with Dennis Williamson's answer (and I've +1'ed him accordingly) here are the blanks filled in for you.
In your shell, run the following script (or it's zsh equivalent, I whipped this up in bash before I saw the zsh tag):
#!/bin/bash
TARGET_DIR="some/logfiles/"
SYMLINK_FILE="SoftwareLog.latest"
SYMLINK_PATH="$TARGET_DIR/$SYMLINK_FILE"
function getLastModifiedFile {
echo $(ls -t "$TARGET_DIR" | grep -v "$SYMLINK_FILE" | head -1)
}
function getCurrentlySymlinkedFile {
if [[ -h $SYMLINK_PATH ]]
then
echo $(ls -l $SYMLINK_PATH | awk '{print $NF}')
else
echo ""
fi
}
symlinkedFile=$(getCurrentlySymlinkedFile)
while true
do
sleep 10
lastModified=$(getLastModifiedFile)
if [[ $symlinkedFile != $lastModified ]]
then
ln -nsf $lastModified $SYMLINK_PATH
symlinkedFile=$lastModified
fi
done
Background that process using the normal method (again, I don't know zsh, so it might be different)...
./updateSymlink.sh 2>&1 > /dev/null
Then tail -F $SYMLINK_PATH so that the tail hands the changing of the symbolic link or a rotation of the file.
This is slightly convoluted, but I don't know of another way to do this with tail. If anyone else knows of a utility that handles this, then let them step forward because I'd love to see it myself too - applications like Jetty by default do logs this way and I always script up a symlinking script run on a cron to compensate for it.
[Edit: Removed an erroneous 'j' from the end of one of the lines. You also had a bad variable name "lastModifiedFile" didn't exist, the proper name that you set is "lastModified"]
I haven't tested this, but an approach that may work would be to run a background process that creates and updates a symlink to the latest log file and then you would tail -f (or tail -F) the symlink.
#!/bin/bash
PATTERN="$1"
# Try to make sure sub-shells exit when we do.
trap "kill -9 -- -$BASHPID" SIGINT SIGTERM EXIT
PID=0
OLD_FILES=""
while true; do
FILES="$(echo $PATTERN)"
if test "$FILES" != "$OLD_FILES"; then
if test "$PID" != "0"; then
kill $PID
PID=0
fi
if test "$FILES" != "$PATTERN" || test -f "$PATTERN"; then
tail --pid=$$ -n 0 -F $PATTERN &
PID=$!
fi
fi
OLD_FILES="$FILES"
sleep 1
done
Then run it as: tail.sh 'SoftwareLog*'
The script will lose some log lines if the logs are written to between checks. But at least it's a single script, with no symlinks required.
We have daily rotating log files as: /var/log/grails/customer-2020-01-03.log. To tail the latest one, the following command worked fine for me:
tail -f /var/log/grails/customer-`date +'%Y-%m-%d'`.log
(NOTE: no space after the + sign in the expression)
So, for you, the following should work (if you are in the same directory of the logs):
tail -f SoftwareLog.`date +'%Y-%m-%d-%H'`
I believe the easiest way is to use tail with ls and head, try something like this
tail -f `ls -t SoftwareLog* | head -1`

Bash script does not continue to read the next line of file

I have a shell script that saves the output of a command that is executed to a CSV file. It reads the command it has to execute from a shell script which is in this format:
ffmpeg -i /home/test/videos/avi/418kb.avi /home/test/videos/done/418kb.flv
ffmpeg -i /home/test/videos/avi/1253kb.avi /home/test/videos/done/1253kb.flv
ffmpeg -i /home/test/videos/avi/2093kb.avi /home/test/videos/done/2093kb.flv
You can see each line is an ffmpeg command. However, the script just executes the first line. Just a minute ago it was doing nearly all of the commands. It was missing half for some reason. I edited the text file that contained the commands and now it will only do the first line. Here is my bash script:
#!/bin/bash
# Shell script utility to read a file line line.
# Once line is read it will run processLine() function
#Function processLine
processLine(){
line="$#"
START=$(date +%s.%N)
eval $line > /dev/null 2>&1
END=$(date +%s.%N)
DIFF=$(echo "$END - $START" | bc)
echo "$line, $START, $END, $DIFF" >> file.csv 2>&1
echo "It took $DIFF seconds"
echo $line
}
# Store file name
FILE=""
# get file name as command line argument
# Else read it from standard input device
if [ "$1" == "" ]; then
FILE="/dev/stdin"
else
FILE="$1"
# make sure file exist and readable
if [ ! -f $FILE ]; then
echo "$FILE : does not exists"
exit 1
elif [ ! -r $FILE ]; then
echo "$FILE: can not read"
exit 2
fi
fi
# read $FILE using the file descriptors
# Set loop separator to end of line
BAKIFS=$IFS
IFS=$(echo -en "\n\b")
exec 3<&0
exec 0<$FILE
while read line
do
# use $line variable to process line in processLine() function
processLine $line
done
exec 0<&3
# restore $IFS which was used to determine what the field separators are
BAKIFS=$ORIGIFS
exit 0
Thank you for any help.
UPDATE 2
Its the ffmpeg commands rather than the shell script that isn't working. But I should of been using just "\b" as Paul pointed out. I am also making use of Johannes's shorter script.
I think that should do the same and seems to be correct:
#!/bin/bash
CSVFILE=/tmp/file.csv
cat "$#" | while read line; do
echo "Executing '$line'"
START=$(date +%s)
eval $line &> /dev/null
END=$(date +%s)
let DIFF=$END-$START
echo "$line, $START, $END, $DIFF" >> "$CSVFILE"
echo "It took ${DIFF}s"
done
no?
ffmpeg reads STDIN and exhausts it. The solution is to call ffmpeg with:
ffmpeg </dev/null ...
See the detailed explanation here: http://mywiki.wooledge.org/BashFAQ/089
Update:
Since ffmpeg version 1.0, there is also the -nostdin option, so this can be used instead:
ffmpeg -nostdin ...
I just had the same problem.
I believe ffmpeg is responsible for this behaviour.
My solution for this problem:
1) Call ffmpeg with an "&" at the end of your ffmpeg command line
2) Since now the skript will not wait till completion of the ffmpeg process,
we have to prevent our script from starting several ffmpeg processes.
We achieve this goal by delaying the loop pass while there is at least
one running ffmpeg process.
#!/bin/bash
cat FileList.txt |
while read VideoFile; do
<place your ffmpeg command line here> &
FFMPEGStillRunning="true"
while [ "$FFMPEGStillRunning" = "true" ]; do
Process=$(ps -C ffmpeg | grep -o -e "ffmpeg" )
if [ -n "$Process" ]; then
FFMPEGStillRunning="true"
else
FFMPEGStillRunning="false"
fi
sleep 2s
done
done
I would add echos before and after the eval to see what it's about to eval (in case it's treating the whole file as one big long line) and after (in case one of the ffmpeg commands is taking forever).
Unless you are planning to read something from standard input after the loop, you don't need to preserve and restore the original standard input (though it is good to see you know how).
Similarly, I don't see a reason for dinking with IFS at all. There is certainly no need to restore the value of IFS before exit - this is a real shell you are using, not a DOS BAT file.
When you do:
read var1 var2 var3
the shell assigns the first field to $var1, the second to $var2, and the rest of the line to $var3. In the case where there's just one variable - your script, for example - the whole line goes into the variable, just as you want it to.
Inside the process line function, you probably don't want to throw away error output from the executed command. You probably do want to think about checking the exit status of the command. The echo with error redirection is ... unusual, and overkill. If you're sufficiently sure that the commands can't fail, then go ahead with ignoring the error. Is the command 'chatty'; if so, throw away the chat by all means. If not, maybe you don't need to throw away standard output, either.
The script as a whole should probably diagnose when it is given multiple files to process since it ignores the extraneous ones.
You could simplify your file handling by using just:
cat "$#" |
while read line
do
processline "$line"
done
The cat command automatically reports errors (and continues after them) and processes all the input files, or reads standard input if there are no arguments left. The use of double quotes around the variable means that it is passed as a single unit (and therefore unparsed into separate words).
The use of date and bc is interesting - I'd not seen that before.
All in all, I'd be looking at something like:
#!/bin/bash
# Time execution of commands read from a file, line by line.
# Log commands and times to CSV logfile "file.csv"
processLine(){
START=$(date +%s.%N)
eval "$#" > /dev/null
STATUS=$?
END=$(date +%s.%N)
DIFF=$(echo "$END - $START" | bc)
echo "$line, $START, $END, $DIFF, $STATUS" >> file.csv
echo "${DIFF}s: $STATUS: $line"
}
cat "$#" |
while read line
do
processLine "$line"
done

Resources