nohup script stopped after some time - linux

I have a monitoring script running on my server which looks like
#!/bin/sh
tail -F /data/logs.txt | grep -E --line-buffered -io 'keyword1|keyword2' | while read -r line ; do
case "$line" in
"keyword1")
echo "hi"
;;
"keyword2")
echo "hi1"
;;
*)
esac
done
i have used -F in tail because if the new log file created it should follow the new file
I have ran that script in nohup to run indefinetly like below
nohup ./script.sh &
But the script getting stopped after a period of time
can someone help why it's happening ?
Thanks,

Related

Shellscript to monitor a log file if keyword triggers then run the snmptrap command

Is there a way to monitor a log file using shell script like
tail -f /var/log/errorlog.txt then if something like down keyword appears, then generate SNMPTRAP to snmp manager and continues the monitoring
I have a SNMP script available to generate SNMPTrap and it looks like
snmptrap -v v2c -c community host "Error message"
Lets the say the script name is snmp.sh
My question is how to perform the below operation
tail the logs
if keyword[down] matches then use snmp.sh script to send alert
else leave
As per the suggestion i tried this
tail -F /data/log/test.log |
egrep -io 'got signal 15 | now exiting' |
while read -r line ;
do
case "$line" in
"got signal 15")
echo "hi"
;;
"now exiting")
echo "hi2"
;;
*)
esac
done
but the problem is tail is not working here with case statement, whenever the new log details added its not going to the case statement and echos the output
I could get the output if i use cat/less/more
Could you someone please tell what mistake i have done here ?
Thanks in advance
It sounds like the pattern you want is this:
tail -f /var/log/errorlog.txt | grep -e down -e unmounted | while read -r line
do
case "$line" in
down)
./snmp.sh …
;;
unmounted)
./snmp.sh …
;;
*)
echo "Unhandled keyword ${line}" >&2
exit 1
esac
done
try
tail -f /var/log/errorlog.txt | grep "down" | while read line; do snmp.sh; done

Getting PID of launched process

For some monitoring purposes, for a given list of files (that change path daily) I want to launch a tail -f in the background searching for specific strings
How do I properly get the exact pid of the tail commands generated by the for loop, so I could kill them later ?
Here's what I'm trying to do which I think gives me the pid of the script and not the tail process.
for f in $(find file...)
do
tail -f $f | while read line
do case $line in
*string_to_search*) echo " $line" | mutt -s "string detected in file : $1" mail#mail.com;
;;
esac
done &
echo $! >> pids.txt
done

Check whether a process is running or not Linux

Here is my code:
#!/bin/bash
ps cax | grep testing > /dev/null
while [ 1 ]
do
if [ $? -eq 0 ]; then
echo "Process is running."
sleep 10
else
nohup ./testing.sh &
sleep 10
fi
done
I run it as nohup ./script.sh &
and it said nohup: failed to run command './script.sh': No such file or directory
What is wrong?
The file script.sh simply does not exist in the directory that you are issuing the command from.
If it did exist and was not executable you would get:
`nohup: failed to run command ‘./script.sh’: Permission denied
For each newly created scripts on Linux, you should first change the permission as you can see the permission details by using
ls -lah
The following content may help you:
#!/bin/bash
while [ 1 ];
do
date=`date`
pid=`ps -ef | grep "your process" | grep -v grep | awk -F' ' '{print $2}'`
if [[ -n $pid ]]; then
echo "$date - processID $pid is running."
else
echo "$date - the process is not running"
# script to restart your process
say: start the process
fi
sleep 5m
done
Make sure your script is saved as script.sh
and your executing nohup ./script.sh & from the same directory in which script.sh.
Also you can give executable permission for script.sh by
chmod 776 script.sh
or
nohup ./script.sh &
Run as
nohup sh ./script.sh &

Linux Script to check if process is running and act on the result

I have a process that fails regularly & sometimes starts duplicate instances..
When I run:
ps x |grep -v grep |grep -c "processname"
I will get:
2
This is normal as the process runs with a recovery process..
If I get
0
I will want to start the process
if I get:
4
I will want to stop & restart the process
What I need is a way of taking the result of ps x |grep -v grep |grep -c "processname"
Then setup a simple 3 option function
ps x |grep -v grep |grep -c "processname"
if answer = 0 (start process & write NOK & Time to log /var/processlog/check)
if answer = 2 (Do nothing & write OK & time to log /var/processlog/check)
if answer = 4 (stot & restart the process & write NOK & Time to log /var/processlog/check)
The process is stopped with
killall -9 process
The process is started with
process -b -c /usr/local/etc
My main problem is finding a way to act on the result of ps x |grep -v grep |grep -c "processname".
Ideally, I would like to make the result of that grep a variable within the script with something like this:
process=$(ps x |grep -v grep |grep -c "processname")
If possible.
Programs to monitor if a process on a system is running.
Script is stored in crontab and runs once every minute.
This works with if process is not running or process is running multiple times:
#! /bin/bash
case "$(pidof amadeus.x86 | wc -w)" in
0) echo "Restarting Amadeus: $(date)" >> /var/log/amadeus.txt
/etc/amadeus/amadeus.x86 &
;;
1) # all ok
;;
*) echo "Removed double Amadeus: $(date)" >> /var/log/amadeus.txt
kill $(pidof amadeus.x86 | awk '{print $1}')
;;
esac
0 If process is not found, restart it.
1 If process is found, all ok.
* If process running 2 or more, kill the last.
A simpler version. This just test if process is running, and if not restart it.
It just tests the exit flag $? from the pidof program. It will be 0 of process is running and 1 if not.
#!/bin/bash
pidof amadeus.x86 >/dev/null
if [[ $? -ne 0 ]] ; then
echo "Restarting Amadeus: $(date)" >> /var/log/amadeus.txt
/etc/amadeus/amadeus.x86 &
fi
And at last, a one liner
pidof amadeus.x86 >/dev/null ; [[ $? -ne 0 ]] && echo "Restarting Amadeus: $(date)" >> /var/log/amadeus.txt && /etc/amadeus/amadeus.x86 &
This can then be used in crontab to run every minute like this:
* * * * * pidof amadeus.x86 >/dev/null ; [[ $? -ne 0 ]] && echo "Restarting Amadeus: $(date)" >> /var/log/amadeus.txt && /etc/amadeus/amadeus.x86 &
cccam oscam
I adopted the #Jotne solution and works perfectly! For example for mongodb server in my NAS
#! /bin/bash
case "$(pidof mongod | wc -w)" in
0) echo "Restarting mongod:"
mongod --config mongodb.conf
;;
1) echo "mongod already running"
;;
esac
I have adopted your script for my situation Jotne.
#! /bin/bash
logfile="/var/oscamlog/oscam1check.log"
case "$(pidof oscam1 | wc -w)" in
0) echo "oscam1 not running, restarting oscam1: $(date)" >> $logfile
/usr/local/bin/oscam1 -b -c /usr/local/etc/oscam1 -t /usr/local/tmp.oscam1 &
;;
2) echo "oscam1 running, all OK: $(date)" >> $logfile
;;
*) echo "multiple instances of oscam1 running. Stopping & restarting oscam1: $(date)" >> $logfile
kill $(pidof oscam1 | awk '{print $1}')
;;
esac
While I was testing, I ran into a problem..
I started 3 extra process's of oscam1 with this line:
/usr/local/bin/oscam1 -b -c /usr/local/etc/oscam1 -t /usr/local/tmp.oscam1
which left me with 8 process for oscam1. the problem is this..
When I run the script, It only kills 2 process's at a time, so I would have to run it 3 times to get it down to 2 process..
Other than killall -9 oscam1 followed by /usr/local/bin/oscam1 -b -c /usr/local/etc/oscam1 -t /usr/local/tmp.oscam1, in *)is there any better way to killall apart from the original process? So there would be zero downtime?
If you changed awk '{print $1}' to '{ $1=""; print $0}' you will get all processes except for the first as a result. It will start with the field separator (a space generally) but I don't recall killall caring. So:
#! /bin/bash
logfile="/var/oscamlog/oscam1check.log"
case "$(pidof oscam1 | wc -w)" in
0) echo "oscam1 not running, restarting oscam1: $(date)" >> $logfile
/usr/local/bin/oscam1 -b -c /usr/local/etc/oscam1 -t /usr/local/tmp.oscam1 &
;;
2) echo "oscam1 running, all OK: $(date)" >> $logfile
;;
*) echo "multiple instances of oscam1 running. Stopping & restarting oscam1: $(date)" >> $logfile
kill $(pidof oscam1 | awk '{ $1=""; print $0}')
;;
esac
It is worth noting that the pidof route seems to work fine for commands that have no spaces, but you would probably want to go back to a ps-based string if you were looking for, say, a python script named myscript that showed up under ps like
root 22415 54.0 0.4 89116 79076 pts/1 S 16:40 0:00 /usr/bin/python /usr/bin/myscript
Just an FYI
The 'pidof' command will not display pids of shell/perl/python scripts. So to find the process id’s of my Perl script I had to use the -x option i.e. 'pidof -x perlscriptname'
I cannot get case to work at all.
Heres what I have:
#! /bin/bash
logfile="/home/name/public_html/cgi-bin/check.log"
case "$(pidof -x script.pl | wc -w)" in
0) echo "script not running, Restarting script: $(date)" >> $logfile
# ./restart-script.sh
;;
1) echo "script Running: $(date)" >> $logfile
;;
*) echo "Removed duplicate instances of script: $(date)" >> $logfile
# kill $(pidof -x ./script.pl | awk '{ $1=""; print $0}')
;;
esac
rem the case action commands for now just to test the script. the above pidof -x command is returning '1', the case statement is returning the results for '0'.
Anyone have any idea where I'm going wrong?
Solved it by adding the following to my BIN/BASH Script:
PATH=$PATH:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
In case you're looking for a more modern way to check to see if a service is running (this will not work for just any old process), then systemctl might be what you're looking for.
Here's the basic command:
systemctl show --property=ActiveState your_service_here
Which will yield very simple output (one of the following two lines will appear depending on whether the service is running or not running):
ActiveState=active
ActiveState=inactive
And if you'd like to know all of the properties you can get:
systemctl show --all your_service_here
If you prefer that alphabetized:
systemctl show --all your_service_here | sort
And the full code to act on it:
service=$1
result=`systemctl show --property=ActiveState $service`
if [[ "$result" == 'ActiveState=active' ]]; then
echo "$service is running" # Do something here
else
echo "$service is not running" # Do something else here
fi
If you are using CentOS, no need to write a script and set cron job. Here is one of the smartest ways to ensure systemd services restart on failure.
Make following changes to /usr/lib/systemd/system/mariadb.service
Then under the [Service] section in the file, add the following 2 lines:
Restart=always
RestartSec=3
After saving the file we need to reload the daemon configurations to ensure systemd is aware of the new file
systemctl daemon-reload
Read the following link for the complete steps -
https://jonarcher.info/2015/08/ensure-systemd-services-restart-on-failure/

How to get watch to run a bash script with quotes

I'm trying to have a lightweight memory profiler for the matlab jobs that are run on my machine. There is either one or zero matlab job instance, but its process id changes frequently (since it is actually called by another script).
So here is the bash script that I put together to log memory usage:
#!/bin/bash
pid=`ps aux | grep '[M]ATLAB' | awk '{print $2}'`
if [[ -n $pid ]]
then
\grep VmSize /proc/$pid/status
else
echo "no pid"
fi
when I run this script in bash like this:
./script.sh
it works fine, giving me the following result:
VmSize: 1289004 kB
which is exactly what I want.
Now, I want to run this periodically. So I run it with watch, like this:
watch ./script.sh
But in this case I only receive:
no pid
Please note that I know the matlab job is still running, because I can see it with the same pid on top, and besides, I know each matlab job take several hours to finish.
I'm pretty sure that something is wrong with the quotes I have when setting pid. I just can't figure out how to fix it. Anyone knows what I'm doing wrong?
PS.
In the man page of watch, it says that commands are executed by sh -c. I did run my script like sh -c ./script and it works just fine, but watch doesn't.
Why don't you use a loop with sleep command instead?
For example:
#!/bin/bash
pid=`ps aux | grep '[M]ATLAB' | awk '{print $2}'`
while [ "1" ]
do
if [[ -n $pid ]]
then
\grep VmSize /proc/$pid/status
else
echo "no pid"
fi
sleep 10
done
Here the script sleeps(waits) for 10 seconds. You can set the interval you need changing the sleep command. For example to make the script sleep for an hour use sleep 1h.
To exit the script press Ctrl - C
This
pid=`ps aux | grep '[M]ATLAB' | awk '{print $2}'`
could be changed to:
pid=$(pidof MATLAB)
I have no idea why it's not working in watch but you could use a cron job and make the script log to a file like so:
#!/bin/bash
pid=$(pidof MATLAB) # Just to follow previously given advice :)
if [[ -n $pid ]]
then
echo "$(date): $(\grep VmSize /proc/$pid/status)" >> logfile
else
echo "$(date): no pid" >> logfile
fi
You'd of course have to create logfile with touch.
You might try just running ps command in watch. I have had issues in the past with watch chopping lines and such when they get too long.
It can be fixed by making the terminal you are running the command from wider or changing the column like this (may need to adjust the 160 to your liking):
export COLUMNS=160;

Resources