Group different linux command in one script - linux

I have some checks that I do every 2 hours to monitor the status of servers like
iostat -ch, df -h /DATA, free -mh, ps -aux | grep kafka and other commands and some shell scripts.
How can I group them in one or two scripts to execute them automatically without doing the same check manually every time?

So if I understand correctly you want to execute a bunch of commands as one script executed it automatically every two hours?
Start by writing a shell script:
#!/bin/sh
iostat -ch
df -h /DATA
free -mh
ps -aux | grep kafka
and then add it as a cron job (see cron)

Related

Bash script results in different output when running from a cron job

I'm puzzled by this problem I'm having on Ubuntu 20.04 where cron is able to run a bash script but the overall outcome is different then when using the shell command.
I've look through all questions I could in here and on Google but couldn't find anyone that had the same problem.
Background:
I'm using Pushgateway to store metrics I'm generating through a bash script, and afterwards it's being imported automatically to Prometheus.
The end goal is to export a list of running processes, their CPU%, Mem% etc, similar to top command.
This is the bash script:
#!/bin/bash
z=$(top -n 1 -bi)
while read -r z
do
var=$var$(awk 'FNR>7{print "cpu_usage{process=\""$12"\", pid=\""$1"\"}", $9z} FNR>7{print "memory_usage{process=\""$12"\", pid=\""$1"\"}", $10z}')
done <<< "$z"
curl -X POST -H "Content-Type: text/plain" --data "$var
" http://localhost:9091/metrics/job/top/instance/machine
I used to have a version that used ps aux but then I found out that it only shows the average CPU% per process.
As you can see, the command I'm running is top -n 1 -bi which gives me a snapshot of active processes and their metrcis.
I'm using awk to format the data, and FNR>7 because I need to ignore the first 7 lines which is the summery presented by top.
The bash scrip is registered on /bin, /usr/bin and /usr/local/bin.
When checking http://localhost:9091/metrics, which is supposed to show me the information gathered, I'm getting this some of information when running the scrip using shell:
cpu_usage{instance="machine",job="top",pid="114468",process="php-fpm74"} 17.6
cpu_usage{instance="machine",job="top",pid="114483",process="php-fpm74"} 11.8
cpu_usage{instance="machine",job="top",pid="126305",process="ffmpeg"} 64.7
And this is the same information when cron is running the same script:
cpu_usage{instance="machine",job="top",pid="114483",process="php-fpm+"} 5
cpu_usage{instance="machine",job="top",pid="126305",process="ffmpeg"} 60
cpu_usage{instance="machine",job="top",pid="128777",process="php"} 15
So, for some reason, when I run it from cron it cuts the process name after 7 places.
I initially though it was related to the FNR>7 but even after changing it to 8 or 9 (and using exec bash to re-register the command) it gives the same results, also when I run it manually it works just fine.
Any help would be appreciated!!

How to kill a script that has been executed using source?

The problem is that I cannot find the process ID of a script that has been executed using source. I am able to do so when they are launched with bash using ps -ef.
If I run a script using bash, I can figure the process ID using ps -ef | grep "test1.sh" | grep -v "grep". However, if I run the script using source, I cannot search for it and hence cannot find the process ID.
I have read the difference between the bash and source commands from this link.
This is my testing procedure:
I have 2 terminals. In one of them, I am searching for process IDs using ps -ef. In the other one, I run a script which prints 'Hello' every one second (an infinite while loop with sleep of 1 second). With bash, PID is searchable, but with source, grep doesn't get any results.
I am working on an Ubuntu 18.04.2 LTS machine
If you do not want to terminate the sourcing bash and are satisfied with the script being stopped only after a command (such as sleep) finishes, you can kill -INT the bash process.

Sun Grid Engine: submitted jobs by qsub command

I am using Sun Grid Engine queuing system.
Assume I submitted multiple jobs using a script that looks like:
#! /bin/bash
for i in 1 2 3 4 5
do
sh qsub.sh python run.py ${i}
done
qsub.sh looks like:
#! /bin/bash
echo cd `pwd` \; "$#" | qsub
Assuming that 5 jobs are running, I want to find out which command each job is executing.
By using qstat -f, I can see which node is running which jobID, but not what specific command each jobID is related to.
So for example, I want to check which jobID=xxxx is running python run.py 3 and so on.
How can I do this?
I think you'll see it if you use qstat -j *. See https://linux.die.net/man/1/qstat-ge .
You could try running array jobs. Array jobs are useful when you have multiple inputs to process in the same way. Qstat will identify each instance of the array of jobs. See the docs for more information.
http://docs.adaptivecomputing.com/torque/4-0-2/Content/topics/commands/qsub.htm#-t
http://wiki.gridengine.info/wiki/index.php/Simple-Job-Array-Howto

Linux bash script that kills a process (not started by me) after x amount of time

I'm pretty inexperienced with Linux bash. That being said, I have a CentOS7 machine that runs a COTS application server. This application server runs other processes that sometimes hang. Since I have no control over the start of these processes, I'm looking for a script that runs every 2 minutes that kills processes of the name "spicer" that have been running for longer than 10 minutes. I've looked around and have only been able to find answers for processes that are run and owned by me.
I use the command ps -eo pid, command,etime | grep spicer to get all the spicer processes. The output of this command looks like:
18216 spicer -l/opt/otmm-10.5/Spi 14:20
18415 spicer -l/opt/otmm-10.5/Spi 11:49
etc...
18588 grep --color=auto spicer
I don't know if there's a way to parse this directly in bash. I'm also not well-versed at all in other Linux tools. I know that awk (or gawk) could possibly help.
EDIT
I have no control over the data that the process is working on.
What about wrapping the executable of spicer and start it using the timeout command? Let's say it is installed in /usr/bin/spicer. Then issue:
cp /usr/bin/spicer{,.orig}
echo '#!/bin/bash' > /usr/bin/spicer
echo 'timeout 10m spicer.orig "$#"' >> /usr/bin/spicer
Another approach would be to create a cronjob defintion into /etc/cron.d/kill_spicer. Like this:
* * * * * root kill $(ps --no-headers -C spicer -o pid,etimes | awk '$2>=600{print $1}')
The cronjob will get executed minutely and uses ps to obtain a list of spicer processes that run longer than 10minutes and passes them to kill.
Probably you even want kill -9 if the process is hanging.
You can use the -C option of ps to select processes by name.
ps --no-headers -C spicer -o pid,etime
Then you can use cut to filter the results, if the spacing is consistent. On my system the pid field takes up 8 characters, so I'd use
kill $(ps --no-headers -C spicer -o pid,etime | cut -c-8)
If the spacing is inconsistent (but if so, what kind of messed up ps are you using? :-P), you can use awk { print $1 } instead of cut.

can i delete a shell script after it has been submitted using qsub without affecting the job?

I want to submit a a bunch of jobs using qsub - the jobs are all very similar. I have a script that has a loop, and in each instance it rewrites over a file tmpjob.sh and then does qsub tmpjob.sh . Before the job has had a chance to run, the tmpjob.sh may have been overwritten by the next instance of the loop. Is another copy of tmpjob.sh stored while the job is waiting to run? Or do I need to be careful not to change tmpjob.sh before the job has begun?
Assuming you're talking about torque, then yes; torque reads in the script at submission time. In fact the submission script need never exist as a file at all; as given as an example in the documentation for torque, you can pipe in commands to qsub (from the docs: cat pbs.cmd | qsub.)
But several other batch systems (SGE/OGE, PBS PRO) use qsub as a queue submission command, so you'll have to tell us what queuing system you're using to be sure.
Yes. You can even create jobs and sub-jobs with HERE Documents. Below is an example of a test I was doing with a script initiated by a cron job:
#!/bin/env bash
printenv
qsub -N testCron -l nodes=1:vortex:compute -l walltime=1:00:00 <<QSUB
cd \$PBS_O_WORKDIR
printenv
qsub -N testsubCron -l nodes=1:vortex:compute -l walltime=1:00:00 <<QSUBNEST
cd \$PBS_O_WORKDIR
pwd
date -Isec
QSUBNEST
QSUB

Resources