how to efficiently monitor system stat using vmstat?

how to efficiently monitor system stat using vmstat? - linux

Am getting the real-time memory stats from vmstat command. I did this using following steps:
$nohup vmstat 60 > vmstatrecord.app &
the command executes in background and writes the log to the file vmstatrecord.app. When i see use the command
$ps -A | grep stat
I could see the vmstat running in the background and i could also access the log using tail command as:
$tail -f vmstatrecord.app
the file updates every 60sec interval.
Now my question is
1. process continues to write to the file so what will happen if i leave for days ?
Assumption:
If the process writes the file forever am afraid that the file size might grow too large
If my assumption is correct and my steps are inefficient. Is there any alternatives to achieve what am trying to achieve from my above steps ?

This question should better be asked on superuser.com or maybe serverfault.com, as it's not about programming.
Yes, your file will keep growing. That's what the 2nd parameter of vmstat is for - run vmstat 60 1440 to stop after a day (note 1440 = 60 minutes * 24 hours). Once when i had this problem, i made a crontab entry:
0 0 * * * vmstat 60 1440 > /some/where/vmstat.out
to restart the output every day.

Related

how to kill log running jobs in every one hour interval?

I want to search all the jobs which are running more that one hour. kill them. Then sleep for 60 mins. Again search if any job is running more than 60? loop the process.

If you want to find the PIDs for the processes running for more than 60 minutes on your linux box you can use a very simple and basic bash script like the one bellow:
#!/bin/sh
MIN=60
SEC=$((MIN*60))
ps -eo etimes=,pid= | while read sec pid; do
if [ ${sec} -gt ${SEC} ]; then
echo ${pid}
#kill -9 ${pid} # remove the # at the beginning of the line to actually kill those processes
fi
done
This will display the PIDs of the running processes, one per line
Assuming you name this script 60min.sh, you can run it every 60 minute using a cron job:
0 * * * * /bin/bash /path_to/60min.sh
This cron job will run your 60min.sh script every 60 minutes (or every hour)
Please keep in mind that you might accidentally kill system processes and your system might become unstable or unusable so you will have to reboot.
If you run different processes using a specific linux user I would recommend you to search the processes beloging to that user only and not to user root.

How do i activate cron command once within specific time frame?

Basic information about my system: I have a music system where people can schedule songs to start and end at a specific time.
OS: Arch linux
It sets two crons at the moment. One lets say at 1.50 (start time with a command like "play etc") and another set at 3.20 (end time with a command like "end etc").
My setup works perfectly and i can end delete schedules etc etc but i now noticed an issue! If i set the above times and turn the system off (My system is a raspberry pi) and turn back on at lets say 2.00 and i missed the 1.50 deadline, the music doesnt start (obviously) and i want to try make it so no matter what time i turn it on within a range lets say: 1.50 - 3.20 it will start the play command. But it will run the command once!
I looked around and the commands i got was like:
0 1.50-3.20/2 * * * your_command.sh
But thats to run every 2 hours. I want it to run once only between these times?
Thanks!

You could add an additional cron job which starts a script on every reboot. For instance, you could add a line like this to your crontab:
#reboot /home/pi/startplayback.sh
Your startplayback.sh script should check if current time is within the desired period and run the desired command if it is. For example the code below will print PLAY! if the script is run between 1:50 and 3:20. You could replace echo 'PLAY!' by run WHATEVER
#!/bin/bash
current=$(date '+%H%M')
(( current=(10#$current) ))
((current > 150 && current < 320 )) && echo 'PLAY!'
P.S. Don't forget to make your script executable sudo chmod +x startplayback.sh

You might want to look at the at command and its utilities.
SYNOPSIS
at [-q queue] [-f file] [-mldbv] time
at [-q queue] [-f file] [-mldbv] -t [[CC]YY]MMDDhhmm[.SS]
at -c job [job ...]
at -l [job ...]
at -l -q queue
at -r job [job ...]
atq [-q queue] [-v]
atrm job [job ...]
batch [-q queue] [-f file] [-mv] [time]
at is good for scheduling one time jobs to be run at some point in the future. It maintains a queue of these jobs, so you can use it to schedule things with a great variety of different time specifications.
Cron is in my opinion a scheduler for jobs that are to be repeated over and over.
So a quick and dirty example for you:
echo 'ls -lathF' | at now + 1 minute
As expected you will see a job to be run in one minute. Try atq to see the list of jobs.
When the job is done, output will be mailed to your user by default.

I solved the issue by creating a PHP file and load the page on reboot then do its work and redirect back to such and such.

Bash script: CPU stress test while watching clock speed

I am totally new to this forum and also new to bash, so please bear with me :).
I would like to write a bash script to conduct a CPU stress test while observing the clock speed. Therefore, I have done the following:
1.) For the CPU stress test, I have created a script named "bernoulli" with the following code:
#!/bin/bash
# argument 1: n
function bernoulli()
{
if (( $1 < 3 ))
then
echo 1
else
echo $(( $(bernoulli $(( $1 - 1 ))) + $(bernoulli $(( $1 - 2 ))) ))
fi
}
bernoulli $1
2.) I have figured out that by using the "timeout" command I can kill a task after a specified time. For example,
timeout 30s ./bernoulli 35
starts a task calculating the 35th bernoulli number and the task is killed after 30 seconds.
3.) I also found out that by typing
timeout 30s watch grep \"cpu MHz\" /proc/cpuinfo
I can watch the clock speed of my cores (updated every 2 seconds) for 30 seconds (at which point "timeout 30s" kills this task).
What I want: I would like to do the above stress test and simultaneously observe the clock speed. In other words, I would somehow run the two commands
timeout 30s ./bernoulli 35
timeout 30s watch grep \"cpu MHz\" /proc/cpuinfo
"at the same time". I hope I could make it clear what I would like to achieve. Can anyone help with my issue? Thanks a lot for every comment!

How about
timeout 30s ./bernoulli 35 &
timeout 30s watch grep \"cpu MHz\" /proc/cpuinfo
& at the end will make command to run at background, so that second timeout will be executed almost instantly after the first one.
PS: this is rather poor way to test modern CPU. You will be exercising only single core and most likely only limited part of your CPU (no sse, etc). It is not trivial to write CPU benchmark, so you might want to use one of already available. For example, you can take a look at sysbench with something like sysbench --test=cpu --cpu-max-prime=20000 run.

You can run them in a dedicated shell:
timeout 30s bash -c './bernoulli 35 & watch grep \"cpu MHz\" /proc/cpuinfo'
Note that the single & is not a typo. It is not a logical and, it runs the bernoulli script in background.

Pause a shell script every x (mili)seconds to decrease immediate CPU usage

I have a shell script which backups my MySQL database every hour. It's a basic script:
mysqldump --user=$USER --password=$PASS $DB > /$PATH/$DATE.sql &&
7z a -t7z -mx=9 /$PATH/$DATE.sql.7z /$PATH/$DATE.sql &&
rm /$PATH/$DATE.sql
I'm using a 7z compression, because:
file permissions and owner/group aren't required to be kept
the space saved by 7z compared to gzip is essential to me
What's troubling me is that the 7z part (line 2) takes about 30 seconds and uses quite a lot of CPU during that time. There are 50% peaks on my server load graphs every hour from this and I'd like to get rid of those peaks.
I'm already executing this script with nice:
0 * * * * /usr/bin/nice -n 19 /path/to/backup.sh > /dev/null
The only solution I can come up with at this point is to somehow pause the execution of the 7z part, let's say every second for 1 second. This could free up the CPU time for other processes.
Is this possible?
I'm doing something similar in my .php scripts, although the pauses are in loops, for example:
while($d=mysqli_fetch_assoc($q)){
usleep(250000);
// process $d
}
I'd like the 7z execution to sleep every X (something) for 1 second.

You can use
prlimit --cpu=10 7z a -t7z -mx=9 /$PATH/$DATE.sql.7z /$PATH/$DATE.sql
Which will limit the cpu usage to 10%

How about configuring a limit for a user/group using the /etc/security/limits.conf file? And running the script/compression as specific user? After setting the limit you can test it using fork bomb.

Long running service check in Nagios

I have a service check that I've found on the Nagios Exchange site which works well for small directories, but not well for larger ones that take longer than 30 or 60 seconds to complete.
http://exchange.nagios.org/directory/Plugins/Uncategorized/Operating-Systems/Linux/CheckDirSize/details
The problem I'm having is that I need to configure a service check that Nagios can run once a day but will remain open for 1440 minutes (one day). The directory listing is huge and takes many hours to complete (up to 20 hours).
This is my service check (check every day, when using nrpe, the timeout is 86400 seconds which is also one day). But for some reason, even though I can see the du -sk running on the command line in ps -ef | grep du, Nagios is reporting "(Service Check Timed Out)":
define service {
use generic-service,srv-pnp
host_name IMAGEServer1
service_description Images
check_command check_nrpe!check_dirsize -t 86400
check_interval 1440
}
In my nrpe.cfg file on the linux server i have these two directives as well:
command_timeout=86400
connection_timeout=86400
How can I get Nagios to complete the check and not time out? I was under the impression that my directives above were correct.

What's timing out is the check_nrpe command on the local side (it has a default timeout of 2 minutes). You could edit its command definition to use a long timeout.
Alternatively, you might want to do this as a passive check on IMAGEServer1, running as a cron job.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

how to efficiently monitor system stat using vmstat? - linux

Related

how to kill log running jobs in every one hour interval?

How do i activate cron command once within specific time frame?

Bash script: CPU stress test while watching clock speed

Pause a shell script every x (mili)seconds to decrease immediate CPU usage

Long running service check in Nagios

Categories

Resources