In a Linux / Unix server when the CPU usage go above a threshold value it need to send out a email alert. Propose a way to do it through cron tab and shell scripting.
This can be done through following shell script and a frequent cron job.
cpu_monitor.sh
CPU=$(sar 1 5 | grep "Average" | sed 's/^.* //')
if [ $CPU -lt 20 ]
then
cat mail_content.html | /usr/lib/sendmail -t
else
echo "Normal"
fi
mail_content.html
From: donotreply#sample.com
To: info#sample.com
Subject: Subject of the mail
Mime-Version: 1.0
Content-Type: text/html
<h1>CPU usage increased heigh</h1>
Here the script will take the CPU ideal percentage for each 1 seconds. And 5 samples will be taken. Then average of that ideal percentage will be passed to variable CPU. When the ideal goes below the 20% mail will be send out.
We can setup the cron with 5 minute duration.
*/5 * * * * cd /full/path/to/script/; ./cpu_monitor.sh;
I would use the uptime command that displays (at the end) the average cpu usage for the last 15 minutes. This should be a reasonable time range in order to avoid short spikes in CPU usage. The uptime command shows the number of processes in a runnable state. If they are less than the CPU cores, the machine is not CPU bound, while if they are more than the CPU cores, the machines will start suffering. Let's say we agree that having the number of processes more than 2 times the number of cores is the level we want to be alerted.
# get the number of CPU cores using /proc/cpuinfo
#
ncpu=$(egrep -c '^processor' /proc/cpuinfo)
#
# get the last number (only integer part) of uptime output
#
load=$(LC_ALL=C uptime | sed -e 's/.*average: .* \([0-9]*\)\.[0-9][0-9]$/\1/')
#
# divide the $load number by the $ncpu and check if greather than alert level
#
if [ $(($load/$cpu)) -gt 2 ]
then
# send an e-mail to alerts#domain.tld.
echo -e "\nCPU usage is greather than twice the number of CPU cores for more than 15 minutes.\n" \
| fold -s -w75 | mailx -s "CPU alert for $(hostname)" alerts#domain.tld
fi
It may be run from crontab adding this text as /etc/cron.d/simplecpucheck:
MAILTO=alerts#domain.tld
10,40 * * * * root /path/name/of/shell-script
Please note that I prefer to send e-mail from the shell script instead of crontab daemon since I may use a subject line more informative. I still leave the email address in crontab file in order to receive any other error (mainly syntax errors in my script and commands not found).
I have an extremely long and complicated shell pipeline set up to grab 2.2Gb of data and process it. It currently takes 45 minutes to process. The pipeline is a number of cut, grep, sort, uniq, grep and awk commands tied together. I have my suspicion that it's the grep portion that is causing it to take so much time but I have no way of confirming it.
Is there anyway to "profile" the entire pipeline from end to end to determine which component is the slowest and if it is CPU or IO bound so it can be optimised?
I cannot post the entire command here unfortunately as it would require posting proprietary information but I suspect it is the following bit checking it out with htop:
grep -v ^[0-9]
One way to do this is to gradually build up the pipeline, timing each addition, and taking as much out of the equation as possible (such as outputting to a terminal or file). A very simple example is shown below:
pax:~$ time ( cat bigfile >/dev/null )
real 0m4.364s
user 0m0.004s
sys 0m0.300s
pax:~$ time ( cat bigfile | tr 'a' 'b' >/dev/null )
real 0m0.446s
user 0m0.312s
sys 0m0.428s
pax:~$ time ( cat bigfile | tr 'a' 'b' | tail -1000l >/dev/null )
real 0m0.796s
user 0m0.516s
sys 0m0.688s
pax:~$ time ( cat bigfile | tr 'a' 'b' | tail -1000l | sort -u >/dev/null )
real 0m0.892s
user 0m0.556s
sys 0m0.756s
If you add up the user and system times above, you'll see that the incremental increases are:
0.304 (0.004 + 0.300) seconds for the cat;
0.436 (0.312 + 0.428 - 0.304) seconds for the tr;
0.464 (0.516 + 0.688 - 0.436 - 0.304) seconds for the tail; and
0.108 (0.556 + 0.756 - 0.464 - 0.436 - 0.304) seconds for the sort.
This tells me that the main things to look into are the tail and the tr.
Now obviously, that's for CPU only, and I probably should have done multiple runs at each stage for averaging purposes, but that's the basic first approach I would take.
If it turns out it really is your grep, there are a few other options available to you. There are numerous other commands that can strip lines not starting with a digit but you may find that a custom-built command for doing this may be faster still, pseudo-code like (untested, but you should get the idea):
state = echo
lastchar = newline
while not end of file:
read big chunk from file
for every char in chunk:
if lastchar is newline:
if state is echo and char is non-digit:
state = skip
else if state is skip and and char is digit:
state = echo
if state is echo:
output char
lastchar = char
Custom, targeted code like this can sometimes be made more efficient than a general-purpose regex processing engine, simply because it can be optimised to the specific case. Whether that's true is this case, or any case for that matter, is something you should test. My number one optimisation mantra is measure, don't guess!
I found the problem myself after some further experimentation. It appears to be due to the encoding support in grep. Using the following hung the pipeline:
grep -v ^[0-9]
I replaced it with sed as follows and it finished in under 45 seconds!
sed '/^[0-9]/d'
This is straightforward with zsh:
zsh-4.3.12[sysadmin]% time sleep 3 | sleep 5 | sleep 2
sleep 3 0.01s user 0.03s system 1% cpu 3.182 total
sleep 5 0.01s user 0.01s system 0% cpu 5.105 total
sleep 2 0.00s user 0.05s system 2% cpu 2.121 total
I want to create a near 100% load on a Linux machine. It's quad core system and I want all cores going full speed. Ideally, the CPU load would last a designated amount of time and then stop. I'm hoping there's some trick in bash. I'm thinking some sort of infinite loop.
I use stress for this kind of thing, you can tell it how many cores to max out.. it allows for stressing memory and disk as well.
Example to stress 2 cores for 60 seconds
stress --cpu 2 --timeout 60
You can also do
dd if=/dev/zero of=/dev/null
To run more of those to put load on more cores, try to fork it:
fulload() { dd if=/dev/zero of=/dev/null | dd if=/dev/zero of=/dev/null | dd if=/dev/zero of=/dev/null | dd if=/dev/zero of=/dev/null & }; fulload; read; killall dd
Repeat the command in the curly brackets as many times as the number of threads you want to produce (here 4 threads).
Simple enter hit will stop it (just make sure no other dd is running on this user or you kill it too).
I think this one is simpler. Open Terminal and type the following and press Enter.
yes > /dev/null &
To fully utilize modern CPUs, one line is not enough, you may need to repeat the command to exhaust all the CPU power.
To end all of this, simply put
killall yes
The idea was originally found here, although it was intended for Mac users, but this should work for *nix as well.
Although I'm late to the party, this post is among the top results in the google search "generate load in linux".
The result marked as solution could be used to generate a system load, i'm preferring to use sha1sum /dev/zero to impose a load on a cpu-core.
The idea is to calculate a hash sum from an infinite datastream (eg. /dev/zero, /dev/urandom, ...) this process will try to max out a cpu-core until the process is aborted.
To generate a load for more cores, multiple commands can be piped together.
eg. generate a 2 core load:
sha1sum /dev/zero | sha1sum /dev/zero
To load 3 cores for 5 seconds:
seq 3 | xargs -P0 -n1 timeout 5 yes > /dev/null
This results in high kernel (sys) load from the many write() system calls.
If you prefer mostly userland cpu load:
seq 3 | xargs -P0 -n1 timeout 5 md5sum /dev/zero
If you just want the load to continue until you press Ctrl-C:
seq 3 | xargs -P0 -n1 md5sum /dev/zero
One core (doesn't invoke external process):
while true; do true; done
Two cores:
while true; do /bin/true; done
The latter only makes both of mine go to ~50% though...
This one will make both go to 100%:
while true; do echo; done
Here is a program that you can download Here
Install easily on your Linux system
./configure
make
make install
and launch it in a simple command line
stress -c 40
to stress all your CPUs (however you have) with 40 threads each running a complex sqrt computation on a ramdomly generated numbers.
You can even define the timeout of the program
stress -c 40 -timeout 10s
unlike the proposed solution with the dd command, which deals essentially with IO and therefore doesn't really overload your system because working with data.
The stress program really overloads the system because dealing with computation.
An infinite loop is the idea I also had. A freaky-looking one is:
while :; do :; done
(: is the same as true, does nothing and exits with zero)
You can call that in a subshell and run in the background. Doing that $num_cores times should be enough. After sleeping the desired time you can kill them all, you get the PIDs with jobs -p (hint: xargs)
:(){ :|:& };:
This fork bomb will cause havoc to the CPU and will likely crash your computer.
I would split the thing in 2 scripts :
infinite_loop.bash :
#!/bin/bash
while [ 1 ] ; do
# Force some computation even if it is useless to actually work the CPU
echo $((13**99)) 1>/dev/null 2>&1
done
cpu_spike.bash :
#!/bin/bash
# Either use environment variables for NUM_CPU and DURATION, or define them here
for i in `seq ${NUM_CPU}` : do
# Put an infinite loop on each CPU
infinite_loop.bash &
done
# Wait DURATION seconds then stop the loops and quit
sleep ${DURATION}
killall infinite_loop.bash
to increase load or consume CPU 100% or X%
sha1sum /dev/zero &
on some system this will increase the load in slots of X%, in that case you have to run the same command multiple time.
then you can see CPU uses by typing command
top
to release the load
killall sha1sum
cat /dev/urandom > /dev/null
#!/bin/bash
duration=120 # seconds
instances=4 # cpus
endtime=$(($(date +%s) + $duration))
for ((i=0; i<instances; i++))
do
while (($(date +%s) < $endtime)); do :; done &
done
I've used bc (binary calculator), asking them for PI with a big lot of decimals.
$ for ((i=0;i<$NUMCPU;i++));do
echo 'scale=100000;pi=4*a(1);0' | bc -l &
done ;\
sleep 4; \
killall bc
with NUMCPU (under Linux):
$ NUMCPU=$(grep $'^processor\t*:' /proc/cpuinfo |wc -l)
This method is strong but seem system friendly, as I've never crashed a system using this.
#!/bin/bash
while [ 1 ]
do
#Your code goes here
done
I went through the Internet to find something like it and found this very handy cpu hammer script.
#!/bin/sh
# unixfoo.blogspot.com
if [ $1 ]; then
NUM_PROC=$1
else
NUM_PROC=10
fi
for i in `seq 0 $((NUM_PROC-1))`; do
awk 'BEGIN {for(i=0;i<10000;i++)for(j=0;j<10000;j++);}' &
done
Using examples mentioned here, but also help from IRC, I developed my own CPU stress testing script. It uses a subshell per thread and the endless loop technique. You can also specify the number of threads and the amount of time interactively.
#!/bin/bash
# Simple CPU stress test script
# Read the user's input
echo -n "Number of CPU threads to test: "
read cpu_threads
echo -n "Duration of the test (in seconds): "
read cpu_time
# Run an endless loop on each thread to generate 100% CPU
echo -e "\E[32mStressing ${cpu_threads} threads for ${cpu_time} seconds...\E[37m"
for i in $(seq ${cpu_threads}); do
let thread=${i}-1
(taskset -cp ${thread} $BASHPID; while true; do true; done) &
done
# Once the time runs out, kill all of the loops
sleep ${cpu_time}
echo -e "\E[32mStressing complete.\E[37m"
kill 0
Utilizing ideas here, created code which exits automatically after a set duration, don't have to kill processes --
#!/bin/bash
echo "Usage : ./killproc_ds.sh 6 60 (6 threads for 60 secs)"
# Define variables
NUM_PROCS=${1:-6} #How much scaling you want to do
duration=${2:-20} # seconds
function infinite_loop {
endtime=$(($(date +%s) + $duration))
while (($(date +%s) < $endtime)); do
#echo $(date +%s)
echo $((13**99)) 1>/dev/null 2>&1
$(dd if=/dev/urandom count=10000 status=none| bzip2 -9 >> /dev/null) 2>&1 >&/dev/null
done
echo "Done Stressing the system - for thread $1"
}
echo Running for duration $duration secs, spawning $NUM_PROCS threads in background
for i in `seq ${NUM_PROCS}` ;
do
# Put an infinite loop
infinite_loop $i &
done
You can try to test the performance of cryptographic algorithms.
openssl speed -multi 4
If you do not want to install additional software, you may use a compression utility which utilizes all CPU cores automatically. For example, xz:
cat /dev/zero | xz -T0 > /dev/null
This takes infinite stream of dummy data from /dev/zero and compresses it using all cores available in the system.
This does a trick for me:
bash -c 'for (( I=100000000000000000000 ; I>=0 ; I++ )) ; do echo $(( I+I*I )) & echo $(( I*I-I )) & echo $(( I-I*I*I )) & echo $(( I+I*I*I )) ; done' &>/dev/null
and it uses nothing except bash.
To enhance dimba's answer and provide something more pluggable (because i needed something similar). I have written the following using the dd load-up concept :D
It will check current cores, and create that many dd threads.
Start and End core load with Enter
#!/bin/bash
load_dd() {
dd if=/dev/zero of=/dev/null
}
fulload() {
unset LOAD_ME_UP_SCOTTY
export cores="$(grep proc /proc/cpuinfo -c)"
for i in $( seq 1 $( expr $cores - 1 ) )
do
export LOAD_ME_UP_SCOTTY="${LOAD_ME_UP_SCOTTY}$(echo 'load_dd | ')"
done
export LOAD_ME_UP_SCOTTY="${LOAD_ME_UP_SCOTTY}$(echo 'load_dd &')"
eval ${LOAD_ME_UP_SCOTTY}
}
echo press return to begin and stop fullload of cores
read
fulload
read
killall -9 dd
Dimba's dd if=/dev/zero of=/dev/null is definitely correct, but also worth mentioning is verifying maxing the cpu to 100% usage. You can do this with
ps -axro pcpu | awk '{sum+=$1} END {print sum}'
This asks for ps output of a 1-minute average of the cpu usage by each process, then sums them with awk. While it's a 1 minute average, ps is smart enough to know if a process has only been around a few seconds and adjusts the time-window accordingly. Thus you can use this command to immediately see the result.
awk is a good way to write a long-running loop that's CPU bound without generating a lot of memory traffic or system calls, or using any significant amount of memory or polluting caches so it slows down other cores a minimal amount. (stress or stress-ng can also do that if you either installed, if you use a simple CPU-stress method.)
awk 'BEGIN{for(i=0;i<100000000;i++){}}' # about 3 seconds on 4GHz Skylake
It's a counted loop so you can make it exit on its own after a finite amount of time. (Awk uses FP numbers, so a limit like 2^54 might not be reachable with i++ due to rounding, but that's way larger than needed for a few seconds to minutes.)
To run it in parallel, use a shell loop to start it in the background n times
for i in {1..6};do awk 'BEGIN{for(i=0;i<100000000;i++){}}' & done
###### 6 threads each running about 3 seconds
$ for i in {1..6};do awk 'BEGIN{for(i=0;i<100000000;i++){}}' & done
[1] 3047561
[2] 3047562
[3] 3047563
[4] 3047564
[5] 3047565
[6] 3047566
$ # this shell is usable.
(wait a while before pressing return)
[1] Done awk 'BEGIN{for(i=0;i<100000000;i++){}}'
[2] Done awk 'BEGIN{for(i=0;i<100000000;i++){}}'
[3] Done awk 'BEGIN{for(i=0;i<100000000;i++){}}'
[4] Done awk 'BEGIN{for(i=0;i<100000000;i++){}}'
[5]- Done awk 'BEGIN{for(i=0;i<100000000;i++){}}'
[6]+ Done awk 'BEGIN{for(i=0;i<100000000;i++){}}'
$
I used perf to see what kind of load it put on the CPU: it runs 2.6 instructions per clock cycle, so it's not the most friendly to a hyperthread sharing the same physical core. But it has a very small cache footprint, getting negligible cache misses even in L1d cache. And strace will show it makes no system calls until exit.
$ perf stat -r5 -d awk 'BEGIN{for(i=0;i<100000000;i++){}}'
Performance counter stats for 'awk BEGIN{for(i=0;i<100000000;i++){}}' (5 runs):
3,277.56 msec task-clock # 0.997 CPUs utilized ( +- 0.24% )
7 context-switches # 2.130 /sec ( +- 12.29% )
1 cpu-migrations # 0.304 /sec ( +- 40.00% )
180 page-faults # 54.765 /sec ( +- 0.18% )
13,708,412,234 cycles # 4.171 GHz ( +- 0.18% ) (62.29%)
35,786,486,833 instructions # 2.61 insn per cycle ( +- 0.03% ) (74.92%)
9,696,339,695 branches # 2.950 G/sec ( +- 0.02% ) (74.99%)
340,155 branch-misses # 0.00% of all branches ( +-122.42% ) (75.08%)
12,108,293,527 L1-dcache-loads # 3.684 G/sec ( +- 0.04% ) (75.10%)
217,064 L1-dcache-load-misses # 0.00% of all L1-dcache accesses ( +- 17.23% ) (75.10%)
48,695 LLC-loads # 14.816 K/sec ( +- 31.69% ) (49.90%)
5,966 LLC-load-misses # 13.45% of all LL-cache accesses ( +- 31.45% ) (49.81%)
3.28711 +- 0.00772 seconds time elapsed ( +- 0.23% )
The most "friendly" to the other hyperthread on an x86 CPU would be a C program like this, which just runs a pause instruction in a loop. (Or portably, a Rust program that runs std::hint::spin_loop.) As far as the OS's process scheduler, it stays in user-space (nothing like a yield() system call), but in hardware it doesn't take up many resources, letting the other logical core have the front-end for multiple cycles.
#include <immintrin.h>
int main(){ // use atoi(argv[1])*10000ULL as a loop count if you want.
while(1) _mm_pause();
}
I combined some of the answers and added a way to scale the stress to all available cpus:
#!/bin/bash
function infinite_loop {
while [ 1 ] ; do
# Force some computation even if it is useless to actually work the CPU
echo $((13**99)) 1>/dev/null 2>&1
done
}
# Either use environment variables for DURATION, or define them here
NUM_CPU=$(grep -c ^processor /proc/cpuinfo 2>/dev/null || sysctl -n hw.ncpu)
PIDS=()
for i in `seq ${NUM_CPU}` ;
do
# Put an infinite loop on each CPU
infinite_loop &
PIDS+=("$!")
done
# Wait DURATION seconds then stop the loops and quit
sleep ${DURATION}
# Parent kills its children
for pid in "${PIDS[#]}"
do
kill $pid
done
Just paste this bad boy into the SSH or console of any server running linux. You can kill the processes manually, but I just shutdown the server when I'm done, quicker.
Edit: I have updated this script to now have a timer feature so that there is no need to kill the processes.
read -p "Please enter the number of minutes for test >" MINTEST && [[ $MINTEST == ?(-)+([0-9]) ]]; NCPU="$(grep -c ^processor /proc/cpuinfo)"; ((endtime=$(date +%s) + ($MINTEST*60))); NCPU=$((NCPU-1)); for ((i=1; i<=$NCPU; i++)); do while (($(date +%s) < $endtime)); do : ; done & done
I want to limit the execution time of a program I am running under Linux. I put in my scons script a line like:
Command("com","","ulimit -t 1; myprogram")
and tested it with an infinite loop program: it did not work and the program ran forever.
Am I missing something?
-- tsf
ulimit -t 1 means that the limit is set to 1 second of CPU time. If your infinite loop program uses any sort of sleep in its inner loop then it will use practically no CPU time. This means it will not get killed in 1 second of real, on the clock time. In fact it may take minutes or hours to use up its 1 second allocation.
What happens if you run the command outside of SCons? Perhaps you don't have permission to change the limit at all...
ulimit -t 1; ./myprogram
For example, it may say the following if the limit is already set to 0:
bash: ulimit: cpu time: cannot modify limit: Operation not permitted
Edit: it seems that the -t option is broken on Ubuntu 9.04. A fix has been committed 05 June 2009, but it may take a while to trickle into the updates - it may not be fixed until 9.10.
As an historical note, this problem no longer exists in Ubuntu 10.04.
You can also use this script:
(taken from http://newsgroups.derkeiler.com/Archive/Comp/comp.sys.mac.system/2005-12/msg00247.html)
#!/bin/sh
# timeout script
#
usage()
{
echo "usage: timeout seconds command args ..."
exit 1
}
[[ $# -lt 2 ]] && usage
seconds=$1; shift
timeout()
{
sleep $seconds
kill -9 $pid >/dev/null 2>/dev/null
}
eval "$#" &
pid=$!
timeout &
wait $pid
.