How to count number of executed instructions of a process id including child processes - linux

I have nodejs application (as a server) deployed as a Docker container and I want to count the number of executed instructions when I call a function in it.
Here is how I find the container's PID:
$ pstree -p | grep node | grep npm
| |-containerd-shim(114397)-+-npm(114414)-+-sh(114540)---node(114541)-+-{node}(114542)
Then, I need to know the Docker ID:
$ docker ps | grep workload
root#node3:/home/m# docker ps | grep workload | grep npm
c7457f74536b michelgokan/synthetic-workload-generator "npm start" 55 minutes ago Up 55 minutes k8s_whatever_workload-5697bb48f9-gg8j5_default_896e5938-55f2-4875-bf6c-2bff2acbe0c6_0
Now, I know the parent PID is 114397. So I run the following perf command:
$ perf stat -p 114397 -e instructions,cycles,task-clock docker exec -it c7457f74536b curl 127.0.0.1:30005/workload/cpu
1000 CHKSM AND DIFFIEHELLMAN 60 OK!
Performance counter stats for process id '114397':
170057460 instructions # 1.02 insn per cycle
166389574 cycles # 1.575 GHz
105.67 msec task-clock # 0.570 CPUs utilized
0.185362408 seconds time elapsed
It seems it's not including instructions executed by the child processes. So I tried the following:
$ perf stat -p 1,722,114397,114414,114540,114541,114542 -e instructions,cycles,task-clock docker exec -it c7457f74536b curl 127.0.0.1:30005/workload/cpu
1000 CHKSM AND DIFFIEHELLMAN 60 OK!
Performance counter stats for process id '1,722,114397,114414,114540,114541,114542':
249803992 instructions # 1.05 insn per cycle
236979702 cycles # 1.575 GHz
150.47 msec task-clock # 0.832 CPUs utilized
0.180848729 seconds time elapsed
In which 1 is the systemd and 722 is the parent containerd PID of
containers.
Questions:
Is there any way that I can provide the parent PID and it counts number of executed instructions of all processes?
Does my approach make sense? I mean the way I provided all the PIDs in a comma-separated format.

You can get the PID of one of you processes (the parent) and deduce the others using pgrep.
pgrep has a neat feature --ns which will get you all the processes running in the same PID namespace as a given PID.
Having that you can get all the child process and convert them to comma separated values and feed them to perf
$ perf stat -p $(pgrep --ns <pid> | paste -s -d ",") -e instructions,cycles,task-clock docker exec -it c7457f74536b curl 127.0.0.1:30005/workload/cpu
pgrep --ns will get you the pid and paste -s -d "," will convert them.

Related

Get average CPU Usage percentage of single process until I kill it in bash script

I have a bash script where I would like to get know how much percentage of CPU time a process uses until I kill it in the last line of the script.
I know that I could normally do this via the time -v command, however my process is Erlang/OTP based and by using the time command it only measures the startup process statistics.
Therefore I'd like to use the process PID I can get easily and use that one to get the CPU time percentage until the end of the script.
Currently I'm using pidstat but it is only giving me statistics for linear time intervals.
I want to measure the exact timeinterval from when the process started until it gets killed.
Peak RAM statistics woul also be nice.
Could you recommend me any command I could use in this case?
This is my bash script:
sudo emqx start
sleep 10
mypid=$(sudo emqx pid)
echo $mypid
sudo pidstat -h -r -u -v -p "$mypid" 5 > $local/server_results/test1/emqxstats_$b.txt &
# process for load testing
# jthreads = amount of publishing users
sleep 5
until sudo ~/Downloads/apache-jmeter-5.2.1/bin/jmeter -n -t $local/testplans/csv.jmx -Jport=$a | grep -m 1 "... end of run"; do : ; done
sudo emqx stop
kill %!
So I want to measure the CPU percentage from the interval between starting mosquitto and until the Apache Jmeter test finished when it reaches the last 2 lines.
Kind regards
I found the command I was looking after some more research.
ps -o pcpu -p $pid
this was exactly the command I needed, because it calculates the percentage over the lifetime of the process.

Why does SIGHUP not work on busybox sh in an Alpine Docker container?

Sending SIGHUP with
kill -HUP <pid>
to a busybox sh process on my native system works as expected and the shell hangs up. However, if I use docker kill to send the signal to a container with
docker kill -s HUP <container>
it doesn't do anything. The Alpine container is still running:
$ CONTAINER=$(docker run -dt alpine:latest)
$ docker ps -a --filter "id=$CONTAINER" --format "{{.Status}}"
Up 1 second
$ docker kill -s HUP $CONTAINER
4fea4f2dabe0f8a717b0e1272528af1a97050bcec51babbe0ed801e75fb15f1b
$ docker ps -a --filter "id=$CONTAINER" --format "{{.Status}}"
Up 7 seconds
By the way, with an Ubuntu container (which runs bash) it does work as expected:
$ CONTAINER=$(docker run -dt debian:latest)
$ docker ps -a --filter "id=$CONTAINER" --format "{{.Status}}"
Up 1 second
$ docker kill -s HUP $CONTAINER
9a4aff456716397527cd87492066230e5088fbbb2a1bb6fc80f04f01b3368986
$ docker ps -a --filter "id=$CONTAINER" --format "{{.Status}}"
Exited (129) 1 second ago
Sending SIGKILL does work, but I'd rather find out why SIGHUP does not.
Update: I'll add another example. Here you can see that busybox sh generally does hang up on SIGHUP successfully:
$ busybox sh -c 'while true; do sleep 10; done' &
[1] 28276
$ PID=$!
$ ps -e | grep busybox
28276 pts/5 00:00:00 busybox
$ kill -HUP $PID
$
[1]+ Hangup busybox sh -c 'while true; do sleep 10; done'
$ ps -e | grep busybox
$
However, running the same infinite sleep loop inside the docker container doesn't quit. As you can see, the container is still running after SIGHUP and only exits after SIGKILL:
$ CONTAINER=$(docker run -dt alpine:latest busybox sh -c 'while true; do sleep 10; done')
$ docker ps -a --filter "id=$CONTAINER" --format "{{.Status}}"
Up 14 seconds
$ docker kill -s HUP $CONTAINER
31574ba7c0eb0505b776c459b55ffc8137042e1ce0562a3cf9aac80bfe8f65a0
$ docker ps -a --filter "id=$CONTAINER" --format "{{.Status}}"
Up 28 seconds
$ docker kill -s KILL $CONTAINER
31574ba7c0eb0505b776c459b55ffc8137042e1ce0562a3cf9aac80bfe8f65a0
$ docker ps -a --filter "id=$CONTAINER" --format "{{.Status}}"
Exited (137) 2 seconds ago
$
(I don't have Docker env at hand for a try. Just guessing.)
For your case, docker run must be running busybox/sh or bash as PID 1.
According to Docker doc:
Note: A process running as PID 1 inside a container is treated specially by Linux: it ignores any signal with the default action. So, the process will not terminate on SIGINT or SIGTERM unless it is coded to do so.
For the differece between busybox/sh and bash regarding SIGHUP ---
On my system (Debian 9.6, x86_64), the signal masks for busybox/sh and bash are as follows:
busybox/sh:
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 82817 0.0 0.0 6952 1904 pts/2 S+ 10:23 0:00 busybox sh
PENDING (0000000000000000):
BLOCKED (0000000000000000):
IGNORED (0000000000284004):
3 QUIT
15 TERM
20 TSTP
22 TTOU
CAUGHT (0000000008000002):
2 INT
28 WINCH
bash:
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 4871 0.0 0.1 21752 6176 pts/16 Ss 2019 0:00 /usr/local/bin/bash
PENDING (0000000000000000):
BLOCKED (0000000000000000):
IGNORED (0000000000380004):
3 QUIT
20 TSTP
21 TTIN
22 TTOU
CAUGHT (000000004b817efb):
1 HUP
2 INT
4 ILL
5 TRAP
6 ABRT
7 BUS
8 FPE
10 USR1
11 SEGV
12 USR2
13 PIPE
14 ALRM
15 TERM
17 CHLD
24 XCPU
25 XFSZ
26 VTALRM
28 WINCH
31 SYS
As we can see busybox/sh does not handle SIGHUP so the signal is ignored. Bash catches SIGHUP so docker kill can deliver the signal to Bash and then Bash will be terminated because, according to its manual, "the shell exits by default upon receipt of a SIGHUP".
UPDATE 2020-03-07 #1:
Did a quick test and my previous analysis is basically correct. You can verify like this:
[STEP 104] # docker run -dt debian busybox sh -c \
'trap exit HUP; while true; do sleep 1; done'
331380090c59018dae4dbc17dd5af9d355260057fdbd2f2ce9fc6548a39df1db
[STEP 105] # docker ps
CONTAINER ID IMAGE COMMAND CREATED
331380090c59 debian "busybox sh -c 'trap…" 11 seconds ago
[STEP 106] # docker kill -s HUP 331380090c59
331380090c59
[STEP 107] # docker ps
CONTAINER ID IMAGE COMMAND CREATED
[STEP 108] #
As I showed earlier, by default busybox/sh does not catch SIGHUP so the signal will be ignored. But after busybox/sh explicitly trap SIGHUP, the signal will be delivered to it.
I also tried SIGKILL and yes it'll always terminate the running container. This is reasonable since SIGKILL cannot be caught by any process so the signal will always be delivered to the container and kill it.
UPDATE 2020-03-07 #2:
You can also verify it this way (much simpler):
[STEP 110] # docker run -ti alpine
/ # ps
PID USER TIME COMMAND
1 root 0:00 /bin/sh
7 root 0:00 ps
/ # kill -HUP 1 <-- this does not kill it because linux ignored the signal
/ #
/ # trap 'echo received SIGHUP' HUP
/ # kill -HUP 1
received SIGHUP <-- this indicates it can receive SIGHUP now
/ #
/ # trap exit HUP
/ # kill -HUP 1 <-- this terminates it because the action changed to `exit`
[STEP 111] #
Like the other answer already points out, the docs for docker run contain the following note:
Note: A process running as PID 1 inside a container is treated specially by Linux: it ignores any signal with the default action. So, the process will not terminate on SIGINT or SIGTERM unless it is coded to do so.
This is the reason why SIGHUP doesn't work on busybox sh inside the container. However, if I run busybox sh on my native system, it won't have PID 1 and therefore SIGHUP works.
There are various solutions:
Use --init to specify an init process which should be used as PID 1.
You can use the --init flag to indicate that an init process should be used as the PID 1 in the container. Specifying an init process ensures the usual responsibilities of an init system, such as reaping zombie processes, are performed inside the created container.
The default init process used is the first docker-init executable found in the system path of the Docker daemon process. This docker-init binary, included in the default installation, is backed by tini.
Trap SIGHUP and call exit yourself.
docker run -dt alpine busybox sh -c 'trap exit HUP ; while true ; do sleep 60 & wait $! ; done'
Use another shell like bash which exits on SIGHUP by default, doesn't matter if PID 1 or not.

Finding Docker container processes? (from host point of view)

I am doing some tests on docker and containers and I was wondering:
Is there a method I can use to find all process associated with a docker container by its name or ID from the host point of view.
After all, at the end of the day a container is a set of virtualized processes.
You can use docker top command.
This command lists all processes running within your container.
For instance this command on a single process container on my box displays:
UID PID PPID C STIME TTY TIME CMD
root 14097 13930 0 23:17 pts/6 00:00:00 /bin/bash
All methods mentioned by others are also possible to use but this one should be easiest.
Update:
To simply get the main process id within the container use this command:
docker inspect -f '{{.State.Pid}}' <container id>
Another way to get an overview of all Docker processes running on a host is using generic cgroup based systemd tools.
systemd-cgls will show all our cgroups and the processes running in them in a tree-view, like this:
├─1 /usr/lib/systemd/systemd --switched-root --system --deserialize 21
├─docker
│ ├─070a034d27ed7a0ac0d336d72cc14671584cc05a4b6802b4c06d4051ce3213bd
│ │ └─14043 bash
│ ├─dd952fc28077af16a2a0a6a3231560f76f363359f061c797b5299ad8e2614245
│ │ └─3050 go-cron -s 0 0 * * * * -- automysqlbackup
As every Docker container has its own cgroup, you can also see Docker Containers and their corresponding host processes this way.
Two interesting properties of this method:
It works even if the Docker Daemon(s) are defunct.
It's a pretty quick overview.
You can also use systemd-cgtop to get an overview of the resource usage of Docker Containers, similar to top.
By the way: Since systemd services also correspond to cgroups these methods are also applicable to non-Dockerized systemd services.
I found a similar solution using a bash script in one line:
for i in $(docker container ls --format "{{.ID}}"); do docker inspect -f '{{.State.Pid}} {{.Name}}' $i; done
the process run in a docker container is a child of a process named containerd-shim (in Docker v18.09.4)
First figure out the process IDs of the containerd-shim processes.
For each of them, find their child process.
pgrep containerd-shim
7105
7141
7248
To find the child process of parent process 7105:
pgrep -P 7105
7127
In the end you could get the list with:
for i in $(pgrep containerd-shim); do pgrep -P $i; done
7127
7166
7275
When running this on the host, it will give you a list of processes running in a container with <Container ID>, showing host PIDs instead of container PIDs.
DID=$(docker inspect -f '{{.State.Pid}}' <Container ID>);ps --ppid $DID -o pid,ppid,cmd
docker ps will list docker containers that are running.
docker exec <id|name> ps will tell you the processes it's running.
Since the following command shows only the container's itself process ID (not all child processes):
docker inspect -f '{{.State.Pid}}' <container-name_or_ID>
To find a process that is the child of a container, this process ID must be find in directory /proc. So find "processID" inside it and then find the container hash from file:
/proc/parent_process/task/processID
and then cut container ID from hash (first 12-digits of the container hash) and then find the container itself:
#!/bin/bash
processPath=$(find /proc/ -name $1 2>/dev/null)
containerID=$(cat ${processPath}/cgroup | fgrep 'pids:/docker/' | sed -e 's#.*/docker/##g' | cut -c 1-12)
docker ps | fgrep $containerID
Save above script in a file such as: p2c and run it by:
p2c <PID>
For example:
p2c 85888
Another solution with docker container and docker top
docker ps --format "{{.ID}}" | xargs -I'{}' docker top {} -o pid | awk '!/PID/'
Note: awk '!/PID/' just remove the PID header from the output of docker top
If you want to know the whole process tree of docker container, you can try it
docker ps --format "{{.ID}}" | xargs -I'{}' docker top {} -o pid | awk '!/PID/' | xargs -I'{}' pstree -psa {}
Docker stats "container id"
Shows the resource consumption along with pid or simply Docker ps .
Probably this cheat sheet can be of use.
http://theearlybirdtechnology.com/2017/08/12/docker-cheatsheet/

Job -l after nohup

How can I monitor a job that is still running (I guess detached?) after I started it with nohup, exited the server and logged back in? Normally, I use jobs -l to see what's running, but this is showing blank.
You need to understand the difference between a process and a job. Jobs are managed by the shell, so when you end your terminal session and start a new one, you are now in a new instance of Bash with its own jobs table. You can't access jobs from the original shell but as the other answers have noted, you can still find and manipulate the processes that were started. For example:
$ nohup sleep 60 &
[1] 27767
# Our job is in the jobs table
$ jobs
[1]+ Running nohup sleep 60 &
# And this is the process we started
$ ps -p 27767
PID TTY TIME CMD
27767 pts/1 00:00:00 sleep
$ exit # and start a new session
# Now jobs returns nothing because the jobs table is empty
$ jobs
# But our process is still alive and kicking...
$ ps -p 27767
PID TTY TIME CMD
27767 pts/1 00:00:00 sleep
# Until we decide to kill it
$ kill 27767
# Now the process is gone
$ ps -p 27767
PID TTY TIME CMD
You can monitor if the proceses if still running using
ps -p <pid>, where is the ID of the process you get after using the nohup command.
If you see valid entries you process is probably alive.
You could have a list of the processes running under current user with ps -u "$USER" or ps -u "$(whoami)".
Try this :
ps -ef | grep <pid>

Limiting certain processes to CPU % - Linux

I have the following problem: some processes, generated dynamically, have a tendency to eat 100% of CPU. I would like to limit all the process matching some criterion (e.g. process name) to a certain amount of CPU percentage.
The specific problem I'm trying to solve is harnessing folding#home worker processes. The best solution I could think of is a perl script that's executed periodically and uses the cpulimit utility to limit the processes (if you're interested in more details, check this blog post). It works, but it's a hack :/
Any ideas? I would like to leave the handling of processes to the OS :)
Thanks again for the suggestions, but we're still missing the point :)
The "slowDown" solution is essentially what the "cpulimit" utility does. I still have to take care about what processes to slow down, kill the "slowDown" process once the worker process is finished and start new ones for new worker processes. It's precisely what I did with the Perl script and a cron job.
The main problem is that I don't know beforehand what processes to limit. They are generated dynamically.
Maybe there's a way to limit all the processes of one user to a certain amount of CPU percentage? I already set up a user for executing the folding#home jobs, hoping that i could limit him with the /etc/security/limits.conf file. But the nearest I could get there is the total CPU time per user...
It would be cool if to have something that enables you to say:
"The sum of all CPU % usage of this user's processes cannot exceed 50%". And then let the processes fight for that 50% of CPU regarding to their priorities...
Guys, thanks for your suggestions, but it's not about priorities - I want to limit the CPU % even when there's plenty of CPU time available. The processes are already low priority, so they don't cause any performance issues.
I would just like to prevent the CPU from running on 100% for extended periods...
I had a slightly similar issue with gzip.
Assuming we want to decrease the CPU of a gzip process:
gzip backup.tar & sleep 2 & cpulimit --limit 10 -e gzip -z
Options:
I found sleep useful as the cpulimit sometimes didn't pick up the new gzip process immediately
--limit 10 limits gzip CPU usage to 10%
-z automatically closes cpulimit when gzip process finishes
Another option is to run the cpulimit daemon.
I don't remember and dont think there was something like this in the unix scheduler. You need a little program which controls the other process and does the following:
loop
wait for some time tR
send SIGSTOP to the process you want to be scheduled
wait for some time tP
send SIGCONT to the process.
loopEnd
the ratio tR/tP controls the cpu load.
Here is a little proof of concept. "busy" is the program which uses up your cpu time and which you want to be slowed-down by "slowDown":
> cat > busy.c:
main() { while (1) {}; }
> cc -o busy busy.c
> busy &
> top
Tasks: 192 total, 3 running, 189 sleeping, 0 stopped, 0 zombie
Cpu(s): 76.9% us, 6.6% sy, 0.0% ni, 11.9% id, 4.5% wa, 0.0% hi, 0.0% si
Mem: 6139696k total, 6114488k used, 25208k free, 115760k buffers
Swap: 9765368k total, 1606096k used, 8159272k free, 2620712k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
26539 cg 25 0 2416 292 220 R 90.0 0.0 3:25.79 busy
...
> cat > slowDown
while true; do
kill -s SIGSTOP $1
sleep 0.1
kill -s SIGCONT $1
sleep 0.1
done
> chmod +x slowDown
> slowDown 26539 &
> top
Tasks: 200 total, 4 running, 192 sleeping, 4 stopped, 0 zombie
Cpu(s): 48.5% us, 19.4% sy, 0.0% ni, 20.2% id, 9.8% wa, 0.2% hi, 2.0% si
Mem: 6139696k total, 6115376k used, 24320k free, 96676k buffers
Swap: 9765368k total, 1606096k used, 8159272k free, 2639796k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
26539 cg 16 0 2416 292 220 T 49.7 0.0 6:00.98 busy
...
ok, that script needs some more work (for example, to care for being INTR-upted and let the controlled process continue in case it was stopped at that moment), but you get the point. I would also write that little script in C or similar and compute the cpu ratio from a comand line argument....
regards
I think in Linux there is no solution to cap the cpu usage, but there is an acceptable way to limit any process to a certain CPU usage: http://ubuntuforums.org/showthread.php?t=992706
In case they remove the info, here is again
INSTALL PACKAGES
Install cpulimit package.
Code:
sudo apt-get install cpulimit
Install gawk package.
Code:
sudo apt-get install gawk
CREATE CPULIMIT DAEMON FILE
Open text editor with root privileges and save bellow daemon script text to new file /usr/bin/cpulimit_daemon.sh
Code:
#!/bin/bash
# ==============================================================
# CPU limit daemon - set PID's max. percentage CPU consumptions
# ==============================================================
# Variables
CPU_LIMIT=20 # Maximum percentage CPU consumption by each PID
DAEMON_INTERVAL=3 # Daemon check interval in seconds
BLACK_PROCESSES_LIST= # Limit only processes defined in this variable. If variable is empty (default) all violating processes are limited.
WHITE_PROCESSES_LIST= # Limit all processes except processes defined in this variable. If variable is empty (default) all violating processes are limited.
# Check if one of the variables BLACK_PROCESSES_LIST or WHITE_PROCESSES_LIST is defined.
if [[ -n "$BLACK_PROCESSES_LIST" && -n "$WHITE_PROCESSES_LIST" ]] ; then # If both variables are defined then error is produced.
echo "At least one or both of the variables BLACK_PROCESSES_LIST or WHITE_PROCESSES_LIST must be empty."
exit 1
elif [[ -n "$BLACK_PROCESSES_LIST" ]] ; then # If this variable is non-empty then set NEW_PIDS_COMMAND variable to bellow command
NEW_PIDS_COMMAND="top -b -n1 -c | grep -E '$BLACK_PROCESSES_LIST' | gawk '\$9>CPU_LIMIT {print \$1}' CPU_LIMIT=$CPU_LIMIT"
elif [[ -n "$WHITE_PROCESSES_LIST" ]] ; then # If this variable is non-empty then set NEW_PIDS_COMMAND variable to bellow command
NEW_PIDS_COMMAND="top -b -n1 -c | gawk 'NR>6' | grep -E -v '$WHITE_PROCESSES_LIST' | gawk '\$9>CPU_LIMIT {print \$1}' CPU_LIMIT=$CPU_LIMIT"
else
NEW_PIDS_COMMAND="top -b -n1 -c | gawk 'NR>6 && \$9>CPU_LIMIT {print \$1}' CPU_LIMIT=$CPU_LIMIT"
fi
# Search and limit violating PIDs
while sleep $DAEMON_INTERVAL
do
NEW_PIDS=$(eval "$NEW_PIDS_COMMAND") # Violating PIDs
LIMITED_PIDS=$(ps -eo args | gawk '$1=="cpulimit" {print $3}') # Already limited PIDs
QUEUE_PIDS=$(comm -23 <(echo "$NEW_PIDS" | sort -u) <(echo "$LIMITED_PIDS" | sort -u) | grep -v '^$') # PIDs in queue
for i in $QUEUE_PIDS
do
cpulimit -p $i -l $CPU_LIMIT -z & # Limit new violating processes
done
done
CHANGE VARIABLES TO YOUR ENVIRONMENT NEEDS
CPU_LIMIT
Change this variable in above script if you would like to omit CPU consumption for every process to any other percentage then 20%. Please read "If using SMP computer" chapter bellow if you have SMP computer (more then 1 CPU or CPU with more then 1 core).
DAEMON_INTERVAL
Change this variable in above script if you would like to have more/less regular checking. Interval is in seconds and default is set to 3 seconds.
BLACK_PROCESS_LIST and WHITE_PROCESSES_LIST
Variable BLACK_PROCESSES_LIST limits only specified processes. If variable is empty (default) all violating processes are limited.
Variable WHITE_PROCESSES_LIST limits all processes except processes defined in this variable. If variable is empty (default) all violating processes are limited.
One or both of the variables BLACK_PROCESSES_LIST and WHITE_PROCESSES_LIST has to be empty - it is not logical that both variables are defined.
You can specify multiple processes in one of this two variables using delimiter characters "|" (without double quotes). Sample: if you would like to cpulimit all processes except mysql, firefox and gedit processes set variable: WHITE_PROCESSES_LIST="mysql|firefox|gedit"
PROCEDURE TO AUTOMATICALLY START DAEMON AT BOOT TIME
Set file permissions for root user:
Code:
sudo chmod 755 /usr/bin/cpulimit_daemon.sh
Open text editor with root privileges and save bellow script to new file /etc/init.d/cpulimit
Code:
#!/bin/sh
#
# Script to start CPU limit daemon
#
set -e
case "$1" in
start)
if [ $(ps -eo pid,args | gawk '$3=="/usr/bin/cpulimit_daemon.sh" {print $1}' | wc -l) -eq 0 ]; then
nohup /usr/bin/cpulimit_daemon.sh >/dev/null 2>&1 &
ps -eo pid,args | gawk '$3=="/usr/bin/cpulimit_daemon.sh" {print}' | wc -l | gawk '{ if ($1 == 1) print " * cpulimit daemon started successfully"; else print " * cpulimit daemon can not be started" }'
else
echo " * cpulimit daemon can't be started, because it is already running"
fi
;;
stop)
CPULIMIT_DAEMON=$(ps -eo pid,args | gawk '$3=="/usr/bin/cpulimit_daemon.sh" {print $1}' | wc -l)
CPULIMIT_INSTANCE=$(ps -eo pid,args | gawk '$2=="cpulimit" {print $1}' | wc -l)
CPULIMIT_ALL=$((CPULIMIT_DAEMON + CPULIMIT_INSTANCE))
if [ $CPULIMIT_ALL -gt 0 ]; then
if [ $CPULIMIT_DAEMON -gt 0 ]; then
ps -eo pid,args | gawk '$3=="/usr/bin/cpulimit_daemon.sh" {print $1}' | xargs kill -9 # kill cpulimit daemon
fi
if [ $CPULIMIT_INSTANCE -gt 0 ]; then
ps -eo pid,args | gawk '$2=="cpulimit" {print $1}' | xargs kill -9 # release cpulimited process to normal priority
fi
ps -eo pid,args | gawk '$3=="/usr/bin/cpulimit_daemon.sh" {print}' | wc -l | gawk '{ if ($1 == 1) print " * cpulimit daemon can not be stopped"; else print " * cpulimit daemon stopped successfully" }'
else
echo " * cpulimit daemon can't be stopped, because it is not running"
fi
;;
restart)
$0 stop
sleep 3
$0 start
;;
status)
ps -eo pid,args | gawk '$3=="/usr/bin/cpulimit_daemon.sh" {print}' | wc -l | gawk '{ if ($1 == 1) print " * cpulimit daemon is running"; else print " * cpulimit daemon is not running" }'
;;
esac
exit 0
Change file's owner to root:
Code:
sudo chown root:root /etc/init.d/cpulimit
Change permissions:
Code:
sudo chmod 755 /etc/init.d/cpulimit
Add script to boot-up procedure directories:
Code:
sudo update-rc.d cpulimit defaults
Reboot to check if script starts cpulimit daemon at boot time:
Code:
sudo reboot
MANUALLY CHECK, STOP, START AND RESTART DAEMON
Note: Daemon and service in this tutorial has equal meaning.
Note: For users using prior to Ubuntu 8.10 (like Ubuntu 8.04 LTS) instead of service command use "sudo /etc/init.d/cpulimit status/start/stop/restart" syntax or install sysvconfig package using command: sudo apt-get install sysvconfig
Check if cpulimit service is running
Check command returns: "cpulimit daemon is running" if service is running, or "cpulimit daemon is not running" if service is not running.
Code:
sudo service cpulimit status
Start cpulimit service
You can manually start cpulimit daemon which will start to omit CPU consumption.
Code:
sudo service cpulimit start
Stop cpulimit service
Stop command stops cpulimit daemon (so no new process will be limited) and also sets to all existing limited processes to have full access to CPU, just like it was before cpulimit was not running.
Code:
sudo service cpulimit stop
Restart cpulimit service
If you change some variables settings in /usr/bin/cpulimit_daemon.sh like CPU_LIMIT, DAEMON_INTERVAL, BLACK_PROCESSES_LIST or WHITE_PROCESSES_LIST, then after changing settings you must restart service.
Code:
sudo service cpulimit restart
CHECK CPU CONSUMPTION WITH OR WITHOUT CPULIMIT DAEMON
Without daemon
1. stop cpulimit daemon (sudo service cpulimit stop)
2. execute CPU intensive tasks in background
3. execute command: top and check for %CPU column
Result of %CPU is probably more then 20% for each process.
With daemon turned on
1. start cpulimit daemon (sudo service cpulimit start)
2. execute the same CPU intensive tasks in background
3. execute command: top and check for %CPU column
Result of %CPU should be maximum 20% for each process.
Note: Don't forget at beginning %CPU can be more then 20%, because daemon has to catch violating process in interval of 3 seconds (set in script by default)
IF USING SMP COMPUTER
I have tested this code on Intel dual-core CPU computer - that behaves like SMP computer. Don't forget that top command and also cpulimit by default behaves in Irix mode, where 20% means 20% of one CPU. If there are two CPUs (or dual-core) then total %CPU can be 200%. In top command Irix mode can be turned off with command I (pressing +i when top command is running) and Solaris mode is turned on, where total amount of CPU is divided by number of CPUs, so %CPU can be no more then 100% on any number of CPU computer. Please read more info about top command in top man page (search for I command). Please also read more about how cpulimit is operating on SMP computer in cpulimit official page.
But how does cpulimit daemon operates on SMP computer? Always in Irix mode. So if you would like to spend 20% of CPU power on 2-CPU computer then 40% should be used for CPU_LIMIT variable in cpulimit daemon script.
UNINSTALL CPULIMIT DAEMON AND CPULIMIT PROGRAM
If you would like to get rid of cpulimit daemon you can clean up your system by removing cpulimit daemon and uninstalling cpulimit program.
Stop cpulimit daemon
Code:
sudo service cpulimit stop # Stop cpulimit daemon and all cpulimited processes
Remove daemon from boot-up procedure
Code:
sudo update-rc.d -f cpulimit remove # Remove symbolic links
Delete boot-up procedure
Code:
sudo rm /etc/init.d/cpulimit # Delete cpulimit boot-up script
Delete cpulimit daemon
Code:
sudo rm /usr/bin/cpulimit_daemon.sh # Delete cpulimit daemon script
Uninstall cpulimit program
Code:
sudo apt-get remove cpulimit
Uninstall gawk program
If you don't need this program for any other script, you can remote it.
Code:
sudo apt-get remove gawk
NOTE ABOUT AUTHORS
I have just written daemon for cpulimit (bash scripts above). I am not the author of cpulimit project. If you need more info about cpulimit program, please read official cpulimit web page: http://cpulimit.sourceforge.net/.
Regards,
Abcuser
I had a similar problem, and the other solutions presented in the thread don't address it at all. My solution works for me right now, but it is suboptimal, particularly for the cases where the process is owned by root.
My workaround for now is to try very hard to make sure that I don't have any long-running processes owned by root (like have backup be done only as a user)
I just installed the hardware sensors applet for gnome, and set up alarms for high and low temperatures on the CPU, and then set up the following commands for each alarm:
low:
mv /tmp/hogs.txt /tmp/hogs.txt.$$ && cat /tmp/hogs.txt.$$ | xargs -n1 kill -CONT
high:
touch /tmp/hogs.txt && ps -eo pcpu,pid | sort -n -r | head -1 | gawk '{ print $2 }' >> /tmp/hogs.txt && xargs -n1 kill -STOP < /tmp/hogs.txt
The good news is that my computer no longer overheats and crashes. The downside is that terminal processes get disconnected from the terminal when they get stopped, and don't get reconnected when they get the CONT signal. The other thing is that if it was an interactive program that caused the overheating (like a certain web browser plugin!) then it will freeze in the middle of what I'm doing while it waits for the CPU to cool off. It would be nicer to have CPU scaling take care of this at a global level, but the problem is that I only have two selectable settings on my CPU and the slow setting isn't slow enough to prevent overheating.
Just to re-iterate here, this has nothing at all to do with process priority, re-nicing,and obviously nothing to do with stopping jobs that run for a long time. This has to do with preventing CPU utilization from staying at 100% for too long, because the hardware is unable to dissipate the heat quickly enough when running at full capacity (idle CPU generates less heat than a fully loaded CPU).
Some other obvious possibilities that might help are:
Lower the CPU speed overall in the BIOS
Replace the heatsink or re-apply the thermal gel to see if that helps
Clean the heatsink with some compressed air
Replace the CPU fan
[edit]
Note: no more overheating at 100% CPU when I disable variable fan speed in the bios (asus p5q pro turbo). With the CPU fully loaded, each core tops out at 49 celcius.
Using cgroups' cpu.shares does nothing that a nice value wouldn't do. It sounds like you want to actually throttle the processes, which can definitely be done.
You will need to use a script or two, and/or edit /etc/cgconfig.conf to define the parameters you want.
Specifically, you want to edit the values cpu.cfs_period_us and cpu.cfs_quota_us. The process will then be allowed to run for cpu.cfs_quota_us microseconds per cpu.cfs_period_us microseconds.
For example:
If cpu.cfs_period_us = 50000 and cpu.cfs_quota_us = 10000 then the process will receive 20% of the CPU time maximum, no matter what else is going on.
In this screenshot I have given the process 2% of CPU time:
As far as the process is concerned it is running at 100%.
Settings cpu.shares on the other hand can and will still use 100% of the idle CPU time.
In this similar example I have given the process cpu.shares = 100 (of 1024):
As you can see the process is still consuming all the idle CPU time.
References:
http://manpages.ubuntu.com/manpages/precise/man5/cgconfig.conf.5.html
http://kennystechtalk.blogspot.co.uk/2015/04/throttling-cpu-usage-with-linux-cgroups.html
In a system that uses systemd, you can run a single process with a maximum CPU usage using a command like the following:
sudo systemd-run --scope --uid=1000 -p CPUQuota=20% my_heavy_computation.sh
This will use up to 20% of a single CPU core (or CPU thread if using hyper-threading).
I also had the need to limit CPU time for certain processes. Cpulimit is a good tool, but I always had to manually find out the PID and manually start cpulimit, so I wanted something more convenient.
I came up with this bash script:
#!/bin/bash
function lp
{
ps aux | grep $1 | termsql "select COL1 from tbl" > /tmp/tmpfile
while read line
do
TEST=`ps aux | grep "cpulimit -p $line" | wc -l`
[[ $TEST -eq "2" ]] && continue #cpulimit is already running on this process
cpulimit -p $line -l $2
done < /tmp/tmpfile
}
while true
do
lp gnome-terminal 5
lp system-journal 5
sleep 10
done
This example limits the cpu time of each gnome-terminal instance to 5% and the cpu time of each system-journal instance to 5% as well.
I used another script I created named termsql in this example to extract the PID . You can get it here:
https://gitorious.org/termsql/termsql/source/master:
You can limit the amount of cpu time with cgroups. The OS will handle resource management just like you ask for in you question.
Here is a wiki with examples: https://wiki.archlinux.org/index.php/Cgroups
Since the processes already run as a separate user limiting all process from that user should be the easiest solution.
I see at least two options:
Use "ulimit -t" in the shell that creates your process
Use "nice" at process creation or "renice" during runtime
PS + GREP + NICE
The nice command will probably help.
This can be done using setrlimit(2) (specifically by setting RLIMIT_CPU parameter).
Throwing some sleep calls in there should force the process off the CPU for a certain time. If you sleep 30 seconds once a minute, your process shouldn't average more than 50% CPU usage during that minute.
I dont really see why you want to limit the CPU time... you should limit the total load on that machine, and the load is determined by IO operations mostly .
Ex: if i create a while(1){} loop, it will get the total load to 1.0, but if this loop does some disk writes the load jumps to 2.0... 4.0. And that's what killing your machine, not the CPU usage. The CPU usage can be easily handled by nice/renice.
Anyways, you could make a script that does a 'kill -SIGSTOP PID' for a specific PID, when the load gets too high, and kill -SIGCONT when everything gets back to normal... The PID's can be determined by using the 'px aux' command, because i see that it displays the CPU usage, and you should be able to sort the list using that column. I think this the whole thing could be done in bash...
You could scale down the CPU frequency. Then you don't have to worry about the individual processes. When you need more cpu's, scale the frequency back up.

Resources