I think I may have a memory leak in my LAMP application (memory gets used up, swap starts getting used, etc.). If I could see how much memory the various processes are using, it might help me resolve my problem. Is there a way for me to see this information in *nix?
Getting right memory usage is trickier than one may think. The best way I could find is:
echo 0 $(awk '/TYPE/ {print "+", $2}' /proc/`pidof PROCESS`/smaps) | bc
Where "PROCESS" is the name of the process you want to inspect and "TYPE" is one of:
Rss: resident memory usage, all memory the process uses, including all memory this process shares with other processes. It does not include swap;
Shared: memory that this process shares with other processes;
Private: private memory used by this process, you can look for memory leaks here;
Swap: swap memory used by the process;
Pss: Proportional Set Size, a good overall memory indicator. It is the Rss adjusted for sharing: if a process has 1MiB private and 20MiB shared between other 10 processes, Pss is 1 + 20/10 = 3MiB
Other valid values are Size (i.e. virtual size, which is almost meaningless) and Referenced (the amount of memory currently marked as referenced or accessed).
You can use watch or some other bash-script-fu to keep an eye on those values for processes that you want to monitor.
For more informations about smaps: http://www.kernel.org/doc/Documentation/filesystems/proc.txt.
I don't know why the answer seem so complicated... It seems pretty simple to do this with ps:
mem()
{
ps -eo rss,pid,euser,args:100 --sort %mem | grep -v grep | grep -i $# | awk '{printf $1/1024 "MB"; $1=""; print }'
}
Example usage:
$ mem mysql
0.511719MB 781 root /bin/sh /usr/bin/mysqld_safe
0.511719MB 1124 root logger -t mysqld -p daemon.error
2.53516MB 1123 mysql /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib/mysql/plugin --user=mysql --pid-file=/var/run/mysqld/mysqld.pid --socket=/var/run/mysqld/mysqld.sock --port=3306
Use ps to find the process id for the application, then use top -p1010 (substitute 1010 for the real process id).
The RES column is the used physical memory and the VIRT column is the used virtual memory - including libraries and swapped memory.
More info can be found using "man top"
First get the pid:
ps ax | grep [process name]
And then:
top -p PID
You can watch various processes in the same time:
top -p PID1 -p PID2
You can use pmap to report memory usage.
Synopsis:
pmap [ -x | -d ] [ -q ] pids...
In case you don't have a current or long running process to track, you can use /usr/bin/time.
This is not the same as Bash time (as you will see).
Eg
# /usr/bin/time -f "%M" echo
2028
This is "Maximum resident set size of the process during its lifetime, in Kilobytes" (quoted from the man page). That is, the same as RES in top et al.
There are a lot more you can get from /usr/bin/time.
# /usr/bin/time -v echo
Command being timed: "echo"
User time (seconds): 0.00
System time (seconds): 0.00
Percent of CPU this job got: 0%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.00
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 1988
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 77
Voluntary context switches: 1
Involuntary context switches: 0
Swaps: 0
File system inputs: 0
File system outputs: 0
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
More elegant approach:
echo "Memory usage for PID <>:"; for mem in {Private,Rss,Shared,Swap,Pss};do grep $mem /proc/<pid>/smaps | awk -v mem_type="$mem" '{i=i+$2} END {print mem_type,"memory usage:"i}' ;done
Use top or htop and pay attention to the "RES" (resident memory size) column.
Thanks. I used this to create this simple bash script that can be used to watch a process and its memory usage:
$ watch watchmypid.sh
#!/bin/bash
#
PROCESSNAME=changethistoyourprocessname
MYPID=`pidof $PROCESSNAME`
echo "=======";
echo PID:$MYPID
echo "--------"
Rss=`echo 0 $(cat /proc/$MYPID/smaps | grep Rss | awk '{print $2}' | sed 's#^#+#') | bc;`
Shared=`echo 0 $(cat /proc/$MYPID/smaps | grep Shared | awk '{print $2}' | sed 's#^#+#') | bc;`
Private=`echo 0 $(cat /proc/$MYPID/smaps | grep Private | awk '{print $2}' | sed 's#^#+#') | bc;`
Swap=`echo 0 $(cat /proc/$MYPID/smaps | grep Swap | awk '{print $2}' | sed 's#^#+#') | bc;`
Pss=`echo 0 $(cat /proc/$MYPID/smaps | grep Pss | awk '{print $2}' | sed 's#^#+#') | bc;`
Mem=`echo "$Rss + $Shared + $Private + $Swap + $Pss"|bc -l`
echo "Rss " $Rss
echo "Shared " $Shared
echo "Private " $Private
echo "Swap " $Swap
echo "Pss " $Pss
echo "=================";
echo "Mem " $Mem
echo "=================";
The tool you want is ps.
To get information about what java programs are doing:
ps -F -C java
To get information about http:
ps -F -C httpd
If your program is ending before you get a chance to run these, open another terminal and run:
while true; do ps -F -C myCoolCode ; sleep 0.5s ; done
You can use pmap + awk.
Most likely, we're interested in the RSS memory which is the 3rd column in the last line of the example pmap output below (82564).
$ pmap -x <pid>
Address Kbytes RSS Dirty Mode Mapping
....
00007f9caf3e7000 4 4 4 r---- ld-2.17.so
00007f9caf3e8000 8 8 8 rw--- ld-2.17.so
00007fffe8931000 132 12 12 rw--- [ stack ]
00007fffe89fe000 8 8 0 r-x-- [ anon ]
ffffffffff600000 4 0 0 r-x-- [ anon ]
---------------- ------ ------ ------
total kB 688584 82564 9592
Awk is then used to extract that value.
$ pmap -x <pid> | awk '/total/ { print $4 "K" }'
The pmap values are in kilobytes. If we wanted it in megabytes, we could do something like this.
$ pmap -x <pid> | awk '/total/ { print $4 / 1024 "M" }'
Why all these complicated answers with various shell scripts?
Use htop, it automatically changes the sizes and you can select which info you want shown and it works in the terminal, so it does not require a desktop.
Example: htop -d8
Use
ps u `pidof $TASKS_LIST` or ps u -C $TASK
ps xu --sort %mem
ps h -o pmem -C $TASK
Example:
ps-of()
{
ps u `pidof "$#"`
}
$ ps-of firefox
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
const 18464 5.9 9.4 1190224 372496 ? Sl 11:28 0:33 /usr/lib/firefox/firefox
$ alias ps-mem="ps xu --sort %mem | sed -e :a -e '1p;\$q;N;6,\$D;ba'"
$ ps-mem
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
const 3656 0.0 0.4 565728 18648 ? Sl Nov21 0:56 /usr/bin/python /usr/lib/ubuntuone-client/ubuntuone-syncdaemon
const 11361 0.3 0.5 1054156 20372 ? Sl Nov25 43:50 /usr/bin/python /usr/bin/ubuntuone-control-panel-qt
const 3402 0.0 0.5 1415848 23328 ? Sl Nov21 1:16 nautilus -n
const 3577 2.3 2.0 1534020 79844 ? Sl Nov21 410:02 konsole
const 18464 6.6 12.7 1317832 501580 ? Sl 11:28 1:34 /usr/lib/firefox/firefox
$ ps h -o pmem -C firefox
12.7
Related
I need to write a Bash script that does the following:
In the "top" command, I would like to filter the processes by a given COMMAND. In the following I use Google Chrome as an example, which appears as "chrome" in the COMMAND column.
After filtering, there can be zero, one or more processes with COMMAND "chrome" left (this is just to highlight that there is not exactly one process with COMMAND "chrome" in general).
Now I would like to write the current time (hh:mm:ss), the PID of the process and the %CPU value displayed for this process to a file "logfile"
Repeat steps 1 to 3 once every second.
Example: Assuming that there are three "chrome" processes, the output in "logfile" should look something like below (for the first three seconds):
17:49:12 7954 14.0
17:49:12 7969 9.3
17:49:12 2626 1.3
17:49:13 7954 12.0
17:49:13 7969 6.3
17:49:13 2626 1.2
17:49:14 7954 14.7
17:49:14 7969 8.5
17:49:14 2626 2.1
My ideas so far: Using the command
top -b -n 1 -p 7954 | tail -n 2 | head -n 2 | awk '{print $1, $9}' >> logfile
I filter top by PID (in this case PID == 7954) and the output looks like
PID %CPU
7954 6.6
however (since I actually want to filer by COMMAND) I do not know how to filter by COMMAND. In the line above, the "-p 7954" does the filtering for PID==7954, however what do I need to write here to filter by COMMAND==chrome? Also, How can I remove/avoid the header?
According to the time step: I found that the command
date +"%T"
gives me the time in the correct format (hh:mm:ss).
So I just struggle with putting these pieces together and fix the filtering problem mentioned above. Thank you for any help!
Awk can do this; awk '/regex/ { print }' performs the print action only on lines matching regex.
However, you can (and perhaps also should) subsume head and tail as well:
top -b -n 1 | awk 'NR>1 && $10 == "chrome" {print strftime("%T"), $1, $9}'
... assuming the tenth field of top output contains the command name.
however what do I need to write here to filter by COMMAND==chrome
Write a small script to accomplish this, say calc_proc_mem which looks
like below :
#!/bin/bash
if [ -z "$1" ] #checking if first param exist
then
echo "Usage : cal_proc_mem process_name"
exit 1 # Exiting with a non-zero value
else
proc_ids=( $(pgrep "$1") )
if [ ${#proc_ids[#]} -eq 0 ] #checking if if pgrep returned nothing
then
echo "$1 : Process Not Running/No such process"
exit 1 # Exiting with a non-zero value
else
echo "$1's %CPU-%MEM usage as on $(date +%F)" >> logfile
while true
do
for proc_id in "${proc_ids[#]}"
do
usage="$(ps -p "$proc_id" -o %cpu,%mem | awk -v pid=$proc_id 'NR==2{printf "PID : %-10d \%CPU : %f \%MEM : %f\n",pid,$1,$2}' 2>/dev/null)"
echo -e "$(date +%H:%M:%S)\t$usage" >> logfile
done
sleep 3
done
fi
fi
Run the script as
./calc_proc_mem process_name
Sample Output
chrome's %CPU-%MEM usage as on 2016-06-27
23:40:33 PID : 3983 %CPU : 1.300000 %MEM : 2.200000
23:40:33 PID : 8448 %CPU : 0.100000 %MEM : 4.300000
23:40:33 PID : 8464 %CPU : 0.000000 %MEM : 0.400000
23:40:33 PID : 8470 %CPU : 0.000000 %MEM : 0.200000
23:40:33 PID : 8526 %CPU : 0.000000 %MEM : 3.000000
23:40:33 PID : 8529 %CPU : 0.000000 %MEM : 0.200000
23:40:33 PID : 8563 %CPU : 0.000000 %MEM : 1.500000
23:40:33 PID : 8655 %CPU : 0.300000 %MEM : 4.900000
23:40:33 PID : 32450 %CPU : 0.300000 %MEM : 2.100000
Note
Since you've an infinite while-loop running, you need to manually terminate the program using Ctrl C.
You can drop '-p PID' option and next, grep by COMMAND. You can do the next:
top -b -n 1 | grep 'chrome' | tail -n 2 | head -n 2 | awk '{print $1, $9}'
Another command sample to get you going could be:
$ cmd="sleep"; for j in {1..3}; do (${cmd} 123 &); done;
$ ts=$(date +"%T"); top -b -n 1| sed s/^[^\ 0123456789].*$//g |grep "${cmd}"|tr -s '\n'| awk '{print $1, $9, $12}'|sed s/^/"${ts} "/g
19:36:51 35122 0.0 sleep
19:36:51 35124 0.0 sleep
19:36:51 35126 0.0 sleep
It prints the time as given by the date call, and from top: PID, %CPU, and COMMAND field found. The headers and non matching data lines are filtered via sed (no number at start, which could suppress small pids by the way =( thus also a space at line start is accepted) and grep on the command. The time is prepended py sed on start of line injecting the stored timestamp and a blank space to separate.
It is not elegant but might fit your needs to have a start.
The pgrep solutions or using awk with a regex look better ... but at least I enjoyed trying to solve it with top. The tail and head stages in the pipe look suspicious to me ...
I just ran into a swap problem, so I tried to find which process was using swap, with the script(getswap.sh) shown in this. It was php-fpm, about 200 subprocess, either toke 1M swap space. So I killed php-fpm. Then I ran the script again, and total swap used decreased a lot. Howerver, result in free -m only decreased for about 3M. What is the problem?
before killing php-fpm:
[root#eng /tmp]# bash getswap.sh | sort -n -k5>out
[root#eng /tmp]# cat out|awk '{a+=$5;}END{print a;}'
202076
[root#eng /tmp]# free -m
total used free shared buffers cached
Mem: 64259 60566 3692 0 192 17098
-/+ buffers/cache: 43275 20983
Swap: 4095 155 3940
after killing php-fpm:
[root#eng /tmp]# bash getswap.sh | sort -n -k5>out
[root#eng /tmp]# cat out|awk '{a+=$5;}END{print a;}'
108456
[root#eng /tmp]# free -m
total used free shared buffers cached
Mem: 64259 60402 3857 0 192 17043
-/+ buffers/cache: 43166 21092
Swap: 4095 152 3943
and the script:
#!/bin/bash
function getswap {
SUM=0
OVERALL=0
for DIR in `find /proc/ -maxdepth 1 -type d | egrep "^/proc/[0-9]"` ; do
PID=`echo $DIR | cut -d / -f 3`
PROGNAME=`ps -p $PID -o comm --no-headers`
for SWAP in `grep Swap $DIR/smaps 2>/dev/null| awk '{ print $2 }'`
do
let SUM=$SUM+$SWAP
done
echo "PID=$PID - Swap used: $SUM - ($PROGNAME )"
let OVERALL=$OVERALL+$SUM
SUM=0
done
echo "Overall swap used: $OVERALL"
}
getswap
thx in advance
I tried to figure this out on my own but for some reason i can not figure this out can you please help me fix this. I am using the proc filesystem to parse information and redirect to a file. I just cannot get the memused.
mhz=$(cat /proc/cpuinfo | grep -m 1 "cpu MHz" | cut -d' ' -f 3-)
model=$(cat /proc/cpuinfo | grep -m 1 "model name" | cut -d' ' -f 4-)
memory=$(cat /proc/meminfo | grep MemTotal | cut -d' ' -f 2-)
free=$(cat /proc/meminfo | grep MemFree | cut -d' ' -f 2-)
version=$(cat /proc/version | cut -d' ' -f 3)
echo >> /home/user/data/proc
echo Filename, field name, data >> /home/user/data/proc
echo /proc/cpuinfo, cpu MHz: $model >> /home/user/data/proc
echo /proc/cpuinfo, Model Name: $mhz >> /home/user/data/proc
echo /proc/meminfo, Total Memory: $memory >> /home/user/data/proc
echo /proc/meminfo, Free Memory: $free >> /home/user/data/proc
echo /proc/version, Linux Version: $version >> /home/user/data/proc
This is meant to be a comment, but I don't have multiline in a comment, so here it goes:
mem_total_without_unit=$(</proc/meminfo grep MemTotal | grep -Eo '[0-9]+')
mem_free_without_unit=$(</proc/meminfo grep MemFree | grep -Eo '[0-9]+')
# print the free memory
# customize the unit based on the format of your /proc/meminfo
echo "$((mem_total_without_unit - mem_free_without_unit)) kB"
Example output:
12427860 kB
The answer you get will be meaningless. Ideally, Linux will utilize all you r RAM. Caching I/O buffers and text in memory is going to invalidate your calculation.
Now, if the question you are trying to answer is "is my system fully-loaded", a more useful tool is vmstat(8):
$ vmstat
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
0 0 1131424 3734228 263588 3188088 1 1 52 91 6 6 41 9 49 1
What you need to watch is the SI (swap-in) column. If you see constant non-zero values, you are having paging issues. The SO (swap-out) value can be ignored.
But, in general, if you are having paging issues, performance will drop like a rock off a cliff.
The web site http://www.linuxatemyram.com/ gives a light-hearted explanation of the kernel's memory management.
Under Linux, how do I find out which process is using the swap space more?
The best script I found is on this page : http://northernmost.org/blog/find-out-what-is-using-your-swap/
Here's one variant of the script and no root needed:
#!/bin/bash
# Get current swap usage for all running processes
# Erik Ljungstrom 27/05/2011
# Modified by Mikko Rantalainen 2012-08-09
# Pipe the output to "sort -nk3" to get sorted output
# Modified by Marc Methot 2014-09-18
# removed the need for sudo
SUM=0
OVERALL=0
for DIR in `find /proc/ -maxdepth 1 -type d -regex "^/proc/[0-9]+"`
do
PID=`echo $DIR | cut -d / -f 3`
PROGNAME=`ps -p $PID -o comm --no-headers`
for SWAP in `grep VmSwap $DIR/status 2>/dev/null | awk '{ print $2 }'`
do
let SUM=$SUM+$SWAP
done
if (( $SUM > 0 )); then
echo "PID=$PID swapped $SUM KB ($PROGNAME)"
fi
let OVERALL=$OVERALL+$SUM
SUM=0
done
echo "Overall swap used: $OVERALL KB"
Run top then press OpEnter. Now processes should be sorted by their swap usage.
Here is an update as my original answer does not provide an exact answer to the problem as pointed out in the comments. From the htop FAQ:
It is not possible to get the exact size of used swap space of a
process. Top fakes this information by making SWAP = VIRT - RES, but
that is not a good metric, because other stuff such as video memory
counts on VIRT as well (for example: top says my X process is using
81M of swap, but it also reports my system as a whole is using only 2M
of swap. Therefore, I will not add a similar Swap column to htop
because I don't know a reliable way to get this information (actually,
I don't think it's possible to get an exact number, because of shared
pages).
Here's another variant of the script, but meant to give more readable output (you need to run this as root to get exact results):
#!/bin/bash
# find-out-what-is-using-your-swap.sh
# -- Get current swap usage for all running processes
# --
# -- rev.0.3, 2012-09-03, Jan Smid - alignment and intendation, sorting
# -- rev.0.2, 2012-08-09, Mikko Rantalainen - pipe the output to "sort -nk3" to get sorted output
# -- rev.0.1, 2011-05-27, Erik Ljungstrom - initial version
SCRIPT_NAME=`basename $0`;
SORT="kb"; # {pid|kB|name} as first parameter, [default: kb]
[ "$1" != "" ] && { SORT="$1"; }
[ ! -x `which mktemp` ] && { echo "ERROR: mktemp is not available!"; exit; }
MKTEMP=`which mktemp`;
TMP=`${MKTEMP} -d`;
[ ! -d "${TMP}" ] && { echo "ERROR: unable to create temp dir!"; exit; }
>${TMP}/${SCRIPT_NAME}.pid;
>${TMP}/${SCRIPT_NAME}.kb;
>${TMP}/${SCRIPT_NAME}.name;
SUM=0;
OVERALL=0;
echo "${OVERALL}" > ${TMP}/${SCRIPT_NAME}.overal;
for DIR in `find /proc/ -maxdepth 1 -type d -regex "^/proc/[0-9]+"`;
do
PID=`echo $DIR | cut -d / -f 3`
PROGNAME=`ps -p $PID -o comm --no-headers`
for SWAP in `grep Swap $DIR/smaps 2>/dev/null| awk '{ print $2 }'`
do
let SUM=$SUM+$SWAP
done
if (( $SUM > 0 ));
then
echo -n ".";
echo -e "${PID}\t${SUM}\t${PROGNAME}" >> ${TMP}/${SCRIPT_NAME}.pid;
echo -e "${SUM}\t${PID}\t${PROGNAME}" >> ${TMP}/${SCRIPT_NAME}.kb;
echo -e "${PROGNAME}\t${SUM}\t${PID}" >> ${TMP}/${SCRIPT_NAME}.name;
fi
let OVERALL=$OVERALL+$SUM
SUM=0
done
echo "${OVERALL}" > ${TMP}/${SCRIPT_NAME}.overal;
echo;
echo "Overall swap used: ${OVERALL} kB";
echo "========================================";
case "${SORT}" in
name )
echo -e "name\tkB\tpid";
echo "========================================";
cat ${TMP}/${SCRIPT_NAME}.name|sort -r;
;;
kb )
echo -e "kB\tpid\tname";
echo "========================================";
cat ${TMP}/${SCRIPT_NAME}.kb|sort -rh;
;;
pid | * )
echo -e "pid\tkB\tname";
echo "========================================";
cat ${TMP}/${SCRIPT_NAME}.pid|sort -rh;
;;
esac
rm -fR "${TMP}/";
Use smem
smem -s swap -r
Here is a link which tells you both how to install it and how to use it: http://www.cyberciti.biz/faq/linux-which-process-is-using-swap/
It's not entirely clear if you mean you want to find the process who has most pages swapped out or process who caused most pages to be swapped out.
For the first you may run top and order by swap (press 'Op'), for the latter you can run vmstat and look for non-zero entries for 'so'.
Another script variant avoiding the loop in shell:
#!/bin/bash
grep VmSwap /proc/[0-9]*/status | awk -F':' -v sort="$1" '
{
split($1,pid,"/") # Split first field on /
split($3,swp," ") # Split third field on space
cmdlinefile = "/proc/"pid[3]"/cmdline" # Build the cmdline filepath
getline pname[pid[3]] < cmdlinefile # Get the command line from pid
swap[pid[3]] = sprintf("%6i %s",swp[1],swp[2]) # Store the swap used (with unit to avoid rebuilding at print)
sum+=swp[1] # Sum the swap
}
END {
OFS="\t" # Change the output separator to tabulation
print "Pid","Swap used","Command line" # Print header
if(sort) {
getline max_pid < "/proc/sys/kernel/pid_max"
for(p=1;p<=max_pid;p++) {
if(p in pname) print p,swap[p],pname[p] # print the values
}
} else {
for(p in pname) { # Loop over all pids found
print p,swap[p],pname[p] # print the values
}
}
print "Total swap used:",sum # print the sum
}'
Standard usage is script.sh to get the usage per program with random order (down to how awk stores its hashes) or script.sh 1 to sort the output by pid.
I hope I've commented the code enough to tell what it does.
Yet two more variants:
Because top or htop could be not installed on small systems, browsing /proc stay always possible.
Even on small systems, you will found a shell...
A shell variant! (Not bash only)
This is exactly same than lolotux script, but without any fork to grep, awk or ps. This is a lot quicker!
And as bash is one of the poorest shell regarding performance, a little work was done to ensure this script will run well under dash, busybox and some other. Then, (thanks to Stéphane Chazelas,) become a lot quicker again!
#!/bin/sh
# Get current swap usage for all running processes
# Felix Hauri 2016-08-05
# Rewritted without fork. Inspired by first stuff from
# Erik Ljungstrom 27/05/2011
# Modified by Mikko Rantalainen 2012-08-09
# Pipe the output to "sort -nk3" to get sorted output
# Modified by Marc Methot 2014-09-18
# removed the need for sudo
OVERALL=0
for FILE in /proc/[0-9]*/status ;do
SUM=0
while read FIELD VALUE;do
case $FIELD in
Pid ) PID=$VALUE ;;
Name ) PROGNAME="$VALUE" ;;
VmSwap ) SUM=${VALUE%% *} ; break ;;
esac
done <$FILE
[ $SUM -gt 0 ] &&
printf "PID: %9d swapped: %11d KB (%s)\n" $PID $SUM "$PROGNAME"
OVERALL=$((OVERALL+SUM))
done
printf "Total swapped memory: %14u KB\n" $OVERALL
Don't forgot to double quote "$PROGNAME" ! See Stéphane Chazelas's comment:
read FIELD PROGNAME < <(
perl -ne 'BEGIN{$0="/*/*/../../*/*"} print if /^Name/' /proc/self/status
)
echo $FIELD "$PROGNAME"
Don't try echo $PROGNAME without double quote on sensible system, and be ready to kill current shell before!
And a perl version
As this become a not so simple script, time is comming to write a dedicated tool by using more efficient language.
#!/usr/bin/perl -w
use strict;
use Getopt::Std;
my ($tot,$mtot)=(0,0);
my %procs;
my %opts;
getopt('', \%opts);
sub sortres {
return $a <=> $b if $opts{'p'};
return $procs{$a}->{'cmd'} cmp $procs{$b}->{'cmd'} if $opts{'c'};
return $procs{$a}->{'mswap'} <=> $procs{$b}->{'mswap'} if $opts{'m'};
return $procs{$a}->{'swap'} <=> $procs{$b}->{'swap'};
};
opendir my $dh,"/proc";
for my $pid (grep {/^\d+$/} readdir $dh) {
if (open my $fh,"</proc/$pid/status") {
my ($sum,$nam)=(0,"");
while (<$fh>) {
$sum+=$1 if /^VmSwap:\s+(\d+)\s/;
$nam=$1 if /^Name:\s+(\S+)/;
}
if ($sum) {
$tot+=$sum;
$procs{$pid}->{'swap'}=$sum;
$procs{$pid}->{'cmd'}=$nam;
close $fh;
if (open my $fh,"</proc/$pid/smaps") {
$sum=0;
while (<$fh>) {
$sum+=$1 if /^Swap:\s+(\d+)\s/;
};
};
$mtot+=$sum;
$procs{$pid}->{'mswap'}=$sum;
} else { close $fh; };
};
};
map {
printf "PID: %9d swapped: %11d (%11d) KB (%s)\n",
$_, $procs{$_}->{'swap'}, $procs{$_}->{'mswap'}, $procs{$_}->{'cmd'};
} sort sortres keys %procs;
printf "Total swapped memory: %14u (%11u) KB\n", $tot,$mtot;
could by run with one of
-c sort by command name
-p sort by pid
-m sort by swap values
by default, output is sorted by status's vmsize
The top command also contains a field to display the number of page faults for a process. The process with maximum page faults would be the process which is swapping most.
For long running daemons it might be that they incur large number of page faults at the beginning and the number does not increase later on. So we need to observe whether the page faults is increasing.
I adapted a different script on the web to this long one-liner:
{ date;for f in /proc/[0-9]*/status; do
awk '{k[$1]=$2} END { if (k["VmSwap:"]) print k["Pid:"],k["Name:"],k["VmSwap:"];}' $f 2>/dev/null;
done | sort -n ; }
Which I then throw into a cronjob and redirect output to a logfile. The information here is the same as accumulating the Swap: entries in the smaps file, but if you want to be sure, you can use:
{ date;for m in /proc/*/smaps;do
awk '/^Swap/ {s+=$2} END { if (s) print FILENAME,s }' $m 2>/dev/null;
done | tr -dc ' [0-9]\n' |sort -k 1n; }
The output of this version is in two columns: pid, swap amount. In the above version, the tr strips the non-numeric components. In both cases, the output is sorted numerically by pid.
Gives totals and percentages for process using swap
smem -t -p
Source : https://www.cyberciti.biz/faq/linux-which-process-is-using-swap/
On MacOSX, you run top command as well but need to type "o" then "vsize" then ENTER.
Since the year 2015 kernel patch that adds SwapPss (https://lore.kernel.org/patchwork/patch/570506/) one can finally get proportional swap count meaning that if a process has swapped a lot and then it forks, both forked processes will be reported to swap 50% each. And if either then forks, each process is counted 33% of the swapped pages so if you count all those swap usages together, you get real swap usage instead of value multiplied by process count.
In short:
(cd /proc; for pid in [0-9]*; do printf "%5s %6s %s\n" "$pid" "$(awk 'BEGIN{sum=0} /SwapPss:/{sum+=$2} END{print sum}' $pid/smaps)" "$(cat $pid/comm)"; done | sort -k2n,2 -k1n,1)
First column is pid, second column is swap usage in KiB and rest of the line is command being executed. Identical swap counts are sorted by pid.
Above may emit lines such as
awk: cmd. line:1: fatal: cannot open file `15407/smaps' for reading (No such file or directory)
which simply means that process with pid 15407 ended between seeing it in the list for /proc/ and reading the process smaps file. If that matters to you, simply add 2>/dev/null to the end. Note that you'll potentially lose any other possible diagnostics as well.
In real world example case, this changes other tools reporting ~40 MB swap usage for each apache child running on one server to actual usage of between 7-3630 KB really used per child.
That is my one liner:
cat /proc/*/status | grep -E 'VmSwap:|Name:' | grep VmSwap -B1 | cut -d':' -f2 | grep -v '\-\-' | grep -o -E '[a-zA-Z0-9]+.*$' | cut -d' ' -f1 | xargs -n2 echo | sort -k2 -n
The steps in this line are:
Get all the data in /proc/process/status for all processes
Select the fields VmSwap and Name for each
Remove the processes that don't have the VmSwap field
Remove the names of the fields (VmSwap: and Name:)
Remove lines with -- that were added by the previous step
Remove the spaces at the start of the lines
Remove the second part of each process name and " kB" after the swap usage number
Take name and number (process name and swap usage) and put them in one line, one after the other
Sort the lines by the swap usage
I suppose you could get a good guess by running top and looking for active processes using a lot of memory. Doing this programatically is harder---just look at the endless debates about the Linux OOM killer heuristics.
Swapping is a function of having more memory in active use than is installed, so it is usually hard to blame it on a single process. If it is an ongoing problem, the best solution is to install more memory, or make other systemic changes.
Here's a version that outputs the same as the script by #loolotux, but is much faster(while less readable).
That loop takes about 10 secs on my machine, my version takes 0.019 s, which mattered to me because I wanted to make it into a cgi page.
join -t / -1 3 -2 3 \
<(grep VmSwap /proc/*/status |egrep -v '/proc/self|thread-self' | sort -k3,3 --field-separator=/ ) \
<(grep -H '' --binary-files=text /proc/*/cmdline |tr '\0' ' '|cut -c 1-200|egrep -v '/proc/self|/thread-self'|sort -k3,3 --field-separator=/ ) \
| cut -d/ -f1,4,7- \
| sed 's/status//; s/cmdline//' \
| sort -h -k3,3 --field-separator=:\
| tee >(awk -F: '{s+=$3} END {printf "\nTotal Swap Usage = %.0f kB\n",s}') /dev/null
I don't know of any direct answer as how to find exactly what process is using the swap space, however, this link may be helpful. Another good one is over here
Also, use a good tool like htop to see which processes are using a lot of memory and how much swap overall is being used.
iotop is a very useful tool. It gives live stats of I/O and swap usage per process/thread. By default it shows per thread but you can do iotop -P to get per process info. This is not available by default. You may have to install via rpm/apt.
You can use Procpath (author here), to simplify parsing of VmSwap from /proc/$PID/status.
$ procpath record -f stat,cmdline,status -r 1 -d db.sqlite
$ sqlite3 -column db.sqlite \
'SELECT status_name, status_vmswap FROM record ORDER BY status_vmswap DESC LIMIT 5'
Web Content 192136
okular 186872
thunderbird 183692
Web Content 143404
MainThread 86300
You can also plot VmSwap of processes of interest over time like this. Here I'm recording my Firefox process tree while opening a couple tens of tabs along with statrting a memory-hungry application to try to cause it to swap (which wasn't convincing for Firefox, but your kilometrage may vary).
$ procpath record -f stat,cmdline,status -i 1 -d db2.sqlite \
'$..children[?(#.stat.pid == 6029)]'
# interrupt by Ctrl+C
$ procpath plot -d db2.sqlite -q cpu --custom-value-expr status_vmswap \
--title "CPU usage, % vs Swap, kB"
The same answer as #lolotux, but with sorted output:
printf 'Computing swap usage...\n';
swap_usages="$(
SUM=0
OVERALL=0
for DIR in `find /proc/ -maxdepth 1 -type d -regex "^/proc/[0-9]+"`
do
PID="$(printf '%s' "$DIR" | cut -d / -f 3)"
PROGNAME=`ps -p $PID -o comm --no-headers`
for SWAP in `grep VmSwap $DIR/status 2>/dev/null | awk '{ print $2 }'`
do
let SUM=$SUM+$SWAP
done
if (( $SUM > 0 )); then
printf "$SUM KB ($PROGNAME) swapped PID=$PID\\n"
fi
let OVERALL=$OVERALL+$SUM
SUM=0
break
done
printf '9999999999 Overall swap used: %s KB\n' "$OVERALL"
)"
printf '%s' "$swap_usages" | sort -nk1
Example output:
Computing swap usage...
2064 KB (systemd) swapped PID=1
59620 KB (xfdesktop) swapped PID=21405
64484 KB (nemo) swapped PID=763627
66740 KB (teamviewerd) swapped PID=1618
68244 KB (flameshot) swapped PID=84209
763136 KB (plugin_host) swapped PID=1881345
1412480 KB (java) swapped PID=43402
3864548 KB (sublime_text) swapped PID=1881327
9999999999 Overall swap used: 2064 KB
I use this, useful if you only have /proc and nothing else useful. Just set nr to the number of top swappers you want to see and it will tell you the process name, swap footprint(MB) and it's full process line from ps -ef:
nr=10;for pid in $(for file in /proc//status ; do awk '/VmSwap|Name|^Pid/{printf $2 " " $3}END{ print ""}' $file; done | sort -k 3 -n -r|head -${nr}|awk '{ print $2 }');do awk '/VmSwap|Name|^Pid/{printf $2 " " $3}END{ print ""}' /proc/$pid/status|awk '{print $1" "$2" "$3/1024" MB"}'|sed -e 's/.[0-9]//g';ps -ef|awk "$2==$pid {print}";echo;done
I have a problem with some zombie-like processes on a certain server that need to be killed every now and then. How can I best identify the ones that have run for longer than an hour or so?
Found an answer that works for me:
warning: this will find and kill long running processes
ps -eo uid,pid,etime | egrep '^ *user-id' | egrep ' ([0-9]+-)?([0-9]{2}:?){3}' | awk '{print $2}' | xargs -I{} kill {}
(Where user-id is a specific user's ID with long-running processes.)
The second regular expression matches the a time that has an optional days figure, followed by an hour, minute, and second component, and so is at least one hour in length.
If they just need to be killed:
if [[ "$(uname)" = "Linux" ]];then killall --older-than 1h someprocessname;fi
If you want to see what it's matching
if [[ "$(uname)" = "Linux" ]];then killall -i --older-than 1h someprocessname;fi
The -i flag will prompt you with yes/no for each process match.
For anything older than one day,
ps aux
will give you the answer, but it drops down to day-precision which might not be as useful.
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.0 0.0 7200 308 ? Ss Jun22 0:02 init [5]
root 2 0.0 0.0 0 0 ? S Jun22 0:02 [migration/0]
root 3 0.0 0.0 0 0 ? SN Jun22 0:18 [ksoftirqd/0]
root 4 0.0 0.0 0 0 ? S Jun22 0:00 [watchdog/0]
If you're on linux or another system with the /proc filesystem, In this example, you can only see that process 1 has been running since June 22, but no indication of the time it was started.
stat /proc/<pid>
will give you a more precise answer. For example, here's an exact timestamp for process 1, which ps shows only as Jun22:
ohm ~$ stat /proc/1
File: `/proc/1'
Size: 0 Blocks: 0 IO Block: 4096 directory
Device: 3h/3d Inode: 65538 Links: 5
Access: (0555/dr-xr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2008-06-22 15:37:44.347627750 -0700
Modify: 2008-06-22 15:37:44.347627750 -0700
Change: 2008-06-22 15:37:44.347627750 -0700
In this way you can obtain the list of the ten oldest processes:
ps -elf | sort -r -k12 | head -n 10
Jodie C and others have pointed out that killall -i can be used, which is fine if you want to use the process name to kill. But if you want to kill by the same parameters as pgrep -f, you need to use something like the following, using pure bash and the /proc filesystem.
#!/bin/sh
max_age=120 # (seconds)
naughty="$(pgrep -f offlineimap)"
if [[ -n "$naughty" ]]; then # naughty is running
age_in_seconds=$(echo "$(date +%s) - $(stat -c %X /proc/$naughty)" | bc)
if [[ "$age_in_seconds" -ge "$max_age" ]]; then # naughty is too old!
kill -s 9 "$naughty"
fi
fi
This lets you find and kill processes older than max_age seconds using the full process name; i.e., the process named /usr/bin/python2 offlineimap can be killed by reference to "offlineimap", whereas the killall solutions presented here will only work on the string "python2".
Perl's Proc::ProcessTable will do the trick:
http://search.cpan.org/dist/Proc-ProcessTable/
You can install it in debian or ubuntu with sudo apt-get install libproc-processtable-perl
Here is a one-liner:
perl -MProc::ProcessTable -Mstrict -w -e 'my $anHourAgo = time-60*60; my $t = new Proc::ProcessTable;foreach my $p ( #{$t->table} ) { if ($p->start() < $anHourAgo) { print $p->pid, "\n" } }'
Or, more formatted, put this in a file called process.pl:
#!/usr/bin/perl -w
use strict;
use Proc::ProcessTable;
my $anHourAgo = time-60*60;
my $t = new Proc::ProcessTable;
foreach my $p ( #{$t->table} ) {
if ($p->start() < $anHourAgo) {
print $p->pid, "\n";
}
}
then run perl process.pl
This gives you more versatility and 1-second-resolution on start time.
You can use bc to join the two commands in mob's answer and get how many seconds ellapsed since the process started:
echo `date +%s` - `stat -t /proc/<pid> | awk '{print $14}'` | bc
edit:
Out of boredom while waiting for long processes to run, this is what came out after a few minutes fiddling:
#file: sincetime
#!/bin/bash
init=`stat -t /proc/$1 | awk '{print $14}'`
curr=`date +%s`
seconds=`echo $curr - $init| bc`
name=`cat /proc/$1/cmdline`
echo $name $seconds
If you put this on your path and call it like this:
sincetime
it will print the process cmdline and seconds since started. You can also put this in your path:
#file: greptime
#!/bin/bash
pidlist=`ps ax | grep -i -E $1 | grep -v grep | awk '{print $1}' | grep -v PID | xargs echo`
for pid in $pidlist; do
sincetime $pid
done
And than if you run:
greptime <pattern>
where patterns is a string or extended regular expression, it will print out all processes matching this pattern and the seconds since they started. :)
do a ps -aef. this will show you the time at which the process started. Then using the date command find the current time. Calculate the difference between the two to find the age of the process.
I did something similar to the accepted answer but slightly differently since I want to match based on process name and based on the bad process running for more than 100 seconds
kill $(ps -o pid,bsdtime -p $(pgrep bad_process) | awk '{ if ($RN > 1 && $2 > 100) { print $1; }}')
stat -t /proc/<pid> | awk '{print $14}'
to get the start time of the process in seconds since the epoch. Compare with current time (date +%s) to get the current age of the process.
Using ps is the right way. I've already done something similar before but don't have the source handy.
Generally - ps has an option to tell it which fields to show and by which to sort. You can sort the output by running time, grep the process you want and then kill it.
HTH
In case anyone needs this in C, you can use readproc.h and libproc:
#include <proc/readproc.h>
#include <proc/sysinfo.h>
float
pid_age(pid_t pid)
{
proc_t proc_info;
int seconds_since_boot = uptime(0,0);
if (!get_proc_stats(pid, &proc_info)) {
return 0.0;
}
// readproc.h comment lies about what proc_t.start_time is. It's
// actually expressed in Hertz ticks since boot
int seconds_since_1970 = time(NULL);
int time_of_boot = seconds_since_1970 - seconds_since_boot;
long t = seconds_since_boot - (unsigned long)(proc_info.start_time / Hertz);
int delta = t;
float days = ((float) delta / (float)(60*60*24));
return days;
}
Came across somewhere..thought it is simple and useful
You can use the command in crontab directly ,
* * * * * ps -lf | grep "user" | perl -ane '($h,$m,$s) = split /:/,$F
+[13]; kill 9, $F[3] if ($h > 1);'
or, we can write it as shell script ,
#!/bin/sh
# longprockill.sh
ps -lf | grep "user" | perl -ane '($h,$m,$s) = split /:/,$F[13]; kill
+ 9, $F[3] if ($h > 1);'
And call it crontab like so,
* * * * * longprockill.sh
My version of sincetime above by #Rafael S. Calsaverini :
#!/bin/bash
ps --no-headers -o etimes,args "$1"
This reverses the output fields: elapsed time first, full command including arguments second. This is preferred because the full command may contain spaces.