How to store ps command output in csv format?

How to store ps command output in csv format? - linux

I am making a process monitoring application. Now my need is
To get process related data (pid, process name, cpu usage, memory usage and virtual memory usage) for all the running processes.
After completing the first step I want to store the retrieved data into csv format.
My code part is:
ps -e -o pid,lstart,%cpu,%mem,cmd >> output.csv
But it is storing all values in only one cell. Meaning it is not being separated by a comma.
output.csv example:
PID STARTED %CPU %MEM CMD
1 Mon Feb 25 00:00:01 2019 0.0 0.1 examplecommand1
2 Mon Feb 25 00:00:01 2019 0.0 0.0 examplecommand2
(...)
Any help would be appreciated.

You could try something like the following code I wrote:
ps -e -o %p, -o lstart -o ,%C, -o %mem -o ,%c > output.csv
Brief explanation:
The -o option can be used multiple times in a ps command to specify the format.
In order to control which separator is used we can use AIX format descriptors. We can specify our needed separators like, e.g. %p,. Since AIX format descriptors are not available for every piece of data, but only for some of the data (for example in our case there are no AIX format descriptors for %mem and for lstart), we plant %mem and lstart around the available AIX format descriptors to achieve the comma separation. For example this site provides information about the ps command for further readings.
output.csv example:
PID, STARTED,%CPU,%MEM,COMMAND
1,Mon Feb 25 00:00:01 2019, 0.0, 0.1,examplecommand1
2,Mon Feb 25 00:00:01 2019, 0.0, 0.0,examplecommand2
(...)

csv-ps -c pid,cmd,%cpu,vm_rss_KiB,vm_size_KiB
csv-ps is from https://github.com/mslusarz/csv-nix-tools.

Related

Script to get user that has process with most memory usage?

How can I write a script that gives an output of the user that has the process with the most memory usage in the system. The script is sh. I tried to use top command as the starting point but it seems it does not work with pipes because it continues running until it is quit.

If you just want the user name of the process using the most memory, try something like:
$ ps axho user --sort -rss | head -1
This checks the resident memory size rss of the processes. If you'd rather check the whole virtual size, use vsz instead of rss. If you want percentage of resident memory used, use pmem (but this could change from moment to moment due to the scheduler, and may not pull out the biggest memory hog). If you'd rather have the user ID instead of user name, use uid instead of user.
The ps options are:
ax for "all processes" (everybody)
h for "no header" in output
o to specify the output format: user (user name)
--sort -rss sort by rss (descending order)
The head -1 strips out all but the first line (which has the largest rss since it's in descending order).
It might be useful to get not just the user name, but some more information about the process, like:
$ ps axho user,pid,rss --sort -rss | head -1
This gives the user name, process ID, and resident memory usage of the top process, all on one line. You could pull out the values individually in whatever script you use it in.

this works in centos: list most memory cost process
[root#182 ~] # ps aux | sort -k 4 -r | head -n2
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 7048 0.2 9.6 8060236 1573612 ? Ssl Dec14 8:23 java -Djava.security.e
sort -k 4 : sort by the forth column, my pc column4 = %MEM
in other linux/unix, you may find the right column number for memory

if you're allowed to use only top, this can be a solution:
top -o VIRT -n 1 | head -8 | tail -1 | cut -d ' ' -f 5
top -o VIRT allows you to override sorting by default column and sort it by column VIRT
top -n 1 allows you to limit iterations that top will make before exiting. We need only one iteration - it's like when you take a photo while recording a video - you saved info at a particular moment
| head -8 allows you to output only first 8 lines of previous output/ Why 8? It's because top is displaying a lot of info before displaying the table with processes:
we don't need all these lines
Tasks: 247 total, 1 running, 245 sleeping, 0 stopped, 1 zombie
%Cpu(s): 5,9 us, 1,5 sy, 0,0 ni, 92,6 id, 0,0 wa, 0,0 hi, 0,0 si, 0,0 st
MiB Mem : 15698,4 total, 7256,4 free, 3087,8 used, 5354,2 buff/cache
MiB Swap: 18222,0 total, 18222,0 free, 0,0 used. 11441,1 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
6249 mneznaev 20 0 46,5g 287484 110232 S 0,0 1,8 5:40.11 code
| tail -1 allows you to get the last line of previous 8 lines that we got by previous step with head and now we have:
6249 mneznaev 20 0 46,5g 290332 110232 S 0,0 1,8 5:40.27 code
cut -d ' ' -f 5 allows you to separate file by columns, where -d ' ' is an option that defines space as a separator and -f 5 allows you to get the 5th column/ Why 5th? That's because there are some spaces before the actual value of PID (before 6249 in my case) and first 3 columns after cut by space are empty. And the 5th column is a username:
mneznaev#mneznaev-desktop:~$ top -o VIRT -n 1 | head -8 | tail -1 | cut -d ' ' -f 5
mneznaev
That's it. Hope it was helpful

background process log shows big gaps in message timestamps

I started a long running job under nohup in the background over a weekend. When looking at the output after it finished, I noticed that there were large gaps between the timestamps of some log messages. Some gaps were as long as 10 hrs. I had no way of finding out what was going on with my job at that time.
I ran it on a standard Red hat linux server machine at work.
Is this behavior caused by nohup command ? If not what could be possible causes ?
One such long running job was as the script below -
#!/bin/bash
while true
do
echo "`date` `top -n 1 -b | grep progname`"
done
And here one such gap from the log -
Mon May 26 04:29:42 PDT 2014 27685 user 18 0 2883m 2.8g 1732 S 0.0 3.9 29:05.54 progname
Tue May 27 03:20:35 PDT 2014 27685 user 18 0 3371m 3.3g 1732 S 0.0 4.6 34:23.21 progname

Ok.
pid is a variable that is the pid of the process you want to monitor--
Try this for starters (Courtesy of S Chazelas):
export pid=$(ps -ef | grep progname | awk '{$print $2}')
while rss=$(ps -o rss= -p "${pid}")
do
printf '%d %s\n' "$rss" "$(date)"
sleep 60;
done > t.lis
#
echo "Done $(date)" >> t.lis
rss is the resident set size (memory allocated to the process) in pages.
getconf PAGESIZE
will show how many bytes are in a page of memory.

how to kill the tty in unix

This is the result of the finger command (Today(Monday) when I (Vidya) logged in)
sekic1083 [6:14am] [/home/vidya] -> finger
Name Tty Idle Login Time Where
Felix pts/0 - Thu 10:06 sekic2594.rnd.ki.sw.
john pts/1 2d Fri 15:43
john *pts/2 2d Fri 15:43
john *pts/3 4 Fri 15:44
john *pts/7 - Thu 16:25
Vidya pts/0 - Mon 06:14
Vidya *pts/5 - Mon 06:14
Vidya *pts/6 - Tue 10:13
Vidya *pts/9 - Wed 05:39
Vidya *pts/10 - Wed 10:23
Under column the Tty pts/0 and pts/5 are the current active terminals.
Apart from those two pts/6, pts/9 and pts/10 are also present and I had logged into these last week. But the idle time for them is showing as "-" (not idle).
How can I kill these 6,9 and 10 terminals?

You can run:
ps -ft pts/6 -t pts/9 -t pts/10
This would produce an output similar to:
UID PID PPID C STIME TTY TIME CMD
Vidya 772 2701 0 15:26 pts/6 00:00:00 bash
Vidya 773 2701 0 16:26 pts/9 00:00:00 bash
Vidya 774 2701 0 17:26 pts/10 00:00:00 bash
Grab the PID from the result.
Use the PIDs to kill the processes:
kill <PID1> <PID2> <PID3> ...
For the above example:
kill 772 773 774
If the process doesn't gracefully terminate, just as a last option you can forcefully kill by sending a SIGKILL
kill -9 <PID>

I had the same question as you but I wanted to kill the gnome terminal which I was in. I read the manual on "who" and found that you can list all of the sessions logged into your computer with the '-a' option and then the '-l' option prints the system login processes.
who -la
You should get something like this. Then all you have to do is kill the process with the 'kill' command.
kill <PID>

for example kill pts/0
pkill -9 -t pts/0

Try this:
skill -KILL -v pts/6
skill -KILL -v pts/9
skill -KILL -v pts/10

I had the same problem today.
I had NO remaining processes, but the remaining finger entry of user "xxx",
which prevent me the deletion of this user using "userdel xxx".
Error message was: userdel: account `xxx' is currently in use.
It looked like a crashed terminal session. So I rebooted, but the issue remained.
last xxx
xxx pts/5 10.1.2.3 Fri Feb 7 10:25 - crash (01:27)
So I (re)moved the /var/run/utmp file:
mv /var/run/utmp /var/run/utmp.save ; touch /var/run/utmp
This cleared all finger entries. Unfortunately in this way even the current running sessions will be cleared. If this is an issue for you, you have to reboot, after you (re)moved the utmp file.
However in my case this was the solution. Afterwards I was able to successfully delete the user, using "userdel xxx".

you do not need to know pts number, just type:
ps all | grep bash
then:
kill pid1 pid2 pid3 ...

The simplest way is with the pkill command.
In your case:
pkill -9 -t pts/6
pkill -9 -t pts/9
pkill -9 -t pts/10
Regarding tty sessions, the commands below are always useful:
w - shows active terminal sessions
tty - shows your current terminal session (so you won't close it by accident)
last | grep logged - shows currently logged users
Sometimes we want to close all sessions of an idle user (ie. when connections are lost abruptly).
pkill -u username - kills all sessions of 'username' user.
And sometimes when we want to kill all our own sessions except the current one, so I made a script for it. There are some cosmetics and some interactivity (to avoid accidental running on the script).
#!/bin/bash
MYUSER=`whoami`
MYSESSION=`tty | cut -d"/" -f3-`
OTHERSESSIONS=`w $MYUSER | grep "^$MYUSER" | grep -v "$MYSESSION" | cut -d" " -f2`
printf "\e[33mCurrent session\e[0m: $MYUSER[$MYSESSION]\n"
if [[ ! -z $OTHERSESSIONS ]]; then
printf "\e[33mOther sessions:\e[0m\n"
w $MYUSER | egrep "LOGIN#|^$MYUSER" | grep -v "$MYSESSION" | column -t
echo ----------
read -p "Do you want to force close all your other sessions? [Y]Yes/[N]No: " answer
answer=`echo $answer | tr A-Z a-z`
confirm=("y" "yes")
if [[ "${confirm[#]}" =~ "$answer" ]]; then
for SESSION in $OTHERSESSIONS
do
pkill -9 -t $SESSION
echo Session $SESSION closed.
done
fi
else
echo "There are no other sessions for the user '$MYUSER'".
fi

You can use killall command as well .
-o, --older-than
Match only processes that are older (started before) the time specified. The time is specified as a float then a unit. The units are s,m,h,d,w,M,y for seconds, minutes, hours, days,
-e, --exact
Require an exact match for very long names.
-r, --regexp
Interpret process name pattern as an extended regular expression.
This worked like a charm.

If you want to close tty for specific user with all the process, above command is the easiest. You can use:
killall -u user_name

In addition to AIXroot's answer, there is also a logout function that can be used to write a utmp logout record. So if you don't have any processes for user xxxx, but userdel says "userdel: account xxxx is currently in use", you can add a logout record manually. Create a file logout.c like this:
#include <stdio.h>
#include <utmp.h>
int main(int argc, char *argv[])
{
if (argc == 2) {
return logout(argv[1]);
}
else {
fprintf(stderr, "Usage: logout device\n");
return 1;
}
}
Compile it:
gcc -lutil -o logout logout.c
And then run it for whatever it says in the output of finger's "On since" line(s) as a parameter:
# finger xxxx
Login: xxxx Name:
Directory: /home/xxxx Shell: /bin/bash
On since Sun Feb 26 11:06 (GMT) on 127.0.0.1:6 (messages off) from 127.0.0.1
On since Fri Feb 24 16:53 (GMT) on pts/6, idle 3 days 17:16, from 127.0.0.1
Last login Mon Feb 10 14:45 (GMT) on pts/11 from somehost.example.com
Mail last read Sun Feb 27 08:44 2014 (GMT)
No Plan.
# userdel xxxx
userdel: account `xxxx' is currently in use.
# ./logout 127.0.0.1:6
# ./logout pts/6
# userdel xxxx
no crontab for xxxx

How to measure CPU usage

I would like to log CPU usage at a frequency of 1 second.
One possible way to do it is via vmstat 1 command.
The problem is that the time between each output is not always exactly one second, especially on a busy server. I would like to be able to output the timestamp along with the CPU usage every second. What would be a simple way to accomplish this, without installing special tools?

There are many ways to do that. Except top another way is to you the "sar" utility. So something like
sar -u 1 10
will give you the cpu utilization for 10 times every 1 second. At the end it will print averages for each one of the sys, user, iowait, idle
Another utility is the "mpstat", that gives you similar things with sar

Use the well-known UNIX tool top that is normally available on Linux systems:
top -b -d 1 > /tmp/top.log
The first line of each output block from top contains a timestamp.
I see no command line option to limit the number of rows that top displays.
Section 5a. SYSTEM Configuration File and 5b. PERSONAL Configuration File of the top man page describes pressing W when running top in interactive mode to create a $HOME/.toprc configuration file.
I did this, then edited my .toprc file and changed all maxtasks values so that they are maxtasks=4. Then top only displays 4 rows of output.
For completeness, the alternative way to do this using pipes is:
top -b -d 1 | awk '/load average/ {n=10} {if (n-- > 0) {print}}' > /tmp/top.log

You might want to try htop and atop. htop is beautifully interactive while atop gathers information and can report CPU usage even for terminated processes.

I found a neat way to get the timestamp information to be displayed along with the output of vmstat.
Sample command:
vmstat -n 1 3 | while read line; do echo "$(date --iso-8601=seconds) $line"; done
Output:
2013-09-13T14:01:31-0700 procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
2013-09-13T14:01:31-0700 r b swpd free buff cache si so bi bo in cs us sy id wa
2013-09-13T14:01:31-0700 1 1 4197640 29952 124584 12477708 12 5 449 147 2 0 7 4 82 7
2013-09-13T14:01:32-0700 3 0 4197780 28232 124504 12480324 392 180 15984 180 1792 1301 31 15 38 16
2013-09-13T14:01:33-0700 0 1 4197656 30464 124504 12477492 344 0 2008 0 1892 1929 32 14 43 10

To monitor the disk usage, cpu and load i created a small bash scripts that writes the values to a log file every 10 seconds.
This logfile is processed by logstash kibana and riemann.
# #!/usr/bin/env bash
# Define a timestamp function
LOGPATH="/var/log/systemstatus.log"
timestamp() {
date +"%Y-%m-%dT%T.%N"
}
#server load
while ( sleep 10 ) ; do
echo -n "$(timestamp) linux::systemstatus::load " >> $LOGPATH
cat /proc/loadavg >> $LOGPATH
#cpu usage
echo -n "$(timestamp) linux::systemstatus::cpu " >> $LOGPATH
top -bn 1 | sed -n 3p >> $LOGPAT
#disk usage
echo -n "$(timestamp) linux::systemstatus::storage " >> $LOGPATH
df --total|grep total|sed "s/total//g"| sed 's/^ *//' >> $LOGPATH
done

How to see top processes sorted by actual memory usage?

I have a server with 12G of memory. A fragment of top is shown below:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
12979 frank 20 0 206m 21m 12m S 11 0.2 26667:24 krfb
13 root 15 -5 0 0 0 S 1 0.0 36:25.04 ksoftirqd/3
59 root 15 -5 0 0 0 S 0 0.0 4:53.00 ata/2
2155 root 20 0 662m 37m 8364 S 0 0.3 338:10.25 Xorg
4560 frank 20 0 8672 1300 852 R 0 0.0 0:00.03 top
12981 frank 20 0 987m 27m 15m S 0 0.2 45:10.82 amarok
24908 frank 20 0 16648 708 548 S 0 0.0 2:08.84 wrapper
1 root 20 0 8072 608 572 S 0 0.0 0:47.36 init
2 root 15 -5 0 0 0 S 0 0.0 0:00.00 kthreadd
The free -m shows the following:
total used free shared buffers cached
Mem: 12038 11676 362 0 599 9745
-/+ buffers/cache: 1331 10706
Swap: 2204 257 1946
If I understand correctly, the system has only 362 MB of available memory. My question is: How can I find out which process is consuming most of the memory?
Just as background info, the system is running 64bit OpenSuse 12.

use quick tip using top command in linux/unix
$ top
and then hit Shift+m (i.e. write a capital M).
From man top
SORTING of task window
For compatibility, this top supports most of the former top sort keys.
Since this is primarily a service to former top users, these commands do
not appear on any help screen.
command sorted-field supported
A start time (non-display) No
M %MEM Yes
N PID Yes
P %CPU Yes
T TIME+ Yes
Or alternatively: hit Shift + f , then choose the display to order by memory usage by hitting key n then press Enter. You will see active process ordered by memory usage

First, repeat this mantra for a little while: "unused memory is wasted memory". The Linux kernel keeps around huge amounts of file metadata and files that were requested, until something that looks more important pushes that data out. It's why you can run:
find /home -type f -name '*.mp3'
find /home -type f -name '*.aac'
and have the second find instance run at ridiculous speed.
Linux only leaves a little bit of memory 'free' to handle spikes in memory usage without too much effort.
Second, you want to find the processes that are eating all your memory; in top use the M command to sort by memory use. Feel free to ignore the VIRT column, that just tells you how much virtual memory has been allocated, not how much memory the process is using. RES reports how much memory is resident, or currently in ram (as opposed to swapped to disk or never actually allocated in the first place, despite being requested).
But, since RES will count e.g. /lib/libc.so.6 memory once for nearly every process, it isn't exactly an awesome measure of how much memory a process is using. The SHR column reports how much memory is shared with other processes, but there is no guarantee that another process is actually sharing -- it could be sharable, just no one else wants to share.
The smem tool is designed to help users better gage just how much memory should really be blamed on each individual process. It does some clever work to figure out what is really unique, what is shared, and proportionally tallies the shared memory to the processes sharing it. smem may help you understand where your memory is going better than top will, but top is an excellent first tool.

ps aux | awk '{print $2, $4, $11}' | sort -k2rn | head -n 10
(Adding -n numeric flag to sort command.)

First you should read an explanation on the output of free. Bottom line: you have at least 10.7 GB of memory readily usable by processes.
Then you should define what "memory usage" is for a process (it's not easy or unambiguous, trust me).
Then we might be able to help more :-)

List and Sort Processes by Memory Usage:
ps -e -orss=,args= | sort -b -k1,1n | pr -TW$COLUMNS

ps aux --sort '%mem'
from procps' ps (default on Ubuntu 12.04) generates output like:
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
...
tomcat7 3658 0.1 3.3 1782792 124692 ? Sl 10:12 0:25 /usr/lib/jvm/java-7-oracle/bin/java -Djava.util.logging.config.file=/var/lib/tomcat7/conf/logging.properties -D
root 1284 1.5 3.7 452692 142796 tty7 Ssl+ 10:11 3:19 /usr/bin/X -core :0 -seat seat0 -auth /var/run/lightdm/root/:0 -nolisten tcp vt7 -novtswitch
ciro 2286 0.3 3.8 1316000 143312 ? Sl 10:11 0:49 compiz
ciro 5150 0.0 4.4 660620 168488 pts/0 Sl+ 11:01 0:08 unicorn_rails worker[1] -p 3000 -E development -c config/unicorn.rb
ciro 5147 0.0 4.5 660556 170920 pts/0 Sl+ 11:01 0:08 unicorn_rails worker[0] -p 3000 -E development -c config/unicorn.rb
ciro 5142 0.1 6.3 2581944 239408 pts/0 Sl+ 11:01 0:17 sidekiq 2.17.8 gitlab [0 of 25 busy]
ciro 2386 3.6 16.0 1752740 605372 ? Sl 10:11 7:38 /usr/lib/firefox/firefox
So here Firefox is the top consumer with 16% of my memory.
You may also be interested in:
ps aux --sort '%cpu'

Building on gaoithe's answer, I attempted to make the memory units display in megabytes, and sorted by memory descending limited to 15 entries:
ps -e -orss=,args= |awk '{print $1 " " $2 }'| awk '{tot[$2]+=$1;count[$2]++} END {for (i in tot) {print tot[i],i,count[i]}}' | sort -n | tail -n 15 | sort -nr | awk '{ hr=$1/1024; printf("%13.2fM", hr); print "\t" $2 }'
588.03M /usr/sbin/apache2
275.64M /usr/sbin/mysqld
138.23M vim
97.04M -bash
40.96M ssh
34.28M tmux
17.48M /opt/digitalocean/bin/do-agent
13.42M /lib/systemd/systemd-journald
10.68M /lib/systemd/systemd
10.62M /usr/bin/redis-server
8.75M awk
7.89M sshd:
4.63M /usr/sbin/sshd
4.56M /lib/systemd/systemd-logind
4.01M /usr/sbin/rsyslogd
Here's an example alias to use it in a bash config file:
alias topmem="ps -e -orss=,args= |awk '{print \$1 \" \" \$2 }'| awk '{tot[\$2]+=\$1;count[\$2]++} END {for (i in tot) {print tot[i],i,count[i]}}' | sort -n | tail -n 15 | sort -nr | awk '{ hr=\$1/1024; printf(\"%13.2fM\", hr); print \"\t\" \$2 }'"
Then you can just type topmem on the command line.

How to total up used memory by process name:
Sometimes even looking at the biggest single processes there is still a lot of used memory unaccounted for. To check if there are a lot of the same smaller processes using the memory you can use a command like the following which uses awk to sum up the total memory used by processes of the same name:
ps -e -orss=,args= |awk '{print $1 " " $2 }'| awk '{tot[$2]+=$1;count[$2]++} END {for (i in tot) {print tot[i],i,count[i]}}' | sort -n
e.g. output
9344 docker 1
9948 nginx: 4
22500 /usr/sbin/NetworkManager 1
24704 sleep 69
26436 /usr/sbin/sshd 15
34828 -bash 19
39268 sshd: 10
58384 /bin/su 28
59876 /bin/ksh 29
73408 /usr/bin/python 2
78176 /usr/bin/dockerd 1
134396 /bin/sh 84
5407132 bin/naughty_small_proc 1432
28061916 /usr/local/jdk/bin/java 7

you can specify which column to sort by, with following steps:
steps:
* top
* shift + F
* select a column from the list
e.g. n means sort by memory,
* press enter
* ok

You can see memory usage by executing this code in your terminal:
$ watch -n2 free -m
$ htop

This very second in time
ps -U $(whoami) -eom pid,pmem,pcpu,comm | head -n4
Continuously updating
watch -n 1 'ps -U $(whoami) -eom pid,pmem,pcpu,comm | head -n4'
I also added a few goodies here you might appreciate (or you might ignore)
-n 1 watch and update every second
-U $(whoami) To show only your processes. $(some command) evaluates now
| head -n4 To only show the header and 3 processes at a time bc often you just need high usage line items
${1-4} says my first argument $1 I want to default to 4, unless I provide it
If you are using a mac you may need to install watch first
brew install watch
Alternatively you might use a function
psm(){
watch -n 1 "ps -eom pid,pmem,pcpu,comm | head -n ${1-4}"
# EXAMPLES:
# psm
# psm 10
}

You have this simple command:
$ free -h

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string