background process log shows big gaps in message timestamps - linux

I started a long running job under nohup in the background over a weekend. When looking at the output after it finished, I noticed that there were large gaps between the timestamps of some log messages. Some gaps were as long as 10 hrs. I had no way of finding out what was going on with my job at that time.
I ran it on a standard Red hat linux server machine at work.
Is this behavior caused by nohup command ? If not what could be possible causes ?
One such long running job was as the script below -
#!/bin/bash
while true
do
echo "`date` `top -n 1 -b | grep progname`"
done
And here one such gap from the log -
Mon May 26 04:29:42 PDT 2014 27685 user 18 0 2883m 2.8g 1732 S 0.0 3.9 29:05.54 progname
Tue May 27 03:20:35 PDT 2014 27685 user 18 0 3371m 3.3g 1732 S 0.0 4.6 34:23.21 progname

Ok.
pid is a variable that is the pid of the process you want to monitor--
Try this for starters (Courtesy of S Chazelas):
export pid=$(ps -ef | grep progname | awk '{$print $2}')
while rss=$(ps -o rss= -p "${pid}")
do
printf '%d %s\n' "$rss" "$(date)"
sleep 60;
done > t.lis
#
echo "Done $(date)" >> t.lis
rss is the resident set size (memory allocated to the process) in pages.
getconf PAGESIZE
will show how many bytes are in a page of memory.

Related

How to store ps command output in csv format?

I am making a process monitoring application. Now my need is
To get process related data (pid, process name, cpu usage, memory usage and virtual memory usage) for all the running processes.
After completing the first step I want to store the retrieved data into csv format.
My code part is:
ps -e -o pid,lstart,%cpu,%mem,cmd >> output.csv
But it is storing all values in only one cell. Meaning it is not being separated by a comma.
output.csv example:
PID STARTED %CPU %MEM CMD
1 Mon Feb 25 00:00:01 2019 0.0 0.1 examplecommand1
2 Mon Feb 25 00:00:01 2019 0.0 0.0 examplecommand2
(...)
Any help would be appreciated.
You could try something like the following code I wrote:
ps -e -o %p, -o lstart -o ,%C, -o %mem -o ,%c > output.csv
Brief explanation:
The -o option can be used multiple times in a ps command to specify the format.
In order to control which separator is used we can use AIX format descriptors. We can specify our needed separators like, e.g. %p,. Since AIX format descriptors are not available for every piece of data, but only for some of the data (for example in our case there are no AIX format descriptors for %mem and for lstart), we plant %mem and lstart around the available AIX format descriptors to achieve the comma separation. For example this site provides information about the ps command for further readings.
output.csv example:
PID, STARTED,%CPU,%MEM,COMMAND
1,Mon Feb 25 00:00:01 2019, 0.0, 0.1,examplecommand1
2,Mon Feb 25 00:00:01 2019, 0.0, 0.0,examplecommand2
(...)
csv-ps -c pid,cmd,%cpu,vm_rss_KiB,vm_size_KiB
csv-ps is from https://github.com/mslusarz/csv-nix-tools.

how to kill the tty in unix

This is the result of the finger command (Today(Monday) when I (Vidya) logged in)
sekic1083 [6:14am] [/home/vidya] -> finger
Name Tty Idle Login Time Where
Felix pts/0 - Thu 10:06 sekic2594.rnd.ki.sw.
john pts/1 2d Fri 15:43
john *pts/2 2d Fri 15:43
john *pts/3 4 Fri 15:44
john *pts/7 - Thu 16:25
Vidya pts/0 - Mon 06:14
Vidya *pts/5 - Mon 06:14
Vidya *pts/6 - Tue 10:13
Vidya *pts/9 - Wed 05:39
Vidya *pts/10 - Wed 10:23
Under column the Tty pts/0 and pts/5 are the current active terminals.
Apart from those two pts/6, pts/9 and pts/10 are also present and I had logged into these last week. But the idle time for them is showing as "-" (not idle).
How can I kill these 6,9 and 10 terminals?
You can run:
ps -ft pts/6 -t pts/9 -t pts/10
This would produce an output similar to:
UID PID PPID C STIME TTY TIME CMD
Vidya 772 2701 0 15:26 pts/6 00:00:00 bash
Vidya 773 2701 0 16:26 pts/9 00:00:00 bash
Vidya 774 2701 0 17:26 pts/10 00:00:00 bash
Grab the PID from the result.
Use the PIDs to kill the processes:
kill <PID1> <PID2> <PID3> ...
For the above example:
kill 772 773 774
If the process doesn't gracefully terminate, just as a last option you can forcefully kill by sending a SIGKILL
kill -9 <PID>
I had the same question as you but I wanted to kill the gnome terminal which I was in. I read the manual on "who" and found that you can list all of the sessions logged into your computer with the '-a' option and then the '-l' option prints the system login processes.
who -la
You should get something like this. Then all you have to do is kill the process with the 'kill' command.
kill <PID>
for example kill pts/0
pkill -9 -t pts/0
Try this:
skill -KILL -v pts/6
skill -KILL -v pts/9
skill -KILL -v pts/10
I had the same problem today.
I had NO remaining processes, but the remaining finger entry of user "xxx",
which prevent me the deletion of this user using "userdel xxx".
Error message was: userdel: account `xxx' is currently in use.
It looked like a crashed terminal session. So I rebooted, but the issue remained.
last xxx
xxx pts/5 10.1.2.3 Fri Feb 7 10:25 - crash (01:27)
So I (re)moved the /var/run/utmp file:
mv /var/run/utmp /var/run/utmp.save ; touch /var/run/utmp
This cleared all finger entries. Unfortunately in this way even the current running sessions will be cleared. If this is an issue for you, you have to reboot, after you (re)moved the utmp file.
However in my case this was the solution. Afterwards I was able to successfully delete the user, using "userdel xxx".
you do not need to know pts number, just type:
ps all | grep bash
then:
kill pid1 pid2 pid3 ...
The simplest way is with the pkill command.
In your case:
pkill -9 -t pts/6
pkill -9 -t pts/9
pkill -9 -t pts/10
Regarding tty sessions, the commands below are always useful:
w - shows active terminal sessions
tty - shows your current terminal session (so you won't close it by accident)
last | grep logged - shows currently logged users
Sometimes we want to close all sessions of an idle user (ie. when connections are lost abruptly).
pkill -u username - kills all sessions of 'username' user.
And sometimes when we want to kill all our own sessions except the current one, so I made a script for it. There are some cosmetics and some interactivity (to avoid accidental running on the script).
#!/bin/bash
MYUSER=`whoami`
MYSESSION=`tty | cut -d"/" -f3-`
OTHERSESSIONS=`w $MYUSER | grep "^$MYUSER" | grep -v "$MYSESSION" | cut -d" " -f2`
printf "\e[33mCurrent session\e[0m: $MYUSER[$MYSESSION]\n"
if [[ ! -z $OTHERSESSIONS ]]; then
printf "\e[33mOther sessions:\e[0m\n"
w $MYUSER | egrep "LOGIN#|^$MYUSER" | grep -v "$MYSESSION" | column -t
echo ----------
read -p "Do you want to force close all your other sessions? [Y]Yes/[N]No: " answer
answer=`echo $answer | tr A-Z a-z`
confirm=("y" "yes")
if [[ "${confirm[#]}" =~ "$answer" ]]; then
for SESSION in $OTHERSESSIONS
do
pkill -9 -t $SESSION
echo Session $SESSION closed.
done
fi
else
echo "There are no other sessions for the user '$MYUSER'".
fi
You can use killall command as well .
-o, --older-than
Match only processes that are older (started before) the time specified. The time is specified as a float then a unit. The units are s,m,h,d,w,M,y for seconds, minutes, hours, days,
-e, --exact
Require an exact match for very long names.
-r, --regexp
Interpret process name pattern as an extended regular expression.
This worked like a charm.
If you want to close tty for specific user with all the process, above command is the easiest. You can use:
killall -u user_name
In addition to AIXroot's answer, there is also a logout function that can be used to write a utmp logout record. So if you don't have any processes for user xxxx, but userdel says "userdel: account xxxx is currently in use", you can add a logout record manually. Create a file logout.c like this:
#include <stdio.h>
#include <utmp.h>
int main(int argc, char *argv[])
{
if (argc == 2) {
return logout(argv[1]);
}
else {
fprintf(stderr, "Usage: logout device\n");
return 1;
}
}
Compile it:
gcc -lutil -o logout logout.c
And then run it for whatever it says in the output of finger's "On since" line(s) as a parameter:
# finger xxxx
Login: xxxx Name:
Directory: /home/xxxx Shell: /bin/bash
On since Sun Feb 26 11:06 (GMT) on 127.0.0.1:6 (messages off) from 127.0.0.1
On since Fri Feb 24 16:53 (GMT) on pts/6, idle 3 days 17:16, from 127.0.0.1
Last login Mon Feb 10 14:45 (GMT) on pts/11 from somehost.example.com
Mail last read Sun Feb 27 08:44 2014 (GMT)
No Plan.
# userdel xxxx
userdel: account `xxxx' is currently in use.
# ./logout 127.0.0.1:6
# ./logout pts/6
# userdel xxxx
no crontab for xxxx

How to measure CPU usage

I would like to log CPU usage at a frequency of 1 second.
One possible way to do it is via vmstat 1 command.
The problem is that the time between each output is not always exactly one second, especially on a busy server. I would like to be able to output the timestamp along with the CPU usage every second. What would be a simple way to accomplish this, without installing special tools?
There are many ways to do that. Except top another way is to you the "sar" utility. So something like
sar -u 1 10
will give you the cpu utilization for 10 times every 1 second. At the end it will print averages for each one of the sys, user, iowait, idle
Another utility is the "mpstat", that gives you similar things with sar
Use the well-known UNIX tool top that is normally available on Linux systems:
top -b -d 1 > /tmp/top.log
The first line of each output block from top contains a timestamp.
I see no command line option to limit the number of rows that top displays.
Section 5a. SYSTEM Configuration File and 5b. PERSONAL Configuration File of the top man page describes pressing W when running top in interactive mode to create a $HOME/.toprc configuration file.
I did this, then edited my .toprc file and changed all maxtasks values so that they are maxtasks=4. Then top only displays 4 rows of output.
For completeness, the alternative way to do this using pipes is:
top -b -d 1 | awk '/load average/ {n=10} {if (n-- > 0) {print}}' > /tmp/top.log
You might want to try htop and atop. htop is beautifully interactive while atop gathers information and can report CPU usage even for terminated processes.
I found a neat way to get the timestamp information to be displayed along with the output of vmstat.
Sample command:
vmstat -n 1 3 | while read line; do echo "$(date --iso-8601=seconds) $line"; done
Output:
2013-09-13T14:01:31-0700 procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
2013-09-13T14:01:31-0700 r b swpd free buff cache si so bi bo in cs us sy id wa
2013-09-13T14:01:31-0700 1 1 4197640 29952 124584 12477708 12 5 449 147 2 0 7 4 82 7
2013-09-13T14:01:32-0700 3 0 4197780 28232 124504 12480324 392 180 15984 180 1792 1301 31 15 38 16
2013-09-13T14:01:33-0700 0 1 4197656 30464 124504 12477492 344 0 2008 0 1892 1929 32 14 43 10
To monitor the disk usage, cpu and load i created a small bash scripts that writes the values to a log file every 10 seconds.
This logfile is processed by logstash kibana and riemann.
# #!/usr/bin/env bash
# Define a timestamp function
LOGPATH="/var/log/systemstatus.log"
timestamp() {
date +"%Y-%m-%dT%T.%N"
}
#server load
while ( sleep 10 ) ; do
echo -n "$(timestamp) linux::systemstatus::load " >> $LOGPATH
cat /proc/loadavg >> $LOGPATH
#cpu usage
echo -n "$(timestamp) linux::systemstatus::cpu " >> $LOGPATH
top -bn 1 | sed -n 3p >> $LOGPAT
#disk usage
echo -n "$(timestamp) linux::systemstatus::storage " >> $LOGPATH
df --total|grep total|sed "s/total//g"| sed 's/^ *//' >> $LOGPATH
done

Getting CPU utilization information

How could I get the CPU utilization with time info of a process in linux? Basically I want to let my application run overnight. At the same time, I would like to monitor the CPU utilization during the period the application is run.
I tried top | grep appName >& log, it does not seem to return me anything in the log. Could someone help me with this?
Thanks.
vmstat and iostat can both give you periodic information of this nature; I would suggest either setting up the number of times manually, or putting a single poll into a cron job, and then redirecting the output to a file:
vmstat 20 4230 >> cpu_log_file
This would give you a snapshot of usage every 20 seconds for 24 hours.
install sysstat package and run sar
nohup sar -o output.file 12 8 >/dev/null 2>&1 &
use the top or watch command
PID COMMAND %CPU TIME #TH #WQ #PORT #MREG RPRVT RSHRD RSIZE VPRVT VSIZE PGRP PPID STATE UID FAULTS COW MSGSENT MSGRECV SYSBSD SYSMACH CSW PAGEINS USER
10764 top 8.4 00:01.04 1/1 0 24 33 2000K 244K 2576K 17M 2378M 10764 10719 running 0 9908+ 54 564790+ 282365+ 3381+ 283412+ 838+ 27 root
10763 taskgated 0.0 00:00.00 2 0 25 27 432K 244K 1004K 27M 2387M 10763 1 sleeping 0 376 60 140 60 160 109 11 0 root
Write a program that invokes your process and then calls getrusage(2) and reports statistics for its children.
You can monitor the time used by your program with top while it is running.
Alternatively, you can launch your application with the time command, which will print the total amount of CPU time used by your program at the end of its execution. Just type time ./my_app instead of just ./my_app
For more info, man 1 time

How to see top processes sorted by actual memory usage?

I have a server with 12G of memory. A fragment of top is shown below:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
12979 frank 20 0 206m 21m 12m S 11 0.2 26667:24 krfb
13 root 15 -5 0 0 0 S 1 0.0 36:25.04 ksoftirqd/3
59 root 15 -5 0 0 0 S 0 0.0 4:53.00 ata/2
2155 root 20 0 662m 37m 8364 S 0 0.3 338:10.25 Xorg
4560 frank 20 0 8672 1300 852 R 0 0.0 0:00.03 top
12981 frank 20 0 987m 27m 15m S 0 0.2 45:10.82 amarok
24908 frank 20 0 16648 708 548 S 0 0.0 2:08.84 wrapper
1 root 20 0 8072 608 572 S 0 0.0 0:47.36 init
2 root 15 -5 0 0 0 S 0 0.0 0:00.00 kthreadd
The free -m shows the following:
total used free shared buffers cached
Mem: 12038 11676 362 0 599 9745
-/+ buffers/cache: 1331 10706
Swap: 2204 257 1946
If I understand correctly, the system has only 362 MB of available memory. My question is: How can I find out which process is consuming most of the memory?
Just as background info, the system is running 64bit OpenSuse 12.
use quick tip using top command in linux/unix
$ top
and then hit Shift+m (i.e. write a capital M).
From man top
SORTING of task window
For compatibility, this top supports most of the former top sort keys.
Since this is primarily a service to former top users, these commands do
not appear on any help screen.
command sorted-field supported
A start time (non-display) No
M %MEM Yes
N PID Yes
P %CPU Yes
T TIME+ Yes
Or alternatively: hit Shift + f , then choose the display to order by memory usage by hitting key n then press Enter. You will see active process ordered by memory usage
First, repeat this mantra for a little while: "unused memory is wasted memory". The Linux kernel keeps around huge amounts of file metadata and files that were requested, until something that looks more important pushes that data out. It's why you can run:
find /home -type f -name '*.mp3'
find /home -type f -name '*.aac'
and have the second find instance run at ridiculous speed.
Linux only leaves a little bit of memory 'free' to handle spikes in memory usage without too much effort.
Second, you want to find the processes that are eating all your memory; in top use the M command to sort by memory use. Feel free to ignore the VIRT column, that just tells you how much virtual memory has been allocated, not how much memory the process is using. RES reports how much memory is resident, or currently in ram (as opposed to swapped to disk or never actually allocated in the first place, despite being requested).
But, since RES will count e.g. /lib/libc.so.6 memory once for nearly every process, it isn't exactly an awesome measure of how much memory a process is using. The SHR column reports how much memory is shared with other processes, but there is no guarantee that another process is actually sharing -- it could be sharable, just no one else wants to share.
The smem tool is designed to help users better gage just how much memory should really be blamed on each individual process. It does some clever work to figure out what is really unique, what is shared, and proportionally tallies the shared memory to the processes sharing it. smem may help you understand where your memory is going better than top will, but top is an excellent first tool.
ps aux | awk '{print $2, $4, $11}' | sort -k2rn | head -n 10
(Adding -n numeric flag to sort command.)
First you should read an explanation on the output of free. Bottom line: you have at least 10.7 GB of memory readily usable by processes.
Then you should define what "memory usage" is for a process (it's not easy or unambiguous, trust me).
Then we might be able to help more :-)
List and Sort Processes by Memory Usage:
ps -e -orss=,args= | sort -b -k1,1n | pr -TW$COLUMNS
ps aux --sort '%mem'
from procps' ps (default on Ubuntu 12.04) generates output like:
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
...
tomcat7 3658 0.1 3.3 1782792 124692 ? Sl 10:12 0:25 /usr/lib/jvm/java-7-oracle/bin/java -Djava.util.logging.config.file=/var/lib/tomcat7/conf/logging.properties -D
root 1284 1.5 3.7 452692 142796 tty7 Ssl+ 10:11 3:19 /usr/bin/X -core :0 -seat seat0 -auth /var/run/lightdm/root/:0 -nolisten tcp vt7 -novtswitch
ciro 2286 0.3 3.8 1316000 143312 ? Sl 10:11 0:49 compiz
ciro 5150 0.0 4.4 660620 168488 pts/0 Sl+ 11:01 0:08 unicorn_rails worker[1] -p 3000 -E development -c config/unicorn.rb
ciro 5147 0.0 4.5 660556 170920 pts/0 Sl+ 11:01 0:08 unicorn_rails worker[0] -p 3000 -E development -c config/unicorn.rb
ciro 5142 0.1 6.3 2581944 239408 pts/0 Sl+ 11:01 0:17 sidekiq 2.17.8 gitlab [0 of 25 busy]
ciro 2386 3.6 16.0 1752740 605372 ? Sl 10:11 7:38 /usr/lib/firefox/firefox
So here Firefox is the top consumer with 16% of my memory.
You may also be interested in:
ps aux --sort '%cpu'
Building on gaoithe's answer, I attempted to make the memory units display in megabytes, and sorted by memory descending limited to 15 entries:
ps -e -orss=,args= |awk '{print $1 " " $2 }'| awk '{tot[$2]+=$1;count[$2]++} END {for (i in tot) {print tot[i],i,count[i]}}' | sort -n | tail -n 15 | sort -nr | awk '{ hr=$1/1024; printf("%13.2fM", hr); print "\t" $2 }'
588.03M /usr/sbin/apache2
275.64M /usr/sbin/mysqld
138.23M vim
97.04M -bash
40.96M ssh
34.28M tmux
17.48M /opt/digitalocean/bin/do-agent
13.42M /lib/systemd/systemd-journald
10.68M /lib/systemd/systemd
10.62M /usr/bin/redis-server
8.75M awk
7.89M sshd:
4.63M /usr/sbin/sshd
4.56M /lib/systemd/systemd-logind
4.01M /usr/sbin/rsyslogd
Here's an example alias to use it in a bash config file:
alias topmem="ps -e -orss=,args= |awk '{print \$1 \" \" \$2 }'| awk '{tot[\$2]+=\$1;count[\$2]++} END {for (i in tot) {print tot[i],i,count[i]}}' | sort -n | tail -n 15 | sort -nr | awk '{ hr=\$1/1024; printf(\"%13.2fM\", hr); print \"\t\" \$2 }'"
Then you can just type topmem on the command line.
How to total up used memory by process name:
Sometimes even looking at the biggest single processes there is still a lot of used memory unaccounted for. To check if there are a lot of the same smaller processes using the memory you can use a command like the following which uses awk to sum up the total memory used by processes of the same name:
ps -e -orss=,args= |awk '{print $1 " " $2 }'| awk '{tot[$2]+=$1;count[$2]++} END {for (i in tot) {print tot[i],i,count[i]}}' | sort -n
e.g. output
9344 docker 1
9948 nginx: 4
22500 /usr/sbin/NetworkManager 1
24704 sleep 69
26436 /usr/sbin/sshd 15
34828 -bash 19
39268 sshd: 10
58384 /bin/su 28
59876 /bin/ksh 29
73408 /usr/bin/python 2
78176 /usr/bin/dockerd 1
134396 /bin/sh 84
5407132 bin/naughty_small_proc 1432
28061916 /usr/local/jdk/bin/java 7
you can specify which column to sort by, with following steps:
steps:
* top
* shift + F
* select a column from the list
e.g. n means sort by memory,
* press enter
* ok
You can see memory usage by executing this code in your terminal:
$ watch -n2 free -m
$ htop
This very second in time
ps -U $(whoami) -eom pid,pmem,pcpu,comm | head -n4
Continuously updating
watch -n 1 'ps -U $(whoami) -eom pid,pmem,pcpu,comm | head -n4'
I also added a few goodies here you might appreciate (or you might ignore)
-n 1 watch and update every second
-U $(whoami) To show only your processes. $(some command) evaluates now
| head -n4 To only show the header and 3 processes at a time bc often you just need high usage line items
${1-4} says my first argument $1 I want to default to 4, unless I provide it
If you are using a mac you may need to install watch first
brew install watch
Alternatively you might use a function
psm(){
watch -n 1 "ps -eom pid,pmem,pcpu,comm | head -n ${1-4}"
# EXAMPLES:
# psm
# psm 10
}
You have this simple command:
$ free -h

Resources