Getting CPU utilization information - linux

How could I get the CPU utilization with time info of a process in linux? Basically I want to let my application run overnight. At the same time, I would like to monitor the CPU utilization during the period the application is run.
I tried top | grep appName >& log, it does not seem to return me anything in the log. Could someone help me with this?
Thanks.

vmstat and iostat can both give you periodic information of this nature; I would suggest either setting up the number of times manually, or putting a single poll into a cron job, and then redirecting the output to a file:
vmstat 20 4230 >> cpu_log_file
This would give you a snapshot of usage every 20 seconds for 24 hours.

install sysstat package and run sar
nohup sar -o output.file 12 8 >/dev/null 2>&1 &

use the top or watch command
PID COMMAND %CPU TIME #TH #WQ #PORT #MREG RPRVT RSHRD RSIZE VPRVT VSIZE PGRP PPID STATE UID FAULTS COW MSGSENT MSGRECV SYSBSD SYSMACH CSW PAGEINS USER
10764 top 8.4 00:01.04 1/1 0 24 33 2000K 244K 2576K 17M 2378M 10764 10719 running 0 9908+ 54 564790+ 282365+ 3381+ 283412+ 838+ 27 root
10763 taskgated 0.0 00:00.00 2 0 25 27 432K 244K 1004K 27M 2387M 10763 1 sleeping 0 376 60 140 60 160 109 11 0 root

Write a program that invokes your process and then calls getrusage(2) and reports statistics for its children.

You can monitor the time used by your program with top while it is running.
Alternatively, you can launch your application with the time command, which will print the total amount of CPU time used by your program at the end of its execution. Just type time ./my_app instead of just ./my_app
For more info, man 1 time

Related

How do I get 'stress --hdd' to stop at end of '--timeout' period?

I am using stress --hdd 1 --timeout 10s -v to stress the IO/disk on an Ubuntu system. Sometimes when running this command, I get something like:
stress: info: [16420] dispatching hogs: 0 cpu, 0 io, 0 vm, 1 hdd
...
stress: info: [16420] successful run completed in 13s
and other times I got something like:
...
stress: info: [16390] successful run completed in 31s
With iostat -yx 1 open in a separate terminal, I can indeed see that the %util and aqu-sz size stay high for roughly that longer-than-specified amount of time. I also see the mmcqd/0 process at the top of top.
How can I get the IO stress to stop within +2 seconds of the --timeout argument given in the stress command?
I have tried stuff like timeout 10 stress --hdd 1 --timeout 10s -v; pkill stress; pkill mmcqd;
without much success. Is there a way to purge the IO queue?
Thanks!

calculating correct memory utilization from invalid output of pidstat

I used, pidstat -r -p <pid> <interval> >> log_Path/MemStat.CSV & command to to collect the memory stat.
after running this command I found that RSS VSZ %MEM values are increased continuously, which is not expected, as pidstat provides the the values considering the interval.
After searching on net I found that there is bug in pidstat and I need to update the syssat package.
(please refer last few statements by pidstat author on this link : http://sebastien.godard.pagesperso-orange.fr/tutorial.html)
Now, my question is, how do I calculate the correct %MEM utilization from the current output as we cannot run the test again.
sample output :
Time PID minflt/s majflt/s VSZ RSS %MEM
9:55:22 AM 18236 1280 0 26071488 119136 0.36
9:55:23 AM 18236 4273 0 27402768 126276 0.38
9:55:24 AM 18236 9831 0 27402800 162468 0.49
9:55:25 AM 18236 161 0 27402800 169092 0.51
9:55:26 AM 18236 51 0 27402800 175416 0.53
9:55:27 AM 18236 6859 0 27402800 198340 0.6
9:55:28 AM 18236 1440 0 27402800 203608 0.62
On the Sysstat tutorial page you refer to it is said:
I noticed that pidstat had a memory footprint (VSZ and RSS fields)
that was constantly increasing as the time went by. I quickly found
that I had forgotten to close a file descriptor in a function of my
code and that was responsible for the memory leak...!
So, the output of pidstat has never been invalid; on the contrary, the author wrote
pidstat helped me to detect a memory leak in the pidstat command
itself.

Why is the system CPU time (% sy) high?

I am running a script that loads big files. I ran the same script in a single core OpenSuSe server and quad core PC. As expected in my PC it is much more faster than in the server. But, the script slows down the server and makes it impossible to do anything else.
My script is
for 100 iterations
Load saved data (about 10 mb)
time myscript (in PC)
real 0m52.564s
user 0m51.768s
sys 0m0.524s
time myscript (in server)
real 32m32.810s
user 4m37.677s
sys 12m51.524s
I wonder why "sys" is so high when i run the code in server. I used top command to check the memory and cpu usage.
It seems there is still free memory, so swapping is not the reason. % sy is so high, its probably the reason for the speed of server but I dont know what is causing % sy so high. The process that is using highest percent of CPU (99%) is "myscript". %wa is zero in the screenshot but sometimes it gets very high (50 %).
When the script is running, load average is greater than 1 but have never seen to be as high as 2.
I also checked my disc:
strt:~ # hdparm -tT /dev/sda
/dev/sda:
Timing cached reads: 16480 MB in 2.00 seconds = 8247.94 MB/sec
Timing buffered disk reads: 20 MB in 3.44 seconds = 5.81 MB/sec
john#strt:~> df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda2 245G 102G 131G 44% /
udev 4.0G 152K 4.0G 1% /dev
tmpfs 4.0G 76K 4.0G 1% /dev/shm
I have checked these things but I am still not sure what is the real problem in my server and how to fix it. Can anyone identify a probable reason for the slowness? What could be the solution?
Or is there anything else I should check?
Thanks!
You're getting a high sys activity because the load of the data you're doing takes system calls that happen in kernel. To resolve your slowness problems without upgrading the server might be possible. You can modify scheduling priority. See the man pages for nice and renice. See here and especially:
Niceness values range from -20 (the highest priority, lowest niceness) and 19 (the lowest priority, highest niceness).
$ ps -lp 941
F S UID PID PPID C PRI NI ADDR SZ WCHAN TTY TIME CMD
4 S 0 941 1 0 70 -10 - 1713 poll_s ? 00:00:00 sshd
$ nice -n 19 ./test.sh
My niceness value is 19
$ renice -n 10 -p 941
941 (process ID) old priority -10, new priority 10

CPU User time and System time on AIX

How can I get CPU user time and system time for each cpu on AIX.
I know I can get this value from cat /proc/stat on a linux machine, and from pstat_getprocessor() on an HP-UX machine. Is there a way to get this same metric on an AIX machine.
$ cat /proc/stat
...
cpu 23697394 7969 2744135 4505191649 2958605 190 17883 0 0
cpu0 12511394 4575 1520243 2251753159 1480624 137 10580 0 0
cpu1 11186000 3394 1223891 2253438490 1477980 53 7302 0 0
...
mpstat is providing these metrics, either parse its output or figure out how/where does it find them.

curl hanging, despite connect-timeout and max-time

I have some scripts retrieving resources (image files etc) using system calls to curl. Occasionally, these will fail to finish, and will show as pipe_w in process listings.
F S UID PID PPID C PRI NI ADDR SZ WCHAN STIME TTY TIME CMD
0 S root 4378 4086 0 82 2 - 16002 pipe_w Jan10 ? 00:00:00 curl -JO --max-time 60 --connect-timeout 60 https://address/path/to/resource?identifier=tag
If I understand correctly, I can use connect-timeout to set the # of seconds to try and make the connection, and max-time to limit the amount of time to wait for response from the remote machine.
curl -JO --max-time 60 --connect-timeout 60 https://address/path/to/resource?identifier=tag
Any suggestions as to how I can force curl to continue past this? Or pointers on what might cause this?
This is using curl 7.21.0, on a stock ubuntu 10.10.

Resources