Closed. This question is off-topic. It is not currently accepting answers.
Closed 11 years ago.
The community reviewed whether to reopen this question 1 year ago and left it closed:
Original close reason(s) were not resolved
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
I am looking for a Linux command-line tool that would report the disk IO activity. Something similar to htop would be really cool. Has someone heard of something like that?
You could use iotop. It doesn't rely on a kernel patch. It Works with stock Ubuntu kernel
There is a package for it in the Ubuntu repos. You can install it using
sudo apt-get install iotop
nmon shows a nice display of disk activity per device. It is available for linux.
? Disk I/O ?????(/proc/diskstats)????????all data is Kbytes per second??????????????????????????????????????????????????????????????ij
?DiskName Busy Read WriteKB|0 |25 |50 |75 100| ?
?sda 0% 0.0 127.9|> | ?
?sda1 1% 0.0 127.9|> | ?
?sda2 0% 0.0 0.0|> | ?
?sda5 0% 0.0 0.0|> | ?
?sdb 61% 385.6 9708.7|WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWR> | ?
?sdb1 61% 385.6 9708.7|WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWR> | ?
?sdc 52% 353.6 9686.7|WWWWWWWWWWWWWWWWWWWWWWWWWWR > | ?
?sdc1 53% 353.6 9686.7|WWWWWWWWWWWWWWWWWWWWWWWWWWR > | ?
?sdd 56% 359.6 9800.6|WWWWWWWWWWWWWWWWWWWWWWWWWWWW> | ?
?sdd1 56% 359.6 9800.6|WWWWWWWWWWWWWWWWWWWWWWWWWWWW> | ?
?sde 57% 371.6 9574.9|WWWWWWWWWWWWWWWWWWWWWWWWWWWWR> | ?
?sde1 57% 371.6 9574.9|WWWWWWWWWWWWWWWWWWWWWWWWWWWWR> | ?
?sdf 53% 371.6 9740.7|WWWWWWWWWWWWWWWWWWWWWWWWWWR > | ?
?sdf1 53% 371.6 9740.7|WWWWWWWWWWWWWWWWWWWWWWWWWWR > | ?
?md0 0% 1726.0 2093.6|>disk busy not available | ?
??????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
It is not htop-like, but you could use atop. However, to display disk activity per process, it needs a kernel patch (available from the site). These kernel patches are now obsoleted, only to show per-process network activity an optional module is provided.
Use collectl which has extensive process I/O monitoring including monitoring threads.
Be warned that there are I/O counters for I/O being written to cache and I/O going to disk. collectl reports them separately. If you're not careful you can misinterpret the data. See http://collectl.sourceforge.net/Process.html
Of course, it shows a lot more than just process stats because you'd want one tool to provide everything rather than a bunch of different one that displays everything in different formats, right?
Related
I am working on a per process memory monitoring (Bash) script but it turns out to be more of a headache than I thought. Especially on forked processes such as PostgreSQL. There are a couple of reasons:
RSS is a potential value to be used as memory usage, however this also contains shared libraries etc which are used in other processes
PSS is another potential value which (should) show only the private memory of a process. Problem here is that PSS can only be retrieved from /proc//smaps which requires elevated capability privileges (or root)
USS (calculated as Private_Dirty + Private_Clean, source How does smem calculate RSS, USS and PSS?) could also be a potential candidate but here again we need access to /proc//smaps
For now I am trying to solve the forked process problem by looping through each PID's smaps (as suggested in https://www.depesz.com/2012/06/09/how-much-ram-is-postgresql-using/), for example:
for pid in $(pgrep -a -f "postgres" | awk '{print $1}' | tr "\n" " " ); do grep "^Pss:" /proc/$pid/smaps; done
Maybe some of the postgres processes should be excluded, I am not sure.
Using this method to calculate and sum the PSS and USS values, resulting in:
PSS: 4817 MB - USS: 4547 MB - RES: 6176 MB - VIRT: 26851 MB used
Obviously this only works with elevated privileges, which I would prefer to avoid. If these values actually represent the truth is not known because other tools/commands show yet again different values.
Unfortunately top and htop are unable to combine the postgres processes. atop is able to do this and seems to be (from a feeling) the most accurate with the following values:
NPROCS SYSCPU USRCPU VSIZE RSIZE PSIZE SWAPSZ RDDSK WRDSK RNET SNET MEM CMD 1/1
27 56m50s 16m40s 5.4G 1.1G 0K 2308K 0K 0K 0 0 11% postgres
Now to the question: What is the suggested and best way to retrieve the most accurate memory usage of an application with forked processes, such as PostgreSQL?
And in case atop already does an accurate calculation, how does atop get the to RSIZE value? Note that this value shown as root and non-root user, which would probably mean that /proc/<pid>/smaps is not used for the calculation.
Please comment if more information is needed.
EDIT: I actually found a bug in my pgrep pattern in my final script and it falsely parsed a lot more than just the postgres processes.
The new output now shows the same RES value as seen in atop RSIZE:
Script output:
PSS: 205 MB - USS: 60 MB - RES: 1162 MB - VIRT: 5506 MB
atop summarized postgres output:
NPROCS SYSCPU USRCPU VSIZE RSIZE PSIZE SWAPSZ RDDSK WRDSK RNET SNET MEM CMD
27 0.04s 0.10s 5.4G 1.1G 0K 2308K 0K 32K 0 0 11% postgres
But the question remains of course. Unless I am now using the most accurate way with the summarized RSS (RES) memory value. Let me know your thoughts, thanks :)
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 7 months ago.
Improve this question
I need this for some stress testing with my CPU. It uses Linux-Debian/Ubuntu OS and I was wondering if there's any way that I could put it under load until it reaches a certain temperature.
Are there any commands, packages or bash scripts for this?
Any help is appreciated!
Download Prime95 from here or use any other CPU-Stress test that works under Debian/Ubuntu.
Get the following package:
sudo apt-get install lm-sensors
Start the sensors in terminal and update continously:
watch sensors
Now start Prime95 or your preferred stress-test and you can see cpu-temp inside terminal. Stop Stress test if cpu-temp exceeds your desired temperature. (modern cpu's are lowering clockspeed or shutting down automatically before damage from overheating is taken)
OR (automatically stopping at a user-specified temp)
Get the following packages:
sudo apt-get install lm-sensors
sudo apt-get install stress
store the following code as bashfile i.e. stresstest.sh and run it with sh /path/to/stresstest.sh
#!/bin/bash
sensors=/usr/bin/sensors
read_temp() {
# get max Packagetemp from lm-sensors
${sensors} | grep 'Package' | awk '{print int($4)}'
}
echo 'Maximum CPU-Temperature:'
# insert tjMax manually
read tjMax
echo 'Workers for testing:'
# more workers cause higher load on the cpu
read workers
echo 'starting stress-test.'
pckgMax=$( read_temp )
while [ $tjMax -gt $pckgMax ]
do
# update Packagetemp
pckgMax=$( read_temp )
# do 10sec stress-test
# if you discover high temperature overhead, try lowering the --timeout
stress --cpu ${workers} --timeout 10
done
echo 'reached tjMax.'
echo 'stopping stress-test.'
# kill this script and all running sub-processes
kill -- -0
I don't know of an existing software package (other than prime95 for max heating), but it's relatively easy to create loops with differing amounts of heat, like awk 'BEGIN{for(i=0;i<100000000;i++){}}' keeps a CPU busy for a while making some heat.
See How to write x86 assembly code to check the effect of temperature on the performance of the processor for some suggestions on creating loops that heat the CPU more vs. less.
To hit a target temperature, you'll need to write some code to implement control loop that reads the temperature (and the direction it's trending) and adjusts the load by starting/stopping threads, or changing up how power-intensive each thread is. Without this feedback, you won't consistently hit a given CPU temperature; the actual temperature for an fixed workload will depend on ambient temp, how dusty your heat-sink is, and how the BIOS manages your fan speeds, etc.
Perhaps your CPU-heating threads could branch on a global atomic variable in an outer loop, so you can change what work they do by changing a variable. e.g. 2 FMAs per clock, 1 FMA per clock, FMAs bottlenecked by latency (so 1 per 4 clocks), or just integer work, or a loop just running pause instructions, so it does the minimum. Or 256-bit vs. 128-bit vs. scalar.
Perhaps also changing your EPP setting (on Intel Skylake or newer) with sudo sh -c 'for i in /sys/devices/system/cpu/cpufreq/policy[0-9]*/energy_performance_preference;do echo performance > "$i";done' or balance_performance or balance_power (emphasize power-saving); these may affect what turbo clock speeds your CPU chooses to run at.
Read the temperature with lm-sensors, or by reading from the "coretemp" kernel driver directly on modern x86 hardware, e.g. /sys/class/hwmon/hwmon3/temp1_input reads as 36000 for 36 degrees C.
$ grep . /sys/class/hwmon/hwmon3/*
/sys/class/hwmon/hwmon3/name:coretemp
/sys/class/hwmon/hwmon3/temp1_input:36000
/sys/class/hwmon/hwmon3/temp1_label:Package id 0
/sys/class/hwmon/hwmon3/temp2_input:35000
/sys/class/hwmon/hwmon3/temp2_label:Core 0
/sys/class/hwmon/hwmon3/temp5_input:33000
/sys/class/hwmon/hwmon3/temp5_label:Core 3
I'm not sure if the temperature is available directly to user-space without the kernel's help, e.g. via CPUID, on Intel or AMD.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 8 years ago.
Improve this question
I have couple of machines as shown below which are running Ubuntu 12.04 and I need to find out the process name along with its pid whose CPU usage is greater than 70%.
Below are the machines as an example -
machineA
machineB
machineC
machineD
I need to have my shell script which can run periodically every 15 minutes and check whether any of the above machines has CPU usage greater than 70%. If there are any machines which are having CPU usage as greater than 70%, then send out an email with the machine name and the process name along with it's id.
I will be running my shell script from machineX and I have passwordless ssh key setup for user david from machineX to all the above machines.
What is the best way to do all these kind of monitoring?
I have below command which can get me PID, %CPU and COMMAND name of the process whose CPU usage is greater than 70%.
ps aux --sort=-%cpu | awk 'NR==1{print $2,$3,$11}NR>1{if($3>=70) print $2,$3,$11}'
Not sure how to fully automate this process?
Hey men you are doing something wrong there, the cpu there its peak cpu in the current execution gap. That means that in the next second this proccess may not be eating any resources. Its better to use a different way to catch up cpu hungry proccess. My prefered way its by cpu time. Have a look on this example:
panos#wintermute:~$ ps xafu
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 2 0.0 0.0 0 0 ? S 18:53 0:00 [kthreadd]
root 3 0.0 0.0 0 0 ? S 18:53 0:00 \_ [ksoftirqd/
root 5 0.0 0.0 0 0 ? S< 18:53 0:00 \_ [kworker/0:
root 7 0.0 0.0 0 0 ? S 18:53 0:02 \_ [rcu_sched]
TIME: is the cpu time a proccess has bean eat. normal proccess dont need to eat much cpu proccess. So by creating a simple shell script with a small loop you could gather the information you need. The shell script could looked like:
#!/bin/sh
date
for i in "a b c d" ; do
echo machine${i}
ssh machine${i} ps xau|awk 'NR==1{print $2,$10,$11}NR>1{if($10>=5) print $2,$10,$11}'
echo -- --
done
exit
That will match any proccess that has bean eat 5 minutes of cpu
AWK does row-oriented editing (and a bunch of other stuff too).
A block of statements enclosed in brackets {} will be executed on each row of input.This behaviour may be limited by prefixing the block with a condition (in ordinary C syntax)
NR==1 {} means that block will be executed on a first input row. In the example above fields 2, 10 and 11 from the first input row will be printed on a single row.
NR>1 {} means that block will be executed for each row after the first one.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 5 years ago.
Improve this question
I am trying to analyze from where the amount of swap is from, and looking at smem display I get a completely different amount of swap usage.
Free shows the following :
[root#server1 ~/smem-1.3]# free -k
total used free shared buffers cached
Mem: 24554040 24197360 356680 0 510200 14443128
-/+ buffers/cache: 9244032 15310008
Swap: 20980880 2473120 18507760
And smem shows :
PID User Command Swap USS PSS RSS
...
18829 oracle oracle_1 (LOCAL=NO) 0 3.9M 98.3M 10.1G
18813 oracle oracle_1 (LOCAL=NO) 0 3.9M 98.6M 10.1G
18809 oracle oracle_1 (LOCAL=NO) 0 4.1M 99.2M 10.0G
28657 oracle ora_lms0_1 56.0K 54.1M 100.3M 4.2G
29589 oracle ora_lms1_1 964.0K 69.7M 118.9M 4.5G
29886 oracle ora_dbw1_1 5.7M 20.8M 130.9M 10.2G
29857 oracle ora_dbw0_1 4.2M 22.6M 133.0M 10.3G
11075 ccm_user /usr/java/jre1.6/bin/java - 197.8M 133.9M 135.9M 140.7M
21688 bsuser /usr/local/java/bin/java -c 30.7M 145.1M 147.2M 152.1M
29930 oracle ora_lck0_1 2.3M 58.6M 169.8M 1.0G
29901 oracle ora_smon_1 0 78.0M 195.6M 4.3G
15604 oracle /var/oragrid/jdk/jre//bin/j 65.4M 253.9M 254.3M 262.2M
-------------------------------------------------------------------------------
359 10 678.8M 2.5G 13.5G 1.2T
Why free shows me "2.4G" and smem only shows me 679M? One of them is showing some wrong result.
I need to find out where are the remaining 1.8G, or prove that free is showing wrong results.
Last but not least, the kernel is 2.6.18.
Well, the main issue is RSS(resident set size) and PSS(proportional set size). From http://www.selenic.com/smem/ as it says - "PSS instead measures each application's "fair share" of each shared area to give a realistic measure". On the otherhand, RSS overestimates by calculating shared memory area of multiple applications as their own. And this is why, you see the difference. In simple word, smem can differentiate between applications shared memory and rather than treating shared area as every applications own!
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 8 years ago.
Improve this question
I have a read only partition who's data is changing.
The change occurs on the first mount only. Subsequent mounts do not change the partition data.
Tried with ext3 and ext2 incase journalling was an issue ... no help.
Tried tune2fs with -c -1 -i 0 in order to disable updating timestamps or any other data that maybe touched by a check being executed ... no help
Normally I wouldn't care, but I need to hashsum this partition for data integrity purposes.
Linux can do a write on read-only fs in some rare cases. E.g. when it detects a fs in inconsistent state (after cold reboot) and is able to do a quick, safe-for-data fix.
I had a kind of such fix when working with Ubuntu Rescue Remix and the write was on second harddrive, before even mounting anything on it (while booting). Information about this was in dmesg, so check the dmesg too.
E.g. here is an orphan cleanup possible on readonly fs, it will temporary DISABLE READONLY flag
1485 if (s_flags & MS_RDONLY) {
1486 ext3_msg(sb, KERN_INFO, "orphan cleanup on readonly fs");
1487 sb->s_flags &= ~MS_RDONLY;
1488 }
... writes...
1549 sb->s_flags = s_flags; /* Restore MS_RDONLY status */
This is done in *ext3_mount-> mount_bdev -> (callback) ext3_fill_super -> ext3_orphan_cleanup
If the block device is not read-protected itself, linux (ASKING YEAH!)
1463 if (bdev_read_only(sb->s_bdev)) {
1464 ext3_msg(sb, KERN_ERR, "error: write access "
1465 "unavailable, skipping orphan cleanup.");
1466 return;
1467 }
WILL COMMIT A WRITE ON READONLY FS
Update: here is a list
http://www.forensicswiki.org/wiki/Forensic_Linux_Live_CD_issues
Ext3 File system requires journal recovery To disable recovery: use "noload" flag, or use "ro,loop" flags, or use "ext2" file system type
Ext4 File system requires journal recovery To disable recovery: use "noload" flag, or use "ro,loop" flags, or use "ext2" file system type
ReiserFS File system has unfinished transactions "nolog" flag does not work (see man mount). To disable journal updates: use "ro,loop" flags
XFS Always (when unmounting) "norecovery" flag does not help (fixed in recent 2.6 kernels). To disable data writes: use "ro,loop" flags