smp affinity setting in linux - linux

I want to load balance the interrupt (irq 75) on my virtual machine system. It has 64 bit redhat 5.8, kernel 2.6.18. There are 8 CPUs in the virtual machine.
When I run:
cat /proc/interrupts
75: 9189 0 0 0 0 0 0 0 IO-APIC-level eth0
I saw that the IRQ 75 is used only CPU0. Then I changed the smp_affinity for irq 75.
echo ff > /proc/irq/75/smp_affinity
cat /proc/irq/75/smp_affinity
00000000,00000000,00000000,00000000,00000000,00000000,00000000,000000ff
But I saw againg the interrupts for irq 75 were using CPU0 only.
75: 157228 0 0 0 0 0 0 0 IO-APIC-level eth0
There is no irq balancing between CPUs. I want to distrubute all interrupts (irq 75) to all CPUs, Am I doing something wrong?

The value is in hex representation of bitmask, usually 64bit
first stop irqbalance
now, try (bit pattern: 10 = 0x2 in hex representation)
echo 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000002 > /proc/irq/75/smp_affinity
this should work if you have 2 core processor.

If you are using vmware, change the ethernet driver to VMXNET3, you will have interrupts like following:
cat /proc/interrupts | grep eth3
57: 0 0 0 0 5 101198492 0 0 PCI-MSI-edge eth3-rxtx-0
58: 0 0 0 0 0 2 82962355 0 PCI-MSI-edge eth3-rxtx-1
59: 0 0 0 0 0 0 1 112986304 PCI-MSI-edge eth3-rxtx-2
60: 120252394 0 0 0 0 0 0 1 PCI-MSI-edge eth3-rxtx-3
61: 1 118585532 0 0 0 0 0 0 PCI-MSI-edge eth3-rxtx-4
62: 0 1 151440277 0 0 0 0 0 PCI-MSI-edge eth3-rxtx-5
63: 0 0 1 94639274 0 0 0 0 PCI-MSI-edge eth3-rxtx-6
64: 0 0 0 1 63577471 0 0 0 PCI-MSI-edge eth3-rxtx-7
65: 0 0 0 0 0 0 0 0 PCI-MSI-edge eth3-event-8
You will have different "rxtx" queues, each assigned to a CPU.
In my case the load became balanced among all CPUs.

Related

How to reliably detect dropped netlink requests or responses

I'm interested in using netlink for a straightforward application (reading cgroup stats at high frequency).
The man page cautions that the protocol is not reliable, hinting that the application needs to be prepared to handle dropped packets:
However, reliable transmissions from kernel to user are impossible in
any case. The kernel can't send a netlink message if the socket buffer
is full: the message will be dropped and the kernel and the user-space
process will no longer have the same view of kernel state. It
is up to the application to detect when this happens (via the ENOBUFS
error returned by recvmsg(2)) and resynchronize.
Since my requirements are simple, I'm fine with just destroying the socket and creating a new one whenever anything unexpected happens. But I can't find any documentation on what the expectations are on my program—the man page for recvmsg(2) doesn't even mention ENOBUFS for example.
What all do I need to worry about in order to make sure I can tell that a request from my application or a response from the kernel has been dropped, so that I can reset everything and start over? It's clear to me that I could do so whenever I receive an error from any of the syscalls involved, but for example what happens if my request is dropped on the way to the kernel? Will I just never receive a response? Do I need to build a timeout mechanism where I wait only so long for a response?
I found the following in Communicating between the kernel and user-space in Linux using Netlink sockets by Ayuso, Gasca, and Lefevre:
If Netlink fails to deliver a message that goes from kernel to user-space, the recvmsg() function returns the No buffer space available (ENOBUFS) error. Thus, the user-space process knows that it is losing messages [...]
On the other hand, buffer overruns cannot occur in communications from user to kernel-space since sendmsg() synchronously passes the Netlink message to the kernel subsystem. If blocking sockets are used, Netlink is completely reliable in communications from user to kernel-space since memory allocations would wait, so no memory exhaustion is possible.
Regarding acks, it looks like worrying about them is optional:
NLM_F_ACK: the user-space application requested a confirmation message from
kernel-space to make sure that a given request was successfully performed. If this
flag is not set, the kernel-space reports the error synchronously via sendmsg() as errno value.
So it sounds like for my simplistic use case I can just use sendmsg and recvmsg naively, reacting to any error (except for EINTR) by starting the whole thing over, perhaps with backoff. My guess it that since I only get one response per request and the responses are tiny, I should never even see ENOBUFS as long as I have only one request in flight at at a time.
As a side note, we can see the dropped netlink packets in the Drops column of /proc/net/netlink. For example:
# cat /proc/net/netlink
sk Eth Pid Groups Rmem Wmem Dump Locks Drops Inode
76f0e0ed 0 1966 00080551 0 0 0 2 0 36968
36a83ab1 0 1431 00000001 0 0 0 2 0 30297
d7d5db8e 0 563 00000440 0 0 0 2 0 19572
a10eb5c0 0 795 00000515 704 0 0 2 0 23584
c52bbce9 0 474 00000001 0 0 0 2 0 17511
3c5a89a5 0 989856248 00000001 0 0 0 2 0 31686
051108c1 0 0 00000000 0 0 0 2 0 25
ed401538 0 562 00000440 0 0 0 2 0 19576
38699987 0 469 00000557 0 0 0 2 0 19806
d7bbb203 0 728 00000113 0 0 0 2 0 22988
4d31126f 2 795 40000000 0 0 0 2 0 31092
febb9674 2 2100 00000001 0 0 0 2 0 37904
8c18eb5b 2 728 40000000 0 0 0 2 0 22989
922a7fcf 4 0 00000000 0 0 0 2 0 8681
16cfa740 7 0 00000000 0 0 0 2 0 7680
4e55a095 9 395 00000000 0 0 0 2 0 15142
0b2c5994 9 1 00000000 0 0 0 2 0 10840
94fe571b 9 0 00000000 0 0 0 2 0 7673
a7a1d82c 9 396 00000000 0 0 0 2 0 14484
b6a3f183 10 0 00000000 0 0 0 2 0 7517
1c2dc7e3 11 0 00000000 0 0 0 2 0 640
6dafb596 12 469 00000007 0 0 0 2 0 19810
2fe2c14c 12 3676872482 00000007 0 0 0 2 0 19811
3f245567 12 0 00000000 0 0 0 2 0 8682
0da2ddc4 12 3578344684 00000000 0 0 0 2 0 43683
f9720247 15 489 00000001 0 0 0 2 11 17781
e84b6d30 15 519 00000002 0 0 0 2 0 19071
c7d75154 15 1550 ffffffff 0 0 0 2 0 31970
02c1c3db 15 4070855316 00000001 0 0 0 2 0 10852
e0d7b09a 15 1 00000002 0 0 0 2 0 10821
78649432 15 0 00000000 0 0 0 2 0 30
8182eaf3 15 504 00000002 0 0 0 2 0 22047
40263df1 15 1858 ffffffff 0 0 0 2 0 34001
49283e31 16 0 00000000 0 0 0 2 0 696

100% Kernel CPU kills connections to the server [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 4 years ago.
Improve this question
My server runs Centos 6.9 (64gb Ram), and nginx, the problem is that every 10 minutes there are random 100% kernel cpu spikes in htop, generated by "events/10" and "ksoftirqd/10". I don't know how to find out which exact process is generating this problem.
This is my /proc/interrupts
$ cat /proc/interrupts
CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7 CPU8 CPU9 CPU10 CPU11 CPU12 CPU13 CPU14 CPU15
0: 77897 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IO-APIC-edge timer
1: 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IO-APIC-edge i8042
8: 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IO-APIC-edge rtc0
9: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IO-APIC-fasteoi acpi
12: 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IO-APIC-edge i8042
56: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge aerdrv
63: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge xhci_hcd
64: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge xhci_hcd
65: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge xhci_hcd
66: 1426061273 0 0 0 0 0 0 0 0 0 0 1914508 0 0 0 0 PCI-MSI-edge ahci
67: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 PCI-MSI-edge ahci
68: 3084636512 0 0 0 0 0 0 0 0 0 10149560 0 0 0 0 0 PCI-MSI-edge eth0
NMI: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Non-maskable interrupts
LOC: 1972128636 528409367 3519065090 2616991376 2762882221 3577269786 2407615998 2889069038 1939478243 2270996522 1940319131 2244314760 2033706214 2339089941 2303043400 2629954396 Local timer interrupts
SPU: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Spurious interrupts
PMI: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Performance monitoring interrupts
IWI: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 IRQ work interrupts
RES: 1349612979 1979915818 1044674069 463586597 1410841781 641984863 3396971132 3062175502 2189512469 2034852778 1686264346 1571882114 1410891335 1330892006 1273321645 1195645068 Rescheduling interrupts
CAL: 1771384 1771300 1775694 1780259 1778017 1782331 1761855 1755630 1758801 1759472 1770034 1773352 1775468 1779579 1778401 1778652 Function call interrupts
TLB: 1295395722 623515438 528231713 457109575 438669843 412327240 413878597 392015004 399091958 373918339 391267007 362582716 383312220 348908971 376811042 337426419 TLB shootdowns
TRM: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Thermal event interrupts
THR: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Threshold APIC interrupts
MCE: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Machine check exceptions
MCP: 77730 77730 77730 77730 77730 77730 77730 77730 77730 77730 77730 77730 77730 77730 77730 77730 Machine check polls
ERR: 0
MIS: 0
This is my /proc/cpuinfo
processor : 0
vendor_id : AuthenticAMD
cpu family : 23
model : 1
model name : AMD Ryzen 7 PRO 1700X Eight-Core Processor
stepping : 1
cpu MHz : 2200.000
cache size : 512 KB
physical id : 0
siblings : 16
core id : 0
cpu cores : 8
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nonstop_tsc extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb perfctr_l2 arat xsaveopt npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold fsgsbase bmi1 avx2 smep bmi2 rdseed adx
bogomips : 6786.47
TLB size : 2560 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 48 bits physical, 48 bits virtual
power management: ts ttp tm hwpstate eff_freq_ro [13] [14]
Hope you can help me, these spikes make the server really unstable (even with nginx and most of the software turned off)
I also already tried installing irqbalance but it just switched the cpu that was going 100% from the first to the eleventh.
I also made the host switch my drives to another machine of the same architecture, but that didn't work either.

Is it possible to change which core timer interrupts happen on?

On my Debian 8 system, when I run the command watch -n0.1 --no-title cat /proc/interrupts, I get the output below.
CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7 [0/1808]
0: 46 0 0 10215 0 0 0 0 IO-APIC-edge timer
1: 1 0 0 2 0 0 0 0 IO-APIC-edge i8042
8: 0 0 0 1 0 0 0 0 IO-APIC-edge rtc0
9: 0 0 0 0 0 0 0 0 IO-APIC-fasteoi acpi
12: 0 0 0 4 0 0 0 0 IO-APIC-edge i8042
18: 0 0 0 0 8 0 0 0 IO-APIC-fasteoi i801_smbus
19: 7337 0 0 0 0 0 0 0 IO-APIC-fasteoi ata_piix, ata_piix
21: 0 66 0 0 0 0 0 0 IO-APIC-fasteoi ehci_hcd:usb1
23: 0 0 35 0 0 0 0 0 IO-APIC-fasteoi ehci_hcd:usb2
40: 208677 0 0 0 0 0 0 0 HPET_MSI-edge hpet2
41: 0 4501 0 0 0 0 0 0 HPET_MSI-edge hpet3
42: 0 0 2883 0 0 0 0 0 HPET_MSI-edge hpet4
43: 0 0 0 1224 0 0 0 0 HPET_MSI-edge hpet5
44: 0 0 0 0 1029 0 0 0 HPET_MSI-edge hpet6
45: 0 0 0 0 0 0 0 0 PCI-MSI-edge aerdrv, PCIe PME
46: 0 0 0 0 0 0 0 0 PCI-MSI-edge PCIe PME
47: 0 0 0 0 0 0 0 0 PCI-MSI-edge PCIe PME
48: 0 0 0 0 0 0 0 0 PCI-MSI-edge PCIe PME
49: 0 0 0 0 0 8570 0 0 PCI-MSI-edge eth0-rx-0
50: 0 0 0 0 0 0 1684 0 PCI-MSI-edge eth0-tx-0
51: 0 0 0 0 0 0 0 2 PCI-MSI-edge eth0
NMI: 8 2 2 2 1 2 1 49 Non-maskable interrupts
LOC: 36 31 29 26 21 7611 886 1390 Local timer interrupts
SPU: 0 0 0 0 0 0 0 0 Spurious interrupts
PMI: 8 2 2 2 1 2 1 49 Performance monitoring interrupts
IWI: 0 0 0 1 1 0 1 0 IRQ work interrupts
RTR: 7 0 0 0 0 0 0 0 APIC ICR read retries
RES: 473 1027 1530 739 1532 3567 1529 1811 Rescheduling interrupts
CAL: 846 1012 1122 1047 984 1008 1064 1145 Function call interrupts
TLB: 2 7 5 3 12 15 10 6 TLB shootdowns
TRM: 0 0 0 0 0 0 0 0 Thermal event interrupts
THR: 0 0 0 0 0 0 0 0 Threshold APIC interrupts
MCE: 0 0 0 0 0 0 0 0 Machine check exceptions
MCP: 4 4 4 4 4 4 4 4 Machine check polls
THR: 0 0 0 0 0 0 0 0 Hypervisor callback interrupts
ERR: 0
MIS: 0
Observe that the timer interrupt is firing mostly on CPU3.
Is it possible to move the timer interrupt to CPU0?
The name of the concept is IRQ SMP affinity.
It's possible to set the smp_affinity of an IRQ by setting the affinity mask in /proc/irq/<IRQ_NUMBER>/smp_affinity or the affinity list in /proc/irq/<IRQ_NUMBER>/smp_affinity_list.
The affinity mask is a bit field where each bit represents a core, the IRQ is allowed to be served on the cores corresponding to bits set.
The command
echo 1 > /proc/irq/0/smp_affinity
executed as root should pin the IRQ0 to CPU0.
The conditional is mandatory as setting the affinity for an IRQ is subject to a set of prerequisites, the list includes: an interrupt controller that supports a redirection table (like the IO-APIC), the affinity mask must contains at least one active CPUs, the IRQ affinity must not be managed by the kernel and the feature must be enabled.
In my virtualised Debian 8 system I was unable to set the affinity of the IRQ0, failing with an EIO error.
I was also unable to track down the exact reason.
If you are willing to dive into the Linux source code, you can start from write_irq_affinity in proc.c
Use isolcpus. It may not reduce your timer interrupts to 0, but on our servers they are greatly reduced.
If you use isolcpus, then the kernel will not affine interrupts to your CPUs that it might otherwise do. For example, we have systems with 12 core dual CPUs. We noticed NVME interrupts on our CPU1 (the second CPU), even with the CPUs isolated via tuned and its cpu-partitioning scheme. nvme drives on our Dell systems are connected to the PCIe lanes on CPU1, hence the interrupts on those cores.
As per my ticket with Red Hat (and Margaret Bloom, who wrote an excellent answer here), if you don't want the interrupts to be affined to your CPUs, you need to use isolcpus on the kernel boot line. And lo and behold, I tried it and our interrupts went to 0 for the NVME drives on all isolated CPU cores.
I have not attempted to isolate ALL cores on CPU1; I don't know if they'll simply be affined to CPU0 or what.
And, in a short summary: any interrupt in /proc/interrupts with "MSI" in the name, is managed by the kernel.

cstime error in /proc/pid/stat file

stime or cstime in /proc/pid/stat file is so huge that doesn't make any sense. But just some processes have this wrong cstime on occasion. Just as following:
# ps -eo pid,ppid,stime,etime,time,%cpu,%mem,command |grep nsc
4815 1 Jan08 1-01:20:02 213503-23:34:33 20226149 0.1 /usr/sbin/nscd
#
# cat /proc/4815/stat
4815 (nscd) S 1 4815 4815 0 -1 4202560 2904 0 0 0 21 1844674407359 0 0 20 0 9 0 4021 241668096 326 18446744073709551615 139782748139520 139782748261700 140737353849984 140737353844496 139782734487251 0 0 3674112 16390 18446744073709551615 0 0 17 1 0 0 0 0 0
You can see the stime of proc 4815, nscd, is 1844674407359, equal to 213503-23:34:33, but has just been running for 1-01:20:02.
Another problem process has wrong cstime is following:
a bash fork a sh, which fork a sleep.
8155 (bash) S 3124 8155 8155 0 -1 4202752 1277 6738 0 0 3 0 4 1844674407368 20 0 1 0 1738175 13258752 451 18446744073709551615 4194304 4757932 140736528897536 140736528896544 47722675403157 0 65536 4100 65538 18446744071562341410 0 0 17 5 0 0 0 0 0
8184 (sh) S 8155 8155 8155 0 -1 4202496 475 0 0 0 0 0 0 0 20 0 1 0 1738185 11698176 357 18446744073709551615 4194304 4757932 140733266239472 140733266238480 47964680542613 0 65536 4100 65538 18446744071562341410 0 0 17 6 0 0 0 0 0
8185 (sleep) S 8184 8155 8155 0 -1 4202496 261 0 0 0 0 0 0 0 20 0 1 0 1738186 8577024 177 18446744073709551615 4194304 4212204 140734101195248 140734101194776 48002231427168 0 0 4096 0 0 0 0 17 7 0 0 0 0 0
So you can see that cstime in proc bash is 1844674407368, which is much larger than total cpu time of its children.
My server has one Intel(R) Xeon(R) CPU E5620 # 2.40GHz, which is 4 cores and 8 threads. Operating system is Suse Linux Enterprise Server SP1 x86_64, as following
# lsb_release -a
LSB Version: core-2.0-noarch:core-3.2-noarch:core-4.0-noarch:core-2.0-x86_64:core-3.2-x86_64:core-4.0-x86_64:desktop-4.0-amd64:desktop-4.0-noarch:graphics-2.0-amd64:graphics-2.0-noarch:graphics-3.2-amd64:graphics-3.2-noarch:graphics-4.0-amd64:graphics-4.0-noarch
Distributor ID: SUSE LINUX
Description: SUSE Linux Enterprise Server 11 (x86_64)
Release: 11
Codename: n/a
#
# uname -a
Linux node2 2.6.32.12-0.7-xen #1 SMP 2010-05-20 11:14:20 +0200 x86_64 x86_64 x86_64 GNU/Linux
So is it the kernel's problem? Can anybody help fix it?
I suspect that you might be simply seeing a kernel bug. Update to the latest offered update kernel for SLES (which is something like 2.6.32.42 or so) and see if it still occurs. Btw, it's stime, not cstime, that is unusually high—in fact, looking close you will notice it is a value that is like a string truncation of 18446744073709551615 (2^64-1) ±a few clocks offset.
pid_nr: 4815
tcomm: (nscd)
state: S
ppid: 1
pgid: 4815
sid: 4815
tty_nr: 0
tty_pgrp: -1
task_flags: 4202560 / 0x402040
min_flt: 2904
cmin_flt: 0
max_flt: 0
cmax_flt: 0
utime: 21 clocks (= 21 clocks) (= 0.210000 s)
stime: 1844674407359 clocks (= 1844674407359 clocks) (= 18446744073.590000 s)
cutime: 0 clocks (= 0 clocks) (= 0.000000 s)
cstime: 0 clocks (= 0 clocks) (= 0.000000 s)
priority: 20
nice: 0
num_threads: 9
always-zero: 0
start_time: 4021
vsize: 241668096
get_mm_rss: 326
rsslim: 18446744073709551615 / 0xffffffffffffffff
mm_start_code: 139782748139520 / 0x7f21b50c7000
mm_end_code: 139782748261700 / 0x7f21b50e4d44
mm_start_stack: 140737353849984 / 0x7ffff7fb9c80
esp: 140737353844496 / 0x7ffff7fb8710
eip: 139782734487251 / 0x7f21b43c1ed3
obsolete-pending-signals: 0 / 0x0
obsolete-blocked-signals: 0 / 0x0
obsolete-sigign: 3674112 / 0x381000
obsolete-sigcatch: 16390 / 0x4006
wchan: 18446744073709551615 / 0xffffffffffffffff
always-zero: 0
always-zero: 0
task_exit_signal: 17
task_cpu: 1
task_rt_priority: 0
task_policy: 0
delayacct_blkio_ticks: 0
gtime: 0 clocks (= 0 clocks) (= 0.000000 s)
cgtime: 0 clocks (= 0 clocks) (= 0.000000 s)

How can I determine the current CPU utilization from the shell? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 9 years ago.
Improve this question
How can I determine the current CPU utilization from the shell in Linux?
For example, I get the load average like so:
cat /proc/loadavg
Outputs:
0.18 0.48 0.46 4/234 30719
Linux does not have any system variables that give the current CPU utilization. Instead, you have to read /proc/stat several times: each column in the cpu(n) lines gives the total CPU time, and you have to take subsequent readings of it to get percentages. See this document to find out what the various columns mean.
You can use top or ps commands to check the CPU usage.
using top : This will show you the cpu stats
top -b -n 1 |grep ^Cpu
using ps: This will show you the % cpu usage for each process.
ps -eo pcpu,pid,user,args | sort -r -k1 | less
Also, you can write a small script in bash or perl to read /proc/stat and calculate the CPU usage.
The command uptime gives you load averages for the past 1, 5, and 15 minutes.
Try this command:
$ top
http://www.cyberciti.biz/tips/how-do-i-find-out-linux-cpu-utilization.html
Try this command:
cat /proc/stat
This will be something like this:
cpu 55366 271 17283 75381807 22953 13468 94542 0
cpu0 3374 0 2187 9462432 1393 2 665 0
cpu1 2074 12 1314 9459589 841 2 43 0
cpu2 1664 0 1109 9447191 666 1 571 0
cpu3 864 0 716 9429250 387 2 118 0
cpu4 27667 110 5553 9358851 13900 2598 21784 0
cpu5 16625 146 2861 9388654 4556 4026 24979 0
cpu6 1790 0 1836 9436782 480 3307 19623 0
cpu7 1306 0 1702 9399053 726 3529 26756 0
intr 4421041070 559 10 0 4 5 0 0 0 26 0 0 0 111 0 129692 0 0 0 0 0 95 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 369 91027 1580921706 1277926101 570026630 991666971 0 277768 0 0 0 0 0 0 0 0 0 0 0 0 0
ctxt 8097121
btime 1251365089
processes 63692
procs_running 2
procs_blocked 0
More details:
http://www.mail-archive.com/linuxkernelnewbies#googlegroups.com/msg01690.html
http://www.linuxhowtos.org/System/procstat.htm
Maybe something like this
ps -eo pid,pcpu,comm
And if you like to parse and maybe only look at some processes.
#!/bin/sh
ps -eo pid,pcpu,comm | awk '{if ($2 > 4) print }' >> ~/ps_eo_test.txt
You need to sample the load average for several seconds and calculate the CPU utilization from that. If unsure what to you, get the sources of "top" and read it.

Resources