Extract Part of String using cut/awk or anything - linux

[DOSLB] : [dmob7h-002.on.bell.ca] : [61421820100992016102414274381420414330] : [[ACTIVE] ExecuteThread: '15' for queue: 'weblogic.kernel.Default (self-tuning)'] : [ca.bell.tv.doslb.infrastructure.logging.LoggingAspect] : [Ending execution of the class: ca.bell.tv.doslb.application.webservice.impl.RetrieveLocalSubscriberDelegateImpl] : [Method: getRetrieveLocalSubscriber[Call ended at: 2016-10-24 14:27:44.150] : [lasted 0 sec, 305 ms] : [WITHINLIMITS]]
I wanted to extract getRetrieveLocalSubscriber from the above string. But I can not be specific with its position and also the string since it is a service name so it will change by time in the log and the position may change but it will be in the same format, [Method: getRetrieveLocalSubscriber[Call ended at: 2016-10-24 14:27:44.150] this portion will always be the same.
and I also wanted to extract the lasted 0 sec part but the problem is the seconds will always change.
I want the output like getRetrieveLocalSubscriber in one variable and
lasted 0 sec in another variable
I have tried awk command
cat out_log.txt | awk -F '[:]' '{print $11}'
which is giving output is getRetrieveLocalSubscriber[Call ended at

Try something like this:
echo "[DOSLB] : [dmob7h-002.on.bell.ca] : [61421820100992016102414274381420414330] : [[ACTIVE] ExecuteThread: '15' for queue: 'weblogic.kernel.Default (self-tuning)'] : [ca.bell.tv.doslb.infrastructure.logging.LoggingAspect] : [Ending execution of the class: ca.bell.tv.doslb.application.webservice.impl.RetrieveLocalSubscriberDelegateImpl] : [Method: getRetrieveLocalSubscriber[Call ended at: 2016-10-24 14:27:44.150] : [lasted 0 sec, 305 ms] : [WITHINLIMITS]]" | perl -ne '/Method: getRetrieveLocalSubscriber\[Call ended at: (\d{4})-(\d{2})-(\d{2}) (\d{2}):(\d{2}):(\d{2})\.(\d+?)\] : \[lasted (\d+?) sec, (\d+?) ms\]/ && print "Call end: $1-$2-$3 $4:$5:$6.$7, lasted for $8s $9ms";'
Call end: 2016-10-24 14:27:44.150, lasted for 0s 305ms
Or, if such strings are in a file:
cat test.log | perl -ne '/Method: getRetrieveLocalSubscriber\[Call ended at: (\d{4})-(\d{2})-(\d{2}) (\d{2}):(\d{2}):(\d{2})\.(\d+?)\] : \[lasted (\d+?) sec, (\d+?) ms\]/ && print "Call end: $1-$2-$3 $4:$5:$6.$7, lasted for $8s $9ms";'
This regular expression also accepts different "call end" time. You can replace
(\d{4})-(\d{2})-(\d{2}) (\d{2}):(\d{2}):(\d{2})\.(\d+?)
part in this regexp with 2016-10-24 14:27:44.150 and $8 and $9 with $1 and $2 respectively to match only strings with
Call ended at: 2016-10-24 14:27:44.150
substrings.

you can try something like this;
grep -o -P '(?=\[Method: getRetrieveLocalSubscriber).*(?<=ms])' yourFile
or
grep -o -P '(?=\[Call).*(?<=ms])' yourFile
Eg;
user#host$ grep -o -P '(?=\[Method: getRetrieveLocalSubscriber).*(?<=ms])' test
[Method: getRetrieveLocalSubscriber[Call ended at: 2016-10-24 14:27:44.150] : [lasted 0 sec, 305 ms]
user#host$ grep -o -P '(?=\[Call).*(?<=ms])' test
[Call ended at: 2016-10-24 14:27:44.150] : [lasted 0 sec, 305 ms]

Related

linux bash cut one row which starts with a certain string

Good day,
im using linux bash commands to extract certain data of each sip account and put them next to each other.
i have an array called $peers that i put all 1000 sips into and now i need to for loop through them to set every sip to its useragent.
what i have so far is
#! /bin/bash
peers="$(asterisk -rx "sip show peers" | cut -f1 -d" " | cut -f1 -d"/" "=")" "= " asterisk -rx "sip show peer " $peer | cut -f2 -d"Useragent"
for peer in $peers do
echo $peers
done
#echo $peers
I need to extract a row from a collection of rows that starts with "Useragent"
I start by running asterisk -rx "sip show peer 101" and that gives me the result below
* Name : 101
Description :
Secret : <Set>
MD5Secret : <Not set>
Remote Secret: <Not set>
Context : outgoing
Record On feature : automon
Record Off feature : automon
Subscr.Cont. : <Not set>
Language :
Tonezone : <Not set>
AMA flags : Unknown
Transfer mode: open
CallingPres : Presentation Allowed, Not Screened
Callgroup :
Pickupgroup :
Named Callgr :
Nam. Pickupgr:
MOH Suggest :
Mailbox :
VM Extension : asterisk
LastMsgsSent : 0/0
Call limit : 0
Max forwards : 0
Dynamic : Yes
Callerid : "" <>
MaxCallBR : 384 kbps
Expire : 23
Insecure : no
Force rport : Yes
Symmetric RTP: Yes
ACL : No
DirectMedACL : No
T.38 support : No
T.38 EC mode : Unknown
T.38 MaxDtgrm: -1
DirectMedia : Yes
PromiscRedir : No
User=Phone : No
Video Support: No
Text Support : No
Ign SDP ver : No
Trust RPID : No
Send RPID : No
Subscriptions: Yes
Overlap dial : Yes
DTMFmode : rfc2833
Timer T1 : 500
Timer B : 32000
ToHost :
Addr->IP : xxx.xxx.xxx.xxx:5060
Defaddr->IP : (null)
Prim.Transp. : UDP
Allowed.Trsp : UDP
Def. Username: 101
SIP Options : (none)
Codecs : (gsm|ulaw|alaw|g729|g722)
Codec Order : (gsm:20,g722:20,g729:20,ulaw:20,alaw:20)
Auto-Framing : No
Status : OK (9 ms)
Useragent : UniFi VoIP Phone 4.6.6.489
Reg. Contact : sip:101#xxx.xxx.xxx.xxx:5060;ob
Qualify Freq : 60000 ms
Keepalive : 0 ms
Sess-Timers : Accept
Sess-Refresh : uas
Sess-Expires : 1800 secs
Min-Sess : 90 secs
RTP Engine : asterisk
Parkinglot :
Use Reason : No
Encryption : No
Now i need to cut this part Useragent : UniFi VoIP Phone 4.6.6.489
and display it as 101 : UniFi VoIP Phone 4.6.6.489
any help would be much appreciated
Thank you. that top answer worked perfectly. this is my solution now.
peer="$(asterisk -rx "sip show peers" | cut -f1 -d" " | cut -f1 -d"/" )"
for peer in $peers do
output= "$(asterisk -rx "sip show peer $peers" | sed -nE '/Useragent/ s/^[^:]+/101 /p')"
echo $output
done
But is is still giving issue, my problem is the loop of the variables
With sed:
... | sed -nE '/Useragent/ s/^[^:]+/101 /p'
/Useragent/ matches line(s) with Useragent it
s/^[^:]+/101 substitutes the portion from start till : (exclusive) with 101

how to extract numbers in the same location from many log files

I got an file test1.log
04/15/2016 02:22:46 PM - kneaddata.knead_data - INFO: Running kneaddata v0.5.1
04/15/2016 02:22:46 PM - kneaddata.utilities - INFO: Decompressing gzipped file ...
Input Reads: 69766650 Surviving: 55798391 (79.98%) Dropped: 13968259 (20.02%)
TrimmomaticSE: Completed successfully
04/15/2016 02:32:04 PM - kneaddata.utilities - DEBUG: Checking output file from Trimmomatic : /home/liaoming/kneaddata_v0.5.1/WGC066610D/WGC066610D_kneaddata.trimmed.fastq
04/15/2016 05:32:31 PM - kneaddata.utilities - DEBUG: 55798391 reads; of these:
55798391 (100.00%) were unpaired; of these:
55775635 (99.96%) aligned 0 times
17313 (0.03%) aligned exactly 1 time
5443 (0.01%) aligned >1 times
0.04% overall alignment rate
and the other files in the same format but different contents,like test2.log,test3.log to test60.log
I would like to extract two numbers from these files.For example the test1.log, the two numbers would be 55798391 55775635.
So the final generated file counts.txt would be something like this:
test1 55798391 55775635
test2 51000000 40000000
.....
test60 5000000 30000000
awk to the rescue!
$ awk 'FNR==9{f=$1} FNR==10{print FILENAME,f,$1}' test{1..60}.log
if not in the same directory, either call within a loop or create the file list and pipe to xargs awk
$ for i in {1..60}; do awk ... test$i/test$i.log; done
$ for i in {1..60}; do echo test$i/test$i.log; done | xargs awk ...

Line manipulation & sorting

I am alright at writing Linux scripts but could use some advice. I know the problem is sort of vague, so if you can provide any help whatsoever I will appreciate it!
The following issue is for personal growth, and because I am writing some network tools for fun/learning. No homework involved (I'm a senior in college, none of my classes require this stuff!)
I am using tshark to get information about packet captures. This is what it looks like:
rachel#Ubuntu-1:~/PCAP$ tshark -r LargeTorrent.pcap -q -z io,phs
===================================================================
Protocol Hierarchy Statistics
Filter:
eth frames:4309 bytes:3984321
ip frames:4119 bytes:3969006
icmp frames:1316 bytes:1308988
udp frames:1408 bytes:1350786
data frames:1368 bytes:1346228
dns frames:16 bytes:1176
nbns frames:14 bytes:1300
http frames:8 bytes:1596
nbdgm frames:2 bytes:486
smb frames:2 bytes:486
mailslot frames:2 bytes:486
browser frames:2 bytes:486
tcp frames:1395 bytes:1309232
data frames:1300 bytes:1294800
http frames:6 bytes:3763
data-text-lines frames:2 bytes:324
xml frames:2 bytes:3205
tcp.segments frames:1 bytes:787
nbss frames:34 bytes:5863
smb frames:17 bytes:3047
pipe frames:4 bytes:686
lanman frames:4 bytes:686
smb2 frames:13 bytes:2444
bittorrent frames:10 bytes:1709
tcp.segments frames:2 bytes:433
bittorrent frames:2 bytes:433
bittorrent frames:1 bytes:258
bittorrent frames:2 bytes:221
bittorrent frames:2 bytes:221
arp frames:146 bytes:8760
ipv6 frames:44 bytes:6555
udp frames:40 bytes:6211
dns frames:18 bytes:1711
dhcpv6 frames:14 bytes:2114
http frames:6 bytes:1014
data frames:2 bytes:1372
icmpv6 frames:4 bytes:344
===================================================================
What I would like for it to look like:
rachel#Ubuntu-1:~/PCAP$ tshark -r LargeTorrent.pcap -q -z io,phs
===================================================================
Protocol Hierarchy Statistics
Filter:
Protocol Bytes
=====================================
eth 984321
ip 3969006
icmp 1308988
udp 1350786
data 1346228
dns 1176
nbns 1300
http 1596
nbdgm 486
smb 486
mailslot 486
browser 486
tcp 1309232
data 1294800
http 3763
data-text-lines 324
xml 3205
tcp.segments 787
nbss 5863
smb 3047
pipe 686
lanman 686
smb2 2444
bittorrent 1709
tcp.segments 433
bittorrent 433
bittorrent 258
bittorrent 221
bittorrent 221
arp 8760
ipv6 6555
udp 6211
dns 1711
dhcpv6 2114
http 1014
data 1372
icmpv6 344
===================================================================
Edit: I am going to add the original question for the purpose of making sense of the (great) answer that was provided.
Originally, I wanted to only print statistics for "leaves" because eth, ip, etc. are all parents and their statistics are not necessary for my purposes. In addition, instead of having a god-awful block of text with only spaces to show hierarchy, I wanted to erase all the statistics for parents, and show them as breadcrumbs behind the child.
Example:
eth frames:4309 bytes:3984321
ip frames:4119 bytes:3969006
icmp frames:1316 bytes:1308988
udp frames:1408 bytes:1350786
data frames:1368 bytes:1346228
dns frames:16 bytes:1176
Should become
eth:ip:icmp - 1308988 bytes
eth:ip:udp:data - 1346228 bytes
eth:ip:udp:dns - 1176 bytes
To preserve the hierarchy and avoid printing useless statistics.
Anyway, the approved answer by Etan solved this perfectly! And for those of you who are on my level who are unsure of how to proceed after this answer, this will help you finish up:
Save the given script as a filename.awk file
Save the block of text you want to manipulate as a filename.txt file
Call awk -f filename.awk filename.txt
Optionally pipe the output to a file ( awk -f filename.awk filename.txt >> output.txt )
The output I originally thought you wanted could be achieved with this awk script. (I think this can probably be done cleaner but this seems to work well enough.)
function entry() {
# Don't want to print empty entries.
if (ind[0]) {
printf "%s", ind[0]
for (i = 1; i <= ls; i++) {
printf ":%s", ind[i]
}
split(b, a, /:/)
printf " - %s %s\n", a[2], a[1]
}
}
# Found our data marker. Note that and print the current line.
$1 == "Filter:" {d=1; print; next}
# Print lines until we see our data marker.
!d {print; next}
# Print empty lines.
!NF {print; next}
# Save our trailing line for later.
/===/ {suf=$0; next}
{
# Save our previous indentation level.
ls = s
# Find our new indentation level (by where the first field starts).
s = (match($0, /[^[:space:]]/)-1) / 2
# If the current line is at or below the last indent level print the last line.
if (s <= ls) {
entry()
}
# Save the current line's byte count.
b=$NF
# Save the current line's field name.
ind[s] = $1
}
END {
# Print a final line if we had one.
entry()
# Print the suffix line if we have one.
if (suf) {
print suf
}
}
Which, on the sample input, gets you this output.
===================================================================
Protocol Hierarchy Statistics
Filter:
eth:ip:icmp - 1308988 bytes
eth:ip:udp:data - 1346228 bytes
eth:ip:udp:dns - 1176 bytes
eth:ip:udp:nbns - 1300 bytes
eth:ip:udp:http - 1596 bytes
eth:ip:udp:nbdgm:smb:mailslot:browser - 486 bytes
eth:ip:tcp:data - 1294800 bytes
eth:ip:tcp:http:data-text-lines - 324 bytes
eth:ip:tcp:http:xml:tcp.segments - 787 bytes
eth:ip:tcp:nbss:smb:pipe:lanman - 686 bytes
eth:ip:tcp:nbss:smb2 - 2444 bytes
eth:ip:tcp:bittorrent:tcp.segments:bittorrent:bittorrent - 258 bytes
eth:ip:tcp:bittorrent:bittorrent:bittorrent - 221 bytes
eth:arp - 8760 bytes
eth:ipv6:udp:dns - 1711 bytes
eth:ipv6:udp:dhcpv6 - 2114 bytes
eth:ipv6:udp:http - 1014 bytes
eth:ipv6:udp:data - 1372 bytes
eth:ipv6:icmpv6:data - 344 bytes
===================================================================
Output like what you edited to indicate you want is probably more easily handled with sed though.
/Filter:/a \
Protocol Bytes \
=====================================
s/frames:[^ ]*//
s/ b/b/
s/bytes:\([^ ]*\)/\1/
Which ends up with output.
===================================================================
Protocol Hierarchy Statistics
Filter:
Protocol Bytes
=====================================
eth 3984321
ip 3969006
icmp 1308988
udp 1350786
data 1346228
dns 1176
nbns 1300
http 1596
nbdgm 486
smb 486
mailslot 486
browser 486
tcp 1309232
data 1294800
http 3763
data-text-lines 324
xml 3205
tcp.segments 787
nbss 5863
smb 3047
pipe 686
lanman 686
smb2 2444
bittorrent 1709
tcp.segments 433
bittorrent 433
bittorrent 258
bittorrent 221
bittorrent 221
arp 8760
ipv6 6555
udp 6211
dns 1711
dhcpv6 2114
http 1014
data 1372
icmpv6 344
===================================================================
A simple script with sed will work as well.
$ printf "\n==========================================================\n"; printf "Protocol Hierarchy Statistics\nFilter:\n\n";printf "\nProtocol\t\t\t\t Bytes\n================================================\n" && sed -e 's/\(frames[:].*bytes[:]\)\(.*$\)/\2/' dat/tshark.txt | tail -n+4 | head -n-1 && printf "================================================\n"
broken down into script form (where dat/tshark.txt is the filename holding the tshark output):
printf "\n==========================================================\n"
printf "Protocol Hierarchy Statistics\nFilter:\n\n"
printf "\nProtocol\t\t\t\t Bytes\n================================================\n"
sed -e 's/\(frames[:].*bytes[:]\)\(.*$\)/\2/' dat/tshark.txt | tail -n+4 | head -n-1
printf "================================================\n"
Output
==========================================================
Protocol Hierarchy Statistics
Filter:
Protocol Bytes
================================================
eth 3984321
ip 3969006
icmp 1308988
udp 1350786
data 1346228
dns 1176
nbns 1300
http 1596
nbdgm 486
smb 486
mailslot 486
browser 486
tcp 1309232
data 1294800
http 3763
data-text-lines 324
xml 3205
tcp.segments 787
nbss 5863
smb 3047
pipe 686
lanman 686
smb2 2444
bittorrent 1709
tcp.segments 433
bittorrent 433
bittorrent 258
bittorrent 221
bittorrent 221
arp 8760
ipv6 6555
udp 6211
dns 1711
dhcpv6 2114
http 1014
data 1372
icmpv6 344
================================================
Formatting
Following on from your comment on how to align the bytes info given the variable length of the protocol tags, you can make use of printf to format the output as you have indicated. Like Ethan, I started working on your original question that had the tags consolidated. My initial approach was to read the different levels into different associative arrays that could be combined into what you initially specified. Doing so, I had to produce the output lined up using printf. Here is the first attempt I made working with the first 4-levels of your tshark data:
declare -i ln=0
declare -A l1 l2 l3 l4
## read each line in file and assing to associative arrays for each level
while read -r line; do
ln=${#line} # base level on length of line read
[ $ln -gt 66 ] && continue;
[ $ln -eq 66 ] && { iface="${line%% *}"; l1[${iface}]="${line##* }"; }
[ $ln -eq 64 ] && { proto="${iface}:${line%% *}"; l2[${proto}]="${line##* }"; }
[ $ln -eq 62 ] && { ptype="${proto}:${line%% *}"; l3[${ptype}]="${line##* }"; }
[ $ln -le 60 ] && { data="${ptype}:${line%% *}"; l4[${data}]="${line##* }"; }
done < "$1"
## output a summary of the file
printf "\n4-level deep summary of file '%s':\n\n" "$1"
for i in "${!l1[#]}"; do
for j in "${!l2[#]}"; do
printf " %-32s %s\n" "$j" "${l2[$j]}"
for k in "${!l3[#]}"; do
printf " %-32s %s\n" "$k" "${l3[$k]}"
for l in "${!l4[#]}"; do
[ "${l%:*}" == "$k" ] && printf " %-32s %s\n" "$l" "${l4[$l]}"
done
done
done
done
The output it produced was for example:
eth:ip frames:4119 bytes:3969006
eth:ip:udp frames:1408 bytes:1350786
eth:ip:udp:data frames:1368 bytes:1346228
eth:ip:udp:nbdgm frames:2 bytes:486
eth:ip:udp:nbns frames:14 bytes:1300
You can look at the various printf statements in the code above and see how the alignment is handled. Let me know if you have further questions.
I'm a little surprised that tshark doesn't have a JSON or machine-readable way to get the -z io,phs info, when it has so many ways to extract packet info.
I tried playing with some of the above, but bash seems to have changed over the years (or has different defaults depending on the environment). I am also not sure which shell or version of it was used to produce the above.
The line lengths/output from tshark have also changed: My debugging showed different line lengths, so the trick above using line lengths, e.g. [ $ln -gt 66 ] didn't work for me.
It seems that read -r strips out leading/trailing whitespaces. If you actually want it, you need IFS= to make it give you the spaces:
## read each line in file
while IFS= read -r line ; do
...
done
The "nested" levels associative arrays is clever, but hard to work with - it shows what rabbit holes you can go down with bash - although now when iterating through it, bash produces it in "hash" order and not the order they were added.
Since I actually needed the data in the rest of my script, the nested arrays made it particularly fiddly to deal with. Fine for printf purposes where you just print the line, but what if you actually want to get the frames count for each item and do then do something with it.
Here was my attempt that simplified it a bit. I implemented it as a bash function which gets a few other bits of info from the sample file:
TSHARK=/usr/bin/tshark
CAPINFOS=/usr/bin/capinfos
declare -A fcount
declare -A bcount
declare -A capinfo
function loadcapinfo
{
local sample=$1
local statstofile=$2
local bytes
local frames
local key
if [ ! -f "$sample" ] ; then
echo "FATAL: loadcapinfo: file does not exist: $sample"
exit 1
fi
capinfo[start_time_epoch]=$($CAPINFOS -Tr -Sa $sample | cut -f2)
capinfo[start_time]=$($CAPINFOS -Tr -a $sample | cut -f2)
capinfo[end_time_epoch]=$($CAPINFOS -Tr -Se $sample | cut -f2)
capinfo[end_time]=$($CAPINFOS -Tr -e $sample | cut -f2)
capinfo[size]=$($CAPINFOS -Tr -s $sample | cut -f2)
declare -i ln=0
while IFS= read -r line ; do
ln=${#line} # base level on length of line read
[ $ln -le 1 ] && continue;
pat=".*frames:([0-9]+)\s+bytes:([0-9]+)"
pat_1="^(\w+)"
pat_2="^\s{2}(\w+)"
pat_3="^\s{4}(\w+)"
pat_4="^\s{6}(\w+)"
ethertype="ethertype"
[[ $line =~ $pat ]] && { frames=${BASH_REMATCH[1]}; bytes=${BASH_REMATCH[2]}; } || continue;
[[ $line =~ $pat_1 ]] && { encap="${BASH_REMATCH[1]}:${ethertype}"; key="${encap}"; }
[[ $line =~ $pat_2 ]] && { proto=${BASH_REMATCH[1]}; key="${encap}:${proto}"; }
[[ $line =~ $pat_3 ]] && { ptype=${BASH_REMATCH[1]}; key="${encap}:${proto}:${ptype}"; }
[[ $line =~ $pat_4 ]] && { data=${BASH_REMATCH[1]}; key="${encap}:${proto}:${ptype}:${data}"; }
[ "$proto" = "llc" ] && { key=${key/eth:ethertype:llc/eth:llc} ; }
fcount[${key}]=${frames:=0}
bcount[${key}]=${bytes:=0}
if [ -n "$statstofile" ] ; then
echo "${capinfo[start_time_epoch]},${key},${frames},${bytes}" >> $statstofile
fi
done < <($TSHARK -qr $sample -z io,phs)
unset fcount[0]
}
Now, after this in the script, we can do:
loadcapinfo /my/sample/file.pcap /tmp/stats.txt
Optionally write the counts to a file, /tmp/stats.txt
This uses one associative array for each count, and puts other info into capinfo so now we can do things like:
echo "IPv4 Packet Count is: ${fcount[eth:ethertype:ip]}"
echo "IPv6 Packet Count is: ${fcount[eth:ethertype:ipv6]}"
echo "ARP Count is: ${fcount[eth:ethertype:arp]}"
echo "STP Count is: ${fcount[eth:llc:stp]}"
echo "Start time: ${capinfo[start_time]}"
echo "End time: ${capinfo[end_time]}"
echo "File size: ${capinfo[size]}"
I made the keys match Wireshark's frame.protocols field, which inserts some "pseudo protocol" for most things called "ethertype". This way, if you want to then iterate through the associative array to find the packet(s) in the pcap file, you can use the information to find packets with a given protocol.
tshark -r /my/sample/file.pcap -Y "frame.protocols == eth:ethertype:ip:udp:snmp" -Tfields -e frame.number -e eth.src_resolved -e eth.dst_resolved -e ip.src -e ip.dst -e frame.protocols
for i in "${!fcount[#]}"; do
tshark -r /my/sample/file.pcap -Y "frame.protocols == $i" -Tfields -e frame.number -e eth.src_resolved -e eth.dst_resolved -e ip.src -e ip.dst -e frame.protocols > /tmp/$i.txt
done

time command output on an already running process

I have a process that spawns some other processes,
I want to use the time command on a specific process and get the same output as the time command.
Is that possible and how?
I want to use the time command on a specific process and get the same output as the time command.
Probably it is enough just to use pidstat to get user and sys time:
$ pidstat -p 30122 1 4
Linux 2.6.32-431.el6.x86_64 (hostname) 05/15/2014 _x86_64_ (8 CPU)
04:42:28 PM PID %usr %system %guest %CPU CPU Command
04:42:29 PM 30122 706.00 16.00 0.00 722.00 3 has_serverd
04:42:30 PM 30122 714.00 12.00 0.00 726.00 3 has_serverd
04:42:31 PM 30122 714.00 14.00 0.00 728.00 3 has_serverd
04:42:32 PM 30122 708.00 16.00 0.00 724.00 3 has_serverd
Average: 30122 710.50 14.50 0.00 725.00 - has_serverd
If not then according to strace time uses wait4 system call (http://linux.die.net/man/2/wait4) to get information about a process from the kernel. The same info returns getrusage but you cannot call it for an arbitrary process according to its documentation (http://linux.die.net/man/2/getrusage).
So, I do not know any command that will give the same output. However it is feasible to create a bash script that gets PID of the specific process and outputs something like time outpus then
This script does these steps:
1) Get the number of clock ticks per second
getconf CLK_TCK
I assume it is 100 and 1 tick is equal to 10 milliseconds.
2) Then in loop do the same sequence of commands while exists the directory /proc/YOUR-PID:
while [ -e "/proc/YOUR-PID" ];
do
read USER_TIME SYS_TIME REAL_TIME <<< $(cat /proc/PID/stat | awk '{print $14, $15, $22;}')
sleep 0.1
end loop
Some explanation - according to man proc :
user time: ($14) - utime - Amount of time that this process has been scheduled in user mode, measured in clock ticks
sys time: ($15) - stime - Amount of time that this process has been scheduled in kernel mode, measured in clock ticks
starttime ($22) - The time in jiffies the process started after system boot.
3) When the process is finished get finish time
read FINISH_TIME <<< $(cat '/proc/self/stat' | awk '{print $22;}')
And then output:
the real time = ($FINISH_TIME-$REAL_TIME) * 10 - in milliseconds
user time: ($USER_TIME/(getconf CLK_TCK)) * 10 - in milliseconds
sys time: ($SYS_TIME/(getconf CLK_TCK)) * 10 - in milliseconds
I think it should give roughly the same result as time. One possible problem I see is if the process exists for a very short period of time.
This is my implementation of time:
#!/bin/bash
# Uses herestrings
print_res_jeffies()
{
let "TIME_M=$2/60000"
let "TIME_S=($2-$TIME_M*60000)/1000"
let "TIME_MS=$2-$TIME_M*60000-$TIME_S*1000"
printf "%s\t%dm%d.%03dms\n" $1 $TIME_M $TIME_S $TIME_MS
}
print_res_ticks()
{
let "TIME_M=$2/6000"
let "TIME_S=($2-$TIME_M*6000)/100"
let "TIME_MS=($2-$TIME_M*6000-$TIME_S*100)*10"
printf "%s\t%dm%d.%03dms\n" $1 $TIME_M $TIME_S $TIME_MS
}
if [ $(getconf CLK_TCK) != 100 ]; then
exit 1;
fi
if [ $# != 1 ]; then
exit 1;
fi
PROC_DIR="/proc/"$1
if [ ! -e $PROC_DIR ]; then
exit 1
fi
USER_TIME=0
SYS_TIME=0
START_TIME=0
while [ -e $PROC_DIR ]; do
read TMP_USER_TIME TMP_SYS_TIME TMP_START_TIME <<< $(cat $PROC_DIR/stat | awk '{print $14, $15, $22;}')
if [ -e $PROC_DIR ]; then
USER_TIME=$TMP_USER_TIME
SYS_TIME=$TMP_SYS_TIME
START_TIME=$TMP_START_TIME
sleep 0.1
else
break
fi
done
read FINISH_TIME <<< $(cat '/proc/self/stat' | awk '{print $22;}')
let "REAL_TIME=($FINISH_TIME - $START_TIME)*10"
print_res_jeffies 'real' $REAL_TIME
print_res_ticks 'user' $USER_TIME
print_res_ticks 'sys' $SYS_TIME
And this is an example that compares my implementation of time and real time:
>time ./sys_intensive > /dev/null
Alarm clock
real 0m10.004s
user 0m9.883s
sys 0m0.034s
In another terminal window I run my_time.sh and give it PID:
>./my_time.sh `pidof sys_intensive`
real 0m10.010ms
user 0m9.780ms
sys 0m0.030ms

find all users who has over N process and echo them in shell

I'm writing script is ksh. Need to find all users who has over N process and echo them in shell.
N reads from ksh.
I know what I should use ps -elf but how parse it, find users with >N process and create array with them. Little troubles with array in ksh. Please help. Maybe simple solutions can help me instead of array creating.
s162103#helios:/home/s162103$ ps -elf
0 S s153308 4804 1 0 40 20 ? 17666 ? 11:03:08 ? 0:00 /usr/lib/gnome-settings-daemon --oa
0 S root 6546 1327 0 40 20 ? 3584 ? 11:14:06 ? 0:00 /usr/dt/bin/dtlogin -daemon -udpPor
0 S webservd 15646 485 0 40 20 ? 2823 ? п╪п╟я─я ? 0:23 /opt/csw/sbin/nginx
0 S s153246 6746 6741 0 40 20 ? 18103 ? 11:14:21 ? 0:00 iiim-panel --disable-crash-dialog
0 S s153246 23512 1 0 40 20 ? 17903 ? 09:34:08 ? 0:00 /usr/bin/metacity --sm-client-id=de
0 S root 933 861 0 40 20 ? 5234 ? 10:26:59 ? 0:00 dtgreet -display :14
...
when i type
ps -elf | awk '{a[$3]++;}END{for(i in a)if (a[i]>N)print i, a[i];}' N=1
s162103#helios:/home/s162103$ ps -elf | awk '{a[$3]++;}END{for(i in a)if (a[i]>N)print i, a[i];}' N=1
root 118
/usr/sadm/lib/smc/bin/smcboot 3
/usr/lib/autofs/automountd 2
/opt/SUNWut/lib/utsessiond 2
nasty 31
dima 22
/opt/oracle/product/Oracle_WT1/ohs/ 7
/usr/lib/ssh/sshd 5
/usr/bin/bash 11
that is not user /usr/sadm/lib/smc/bin/smcboot
there is last field in ps -elf ,not user
Something like this(assuming 3rd field of your ps command gives the user id):
ps -elf |
awk '{a[$3]++;}
END {
for(i in a)
if (a[i]>N)
print i, a[i];
}' N=3
The minimal ps command you want to use here is ps -eo user=. This will just print the username for each process and nothing more. The rest can be done with awk:
ps -eo user= |
awk -v max=3 '{ n[$1]++ }
END {
for (user in n)
if (n[user]>max)
print n[user], user
}'
I recommend to put the count in the first column for readability.
read number
ps -elfo user= | sort | uniq -c | while read count user
do
if (( $count > $number ))
then
echo $user
fi
done
That is best solution and it works!

Resources