How to print most frequent lines of file in linux terminal?

How to print most frequent lines of file in linux terminal? - linux

I have file with lines:
<host>\t<ip>\n
and I need to print first 5 most frequent IPs. How can I do that?
For example, if I needed to print 3 most frequent IPs from this file:
host1 192.168.0.26
host2 192.168.0.26
host3 192.168.0.23
host4 192.168.0.24
host5 192.168.0.26
host6 192.168.0.26
host7 192.168.0.25
host8 192.168.0.26
host9 192.168.0.26
host18 192.168.0.22
host22 192.168.0.22
host24 192.168.0.23
I would print:
192.168.0.26
192.168.0.22
192.168.0.23

The following should work. Note that it returns 5 lines, even if there are 10 IPs with the same frequency.
cut -f2 file | sort | uniq -c | sort -n | head -n5

Related

gawk: Using a combination of "/proc/net/route" and "/proc/net/fib_trie" to determine the current local IP address

I would like to create a gawk-one-command-only solution[1], which determines the current local IP address via /proc/net/route and /proc/net/fib_trie.
I would like to use the first octett of the gateway value from /proc/net/route (0xC0 (192)) as pattern; only, if the first octett is greater than 0:
$ < "/proc/net/route"
Iface Destination Gateway Flags RefCnt Use Metric Mask MTU Window IRTT
enp3s0 00000000 0100A8C0 0003 0 0 0 00000000 0 0 0
enp3s0 00000000 00000000 0001 0 0 1000002 00000000 0 0 0
enp3s0 0000FEA9 00000000 0001 0 0 2 0000FFFF 0 0 0
enp3s0 0000A8C0 00000000 0001 0 0 0 00FFFFFF 0 0 0
The following command returns 0xC0:
$ gawk 'NR >= 2 \
{ \
if($3 > 0) \
print "0x"substr($3, length($3)-1, length($3)) \
}' "/proc/net/route"
0xC0
Which can be used to convert it to a decimal number:
$ gawk 'NR >= 2 \
{ \
if($3 > 0) \
print strtonum("0x"substr($3, length($3)-1, length($3))) \
}' "/proc/net/route"
192
The function strtonum() converts the hex number.
The following command for /proc/net/fib_trie returns me these values:
$ gawk '/32 host/ { print f } { f=$2 }' "/proc/net/fib_trie"
127.0.0.1
169.254.88.86
192.168.1.80
127.0.0.1
169.254.88.86
192.168.1.80
Main:
+-- 0.0.0.0/0 3 0 4
|-- 0.0.0.0
/0 universe UNICAST
/0 link UNICAST
+-- 127.0.0.0/8 2 0 2
+-- 127.0.0.0/31 1 0 0
|-- 127.0.0.0
/8 host LOCAL
|-- 127.0.0.1
/32 host LOCAL
|-- 127.255.255.255
/32 link BROADCAST
+-- 169.254.0.0/16 2 0 1
|-- 169.254.0.0
/16 link UNICAST
|-- 169.254.88.86
/32 host LOCAL
|-- 169.254.255.255
/32 link BROADCAST
+-- 192.168.0.0/24 2 1 2
+-- 192.168.0.0/26 2 0 2
|-- 192.168.0.0
/24 link UNICAST
|-- 192.168.1.80
/32 host LOCAL
|-- 192.168.0.255
/32 link BROADCAST
Local:
+-- 0.0.0.0/0 3 0 4
|-- 0.0.0.0
/0 universe UNICAST
/0 link UNICAST
+-- 127.0.0.0/8 2 0 2
+-- 127.0.0.0/31 1 0 0
|-- 127.0.0.0
/8 host LOCAL
|-- 127.0.0.1
/32 host LOCAL
|-- 127.255.255.255
/32 link BROADCAST
+-- 169.254.0.0/16 2 0 1
|-- 169.254.0.0
/16 link UNICAST
|-- 169.254.88.86
/32 host LOCAL
|-- 169.254.255.255
/32 link BROADCAST
+-- 192.168.0.0/24 2 1 2
+-- 192.168.0.0/26 2 0 2
|-- 192.168.0.0
/24 link UNICAST
|-- 192.168.1.80
/32 host LOCAL
|-- 192.168.0.255
/32 link BROADCAST
I thought of something like the following: Combine the above three commands as one gawk command and filter the local IP address from /proc/net/fib_trie:
$ gawk '/32 host/ && /strtonum(0xC0)/ { print f; <do_an_awk_sort_--unique_equivalent_somewhere_in_between_for_192.168.1.80> } { f=$2 }' "/proc/net/fib_trie"
192.168.1.80
I cannot think of any proper solution right now.
Thanks!
[1]
Background story:
I am using tmux and want to output the current local IP address, but the current solution forks three subshells: /bin/sh with ip and gawk. I would like to have a gawk-only-solution for this to have less load on the CPU.
Solving this leads to another solution, where I can combine this with another gawk command, where I calculate the current uptime.
My ultimate goal is to only depend on procfs and sysfs and have only two gawk commands, which will represent the following status bar of tmux:
tmux status bar
In this example I have already done that for the left hand side.
I hope, I explained everything elaborately enough to this very late hour.
Thank you!
-Ramon

One idea for a consolidated awk script:
awk '
# FNR==NR ==> 1st file (route)
FNR==NR { if (FNR>1 && $3>0) {
len=length($3)
octett=strtonum("0x" substr($3,len-1)) # convert last hex to decimal
nextfile # skip to next file; assumes (for now) we are only interested in first octett we find
}
next # skip to next record
}
# FNR!=NR ==> 2nd file (fib_trie)
octett { if ($0 ~ /32 host/) { # if octett is non-zero (ie, we found an octett) and current line matches "32 host" then ...
split(ip,a,".") # split ip into tuples
for (i=1;i<=4;i++) # loop through tuples and ...
if (a[i] == octett) { # if a tuple == octett then ...
iplist[ip] # add ip as an iplist[] array index (ie, add ip to our list of ips
next # break out of loop and skip to next input line
}
}
else
ip=$2 # make note of current (potential) ip
}
# after all input files/records have been processed ...
END { for (ip in iplist) # loop through list of ips and ...
print ip # print to stdout
}
' /proc/net/route /proc/net/fib_trie
This generates:
192.168.1.80
NOTES:
There are a few outstanding issues that need to be addressed by OP ...
What to do if we find more than one non-zero Gateway/octett in /proc/net/route?
What to do if we find more than one ip address (in proc/net/fib_trie) with a tuple that matches the octett(s) (from /proc/net/route)?
If we find multiple ip addresses is there a sorting requirement when printing to stdout?

Extract average time using fping

I want to extract the avg time using fping.
fping -q -b 12 -c 3 localhost 192.168.0.20 192.168.0.1 192.168.0.18 192.168.0.22
localhost : xmt/rcv/%loss = 3/3/0%, min/avg/max =
0.06/0.07/0.09
192.168.0.20 : xmt/rcv/%loss = 3/0/100%
192.168.0.1 : xmt/rcv/%loss = 3/3/0%, min/avg/max = 2.00/2.57/3.11
192.168.0.18 : xmt/rcv/%loss = 3/0/100%
192.168.0.22 : xmt/rcv/%loss = 3/3/0%, min/avg/max = 0.12/0.16/0.19
The average output should be of every device(-1 if device is unreachable), for example.
0.07
-1
2.57
-1
0.16
Thanks

Using awk:
fping -b 12 -c 3 localhost 192.168.0.20 192.168.0.1 192.168.0.18 192.168.0.22 |
awk -F'/' '{print ($8?$8:"-1")}'
0.07
-1
2.57
-1
0.16
Given the / as field delimiter, print the 8th field if it exists otherwise print the string -1

$ ... | awk -F/ '{print (/avg/?$(NF-1):-1)}'
search for "avg" keyword, if found print penultimate field, otherwise -1.

Line manipulation & sorting

I am alright at writing Linux scripts but could use some advice. I know the problem is sort of vague, so if you can provide any help whatsoever I will appreciate it!
The following issue is for personal growth, and because I am writing some network tools for fun/learning. No homework involved (I'm a senior in college, none of my classes require this stuff!)
I am using tshark to get information about packet captures. This is what it looks like:
rachel#Ubuntu-1:~/PCAP$ tshark -r LargeTorrent.pcap -q -z io,phs
===================================================================
Protocol Hierarchy Statistics
Filter:
eth frames:4309 bytes:3984321
ip frames:4119 bytes:3969006
icmp frames:1316 bytes:1308988
udp frames:1408 bytes:1350786
data frames:1368 bytes:1346228
dns frames:16 bytes:1176
nbns frames:14 bytes:1300
http frames:8 bytes:1596
nbdgm frames:2 bytes:486
smb frames:2 bytes:486
mailslot frames:2 bytes:486
browser frames:2 bytes:486
tcp frames:1395 bytes:1309232
data frames:1300 bytes:1294800
http frames:6 bytes:3763
data-text-lines frames:2 bytes:324
xml frames:2 bytes:3205
tcp.segments frames:1 bytes:787
nbss frames:34 bytes:5863
smb frames:17 bytes:3047
pipe frames:4 bytes:686
lanman frames:4 bytes:686
smb2 frames:13 bytes:2444
bittorrent frames:10 bytes:1709
tcp.segments frames:2 bytes:433
bittorrent frames:2 bytes:433
bittorrent frames:1 bytes:258
bittorrent frames:2 bytes:221
bittorrent frames:2 bytes:221
arp frames:146 bytes:8760
ipv6 frames:44 bytes:6555
udp frames:40 bytes:6211
dns frames:18 bytes:1711
dhcpv6 frames:14 bytes:2114
http frames:6 bytes:1014
data frames:2 bytes:1372
icmpv6 frames:4 bytes:344
===================================================================
What I would like for it to look like:
rachel#Ubuntu-1:~/PCAP$ tshark -r LargeTorrent.pcap -q -z io,phs
===================================================================
Protocol Hierarchy Statistics
Filter:
Protocol Bytes
=====================================
eth 984321
ip 3969006
icmp 1308988
udp 1350786
data 1346228
dns 1176
nbns 1300
http 1596
nbdgm 486
smb 486
mailslot 486
browser 486
tcp 1309232
data 1294800
http 3763
data-text-lines 324
xml 3205
tcp.segments 787
nbss 5863
smb 3047
pipe 686
lanman 686
smb2 2444
bittorrent 1709
tcp.segments 433
bittorrent 433
bittorrent 258
bittorrent 221
bittorrent 221
arp 8760
ipv6 6555
udp 6211
dns 1711
dhcpv6 2114
http 1014
data 1372
icmpv6 344
===================================================================
Edit: I am going to add the original question for the purpose of making sense of the (great) answer that was provided.
Originally, I wanted to only print statistics for "leaves" because eth, ip, etc. are all parents and their statistics are not necessary for my purposes. In addition, instead of having a god-awful block of text with only spaces to show hierarchy, I wanted to erase all the statistics for parents, and show them as breadcrumbs behind the child.
Example:
eth frames:4309 bytes:3984321
ip frames:4119 bytes:3969006
icmp frames:1316 bytes:1308988
udp frames:1408 bytes:1350786
data frames:1368 bytes:1346228
dns frames:16 bytes:1176
Should become
eth:ip:icmp - 1308988 bytes
eth:ip:udp:data - 1346228 bytes
eth:ip:udp:dns - 1176 bytes
To preserve the hierarchy and avoid printing useless statistics.
Anyway, the approved answer by Etan solved this perfectly! And for those of you who are on my level who are unsure of how to proceed after this answer, this will help you finish up:
Save the given script as a filename.awk file
Save the block of text you want to manipulate as a filename.txt file
Call awk -f filename.awk filename.txt
Optionally pipe the output to a file ( awk -f filename.awk filename.txt >> output.txt )

The output I originally thought you wanted could be achieved with this awk script. (I think this can probably be done cleaner but this seems to work well enough.)
function entry() {
# Don't want to print empty entries.
if (ind[0]) {
printf "%s", ind[0]
for (i = 1; i <= ls; i++) {
printf ":%s", ind[i]
}
split(b, a, /:/)
printf " - %s %s\n", a[2], a[1]
}
}
# Found our data marker. Note that and print the current line.
$1 == "Filter:" {d=1; print; next}
# Print lines until we see our data marker.
!d {print; next}
# Print empty lines.
!NF {print; next}
# Save our trailing line for later.
/===/ {suf=$0; next}
{
# Save our previous indentation level.
ls = s
# Find our new indentation level (by where the first field starts).
s = (match($0, /[^[:space:]]/)-1) / 2
# If the current line is at or below the last indent level print the last line.
if (s <= ls) {
entry()
}
# Save the current line's byte count.
b=$NF
# Save the current line's field name.
ind[s] = $1
}
END {
# Print a final line if we had one.
entry()
# Print the suffix line if we have one.
if (suf) {
print suf
}
}
Which, on the sample input, gets you this output.
===================================================================
Protocol Hierarchy Statistics
Filter:
eth:ip:icmp - 1308988 bytes
eth:ip:udp:data - 1346228 bytes
eth:ip:udp:dns - 1176 bytes
eth:ip:udp:nbns - 1300 bytes
eth:ip:udp:http - 1596 bytes
eth:ip:udp:nbdgm:smb:mailslot:browser - 486 bytes
eth:ip:tcp:data - 1294800 bytes
eth:ip:tcp:http:data-text-lines - 324 bytes
eth:ip:tcp:http:xml:tcp.segments - 787 bytes
eth:ip:tcp:nbss:smb:pipe:lanman - 686 bytes
eth:ip:tcp:nbss:smb2 - 2444 bytes
eth:ip:tcp:bittorrent:tcp.segments:bittorrent:bittorrent - 258 bytes
eth:ip:tcp:bittorrent:bittorrent:bittorrent - 221 bytes
eth:arp - 8760 bytes
eth:ipv6:udp:dns - 1711 bytes
eth:ipv6:udp:dhcpv6 - 2114 bytes
eth:ipv6:udp:http - 1014 bytes
eth:ipv6:udp:data - 1372 bytes
eth:ipv6:icmpv6:data - 344 bytes
===================================================================
Output like what you edited to indicate you want is probably more easily handled with sed though.
/Filter:/a \
Protocol Bytes \
=====================================
s/frames:[^ ]*//
s/ b/b/
s/bytes:\([^ ]*\)/\1/
Which ends up with output.
===================================================================
Protocol Hierarchy Statistics
Filter:
Protocol Bytes
=====================================
eth 3984321
ip 3969006
icmp 1308988
udp 1350786
data 1346228
dns 1176
nbns 1300
http 1596
nbdgm 486
smb 486
mailslot 486
browser 486
tcp 1309232
data 1294800
http 3763
data-text-lines 324
xml 3205
tcp.segments 787
nbss 5863
smb 3047
pipe 686
lanman 686
smb2 2444
bittorrent 1709
tcp.segments 433
bittorrent 433
bittorrent 258
bittorrent 221
bittorrent 221
arp 8760
ipv6 6555
udp 6211
dns 1711
dhcpv6 2114
http 1014
data 1372
icmpv6 344
===================================================================

A simple script with sed will work as well.
$ printf "\n==========================================================\n"; printf "Protocol Hierarchy Statistics\nFilter:\n\n";printf "\nProtocol\t\t\t\t Bytes\n================================================\n" && sed -e 's/\(frames[:].*bytes[:]\)\(.*$\)/\2/' dat/tshark.txt | tail -n+4 | head -n-1 && printf "================================================\n"
broken down into script form (where dat/tshark.txt is the filename holding the tshark output):
printf "\n==========================================================\n"
printf "Protocol Hierarchy Statistics\nFilter:\n\n"
printf "\nProtocol\t\t\t\t Bytes\n================================================\n"
sed -e 's/\(frames[:].*bytes[:]\)\(.*$\)/\2/' dat/tshark.txt | tail -n+4 | head -n-1
printf "================================================\n"
Output
==========================================================
Protocol Hierarchy Statistics
Filter:
Protocol Bytes
================================================
eth 3984321
ip 3969006
icmp 1308988
udp 1350786
data 1346228
dns 1176
nbns 1300
http 1596
nbdgm 486
smb 486
mailslot 486
browser 486
tcp 1309232
data 1294800
http 3763
data-text-lines 324
xml 3205
tcp.segments 787
nbss 5863
smb 3047
pipe 686
lanman 686
smb2 2444
bittorrent 1709
tcp.segments 433
bittorrent 433
bittorrent 258
bittorrent 221
bittorrent 221
arp 8760
ipv6 6555
udp 6211
dns 1711
dhcpv6 2114
http 1014
data 1372
icmpv6 344
================================================
Formatting
Following on from your comment on how to align the bytes info given the variable length of the protocol tags, you can make use of printf to format the output as you have indicated. Like Ethan, I started working on your original question that had the tags consolidated. My initial approach was to read the different levels into different associative arrays that could be combined into what you initially specified. Doing so, I had to produce the output lined up using printf. Here is the first attempt I made working with the first 4-levels of your tshark data:
declare -i ln=0
declare -A l1 l2 l3 l4
## read each line in file and assing to associative arrays for each level
while read -r line; do
ln=${#line} # base level on length of line read
[ $ln -gt 66 ] && continue;
[ $ln -eq 66 ] && { iface="${line%% *}"; l1[${iface}]="${line##* }"; }
[ $ln -eq 64 ] && { proto="${iface}:${line%% *}"; l2[${proto}]="${line##* }"; }
[ $ln -eq 62 ] && { ptype="${proto}:${line%% *}"; l3[${ptype}]="${line##* }"; }
[ $ln -le 60 ] && { data="${ptype}:${line%% *}"; l4[${data}]="${line##* }"; }
done < "$1"
## output a summary of the file
printf "\n4-level deep summary of file '%s':\n\n" "$1"
for i in "${!l1[#]}"; do
for j in "${!l2[#]}"; do
printf " %-32s %s\n" "$j" "${l2[$j]}"
for k in "${!l3[#]}"; do
printf " %-32s %s\n" "$k" "${l3[$k]}"
for l in "${!l4[#]}"; do
[ "${l%:*}" == "$k" ] && printf " %-32s %s\n" "$l" "${l4[$l]}"
done
done
done
done
The output it produced was for example:
eth:ip frames:4119 bytes:3969006
eth:ip:udp frames:1408 bytes:1350786
eth:ip:udp:data frames:1368 bytes:1346228
eth:ip:udp:nbdgm frames:2 bytes:486
eth:ip:udp:nbns frames:14 bytes:1300
You can look at the various printf statements in the code above and see how the alignment is handled. Let me know if you have further questions.

I'm a little surprised that tshark doesn't have a JSON or machine-readable way to get the -z io,phs info, when it has so many ways to extract packet info.
I tried playing with some of the above, but bash seems to have changed over the years (or has different defaults depending on the environment). I am also not sure which shell or version of it was used to produce the above.
The line lengths/output from tshark have also changed: My debugging showed different line lengths, so the trick above using line lengths, e.g. [ $ln -gt 66 ] didn't work for me.
It seems that read -r strips out leading/trailing whitespaces. If you actually want it, you need IFS= to make it give you the spaces:
## read each line in file
while IFS= read -r line ; do
...
done
The "nested" levels associative arrays is clever, but hard to work with - it shows what rabbit holes you can go down with bash - although now when iterating through it, bash produces it in "hash" order and not the order they were added.
Since I actually needed the data in the rest of my script, the nested arrays made it particularly fiddly to deal with. Fine for printf purposes where you just print the line, but what if you actually want to get the frames count for each item and do then do something with it.
Here was my attempt that simplified it a bit. I implemented it as a bash function which gets a few other bits of info from the sample file:
TSHARK=/usr/bin/tshark
CAPINFOS=/usr/bin/capinfos
declare -A fcount
declare -A bcount
declare -A capinfo
function loadcapinfo
{
local sample=$1
local statstofile=$2
local bytes
local frames
local key
if [ ! -f "$sample" ] ; then
echo "FATAL: loadcapinfo: file does not exist: $sample"
exit 1
fi
capinfo[start_time_epoch]=$($CAPINFOS -Tr -Sa $sample | cut -f2)
capinfo[start_time]=$($CAPINFOS -Tr -a $sample | cut -f2)
capinfo[end_time_epoch]=$($CAPINFOS -Tr -Se $sample | cut -f2)
capinfo[end_time]=$($CAPINFOS -Tr -e $sample | cut -f2)
capinfo[size]=$($CAPINFOS -Tr -s $sample | cut -f2)
declare -i ln=0
while IFS= read -r line ; do
ln=${#line} # base level on length of line read
[ $ln -le 1 ] && continue;
pat=".*frames:([0-9]+)\s+bytes:([0-9]+)"
pat_1="^(\w+)"
pat_2="^\s{2}(\w+)"
pat_3="^\s{4}(\w+)"
pat_4="^\s{6}(\w+)"
ethertype="ethertype"
[[ $line =~ $pat ]] && { frames=${BASH_REMATCH[1]}; bytes=${BASH_REMATCH[2]}; } || continue;
[[ $line =~ $pat_1 ]] && { encap="${BASH_REMATCH[1]}:${ethertype}"; key="${encap}"; }
[[ $line =~ $pat_2 ]] && { proto=${BASH_REMATCH[1]}; key="${encap}:${proto}"; }
[[ $line =~ $pat_3 ]] && { ptype=${BASH_REMATCH[1]}; key="${encap}:${proto}:${ptype}"; }
[[ $line =~ $pat_4 ]] && { data=${BASH_REMATCH[1]}; key="${encap}:${proto}:${ptype}:${data}"; }
[ "$proto" = "llc" ] && { key=${key/eth:ethertype:llc/eth:llc} ; }
fcount[${key}]=${frames:=0}
bcount[${key}]=${bytes:=0}
if [ -n "$statstofile" ] ; then
echo "${capinfo[start_time_epoch]},${key},${frames},${bytes}" >> $statstofile
fi
done < <($TSHARK -qr $sample -z io,phs)
unset fcount[0]
}
Now, after this in the script, we can do:
loadcapinfo /my/sample/file.pcap /tmp/stats.txt
Optionally write the counts to a file, /tmp/stats.txt
This uses one associative array for each count, and puts other info into capinfo so now we can do things like:
echo "IPv4 Packet Count is: ${fcount[eth:ethertype:ip]}"
echo "IPv6 Packet Count is: ${fcount[eth:ethertype:ipv6]}"
echo "ARP Count is: ${fcount[eth:ethertype:arp]}"
echo "STP Count is: ${fcount[eth:llc:stp]}"
echo "Start time: ${capinfo[start_time]}"
echo "End time: ${capinfo[end_time]}"
echo "File size: ${capinfo[size]}"
I made the keys match Wireshark's frame.protocols field, which inserts some "pseudo protocol" for most things called "ethertype". This way, if you want to then iterate through the associative array to find the packet(s) in the pcap file, you can use the information to find packets with a given protocol.
tshark -r /my/sample/file.pcap -Y "frame.protocols == eth:ethertype:ip:udp:snmp" -Tfields -e frame.number -e eth.src_resolved -e eth.dst_resolved -e ip.src -e ip.dst -e frame.protocols
for i in "${!fcount[#]}"; do
tshark -r /my/sample/file.pcap -Y "frame.protocols == $i" -Tfields -e frame.number -e eth.src_resolved -e eth.dst_resolved -e ip.src -e ip.dst -e frame.protocols > /tmp/$i.txt
done

How to get ubi volume name by volume ID on linux terminal?

We have 4 volumes on ubi0 and I want to rename the volume name during runtime(dynamically).
I found one option is like getting ubinfo for corresponding volume and parsing result to get the volume name.
example:
ubi0
ubi0_0:
Name: name1
ubi0_1:
Name: name_2
...........
like this till ubi0_4.
say if I want to get the volume 2 name then
ubinfo -d 0 -n 2 |grep "Name:" | sed -e 's|Name:||' -e 's/^ *//'
name_2
command details: -d <UBI device number> -----> ubi0(0)
-n <volume ID> -------> 2
ouptut of ubinfo -d 0 -n 2
Volume ID: 2 (on ubi0)
Type: dynamic
Alignment: 1
Size: mm LEBs (xxxxx bytes, d MiB)
State: OK
Name: name_2
Character device major/minor: zzz:n
reaming is to get the Name string value.
Is there any other easier option to get the volume name by volume id?

volid=2
cat /sys/class/ubi/ubi0_$volid/name

How to find user memory usage in linux

How i can see memory usage by user in linux centos 6
For example:
USER USAGE
root 40370
admin 247372
user2 30570
user3 967373

This one-liner worked for me on at least four different Linux systems with different distros and versions. It also worked on FreeBSD 10.
ps hax -o rss,user | awk '{a[$2]+=$1;}END{for(i in a)print i" "int(a[i]/1024+0.5);}' | sort -rnk2
About the implementation, there are no shell loop constructs here; this uses an associative array in awk to do the grouping & summation.
Here's sample output from one of my servers that is running a decent sized MySQL, Tomcat, and Apache. Figures are in MB.
mysql 1566
joshua 1186
tomcat 353
root 28
wwwrun 12
vbox 1
messagebus 1
avahi 1
statd 0
nagios 0
Caveat: like most similar solutions, this is only considering the resident set (RSS), so it doesn't count any shared memory segments.
EDIT: A more human-readable version.
echo "USER RSS PROCS" ; echo "-------------------- -------- -----" ; ps hax -o rss,user | awk '{rss[$2]+=$1;procs[$2]+=1;}END{for(user in rss) printf "%-20s %8.0f %5.0f\n", user, rss[user]/1024, procs[user];}' | sort -rnk2
And the output:
USER RSS PROCS
-------------------- -------- -----
mysql 1521 1
joshua 1120 28
tomcat 379 1
root 19 107
wwwrun 10 10
vbox 1 3
statd 1 1
nagios 1 1
messagebus 1 1
avahi 1 1

Per-user memory usage in percent using standard tools:
for _user in $(ps haux | awk '{print $1}' | sort -u)
do
ps haux | awk -v user=${_user} '$1 ~ user { sum += $4} END { print user, sum; }'
done
or for more precision:
TOTAL=$(free | awk '/Mem:/ { print $2 }')
for _user in $(ps haux | awk '{print $1}' | sort -u)
do
ps hux -U ${_user} | awk -v user=${_user} -v total=$TOTAL '{ sum += $6 } END { printf "%s %.2f\n", user, sum / total * 100; }'
done
The first version just sums up the memory percentage for each process as reported by ps. The second version sums up the memory in bytes instead and calculates the total percentage afterwards, thus leading to a higher precision.

If your system supports, try to install and use smem:
smem -u
User Count Swap USS PSS RSS
gdm 1 0 308 323 820
nobody 1 0 912 932 2240
root 76 0 969016 1010829 1347768
or
smem -u -t -k
User Count Swap USS PSS RSS
gdm 1 0 308.0K 323.0K 820.0K
nobody 1 0 892.0K 912.0K 2.2M
root 76 0 937.6M 978.5M 1.3G
ameskaas 46 0 1.2G 1.2G 1.5G
124 0 2.1G 2.2G 2.8G
In Ubuntu, smem can be installed by typing
sudo apt install smem

This will return the total ram usage by users in GBs, reverse sorted
sudo ps --no-headers -eo user,rss | awk '{arr[$1]+=$2}; END {for (i in arr) {print i,arr[i]/1024/1024}}' | sort -nk2 -r

You can use the following Python script to find per-user memory usage using only sys and os module.
import sys
import os
# Get list of all users present in the system
allUsers = os.popen('cut -d: -f1 /etc/passwd').read().split('\n')[:-1]
for users in allUsers:
# Check if the home directory exists for the user
if os.path.exists('/home/' + str(users)):
# Print the current usage of the user
print(os.system('du -sh /home/' + str(users)))

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to print most frequent lines of file in linux terminal? - linux

The following should work. Note that it returns 5 lines, even if there are 10 IPs with the same frequency. cut -f2 file | sort | uniq -c | sort -n | head -n5

Related

gawk: Using a combination of "/proc/net/route" and "/proc/net/fib_trie" to determine the current local IP address

Extract average time using fping

Line manipulation & sorting

How to get ubi volume name by volume ID on linux terminal?

How to find user memory usage in linux

Categories

Resources