How to only display owner of file when using ls command with special edge case - linux

My objective is to find all files in a directory recursively and display only the file owner name so I'm able to use uniq to count the # of files a user owns in a directory. The command I am using is the following:
command = "find " + subdirectory.directoryPath + "/ -type f -exec ls -lh {} + | cut -f 3 -d' ' | sort | uniq -c | sort -n"
This command successfully displays only the owner of the file of each line, and allows me to count of the # of times the owner names is repeated, hence getting the # of files they own in a subdirectory. Cut uses ' ' as a delimiter and only keeps the 3rd column in ls, which is the owner of the file.
However, for my purpose there is this special edge case, where I'm not able to obtain the owner name if the following occurs.
-rw-r----- 1 31122918 group 20169510233 Mar 17 06:02
-rw-r----- 1 user1 group 20165884490 Mar 25 11:11
-rw-r----- 1 user1 group 20201669165 Mar 31 04:17
-rwxr-x--- 1 user3 group 20257297418 Jun 2 13:25
-rw-r----- 1 user2 group 20048291543 Mar 4 22:04
-rw-r----- 1 14235912 group 20398346003 Mar 10 04:47
The special edge cases are the #s as the owner you see above. The current command Im using can detect user1,user2,and user3 perfectly, but because the numbers are placed all the way to the right, the command above doesn't detect the numbers, and simply displays nothing. Example output is shown here:
1
1 user3
1 user2
1
2 user1
Can anyone help me parse the ls output so I'm able to detect these #'s when trying to only print the file owner column?

cut -d' ' won't capture the third field when it contains leading spaces -- each space is treated as the separator of another field.
Alternatives:
cut -c
123456789X123456789X123456789X123456789X123456789L0123456789X0123
-rw-r----- 1 31122918 group 20169510233 Mar 17 06:02
-rw-r----- 1 user1 group 20165884490 Mar 25 11:11
The data you seek is between characters 15 and 34 on each line, so you can say
cut -c14-39
perl/awk: other tools are adept at extracting data out of a line. Try one of
perl -lane 'print $F[2]'
awk '{print $3}'

Don't try to parse the output of ls. Use the stat command.
find dirname ! -user root -type f -exec stat --format=%U {} + | sort | uniq -c | sort -n
%U prints the owner username.

Merging multiple spaces
tr -s ' '
Get file users
ls -hl | tr -s ' ' | cut -f 3 -d' '
ls -hl | awk '{print $3}'
sudo find ./ ! -user root -type f -exec ls -lh {} + | tr -s ' ' | cut -f 3 -d' ' | sort | uniq -c | sort -n

You can use the below command to display only the owner of a directory or a file.
stat -c "%U" /path/of/the/file/or/directory
If you also want to print the group of a file or directory you can use %G as well.
stat -c "%U %G" /path/of/the/file/or/directory

Related

Extracting the user with the most amount of files in a dir

I am currently working on a script that should receive a standard input, and output the user with the highest amount of files in that directory.
I've wrote this so far:
#!/bin/bash
while read DIRNAME
do
ls -l $DIRNAME | awk 'NR>1 {print $4}' | uniq -c
done
and this is the output I get when I enter /etc for an instance:
26 root
1 dip
8 root
1 lp
35 root
2 shadow
81 root
1 dip
27 root
2 shadow
42 root
Now obviously the root folder is winning in this case, but I don't want only to output this, i also want to sum the number of files and output only the user with the highest amount of files.
Expected output for entering /etc:
root
is there a simple way to filter the output I get now, so that the user with the highest sum will be stored somehow?
ls -l /etc | awk 'BEGIN{FS=OFS=" "}{a[$4]+=1}END{ for (i in a) print a[i],i}' | sort -g -r | head -n 1 | cut -d' ' -f2
This snippet returns the group with the highest number of files in the /etc directory.
What it does:
ls -l /etc lists all the files in /etc in long form.
awk 'BEGIN{FS=OFS=" "}{a[$4]+=1}END{ for (i in a) print a[i],i}' sums the number of occurrences of unique words in the 4th column and prints the number followed by the word.
sort -g -r sorts the output descending based on numbers.
head -n 1 takes the first line
cut -d' ' -f2 takes the second column while the delimiter is a white space.
Note: In your question, you are saying that you want the user with the highest number of files, but in your code you are referring to the 4th column which is the group. My code follows your code and groups on the 4th column. If you wish to group by user and not group, change {a[$4]+=1} to {a[$3]+=1}.
Without unreliable parsing the output of ls:
read -r dirname
# List user owner of files in dirname
stat -c '%U' "$dirname/" |
# Sort the list of users by name
sort |
# Count occurrences of user
uniq -c |
# Sort by higher number of occurrences numerically
# (first column numerically reverse order)
sort -k1nr |
# Get first line only
head -n1 |
# Keep only starting at character 9 to get user name and discard counts
cut -c9-
I have an awk script to read standard input (or command line files) and sum up the unique names.
summer:
awk '
{ sum[ $2 ] += $1 }
END {
for ( v in sum ) {
print v, sum[v]
}
}
' "$#"
Let's say we are using your example of /etc:
ls -l /etc | summer
yields:
0
dip 2
shadow 4
root 219
lp 1
I like to keep utilities general so I can reuse them for other purposes. Now you can just use sort and head to get the maximum result output by summer:
ls -l /etc | summer | sort -r -k2,2 -n | head -1 | cut -f1 -d' '
Yields:
root

ls option to list a symlink reference only (no dates, permissions, size, etc.)

If I use:
$ ls -l mysymlinkname
I get:
lrwxrwxrwx 1 ownr grp 46 Jan 19 17:15 mysymlinkname -> /home/ownr/path/target
All I want is:
/home/ownr/path/target
to put into a bash variable.
Is there an ls option for that? or a simple reliable bash command to extract it?
ls -l `pwd`/mysymlinkname | awk '{print $NF}'
To put it into a variable:
VARIABLE=$(ls -l `pwd`/mysymlinkname | awk '{print $NF}')

Print permissions from file arguments in Bash script

I'm having trouble reading the permissions of file arguments. I looks like it has something to do with hidden files but I'm not sure why.
Current Code:
#!/bin/bash
if [ $# = 0 ]
then
echo "Usage ./checkPerm filename [filename2 ... filenameN]"
exit 0
fi
for file in $#
do
ls -l | grep $file | cut -f1 -d' '
# Do Something
done
I can get the permissions for each input, but when a hidden file is run through through the loop it re-prints the permissions of all files.
-bash-4.1$ ll test*
-rw-r--r-- 1 user joe 0 Nov 11 19:07 test1
-r-xr-xr-x 1 user joe 0 Nov 11 19:07 test2*
-r--r----- 1 user joe 0 Nov 11 19:07 test3
-rwxr-x--- 1 user joe 0 Nov 11 19:07 test4*
-bash-4.1$ ./checkPerm test*
-rw-r--r--
-rw-r--r--
-r-xr-xr-x
-r--r-----
-rwxr-x---
-r--r-----
-rw-r--r--
-r-xr-xr-x
-r--r-----
-rwxr-x---
-bash-4.1$
What is going on in the loop?
It's your grep:
ls -l | grep 'test2*'
This will grep out anything starting with test since you're basically asking for anything starting with test that might end with 0 or more 2s in it, as specified by the 2*.
To get your intended result, simply remove your loop and replace it with this:
ls -l "$#" | cut -d' ' -f1
Or keep your loop, but remove the grep:
ls -l $file | cut -d' ' -f1
Also, technically, none of those files are hidden. Hidden files in bash start with ., like .bashrc.
When you do the ls -l inside the loop and then grep the results, if there are files that contain test1 in the name, but not at the start, they are selected by the grep, giving you extra results. You could see that by doing:
ls -l | grep test
and seeing that there are many more entries than the 4 you get with ls -l test*.
Inside your loop, you should probably use just:
ls -ld "$file" | cut -d' ' -f1

tr "[1-9]" "['01'-'09']" not working properly

I'm trying to cut only the date part from a ls -lrth | grep TRACK output:
-rw-r--r-- 1 ins ins 0 Dec 3 00:00 TRACK_1_20121203_01010014.LOG
-rw-r--r-- 1 ins ins 0 Dec 3 00:00 TRACK_0_20121203_01010014.LOG
-rw-r--r-- 1 ins ins 0 Dec 13 15:10 TRACK_9_20121213_01010014.LOG
-rw-r--r-- 1 ins ins 0 Dec 13 15:10 TRACK_8_20121213_01010014.LOG
But, doing this:
ls -lrth | grep TRACK | tr "\t" " " | cut -d" " -f 9
only gives me the dates which are double digits and spaces for single digits:
13
13
So I tried something with tr command, to translate all single digit dates to double digits:
ls -lrth | grep TRACK | tr "\t" " " | tr "[1-9]" "['01'-'09']" | cut -d" " -f 9
But it's giving some weird results, and evidently don't serve my purpose. Any ideas on how to get the correct output?
Don't parse ls output.
ls is a tool for interactively looking at file information. Its output is formatted for humans and will cause bugs in scripts. Use globs or find instead. Understand why: http://mywiki.wooledge.org/ParsingLs
I recommend this way :
If you want the date and the file path :
find . -name 'TRACK*' -printf '%a %p\n'
If you want only the date:
find . -name 'TRACK*' -printf '%a\n'
You could try another approach with something like
find . -name 'TRACK*' -exec stat -c %y {} \; | sort
You can add something like | cut -f1 -d' ' if you only need the date.
I guess this does suffice:
ls -lhrt | grep TRACK | awk '{print $6, $7, $8}'
that kind of substitution would be better handled through sed:
ls -lrth | grep TRACK | sed 's/ \+/ /g;s/ \([0-9]\) / 0\1 /g' | cut -d" " -f 7
As already said, never parse the output of ls!
Since you only want the modification time, the command date has a cool option for that: option -r (man date for more info).
Hence, you probably want this instead of your line:
for i in TRACK*; do date -r "$i"; done
I don't know how you want the format of the date, so play with the options, e.g.,
for i in TRACK*; do date -r "$i" "+%D"; done
(the formats are in man date).
Use stat to get information about a file.
Also, tr only does one-to-one character translation. It won't replace one-character sequences with two-character ones.

linux bash command separate by space

so I'm trying to display only columns at a time
first ls -l gives me this
drwxr-xr-x 11 stuff stuff 4096 2009-08-22 06:45 lyx-1.6.4
-rw-r--r-- 1 stuff stuff 14403778 2009-10-26 02:37 lyx.tar.gz
I'm using this:
ls -l |cut -d " " -f 1
to get this
drwxr-xr-x
-rw-r--r--
and it displays my first column just fine. Then I want to see on the second column
ls -l |cut -d " " -f 2
I only get this
11
Shouldn't I get
11
1
?
Why is it doing this?
if I try
ls -l |cut -d " " -f 2-3
I get
11 stuff
There's gotta be an easier way to display columns right?
This should show the second column:
ls -l | awk '{print $2}'
cut considers two sequential delimiters to have an empty field in between. So the second line:
-rw-r--r-- 1 stuff stuff
has fields:
1: -rw-r--r--
2: --empty field--
3: 1
etc.
You can use use column fields in cut:
ls -l | cut -c13-14
Or you can use awk to separate fields (unlink, cut awk will treat sequential delimiters as a single delimiter).

Resources