Shell script to recursively print full directory tree using ls

Shell script to recursively print full directory tree using ls - linux

Assignment: I have to create a shell script using diff and sort, and a pipeline using ls -l, grep '^d', and awk '{print $9}' to print a full directory tree.
I wrote a C program to display what I am looking for. Here is the output:
ryan#chrx:~/Documents/OS-Projects/Project5_DirectoryTree$ ./a.out
TestRoot/
[Folder1]
[FolderC]
[FolderB]
[FolderA]
[Folder2]
[FolderD]
[FolderF]
[FolderE]
[Folder3]
[FolderI]
[FolderG]
[FolderH]
I wrote this so far:
ls -R -l $1 | grep '^d' | awk '{print $9}'
to print the directory tree but now I need a way to sort it by folder depth and possibly indent but not required. Any suggestions? I can't use find or tree commands.
EDIT: The original assignment & restrictions were mistaken and changed at a later date. The current answers are good solutions if you disregard the restrictions so please leave them for any people with similar issues. As for the the new assignment in case anybody was wondering. I was to recursively print all sub directories, sort them, then compare them with my program to make sure they have similar results. Here was my solution:
#!/bin/bash
echo Program:
./a.out $1 | sort
echo Shell Script:
ls -R -l $1 | grep '^d' | awk '{print $9}' | sort
diff <(./a.out $1 | sort) <(ls -R -l $1 | grep '^d' | awk '{print $9}' | sort)
DIFF=$?
if [[ $DIFF -eq 0 ]]
then
echo "The outputs are similar!"
fi

You don't need neither ls nor grep nor awk for getting the tree. The Simple recursive bash function will be enouh, like:
#!/bin/bash
walk() {
local indent="${2:-0}"
printf "%*s%s\n" $indent '' "$1"
for entry in "$1"/*; do
[[ -d "$entry" ]] && walk "$entry" $((indent+4))
done
}
walk "$1"
If you run it as bash script.sh /etc it will print the dir-tree like:
/etc
/etc/apache2
/etc/apache2/extra
/etc/apache2/original
/etc/apache2/original/extra
/etc/apache2/other
/etc/apache2/users
/etc/asl
/etc/cups
/etc/cups/certs
/etc/cups/interfaces
/etc/cups/ppd
/etc/defaults
/etc/emond.d
/etc/emond.d/rules
/etc/mach_init.d
/etc/mach_init_per_login_session.d
/etc/mach_init_per_user.d
/etc/manpaths.d
/etc/newsyslog.d
/etc/openldap
/etc/openldap/schema
/etc/pam.d
/etc/paths.d
/etc/periodic
/etc/periodic/daily
/etc/periodic/monthly
/etc/periodic/weekly
/etc/pf.anchors
/etc/postfix
/etc/postfix/postfix-files.d
/etc/ppp
/etc/racoon
/etc/security
/etc/snmp
/etc/ssh
/etc/ssl
/etc/ssl/certs
/etc/sudoers.d

Borrowing from #jm666's idea of running it on /etc:
$ find /etc -type d -print | awk -F'/' '{printf "%*s[%s]\n", 4*(NF-2), "", $0}'
[/etc]
[/etc/alternatives]
[/etc/bash_completion.d]
[/etc/defaults]
[/etc/defaults/etc]
[/etc/defaults/etc/pki]
[/etc/defaults/etc/pki/ca-trust]
[/etc/defaults/etc/pki/nssdb]
[/etc/defaults/etc/profile.d]
[/etc/defaults/etc/skel]
[/etc/fonts]
[/etc/fonts/conf.d]
[/etc/fstab.d]
[/etc/ImageMagick]
[/etc/ImageMagick-6]
[/etc/pango]
[/etc/pkcs11]
[/etc/pki]
[/etc/pki/ca-trust]
[/etc/pki/ca-trust/extracted]
[/etc/pki/ca-trust/extracted/java]
[/etc/pki/ca-trust/extracted/openssl]
[/etc/pki/ca-trust/extracted/pem]
[/etc/pki/ca-trust/source]
[/etc/pki/ca-trust/source/anchors]
[/etc/pki/ca-trust/source/blacklist]
[/etc/pki/nssdb]
[/etc/pki/tls]
[/etc/postinstall]
[/etc/preremove]
[/etc/profile.d]
[/etc/sasl2]
[/etc/setup]
[/etc/skel]
[/etc/ssl]
[/etc/texmf]
[/etc/texmf/tlmgr]
[/etc/texmf/web2c]
[/etc/xml]
Sorry, I couldn't find a sensible way to use the other tools you mentioned so it may not help you but maybe it'll help others with the same question but without the requirement to use specific tools.

Related

Bash: Reading a column from ls -l

For a problem at uni I need to get the file size and file name of the 5 largest files in a series of directories. To do this I'm using two functions, one which loads everything in with ls -l (I realize that parsing info from ls isn't a good method but this particular problem specifies that I can't use find, locate or du). Each line from the ls output is then sent to another function which using awk should withdraw the filesize and file name and store it into an array. Instead I seem to be getting awk trying to open every column from ls to be read.
The code for this is as so:
function addFileSize {
local y=0
local curLine=$1
if [[ -z "${sizeArray[0]}" ]]; then
i=$(awk '{print $5}' $curLine)
nameArray[y]=$(awk '{print $9}' $curLine)
elif [[ -z "${sizeArray[1]}" ]]; then
i=$(awk '{print $5}' $curLine)
nameArray[y]=$(awk '{print $9}' $curLine)
elif [[ -z "${sizeArray[2]}" ]]; then
i=$(awk '{print $5}' $curLine)
nameArray[y]=$(awk '{print $9}' $curLine)
elif [[ -z "${sizeArray[3]}" ]]; then
i=$(awk '{print $5}' $curLine)
nameArray[y]=$(awk '{print $9}' $curLine)
elif [[ -z "${sizeArray[4]}" ]]; then
i=$(awk '{print $5}' $curLine)
nameArray[y]=$(awk '{print $9}' $curLine)
fi
for i in "${sizeArray[#]}"; do
echo "$(awk '{print $5}' $curLine)"
if [[ -z "$i" ]]; then
i=$(awk '{print $5}' $curLine)
nameArray[y]=$(awk '{print $9}' $curLine)
break
elif [[ $i -lt $(awk '{print $5}' $curLine) ]]; then
i=$(awk '{print $5}' $curLine)
nameArray[y]=$(awk '{print $9}' $curLine)
break
fi
let "y++"
done
echo "Name Array:"
echo "${nameArray[#]}"
echo "Size Array:"
echo "${sizeArray[#]}"
}
function searchFiles {
local curdir=$1
for i in $( ls -C -l -A $curdir | grep -v ^d | grep -v ^total ); do # Searches through all files in the current directory
if [[ -z "${sizeArray[4]}" ]]; then
addFileSize $i
elif [[ ${sizeArray[4]} -lt $(awk '{print $5}' $i) ]]; then
addFileSize $i
fi
done
}
Any help would be greatly appreciated, thanks.

If the problem is specifically supposed to be about parsing, then awk might be a good option (although ls output is challenging to parse reliably). Likewise, if the problem is about working with arrays, then your solution should focus on those.
However, if the problem is there to encourage learning about the tools available to you, I would suggest:
the stat tool prints particular pieces of information about a file (including size)
the sort tool re-orders lines of input
the head and tail tools print the first and last lines of input
and your shell can also perform pathname expansion to list files matching a glob wildcard pattern like *.txt
Imagine a directory with some files of various sizes:
10000000 sound/concert.wav
1000000 sound/song.wav
100000 sound/ding.wav
You can use pathname expansion to find their names:
$ echo sound/*
sound/concert.wav sound/ding.wav sound/song.wav
You can use stat to turn a name into a size:
$ stat -f 'This one is %z bytes long.' sound/ding.wav
This one is 100000 bytes long.
Like most Unix tools, stat works the same whether you provide it one argument or several:
$ stat -f 'This one is %z bytes long.' sound/concert.wav sound/ding.wav sound/song.wav
This one is 10000000 bytes long.
This one is 100000 bytes long.
This one is 1000000 bytes long.
(Check man stat for reference about %z and what else you can print. The file's Name is particularly useful.)
Now you have a list of file sizes (and hopefully you've kept their names around too). How do you find which sizes are biggest?
It's much easier to find the biggest item in a sorted list than an unsorted list. To get a feel for it, think about how you might find the highest two items in this unsorted list:
1234 5325 3243 4389 5894 245 2004 45901 3940 3255
Whereas if the list is sorted, you can find the biggest items very quickly indeed:
245 1234 2004 3243 3255 3940 4389 5325 5894 45901
The Unix sort utility takes lines of input and outputs them from lowest to highest (or in reverse order with sort -r).
It defaults to sorting character-by-character, which is great for words ("apple" comes before "balloon") but not so great for numbers ("10" comes before "9"). You can activate numeric sorting with sort -n.
Once you have a sorted list of lines, you can print the first lines with the head tool, or print the last lines using the tail tool.
The first two items of the (already-sorted) list of words for spell-checking:
$ head -n 2 /usr/share/dict/words
A
a
The last two items:
$ tail -n 2 /usr/share/dict/words
Zyzomys
Zyzzogeton
With those pieces, you can assemble a solution to the problem "find the five biggest files across dir1, dir2, dir3":
stat -f '%z %N' dir1/* dir2/* dir3/* |
sort -n |
tail -n 5
Or a solution to "find the biggest file in each of dir1, dir, dir3, dir4, dir5":
for dir in dir1 dir2 dir3 dir4 dir5; do
stat -f '%z %N' "$dir"/* |
sort -n |
tail -n 1
done

Without using find, locate, or du, you could do the following for each directory:
ls -Sl|grep ^\-|head -5|awk '{printf("%s %d\n", $9, $5);}'
which lists all files by size, filters out directories, takes the top 5, and prints the file name and size. Wrap with a loop in bash for each directory.

Use ls -S to sort by size, pipe through head to get the top five, pipe through sed to compress multiple spaces into one, then pipe through cut to get the size and file name fields.
robert#habanero:~/scripts$ ls -lS | head -n 5 | sed -e 's/ / /g' | cut -d " " -f 5,9
32K xtractCode.pl
29K tmd55.pl
24K tagebuch.pl
14K backup
Just specify the directories as arguments to the initial ls.

This would be another choice. Ctrl+V+I is how to insert a tab from the command line.
ls -lS dir1 dir2 dir3.. | awk 'BEGIN{print "Size""Ctrl+V+I""Name"}NR <= 6{print $5"Ctrl+V+I"$9}'

If you can't use find locate and du, there's still a straightforward option to get the file size without resorting to ls parsing:
size=$(wc -c < "$file")
wc is smart enough to detect a file on STDIN and call stat to get the size, so this works just as fast.

How to match strings in file names and rename file according to string?

I have many files with matching strings in file names.
foostring.bar
barstring.bar
fuustring.bar
aha_foostring.abc
meh_barstring.abc
lol_fuustring.abc
...
I need to find the bar and abc files with matching strings, and rename the *.bar-files basename to the look like the *.abc-files. In other words, add a string prefix.
The result I'm looking for should look like this:
aha_foostring.bar
meh_barstring.bar
lol_fuustring.bar
aha_foostring.abc
meh_barstring.abc
lol_fuustring.abc
...
Clarification Edit: The strings in the *.abc-files are always situated after the last underscore _ and before the dot . The string only contains letters and numbers. The prefix can contain any number of characters, and any type of character, including _ and . This means I also need to take the below example into consideration.
dindongstring.bar
w_h.a.t_e_v_e.r_dingdongstring.abc
I've been experimenting with find, prefix and basename, but I need help and advice here.
Thanks

I would go with something like this:
(I am sure there are more elegant ways to do it (awk/sed))
#!/bin/bash
for filename in *.abc
do
prefix=${filename%_*}
searchstring=${filename%.abc}
searchstring=${searchstring#*_}
if [[ -f "$searchstring.bar" ]]
then
mv "${searchstring}.bar" "${prefix}_${searchstring}.bar"
fi
done
# show the result
ls -al
Apologies for adding this in your answer but since I've deleted my answer and you answer is closest to what OP needs. (I dont mind... I care about solutions =)
EDIT: Probably this is what OP wants:
for f in *.abc; do
prefix=${f%_*}
bar=${f%.abc}
bar="${bar##*_}.bar"
[[ -f "$bar" ]] && mv "$bar" "${prefix}_${bar}"
done

I suggest to try the following "magick":
$ join -j 2 <(ls -1 . | sed -n '/\.bar/s/^\(.*\)\(\.[^.]\+\)$/\1\2\t\1/p' | sort -k2) <(ls -1 . | sed -n '/\.abc/s/^\(.\+_\)\?\([a-zA-Z0-9]\+\)\(\.[^.]\+\)$/\1\2\3\t\2\t\1/p' | sort -k2) | awk '{print $2 " " $4}' | while read FILE PREFIX; do echo mv -v "$FILE" "$PREFIX$FILE"; done
mv -v barstring.bar meh_barstring.bar
mv -v dingdongstring.bar w_h.a.t_e_v_e.r_dingdongstring.bar
mv -v foostring.bar aha_foostring.bar
mv -v fuustring.bar lol_fuustring.bar
If it will show expected commands then remove echo before mv and run again to do the changes.
Note also that there I use ls -1 . command to show files of the current directory, probably you'll need to change directory or run command in directory with files.
Little explanation:
The idea behind that code is to create pairs of filename-common part for .bar and .abc files:
$ ls -1 . | sed -n '/\.bar/s/^\(.*\)\(\.[^.]\+\)$/\1\2\t\1/p' | sort -k2
barstring.bar barstring
dingdongstring.bar dingdongstring
foostring.bar foostring
fuustring.bar fuustring
$ ls -1 . | sed -n '/\.abc/s/^\(.\+_\)\?\([a-zA-Z0-9]\+\)\(\.[^.]\+\)$/\1\2\3\t\2\t\1/p' | sort -k2
meh_barstring.abc barstring meh_
w_h.a.t_e_v_e.r_dingdongstring.abc dingdongstring w_h.a.t_e_v_e.r_
aha_foostring.abc foostring aha_
lol_fuustring.abc fuustring lol_
As you can see there the 2nd field is common part. After that we join these lists together by common part and leave only .abc filename and prefix:
$ join -j 2 <(ls -1 . | sed -n '/\.bar/s/^\(.*\)\(\.[^.]\+\)$/\1\2\t\1/p' | sort -k2) <(ls -1 . | sed -n '/\.abc/s/^\(.\+_\)\?\([a-zA-Z0-9]\+\)\(\.[^.]\+\)$/\1\2\3\t\2\t\1/p' | sort -k2) | awk '{print $2 " " $4}'
barstring.bar meh_
dingdongstring.bar w_h.a.t_e_v_e.r_
foostring.bar aha_
fuustring.bar lol_
And final step is to rename files by adding appropriate prefix to them.

Reformat with awk and sed from STDIN and execute

This is just an example of what I run into a lot:
I would like to copy all .bash_histories to one directory.
grep "/bin/bash" /etc/passwd | awk -F: '{ print "cp " $6"/.bash_history /backup" $6 ".bash_history" }
Output:
cp /home/peter/.bash_history /backup/home/peter/.bash_history
cp /home/john/.bash_history /backup/home/john/.bash_history
What I would like is an output like this:
cp /home/peter/.bash_history /backup/_home_peter_.bash_history
cp /home/john/.bash_history /backup/_home_john_.bash_history
And that this output will be executed.
(It's not specifically about this issue, but just in general how to reformat with awk and sed and execute the new created command line, without really creating a script for it)

The awk script to obtain a similar output will be
grep "/bin/bash" /etc/passwd |head -2 | awk -F: '{ print "cp " $6 "/.bash_history backup/_home_"$1".bash_history" }'
giving an output like
cp /root/.bash_history backup/_home_root.bash_history
cp /home/xxx/.bash_history backup/_home_xxx.bash_history
Now inorder to excecute the commands, the system() function within the awk would be helpfull
system(command) would excecute any command, and return value being the exit status of the command.
The above script can be modified as
grep "/bin/bash" /etc/passwd |head -2 | awk -F: '{ system("cp " $6 "/.bash_history backup/_home_"$1".bash_history;") }'
Test run:
$ grep "/bin/bash" /etc/passwd |head -2 | awk -F: '{ system("cp " $6 "/.bash_history backup/_home_"$1".bash_history;") }'
$ ls backup/
_home_xxx.bash_history _home_root.bash_history
PS: It is not recommend to create directories in your root folder. So i intentionally replaced /backup in your script to backup.
Also inorder for the script to be successful, the backup folder must be created before hand.

getent passwd | grep \/bin\/bash | cut -d ":" -f 6 | while read a; do eval "cp $a/.bash_history /backup/$(echo $a | sed 's#/#_#g')_.bash_history"; done
This uses getent to fetch the passwd file and cut gets the 6th field like your awk statement did, then it reads each entry line by line and builds the string and executes it with eval.

getent passwd | grep \/bin\/bash | cut -d ":" -f 6 | while read a; do eval "cp $a/.bash_history /backup/$(echo $a | sed 's#/#_#g')_.bash_history"; done
Worked perfectly! Issue solved!

How do i append some text to pipe without temporary file

I am trying to get the max version number from a directory where i have several versions of one program
for example if output of ls is
something01_1.sh
something02_0.1.2.sh
something02_0.1.sh
something02_1.1.sh
something02_1.2.sh
something02_2.0.sh
something02_2.1.sh
something02_2.3.sh
something02_3.1.2.sh
something.sh
I am getting the max version number with the following -
ls somedir | grep some_prefix | cut -d '_' -f2 | sort -t '.' -k1 -r | head -n 1
Now if at the same time i want to check it with the version number which i already have in the system, whats the best way to do it...
in bash i got this working (if 2.5 is the current version)
(ls somedir | grep some_prefix | cut -d '_' -f2; echo 2.5) | sort -t '.' -k1 -r | head -n 1
is there any other correct way to do it?
EDIT: In the above example some_prefix is something02.
EDIT: Actual Problem here is
(ls smthing; echo more) | sort
is it the best way to merge output of two commands/program for piping into third.

I have found the solution. The best way it seems is using process substitution.
cat <(ls smthing) <(echo more) | sort
for my version example
cat <(ls somedir | grep some_prefix | cut -d '_' -f2) <(echo 2.5) | sort -t '.' -k1 -r | head -n 1
for the benefit of future readers, I recommend - please drop the lure of one-liner and use glob as chepner suggested.
Almost similar question is asked on superuser.
more info about process substitution.

Is the following code more suitable to what you're looking for:
#/bin/bash
highest_version=$(ls something* | sort -V | tail -1 | sed "s/something02_\|\.sh//g")
current_version=$(echo $0 | sed "s/something02_\|\.sh//g")
if [ $current_version > $highest_version ]; then
echo "Uh oh! Looks like we need to update!";
fi

You can try something like this :
#! /bin/bash
lastversion() { # prefix
local prefix="$1" a=0 b=0 c=0 r f vmax=0
for f in "$prefix"* ; do
test -f "$f" || continue
read a b c r <<< $(echo "${f#$prefix} 0 0 0" | tr -C '[0-9]' ' ')
v=$(((a*100+b)*100+c))
if ((v>vmax)); then vmax=$v; fi
done
echo $vmax
}
lastversion "something02"
It will print: 30102

Bash ls (glob-style)

I have an excersise in which I have to print all the file names which are contained in the current folder, which contain in the them one of the letters [a-k] and [m-p] and [1-9] atleast 1 time (each).
I probably have to use ls (glob-style).

If order is important then you can use globbing:
$ ls *[a-k]*[m-p]*[1-9]*
ajunk404 am1 cn5
Else just grep for each group separately:
ls | grep "[a-k]" | grep "[m-p]" | grep "[1-9]"
1ma
ajunk404
am1
cn5
m1a
Note: ls will show directories if you really only want files use find inside:
find . -maxdepth 1 -type f | grep "[a-k]" | grep "[m-p]" | grep "[1-9]"

A 100% pure bash (and funny!) possibility:
#!/bin/bash
shopt -s nullglob
a=( *[a-k]* )
b=(); for i in "${a[#]}"; do [[ "$i" = *[p-z]* ]] && b+=( "$i" ); done
c=(); for i in "${b[#]}"; do [[ "$i" = *[1-9]* ]] && c+=( "$i" ); done
printf "%s\n" "${c[#]}"
No external processes whatsoever! No pipes! Only pure bash! 100% safe regarding files with funny symbols in their name (e.g., newlines) (and that's not the case with other methods using ls). And if you want to actually see the funny symbols in the file names and have them properly quoted, so as to reuse the output, use
printf "%q\n" "${c[#]}"
in place of the last printf statement.
Note. The patterns [a-k], [p-z] are locale-dependent. You might want to set LC_ALL=C to be sure that [a-k] really means [abcdefghijk] and not something else, e.g., [aAbBcCdDeEfFgGhHiIjJk].
Hope this helps!

If order isn't important, and the letters appear once or more, you can use chained greps.
ls | egrep "[a-k]" | egrep "[m-p]" | egrep "[1-9]"
If order matters, then just use a glob pattern
ls *[a-k]*[m-p]*[1-9]*

To be complete, you need to search all the combinations:
ls *[a-k]*[m-p]*[1-9]* *[a-k]*[1-9]*[m-p]* \
*[m-p]*[a-k]*[1-9]* *[m-p]*[1-9]*[a-k]* \
*[1-9]*[m-p]*[a-k]* *[1-9]*[a-k]*[m-p]*

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Shell script to recursively print full directory tree using ls - linux

Related

Bash: Reading a column from ls -l

How to match strings in file names and rename file according to string?

Reformat with awk and sed from STDIN and execute

How do i append some text to pipe without temporary file

Bash ls (glob-style)

Categories

Resources