BASH - Only printing the deepest directory in path - linux

I need some help.....
In my .bashrc file I have a VERY useful function (It may be a bit rough and ready, and a bit hacky, but it works a treat!) that reads an input file, and uses the 'tree' function on each of the input lines to create a directory tree. this tree is then printed into an output file (along with the size of the folder).
multitree()
{
while read cheese
do
pushd . > /dev/null
pushd $cheese > /dev/null
echo -e "$cheese \n\n" >> ~/Desktop/$2.txt
tree -idf . >> ~/Desktop/$2.txt
echo -e "\n\n\n" >> ~/Desktop/$2.txt
du -sh --si >> ~/Desktop/$2.txt
echo -e "\n\n\n\n\n\n\n" >> ~/Desktop/$2.txt
popd > /dev/null
done < $1
cat ~/done
}
This is a time saver like no end, and outputs a snippet like the following:
./foo
./foo/bar
./foo/bar/1
./foo/bar/1/2
etc etc....
however, the first (and most tedious) thing I need to do is remove all entries leaving only the deepest folder path (Using the above example it would be reduced to just ./foo/bar/1/2)
Is there a way of processing the file before/after the tree function to only print the deepest levels?
I know something like python might do a better job, but my issue is I've never used python And I'm not sure the work systems would let me run python... they let us modify our own .bashrc so I'm not too worried!
Thanks in advance guys!!!!
Owen.

You could use
find . -type d -links 2
Replace . with a directory if desired.
EDIT: Explanation:
find searches a directory for files that match a given filter. In this case, the directory is ., and the filter is -type d -links 2.
-type d filters for directories
-links 2 filters for those that have two (hard) links to their name. Effectively, this filters for all directories that have no subdirectories, because only those have two: The one in their parent directory and the . link in themselves. Those with subdirectories also have the .. links in their subdirectories.

Here's a hint:
You just need to count the number of "/" characters in each line.
If the current line has fewer than the number of "/" characters in the preceding line, the preceding line would be the "deepest" directory in its part of the hierarchy.
This line, and any subsequent line with still fewer "/" characters would NOT be the deepest directory in its part of the entire directory hierarchy. As soon as you get a line with the same number of "/" characters, or greater, then you can "reset" and, once again, keep an eye out for the first line with the fewer number of "/" characters.
And, finally, you need to handle the trivial case: only one line in your tree output, the current directory has no subdirectories, so it wins by default.
Another way you can implement this is by considering the following statement:
If a directory's name also exists as an exact prefix of another directory in the list, followed by the "/" character, then it is NOT the deepest directory in its part of the hierarchy.

Related

How can I list the files in a directory that have zero size/length in the Linux terminal?

I am new to using the Linux terminal, so I'm just starting to learn about the commands I can use. I have figured out how to list the files in a directory using the Linux terminal, and how to list them according to file size. I was wondering if there's a way to list only the files of a specific file size. Right now, I'm trying to list files with zero size, like those that you might create using the touch command. I looked through the flags I could use when I use ls, but I couldn't find exactly what I was looking for. Here's what I have right now:
ls -lsh /mydirectory
The "mydirectory" part is just a placeholder. Is there anything I can add that will only list files that have zero size?
There's a few ways you can go about this; if you want to stick with ls -l you could use e.g. awk in a pipeline to do the filtering.
ls -lsh /mydirectory | awk '$5 == 0'
Here, $5 is the fifth field in ls's output, the size.
Another approach would be to use a different tool, find.
find /mydirectory -maxdepth 1 -size 0 -ls
This will also list hidden files, analogous to an ls -la.
The -maxdepth 1 is there so it doesn't traverse the directory tree if you have nested directories.
A simple script can do this.
for file_name in *
do
if [[ !( -s $file_name ) ]]
then
echo $file_name
fi
done
explanation:
for is a loop. * gives list of all files in a current directory.
-s file_name becomes true if the file has size greater than 0.
! to negate that

How to add sequential numbers say 1,2,3 etc. to each file name and also for each line of the file content in a directory?

I want to add sequential number for each file and its contents in a directory. The sequential number should be prefixed with the filename and for each line of its contents should have the same number prefixed. In this manner, the sequential numbers should be generated for all the files(for names and its contents) in the sub-folders of the directory.
I have tried using maxdepth, rename, print function as a part. but it throws error saying that "-maxdepth" - not a valid option.
I have already a part of code(to print the names and contents of text files in a directory) and this logic should be appended with it.
#!bin/bash
cd home/TESTING
for file in home/TESTING;
do
find home/TESTING/ -type f -name *.txt -exec basename {} ';' -exec cat {} \;
done
P.s - print, rename, maxdepth are not working
If the name of the first file is File1.txt and its contents is mentioned as "Louis" then the output for the filename should be 1File1.txt and the content should be as "1Louis".The same should be replaced with 2 for second file. In this manner, it has to traverse through all the subfolders in the directory and print accordingly. I have already a part of code and this logic should be appended with it.
There should be fail safe if you execute cd in a script. You can execute command in wrong directory if you don't.
In your attempt, the output would be the same even without the for cycle, as for file in home/TESTING only pass home/TESTING as argument to for so it only run once. In case of
for file in home/TESTING/* this would happen else how.
I used find without --maxdepth, so it will look into all subdirectory as well for *.txt files. If you want only the current directory $(find /home/TESTING/* -type f -name "*.txt") could be replaced to $(ls *.txt) as long you do not have directory that end to .txt there will be no problem.
#!/bin/bash
# try cd to directory, do things upon success.
if cd /home/TESTING ;then
# set sequence number
let "x = 1"
# pass every file to for that find matching, sub directories will be also as there is no maxdeapth.
for file in $(find /home/TESTING/* -type f -name "*.txt") ; do
# print sequence number, and base file name, processed by variable substitution.
# basename can be used as well but this is bash built in.
echo "${x}${file##*/}"
# print file content, and put sequence number before each line with stream editor.
sed 's#^#'"${x}"'#g' ${file}
# increase sequence number with one.
let "x++"
done
# unset sequence number
unset 'x'
else
# print error on stderr
echo 'cd to /home/TESTING directory is failed' >&2
fi
Variable Substitution:
There is more i only picked this 4 for now as they similar.
${var#pattern} - Use value of var after removing text that match pattern from the left
${var##pattern} - Same as above but remove the longest matching piece instead the shortest
${var%pattern} - Use value of var after removing text that match pattern from the right
${var%%pattern} - Same as above but remove the longest matching piece instead the shortest
So ${file##*/} will take the variable of $file and drop every caracter * before the last ## slash /. The $file variable value not get modified by this, so it still contain the path and filename.
sed 's#^#'"${x}"'#g' ${file} sed is a stream editor, there is whole books about its usage, for this particular one. It usually placed into single quote, so 's#^#1#g' will add 1 the beginning of every line in a file.s is substitution, ^ is the beginning of the file, 1 is a text, g is global if you not put there the g only first mach will be affected.
# is separator it can be else as well, like / for example. I brake single quote to let variable be used and reopened the single quote.
If you like to replace a text, .txt to .php, you can use sed 's#\.txt#\.php#g' file , . have special meaning, it can replace any singe character, so it need to be escaped \, to use it as a text. else not only file.txt will be matched but file1txt as well.
It can be piped , you not need to specify file name in that case, else you have to provide at least one filename in our case it was the ${file} variable that contain the filename. As i mentioned variable substitution is not modify variable value so its still contain the filename with path.

for each pair of files with the same prefix, execute code

I have a large list of directories, each of which contains a varied number of "paired" files. By paired, I mean the prefix is the same for two files, and the pairs are denoted as "a" and "b". The prefix does not follow a defined pattern either. My broader intentions are to write a bash script that will list all subdirectories in a given directory, cd into each directory, find the pairs of files, and execute a function on the pairs. Here is an example directory:
Dir1
123_a.txt
234_a.txt
123_b.txt
234_b.txt
Dir2
345_a.txt
345_b.txt
Dir3
456_a.txt
567_a.txt
678_a.txt
456_b.txt
567_b.txt
678_b.txt
I can use this code to loop thought each directory:
for d in ./*/ ; do (cd "$d" && script.sh); done
In script.sh, I have been working on writing a script that will find all pairs of files (which is the problem I am struggling to figure out), and then call the function I want to apply to those files. This is the gist of what I have been trying:
for file in ./*_a.txt; do (find the paired file with *_b.txt && run_function.sh); done
Ive broken the problem into needing to get the value of "*" for the _a.txt files, and then searching the directory using this value for the matching _b.txt suffix,and making a subdirectory that I can put them into so I can then apply run_function.sh. So Dir1, would contain subdirectories 123 and 234.
Let me know if this doesn't make sense. The part of the problem I'm struggling with is matching files without a defined prefix.
Thanks for your help.
Use parameter expansion:
#!/bin/bash
file=123_a.txt
prefix=${file%_a.txt} # remove _a.txt from the right
second=${prefix}_b.txt
if [[ -f $second ]] ; then
run_function "$file" "$second"
fi

find returning inverted results

In a few words a wrote this little script to clean up some directories where I had consolidated directories/files from multiple sources where I used the cp command with the --backup=numbered feature so that files with identical names would have a suffix like .~1~ appended to avoid overwriting. I then ran fdupes to remove duplicate files, in some cases fdupes removed the file which did not have the suffix appended from the cp command (the original file) so I wanted to scan the directories looking for files with the suffix appended by the cp command and if the file does not exist with the suffix removed I would move mv the file otherwise I would leave it to avoid deleting anything as fdupes did not think it was a duplicate.
The issues is the test condition if [ -f ... ] part of the code below returns inverted results than what it should and I cannot understand why. For example, when the file exists it would return false and when the file did not exist it would return true. I fixed it by reversing the actions that I wanted to do based on the inverted return code and verified it was working as intended and it was so I ran it as such but would like to know if anyone knows why it would behave the way it did. I am not a bash script expert by any means so its possible that I missed something simple.
#!/bin/bash
logfile=$$.log
exec > $logfile 2>&1
IFS='
'
#set -f
for FILE in $(find . -type f -regextype posix-extended -regex '^.*(\.~[0-9]+~)+$')
do
FILE2=${FILE%%.~[0-9]*} # remove the suffix
if [ -f "${FILE2}" ]
then
echo ERROR: "${FILE2}" already exists!
else
echo "${FILE}" renamed "${FILE2}"
mv "${FILE}" "${FILE2}"
fi
done
You might be able to see the problem by modifying your script to show both FILE and FILE2 in the error message. There are a few minor problems with the script which could cause some confusion (but not the "inverted" logic):
find output is not sorted. If you had more than one backup file, a randomly chosen one would replace the original file;
you could sort the output using an expression like |sort -t~ -n -k2 on the end of the find-command.
the regular expression allows multiple matches of the ~[0-9]~ pattern. Conceivably you could have some odd file which ends with ~1~~2~.
the part where the suffix is removed assumes a single ~[0-9]~ is on the end of the filename. An embedded ~0, e.g., foo~0bar~1~ would reduce FILE to foo. The workaround for that would be more cumbersome (since the suffix-stripping uses globbing), but could be done with a case statement which matched an explicit number of digits (likely three digits would be enough).

How to open all files in a directory in Bourne shell script?

How can I use the relative path or absolute path as a single command line argument in a shell script?
For example, suppose my shell script is on my Desktop and I want to loop through all the text files in a folder that is somewhere in the file system.
I tried sh myshscript.sh /home/user/Desktop, but this doesn't seem feasible. And how would I avoid directory names and file names with whitespace?
myshscript.sh contains:
for i in `ls`
do
cat $i
done
Superficially, you might write:
cd "${1:-.}" || exit 1
for file in *
do
cat "$file"
done
except you don't really need the for loop in this case:
cd "${1:-.}" || exit 1
cat *
would do the job. And you could avoid the cd operation with:
cat "${1:-.}"/*
which lists (cats) all the files in the given directory, even if the directory or the file names contains spaces, newlines or other difficult to manage characters. You can use any appropriate glob pattern in place of * — if you want files ending .txt, then use *.txt as the pattern, for example.
This breaks down if you might have so many files that the argument list is too long. In that case, you probably need to use find:
find "${1:-.}" -type f -maxdepth 1 -exec cat {} +
(Note that -maxdepth is a GNU find extension.)
Avoid using ls to generate lists of file names, especially if the script has to be robust in the face of spaces, newlines etc in the names.
Use a glob instead of ls, and quote the loop variable:
for i in "$1"/*.txt
do
cat "$i"
done
PS: ShellCheck automatically points this out.

Resources