linux command line recursively check directories for at least 1 file with the same name as the directory - linux

I have a directory containing a large number of directories. Each directory contains some files and in some cases another directory.
parent_directory
sub_dir_1
sub_dir_1.txt
sub_dir_1_1.txt
sub_dir_2
sub_dir_2.txt
sub_dir_2_1.txt
sub_dir_3
sub_dir_3.txt
sub_dir_3_1.txt
sub_dir_4
sub_dir_4.txt
sub_dir_4_1.txt
sub_dir_5
sub_dir_5.txt
sub_dir_5_1.txt
I need to check that each sub_dir contains at least one file with the exact same name. I don' need to check any further down if there are sub directories within the sub_dirs.
I was thinking of using for d in ./*/ ; do (command here); done but I dont know how to get access to the sub_dir name inside the for loop
for d in ./*/ ;
do
(if directory does not contain 1 file that is the same name as the directory then echo directory name );
done
What is the best way to do this or is there a simpler way?

from the parent directory
find -maxdepth 1 -type d -printf "%f\n" |
xargs -I {} find {} -maxdepth 1 -type f -name {}.txt
will give you the name/name.txt pair. Compare with the all dir names to find the missing ones.
UPDATE
this might be simpler, instead of scanning you can check whether file exists or not
for f in $(find -maxdepth 1 -type d -printf "%f\n");
do if [ ! -e "$f/$f.txt" ];
then echo "$f not found";
fi; done

Maybe not understand fully, but
find . -print | grep -P '/(.*?)/\1\.txt'
this will print any file which is inside of the same-named directory, e.g:
./a/b/b.txt
./a/c/d/d.txt
etc...
Similarly
find . -print | sed -n '/\(.*\)\/\1\.txt/p'
this
find . -print | grep -P '/(.*?)/\1\.'
will list all files regardless of the extension in same-named dirs.
You can craft other regexes following the backreference logic.

Related

Moving files with a pattern in their name to a folder with the same pattern as its name

My directory contains mix of hundreds of files and directories similar to this:
508471/
ae_lstm__ts_ 508471_detected_anomalies.pdf
ae_lstm__508471_prediction_result.pdf
mlp_508471_prediction_result.pdf
mlp__ts_508471_detected_anomalies.pdf
vanilla_lstm_508471_prediction_result.pdf
vanilla_lstm_ts_508471_detected_anomalies.pdf
598690/
ae_lstm__ts_598690_detected_anomalies.pdf
ae_lstm__598690_prediction_result.pdf
mlp_598690_prediction_result.pdf
mlp__ts_598690_detected_anomalies.pdf
vanilla_lstm_598690_prediction_result.pdf
vanilla_lstm_ts_598690_detected_anomalies.pdf
There are folders with an ID number as their names, like 508471 and 598690.
In the same path as these folders, there are pdf files that have this ID number as part of their name. I need to move all the pdf files with the same ID in their name, to their related directories.
I tried the following shell script but it doesn't do anything. What am I doing wrong?
I'm trying to loop over all the directories, find the files that have id in their name, and move them to the same dir:
for f in ls -d */; do
id=${f%?} # f value is '598690/', I'm removing the last character, `\`, to get only the id part
find . -maxdepth 1 -type f -iname *.pdf -exec grep $id {} \; -exec mv -i {} $f \;
done
#!/bin/sh
find . -mindepth 1 -maxdepth 1 -type d -exec sh -c '
for d in "$#"; do
id=${d#./}
for file in *"$id"*.pdf; do
[ -f "$file" ] && mv -- "$file" "$d"
done
done
' findshell {} +
This finds every directory inside the current one (finding, for example, ./598690). Then, it removes ./ from the relative path and selects each file that contains the resulting id (598690), moving it to the corresponding directory.
If you are unsure of what this will do, put an echo between && and mv, it will list the mv actions the script would make.
And remember, do not parse ls.
The below code should do the required job.
for dir in */; do find . -mindepth 1 -maxdepth 1 -type f -name "*${dir%*/}*.pdf" -exec mv {} ${dir}/ \;; done
where */ will consider only the directories present in the given directory, find will search only files in the given directory which matches *${dir%*/}*.pdf i.e file name containing the directory name as its sub-string and finally mv will copy the matching files to the directory.
in Unix please use below command
find . -name '*508471*' -exec bash -c 'echo mv $0 ${0/508471/598690}' {} \;
You may use this for loop from the parent directory of these pdf files and directories:
for d in */; do
compgen -G "*${d%/}*.pdf" >/dev/null && mv *"${d%/}"*.pdf "$d"
done
compgen -G is used to check if there is a match for given glob or not.

Bash - how to exclude directory with find command and how to get full path with find?

so I have the code right now down below, and I'm running into a few problems with it
I'm having trouble excluding the directories being outputted by
find ${1-.}
It is giving me the directories too instead of only names; I've tried different methods such as -prune etc.
I'm having trouble with deleting the empty files
The data given to me by
EMPTY_FILE=$(find ${1-.} -size 0)
Does not give me the correct path
Here is the output for that
TestFolder/TestFile
in this case I can't just do:
rm TestFolder/TestFile
As it is invalid path; since it needs ./TestFolder/TestFile
How would I add on the ./ or is there away to get the full path.
#!/bin/bash
echo "Here are all the files in the directory specified\n"
find ${1-.}
EMPTY_FILE=$(find ${1-.} -size 0)
echo "Here are the list of empty files\n"
echo "$EMPTY_FILE \n"
echo "Do you want to delete those empty files?(yes/no)"
read text
if [ "$text" == "yes" ]; then $(rm -- $EMPTY_FILE); fi
Any help is appreciated!
You want this:
#!/bin/bash
echo -e "Here are all the files in the directory specified\n"
# Use -printf "%f\n" to print the filename without leading directories
# Use -type f to restrict find to files
find "${1-.}" -type f -printf " %f\n"
echo -e "Here are the list of empty files\n"
# Again, use -printf "%f\n"
find "${1-.}" -type f -size 0 -printf " %f\n"
echo -e "Do you want to delete those empty files?(yes/no)"
read answer
# Delete files using the `-delete` option
[ "$answer" = "yes" ] && find "${1-.}" -type f -size 0 -delete
Also note that I've quotes "${1-.}" at all occurrences. Since it is user input, you can't rely on the input. Even if it is a path, it might still contain problematic characters, like spaces.
I'm having trouble excluding the directories being outputted by
find ${1-.}
It is giving me the directories too instead of only names
You are looking for the -type test. To instruct find to report only regular files, you could say
find ${1-.} -type f
That's probably what you really want, but what you actually asked (to exclude only directories) would be
find ${1-.} -not -type d
Excluding only directories will list symbolic links and special files, too.
in this case I can't just do:
rm TestFolder/TestFile
As it is invalid path; since it needs ./TestFolder/TestFile
Nonsense. ./TestFolder/TestFile means exactly the same thing as TestFolder/TestFile.
In any event, find does print paths starting at the specified starting path(s).
I have a feeling that I'm missing something from your question, but if all you need to do is exclude directories, just tell find to only look for files:
find . -type f -size 0 -delete
And then adjust that to suit your script. Hope this helps.
-size 0 -type f
rm with no option will not delete directories . Your claim that rm needs ./ is wrong anyway.

Find and delete file but not specific path

I am writing a script to cleanup user dir on "/srv". At present every user keeps some temp files on "/srv/$USER".
Following is my script :
for x in $(cut -d: -f1 /etc/passwd); do
if [ -d "/srv/${x}" ]; then
echo "/srv/${x}"
find /srv/${x} -mindepth 1 -type f -not -amin -10080 -exec rm {} \;
fi
done
So I tried this script replacing rm with ls
/srv/abc
/srv/abc/2015-04-20-11-multi-interval.json
/srv/abc/2015-04-20-10-mimic.json
/srv/xyz
/srv/xyz/magnetic_hadoop/fabfile.py
here i want to exclude /srv/abc which is parent dir and delete only files, So I added -mindepth 1, but still I didn't get what I want.
Then I added -not -path /srv/${x} but no difference.
Anyone know what am I missing here ?
Thanks
the '-type f' means that you will get only files. and your output shows that: after the folder name which comes from the echo command, only files are shown.
unless you want to leave user folders intact, you don't want the '-mindepth 1' option; it does not change the fact that '-type f'

LINUX - shell script finding and listing all files with rights to write in directory tree

Here is the code that i have soo far :
echo $(pwd > adress)
var=$(head -1 adress)
rm adress
found=0 #Flag
fileshow()
{
cd $1
for i in *
do
if [ -d $i ]
then
continue
elif [ -w $i ]
then
echo $i
found=1
fi
done
cd ..
}
fileshow $1
if [ $found -eq 0 ]
then
clear
echo "$(tput setaf 1)There arent any executable files !!!$(tput sgr0)"
fi
Its working but it find files only in current directory.
I was told that i need to use some kind of recursive method to loop through all sub-directories but i dont know how to do it.
So if any one can help me i will be very grateful.
Thanks!
The effect of your script is to find the files below the current working directory that are not directories and are writeable to the current user. This can be achieved with the command:
find ./ -type f -writable
The advantage of using -type f is that it also excludes symbolic links and other special kinds of file, if that's what you want. If you want all files that are not directories (as suggested by your script), then you can use:
find ./ ! -type d -writable
If you want to sort these files (added question, assuming lexicographic ascending order), you can use sort:
find ./ -type f -writable | sort
If you want to use these sorted filenames for something else, the canonical pattern would be (to handle filenames with embedded newlines and other seldom-used characters):
while read -r -d $'\0'; do
echo "File '$REPLY' is an ordinary file and is writable"
done < <(find ./ -type f -writable -print0 | sort -z)
If you're using a very old version of find that does not support the handy -writable predicate (added to v.4.3 in 2005), then you only have file permissions to go on. You then have to be clear about what you mean by “writable” in the specific context (writable to whom?), and you can replace the -writable predicate with the -perm predicates described in #gregb's answer. If you decide that you mean “writable by anyone” you could use -perm /u=w,g=w,o=w or -perm /222, but there's actually no way of getting all the benefits of -writable just using permissions. Also note that the + form of permission tests to -perm is deprecated and should no longer be used; the / form should be used instead.
You could use find:
find /path/to/directory/ -type f -perm -o=w
Where the -o=w implies that each file has the "other write-permission" set.
or,
find /path/to/directory/ -type f -perm /u+w,g+w,o+w
Where /u+w,g+w,o+w implies that each file either has user, group, or other write-permissions set.

Listing all directories except one

I have a following directory structure
libs logs src etc .........
|-- logs
|-- src
|-- inc
"logs" directory is everywhere inside. So I want to list all directories except "logs". What will be shell command for that.
Something like
#!/bin/bash
for dir in `find * -type d`; do
if [[ ${dir} != "{logs}*" ]]; then
echo ${dir}
fi
done
but this does not seems to be working.
Regards,
Farrukh Arshad.
Rather than trying to process these things one at a time with checks, why don't you get all directories and just filter out the ones you don't want:
find * -type d | egrep -v '^logs/|/logs/'
The grep simply removes lines containing either logs/ at the start or /logs/ anywhere.
That's going to be a lot faster than individually checking every single directory one-by-one.
As mentioned in the above comment you can use egrep and | to separate patterns or like below define it all in find
find . -type d -print
.
./logs1
./test
./logs
$ find . -type d -not -name logs -not -name logs1 -print
.
./test

Resources