how to perform action of list of directories listed using for loop? - linux

I am doing one check to see if a directory is present inside list of below directories. Below is the table listed. This is the only table available about such directories.
user#root> cat u
Directory owner value
-------- ---- -----
0-0-1-0 Aleks 10
0-0-2-0 Ram 23
0-0-3-0 mark 43
0-0-4-0 Sam 22
0-0-5-0 wood 21
0-0-6-0 peter 34
0-0-7-0 ron 45
0-0-8-0 Alic 44
0-0-9-0 amber 56
0-0-10-0 janny 34
user#root> cat u |grep -Ev "owner|--"|awk '{print $1 }'
0-0-1-0
0-0-2-0
0-0-3-0
0-0-4-0
0-0-5-0
0-0-6-0
0-0-7-0
0-0-8-0
0-0-9-0
0-0-10-0
Query:
I want to login into all the directories from 0-0-1-0 to 0-0-10-0 and perform some action. How can I do that ?
For example I want to validate if XYZ directory is present inside all the directories or not.
user#root>test -d 0-0-1-0/XYZ; if [ "$?" != "0" ];then echo "directory is missing" fi
I think if I can store value of each row incrementally in some variable then issue will be resolved.

You can process your list of files like this:
#!/bin/sh
for dir in $(awk 'NR>2 {print $1}' $1)
do
if [[ -d "$dir" ]]
then
cd "$dir"
pwd
# Do random stuff
fi
done
Run the script like this:
./script.sh my_list_of_files
If the directory exists it will cd to that directory and run pwd.
One warning though, this script will get a bit confused if any of your directories have a space in them.

If you know the DIR_NAME_TO_BE_SEARCHED, then you can use following command:
find YOUR_STARTING_DIRECTORY -type d -name DIR_NAME_TO_BE_SEARCHED -print
example:
find . -type d -name test -print
explanation:
will find all directories (-type d) starting from your current directory that have their name as test (-name test) and output them (-print).
and if you don't know the exact DIRECTORY_NAME_TO_BE_SEARCHED, then you can use pattern as well :
find YOUR_STARTING_DIRECTORY -type d -name "DIR_NAME_TO_BE_SEARCHED_PATTERN" -print
example:
find . -type d -name "\*test\*" -print

Related

Delete files older than 3 days if disk usage is over 85 percent in bash

I'm working on a bash code which would enter the command df-h, loop through the disk usage fields and print the number using awk and if any of the disk usage reaches over 85 percent, it would find the files older than 3 days within the log path indicated in variable and remove them.
However upon trying to run the script, it constantly complains that the command was not found on line 6.
This is the code that I'm working on
files =$(find /files/logs -type f -mtime +3 -name '*.log)
process =$(df-h | awk '{print $5+0}')
for i in process
do
if $i -ge 85
then
for k in $files
do
rm -rf $k
done
fi
done;
Its so irritating because I feel that I'm so close to the solution and yet I still cant figure out as to whats wrong with the script that it refuses to work
you are searching files in /files/logs so you are probably only interested in this partition (root partition?)
your if statement did not respect the proper syntax... should be if [[ x -gt y ]]; then....
there was no need to loop thhrough the files collected by find since you can use -exec directly in find (see man find)
#!/bin/bash
# find the partition that contains the log files (this example with root partition)
PARTITION="$(mount | grep "on / type" | awk '{print $1}')"
# find the percentage used of the partition
PARTITION_USAGE_PERCENT="$(df -h | grep "$PARTITION" | awk '{print $5}' | sed s/%//g)"
# if partition too full...
if [[ "$PARTITION_USAGE_PERCENT" -gt 85 ]]; then
# run command to delete extra files
printf "%s%s%s\n" "Disk usage is at " "$PARTITION_USAGE_PERCENT" "% deleting log files..."
find /files/logs -type f -mtime +3 -name '*.log' -exec rm {} \;
fi

LINUX Copy the name of the newest folder and paste it in a command [duplicate]

I would like to find the newest sub directory in a directory and save the result to variable in bash.
Something like this:
ls -t /backups | head -1 > $BACKUPDIR
Can anyone help?
BACKUPDIR=$(ls -td /backups/*/ | head -1)
$(...) evaluates the statement in a subshell and returns the output.
There is a simple solution to this using only ls:
BACKUPDIR=$(ls -td /backups/*/ | head -1)
-t orders by time (latest first)
-d only lists items from this folder
*/ only lists directories
head -1 returns the first item
I didn't know about */ until I found Listing only directories using ls in bash: An examination.
This ia a pure Bash solution:
topdir=/backups
BACKUPDIR=
# Handle subdirectories beginning with '.', and empty $topdir
shopt -s dotglob nullglob
for file in "$topdir"/* ; do
[[ -L $file || ! -d $file ]] && continue
[[ -z $BACKUPDIR || $file -nt $BACKUPDIR ]] && BACKUPDIR=$file
done
printf 'BACKUPDIR=%q\n' "$BACKUPDIR"
It skips symlinks, including symlinks to directories, which may or may not be the right thing to do. It skips other non-directories. It handles directories whose names contain any characters, including newlines and leading dots.
Well, I think this solution is the most efficient:
path="/my/dir/structure/*"
backupdir=$(find $path -type d -prune | tail -n 1)
Explanation why this is a little better:
We do not need sub-shells (aside from the one for getting the result into the bash variable).
We do not need a useless -exec ls -d at the end of the find command, it already prints the directory listing.
We can easily alter this, e.g. to exclude certain patterns. For example, if you want the second newest directory, because backup files are first written to a tmp dir in the same path:
backupdir=$(find $path -type -d -prune -not -name "*temp_dir" | tail -n 1)
The above solution doesn't take into account things like files being written and removed from the directory resulting in the upper directory being returned instead of the newest subdirectory.
The other issue is that this solution assumes that the directory only contains other directories and not files being written.
Let's say I create a file called "test.txt" and then run this command again:
echo "test" > test.txt
ls -t /backups | head -1
test.txt
The result is test.txt showing up instead of the last modified directory.
The proposed solution "works" but only in the best case scenario.
Assuming you have a maximum of 1 directory depth, a better solution is to use:
find /backups/* -type d -prune -exec ls -d {} \; |tail -1
Just swap the "/backups/" portion for your actual path.
If you want to avoid showing an absolute path in a bash script, you could always use something like this:
LOCALPATH=/backups
DIRECTORY=$(cd $LOCALPATH; find * -type d -prune -exec ls -d {} \; |tail -1)
With GNU find you can get list of directories with modification timestamps, sort that list and output the newest:
find . -mindepth 1 -maxdepth 1 -type d -printf "%T#\t%p\0" | sort -z -n | cut -z -f2- | tail -z -n1
or newline separated
find . -mindepth 1 -maxdepth 1 -type d -printf "%T#\t%p\n" | sort -n | cut -f2- | tail -n1
With POSIX find (that does not have -printf) you may, if you have it, run stat to get file modification timestamp:
find . -mindepth 1 -maxdepth 1 -type d -exec stat -c '%Y %n' {} \; | sort -n | cut -d' ' -f2- | tail -n1
Without stat a pure shell solution may be used by replacing [[ bash extension with [ as in this answer.
Your "something like this" was almost a hit:
BACKUPDIR=$(ls -t ./backups | head -1)
Combining what you wrote with what I have learned solved my problem too. Thank you for rising this question.
Note: I run the line above from GitBash within Windows environment in file called ./something.bash.

linux command line recursively check directories for at least 1 file with the same name as the directory

I have a directory containing a large number of directories. Each directory contains some files and in some cases another directory.
parent_directory
sub_dir_1
sub_dir_1.txt
sub_dir_1_1.txt
sub_dir_2
sub_dir_2.txt
sub_dir_2_1.txt
sub_dir_3
sub_dir_3.txt
sub_dir_3_1.txt
sub_dir_4
sub_dir_4.txt
sub_dir_4_1.txt
sub_dir_5
sub_dir_5.txt
sub_dir_5_1.txt
I need to check that each sub_dir contains at least one file with the exact same name. I don' need to check any further down if there are sub directories within the sub_dirs.
I was thinking of using for d in ./*/ ; do (command here); done but I dont know how to get access to the sub_dir name inside the for loop
for d in ./*/ ;
do
(if directory does not contain 1 file that is the same name as the directory then echo directory name );
done
What is the best way to do this or is there a simpler way?
from the parent directory
find -maxdepth 1 -type d -printf "%f\n" |
xargs -I {} find {} -maxdepth 1 -type f -name {}.txt
will give you the name/name.txt pair. Compare with the all dir names to find the missing ones.
UPDATE
this might be simpler, instead of scanning you can check whether file exists or not
for f in $(find -maxdepth 1 -type d -printf "%f\n");
do if [ ! -e "$f/$f.txt" ];
then echo "$f not found";
fi; done
Maybe not understand fully, but
find . -print | grep -P '/(.*?)/\1\.txt'
this will print any file which is inside of the same-named directory, e.g:
./a/b/b.txt
./a/c/d/d.txt
etc...
Similarly
find . -print | sed -n '/\(.*\)\/\1\.txt/p'
this
find . -print | grep -P '/(.*?)/\1\.'
will list all files regardless of the extension in same-named dirs.
You can craft other regexes following the backreference logic.

A bash script to run a program for directories that do not have a certain file

I need a Bash Script to Execute a program for all directories that do not have a specific file and create the output file on the same directory.This program needs an input file which exist in every directory with the name *.DNA.fasta.Suppose I have the following directories that may contain sub directories also
dir1/a.protein.fasta
dir2/b.protein.fasta
dir3/anyfile
dir4/x.orf.fasta
I have started by finding the directories that don't have that specific file whic name is *.protein.fasta
in this case I want the dir3 and dir4 to be listed (since they do not contain *.protein.fasta)
I have tried this code:
find . -maxdepth 1 -type d \! -exec test -e '{}/*protein.fasta' \; -print
but it seems I missed some thing it does not work.
also I do not know how to proceed for the whole story.
This is a tricky one.
I can't think of a good solution. But here's a solution, nevertheless. Note that this is guaranteed not to work if your directory or file names contain newlines, and it's not guaranteed to work if they contain other special characters. (I've only tested with the samples in your question.)
Also, I haven't included a -maxdepth because you said you need to search subdirectories too.
#!/bin/bash
# Create an associative array
declare -A excludes
# Build an associative array of directories containing the file
while read line; do
excludes[$(dirname "$line")]=1
echo "excluded: $(dirname "$line")" >&2
done <<EOT
$(find . -name "*protein.fasta" -print)
EOT
# Walk through all directories, print only those not in array
find . -type d \
| while read line ; do
if [[ ! ${excludes[$line]} ]]; then
echo "$line"
fi
done
For me, this returns:
.
./dir3
./dir4
All of which are directories that do not contain a file matching *.protein.fasta. Of course, you can replace the last echo "$line" with whatever you need to do with these directories.
Alternately:
If what you're really looking for is just the list of top-level directories that do not contain the matching file in any subdirectory, the following bash one-liner may be sufficient:
for i in *; do test -d "$i" && ( find "$i" -name '*protein.fasta' | grep -q . || echo "$i" ); done
#!/bin/bash
for dir in *; do
test -d "$dir" && ( find "$dir" -name '*protein.fasta' | grep -q . || Programfoo"$dir/$dir.DNA.fasta");
done

How to loop over directories in Linux?

I am writing a script in bash on Linux and need to go through all subdirectory names in a given directory. How can I loop through these directories (and skip regular files)?
For example:
the given directory is /tmp/
it has the following subdirectories: /tmp/A, /tmp/B, /tmp/C
I want to retrieve A, B, C.
All answers so far use find, so here's one with just the shell. No need for external tools in your case:
for dir in /tmp/*/ # list directories in the form "/tmp/dirname/"
do
dir=${dir%*/} # remove the trailing "/"
echo "${dir##*/}" # print everything after the final "/"
done
cd /tmp
find . -maxdepth 1 -mindepth 1 -type d -printf '%f\n'
A short explanation:
find finds files (quite obviously)
. is the current directory, which after the cd is /tmp (IMHO this is more flexible than having /tmp directly in the find command. You have only one place, the cd, to change, if you want more actions to take place in this folder)
-maxdepth 1 and -mindepth 1 make sure that find only looks in the current directory and doesn't include . itself in the result
-type d looks only for directories
-printf '%f\n prints only the found folder's name (plus a newline) for each hit.
Et voilĂ !
You can loop through all directories including hidden directrories (beginning with a dot) with:
for file in */ .*/ ; do echo "$file is a directory"; done
note: using the list */ .*/ works in zsh only if there exist at least one hidden directory in the folder. In bash it will show also . and ..
Another possibility for bash to include hidden directories would be to use:
shopt -s dotglob;
for file in */ ; do echo "$file is a directory"; done
If you want to exclude symlinks:
for file in */ ; do
if [[ -d "$file" && ! -L "$file" ]]; then
echo "$file is a directory";
fi;
done
To output only the trailing directory name (A,B,C as questioned) in each solution use this within the loops:
file="${file%/}" # strip trailing slash
file="${file##*/}" # strip path and leading slash
echo "$file is the directoryname without slashes"
Example (this also works with directories which contains spaces):
mkdir /tmp/A /tmp/B /tmp/C "/tmp/ dir with spaces"
for file in /tmp/*/ ; do file="${file%/}"; echo "${file##*/}"; done
Works with directories which contains spaces
Inspired by Sorpigal
while IFS= read -d $'\0' -r file ; do
echo $file; ls $file ;
done < <(find /path/to/dir/ -mindepth 1 -maxdepth 1 -type d -print0)
Original post (Does not work with spaces)
Inspired by Boldewyn: Example of loop with find command.
for D in $(find /path/to/dir/ -mindepth 1 -maxdepth 1 -type d) ; do
echo $D ;
done
find . -mindepth 1 -maxdepth 1 -type d -printf "%P\n"
The technique I use most often is find | xargs. For example, if you want to make every file in this directory and all of its subdirectories world-readable, you can do:
find . -type f -print0 | xargs -0 chmod go+r
find . -type d -print0 | xargs -0 chmod go+rx
The -print0 option terminates with a NULL character instead of a space. The -0 option splits its input the same way. So this is the combination to use on files with spaces.
You can picture this chain of commands as taking every line output by find and sticking it on the end of a chmod command.
If the command you want to run as its argument in the middle instead of on the end, you have to be a bit creative. For instance, I needed to change into every subdirectory and run the command latemk -c. So I used (from Wikipedia):
find . -type d -depth 1 -print0 | \
xargs -0 sh -c 'for dir; do pushd "$dir" && latexmk -c && popd; done' fnord
This has the effect of for dir $(subdirs); do stuff; done, but is safe for directories with spaces in their names. Also, the separate calls to stuff are made in the same shell, which is why in my command we have to return back to the current directory with popd.
a minimal bash loop you can build off of (based off ghostdog74 answer)
for dir in directory/*
do
echo ${dir}
done
to zip a whole bunch of files by directory
for dir in directory/*
do
zip -r ${dir##*/} ${dir}
done
If you want to execute multiple commands in a for loop, you can save the result of find with mapfile (bash >= 4) as a variable and go through the array with ${dirlist[#]}. It also works with directories containing spaces.
The find command is based on the answer by Boldewyn. Further information about the find command can be found there.
IFS=""
mapfile -t dirlist < <( find . -maxdepth 1 -mindepth 1 -type d -printf '%f\n' )
for dir in ${dirlist[#]}; do
echo ">${dir}<"
# more commands can go here ...
done
TL;DR:
(cd /tmp; for d in */; do echo "${d%/}"; done)
Explanation.
There's no need to use external programs. What you need is a shell globbing pattern. To avoid the need of removing /tmp afterward, I'm running it in a subshell, which may or not be suitable for your purposes.
Shell globbing patterns in a nutshell:
* Match any non-empty string any number of times.
? Match exactly one character.
[...] Matches with a character from between the brackets. You can also specify ranges ([a-z], [A-F0-9], etc.) or classes ([:digit:], [:alpha:], etc.).
[^...] Match one of the characters not between the braces.
* If no file names match the pattern, the shell will return the pattern unchanged. Any character or string that is not one of the above represents itself.
Consequently, the pattern */ will match any file name that ends with a /. A trailing / in a file name unambiguously identifies a directory.
The last bit is removing the trailing slash, which is achieved with the variable substitution ${var%PATTERN}, which removes the shortest matching pattern from the end of the string contained in var, and where PATTERN is any valid globbing pattern. So we write ${d%/}, meaning we want to remove the trailing slash from the string represented by d.
find . -type d -maxdepth 1
In short, put the results of find into an array and iterate the array and do what you want. Not the quickest but more organized thinking.
#!/bin/bash
cd /tmp
declare -a results=(`find -type d`)
#Iterate the results
for path in ${results[#]}
do
echo "Your path is $path"
#Do something with the path..
if [[ $path =~ "/A" ]]; then
echo $path | awk -F / '{print $NF}'
#prints A
elif [[ $path =~ "/B" ]]; then
echo $path | awk -F / '{print $NF}'
#Prints B
elif [[ $path =~ "/C" ]]; then
echo $path | awk -F / '{print $NF}'
#Prints C
fi
done
This can be reduced to find -type d | grep "/A" | awk -F / '{print $NF}' prints A
find -type d | grep "/B" | awk -F / '{print $NF}' prints B
find -type d | grep "/C" | awk -F / '{print $NF}' prints C

Resources