How can I delete the directory with the highest number name? - linux

I have a directory containing sub-directories, some of whose names are numbers. Without looking, I don't know what the numbers are. How can I delete the sub-directory with the highest number name? I reckon the solution might sort the sub-directories into reverse order and select the first sub-directory that begins with a number but I don't know how to do that. Thank you for your help.

cd $yourdir #go to that dir
ls -q -p | #list all files directly in dir and make directories end with /
grep '^[0-9]*/$' | #select directories (end with /) whose names are made of numbers
sort -n | #sort numerically
tail -n1 | #select the last one (largest)
xargs -r rmdir #or rm -r if nonempty
Recommend running it first without the xargs -r rmdir or xargs -r rm -r part to make sure your deleting the right thing.

A pure Bash solution:
#!/bin/bash
shopt -s nullglob extglob
# Make an array of all the dir names that only contain digits
dirs=( +([[:digit:]])/ )
# If none found, exit
if ((${#dirs[#]}==0)); then
echo >&2 "No dirs found"
exit
fi
# Loop through all elements of array dirs, saving the greatest number
max=${dirs[0]%/}
for i in "${dirs[#]%/}"; do
((10#$max<10#$i)) && max=$i
done
# Finally, delete the dir with largest number found
echo rm -r "$max"
Note:
This will have an unpredictable behavior when there are dirs with same number but written differently, e.g., 2 and 0002.
Will fail if the numbers overflow Bash's numbers.
Doesn't take into account negative numbers and non-integer numbers.
Remove the echo in the last line if you're happy with it.
To be run from within your directory.

Let's make some directories to test the script:
mkdir test; cd test; mkdir $(seq 100)
Now
find -mindepth 1 -maxdepth 1 -type d | cut -c 3- | sort -k1n | tail -n 1 | xargs -r echo rm -r
Result:
rm -r 100
Now, remove the word echo from the command and xargs will execute rm -r 100.

Related

LINUX Copy the name of the newest folder and paste it in a command [duplicate]

I would like to find the newest sub directory in a directory and save the result to variable in bash.
Something like this:
ls -t /backups | head -1 > $BACKUPDIR
Can anyone help?
BACKUPDIR=$(ls -td /backups/*/ | head -1)
$(...) evaluates the statement in a subshell and returns the output.
There is a simple solution to this using only ls:
BACKUPDIR=$(ls -td /backups/*/ | head -1)
-t orders by time (latest first)
-d only lists items from this folder
*/ only lists directories
head -1 returns the first item
I didn't know about */ until I found Listing only directories using ls in bash: An examination.
This ia a pure Bash solution:
topdir=/backups
BACKUPDIR=
# Handle subdirectories beginning with '.', and empty $topdir
shopt -s dotglob nullglob
for file in "$topdir"/* ; do
[[ -L $file || ! -d $file ]] && continue
[[ -z $BACKUPDIR || $file -nt $BACKUPDIR ]] && BACKUPDIR=$file
done
printf 'BACKUPDIR=%q\n' "$BACKUPDIR"
It skips symlinks, including symlinks to directories, which may or may not be the right thing to do. It skips other non-directories. It handles directories whose names contain any characters, including newlines and leading dots.
Well, I think this solution is the most efficient:
path="/my/dir/structure/*"
backupdir=$(find $path -type d -prune | tail -n 1)
Explanation why this is a little better:
We do not need sub-shells (aside from the one for getting the result into the bash variable).
We do not need a useless -exec ls -d at the end of the find command, it already prints the directory listing.
We can easily alter this, e.g. to exclude certain patterns. For example, if you want the second newest directory, because backup files are first written to a tmp dir in the same path:
backupdir=$(find $path -type -d -prune -not -name "*temp_dir" | tail -n 1)
The above solution doesn't take into account things like files being written and removed from the directory resulting in the upper directory being returned instead of the newest subdirectory.
The other issue is that this solution assumes that the directory only contains other directories and not files being written.
Let's say I create a file called "test.txt" and then run this command again:
echo "test" > test.txt
ls -t /backups | head -1
test.txt
The result is test.txt showing up instead of the last modified directory.
The proposed solution "works" but only in the best case scenario.
Assuming you have a maximum of 1 directory depth, a better solution is to use:
find /backups/* -type d -prune -exec ls -d {} \; |tail -1
Just swap the "/backups/" portion for your actual path.
If you want to avoid showing an absolute path in a bash script, you could always use something like this:
LOCALPATH=/backups
DIRECTORY=$(cd $LOCALPATH; find * -type d -prune -exec ls -d {} \; |tail -1)
With GNU find you can get list of directories with modification timestamps, sort that list and output the newest:
find . -mindepth 1 -maxdepth 1 -type d -printf "%T#\t%p\0" | sort -z -n | cut -z -f2- | tail -z -n1
or newline separated
find . -mindepth 1 -maxdepth 1 -type d -printf "%T#\t%p\n" | sort -n | cut -f2- | tail -n1
With POSIX find (that does not have -printf) you may, if you have it, run stat to get file modification timestamp:
find . -mindepth 1 -maxdepth 1 -type d -exec stat -c '%Y %n' {} \; | sort -n | cut -d' ' -f2- | tail -n1
Without stat a pure shell solution may be used by replacing [[ bash extension with [ as in this answer.
Your "something like this" was almost a hit:
BACKUPDIR=$(ls -t ./backups | head -1)
Combining what you wrote with what I have learned solved my problem too. Thank you for rising this question.
Note: I run the line above from GitBash within Windows environment in file called ./something.bash.

How to copy the contents of a folder to multiple folders based on number of files?

I want to copy the files from a folder (named: 1) to multiple folders based on the number of files (here: 50).
The code given below works. I transferred all the files from the folder to the subfolders based on number of files and then copied back all the files in the directory back to the initial folder.
However, I need something cleaner and more efficient. Apologies for the mess below, I'm a nube.
bf=1 #breakfolder
cd 1 #the folder from where I wanna copy stuff, contains 179 files
flies_exist=$(ls -1q * | wc -l) #assign the number of files in folder 1
#move 50 files from 1 to various subfolders
while [ $flies_exist -gt 50 ]
do
mkdir ../CompiledPdfOutput/temp/1-$bf
set --
for f in .* *; do
[ "$#" -lt 50 ] || break
[ -f "$f" ] || continue
[ -L "$f" ] && continue
set -- "$#" "$f"
done
mv -- "$#" ../CompiledPdfOutput/temp/1-$bf/
flies_exist=$(ls -1q * | wc -l)
bf=$(($bf + 1))
done
#mover the rest of the files into one final subdir
mkdir ../CompiledPdfOutput/temp/1-$bf
set --
for f in .* *; do
[ "$#" -lt 50 ] || break
[ -f "$f" ] || continue
[ -L "$f" ] && continue
set -- "$#" "$f"
done
mv -- "$#" ../CompiledPdfOutput/temp/1-$bf/
#get out of 1
cd ..
# copy back the contents from subdir to 1
find CompiledPdfOutput/temp/ -exec cp {} 1 \;
The required directory structure is:
parent
________|________
| |
1 CompiledPdfOutput
| |
(179) temp
|
---------------
| | | |
1-1 1-2 1-3 1-4
(50) (50) (50) (29)
The number inside "()" denotes the number of files.
BTW, the final step of my code gives this warning, would be glad if anyone can explain what's happening and a solution.
cp: -r not specified; omitting directory 'CompiledPdfOutput/temp/'
cp: -r not specified; omitting directory 'CompiledPdfOutput/temp/1-4'
cp: -r not specified; omitting directory 'CompiledPdfOutput/temp/1-3'
cp: -r not specified; omitting directory 'CompiledPdfOutput/temp/1-1'
cp: -r not specified; omitting directory 'CompiledPdfOutput/temp/1-2'
I dont wnt to copy the directory as well, just the files so giving -r would be bad.
Assuming that you need something more compact/efficient, you can leverage existing tools (find, xargs) to create a pipeline, eliminating the need to program each step using bash.
The following will move the files into the split folder. It will find the files, group them, 50 into each folder, use awk to generate output folder, and move the files. Solution not as elegant as original one :-(
find 1 -type f |
xargs -L50 echo |
awk '{ print "CompliedOutput/temp/1-" NR, $0 }' |
xargs -L1 echo mv -t
As a side note, current script moves the files from the '1' folder, to the numbered folders, and then copy the file back to the original folder. Why not just copy the files to the numbered folders. You can use 'cp -p' to to preserve timestamp, if that's needed.
Supporting file names with new lines (and spaces)
Clarification to question indicate a solution should work with file names with embedded new lines (and while spaces). This require minor change to use NUL character as separator.
# Count number of output folders
DIR_COUNT=$(find 1 -type f -print0 | xargs -0 -I{} echo X | wc -l)
# Remove previous tree, and create folder
OUT=CompiledOutput/temp
rm -rf $OUT
eval mkdir -p $OUT/1-{1..$DIR_COUNT}
# Process file, use NUL as separator
find 1 -type f -print0 |
awk -vRS="\0" -v"OUT=$OUT" 'NR%50 == 1 { printf "%s/1-%d%s",OUT,1+int(NR/50),RS } { printf "%s", ($0 RS) }' |
xargs -0 -L51 -t mv -t
Did limited testing with both space and new lines in the file. Looks OK on my machine.
I find a couple of issues with the posted script:
The logic of copying maximum 50 files per folder is overcomplicated, and the code duplication of an entire loop is error-prone.
It uses reuses the $# array of positional parameters for internal storage purposes. This variable was not intended for that, it would be better to use a new dedicated array.
Instead of moving files to sub-directories and then copying them back, it would be simpler to just copy them in the first step, without ever moving.
Parsing the output of ls is not recommended.
Consider this alternative, simpler logic:
Initialize an empty array to_copy, to keep files that should be copied
Initialize a folder counter, to use to compute the target folder
Loop over the source files
Apply filters as before (skip if not file)
Add file to to_copy
If to_copy contains the target number of files, then:
Create the target folder
Copy the files contained in to_copy
Reset the content of to_copy to empty
Increment folder_counter
If to_copy is not empty
Create the target folder
Copy the files contained in to_copy
Something like this:
#!/usr/bin/env bash
set -euo pipefail
distribute_to_folders() {
local src=$1
local target=$2
local max_files=$3
local to_copy=()
local folder_counter=1
for file in "$src"/* "$src/.*"; do
[ -f "$file" ] || continue
to_copy+=("$file")
if (( ${#to_copy[#]} == max_files )); then
mkdir -p "$target/$folder_counter"
cp -v "${to_copy[#]}" "$target/$folder_counter/"
to_copy=()
((++folder_counter))
fi
done
if (( ${#to_copy[#]} > 0 )); then
mkdir -p "$target/$folder_counter"
cp -v "${to_copy[#]}" "$target/$folder_counter/"
fi
}
distribute_to_folders "$#"
To distribute files in path/to/1 into directories of maximum 50 files under path/to/compiled-output, you can call this script with:
./distribute.sh path/to/1 path/to/compiled-output 50
BTW, the final step of my code gives this warning, would be glad if anyone can explain what's happening and a solution.
Sure. The command find CompiledPdfOutput/temp/ -exec cp {} 1 \; finds files and directories, and tries to copy them. When cp encounters a directory and the -r parameter is not specified, it issues the warning you saw. You could add a filter for files, with -type f. If there are not excessively many files then a simple shell glob will do the job:
cp -v CompiledPdfOutput/temp/*/* 1
This will copy files to multiple folders of fixed size. Change source, target, and folderSize as per your requirement. This also works with filenames with special character (e.g. 'file 131!##$%^&*()_+-=;?').
source=1
target=CompiledPDFOutput/temp
folderSize=50
find $source -type f -printf "\"%p\"\0" \
| xargs -0 -L$folderSize \
| awk '{system("mkdir -p '$target'/1-" NR); printf "'$target'/1-" NR " %s\n", $0}' \
| xargs -L1 cp -t

Script for deleting multiple directories based on directory name using shell or bash script

I am trying to write a shell script which removed the directories and its contents based on directory name instead of last modified time.
I have following directories in /tmp/ location
2015-05-25
2015-05-26
2015-05-27
2015-05-28
2015-05-29
2015-05-30
2015-05-31
Now I would like to delete all the directories till 2015-05-29. Last modified date is same for all the directories.
Can any one please suggest?
A straightforward but not flexible way (in bash) is:
rm -rf 2015-05-{25..29}
A more flexible way would involve some coding:
ls -d ./2015-* | sort | sed '/2015-06-02/,$d' | xargs rm -r
Sort lexcially all the directories follow the name pattern 2015-*
Use 'sed' to remove all files after (inclusive) 2015-06-02
Use 'xargs' to delete the remaining ones
A simple solution is:
rm -r /tmp/2015-05-2?
If you want to keep 2 folders, try:
ls -d ./2015-* | sort | head -n -2 | xargs echo
Replace -2 with the negative number of folders to keep. Replace echo with rm -r when the output looks correct.
The intention of this question is to find directories whose names indicate a timestamp. Therefore I'd propose to calculate those timestamps to decide whether to delete or not:
ref=$(date --date 2015-05-29 +%s)
for d in ????-??-??; do [[ $(date --date "$d" +%s) -le $ref ]] && rm -rf "$d"; done
ref is the reference date, the names of the other directories are compared to this one.

remove unused folder and copy needed

I'm trying to remove all Folder with
rm -R !(foo|bar|abc)
exclude the given folder names, that could be two or more.
That works fine!
In the next step I need to copy the needed missing folder from another direction in this folder.
I tried following, but it doesn't work, and it should also be flexible with folder counts.
rm -R !($neededfolders)
ownedfolders=$(ls ./dest/) #
find ../source/ -maxdepth 1 -type d | grep "$neededfolders" | grep -v "$ownedfolders" | xargs cp -Rt ./dest/
My problem with the code is that grep won't use multiple names, I also tried to declare the ownedfolder, set the second grep to
grep -v ${ownedfolder[i]}
and put the whole thing in a for loop, but these ends in a fallacy.
Many thanks!
You can use a for loop:
needed='#(foo|bar|abc)'
for dir in ../source/*/ ; do
dir=${dir%/}
if [[ $dir == ../source/$needed && ! -d dest/${dir##*/} ]] ; then
cp -R "$dir" dest/
fi
done
This avoids the ugly variable $ownedfolders populated by the output of ls.
You need to use the -E option to grep to enable extended regular expressions, which recognize | alternatives:
find ../source/ -maxdepth 1 -type d | grep -E "$neededfolders" | grep -v -E "$ownedfolders" | xargs cp -Rt ./dest/

How to loop over directories in Linux?

I am writing a script in bash on Linux and need to go through all subdirectory names in a given directory. How can I loop through these directories (and skip regular files)?
For example:
the given directory is /tmp/
it has the following subdirectories: /tmp/A, /tmp/B, /tmp/C
I want to retrieve A, B, C.
All answers so far use find, so here's one with just the shell. No need for external tools in your case:
for dir in /tmp/*/ # list directories in the form "/tmp/dirname/"
do
dir=${dir%*/} # remove the trailing "/"
echo "${dir##*/}" # print everything after the final "/"
done
cd /tmp
find . -maxdepth 1 -mindepth 1 -type d -printf '%f\n'
A short explanation:
find finds files (quite obviously)
. is the current directory, which after the cd is /tmp (IMHO this is more flexible than having /tmp directly in the find command. You have only one place, the cd, to change, if you want more actions to take place in this folder)
-maxdepth 1 and -mindepth 1 make sure that find only looks in the current directory and doesn't include . itself in the result
-type d looks only for directories
-printf '%f\n prints only the found folder's name (plus a newline) for each hit.
Et voilĂ !
You can loop through all directories including hidden directrories (beginning with a dot) with:
for file in */ .*/ ; do echo "$file is a directory"; done
note: using the list */ .*/ works in zsh only if there exist at least one hidden directory in the folder. In bash it will show also . and ..
Another possibility for bash to include hidden directories would be to use:
shopt -s dotglob;
for file in */ ; do echo "$file is a directory"; done
If you want to exclude symlinks:
for file in */ ; do
if [[ -d "$file" && ! -L "$file" ]]; then
echo "$file is a directory";
fi;
done
To output only the trailing directory name (A,B,C as questioned) in each solution use this within the loops:
file="${file%/}" # strip trailing slash
file="${file##*/}" # strip path and leading slash
echo "$file is the directoryname without slashes"
Example (this also works with directories which contains spaces):
mkdir /tmp/A /tmp/B /tmp/C "/tmp/ dir with spaces"
for file in /tmp/*/ ; do file="${file%/}"; echo "${file##*/}"; done
Works with directories which contains spaces
Inspired by Sorpigal
while IFS= read -d $'\0' -r file ; do
echo $file; ls $file ;
done < <(find /path/to/dir/ -mindepth 1 -maxdepth 1 -type d -print0)
Original post (Does not work with spaces)
Inspired by Boldewyn: Example of loop with find command.
for D in $(find /path/to/dir/ -mindepth 1 -maxdepth 1 -type d) ; do
echo $D ;
done
find . -mindepth 1 -maxdepth 1 -type d -printf "%P\n"
The technique I use most often is find | xargs. For example, if you want to make every file in this directory and all of its subdirectories world-readable, you can do:
find . -type f -print0 | xargs -0 chmod go+r
find . -type d -print0 | xargs -0 chmod go+rx
The -print0 option terminates with a NULL character instead of a space. The -0 option splits its input the same way. So this is the combination to use on files with spaces.
You can picture this chain of commands as taking every line output by find and sticking it on the end of a chmod command.
If the command you want to run as its argument in the middle instead of on the end, you have to be a bit creative. For instance, I needed to change into every subdirectory and run the command latemk -c. So I used (from Wikipedia):
find . -type d -depth 1 -print0 | \
xargs -0 sh -c 'for dir; do pushd "$dir" && latexmk -c && popd; done' fnord
This has the effect of for dir $(subdirs); do stuff; done, but is safe for directories with spaces in their names. Also, the separate calls to stuff are made in the same shell, which is why in my command we have to return back to the current directory with popd.
a minimal bash loop you can build off of (based off ghostdog74 answer)
for dir in directory/*
do
echo ${dir}
done
to zip a whole bunch of files by directory
for dir in directory/*
do
zip -r ${dir##*/} ${dir}
done
If you want to execute multiple commands in a for loop, you can save the result of find with mapfile (bash >= 4) as a variable and go through the array with ${dirlist[#]}. It also works with directories containing spaces.
The find command is based on the answer by Boldewyn. Further information about the find command can be found there.
IFS=""
mapfile -t dirlist < <( find . -maxdepth 1 -mindepth 1 -type d -printf '%f\n' )
for dir in ${dirlist[#]}; do
echo ">${dir}<"
# more commands can go here ...
done
TL;DR:
(cd /tmp; for d in */; do echo "${d%/}"; done)
Explanation.
There's no need to use external programs. What you need is a shell globbing pattern. To avoid the need of removing /tmp afterward, I'm running it in a subshell, which may or not be suitable for your purposes.
Shell globbing patterns in a nutshell:
* Match any non-empty string any number of times.
? Match exactly one character.
[...] Matches with a character from between the brackets. You can also specify ranges ([a-z], [A-F0-9], etc.) or classes ([:digit:], [:alpha:], etc.).
[^...] Match one of the characters not between the braces.
* If no file names match the pattern, the shell will return the pattern unchanged. Any character or string that is not one of the above represents itself.
Consequently, the pattern */ will match any file name that ends with a /. A trailing / in a file name unambiguously identifies a directory.
The last bit is removing the trailing slash, which is achieved with the variable substitution ${var%PATTERN}, which removes the shortest matching pattern from the end of the string contained in var, and where PATTERN is any valid globbing pattern. So we write ${d%/}, meaning we want to remove the trailing slash from the string represented by d.
find . -type d -maxdepth 1
In short, put the results of find into an array and iterate the array and do what you want. Not the quickest but more organized thinking.
#!/bin/bash
cd /tmp
declare -a results=(`find -type d`)
#Iterate the results
for path in ${results[#]}
do
echo "Your path is $path"
#Do something with the path..
if [[ $path =~ "/A" ]]; then
echo $path | awk -F / '{print $NF}'
#prints A
elif [[ $path =~ "/B" ]]; then
echo $path | awk -F / '{print $NF}'
#Prints B
elif [[ $path =~ "/C" ]]; then
echo $path | awk -F / '{print $NF}'
#Prints C
fi
done
This can be reduced to find -type d | grep "/A" | awk -F / '{print $NF}' prints A
find -type d | grep "/B" | awk -F / '{print $NF}' prints B
find -type d | grep "/C" | awk -F / '{print $NF}' prints C

Resources