Find pattern of the file, create a folder with that pattern and copy the files to that folder - Bash script - linux

I have a task, to find the pattern of the file, create a folder with the pattern name and copy the file to that folder. I am able to create the folders.
folders=`find /Location -type f -name "*.pdf" -printf "%f\n" | cut -f 1 -d '_' | sort -u`
for i in $folders
do
mkdir -p /LocationToCreateTheFolder/$i
done
Not able to go further on how to copy the files.

maybe try?
for i in $folders do mkdir -p /LocationToCreateTheFolder/$i && cp ./$i.pdf ./$i/

This will do the finding and the copying:
find Location -type f -name '*.pdf' -exec bash -c 'f=${1##*/}; d="LocationToCreateTheFolder/${f%%_*}"; mkdir -p "$d" && cp "$1" "$d"' None {} \;
This is safe for difficult file names even ones that contain spaces, tabs, or newlines in their names.
How it works
find Location -type f -name '*.pdf' -exec bash -c '...' None {} \;
This will find the pdf files under directory Location and, for each one found, the bash commands inside '...' will be executed with $1 set to the name of the file found. ($0 is set to None. We don't use $0.)
f=${1##*/}
This removes the directory names from the name of the file. This is an example of prefix removal: everything in $1 up to and including the last / is removed.
d="LocationToCreateTheFolder/${f%%_*}"
This creates the name of the directory to which we want to send the file.
${f%%_*}" is an example of suffix removal. Everything in $f from the first _ and after is removed.
mkdir -p "$d" && cp "$1" "$d"
This makes sure that the directory exists and then copies the file to it.

Related

Moving files with a pattern in their name to a folder with the same pattern as its name

My directory contains mix of hundreds of files and directories similar to this:
508471/
ae_lstm__ts_ 508471_detected_anomalies.pdf
ae_lstm__508471_prediction_result.pdf
mlp_508471_prediction_result.pdf
mlp__ts_508471_detected_anomalies.pdf
vanilla_lstm_508471_prediction_result.pdf
vanilla_lstm_ts_508471_detected_anomalies.pdf
598690/
ae_lstm__ts_598690_detected_anomalies.pdf
ae_lstm__598690_prediction_result.pdf
mlp_598690_prediction_result.pdf
mlp__ts_598690_detected_anomalies.pdf
vanilla_lstm_598690_prediction_result.pdf
vanilla_lstm_ts_598690_detected_anomalies.pdf
There are folders with an ID number as their names, like 508471 and 598690.
In the same path as these folders, there are pdf files that have this ID number as part of their name. I need to move all the pdf files with the same ID in their name, to their related directories.
I tried the following shell script but it doesn't do anything. What am I doing wrong?
I'm trying to loop over all the directories, find the files that have id in their name, and move them to the same dir:
for f in ls -d */; do
id=${f%?} # f value is '598690/', I'm removing the last character, `\`, to get only the id part
find . -maxdepth 1 -type f -iname *.pdf -exec grep $id {} \; -exec mv -i {} $f \;
done
#!/bin/sh
find . -mindepth 1 -maxdepth 1 -type d -exec sh -c '
for d in "$#"; do
id=${d#./}
for file in *"$id"*.pdf; do
[ -f "$file" ] && mv -- "$file" "$d"
done
done
' findshell {} +
This finds every directory inside the current one (finding, for example, ./598690). Then, it removes ./ from the relative path and selects each file that contains the resulting id (598690), moving it to the corresponding directory.
If you are unsure of what this will do, put an echo between && and mv, it will list the mv actions the script would make.
And remember, do not parse ls.
The below code should do the required job.
for dir in */; do find . -mindepth 1 -maxdepth 1 -type f -name "*${dir%*/}*.pdf" -exec mv {} ${dir}/ \;; done
where */ will consider only the directories present in the given directory, find will search only files in the given directory which matches *${dir%*/}*.pdf i.e file name containing the directory name as its sub-string and finally mv will copy the matching files to the directory.
in Unix please use below command
find . -name '*508471*' -exec bash -c 'echo mv $0 ${0/508471/598690}' {} \;
You may use this for loop from the parent directory of these pdf files and directories:
for d in */; do
compgen -G "*${d%/}*.pdf" >/dev/null && mv *"${d%/}"*.pdf "$d"
done
compgen -G is used to check if there is a match for given glob or not.

Getting all files from various folders and copying them with unique names

Currently using this command to get all my "fanart" from my TV folder, and dump it into a single folder.
find /volume1/tv/ -type f \( -name '*fanart.jpg'* -o -path '*/fanart/*.jpg' -o -path '*/extrafanart/*.jpg' \) -exec cp {} /volume1/tv/_FANART \;
Here's the issue: a lot of these files have the same name, and can't be dumped into the same folder. Example:
Folder A
fanart.jpg
Folder B
fanart.jpg
Is there a way to copy these files from their respective folders and give them a unique name in the destination folder? Name needn't be anything descriptive, random is just fine.
Thanks!
find /volume1/tv/ -type f \( -name '*fanart.jpg'* -o -path '*/fanart/*.jpg' -o -path '*/extrafanart/*.jpg' \) -exec cp --backup=numbered {} /volume1/tv/_FANART \;
..
cp --backup=numbered {}
If the file exists, this will not overwrite but make a backup with a number assigned.
The files will be hidden. Ctrl+H to view hidden files
You could copy the files while giving them names according to their locations in the original directory tree. For instance (":" is legal but
unusual in filenames), your "find" command could call a shell script (rather than "cp" directly), which might look like this:
#!/bin/sh
case "x$1" in
x/volume1/tv/_FANART/*)
;;
*)
target=`echo "$1" | sed -e 's,^/volume1/tv/,,' -e s,/,:,g`
cp "$1" "$2/$target"
;;
esac
and the corresponding "-exec" would be
-exec myscript "{}" /volume1/tv/_FANART \;
By the way, the source/destination on the original example are in the same directory tree "/volume1/tv", which is why the sample script uses a case statement - to exclude files already copied to the _FANART folder.
If you want to use the md5sum as the new name:
find /volume1/tv/ -type d -path '/volume1/tv/_FANART' -prune -o -type f \( -name '*fanart.jpg'* -o -path '*/fanart/*.jpg' -o -path '*/extrafanart/*.jpg' \) -exec sh -c 'md5=$(md5sum < "$0") && md5=${md5%% *}.jpg && echo cp "$0" "/volume1/tv/_FANART/$md5"' {} \;
Every thing happens in the sh command (all commands are separated by && but I omitted the && for clarity):
md5=$(md5sum < "$0")
md5=${md5%% *}.jpg
cp "$0" "/volume1/tv/_FANART/$md5"'
the $0 expands to the filename processed. We first compute the md5sum of the file, then only keep the md5sum (md5sum puts a hyphen next to the hash) and append .jpg to that, and finally we copy the file into the target folder, with the computed name.
Notes.
I added
-type d -path '/volume1/tv/_FANART` -prune -o
to your command to omit this folder, since you very likely don't want to process it; it would actually be weird to process it, as its content is changed throughout find's traversal.
I left an echo in the command, so that absolutely nothing is copied (as is, it's 100% safe, you can just copy and paste it in your terminal): it only shows what commands are going to be performed (and you'll also see how fast/slow it is).
The command is 100% safe regarding funny filenames with spaces, newlines, globs, etc.
I used md5sum < fileand not md5sum file, because if the filename file contains special characters (like backslashes, newlines, etc.), md5sum (at least my version) prepends the hash with a backslash. Weird. By not giving a filename, we're safe, this won't happen.

unix bash find file directories with 2 explicit file extensions

I am trying to create a small bash script that essentially looks through a directory that includes hundreds of sub directories. in SOME of these subdirectories include a textfile.txt and a htmlfile.html where the names textfile and htmlfile are variable.
I only really care about sub directories that have both the .txt and the .html, all other subdirecories can be ignored.
I then want to list all the .html files and .txt files that are in the same sub directory
this seems like a pretty simple issue to solve but I am at a loss. all I can really get working is a line of code that outputs sub directories that have either a .html file or .txt with no association with the actual sub directory they are in, and I am pretty new at bash scripting so I can't go any further
#!/bin/bash
files="$(find ~/file/ -type f -name '*.txt' -or -name '*.html')"
for file in $files
do
echo $file
done
The following find command looks checks every subdirectory and, if it has both html and txt files, it lists all of them:
find . -type d -exec env d={} bash -c 'ls "$d"/*.html &>/dev/null && ls "$d"/*.txt &>/dev/null && ls "$d/"*.{html,txt}' \;
Explanation:
find . -type d
This looks for all subdirectories of the current directory.
-exec env d={} bash -c '...' \;
This sets the environment variable d to the value of the found subdirectory and then executes the bash command that is contained within the single quotes (see below).
ls "$d"/*.html &>/dev/null && ls "$d"/*.txt &>/dev/null && ls "$d/"*.{html,txt}
This is the bash command that is executed. It consists of three statements and-ed together. The first checks to see if directory d has any html files. If so, the second statement runs and it checks to see if there are any txt files. If so, the last statement is executed and it lists all html and txt files in the directory d.
This command is safe for all file and directory names containing spaces, tabs, or other difficult characters.
You could do it by searching recursively with the globstar option:
shopt -s globstar
for file in **; do
if [[ -d $file ]]; then
for sub_file in "$file"/*; do
case "$sub_file" in
*.html)
html=1;;
*.txt)
txt=1;;
esac
done
[[ $html && $txt ]] && echo "$file"
html=""
txt=""
fi
done
You can make use of -o
#!/bin/bash
files=$(find ~/file/ -type f -name '*.txt' -o -name '*.html')
for file in $files
do
echo $file
done
#!/bin/bash
#A quick peek into a dir to see if there's at least one file that matches pattern
dir_has_file() { dir="$1"; pattern="$2";
[ -n "$(find "$dir" -maxdepth 1 -type f -name "$pattern" -print -quit)" ]
}
#Assumes there are no newline characters in the filenames, but will behave correctly with subdirectories that match *.html or *.txt
find "$1" -type d|\
while read d
do
dir_has_file "$d" '*.txt' &&
dir_has_file "$d" '*.html' &&
#Now print all the matching files
find "$d" -maxdepth 1 -type f -name '*.txt' -o -name '*.html'
done
This script takes the root directory to look into as the first argument ($1).
The test command is what you need to check for the existence of each file in each of the subdirs:
find . -type d -exec sh -c "if test -f {}/$file1 -a -f {}/$file2 ; then ls {}/*.{txt,html} ; fi" \;
where $file1 and $file2 are the two .txt and .html files you are looking for.

Recursively prepend text to file names

I want to prepend text to the name of every file of a certain type - in this case .txt files - located in the current directory or a sub-directory.
I have tried:
find -L . -type f -name "*.txt" -exec mv "{}" "PrependedTextHere{}" \;
The problem with this is dealing with the ./ part of the path that comes with the {} reference.
Any help or alternative approaches appreciated.
You can do something like this
find -L . -type f -name "*.txt" -exec bash -c 'echo "$0" "${0%/*}/PrependedTextHere${0##*/}"' {} \;
Where
bash -c '...' executes the command
$0 is the first argument passed in, in this case {} -- the full filename
${0%/*} removes everything including and after the last / in the filename
${0##*/} removes everything before and including the last / in the filename
Replace the echo with a mv once you're satisfied it's working.
Are you just trying to move the files to a new file name that has Prepend before it?
for F in *.txt; do mv "$F" Prepend"$F"; done
Or do you want it to handle subdirectories and prepend between the directory and file name:
dir1/PrependA.txt
dir2/PrependB.txt
Here's a quick shot at it. Let me know if it helps.
for file in $(find -L . -type f -name "*.txt")
do
parent=$(echo $file | sed "s=\(.*/\).*=\1=")
name=$(echo $file | sed "s=.*/\(.*\)=\1=")
mv "$file" "${parent}PrependedTextHere${name}"
done
This ought to work, as long file names does not have new line character(s). In such case make the find to use -print0 and IFS to have null.
#!/bin/sh
IFS='
'
for I in $(find -L . -name '*.txt' -print); do
echo mv "$I" "${I%/*}/prepend-${I##*/}"
done
p.s. Remove the echo to make the script effective, it's there to avoid accidental breakage for people who randomly copy paste stuff from here to their shell.

Include folder name in renaming a file in linux

I've already used that command to rename the files in multiple directories and change JPG to jpg, so I have consistency.
find . -name '*.jpg' -exec sh -c 'mv "$0" "${0%.JPG}$.jpg"' {} \;
Do you have any idea how to change that to include the folder name in the name of the file
I am executing that in a folder that contains about 2000 folders (SKU's) or products ... and inside every SKU folder, there are 9 images. 1.jpg 2.jpg .... 9.jpg.
So the bottom-line is I have 2000 images with name 1.jpg, 2.jpg ... 9.jpg. I need those files to be unique, for example:
folder-name-1.jpg ... folder-name.2.jpg ... so on, in every folder.
Any help will be appreciated.
For example I can do as follows:
$ find . -iname '*.jpg' | while read fn; do name=$(basename "$fn") ; dir=$(dirname "$fn") ; mv "$fn" "$dir/$(basename "$dir")-$name" ;done
./lib/bukovina/version.jpg ./lib/bukovina/bukovina-version.jpg
./lib/bukovina.jpg ./lib/lib-bukovina.jpg
You can use fine one-liner:
find . -name '*.jpg' -execdir \
bash -c 'd="${PWD##*/}"; [[ "$1" != "$d-"* ]] && mv "$1" "./$d-$1"' - '{}' \;
This command uses safe approach to check whether image name is already not prefixed by the current directory name. You can run it multiple times also and image name won't be renamed after first run.
To get the folder name of a file you can do $(basename $(dirname ${FILE})), where ${FILE} is a path that may be relative but must contain at least one folder before the file name in it. This should not be a problem with find. If it is, just run it from one directory up.
find . -name '*.jpg' -exec sh -c 'mv "$0" "$(basename $(dirname $0))-${0%.JPG}$.jpg"' {} \;
Or, if you have JPEGs in your current directory:
find ../<dirname> -name '*.jpg' -exec sh -c 'mv "$0" "$(basename $(dirname $0))-${0%.JPG}$.jpg"' {} \;

Resources