Bash script to find files in a list, copy them to dest, print files not found - linux

I would like to build on the answer I found here: Bash script to find specific files in a hierarchy of files
find $dir -name $name -exec scp {} $destination \;
I have a file with a list of file names and I need to find those files on a backup disk, then copy those files found to a destination folder, and lastly print the files that could not be found to a new file.
the last step would be helpful so that I wouldn't need to make another list of files copied and then do a compare with original list.
If the script can then make a list of the copied files, and do a compare, then print the differences, then that's exactly what's required. Unless the shell process find can print to file each time it "Can't find" a file.

Assuming that your list is separated by newlines; something like this should work
#!/bin/bash
dir=someWhere
dest=someWhereElse
toCopyList=filesomewhere
notCopied=filesomewhereElse
while read line; do
find "$dir" -name "$line" -exec cp '{}' $dest \; -printf "%f\n"
done < "$toCopyList" > cpList
#sed -i 's#'$dir'/##' cpList
# I used # instead of / in sed to not confuse sed with / in $dir
# Also, I assumed the string in $dir doesnot end with a /
cat cpList "$toCopyList" | sort | uniq -c | sed -nr '/^ +1/s/^ +1 +(.*)/\1/p' > "$notCopied"
# Will not work if you give wild cards in your "toCopyList"
Hope it helps

while read fname ; do
find /FROM/WHERE/TO/COPY/ \
-type f \
-name "$fname" \
-exec cp \{\} /DESTINATION/DIR/ \; 2>/dev/null
find /DESTINATION/DIR/ \
-type f \
-name "$fname" &>/dev/null || \
echo $fname
done < FILESTOCOPY > MISSEDFILES
Will do.

Related

Linux Command Line - list all directories containing .js files, and copy the directories and their contents to a new folder

Here is the code I already have that finds and lists all directories containing .js files (excluding the node_modules directory).
find . -name '*.js*' -printf "%h\n" | sort -u | grep -v node_modules
As you can see, listing those directories is no problem. However, rather than list the directories, I would like to copy them (and their contents) to a new folder, preferably all in one line without running any kind of script.
Any help would be much appreciated!
The safest way to do this is to process the list of directories using NULL as the delimiter so that directories with spaces (and other odd characters) are handled correctly.
Remove the echo if the output looks correct.
"1-liner"
find "/path/to/tld" -path "*node_modules*" -prune -o -name "*.js" -printf "%h\0" | \
sort -uz | xargs -0 -I _ echo cp -a _ "/path/to/new/dir"
Bash Script
This requires Bash 4 for the associative array which will filter out duplicates.
#!/bin/bash
tld="/path/to/top/level/dir"
newdir="/path/to/new/dir"
unset dirHash;
declare -A dirHash
while read -r -d $'\0' dir; do
(( ! dirHash["$dir"]++ )) && echo cp -a "$dir" "$newdir"
done < <(find "$tld" -path "*node_modules*" -prune -o -name "*.js" -printf "%h\0")

Need guidance with a bash script to check log files in a certain directory for a certain string

I would like to preface this with I am a complete noob with scripting. So I have a situation where I need to manually look for a phone number that could live in one of hundreds of files.
so the logs live in the following directory.
/actlogs/sbclogger_archive
The logs file names are in directories numbered 01-31 inside of that directory and all the files are zipped.
Inside of those numbered directories are tons of files but the only ones I want to search are "sipd.logthenthedate.gz" and "sipmsg.logthenthedate.gz".
So I need to look in all the files in the following directory.
"/actlogs/sbclogger_archive"
Which has 31 directories labeled "01-31"
Then in each 01-31 there is hundreds of files the only ones I want to look are are "sipd.logthenthedate.gz" and "sipmsg.logthenthedate.gz".
The script I am using is below, please let me know what I could do to make this work.
#!/bin/bash
read -p "Enter a phone number: " text
read -p "Enter directory of log file's, Hint it should be /actlogs/sbclogger_archive: " directory
#arr=( $(find $directory -type f -exec grep -l "$text" {} \; | sort -r) )
#find $directory -type f -exec grep -qe "$text" {} \; -exec bash -c '
file=$(find $directory -type f -name 'sipd.log*' -exec grep -qe "$text" {} \; -exec bash -c 'select f; do echo $f; break; done' find-sh {} +;)
if [ -z "$file" ]; then
echo "No matches found."
else
echo "select tool:"
tools=("nano" "less" "vim" "quit")
select tool in "${tools[#]}"
do
case $tool in
"quit")
break
;;
*)
$tool $file
break
;;
esac
done
fi
This would give you the list of files matching:
find \( -name 'sipd.log[0-9]*.gz' -o -name 'sipmsg.log[0-9]*.gz' \) \
-exec sh -c 'gunzip -c {}| grep -m1 -q 888333' \; -print
./18/sipd.log20200118.gz
./7/sipd.log20200107.gz
Note: -m1 tells grep to stop after first match, since you need only the file name in this case, it's enough.
If you have zgrep, you can shorten it to:
find \( -name 'sipd.log[0-9]*.gz' -o -name 'sipmsg.log[0-9]*.gz' \) \
-exec zgrep -l '888333' {} \;
./18/sipd.log20200118.gz
./7/sipd.log20200107.gz
Also, some of the tools you are suggesting do not support gzip files (nano and some variants of less for example). In which case you might need to decompress the file and compress it again when done.
And, you might want to consider a loop if you want to "quit". Feeding the file list to the tool doesn't make sense.
Note: AFAIK zgrep doesn't do recursive:
DESCRIPTION
Zgrep invokes grep on compressed or gzipped files. These grep options will cause zgrep to terminate with an
error code:
(-[drRzZ]|--di*|--exc*|--inc*|--rec*|--nu*). All other options specified are passed directly to grep. If no file is specified, then
the
standard input is decompressed if necessary and fed to grep. Otherwise the given files are uncompressed if necessary and fed to
grep.
so zgrep -rl "$text" "$directory" or zgrep -rl --include 'simpd.log*.gz' "$test" {01..31} won't work except if you have a special zgrep
As you must unzip before using your tool, i would divide the problem in two blocks.
Firstly, i would expand the paths you need (looking under <directory> for the phone <text>), and then iterate to apply the tool (because some tools like vim or nano cannot be piped).
Try something like this:
#!/bin/bash
#...
# text/directory input stuff
#...
tmpdir=$(mktemp -d)
trap 'rm -rf ${tmpdir}' EXIT
while IFS= read -r file; do
unzipped=${tmpdir}/$(basename "${file}" .gz)
gunzip -c "${file}" > "${unzipped}"
${tool} "${unzipped}"
done < <(zgrep -lw "${text}" "${directory}"/{01..31}/{sipd.logthenthedate.gz,sipmsg.logthenthedate.gz} 2>/dev/null)
Above is the proposed invert-form by Charles Duffy following this Bash FAQ.
If you prefer to iterate an array, you could build in this way:
# shellcheck disable=SC2207
files=( $(zgrep -lw "${text}" "${directory}"/{01..31}/{sipd.logthenthedate.gz,sipmsg.logthenthedate.gz} 2>/dev/null) )
for file in "${files[#]}"; do
# etc.
as in our particular case, the files to match have no spaces in their names and shellcheck warning is not so important (hidden above).
BRs

Create duplicate file and rename it

I want duplicates of the files with different name.
I am currently trying out these commands before putting them into my bash script.
$ set dir = /somewhere/states
$ find $dir -name "total.txt" -type f | xargs ls -1
/somewhere/states/florida/fixed.fl_Asite_ttl/somewhere/total.txt
/somewhere/states/hawaii/fixed.hi_Bsite_ttl/somewhere/total.txt
/somewhere/states/kentucky/fixed.ky_Asite_ttl/somewhere/total.txt
/somewhere/states/michigan/fixed.mi_Csite_ttl/somewhere/total.txt
/somewhere/states/texas/fixed.tx_Vsite_ttl/somewhere/total.txt
I know I can rename file using something like this, but it isn't exactly what I want:
$ find $dir -name "total.txt" -exec sh -c 'cp {} `dirname {}`/`basename {} `why.xls' \;
/somewhere/states/florida/fixed.fl_Asite_ttl/somewhere/total.txtwhy.xls
/somewhere/states/hawaii/fixed.hi_Bsite_ttl/somewhere/total.txtwhy.xls
/somewhere/states/kentucky/fixed.ky_Asite_ttl/somewhere/total.txtwhy.xls
/somewhere/states/michigan/fixed.mi_Csite_ttl/somewhere/total.txtwhy.xls
/somewhere/states/texas/fixed.tx_Vsite_ttl/somewhere/total.txtwhy.xls
May I know how to copy the files and have the new files in the same dir?
below are the examples.
I want to name the new files as everything behind "fixed." and before "/somewhere" and changing the file extension as well
/somewhere/states/florida/fixed.fl_Asite_ttl/somewhere/fl_Asite_ttl.xls
/somewhere/states/hawaii/fixed.hi_Bsite_ttl/somewhere/hi_Bsite_ttl.xls
/somewhere/states/kentucky/fixed.ky_Asite_ttl/somewhere/ky_Asite_ttl.xls
/somewhere/states/michigan/fixed.mi_Csite_ttl/somewhere/mi_Csite_ttl.xls
/somewhere/states/texas/fixed.tx_Vsite_ttl/somewhere/tx_Vsite_ttl.xls
Update:
/somewhere/states/florida_fixed_ttl/fixed.fl_Asite_ttl/somewhere/total.txt
Probably not the most elegant but this should work:
find . -name total.txt | while read F ; do [[ $F =~ fixed.[^/]* ]] ; N=$(echo $BASH_REMATCH | sed s/fixed\.//) ; echo "cp $F $(dirname $F)/$N.xls" ; done
If you are happy with the output just remove the last echo, i.e. this:
echo "cp $F $(dirname $F)/$N.xls"
to this:
cp "$F" "$(dirname $F)/$N.xls"
Note, if the .txt and .xls contents will always remain the same you can use ln instead of cp -- one file, two names.

Recursively prepend text to file names

I want to prepend text to the name of every file of a certain type - in this case .txt files - located in the current directory or a sub-directory.
I have tried:
find -L . -type f -name "*.txt" -exec mv "{}" "PrependedTextHere{}" \;
The problem with this is dealing with the ./ part of the path that comes with the {} reference.
Any help or alternative approaches appreciated.
You can do something like this
find -L . -type f -name "*.txt" -exec bash -c 'echo "$0" "${0%/*}/PrependedTextHere${0##*/}"' {} \;
Where
bash -c '...' executes the command
$0 is the first argument passed in, in this case {} -- the full filename
${0%/*} removes everything including and after the last / in the filename
${0##*/} removes everything before and including the last / in the filename
Replace the echo with a mv once you're satisfied it's working.
Are you just trying to move the files to a new file name that has Prepend before it?
for F in *.txt; do mv "$F" Prepend"$F"; done
Or do you want it to handle subdirectories and prepend between the directory and file name:
dir1/PrependA.txt
dir2/PrependB.txt
Here's a quick shot at it. Let me know if it helps.
for file in $(find -L . -type f -name "*.txt")
do
parent=$(echo $file | sed "s=\(.*/\).*=\1=")
name=$(echo $file | sed "s=.*/\(.*\)=\1=")
mv "$file" "${parent}PrependedTextHere${name}"
done
This ought to work, as long file names does not have new line character(s). In such case make the find to use -print0 and IFS to have null.
#!/bin/sh
IFS='
'
for I in $(find -L . -name '*.txt' -print); do
echo mv "$I" "${I%/*}/prepend-${I##*/}"
done
p.s. Remove the echo to make the script effective, it's there to avoid accidental breakage for people who randomly copy paste stuff from here to their shell.

list the file and its base directory

I have some files in my folder /home/sample/* * /*.pdf and *.doc and * .xls etc ('**' means some sub-sub directory.
I need the shell script or linux command to list the files in following manner.
pdf_docs/xx.pdf
documents/xx.doc
excel/xx.xls
pdf_docs, documents and excel are directories, which is located in various depth in /home/sample. like
/home/sample/12091/pdf_docs/xx.pdf
/home/sample/documents/xx.doc
/home/excel/V2hm/1001/excel/xx.xls
You can try this:
for i in {*.pdf,*.doc,*.xls}; do find /home/sample/ -name "$i"; done | awk -F/ '{print $(NF-1) "/" $NF}'
I ve added a line of awk which will print the last 2 fields (seperated by '/' ) of the result alone
Something like this?
for i in {*.pdf,*.doc,*.xls}; do
find /home/sample/ -name "$i";
done | perl -lnwe '/([^\/]+\/[^\/]+)$/&&print $1'
How about this?
find /home/sample -type f -regex '^.*\.\(pdf\|doc\|xls\)$'
Takes into account spaces in file names, potential case of extension
for a in {*.pdf,*.doc,*.xls}; do find /home/sample/ -type f -iname "$a" -exec basename {} \; ; done
EDIT
Edited to take into account only files
You don't need to call out to an external program to chop the pathname like you're looking for:
$ filename=/home/sample/12091/pdf_docs/xx.pdf
$ echo ${filename%/*/*}
/home/sample/12091
$ echo ${filename#${filename%/*/*}?}
pdf_docs/xx.pdf
So,
find /home/sample -name \*.doc -o -name \*.pdf -o -name \*.xls -print0 |
while read -r -d '' pathname; do
echo "${pathname#${pathname%/*/*}?}"
done

Resources