list the file and its base directory - linux

I have some files in my folder /home/sample/* * /*.pdf and *.doc and * .xls etc ('**' means some sub-sub directory.
I need the shell script or linux command to list the files in following manner.
pdf_docs/xx.pdf
documents/xx.doc
excel/xx.xls
pdf_docs, documents and excel are directories, which is located in various depth in /home/sample. like
/home/sample/12091/pdf_docs/xx.pdf
/home/sample/documents/xx.doc
/home/excel/V2hm/1001/excel/xx.xls

You can try this:
for i in {*.pdf,*.doc,*.xls}; do find /home/sample/ -name "$i"; done | awk -F/ '{print $(NF-1) "/" $NF}'
I ve added a line of awk which will print the last 2 fields (seperated by '/' ) of the result alone

Something like this?
for i in {*.pdf,*.doc,*.xls}; do
find /home/sample/ -name "$i";
done | perl -lnwe '/([^\/]+\/[^\/]+)$/&&print $1'

How about this?
find /home/sample -type f -regex '^.*\.\(pdf\|doc\|xls\)$'

Takes into account spaces in file names, potential case of extension
for a in {*.pdf,*.doc,*.xls}; do find /home/sample/ -type f -iname "$a" -exec basename {} \; ; done
EDIT
Edited to take into account only files

You don't need to call out to an external program to chop the pathname like you're looking for:
$ filename=/home/sample/12091/pdf_docs/xx.pdf
$ echo ${filename%/*/*}
/home/sample/12091
$ echo ${filename#${filename%/*/*}?}
pdf_docs/xx.pdf
So,
find /home/sample -name \*.doc -o -name \*.pdf -o -name \*.xls -print0 |
while read -r -d '' pathname; do
echo "${pathname#${pathname%/*/*}?}"
done

Related

How to find a list of files that are of specific extension but do not contain certain characters in their file name?

I have a folder with files that have extensions, such as .txt, .sh and .out.
However, I want a list of files that have only .txt extension, with the file names not containing certain characters.
For example, the .txt files are named L-003_45.txt and so on all up to L-003_70.txt. Some files have a change in the L-003 part to lets say L-004, creating duplicates of lets say file 45, so basically both L-003_45.txt and L-004_45.txt exist. So I want to get a list of text files that don't have 45 in their name.
How would I do that?
I tried with find and ls and succeeded but I would like to know how to do a for loop instead.
I tried:
for FILE in *.txt; do ls -I '*45.txt'; done but it failed.
Would be grateful for the help!
Or you use Bash's extendedglobing
#!/usr/bin/env bash
# Enables extended globing
shopt -s extglob
# Prevents iterating patterns if no match found
shopt -s nullglob
# Iterates files not having 45 or 57 before .txt
for file in !(*#(45|57)).txt; do
printf '%s\n' "$file"
done
I would advise you to use the find command to find all files with the required extensions, and later filter out the ones with the "strange" characters, e.g. for finding the file extensions:
find ./ -name "*.txt" -o -name "*.sh" -o name "*.out"
... and now, for not showing the ones with "45" in the name, you can do:
find ./ -name "*.txt" -o -name "*.sh" -o name "*.out" | grep -v "45"
... and if you don't want "45" nor "56", you can do:
find ./ -name "*.txt" -o -name "*.sh" -o name "*.out" | grep -v "45" | grep -v "56"
Explanation:
-o stands for OR
grep -v stands for "--invert-match" (not showing those results)
Setup:
$ touch L-004_23.txt L-003_45.txt L-004_45.txt L-003_70.txt
$ ls -1 L*txt
L-003_45.txt
L-003_70.txt
L-004_23.txt
L-004_45.txt
One idea using ! to negate a criteria:
$ find . -name "*.txt" ! -name "*_45.txt"
./L-003_70.txt
./L-004_23.txt
Feeding the find results to a while loop, eg:
while read -r file
do
echo "file: ${file}"
done < <(find . -name "*.txt" ! -name "*_45.txt")
This generates:
file: ./L-003_70.txt
file: ./L-004_23.txt
The proposed solution with extglob is a very good one. In case you need to exclude more than one pattern you can also test and continue. Example to exclude all *45.txt and *57.txt:
declare -a excludes=("45" "57")
for f in *.txt; do
for e in "${excludes[#]}"; do
[[ "$f" == *"$e.txt" ]] && continue 2
done
printf '%s\n' "$f"
done

Recursively prepend text to file names

I want to prepend text to the name of every file of a certain type - in this case .txt files - located in the current directory or a sub-directory.
I have tried:
find -L . -type f -name "*.txt" -exec mv "{}" "PrependedTextHere{}" \;
The problem with this is dealing with the ./ part of the path that comes with the {} reference.
Any help or alternative approaches appreciated.
You can do something like this
find -L . -type f -name "*.txt" -exec bash -c 'echo "$0" "${0%/*}/PrependedTextHere${0##*/}"' {} \;
Where
bash -c '...' executes the command
$0 is the first argument passed in, in this case {} -- the full filename
${0%/*} removes everything including and after the last / in the filename
${0##*/} removes everything before and including the last / in the filename
Replace the echo with a mv once you're satisfied it's working.
Are you just trying to move the files to a new file name that has Prepend before it?
for F in *.txt; do mv "$F" Prepend"$F"; done
Or do you want it to handle subdirectories and prepend between the directory and file name:
dir1/PrependA.txt
dir2/PrependB.txt
Here's a quick shot at it. Let me know if it helps.
for file in $(find -L . -type f -name "*.txt")
do
parent=$(echo $file | sed "s=\(.*/\).*=\1=")
name=$(echo $file | sed "s=.*/\(.*\)=\1=")
mv "$file" "${parent}PrependedTextHere${name}"
done
This ought to work, as long file names does not have new line character(s). In such case make the find to use -print0 and IFS to have null.
#!/bin/sh
IFS='
'
for I in $(find -L . -name '*.txt' -print); do
echo mv "$I" "${I%/*}/prepend-${I##*/}"
done
p.s. Remove the echo to make the script effective, it's there to avoid accidental breakage for people who randomly copy paste stuff from here to their shell.

From directories create files changing their ending

I have several directories with a pattern:
$find -name "*.out"
./trnascanse.out
./darn.out
./blast_rnaz.out
./erpin.out
./rnaspace_cli.out
./yass.out
./atypicalgc.out
./blast.out
./combine.out
./infernal.out
./ecoli.out
./athaliana.out
./yass_carnac.out
./rnammer.out
I can get the list into a file find -name "*.out" > files because I want to create for each directory a file ending with .ref instead of .out : trnascanse.ref, darn.ref, blast_rnaz.refand so on.
I would say that this is possible with some grep and touch but I don't know how to do it. Any idea? Or just create each one manually is the only way (as I did with this directories). Thanks
Here's one way:
for d in *.out ; do echo touch "${d%.out}.ref" ; done
The ${d%.out} expands $d and removes the trailing .out. Read about it in the bash man page.
If the output of above one-liner looks ok, pipe it to sh , or remove the echo and re-run it.
Use this:
find -maxdepth 1 -type d -printf "%f" -exec bash -c "mkdir $(echo '{}' | sed 's/\.out$//').ref" \;

Bash script to find files in a list, copy them to dest, print files not found

I would like to build on the answer I found here: Bash script to find specific files in a hierarchy of files
find $dir -name $name -exec scp {} $destination \;
I have a file with a list of file names and I need to find those files on a backup disk, then copy those files found to a destination folder, and lastly print the files that could not be found to a new file.
the last step would be helpful so that I wouldn't need to make another list of files copied and then do a compare with original list.
If the script can then make a list of the copied files, and do a compare, then print the differences, then that's exactly what's required. Unless the shell process find can print to file each time it "Can't find" a file.
Assuming that your list is separated by newlines; something like this should work
#!/bin/bash
dir=someWhere
dest=someWhereElse
toCopyList=filesomewhere
notCopied=filesomewhereElse
while read line; do
find "$dir" -name "$line" -exec cp '{}' $dest \; -printf "%f\n"
done < "$toCopyList" > cpList
#sed -i 's#'$dir'/##' cpList
# I used # instead of / in sed to not confuse sed with / in $dir
# Also, I assumed the string in $dir doesnot end with a /
cat cpList "$toCopyList" | sort | uniq -c | sed -nr '/^ +1/s/^ +1 +(.*)/\1/p' > "$notCopied"
# Will not work if you give wild cards in your "toCopyList"
Hope it helps
while read fname ; do
find /FROM/WHERE/TO/COPY/ \
-type f \
-name "$fname" \
-exec cp \{\} /DESTINATION/DIR/ \; 2>/dev/null
find /DESTINATION/DIR/ \
-type f \
-name "$fname" &>/dev/null || \
echo $fname
done < FILESTOCOPY > MISSEDFILES
Will do.

Linux - Replacing spaces in the file names

I have a number of files in a folder, and I want to replace every space character in all file names with underscores. How can I achieve this?
This should do it:
for file in *; do mv "$file" `echo $file | tr ' ' '_'` ; done
I prefer to use the command 'rename', which takes Perl-style regexes:
rename "s/ /_/g" *
You can do a dry run with the -n flag:
rename -n "s/ /_/g" *
Use sh...
for i in *' '*; do mv "$i" `echo $i | sed -e 's/ /_/g'`; done
If you want to try this out before pulling the trigger just change mv to echo mv.
If you use bash:
for file in *; do mv "$file" ${file// /_}; done
What if you want to apply the replace task recursively? How would you do that?
Well, I just found the answer myself. Not the most elegant solution, (also tries to rename files that do not comply with the condition) but it works. (BTW, in my case I needed to rename the files with '%20', not with an underscore)
#!/bin/bash
find . -type d | while read N
do
(
cd "$N"
if test "$?" = "0"
then
for file in *; do mv "$file" ${file// /%20}; done
fi
)
done
Here is another solution:
ls | awk '{printf("\"%s\"\n", $0)}' | sed 'p; s/\ /_/g' | xargs -n2 mv
uses awk to add quotes around the name of the file
uses sed to replace space with underscores; prints the original name with quotes(from awk); then the substituted name
xargs takes 2 lines at a time and passes it to mv
Try something like this, assuming all of your files were .txt's:
for files in *.txt; do mv “$files” `echo $files | tr ‘ ‘ ‘_’`; done
Quote your variables:
for file in *; do echo mv "'$file'" "${file// /_}"; done
Remove the "echo" to do the actual rename.
To rename all the files with a .py extension use,
find . -iname "*.py" -type f | xargs -I% rename "s/ /_/g" "%"
Sample output,
$ find . -iname "*.py" -type f
./Sample File.py
./Sample/Sample File.py
$ find . -iname "*.py" -type f | xargs -I% rename "s/ /_/g" "%"
$ find . -iname "*.py" -type f
./Sample/Sample_File.py
./Sample_File.py
This will replace ' ' with '_' in every folder and file name recursivelly in Linux with Python >= 3.5. Change path_to_your_folder with your path.
Only list files and folders:
python -c "import glob;[print(x) for x in glob.glob('path_to_your_folder/**', recursive=True)]"
Replace ' ' with '_' in every folder and file name
python -c "import os;import glob;[os.rename(x,x.replace(' ','_')) for x in glob.glob('path_to_your_folder/**', recursive=True)]"
With Python < 3.5, you can install glob2
pip install glob2
python -c "import os;import glob2;[os.rename(x,x.replace(' ','_')) for x in glob2.glob('path_to_your_folder/**')]"
The easiest way to replace a string (space character in your case) with another string in Linux is using sed. You can do it as follows
sed -i 's/\s/_/g' *
Hope this helps.

Resources