Linux Command Line - list all directories containing .js files, and copy the directories and their contents to a new folder - linux

Here is the code I already have that finds and lists all directories containing .js files (excluding the node_modules directory).
find . -name '*.js*' -printf "%h\n" | sort -u | grep -v node_modules
As you can see, listing those directories is no problem. However, rather than list the directories, I would like to copy them (and their contents) to a new folder, preferably all in one line without running any kind of script.
Any help would be much appreciated!

The safest way to do this is to process the list of directories using NULL as the delimiter so that directories with spaces (and other odd characters) are handled correctly.
Remove the echo if the output looks correct.
"1-liner"
find "/path/to/tld" -path "*node_modules*" -prune -o -name "*.js" -printf "%h\0" | \
sort -uz | xargs -0 -I _ echo cp -a _ "/path/to/new/dir"
Bash Script
This requires Bash 4 for the associative array which will filter out duplicates.
#!/bin/bash
tld="/path/to/top/level/dir"
newdir="/path/to/new/dir"
unset dirHash;
declare -A dirHash
while read -r -d $'\0' dir; do
(( ! dirHash["$dir"]++ )) && echo cp -a "$dir" "$newdir"
done < <(find "$tld" -path "*node_modules*" -prune -o -name "*.js" -printf "%h\0")

Related

How to find a list of files that are of specific extension but do not contain certain characters in their file name?

I have a folder with files that have extensions, such as .txt, .sh and .out.
However, I want a list of files that have only .txt extension, with the file names not containing certain characters.
For example, the .txt files are named L-003_45.txt and so on all up to L-003_70.txt. Some files have a change in the L-003 part to lets say L-004, creating duplicates of lets say file 45, so basically both L-003_45.txt and L-004_45.txt exist. So I want to get a list of text files that don't have 45 in their name.
How would I do that?
I tried with find and ls and succeeded but I would like to know how to do a for loop instead.
I tried:
for FILE in *.txt; do ls -I '*45.txt'; done but it failed.
Would be grateful for the help!
Or you use Bash's extendedglobing
#!/usr/bin/env bash
# Enables extended globing
shopt -s extglob
# Prevents iterating patterns if no match found
shopt -s nullglob
# Iterates files not having 45 or 57 before .txt
for file in !(*#(45|57)).txt; do
printf '%s\n' "$file"
done
I would advise you to use the find command to find all files with the required extensions, and later filter out the ones with the "strange" characters, e.g. for finding the file extensions:
find ./ -name "*.txt" -o -name "*.sh" -o name "*.out"
... and now, for not showing the ones with "45" in the name, you can do:
find ./ -name "*.txt" -o -name "*.sh" -o name "*.out" | grep -v "45"
... and if you don't want "45" nor "56", you can do:
find ./ -name "*.txt" -o -name "*.sh" -o name "*.out" | grep -v "45" | grep -v "56"
Explanation:
-o stands for OR
grep -v stands for "--invert-match" (not showing those results)
Setup:
$ touch L-004_23.txt L-003_45.txt L-004_45.txt L-003_70.txt
$ ls -1 L*txt
L-003_45.txt
L-003_70.txt
L-004_23.txt
L-004_45.txt
One idea using ! to negate a criteria:
$ find . -name "*.txt" ! -name "*_45.txt"
./L-003_70.txt
./L-004_23.txt
Feeding the find results to a while loop, eg:
while read -r file
do
echo "file: ${file}"
done < <(find . -name "*.txt" ! -name "*_45.txt")
This generates:
file: ./L-003_70.txt
file: ./L-004_23.txt
The proposed solution with extglob is a very good one. In case you need to exclude more than one pattern you can also test and continue. Example to exclude all *45.txt and *57.txt:
declare -a excludes=("45" "57")
for f in *.txt; do
for e in "${excludes[#]}"; do
[[ "$f" == *"$e.txt" ]] && continue 2
done
printf '%s\n' "$f"
done

LINUX Copy the name of the newest folder and paste it in a command [duplicate]

I would like to find the newest sub directory in a directory and save the result to variable in bash.
Something like this:
ls -t /backups | head -1 > $BACKUPDIR
Can anyone help?
BACKUPDIR=$(ls -td /backups/*/ | head -1)
$(...) evaluates the statement in a subshell and returns the output.
There is a simple solution to this using only ls:
BACKUPDIR=$(ls -td /backups/*/ | head -1)
-t orders by time (latest first)
-d only lists items from this folder
*/ only lists directories
head -1 returns the first item
I didn't know about */ until I found Listing only directories using ls in bash: An examination.
This ia a pure Bash solution:
topdir=/backups
BACKUPDIR=
# Handle subdirectories beginning with '.', and empty $topdir
shopt -s dotglob nullglob
for file in "$topdir"/* ; do
[[ -L $file || ! -d $file ]] && continue
[[ -z $BACKUPDIR || $file -nt $BACKUPDIR ]] && BACKUPDIR=$file
done
printf 'BACKUPDIR=%q\n' "$BACKUPDIR"
It skips symlinks, including symlinks to directories, which may or may not be the right thing to do. It skips other non-directories. It handles directories whose names contain any characters, including newlines and leading dots.
Well, I think this solution is the most efficient:
path="/my/dir/structure/*"
backupdir=$(find $path -type d -prune | tail -n 1)
Explanation why this is a little better:
We do not need sub-shells (aside from the one for getting the result into the bash variable).
We do not need a useless -exec ls -d at the end of the find command, it already prints the directory listing.
We can easily alter this, e.g. to exclude certain patterns. For example, if you want the second newest directory, because backup files are first written to a tmp dir in the same path:
backupdir=$(find $path -type -d -prune -not -name "*temp_dir" | tail -n 1)
The above solution doesn't take into account things like files being written and removed from the directory resulting in the upper directory being returned instead of the newest subdirectory.
The other issue is that this solution assumes that the directory only contains other directories and not files being written.
Let's say I create a file called "test.txt" and then run this command again:
echo "test" > test.txt
ls -t /backups | head -1
test.txt
The result is test.txt showing up instead of the last modified directory.
The proposed solution "works" but only in the best case scenario.
Assuming you have a maximum of 1 directory depth, a better solution is to use:
find /backups/* -type d -prune -exec ls -d {} \; |tail -1
Just swap the "/backups/" portion for your actual path.
If you want to avoid showing an absolute path in a bash script, you could always use something like this:
LOCALPATH=/backups
DIRECTORY=$(cd $LOCALPATH; find * -type d -prune -exec ls -d {} \; |tail -1)
With GNU find you can get list of directories with modification timestamps, sort that list and output the newest:
find . -mindepth 1 -maxdepth 1 -type d -printf "%T#\t%p\0" | sort -z -n | cut -z -f2- | tail -z -n1
or newline separated
find . -mindepth 1 -maxdepth 1 -type d -printf "%T#\t%p\n" | sort -n | cut -f2- | tail -n1
With POSIX find (that does not have -printf) you may, if you have it, run stat to get file modification timestamp:
find . -mindepth 1 -maxdepth 1 -type d -exec stat -c '%Y %n' {} \; | sort -n | cut -d' ' -f2- | tail -n1
Without stat a pure shell solution may be used by replacing [[ bash extension with [ as in this answer.
Your "something like this" was almost a hit:
BACKUPDIR=$(ls -t ./backups | head -1)
Combining what you wrote with what I have learned solved my problem too. Thank you for rising this question.
Note: I run the line above from GitBash within Windows environment in file called ./something.bash.

Find pattern of the file, create a folder with that pattern and copy the files to that folder - Bash script

I have a task, to find the pattern of the file, create a folder with the pattern name and copy the file to that folder. I am able to create the folders.
folders=`find /Location -type f -name "*.pdf" -printf "%f\n" | cut -f 1 -d '_' | sort -u`
for i in $folders
do
mkdir -p /LocationToCreateTheFolder/$i
done
Not able to go further on how to copy the files.
maybe try?
for i in $folders do mkdir -p /LocationToCreateTheFolder/$i && cp ./$i.pdf ./$i/
This will do the finding and the copying:
find Location -type f -name '*.pdf' -exec bash -c 'f=${1##*/}; d="LocationToCreateTheFolder/${f%%_*}"; mkdir -p "$d" && cp "$1" "$d"' None {} \;
This is safe for difficult file names even ones that contain spaces, tabs, or newlines in their names.
How it works
find Location -type f -name '*.pdf' -exec bash -c '...' None {} \;
This will find the pdf files under directory Location and, for each one found, the bash commands inside '...' will be executed with $1 set to the name of the file found. ($0 is set to None. We don't use $0.)
f=${1##*/}
This removes the directory names from the name of the file. This is an example of prefix removal: everything in $1 up to and including the last / is removed.
d="LocationToCreateTheFolder/${f%%_*}"
This creates the name of the directory to which we want to send the file.
${f%%_*}" is an example of suffix removal. Everything in $f from the first _ and after is removed.
mkdir -p "$d" && cp "$1" "$d"
This makes sure that the directory exists and then copies the file to it.

unix bash find file directories with 2 explicit file extensions

I am trying to create a small bash script that essentially looks through a directory that includes hundreds of sub directories. in SOME of these subdirectories include a textfile.txt and a htmlfile.html where the names textfile and htmlfile are variable.
I only really care about sub directories that have both the .txt and the .html, all other subdirecories can be ignored.
I then want to list all the .html files and .txt files that are in the same sub directory
this seems like a pretty simple issue to solve but I am at a loss. all I can really get working is a line of code that outputs sub directories that have either a .html file or .txt with no association with the actual sub directory they are in, and I am pretty new at bash scripting so I can't go any further
#!/bin/bash
files="$(find ~/file/ -type f -name '*.txt' -or -name '*.html')"
for file in $files
do
echo $file
done
The following find command looks checks every subdirectory and, if it has both html and txt files, it lists all of them:
find . -type d -exec env d={} bash -c 'ls "$d"/*.html &>/dev/null && ls "$d"/*.txt &>/dev/null && ls "$d/"*.{html,txt}' \;
Explanation:
find . -type d
This looks for all subdirectories of the current directory.
-exec env d={} bash -c '...' \;
This sets the environment variable d to the value of the found subdirectory and then executes the bash command that is contained within the single quotes (see below).
ls "$d"/*.html &>/dev/null && ls "$d"/*.txt &>/dev/null && ls "$d/"*.{html,txt}
This is the bash command that is executed. It consists of three statements and-ed together. The first checks to see if directory d has any html files. If so, the second statement runs and it checks to see if there are any txt files. If so, the last statement is executed and it lists all html and txt files in the directory d.
This command is safe for all file and directory names containing spaces, tabs, or other difficult characters.
You could do it by searching recursively with the globstar option:
shopt -s globstar
for file in **; do
if [[ -d $file ]]; then
for sub_file in "$file"/*; do
case "$sub_file" in
*.html)
html=1;;
*.txt)
txt=1;;
esac
done
[[ $html && $txt ]] && echo "$file"
html=""
txt=""
fi
done
You can make use of -o
#!/bin/bash
files=$(find ~/file/ -type f -name '*.txt' -o -name '*.html')
for file in $files
do
echo $file
done
#!/bin/bash
#A quick peek into a dir to see if there's at least one file that matches pattern
dir_has_file() { dir="$1"; pattern="$2";
[ -n "$(find "$dir" -maxdepth 1 -type f -name "$pattern" -print -quit)" ]
}
#Assumes there are no newline characters in the filenames, but will behave correctly with subdirectories that match *.html or *.txt
find "$1" -type d|\
while read d
do
dir_has_file "$d" '*.txt' &&
dir_has_file "$d" '*.html' &&
#Now print all the matching files
find "$d" -maxdepth 1 -type f -name '*.txt' -o -name '*.html'
done
This script takes the root directory to look into as the first argument ($1).
The test command is what you need to check for the existence of each file in each of the subdirs:
find . -type d -exec sh -c "if test -f {}/$file1 -a -f {}/$file2 ; then ls {}/*.{txt,html} ; fi" \;
where $file1 and $file2 are the two .txt and .html files you are looking for.

Bash script to find files in a list, copy them to dest, print files not found

I would like to build on the answer I found here: Bash script to find specific files in a hierarchy of files
find $dir -name $name -exec scp {} $destination \;
I have a file with a list of file names and I need to find those files on a backup disk, then copy those files found to a destination folder, and lastly print the files that could not be found to a new file.
the last step would be helpful so that I wouldn't need to make another list of files copied and then do a compare with original list.
If the script can then make a list of the copied files, and do a compare, then print the differences, then that's exactly what's required. Unless the shell process find can print to file each time it "Can't find" a file.
Assuming that your list is separated by newlines; something like this should work
#!/bin/bash
dir=someWhere
dest=someWhereElse
toCopyList=filesomewhere
notCopied=filesomewhereElse
while read line; do
find "$dir" -name "$line" -exec cp '{}' $dest \; -printf "%f\n"
done < "$toCopyList" > cpList
#sed -i 's#'$dir'/##' cpList
# I used # instead of / in sed to not confuse sed with / in $dir
# Also, I assumed the string in $dir doesnot end with a /
cat cpList "$toCopyList" | sort | uniq -c | sed -nr '/^ +1/s/^ +1 +(.*)/\1/p' > "$notCopied"
# Will not work if you give wild cards in your "toCopyList"
Hope it helps
while read fname ; do
find /FROM/WHERE/TO/COPY/ \
-type f \
-name "$fname" \
-exec cp \{\} /DESTINATION/DIR/ \; 2>/dev/null
find /DESTINATION/DIR/ \
-type f \
-name "$fname" &>/dev/null || \
echo $fname
done < FILESTOCOPY > MISSEDFILES
Will do.

Resources