I'm trying to cp some files on an OSX desktop that fit a pattern of five digits. It works, but I can't understand why the -n option is being ignored. I don't want to overwrite a file if it already at the desination.
find ./prefix* -name '[0-9][0-9][0-9][0-9][0-9]' -maxdepth 5 -exec cp -nr {} ./dest \;
Everything is copied, even though one directory is already in dest. How can I force no overwrite? This solution on super user indicates that I could simply change the permissions on everything in dest to read only. But I feel like there must be a reason why my implementation cp is behaving inconsistent to that which is on the man page, and there should therefore be a better way to solve the problem.
Also, the permissions for the file being overwritten are rwxr-xr-x (or 0755 if octal is your thing).
cpwon't do what you want. You'll need to iterate over the output from find. Assuming you don't have spaces or other special characters in any of the paths you find (see below if you do):
find ./prefix* -name '[0-9][0-9][0-9][0-9][0-9]' -maxdepth 5 | \
while read dir
do
target=./dest/$(basename $dir)
[ -d $target ] || cp -r $dir ./dest/
done
This works because while will keep executing what's between do and done as long as read returns success. The output from find is piped into read so every time read dir executes, it reads one line of output from find and assigns it to the dirvariable.
When there are no more lines to be read from find, read returns failure and the loop terminates.
Inside the loop body, basename prints the last part of the path passed to it. In this case, the 5 digits.
The [ ... ] is shell lingo for running a conditional test. (Yes, [ is a command!) You could say test -d ... instead of [ -d ... ]. See e.g. https://unix.stackexchange.com/questions/99185/what-do-square-brackets-mean-without-the-if-on-the-left for more info on this)
-d ... returns success if the argument exists as a directory. Failure if not.
|| means or - so foo || bar executes bar only if foo fails.
So the loop body basically says:
let target be "dest/" + the basename of $dir
such a directory exists or copy $dir into dest/
I hope that clarifies it a bit. It's a lot of shell lingo in very little code. All of this is information basically in the bash manpage, though arguably in a much less accessible format.
If there's any chance that any of the paths found by find contain spaces or other special characters, then you'll need to add quoting and a few other bells & whistles:
find ./prefix* -name '[0-9][0-9][0-9][0-9][0-9]' -maxdepth 5 -print0 | \
while IFS= read -r -d '' dir
do
target="./dest/$(basename "$dir")"
[ -d "$target" ] || cp -r "$dir" ./dest/
done
Related
I want to exclude some directory in my script (directory name >1000) for deletion and here is my directory look like:
/home/tester/100
/home/tester/1000
/home/tester/1020 # delete all files inside
/home/tester/2000 # delete all files inside
My bash script:
cd /home/tester
for dir in */ ; do
echo -n $dir": ";
find "$dir" -type f | wc -l;
if [ $dir -gt 1000 ]; then
cd $dir;
rm *;
cd ..;
fi
done
I got error on the if line and have no idea how to fix it ... Is it possible to do with bash script ?
Thank you for your help
for dir in */ ; do will set dir to things like "1000/" -- and the "/" makes it not a valid number. You can trim off the trailing "/" with ${dir%/}. I'd also recommend double-quoting it to prevent possible weird parsing:
if [ "${dir%/}" -gt 1000 ]; then
Note that if the directory name isn't a number (even after the "/" is removed), you'll get an error from the comparison, and the then clause won't run (which is probably what you want). If you want to handle other (non-numeric) directory names more gracefully, you should add some appropriate is-this-a-number test first.
Also, using cd in scripts tends to be problematic, because if a cd fails for any reason, the rest of the script will continue running, but in the wrong place. This can cause all sorts of chaos. Consider what'd happen if one of the cd $dir commands fails: it'd run rm * in the /home/tester directory, deleting all the non-subdirectory files there, then it'd cd .., leaving it in /home. The next iteration would try to cd down to something like 2000, which doesn't exist under /home, so that cd would fail too, and then it'd delete all files in /home. This repeats indefinitely, potentially all the way up to running rm * in /, the root directory. Not good at all.
I recommend either putting error checks on cd commands, or just avoiding them entirely in favor of using explicit paths to files.
#!/bin/bash
cd /home/tester || {
echo "Couldn't cd to /home/tester, quitting here..." >&2
exit 1
}
for dir in */ ; do
echo -n "$dir: "
find "$dir" -type f | wc -l
if [ "${dir%/}" -gt 1000 ]; then
rm "$dir"/* # Explicit path -- the / is redundant, but won't hurt
fi
done
I've also added an explicit shebang line, double-quoted all the variable references (good general scripting hygiene), and removed the semicolons from the ends of lines (not needed in shell syntax).
Another recommendation: run your scripts through shellcheck.net -- it'll point out a lot of common mistakes like unquoted variable references and unchecked cds.
The value of $dir is not numeric. Add set -x at the to of your script to debug.
Use "$(basename "$dir")" to get the numeric value.
When I did not have my first cup of coffee, I would do
for dir in */ ; do
echo -n $dir": ";
find "$dir" -type f | wc -l;
done
mv /home/tester/1000 /home/tester/some_unique_name
rm /home/tester/[1-9][0-9][0-9][0-9]/*
mv /home/tester/some_unique_name /home/tester/1000
This will not work when you have directories > 9999.
Perhaps rm /home/tester/[1-9][0-9][0-9][0-9]*/* will work, when you don't have directories like 1000backup or 2000my_unique_name.
A better solution is
find . -regextype sed -regex '/home/tester/[0-9]\{4,\}' ! -name 1000 |
xargs -L1 -I{} echo rm {}/*
This question already has answers here:
Batch renaming files with Bash
(10 answers)
Closed 4 years ago.
we are working on a angularjs project, where the compiled output contains lot of file extensions like js,css, woff, etc.. along with individual dynamic hashing as part of file name.
I am working on simple bash script to search the files belonging to the mentioned file extensions and moving to some folder with hashing removed by
searching for first instance of '.'.
Please note file extension .woff and .css should be retained.
/src/main.1cc794c25c00388d81bb.js ==> /dst/main.js
/src/polyfills.eda7b2736c9951cdce19.js ==> /dst/polyfills.js
/src/runtime.a2aefc53e5f0bce023ee.js ==> /dst/runtime.js
/src/styles.8f19c7d2fbe05fc53dc4.css ==> /dst/styles.css
/src/1.620807da7415abaeeb47.js ==> /dst/1.js
/src/2.93e8bd3b179a0199a6a3.js ==> /dst/2.js
/src/some-webfont.fee66e712a8a08eef580.woff ==> /dst/some-webfont.woff
/src/Web_Bd.d2138591460eab575216.woff ==> /dst/Web_Bd.woff
Bash code:
#!/bin/bash
echo Process web binary files!
echo Processing the name change for js files!!!!!!!!!!!!!
sfidx=0;
SFILES=./src/*.js #{js,css,voff}
DST=./dst/
for files in $SFILES
do
echo $(basename $files)
cp $files ${DST}"${files//.*}".js
sfidx=$((sfidx+1))
done
echo Number of target files detected in srcdir $sfidx!!!!!!!!!!
The above code has 2 problems,
Need to add file extensions in for loop at a common place, instead of running for each extension. However, this method fails, not sure this needs to be changed.
SFILES=./src/*.{js,css,voff}
cp: cannot stat `./src/*.{js,css,voff}': No such file or directory
Second, the cp cmd fails due to below reason, need some help to figure out correct syntax.
cp $files ${DST}"${files//.*}".js
1.620807da7415abaeeb47.js
cp: cannot create regular file `./dst/./src/1.620807da7415abaeeb47.js.js': No such file or directory
Here is a relatively simple command to do it:
find ./src -type f \( -name \*.js -o -name \*.css -o -name \*.woff \) -print0 |
while IFS= read -r -d $'\0' line; do
dest="./dst/$(echo $(basename $line) | sed -E 's/(\..{20}\.)(js|css|woff)/\.\2/g')"
echo Copying $line to $dest
cp $line $dest
done
This is based on the original code and is Shellcheck-clean:
#!/bin/bash
shopt -s nullglob # Make globs that match nothing expand to nothing
echo 'Process web binary files!'
echo 'Processing the name change for js, css, and woff files!!!!!!!!!!!!!'
srcfiles=( src/*.{js,css,woff} )
destdir=dst
for srcpath in "${srcfiles[#]}" ; do
filename=${srcpath##*/}
printf '%s\n' "$filename"
nohash_base=${filename%.*.*} # Remove the hash and suffix from the end
suffix=${filename##*.} # Remove everything up to the final '.'
newfilename=$nohash_base.$suffix
cp -- "$srcpath" "$destdir/$newfilename"
done
echo "Number of target files detected in srcdir ${#srcfiles[*]}!!!!!!!!!"
The code uses an array instead of a string to hold the list of files because it is easier (and generally safer because it can handle file with names that contain spaces and other special characters). See Arrays [Bash Hackers Wiki] for information about using arrays in Bash.
See Removing part of a string (BashFAQ/100 (How do I do string manipulation in bash?)) for information about using ${var##pattern} etc. for extracting parts of strings.
See Correct Bash and shell script variable capitalization for an explanation of why it is best to avoid uppercase variable names (such as SFILES).
shopt -s nullglob prevents strange things happening if the glob pattern(s) fail to match. See Why is nullglob not default? for more information.
See Bash Pitfalls #2 (cp $file $target) for why it's generally better to use cp -- instead of plain cp (though it's not necessary in this case (since neither argument can begin with '-')).
It's best to keep Bash code Shellcheck-clean. When run on the code in the question it identifies the key problem, and recommends the use of arrays as a way to fix it. It also identifies several other potential problems.
Your problem is precedence of the expansions. Here is my solution:
#!/bin/bash
echo Process web binary files!
echo Processing the name change for js files!!!!!!!!!!!!!
sfidx=0;
SFILES=$(echo ./src/*.{js,css,voff})
DST=./dst/
for file in $SFILES
do
new=${file##*/} # basename
new="${DST}${new%\.*}.js" # escape \ the .
echo "copying $file to $new" # sanity check
cp $file "$new"
sfidx=$((sfidx+1))
done
echo Number of target files detected in srcdir $sfidx!!!!!!!!!!
With three files in ./src all named "gash" I get:
Process web binary files!
Processing the name change for js files!!!!!!!!!!!!!
copying ./src/gash.js to ./dst/gash.js
copying ./src/gash.css to ./dst/gash.js
copying ./src/gash.voff to ./dst/gash.js
Number of target files detected in srcdir 3!!!!!!!!!!
(You might be able to get around using eval, but that can be a security issue)
new=${file##*/} - remove the longest string on the left ending in / (remove leading directory names, as basename). If you wanted to use the external non-shell basename program then it would be new=$(basename $file).
${new%\.*} - remove the shortest string on the right starting . (remove the old file extension)
A possible approach is to have the find command generate a shell script, then execute it.
src=./src
dst=./dst
find "$src" \( -name \*.js -o -name \*.woff -o -name \*.css \) \
-printf 'p="%p"; f="%f"; cp "$p" "'"${dst}"'/${f%%%%.*}.${f##*.}"\n'
This will print the shell commands you want to execute. If they are what you want,
just pipe the output to a shell:
find "$src" \( -name \*.js -o -name \*.woff -o -name \*.css \) \
-printf 'p="%p"; f="%f"; cp "$p" "'"${dst}"'/${f%%%%.*}.${f##*.}"\n'|bash
(or |bash -x if you want to see what is going on.)
If you have files named, e.g., ./src/dir1/a.xyz.js and ./src/dir2/a.uvw.js they will both end up as ./dst/a.js, the second overwriting the first. To avoid this, you might want to use cp -i instead of cp.
If you are absolutely sure that there will never be spaces or other strange characters in your pathnames, you can use less quotes (to the horror of some shell purists)
find $src \( -name \*.js -o -name \*.woff -o -name \*.css \) \
-printf "p=%p; f=%f; cp \$p ${dst}/\${f%%%%.*}.\${f##*.}\\n"|bash
Some final remarks:
%p and %f are expanded by -printf as the full pathname and the basename of the processed file. they enable us to avoid the basename command. Unfortunately, the is no such directive for the file extension, so we must use brace expansion in the shell to compose the final name.
In the -printf argument, we must use %% to write a single percent character. Since we need two of them, there have to be four...
${f%%.*} expands in the shell as the value of $f with everything removed from the first dot onwards
${f##*.} expands in the shell as the value of $f with everything removed up to the last dot (i.e., it expands to the file extension)
I have 2 folders: folder_a & folder_b. In each of these folders there are a bunch of files. I am trying to use sed to move all of these files out of these folders and into my current working directory I am currently in.
My folder structure looks like this:
mytest:
a:
1.txt
2.txt
3.txt
b:
4.txt
5.txt
The command I am trying to use is:
find . -type d ! -iname '*.*' # find all folders other than root
| sed -r 's/.*/&\/*/' # add '/*' to each of the arguments
| sed -r 'p;s/.*/./' # output: a/* . b/* .
| xargs -n 2 mv # should be creating two commands: 'mv a/* .' and 'mv b/* .'
Unfortunately I get an error:
mv: cannot stat './aaa/*': No such file or directory
I also get the same error when I try this other strategy (using ls instead of mv):
for dir in */; do
ls $dir;
done;
Even if I use sed to replace the spaces in each directory name with '\ ', or surround the directory names with quotes I get the same error.
I'm not sure if these 2 examples are related in my misunderstanding of bash but they both seem to demonstrate my ignorance of how bash translates the output from one command into the input of another command.
Can anyone shed some light on this?
Update: Completely rewritten.
As #EtanReisner and #melpomene have noted, mv */* . or, more specifically, mv a/* b/* . is the most straightforward solution, but you state that this is in part a learning exercise, so the remainder of the answer shows an efficient find-based solution and explains the problem with the original command.
An efficient find-based solution
Generally, if feasible, it's best and most efficient to let find itself do the work, without involving additional tools; find's -exec action is like a built-in xargs, with {} representing the path at hand (with terminator \;) / all paths (with +):
find . -type f -exec echo mv -t . {} +
To be safe, his will just print the mv commands that would be executed; remove the echo to actually execute them.
This will execute a single[1] mv command to which all matching files are passed, and -t . moves them all to the current dir.
[1] If the resulting command line is too long (which is unlikely), it is split up into multiple commands, just as with xargs.
Operating on files (-type f) bypasses the need for globbing, as find will then enumerate all files for you (it also bypasses the need to exclude . explicitly).
Note that this solution works on entire subtrees, not just (immediate) subdirectories.
It's tempting to consider turning on Bash 4's globstar option and using mv */** ., but that won't work, because it will attempt to move directories as well, not just the files in them.
A caveat re -exec with +: it only works if {} - the placeholder for all paths - is the token immediately before the +.
Since you're on Linux, we can satisfy this condition by specifying the target folder for mv with option -t before the {}; on BSD-based systems such as OSX, you could not do that, because mv doesn't support -t there, so you'd have to use terminator \;, which means that mv is called once for every path, which is obviously much slower.
Why your command didn't work:
As #EtanReisner points out in a comment, xargs invokes the command specified without (implicitly) involving a shell, so globbing won't work; you can verify this with the following command:
echo '*' | xargs echo # -> '*' - NO globbing
If we leave the globbing issue aside, additional work would have been necessary to make your xargs command work correctly with folder names with embedded spaces (or other shell metacharacters):
find . -mindepth 1 -type d |
sed -r "s/.*/'&'\/* ./" | # -> '<input-path>'/* . (including single-quotes)
xargs -n 2 echo mv # NOTE: still won't work due to lack of globbing
Note how the (combined) sed command now produces a single output line '<input-path>'/* ., with the input path enclosed in embedded single-quotes, which is required for xargs to recognize <input-path> as a single argument, even if it contains embedded spaces.
(If your filenames contain single-quotes, you'd have to do more work; also note that since now all arguments for a given dir. are on a single line, you could use xargs -L 1 ....)
Also note how -mindepth 1 (only process paths at the subdirectory level or below) is used to skip processing of . itself.
The only way to make globbing happen is to get the shell involved:
find . -mindepth 1 -type d |
sed -r "s/.*/'&'\/* ./" | # -> '<input-path>'/* . (including single-quotes)
xargs -I {} sh -c 'echo mv {}' # works, but is inefficient
Note the use of xargs' -I option to treat each input line as its own argument ({} is a self-chosen placeholder for the input).
sh -c invokes the (default) shell to execute the resulting command, at which globbing does happen.
However, overall, this is quite inefficient:
A pipeline with 3 segments is used.
A shell instance is invoked for every input path, which in turn calls the mv utility.
Compare this to the efficient find-only solution above, which (typically) creates only 2 processes in total.
I am trying to move the directories from $DIR1 to $DIR2 if $DIR2 does not have the same directory name
if [[ ! $(ls -d /$DIR2/* | grep test) ]] is what I currently have.
then
mv $DIR1/test* /$DIR2
fi
first it gives
ls: cannot access //data/lims/PROCESSING/*: No such file or directory
when $DIR2 is empty
however, it still works.
secondly
when i run the shell script twice.
it doesn't let me move the directories with the similar name.
for example
in $DIR1 i have test-1 test-2 test-3
when it runs for the first time all three directories moves to $DIR2
after that i do mkdir test-4 at $DIR1 and run the script again..
it does not let me move the test-4 because my loop thinks that test-4 is already there since I am grabbing all test
how can I go around and move test-4 ?
Firstly, you can check whether or not a directory exists using bash's built in 'True if directory exists' expression:
test="/some/path/maybe"
if [ -d "$test" ]; then
echo "$test is a directory"
fi
However, you want to test if something is not a directory. You've shown in your code that you already know how to negate the expression:
test="/some/path/maybe"
if [ ! -d "$test" ]; then
echo "$test is NOT a directory"
fi
You also seem to be using ls to get a list of files. Perhaps you want to loop over them and do something if the files are not a directory?
dir="/some/path/maybe"
for test in $(ls $dir);
do
if [ ! -d $test ]; then
echo "$test is NOT a directory."
fi
done
A good place to look for bash stuff like this is Machtelt Garrels' guide. His page on the various expressions you can use in if statements helped me a lot.
Moving directories from a source to a destination if they don't already exist in the destination:
For the sake of readability I'm going to refer to your DIR1 and DIR2 as src and dest. First, let's declare them:
src="/place/dir1/"
dest="/place/dir2/"
Note the trailing slashes. We'll append the names of folders to these paths so the trailing slashes make that simpler. You also seem to be limiting the directories you want to move by whether or not they have the word test in their name:
filter="test"
So, let's first loop through the directories in source that pass the filter; if they don't exist in dest let's move them there:
for dir in $(ls -d $src | grep $filter); do
if [ ! -d "$dest$dir" ]; then
mv "$src$dir" "$dest"
fi
done
I hope that solves your issue. But be warned, #gniourf_gniourf posted a link in the comments that should be heeded!
If you need to mv some directories to another according to some pattern, than you can use find:
find . -type d -name "test*" -exec mv -t /tmp/target {} +
Details:
-type d - will search only for directories
-name "" - set search pattern
-exec - do something with find results
-t, --target-directory=DIRECTORY move all SOURCE arguments into DIRECTORY
There are many examples of exec or xargs usage.
And if you do not want to overwrite files, than add -n option to mv command:
find . -type d -name "test*" -exec mv -n -t /tmp/target {} +
-n, --no-clobber do not overwrite an existing file
I am trying to write a bash script which does some processing on music files. Here is the script so far:
#!/bin/bash
SAVEIFS=$IFS
IFS=printf"\n\0"
find `pwd` -iname "*.mp3" -o -iname "*.flac" | while read f
do
echo "$f"
$arr=($(f))
exiftool "${arr[#]}"
done
IFS=$SAVEIFS
This fails with:
[johnd:/tmp/tunes] 2 $ ./test.sh
./test.sh: line 9: syntax error near unexpected token `$(f)'
./test.sh: line 9: ` $arr=($(f))'
[johnd:/tmp/tunes] 2 $
I have tried many different incantations, none of which have worked. The bottom line is I'm trying to call a command exiftool, and one of the parameters of that command is a filename which may contain spaces. Above I'm trying to assign the filename $f to an array and pass that array to exiftool, but I'm having trouble with the construction of the array.
Immediate question is, how do I construct this array? But the deeper question is how, from within a bash script, do I call an external command with parameters which may contain spaces?
You actually did have the call-with-possibly-space-containing-arguments syntax right (program "${args[#]}"). There were several problems, though.
Firstly, $(foo) executes a command. If you want a variable's value, use $foo or ${foo}.
Secondly, if you want to append something onto an array, the syntax is array+=(value) (or, if that doesn't work, array=("${array[#]}" value)).
Thirdly, please separate filenames with \0 whenever possible. Newlines are all well and good, but filenames can contain newlines.
Fourthly, read takes the switch -d, which can be used with an empty string '' to specify \0 as the delimiter. This eliminates the need to mess around with IFS.
Fifthly, be careful when piping into while loops - this causes the loop to be executed in a subshell, preventing variable assignments inside it from taking effect outside. There is a way to get around this, however - instead of piping (command | while ... done), use process substitution (while ... done < <(command)).
Sixthly, watch your process substitutions - there's no need to use $(pwd) as an argument to a command when . will do. (Or if you really must have full paths, try quoting the pwd call.)
tl;dr
The script, revised:
while read -r -d '' f; do
echo "$f" # For debugging?
arr+=("$f")
done < <(find . -iname "*.mp3" -o -iname "*.flac" -print0)
exiftool "${arr[#]}"
Another way
Leveraging find's full capabilities:
find . -iname "*.mp3" -o -iname "*.flac" -exec exiftool {} +
# Much shorter!
Edit 1
So you need to save the output of exiftool, manipulate it, then copy stuff? Try this:
while read -r -d '' f; do
echo "$f" # For debugging?
arr+=("$f")
done < <(find . -iname "*.mp3" -o -iname "*.flac" -print0)
# Warning: somewhat misleading syntax highlighting ahead
newfilename="$(exiftool "${arr[#]}")"
newfilename="$(manipulate "$newfilename")"
cp -- "$some_old_filename" "$newfilename"
You probably will need to change that last bit - I've never used exiftool, so I don't know precisely what you're after (or how to do it), but that should be a start.
You can do this just with bash:
shopt -s globstar nullglob
a=( **/*.{mp3,flac} )
exiftool "${a[#]}"
This probably works too: exiftool **/*.{mp3,flac}