bash: complex test in find command - linux

I would like to do something like:
find . -type f -exec test $(file --brief --mime-type '{}' ) == 'text/html' \; -print
but I can't figure out the correct way to quote or escape the args to test, especially the '$(' ... ')' .

You cannot simply escape the arguments for passing them to find.
Any shell expansion will happen before find is run. find will not pass its arguments through a shell, so even if you escape the shell expansion, everything will simply be treated as literal arguments to the test command, not expanded by the shell as you are expecting.
The best way to achieve what you want would be to write a short shell script, which takes the filename as an argument, and use -exec on that:
find . -type f -exec is_html.sh {} \; -print
with is_html.sh:
#!/bin/sh
test $(file --brief --mime-type "$1") == 'text/html'
If you really want it all on one line, without using a separate script, you can invoke sh directly from find:
find . -type f -exec sh -c 'test $(file --brief --mime-type "$0") == "text/html"' {} \; -print

Although it may be possible to turn it into one wildly quoted statement, it is often easier - and more clear - to be a little more verbose:
$ find . -type f -print0 | xargs -0 file --mime-type | ↷
grep ':[^:]*text/html$'| sed 's,:[^:]*text/html,,'

Use "{}" instead, for an example this simply lists file types:
find * -maxdepth 0 -exec file "{}" \;

Related

Using 'find' to return filenames without extension

I have a directory (with subdirectories), of which I want to find all files that have a ".ipynb" extension. But I want the 'find' command to just return me these filenames without the extension.
I know the first part:
find . -type f -iname "*.ipynb" -print
But how do I then get the names without the "ipynb" extension?
Any replies greatly appreciated...
To return only filenames without the extension, try:
find . -type f -iname "*.ipynb" -execdir sh -c 'printf "%s\n" "${0%.*}"' {} ';'
or (omitting -type f from now on):
find "$PWD" -iname "*.ipynb" -execdir basename {} .ipynb ';'
or:
find . -iname "*.ipynb" -exec basename {} .ipynb ';'
or:
find . -iname "*.ipynb" | sed "s/.*\///; s/\.ipynb//"
however invoking basename on each file can be inefficient, so #CharlesDuffy suggestion is:
find . -iname '*.ipynb' -exec bash -c 'printf "%s\n" "${#%.*}"' _ {} +
or:
find . -iname '*.ipynb' -execdir basename -s '.sh' {} +
Using + means that we're passing multiple files to each bash instance, so if the whole list fits into a single command line, we call bash only once.
To print full path and filename (without extension) in the same line, try:
find . -iname "*.ipynb" -exec sh -c 'printf "%s\n" "${0%.*}"' {} ';'
or:
find "$PWD" -iname "*.ipynb" -print | grep -o "[^\.]\+"
To print full path and filename on separate lines:
find "$PWD" -iname "*.ipynb" -exec dirname "{}" ';' -exec basename "{}" .ipynb ';'
Here's a simple solution:
find . -type f -iname "*.ipynb" | sed 's/\.ipynb$//1'
I found this in a bash oneliner that simplifies the process without using find
for n in *.ipynb; do echo "${n%.ipynb}"; done
If you need to have the name with directory but without the extension :
find . -type f -iname "*.ipynb" -exec sh -c 'f=$(basename $1 .ipynb);d=$(dirname $1);echo "$d/$f"' sh {} \;
find . -type f -iname "*.ipynb" | grep -oP '.*(?=[.])'
The -o flag outputs only the matched part. The -P flag matches according to Perl regular expressions. This is necessary to make the lookahead (?=[.]) work.
Perl One Liner
what you want
find . | perl -a -F/ -lne 'print $F[-1] if /.*.ipynb/g'
Then not your code
what you do not want
find . | perl -a -F/ -lne 'print $F[-1] if !/.*.ipynb/g'
NOTE
In Perl you need to put extra .. So your pattern would be .*.ipynb
If there's no occurrence of this ".ipynb" string on any file name other than a suffix, then you can try this simpler way using tr:
find . -type f -iname "*.ipynb" -print | tr -d ".ipbyn"
If you don't know that the extension is or there are multiple you could use this:
find . -type f -exec basename {} \;|perl -pe 's/(.*)\..*$/$1/;s{^.*/}{}'
and for a list of files with no duplicates (originally differing in path or extension)
find . -type f -exec basename {} \;|perl -pe 's/(.*)\..*$/$1/;s{^.*/}{}'|sort|uniq
Another easy way which uses basename is:
find . -type f -iname '*.ipynb' -exec basename -s '.ipynb' {} +
Using + will reduce the number of invocations of the command (manpage):
-exec command {} +
This variant of the -exec action runs the specified command on
the selected files, but the command line is built by appending
each selected file name at the end; the total number of
invocations of the command will be much less than the number
of matched files. The command line is built in much the same
way that xargs builds its command lines. Only one instance of
'{}' is allowed within the command, and (when find is being
invoked from a shell) it should be quoted (for example, '{}')
to protect it from interpretation by shells. The command is
executed in the starting directory. If any invocation with
the `+' form returns a non-zero value as exit status, then
find returns a non-zero exit status. If find encounters an
error, this can sometimes cause an immediate exit, so some
pending commands may not be run at all. For this reason -exec
my-command ... {} + -quit may not result in my-command
actually being run. This variant of -exec always returns
true.
Using -s with basename runs accepts multiple filenames and removes a specified suffix (manpage):
-a, --multiple
support multiple arguments and treat each as a NAME
-s, --suffix=SUFFIX
remove a trailing SUFFIX; implies -a

How can I change the extension of files of a type using "find" with Bash? [duplicate]

This question already has answers here:
Recursively change file extensions in Bash
(6 answers)
Closed 6 years ago.
The user inputs a file type they are looking for; it is stored in $arg1; the file type they would like to change them is stored as $arg2. I'm able to find what I'm looking for, but I'm unsure of how to keep the filename the same but just change the type... ie., file1.txt into file1.log.
find . -type f -iname "*.$arg1" -exec mv {} \;
To enable the full power of shell parameter expansions, you can call bash -c in your exec action:
find . -type f -iname "*.$arg1" \
-exec bash -c 'echo mv "$1" "${1/%.*/$1}"' _ {} "$arg2" \;
We add {} and "$arg2" as a parameters to bash -c, so they become accessible within the command as $0 and $1. ${0%.*} removes the extension, to be replaced by whatever $arg2 expands to.
As it is, the command just prints the mv commands it would execute; to actually rename the files, the echo has to be removed.
The quoting is relevant: the argument to bash -c is in single quotes to prevent $0 and $1 from being expanded prematurely, and the two arguments to mv, and arg2 are also quoted to deal with file names with spaces in them.
Combining the find -exec bash idea with the bash loop idea, you can use the + terminator on the -exec to tell find to pass multiple filenames to a single invocation of the bash command. Pass the new type as the first argument - which shows up in $0 and so is conveniently skipped by a for loop over the rest of the command-line arguments - and you have a pretty efficient solution:
find . -type f -iname "*.$arg1" -exec bash -c \
'for arg; do mv "$arg" "${arg%.*}.$0"; done' "$arg2" {} +
Alternatively, if you have either version of the Linux rename command, you can use that. The Perl one (a.k.a. prename, installed by default on Ubuntu and other Debian-based distributions; also available for OS X from Homebrew via brew install rename) can be used like this:
find . -type f -iname "*.$arg1" -exec rename 's/\Q'"$arg1"'\E$/'"$arg2"'/' {} +
That looks a bit ugly, but it's really just the s/old/new/ substitution command familiar from many UNIX tools. The \Q and \E around $arg1 keep any weird characters inside the suffix from being interpreted as regular expression metacharacters that might match something unexpected; the $ after the \E makes sure the pattern only matches at the end of the filename.
The pattern-based version installed by default on Red Hat-based Linux distros (Fedora, CentOS, etc) is simpler:
find . -type f -iname "*.$arg1" -exec rename ".$arg1" ".$arg2" {} +
but it's also dumber: if you rename .com .exe stackoverflow.com_scanner.com, you'll get a file named stackoverflow.exe_scanner.exe.
I would do it like so:
find . -type f -iname "*.$arg1" -print0 |\
while IFS= read -r -d '' file; do
mv -- "$file" "${file%$arg1}$arg2"
done
I took your find command and fed its output to a while loop. Within that loop, I am doing the actual renaming. This way I have the name of the file as a variable that I can manipulate using bash's string manipulation operations.
If you have perl based rename command
Sample directory:
$ find
.
./a"bn.txt
./t2.abc
./abc
./abc/t1.txt
./abc/list.txt
./a bc.txt
Sample args:
$ arg1='txt'
$ arg2='log'
Dry run:
$ find -type f -iname "*.$arg1" -exec rename -n "s/$arg1$/$arg2/" {} +
rename(./a"bn.txt, ./a"bn.log)
rename(./abc/t1.txt, ./abc/t1.log)
rename(./abc/list.txt, ./abc/list.log)
rename(./a bc.txt, ./a bc.log)
Remove -n option once it is okay:
$ find -type f -iname "*.$arg1" -exec rename "s/$arg1$/$arg2/" {} +
$ find
.
./a bc.log
./t2.abc
./abc
./abc/list.log
./abc/t1.log
./a"bn.log

Missing Syntax of moving file from one folder to another [duplicate]

I was helped out today with a command, but it doesn't seem to be working. This is the command:
find /home/me/download/ -type f -name "*.rm" -exec ffmpeg -i {} -sameq {}.mp3 && rm {}\;
The shell returns
find: missing argument to `-exec'
What I am basically trying to do is go through a directory recursively (if it has other directories) and run the ffmpeg command on the .rm file types and convert them to .mp3 file types. Once this is done, remove the .rm file that has just been converted.
A -exec command must be terminated with a ; (so you usually need to type \; or ';' to avoid interpretion by the shell) or a +. The difference is that with ;, the command is called once per file, with +, it is called just as few times as possible (usually once, but there is a maximum length for a command line, so it might be split up) with all filenames. See this example:
$ cat /tmp/echoargs
#!/bin/sh
echo $1 - $2 - $3
$ find /tmp/foo -exec /tmp/echoargs {} \;
/tmp/foo - -
/tmp/foo/one - -
/tmp/foo/two - -
$ find /tmp/foo -exec /tmp/echoargs {} +
/tmp/foo - /tmp/foo/one - /tmp/foo/two
Your command has two errors:
First, you use {};, but the ; must be a parameter of its own.
Second, the command ends at the &&. You specified “run find, and if that was successful, remove the file named {};.“. If you want to use shell stuff in the -exec command, you need to explicitly run it in a shell, such as -exec sh -c 'ffmpeg ... && rm'.
However you should not add the {} inside the bash command, it will produce problems when there are special characters. Instead, you can pass additional parameters to the shell after -c command_string (see man sh):
$ ls
$(echo damn.)
$ find * -exec sh -c 'echo "{}"' \;
damn.
$ find * -exec sh -c 'echo "$1"' - {} \;
$(echo damn.)
You see the $ thing is evaluated by the shell in the first example. Imagine there was a file called $(rm -rf /) :-)
(Side note: The - is not needed, but the first variable after the command is assigned to the variable $0, which is a special variable normally containing the name of the program being run and setting that to a parameter is a little unclean, though it won't cause any harm here probably, so we set that to just - and start with $1.)
So your command could be something like
find -exec bash -c 'ffmpeg -i "$1" -sameq "$1".mp3 && rm "$1".mp3' - {} \;
But there is a better way. find supports and and or, so you may do stuff like find -name foo -or -name bar. But that also works with -exec, which evaluates to true if the command exits successfully, and to false if not. See this example:
$ ls
false true
$ find * -exec {} \; -and -print
true
It only runs the print if the command was successfully, which it did for true but not for false.
So you can use two exec statements chained with an -and, and it will only execute the latter if the former was run successfully.
Try putting a space before each \;
Works:
find . -name "*.log" -exec echo {} \;
Doesn't Work:
find . -name "*.log" -exec echo {}\;
I figured it out now. When you need to run two commands in exec in a find you need to actually have two separate execs. This finally worked for me.
find . -type f -name "*.rm" -exec ffmpeg -i {} -sameq {}.mp3 \; -exec rm {} \;
You have to put a space between {} and \;
So the command will be like:
find /home/me/download/ -type f -name "*.rm" -exec ffmpeg -i {} -sameq {}.mp3 && rm {} \;
Just for your information:
I have just tried using "find -exec" command on a Cygwin system (UNIX emulated on Windows), and there it seems that the backslash before the semicolon must be removed:
find ./ -name "blabla" -exec wc -l {} ;
For anyone else having issues when using GNU find binary in a Windows command prompt. The semicolon needs to be escaped with ^
find.exe . -name "*.rm" -exec ffmpeg -i {} -sameq {}.mp3 ^;
You need to do some escaping I think.
find /home/me/download/ -type f -name "*.rm" -exec ffmpeg -i {} \-sameq {}.mp3 \&\& rm {}\;
Just in case anyone sees a similar "missing -exec args" in Amazon Opsworks Chef bash scripts, I needed to add another backslash to escape the \;
bash 'remove_wars' do
user 'ubuntu'
cwd '/'
code <<-EOH
find /home/ubuntu/wars -type f -name "*.war" -exec rm {} \\;
EOH
ignore_failure true
end
Also, if anyone else has the "find: missing argument to -exec" this might help:
In some shells you don't need to do the escaping, i.e. you don't need the "\" in front of the ";".
find <file path> -name "myFile.*" -exec rm - f {} ;
Both {} and && will cause problems due to being expanded by the command line. I would suggest trying:
find /home/me/download/ -type f -name "*.rm" -exec ffmpeg -i \{} -sameq \{}.mp3 \; -exec rm \{} \;
In my case I needed to execute "methods" from by bash script, which does not work when using -exec bash -c, so I add another solution I found here, as well:
UploadFile() {
curl ... -F "file=$1"
}
find . | while read file;
do
UploadFile "$file"
done
This thread pops up first when searching for solutions to execute commands for each file from find, so I hope it's okay that this solution does not use the -exec argument
I got the same error when I left a blank space after the ending ; of an -exec command.So, remove blank space after ;
If you are still getting "find: missing argument to -exec" try wrapping the execute argument in quotes.
find <file path> -type f -exec "chmod 664 {} \;"

Find and basename not playing nicely

I want to echo out the filename portion of a find on the linux commandline. I've tried to use the following:
find www/*.html -type f -exec sh -c "echo $(basename {})" \;
and
find www/*.html -type f -exec sh -c "echo `basename {}`" \;
and a whole host of other combinations of escaping and quoting various parts of the text. The result is that the path isn't stripped:
www/channel.html
www/definition.html
www/empty.html
www/index.html
www/privacypolicy.html
Why not?
Update: While I have a working solution below, I'm still interested in why "basename" doesn't do what it should do.
The trouble with your original attempt:
find www/*.html -type f -exec sh -c "echo $(basename {})" \;
is that the $(basename {}) code is executed once, before the find command is executed. The output of the single basename is {} since that is the basename of {} as a filename. So, the command that is executed by find is:
sh -c "echo {}"
for each file found, but find actually substitutes the original (unmodified) file name each time because the {} characters appear in the string to be executed.
If you wanted it to work, you could use single quotes instead of double quotes:
find www/*.html -type f -exec sh -c 'echo $(basename {})' \;
However, making echo repeat to standard output what basename would have written to standard output anyway is a little pointless:
find www/*.html -type f -exec sh -c 'basename {}' \;
and we can reduce that still further, of course, to:
find www/*.html -type f -exec basename {} \;
Could you also explain the difference between single quotes and double quotes here?
This is routine shell behaviour. Let's take a slightly different command (but only slightly — the names of the files could be anywhere under the www directory, not just one level down), and look at the single-quote (SQ) and double-quote (DQ) versions of the command:
find www -name '*.html' -type f -exec sh -c "echo $(basename {})" \; # DQ
find www -name '*.html' -type f -exec sh -c 'echo $(basename {})' \; # SQ
The single quotes pass the material enclosed direct to the command. Thus, in the SQ command line, the shell that launches find removes the enclosing quotes and the find command sees its $9 argument as:
echo $(basename {})
because the shell removes the quotes. By comparison, the material in the double quotes is processed by the shell. Thus, in the DQ command line, the shell (that launches find — not the one launched by find) sees the $(basename {}) part of the string and executes it, getting back {}, so the string it passes to find as its $9 argument is:
echo {}
Now, when find does its -exec action, in both cases it replaces the {} by the filename that it just found (for sake of argument, www/pics/index.html). Thus, you get two different commands being executed:
sh -c 'echo $(basename www/pics/index.html)' # SQ
sh -c "echo www/pics/index.html" # DQ
There's a (slight) notational cheat going on there — those are the equivalent commands that you'd type at the shell. The $2 of the shell that is launched actually has no quotes in it in either case — the launched shell does not see any quotes.
As you can see, the DQ command simply echoes the file name; the SQ command runs the basename command and captures its output, and then echoes the captured output. A little bit of reductionist thinking shows that the DQ command could be written as -print instead of using -exec, and the SQ command could be written as -exec basename {} \;.
If you're using GNU find, it supports the -printf action which can be followed by Format Directives such that running basename is unnecessary. However, that is only available in GNU find; the rest of the discussion here applies to any version of find you're likely to encounter.
Try this instead :
find www/*.html -type f -printf '%f\n'
If you want to do it with a pipe (more resources needed) :
find www/*.html -type f -print0 | xargs -0 -n1 basename
Thats how I batch resize files with imagick, rediving output filename from source
find . -name header.png -exec sh -c 'convert -geometry 600 {} $(dirname {})/$(basename {} ".png")_mail.png' \;
I had to accomplish something similar, and found following the practices mentioned for avoiding looping over find's output and using find with sh sidestepped these problems with {} and -printfentirely.
You can try it like this:
find www/*.html -type f -exec sh -c 'echo $(basename $1)' find-sh {} \;
The summary is "Don't reference {} directly inside of a sh -c but instead pass it to sh -c as an argument, then you can reference it with a number variable inside of sh -c" the find-sh is just there as a dummy to take up the $0, there is more utility in doing it that way and using {} for $1.
I'm assuming the use of echo is really to simplify the concept and test function. There are easier ways to simply echo as others have mentioned, But an ideal use case for this scenario might be using cp, mv, or any more complex commands where you want to reference the found file names more than once in the command and you need to get rid of the path, eg. when you have to specify filename in both source and destination or if you are renaming things.
So for instance, if you wanted to copy only the html documents to your public_html directory (Why? because Example!) then you could:
find www/*.html -type f -exec sh -c 'cp /var/www/$(basename $1) /home/me/public_html/$(basename $1)' find-sh {} \;
Over on unix stackexchange, user wildcard's answer on looping with find goes into some great gems on usage of -exec and sh -c. (You can find it here: https://unix.stackexchange.com/questions/321697/why-is-looping-over-finds-output-bad-practice)

Find throw paths must precede expression in script

I am trying to alias find and grep to a line as show below
alias f='find . -name $1 -type f -exec grep -i $2 '{}' \;'
I intend to run it as
f *.php function
but when I add this to .bash_profile and run it I am hit with
[a#a ~]$ f ss s
find: paths must precede expression
Usage: find [-H] [-L] [-P] [path...] [expression]
How do I resolve this?
Aliases don't accept positional parameters. You'll need to use a function.
f () { find . -name "$1" -type f -exec grep -i "$2" '{}' \; ; }
You'll also need to quote some of your arguments.
f '*.php' function
This defers the expansion of the glob so that find performs it rather than the shell.
Expanding on Dennis Williamson's solution:
f() { find . -name "$1" -type f -print0 | xargs -0 grep -i "$2"; }
Using xargs rather than -exec saves you from spawning a new process for each grep... if you have a lot of files, the overhead can make a difference.

Resources