Using 'find' to return filenames without extension - linux

I have a directory (with subdirectories), of which I want to find all files that have a ".ipynb" extension. But I want the 'find' command to just return me these filenames without the extension.
I know the first part:
find . -type f -iname "*.ipynb" -print
But how do I then get the names without the "ipynb" extension?
Any replies greatly appreciated...

To return only filenames without the extension, try:
find . -type f -iname "*.ipynb" -execdir sh -c 'printf "%s\n" "${0%.*}"' {} ';'
or (omitting -type f from now on):
find "$PWD" -iname "*.ipynb" -execdir basename {} .ipynb ';'
or:
find . -iname "*.ipynb" -exec basename {} .ipynb ';'
or:
find . -iname "*.ipynb" | sed "s/.*\///; s/\.ipynb//"
however invoking basename on each file can be inefficient, so #CharlesDuffy suggestion is:
find . -iname '*.ipynb' -exec bash -c 'printf "%s\n" "${#%.*}"' _ {} +
or:
find . -iname '*.ipynb' -execdir basename -s '.sh' {} +
Using + means that we're passing multiple files to each bash instance, so if the whole list fits into a single command line, we call bash only once.
To print full path and filename (without extension) in the same line, try:
find . -iname "*.ipynb" -exec sh -c 'printf "%s\n" "${0%.*}"' {} ';'
or:
find "$PWD" -iname "*.ipynb" -print | grep -o "[^\.]\+"
To print full path and filename on separate lines:
find "$PWD" -iname "*.ipynb" -exec dirname "{}" ';' -exec basename "{}" .ipynb ';'

Here's a simple solution:
find . -type f -iname "*.ipynb" | sed 's/\.ipynb$//1'

I found this in a bash oneliner that simplifies the process without using find
for n in *.ipynb; do echo "${n%.ipynb}"; done

If you need to have the name with directory but without the extension :
find . -type f -iname "*.ipynb" -exec sh -c 'f=$(basename $1 .ipynb);d=$(dirname $1);echo "$d/$f"' sh {} \;

find . -type f -iname "*.ipynb" | grep -oP '.*(?=[.])'
The -o flag outputs only the matched part. The -P flag matches according to Perl regular expressions. This is necessary to make the lookahead (?=[.]) work.

Perl One Liner
what you want
find . | perl -a -F/ -lne 'print $F[-1] if /.*.ipynb/g'
Then not your code
what you do not want
find . | perl -a -F/ -lne 'print $F[-1] if !/.*.ipynb/g'
NOTE
In Perl you need to put extra .. So your pattern would be .*.ipynb

If there's no occurrence of this ".ipynb" string on any file name other than a suffix, then you can try this simpler way using tr:
find . -type f -iname "*.ipynb" -print | tr -d ".ipbyn"

If you don't know that the extension is or there are multiple you could use this:
find . -type f -exec basename {} \;|perl -pe 's/(.*)\..*$/$1/;s{^.*/}{}'
and for a list of files with no duplicates (originally differing in path or extension)
find . -type f -exec basename {} \;|perl -pe 's/(.*)\..*$/$1/;s{^.*/}{}'|sort|uniq

Another easy way which uses basename is:
find . -type f -iname '*.ipynb' -exec basename -s '.ipynb' {} +
Using + will reduce the number of invocations of the command (manpage):
-exec command {} +
This variant of the -exec action runs the specified command on
the selected files, but the command line is built by appending
each selected file name at the end; the total number of
invocations of the command will be much less than the number
of matched files. The command line is built in much the same
way that xargs builds its command lines. Only one instance of
'{}' is allowed within the command, and (when find is being
invoked from a shell) it should be quoted (for example, '{}')
to protect it from interpretation by shells. The command is
executed in the starting directory. If any invocation with
the `+' form returns a non-zero value as exit status, then
find returns a non-zero exit status. If find encounters an
error, this can sometimes cause an immediate exit, so some
pending commands may not be run at all. For this reason -exec
my-command ... {} + -quit may not result in my-command
actually being run. This variant of -exec always returns
true.
Using -s with basename runs accepts multiple filenames and removes a specified suffix (manpage):
-a, --multiple
support multiple arguments and treat each as a NAME
-s, --suffix=SUFFIX
remove a trailing SUFFIX; implies -a

Related

I want to get an output of the find command in shell script

Am trying to write a script that finds the files that are older than 10 hours from the sub-directories that are in the "HS_client_list". And send the Output to a file "find.log".
#!/bin/bash
while IFS= read -r line; do
echo Executing cd /moveit/$line
cd /moveit/$line
#Find files less than 600 minutes old.
find $PWD -type f -iname "*.enc" -mmin +600 -execdir basename '{}' ';' | xargs ls > /home/infa91punv/find.log
done < HS_client_list
However, the script is able to cd to the folders from HS_client_list(this file contents the name of the subdirectories) but, the find command (find $PWD -type f -iname "*.enc" -mmin +600 -execdir basename '{}' ';' | xargs ls > /home/infa91punv/find.log) is not working. The Output file is empty. But when I run find $PWD -type f -iname "*.enc" -mmin +600 -execdir basename '{}' ';' | xargs ls > /home/infa91punv/find.log as a command it works and from the script it doesn't.
You are overwriting the file in each iteration.
You can use xargs to perform find on multiple directories; but you have to use an alternate delimiter to avoid having xargs populate the {} in the -execdir command.
sed 's%^%/moveit/%' HS_client_list |
xargs -I '<>' find '<>' -type f -iname "*.enc" -mmin +600 -execdir basename {} \; > /home/infa91punv/find.log
The xargs ls did not seem to perform any useful functionality, so I took it out. Generally, don't use ls in scripts.
With GNU find, you could avoid the call to an external utility, and use the -printf predicate to print just the part of the path name that you care about.
For added efficiency, you could invoke a shell to collect the arguments:
sed 's%^%/moveit/%' HS_client_list |
xargs sh -c 'find "$#" -type f -iname "*.enc" -mmin +600 -execdir basename {} \;' _ >/home/infa91punv/find.log
This will run as many directories as possible in a single find invocation.
If you want to keep your loop, the solution is to put the redirection after done. I would still factor out the cd, and take care to quote the variable interpolation.
while IFS= read -r line; do
find /moveit/"$line" -type f -iname "*.enc" -mmin +600 -execdir basename '{}' ';'
done < HS_client_list >/home/infa91punv/find.log

Linux find command get all text in the file and print file path

I need to get all the texts in the matching file in the folder. However, at the same time need to get the matching file path as well. How can I get the matching file path as well using the following command.
find . -type f -name release.txt | xargs cat
try
find . -type f -name release.txt -exec grep -il {} \; | xargs cat
Skip xargs, just do:
find . -type f -name release.txt -exec sh -c 'echo "$1"; cat "$1"' _ {} \;

How to pipe the results of 'find' to mv in Linux

How do I pipe the results of a 'find' (in Linux) to be moved to a different directory? This is what I have so far.
find ./ -name '*article*' | mv ../backup
but its not yet right (I get an error missing file argument, because I didn't specify a file, because I was trying to get it from the pipe)
find ./ -name '*article*' -exec mv {} ../backup \;
OR
find ./ -name '*article*' | xargs -I '{}' mv {} ../backup
xargs is commonly used for this, and mv on Linux has a -t option to facilitate that.
find ./ -name '*article*' | xargs mv -t ../backup
If your find supports -exec ... \+ you could equivalently do
find ./ -name '*article*' -exec mv -t ../backup {} \+
The -t option is a GNU extension, so it is not portable to systems which do not have GNU coreutils (though every proper Linux I have seen has that, with the possible exception of Busybox). For complete POSIX portability, it's of course possible to roll your own replacement, maybe something like
find ./ -name '*article*' -exec sh -c 'mv "$#" "$0"' ../backup {} \+
where we shamelessly abuse the convenient fact that the first argument after sh -c 'commands' ends up as the "script name" parameter in $0 so that we don't even need to shift it.
Probably see also https://mywiki.wooledge.org/BashFAQ/020
I found this really useful having thousands of files in one folder:
ls -U | head -10000 | egrep '\.png$' | xargs -I '{}' mv {} ./png
To move all pngs in first 10000 files to subfolder png
mv $(find . -name '*article*') ../backup
Here are a few solutions.
find . -type f -newermt "2019-01-01" ! -newermt "2019-05-01" \
-exec mv {} path \;**
or
find path -type f -newermt "2019-01-01" ! -newermt "2019-05-01" \
-exec mv {} path \;
or
find /Directory/filebox/ -type f -newermt "2019-01-01" \
! -newermt "2019-05-01" -exec mv {} ../filemove/ \;
The backslash + newline is just for legibility; you can equivalently use a single long line.
xargs is your buddy here (When you have multiple actions to take)!
And using it the way I have shown will give great control to you as well.
find ./ -name '*article*' | xargs -n1 sh -c "mv {} <path/to/target/directory>"
Explanation:
-n1
Number of lines to consider for each operation ahead
sh -c
The shell command to execute giving it the lines as per previous condition
"mv {} /target/path"
The move command will take two arguments-
1) The line(s) from operation 1, i.e. {}, value substitutes automatically
2) The target path for move command, as specified
Note: the "Double Quotes" are specified to allow any number of spaces or arguments for the shell command which receives arguments from xargs

bash: complex test in find command

I would like to do something like:
find . -type f -exec test $(file --brief --mime-type '{}' ) == 'text/html' \; -print
but I can't figure out the correct way to quote or escape the args to test, especially the '$(' ... ')' .
You cannot simply escape the arguments for passing them to find.
Any shell expansion will happen before find is run. find will not pass its arguments through a shell, so even if you escape the shell expansion, everything will simply be treated as literal arguments to the test command, not expanded by the shell as you are expecting.
The best way to achieve what you want would be to write a short shell script, which takes the filename as an argument, and use -exec on that:
find . -type f -exec is_html.sh {} \; -print
with is_html.sh:
#!/bin/sh
test $(file --brief --mime-type "$1") == 'text/html'
If you really want it all on one line, without using a separate script, you can invoke sh directly from find:
find . -type f -exec sh -c 'test $(file --brief --mime-type "$0") == "text/html"' {} \; -print
Although it may be possible to turn it into one wildly quoted statement, it is often easier - and more clear - to be a little more verbose:
$ find . -type f -print0 | xargs -0 file --mime-type | ↷
grep ':[^:]*text/html$'| sed 's,:[^:]*text/html,,'
Use "{}" instead, for an example this simply lists file types:
find * -maxdepth 0 -exec file "{}" \;

how to find files containing a string using egrep

I would like to find the files containing specific string under linux.
I tried something like but could not succeed:
find . -name *.txt | egrep mystring
Here you are sending the file names (output of the find command) as input to egrep; you actually want to run egrep on the contents of the files.
Here are a couple of alternatives:
find . -name "*.txt" -exec egrep mystring {} \;
or even better
find . -name "*.txt" -print0 | xargs -0 egrep mystring
Check the find command help to check what the single arguments do.
The first approach will spawn a new process for every file, while the second will pass more than one file as argument to egrep; the -print0 and -0 flags are needed to deal with potentially nasty file names (allowing to separate file names correctly even if a file name contains a space, for example).
try:
find . -name '*.txt' | xargs egrep mystring
There are two problems with your version:
Firstly, *.txt will first be expanded by the shell, giving you a listing of files in the current directory which end in .txt, so for instance, if you have the following:
[dsm#localhost:~]$ ls *.txt
test.txt
[dsm#localhost:~]$
your find command will turn into find . -name test.txt. Just try the following to illustrate:
[dsm#localhost:~]$ echo find . -name *.txt
find . -name test.txt
[dsm#localhost:~]$
Secondly, egrep does not take filenames from STDIN. To convert them to arguments you need to use xargs
find . -name *.txt | egrep mystring
That will not work as egrep will be searching for mystring within the output generated by find . -name *.txt which are just the path to *.txt files.
Instead, you can use xargs:
find . -name *.txt | xargs egrep mystring
You could use
find . -iname *.txt -exec egrep mystring \{\} \;
Here's an example that will return the file paths of a all *.log files that have a line that begins with ERROR:
find . -name "*.log" -exec egrep -l '^ERROR' {} \;
there's a recursive option from egrep you can use
egrep -R "pattern" *.log
If you only want the filenames:
find . -type f -name '*.txt' -exec egrep -l pattern {} \;
If you want filenames and matches:
find . -type f -name '*.txt' -exec egrep pattern {} /dev/null \;

Resources