Find files with a certain extension that exceeds a certain file size - linux

I'm having trouble with the find command in bash.
I'm trying to find a file that ends with .c and has a file size bigger than 2000 bytes. I thought it would be:
find $HOME -type f -size +2000c .c$
But obviously that isn't correct.
What am I doing wrong?

find $HOME -type f -name "*.c" -size +2000c
Have a look to the -name switch in the mane page:
-name pattern
Base of file name (the path with the leading directories
removed) matches shell pattern pattern. The metacharacters
(`*', `?', and `[]') match a `.' at the start of the base name
(this is a change in findutils-4.2.2; see section STANDARDS CON‐
FORMANCE below). To ignore a directory and the files under it,
use -prune; see an example in the description of -path. Braces
are not recognised as being special, despite the fact that some
shells including Bash imbue braces with a special meaning in
shell patterns. The filename matching is performed with the use
of the fnmatch(3) library function. Don't forget to enclose
the pattern in quotes in order to protect it from expansion by
the shell.
Note the suggestion at the end to always enclose the pattern inside quotes. The order of the options is not relevant. Have, again, a look to the man page:
EXPRESSIONS
The expression is made up of options (which affect overall operation
rather than the processing of a specific file, and always return true),
tests (which return a true or false value), and actions (which have
side effects and return a true or false value), all separated by opera‐
tors. -and is assumed where the operator is omitted.
If the expression contains no actions other than -prune, -print is per‐
formed on all files for which the expression is true.
So, options are, by default, connected with and -and operator: they've to be all true in order to find a file and the order doesn't matter at all. The order could be relevant only for more complicated pattern matching where there are other operators than -and.

Try this:
find $HOME -type f -size +2000c -name *.c

Try the following:
find $HOME -type f -size +2000c -name *.c

Related

find type -f also returns non-matching files in matching directory

all files/folders in current directory:
./color_a.txt
./color_b.txt
./color_c.txt
./color/color_d.txt
./color/blue.txt
./color/red.txt
./color/yellow.txt
command used to find all files with the word color in name:
find ./*color* -type f
result:
./color_a.txt
./color_b.txt
./color_c.txt
./color/color_d.txt
./color/blue.txt
./color/red.txt
./color/yellow.txt
expected result:
./color_a.txt
./color_b.txt
./color_c.txt
./color/color_d.txt
The result also includes all the non-matching file names under a matching parent directory.
How could I get ONLY files with names directly matching the color pattern?
Thanks a lot!
What you probably want for filename filtering is a simple -name <glob-pattern> test:
find -name '*color*' -type f
From man find:
-name pattern
Base of file name (the path with the leading directories removed) matches shell
pattern pattern. Because the leading directories are removed, the file names
considered for a match with -name will never include a slash, so `-name a/b' will
never match anything (you probably need to use -path instead).
Just as a side note, when you wrote:
find ./*color* -type f
the shell expanded the (unquoted) glob pattern ./*color*, and what was really executed (what find saw) was this:
find ./color ./color_a.txt ./color_b.txt ./color_c.txt -type f
thus producing a list of files in all of those locations.
You can use the regex option
find -regex ".*color_.*" -type f

How to find files with specific pattern in directory with specific number? Linux

I've got folders named folder1 all the way up to folder150 and maybe beyond.. but I only want to find the complete path to text files in some of the folders (for example folder1 to folder50).
I thought a command like the following might work, but it is incorrect.
find '/path/to/directory/folder{1..50}' -name '*.txt'
The solution doesn't have to use find, as long as it does the correct thing.
find /path/to/directory/folder{1..50} -name '*.txt' 2>/dev/null
Or only basename
find /path/to/directory/folder{1..50} -name '*.txt' -exec basename {} \; 2>/dev/null
Or basename without .txt
find /path/to/directory/folder{1..50} -name '*.txt' -exec basename {} .txt \; 2>/dev/null
V. Michel's answer directly solves your problem; to complement it with an explanation:
Bash's brace expansion is only applied to unquoted strings; your solution attempt uses a single-quoted string, whose contents are by definition interpreted as literals.
Contrast the following two statements:
# WRONG:
# {...} inside a single-quoted (or double-quoted) string: interpreted as *literal*.
echo 'folder{1..3}' # -> 'folder{1..3}'
# OK:
# Unquoted use of {...} -> *brace expansion* is applied.
echo 'folder'{1..3} # -> 'folder1 folder2 folder 3'
Note how only the brace expression is left unquoted in the 2nd example above, which demonstrates that you can selectively mix quoted and unquoted substrings in Bash.
It is worth noting that it is - and can only be - Bash that performs brace expansion here, and find only sees the resulting, literal paths.[1]
find only accepts literal paths as filename operands.
(Some of find's primaries (tests), such as -name and -path, do support globs (as demonstrated in the question), but not brace expansion; to ensure that such globs are passed through intact to find, without premature expansion by Bash, they must be quoted; e.g., -name '*.txt')
[1] After Bash performs brace expansion, globbing (pathname expansion) may occur in addition, as demonstrated in ehaymore's answer; folder(?,[1-4]?,50) is brace-expanded to tokens folder?, folder[1-4]?, and folder50, the first two of which are subject to globbing, due to containing pattern metacharacters (?, [...]). Whether globbing is involved or not, the target program ultimately only sees the resulting literal paths.
You can give multiple directories to the find command, each matching part of the pattern you're looking for. For example,
find /path/to/directory/folder{?,[1-4]?,50} -name '*.txt'
which expands to three patterns:
folder? (matches 0-9)
folder[1-4]? (matches 10-49)
folder50
The question mark is a single-character wildcard.

Linux find command shell expansion

I have just a little question I don't understand with the find command.
I can do this :
[root#hostnaoem# ❯❯❯ls /proc/*/fd
But this give me an error :
[root#hostnaoem# ❯❯❯ find /proc/*/fd -ls
find: `/proc/*/fd': No such file or directory
even if I use "/proc//fd", /proc/""/fd or "/proc/*/fd"
I've searched wha find shell expansion says about that, but I found nothing. Can someone tell me why?
Thanks
If you just RTFM, you'll learn that the syntax for find is:
find [-H] [-L] [-P] [-D debugopts] [-Olevel] [path...] [expression]
The usually used subset of that is:
find whereToSearch (-howToSearch arg)*
To find all files|directories named fd in /proc:
find /proc -name fd
-name is the most common howToSearch expression:
-name pattern
Base of file name (the path with the leading directories
removed) matches shell pattern pattern. The metacharacters
(`*', `?', and `[]') match a `.' at the start of the base name
(this is a change in findutils-4.2.2; see section STANDARDS CON‐
FORMANCE below). To ignore a directory and the files under it,
use -prune; see an example in the description of -path. Braces
are not recognised as being special, despite the fact that some
shells including Bash imbue braces with a special meaning in
shell patterns. The filename matching is performed with the use
of the fnmatch(3) library function. Don't forget to enclose
the pattern in quotes in order to protect it from expansion by
the shell.
(Note the the last sentence)
If your pattern contains slashes, you need -path or -wholename (same thing):
find /proc/ -wholename '/proc/[0-9]*/fd' 2>/dev/null
Other expressions you might want to use are:
-type
-depth, -mindepth, -maxdepth
-user, -uid
See find(1) to learn more about each search expressions. If you want to search the in-terminal manual (man find or man 1 find), you can use the / character to enter search mode (like Ctrl+F in most GUI apps).
Usage of ls with globbing (*) is generally a code smell. Unless you use the -d flag, it'll list the contents of the directories that match the glob pattern in addition to the matches.
I find the echo globpattern form generally more convenient for viewing the results of a glob pattern match.
This work :
[root#hostname # ❯❯❯ find /proc/ -path /proc/*/fd -ls
Regards.

Using Perl-based rename command with find in Bash

I just stumbled upon Perl today while playing around with Bash scripting. When I tried to remove blank spaces in multiple file names, I found this post, which helped me a lot.
After a lot of struggling, I finally understand the rename and substitution commands and their syntax. I wanted to try to replace all "_(x)" at the end of file names with "x", due to duplicate files. But when I try to do it myself, it just does not seem to work. I have three questions with the following code:
Why is nothing executed when I run it?
I used redirection to show me the success note as an error, so I know what happened. What did I do wrong about that?
After a lot of research, I still do not entirely understand file descriptors and redirection in Bash as well as the syntax for the substitute function in Perl. Can somebody give give me a link for a good tutorial?
find -name "*_(*)." -type f | \
rename 's/)././g' && \
find -name "*_(*." -type f | \
rename 's/_(//g' 2>&1
You either need to use xargs or you need to use find's ability to execute commands:
find -name "*_(*)." -type f | xargs rename 's/)././g'
find -name "*_(*." -type f | xargs rename 's/_(//g'
Or:
find -name "*_(*)." -type f -exec rename 's/)././g' {} +
find -name "*_(*." -type f -exec rename 's/_(//g' {} +
In both cases, the file names are added to the command line of rename. As it was, rename would have to read its standard input to discover the file names — and it doesn't.
Does the first find find the files you want? Is the dot at the end of the pattern needed? Do the regexes do what you expect? OK, let's debug some of those too.
You could do it all in one command with a more complex regex:
find . -name "*_(*)" -type f -exec rename 's/_\((\d+)\)$/$1/' {} +
The find pattern is corrected to lose the requirement of a trailing .. If the _(x) is inserted before the extension, then you'd need "*_(*).*" as the pattern for find (and you'll need to revise the Perl regexes).
The Perl substitute needs dissection:
The \( matches an open parenthesis.
The ( starts a capture group.
The \d+ looks for 'one or more digits'.
The ) stops the capture group. It is the first and only, so it is given the number 1.
The \) matches a close parenthesis.
The $ matches the end of the file name.
The $1 in the replacement puts the value of capture group 1 into the replacement text.
In your code, the 2>&1 sent the error messages from the second rename command to standard output instead of standard error. That really doesn't help much here.
You need two separate tutorials; you are not going to find one tutorial that covers I/O redirection in Bash and regular expressions in Perl.
The 'official' Perl regular expression tutorial is:
perlretut, also available as perldoc perlretut on your machine.
The Bash manual covers I/O redirection, but it is somewhat terse:
I/O Redirections.

Problem using 'find' in BASH

I'm following this guide to get some basic skills in Linux.
At the exercises of chapter 3 section, there are two exercises:
*Change to your home directory. Create a new directory and copy all
the files of the /etc directory into it. Make sure that you also copy
the files and directories which are in the subdirectories of /etc!
(recursive copy)
*Change into the new directory and make a directory for files starting
with an upper case character and one for files starting with a lower
case character. Move all the files to the appropriate directories. Use
as few commands as possible.
The first part was simple but I have encountered problems in the second part (although I thought it should be simple as well).
I did the first part successfully - that is, I have a copy of the /etc folder in ~/newetc - with all the files copied recursively into subdirectories.
I've created ~/newetc/upper and ~/newetc/lower directories.
My intention was to do something like mv 'find ... ' ./upper for example.
But first I thought I should make sure that I can find all the files with Upper/Lower case seperately. At this I failed.
I thought that find ~/newetc [A-Z].* (also tried: find ~/newetc -name [A-Z].*) to find all the upper case files - but it simply returns no results.
What's even stranger: find ~/newetc -name [a-z].*) returns only two files, although of course there are a lot more then that...
any idea what am I doing wrong?
Thank you for your time!
Edit: (I have tried to read the Man for find command btw, but didn't come up with anything)
The -name argument does not take a full regular expression by default. So [A-Z].* will match only if the second character is a dot.
Use the expression [A-Z]*, or use -regex and -regextype to match using a real regex.
You need to use quotes
find ~/new_etc -name "[A-Z]*"
find ~/new_etc -name "[a-z]*"
If you want to use regexp, then you must use -regex (or -iregex).
For finding stuff, the other answers tell you how to do it.
For moving the results of find, use the -exec flag (while being in newetc):
find -name "[A-Z]*" -exec mv {} upper/{} \;
find -name "[a-z]*" -exec mv {} lower/{} \;
The -name parameter takes a glob, not a regular expression (those are both very useful pages). So the dot does not have a special meaning for this parameter - It is interpreted as a literal dot character. Also, in a regular expression the * means "0 or more of the previous expression" while in a glob it means "any number of any character." So, as others have pointed out, the following should get you any files below the current directory which start with an uppercase character:
find . -name '[A-Z]*'
If you want to find all the name beginning with a capital letter you have to use
find . -name "[A-Z]*"
NOT
find [A-Z].*
otherwise yo will try to locate all the file that begin with a capital letter and have a . just after

Resources