Remove files not containing a specific string - linux

I want to find the files not containing a specific string (in a directory and its sub-directories) and remove those files. How I can do this?

The following will work:
find . -type f -print0 | xargs --null grep -Z -L 'my string' | xargs --null rm
This will firstly use find to print the names of all the files in the current directory and any subdirectories. These names are printed with a null terminator rather than the usual newline separator (try piping the output to od -c to see the effect of the -print0 argument.
Then the --null parameter to xargs tells it to accept null-terminated inputs. xargs will then call grep on a list of filenames.
The -Z argument to grep works like the -print0 argument to find, so grep will print out its results null-terminated (which is why the final call to xargs needs a --null option too). The -L argument to grep causes grep to print the filenames of those files on its command line (that xargs has added) which don't match the regular expression:
my string
If you want simple matching without regular expression magic then add the -F option. If you want more powerful regular expressions then give a -E argument. It's a good habit to use single quotes rather than double quotes as this protects you against any shell magic being applied to the string (such as variable substitution)
Finally you call xargs again to get rid of all the files that you've found with the previous calls.
The problem with calling grep directly from the find command with the -exec argument is that grep then gets invoked once per file rather than once for a whole batch of files as xargs does. This is much faster if you have lots of files. Also don't be tempted to do stuff like:
rm $(some command that produces lots of filenames)
It's always better to pass it to xargs as this knows the maximum command-line limits and will call rm multiple times each time with as many arguments as it can.
Note that this solution would have been simpler without the need to cope with files containing white space and new lines.
Alternatively
grep -r -L -Z 'my string' . | xargs --null rm
will work too (and is shorter). The -r argument to grep causes it to read all files in the directory and recursively descend into any subdirectories). Use the find ... approach if you want to do some other tests on the files as well (such as age or permissions).
Note that any of the single letter arguments, with a single dash introducer, can be grouped together (for instance as -rLZ). But note also that find does not use the same conventions and has multi-letter arguments introduced with a single dash. This is for historical reasons and hasn't ever been fixed because it would have broken too many scripts.

GNU grep and bash.
grep -rLZ "$str" . | while IFS= read -rd '' x; do rm "$x"; done
Use a find solution if portability is needed. This is slightly faster.

EDIT: This is how you SHOULD NOT do this! Reason is given here. Thanks to #ormaaj for pointing it out!
find . -type f | grep -v "exclude string" | xargs rm
Note: grep pattern will match against full file path from current directory (see find . -type f output)

One possibility is
find . -type f '!' -exec grep -q "my string" {} \; -exec echo rm {} \;
You can remove the echo if the output of this preview looks correct.
The equivalent with -delete is
find . -type f '!' -exec grep -q "user_id" {} \; -delete
but then you don't get the nice preview option.

To remove files not containing a specific string:
Bash:
To use them, enable the extglob shell option as follows:
shopt -s extglob
And just remove all files that don't have the string "fix":
rm !(*fix*)
If you want to don't delete all the files that don't have the names "fix" and "class":
rm !(*fix*|*class*)
Zsh:
To use them, enable the extended glob zsh shell option as follows:
setopt extended_glob
Remove all files that don't have the string, in this example "fix":
rm -- ^*fix*
If you want to don't delete all the files that don't have the names "fix" and "class":
rm -- ^(*fix*|*class*)
It's possible to use it for extensions, you only need to change the regex: (.zip) , (.doc), etc.
Here are the sources:
https://www.tecmint.com/delete-all-files-in-directory-except-one-few-file-extensions/
https://codeday.me/es/qa/20190819/1296122.html

I can think of a few ways to approach this. Here's one: find and grep to generate a list of files with no match, and then xargs rm them.
find yourdir -type f -exec grep -F -L 'yourstring' '{}' + | xargs -d '\n' rm
This assumes GNU tools (grep -L and xargs -d are non-portable) and of course no filenames with newlines in them. It has the advantage of not running grep and rm once per file, so it'll be reasonably fast. I recommend testing it with "echo" in place of "rm" just to make sure it picks the right files before you unleash the destruction.

This worked for me, you can remove the -f if you're okay with deleting directories.
myString="keepThis"
for x in `find ./`
do if [[ -f $x && ! $x =~ $myString ]]
then rm $x
fi
done

Another solution (although not as fast). The top solution didn't work in my case because the string I needed to use in place of 'my string' has special characters.
find -type f ! -name "*my string*" -exec rm {} \; -print

Related

Find and show information from logs inside a folder in linux

I'm trying to create a little script using bash in linux. That allows me to find if there is any tag 103=16 inside a log
I have multiple folders named for example l51prdsrv-api1.nebex.local, l51prdsrv-oe1.nebex.local, etc... inside those folders are .log files like TRADX_gsoe3.log, TRADX_gseuoe2.log, etc... .
I need to find if inside those logs there is the tag 103=16
I'm trying this command
find . /opt/FIXLOGS/l51prdsrv* -iname "TRADX_" -type f | grep -e 103=16
But what it does is that is showing just the logs names and not the content to see if there is a tag 103=16
First of all, you are not searching files of the form TRADX_something.log, but only files which are just named TRADX_ (case-insensitively, so TradX_ would also be found).
Then you are feeding to grep the names of the files, but never look into the content of those files. From the grep man page, you see that the file content can be supplied either via stdin, or by specifying the file name on the command line. In your case, the latter is the way to go. Therefore you can either do a
find . /opt/FIXLOGS/l51prdsrv* -iname "TRADX_*.log" -type f -exec grep -F 103=16 {} \;
if you are only interested in the matchin lines, or a
find . /opt/FIXLOGS/l51prdsrv* -iname "TRADX_*.log" -type f -exec grep -F 103=16 {} /dev/null \;
if you also want to see the file names where the pattern matches. The reason is that grep is printing the filename only if it sees more than 1 filename on the command line and the /dev/null provides a second dummy file. find replaces the {} by the filename.
BTW, I used -f for grep instead of your -e, because you don't seem to use any specific regular expression pattern anyway.
But you don't need find for this task. An alternative would be an explicit loop:
shopt -s nocasematch # make globbing case-insensitive
shopt -s globstar # turn on ** globbing
for f in {.,/opt/FIXLOGS/l51prdsrv*}/**/tradx_*.log
do
[[ -f $f ]] && grep -F 103=16 "$f" /dev/null
done
While the loop looks more complicated at first glance, it is easier to extend the logic in case you want to do more with the files instead of just grepping the lines, for instance taking specific actions on those files which contain the pattern.
You are doing:
find . /opt/FIXLOGS/l51prdsrv* -iname "TRADX_" -type f | grep -e 103=16
I propose you do:
find . /opt/FIXLOGS/l51prdsrv* -iname "TRADX_" -type f -exec grep -e "103=16" {} /dev/null \;
What's the difference?
find ... -type f
=> gives you a list of files.
When you add | grep -e 103=16, then you perform that on the filenames.
When you add -exec grep ..., then you perform that on the files itselfs.

Bash script to recursively find and replace in files [duplicate]

How do I find and replace every occurrence of:
subdomainA.example.com
with
subdomainB.example.com
in every text file under the /home/www/ directory tree recursively?
find /home/www \( -type d -name .git -prune \) -o -type f -print0 | xargs -0 sed -i 's/subdomainA\.example\.com/subdomainB.example.com/g'
-print0 tells find to print each of the results separated by a null character, rather than a new line. In the unlikely event that your directory has files with newlines in the names, this still lets xargs work on the correct filenames.
\( -type d -name .git -prune \) is an expression which completely skips over all directories named .git. You could easily expand it, if you use SVN or have other folders you want to preserve -- just match against more names. It's roughly equivalent to -not -path .git, but more efficient, because rather than checking every file in the directory, it skips it entirely. The -o after it is required because of how -prune actually works.
For more information, see man find.
The simplest way for me is
grep -rl oldtext . | xargs sed -i 's/oldtext/newtext/g'
Note: Do not run this command on a folder including a git repo - changes to .git could corrupt your git index.
find /home/www/ -type f -exec \
sed -i 's/subdomainA\.example\.com/subdomainB.example.com/g' {} +
Compared to other answers here, this is simpler than most and uses sed instead of perl, which is what the original question asked for.
All the tricks are almost the same, but I like this one:
find <mydir> -type f -exec sed -i 's/<string1>/<string2>/g' {} +
find <mydir>: look up in the directory.
-type f:
File is of type: regular file
-exec command {} +:
This variant of the -exec action runs the specified command on the selected files, but the command line is built by appending
each selected file name at the end; the total number of invocations of the command will be much less than the number of
matched files. The command line is built in much the same way that xargs builds its command lines. Only one instance of
`{}' is allowed within the command. The command is executed in the starting directory.
For me the easiest solution to remember is https://stackoverflow.com/a/2113224/565525, i.e.:
sed -i '' -e 's/subdomainA/subdomainB/g' $(find /home/www/ -type f)
NOTE: -i '' solves OSX problem sed: 1: "...": invalid command code .
NOTE: If there are too many files to process you'll get Argument list too long. The workaround - use find -exec or xargs solution described above.
cd /home/www && find . -type f -print0 |
xargs -0 perl -i.bak -pe 's/subdomainA\.example\.com/subdomainB.example.com/g'
For anyone using silver searcher (ag)
ag SearchString -l0 | xargs -0 sed -i 's/SearchString/Replacement/g'
Since ag ignores git/hg/svn file/folders by default, this is safe to run inside a repository.
This one is compatible with git repositories, and a bit simpler:
Linux:
git grep -l 'original_text' | xargs sed -i 's/original_text/new_text/g'
Mac:
git grep -l 'original_text' | xargs sed -i '' -e 's/original_text/new_text/g'
(Thanks to http://blog.jasonmeridth.com/posts/use-git-grep-to-replace-strings-in-files-in-your-git-repository/)
To cut down on files to recursively sed through, you could grep for your string instance:
grep -rl <oldstring> /path/to/folder | xargs sed -i s^<oldstring>^<newstring>^g
If you run man grep you'll notice you can also define an --exlude-dir="*.git" flag if you want to omit searching through .git directories, avoiding git index issues as others have politely pointed out.
Leading you to:
grep -rl --exclude-dir="*.git" <oldstring> /path/to/folder | xargs sed -i s^<oldstring>^<newstring>^g
A straight forward method if you need to exclude directories (--exclude-dir=..folder) and also might have file names with spaces (solved by using 0Byte for both grep -Z and xargs -0)
grep -rlZ oldtext . --exclude-dir=.folder | xargs -0 sed -i 's/oldtext/newtext/g'
An one nice oneliner as an extra. Using git grep.
git grep -lz 'subdomainA.example.com' | xargs -0 perl -i'' -pE "s/subdomainA.example.com/subdomainB.example.com/g"
Simplest way to replace (all files, directory, recursive)
find . -type f -not -path '*/\.*' -exec sed -i 's/foo/bar/g' {} +
Note: Sometimes you might need to ignore some hidden files i.e. .git, you can use above command.
If you want to include hidden files use,
find . -type f -exec sed -i 's/foo/bar/g' {} +
In both case the string foo will be replaced with new string bar
find /home/www/ -type f -exec perl -i.bak -pe 's/subdomainA\.example\.com/subdomainB.example.com/g' {} +
find /home/www/ -type f will list all files in /home/www/ (and its subdirectories).
The "-exec" flag tells find to run the following command on each file found.
perl -i.bak -pe 's/subdomainA\.example\.com/subdomainB.example.com/g' {} +
is the command run on the files (many at a time). The {} gets replaced by file names.
The + at the end of the command tells find to build one command for many filenames.
Per the find man page:
"The command line is built in much the same way that
xargs builds its command lines."
Thus it's possible to achieve your goal (and handle filenames containing spaces) without using xargs -0, or -print0.
I just needed this and was not happy with the speed of the available examples. So I came up with my own:
cd /var/www && ack-grep -l --print0 subdomainA.example.com | xargs -0 perl -i.bak -pe 's/subdomainA\.example\.com/subdomainB.example.com/g'
Ack-grep is very efficient on finding relevant files. This command replaced ~145 000 files with a breeze whereas others took so long I couldn't wait until they finish.
or use the blazing fast GNU Parallel:
grep -rl oldtext . | parallel sed -i 's/oldtext/newtext/g' {}
grep -lr 'subdomainA.example.com' | while read file; do sed -i "s/subdomainA.example.com/subdomainB.example.com/g" "$file"; done
I guess most people don't know that they can pipe something into a "while read file" and it avoids those nasty -print0 args, while presevering spaces in filenames.
Further adding an echo before the sed allows you to see what files will change before actually doing it.
Try this:
sed -i 's/subdomainA/subdomainB/g' `grep -ril 'subdomainA' *`
According to this blog post:
find . -type f | xargs perl -pi -e 's/oldtext/newtext/g;'
#!/usr/local/bin/bash -x
find * /home/www -type f | while read files
do
sedtest=$(sed -n '/^/,/$/p' "${files}" | sed -n '/subdomainA/p')
if [ "${sedtest}" ]
then
sed s'/subdomainA/subdomainB/'g "${files}" > "${files}".tmp
mv "${files}".tmp "${files}"
fi
done
If you do not mind using vim together with grep or find tools, you could follow up the answer given by user Gert in this link --> How to do a text replacement in a big folder hierarchy?.
Here's the deal:
recursively grep for the string that you want to replace in a certain path, and take only the complete path of the matching file. (that would be the $(grep 'string' 'pathname' -Rl).
(optional) if you want to make a pre-backup of those files on centralized directory maybe you can use this also: cp -iv $(grep 'string' 'pathname' -Rl) 'centralized-directory-pathname'
after that you can edit/replace at will in vim following a scheme similar to the one provided on the link given:
:bufdo %s#string#replacement#gc | update
You can use awk to solve this as below,
for file in `find /home/www -type f`
do
awk '{gsub(/subdomainA.example.com/,"subdomainB.example.com"); print $0;}' $file > ./tempFile && mv ./tempFile $file;
done
hope this will help you !!!
For replace all occurrences in a git repository you can use:
git ls-files -z | xargs -0 sed -i 's/subdomainA\.example\.com/subdomainB.example.com/g'
See List files in local git repo? for other options to list all files in a repository. The -z options tells git to separate the file names with a zero byte, which assures that xargs (with the option -0) can separate filenames, even if they contain spaces or whatnot.
A bit old school but this worked on OS X.
There are few trickeries:
• Will only edit files with extension .sls under the current directory
• . must be escaped to ensure sed does not evaluate them as "any character"
• , is used as the sed delimiter instead of the usual /
Also note this is to edit a Jinja template to pass a variable in the path of an import (but this is off topic).
First, verify your sed command does what you want (this will only print the changes to stdout, it will not change the files):
for file in $(find . -name *.sls -type f); do echo -e "\n$file: "; sed 's,foo\.bar,foo/bar/\"+baz+\"/,g' $file; done
Edit the sed command as needed, once you are ready to make changes:
for file in $(find . -name *.sls -type f); do echo -e "\n$file: "; sed -i '' 's,foo\.bar,foo/bar/\"+baz+\"/,g' $file; done
Note the -i '' in the sed command, I did not want to create a backup of the original files (as explained in In-place edits with sed on OS X or in Robert Lujo's comment in this page).
Happy seding folks!
just to avoid to change also
NearlysubdomainA.example.com
subdomainA.example.comp.other
but still
subdomainA.example.com.IsIt.good
(maybe not good in the idea behind domain root)
find /home/www/ -type f -exec sed -i 's/\bsubdomainA\.example\.com\b/\1subdomainB.example.com\2/g' {} \;
Here's a version that should be more general than most; it doesn't require find (using du instead), for instance. It does require xargs, which are only found in some versions of Plan 9 (like 9front).
du -a | awk -F' ' '{ print $2 }' | xargs sed -i -e 's/subdomainA\.example\.com/subdomainB.example.com/g'
If you want to add filters like file extensions use grep:
du -a | grep "\.scala$" | awk -F' ' '{ print $2 }' | xargs sed -i -e 's/subdomainA\.example\.com/subdomainB.example.com/g'
For Qshell (qsh) on IBMi, not bash as tagged by OP.
Limitations of qsh commands:
find does not have the -print0 option
xargs does not have -0 option
sed does not have -i option
Thus the solution in qsh:
PATH='your/path/here'
SEARCH=\'subdomainA.example.com\'
REPLACE=\'subdomainB.example.com\'
for file in $( find ${PATH} -P -type f ); do
TEMP_FILE=${file}.${RANDOM}.temp_file
if [ ! -e ${TEMP_FILE} ]; then
touch -C 819 ${TEMP_FILE}
sed -e 's/'$SEARCH'/'$REPLACE'/g' \
< ${file} > ${TEMP_FILE}
mv ${TEMP_FILE} ${file}
fi
done
Caveats:
Solution excludes error handling
Not Bash as tagged by OP
If you wanted to use this without completely destroying your SVN repository, you can tell 'find' to ignore all hidden files by doing:
find . \( ! -regex '.*/\..*' \) -type f -print0 | xargs -0 sed -i 's/subdomainA.example.com/subdomainB.example.com/g'
Using combination of grep and sed
for pp in $(grep -Rl looking_for_string)
do
sed -i 's/looking_for_string/something_other/g' "${pp}"
done
perl -p -i -e 's/oldthing/new_thingy/g' `grep -ril oldthing *`
to change multiple files (and saving a backup as *.bak):
perl -p -i -e "s/\|/x/g" *
will take all files in directory and replace | with x
called a “Perl pie” (easy as a pie)

Convert all EOL (dos->unix) of all files in a directory and sub-directories recursively without dos2unix

How do I convert all EOL (dos->unix) of all files in a directory and sub-directories recursively without dos2unix? (I do not have it and cannot install it.)
Is there a way to do it using tr -d '\r' and pipes? If so, how?
For all files in current directory you can do it with a Perl one-liner: perl -pi -e 's/\r\n/\n/g' * (stolen from here)
EDIT: And with a small modification you can do subdirectory recursion:
find | xargs perl -pi -e 's/\r\n/\n/g'
You can use sed's -i flag to change the files in-place:
find . -type f -exec sed -i 's/\x0d//g' {} \+
If I were you, I would keep the files around to make sure the operation went okay. Then you can delete the temporary files when you get done. This can be done like so:
find . -type f -exec sed -i'.OLD' 's/\x0d//g' {} \+
find . -type f -name '*.OLD' -delete
Do you have sane file names and directory names without spaces, etc in them?
If so, it is not too hard. If you've got to deal with arbitrary names containing newlines and spaces, etc, then you have to work harder than this.
tmp=${TMPDIR:-/tmp}/crlf.$$
trap "rm -f $tmp.?; exit 1" 0 1 2 3 13 15
find . -type f -print |
while read name
do
tr -d '\015' < $name > $tmp.1
mv $tmp.1 $name
done
rm -f $tmp.?
trap 0
exit 0
The trap stuff ensures you don't get temporary files left around. There other tricks you can pull, with more random names for your temporary file names. You don't normally need them unless you work in a hostile environment.
You can also use the editor in batch mode.
find . -type f -exec bash -c 'echo -ne "%s/\\\r//\nx\n" | ex "{}" ' \;
If \r isn't followed by \n (maybe the case in files of Tim Pote):
deleting \r (using tr -d) may remove newlines
replacing \r with \n may not cause double / triple newlines
Maybe Tim Pote could verify the points above for the files he mentioned.
This removes carriage returns from all files in the current directory and all subdirectories, and should work on most Unix-like OSs:
grep -lIUre '\r' | xargs sed -i 's/\r//'
If its done in widows:
try to run the command in git bash:
$ find | xargs perl -pi -e 's/\r\n/\n/g'
It can show some Can't do inplace edit: type a message so ignore it

How do I recursively grep all directories and subdirectories?

How do I recursively grep all directories and subdirectories?
find . | xargs grep "texthere" *
grep -r "texthere" .
The first parameter represents the regular expression to search for, while the second one represents the directory that should be searched. In this case, . means the current directory.
Note: This works for GNU grep, and on some platforms like Solaris you must specifically use GNU grep as opposed to legacy implementation. For Solaris this is the ggrep command.
If you know the extension or pattern of the file you would like, another method is to use --include option:
grep -r --include "*.txt" texthere .
You can also mention files to exclude with --exclude.
Ag
If you frequently search through code, Ag (The Silver Searcher) is a much faster alternative to grep, that's customized for searching code. For instance, it's recursive by default and automatically ignores files and directories listed in .gitignore, so you don't have to keep passing the same cumbersome exclude options to grep or find.
I now always use (even on Windows with GoW -- Gnu on Windows):
grep --include="*.xxx" -nRHI "my Text to grep" *
(As noted by kronen in the comments, you can add 2>/dev/null to void permission denied outputs)
That includes the following options:
--include=PATTERN
Recurse in directories only searching file matching PATTERN.
-n, --line-number
Prefix each line of output with the line number within its input file.
(Note: phuclv adds in the comments that -n decreases performance a lot so, so you might want to skip that option)
-R, -r, --recursive
Read all files under each directory, recursively; this is equivalent to the -d recurse option.
-H, --with-filename
Print the filename for each match.
-I
Process a binary file as if it did not contain matching data;
this is equivalent to the --binary-files=without-match option.
And I can add 'i' (-nRHIi), if I want case-insensitive results.
I can get:
/home/vonc/gitpoc/passenger/gitlist/github #grep --include="*.php" -nRHI "hidden" *
src/GitList/Application.php:43: 'git.hidden' => $config->get('git', 'hidden') ? $config->get('git', 'hidden') : array(),
src/GitList/Provider/GitServiceProvider.php:21: $options['hidden'] = $app['git.hidden'];
tests/InterfaceTest.php:32: $options['hidden'] = array(self::$tmpdir . '/hiddenrepo');
vendor/klaussilveira/gitter/lib/Gitter/Client.php:20: protected $hidden;
vendor/klaussilveira/gitter/lib/Gitter/Client.php:170: * Get hidden repository list
vendor/klaussilveira/gitter/lib/Gitter/Client.php:176: return $this->hidden;
...
Also:
find ./ -type f -print0 | xargs -0 grep "foo"
but grep -r is a better answer.
globbing **
Using grep -r works, but it may overkill, especially in large folders.
For more practical usage, here is the syntax which uses globbing syntax (**):
grep "texthere" **/*.txt
which greps only specific files with pattern selected pattern. It works for supported shells such as Bash +4 or zsh.
To activate this feature, run: shopt -s globstar.
See also: How do I find all files containing specific text on Linux?
git grep
For projects under Git version control, use:
git grep "pattern"
which is much quicker.
ripgrep
For larger projects, the quickest grepping tool is ripgrep which greps files recursively by default:
rg "pattern" .
It's built on top of Rust's regex engine which uses finite automata, SIMD and aggressive literal optimizations to make searching very fast. Check the detailed analysis here.
In POSIX systems, you don't find -r parameter for grep and your grep -rn "stuff" . won't run, but if you use find command it will:
find . -type f -exec grep -n "stuff" {} \; -print
Agreed by Solaris and HP-UX.
If you only want to follow actual directories, and not symbolic links,
grep -r "thingToBeFound" directory
If you want to follow symbolic links as well as actual directories (be careful of infinite recursion),
grep -R "thing to be found" directory
Since you're trying to grep recursively, the following options may also be useful to you:
-H: outputs the filename with the line
-n: outputs the line number in the file
So if you want to find all files containing Darth Vader in the current directory or any subdirectories and capture the filename and line number, but do not want the recursion to follow symbolic links, the command would be
grep -rnH "Darth Vader" .
If you want to find all mentions of the word cat in the directory
/home/adam/Desktop/TomAndJerry
and you're currently in the directory
/home/adam/Desktop/WorldDominationPlot
and you want to capture the filename but not the line number of any instance of the string "cats", and you want the recursion to follow symbolic links if it finds them, you could run either of the following
grep -RH "cats" ../TomAndJerry #relative directory
grep -RH "cats" /home/adam/Desktop/TomAndJerry #absolute directory
Source:
running "grep --help"
A short introduction to symbolic links, for anyone reading this answer and confused by my reference to them:
https://www.nixtutor.com/freebsd/understanding-symbolic-links/
To find name of files with path recursively containing the particular string use below command
for UNIX:
find . | xargs grep "searched-string"
for Linux:
grep -r "searched-string" .
find a file on UNIX server
find . -type f -name file_name
find a file on LINUX server
find . -name file_name
just the filenames can be useful too
grep -r -l "foo" .
another syntax to grep a string in all files on a Linux system recursively
grep -irn "string"
the -r indicates a recursive search that searches for the specified string in the given directory and sub directory looking for the specified string in files, program, etc
-i ingnore case sensitive can be used to add inverted case string
-n prints the line number of the specified string
NB: this prints massive result to the console so you might need to filter the output by piping and remove less interesting bits of info it also searches binary programs so you might want to filter some of the results
ag is my favorite way to do this now github.com/ggreer/the_silver_searcher . It's basically the same thing as ack but with a few more optimizations.
Here's a short benchmark. I clear the cache before each test (cf https://askubuntu.com/questions/155768/how-do-i-clean-or-disable-the-memory-cache )
ryan#3G08$ sync && echo 3 | sudo tee /proc/sys/vm/drop_caches
3
ryan#3G08$ time grep -r "hey ya" .
real 0m9.458s
user 0m0.368s
sys 0m3.788s
ryan#3G08:$ sync && echo 3 | sudo tee /proc/sys/vm/drop_caches
3
ryan#3G08$ time ack-grep "hey ya" .
real 0m6.296s
user 0m0.716s
sys 0m1.056s
ryan#3G08$ sync && echo 3 | sudo tee /proc/sys/vm/drop_caches
3
ryan#3G08$ time ag "hey ya" .
real 0m5.641s
user 0m0.356s
sys 0m3.444s
ryan#3G08$ time ag "hey ya" . #test without first clearing cache
real 0m0.154s
user 0m0.224s
sys 0m0.172s
This should work:
grep -R "texthere" *
If you are looking for a specific content in all files from a directory structure, you may use find since it is more clear what you are doing:
find -type f -exec grep -l "texthere" {} +
Note that -l (downcase of L) shows the name of the file that contains the text. Remove it if you instead want to print the match itself. Or use -H to get the file together with the match. All together, other alternatives are:
find -type f -exec grep -Hn "texthere" {} +
Where -n prints the line number.
This is the one that worked for my case on my current machine (git bash on windows 7):
find ./ -type f -iname "*.cs" -print0 | xargs -0 grep "content pattern"
I always forget the -print0 and -0 for paths with spaces.
EDIT: My preferred tool is now instead ripgrep: https://github.com/BurntSushi/ripgrep/releases . It's really fast and has better defaults (like recursive by default). Same example as my original answer but using ripgrep: rg -g "*.cs" "content pattern"
grep -r "texthere" . (notice period at the end)
(^credit: https://stackoverflow.com/a/1987928/1438029)
Clarification:
grep -r "texthere" / (recursively grep all directories and subdirectories)
grep -r "texthere" . (recursively grep these directories and subdirectories)
grep recursive
grep [options] PATTERN [FILE...]
[options]
-R, -r, --recursive
Read all files under each directory, recursively.
This is equivalent to the -d recurse or --directories=recurse option.
http://linuxcommand.org/man_pages/grep1.html
grep help
$ grep --help
$ grep --help |grep recursive
-r, --recursive like --directories=recurse
-R, --dereference-recursive
Alternatives
ack (http://beyondgrep.com/)
ag (http://github.com/ggreer/the_silver_searcher)
Throwing my two cents here. As others already mentioned grep -r doesn't work on every platform. This may sound silly but I always use git.
git grep "texthere"
Even if the directory is not staged, I just stage it and use git grep.
Below are the command for search a String recursively on Unix and Linux environment.
for UNIX command is:
find . -name "string to be searched" -exec grep "text" "{}" \;
for Linux command is:
grep -r "string to be searched" .
In 2018, you want to use ripgrep or the-silver-searcher because they are way faster than the alternatives.
Here is a directory with 336 first-level subdirectories:
% find . -maxdepth 1 -type d | wc -l
336
% time rg -w aggs -g '*.py'
...
rg -w aggs -g '*.py' 1.24s user 2.23s system 283% cpu 1.222 total
% time ag -w aggs -G '.*py$'
...
ag -w aggs -G '.*py$' 2.71s user 1.55s system 116% cpu 3.651 total
% time find ./ -type f -name '*.py' | xargs grep -w aggs
...
find ./ -type f -name '*.py' 1.34s user 5.68s system 32% cpu 21.329 total
xargs grep -w aggs 6.65s user 0.49s system 32% cpu 22.164 total
On OSX, this installs ripgrep: brew install ripgrep. This installs silver-searcher: brew install the_silver_searcher.
In my IBM AIX Server (OS version: AIX 5.2), use:
find ./ -type f -print -exec grep -n -i "stringYouWannaFind" {} \;
this will print out path/file name and relative line number in the file like:
./inc/xxxx_x.h
2865: /** Description : stringYouWannaFind */
anyway,it works for me : )
For a list of available flags:
grep --help
Returns all matches for the regexp texthere in the current directory, with the corresponding line number:
grep -rn "texthere" .
Returns all matches for texthere, starting at the root directory, with the corresponding line number and ignoring case:
grep -rni "texthere" /
flags used here:
-r recursive
-n print line number with output
-i ignore case
Note that find . -type f | xargs grep whatever sorts of solutions will run into "Argument list to long" errors when there are too many files matched by find.
The best bet is grep -r but if that isn't available, use find . -type f -exec grep -H whatever {} \; instead.
I guess this is what you're trying to write
grep myText $(find .)
and this may be something else helpful if you want to find the files grep hit
grep myText $(find .) | cut -d : -f 1 | sort | uniq
For .gz files, recursively scan all files and directories
Change file type or put *
find . -name \*.gz -print0 | xargs -0 zgrep "STRING"
Just for fun, a quick and dirty search of *.txt files if the #christangrant answer is too much to type :-)
grep -r texthere .|grep .txt
Here's a recursive (tested lightly with bash and sh) function that traverses all subfolders of a given folder ($1) and using grep searches for given string ($3) in given files ($2):
$ cat script.sh
#!/bin/sh
cd "$1"
loop () {
for i in *
do
if [ -d "$i" ]
then
# echo entering "$i"
cd "$i"
loop "$1" "$2"
fi
done
if [ -f "$1" ]
then
grep -l "$2" "$PWD/$1"
fi
cd ..
}
loop "$2" "$3"
Running it and an example output:
$ sh script start_folder filename search_string
/home/james/start_folder/dir2/filename
Get the first matched files from grep command and get all the files don't contain some word, but input files for second grep comes from result files of first grep command.
grep -l -r --include "*.js" "FIRSTWORD" * | xargs grep "SECONDwORD"
grep -l -r --include "*.js" "FIRSTWORD" * | xargs grep -L "SECONDwORD"
dc0fd654-37df-4420-8ba5-6046a9dbe406
grep -l -r --include "*.js" "SEARCHWORD" * | awk -F'/' '{print $NF}' | xargs -I{} sh -c 'echo {}; grep -l -r --include "*.html" -w --include=*.js -e {} *; echo '''
5319778a-cec2-444d-bcc4-53d33821fedb

How do I rename all folders and files to lowercase on Linux?

I have to rename a complete folder tree recursively so that no uppercase letter appears anywhere (it's C++ source code, but that shouldn't matter).
Bonus points for ignoring CVS and Subversion version control files/folders. The preferred way would be a shell script, since a shell should be available on any Linux box.
There were some valid arguments about details of the file renaming.
I think files with the same lowercase names should be overwritten; it's the user's problem. When checked out on a case-ignoring file system, it would overwrite the first one with the latter, too.
I would consider A-Z characters and transform them to a-z, everything else is just calling for problems (at least with source code).
The script would be needed to run a build on a Linux system, so I think changes to CVS or Subversion version control files should be omitted. After all, it's just a scratch checkout. Maybe an "export" is more appropriate.
Smaller still I quite like:
rename 'y/A-Z/a-z/' *
On case insensitive filesystems such as OS X's HFS+, you will want to add the -f flag:
rename -f 'y/A-Z/a-z/' *
A concise version using the "rename" command:
find my_root_dir -depth -exec rename 's/(.*)\/([^\/]*)/$1\/\L$2/' {} \;
This avoids problems with directories being renamed before files and trying to move files into non-existing directories (e.g. "A/A" into "a/a").
Or, a more verbose version without using "rename".
for SRC in `find my_root_dir -depth`
do
DST=`dirname "${SRC}"`/`basename "${SRC}" | tr '[A-Z]' '[a-z]'`
if [ "${SRC}" != "${DST}" ]
then
[ ! -e "${DST}" ] && mv -T "${SRC}" "${DST}" || echo "${SRC} was not renamed"
fi
done
P.S.
The latter allows more flexibility with the move command (for example, "svn mv").
for f in `find`; do mv -v "$f" "`echo $f | tr '[A-Z]' '[a-z]'`"; done
Just simply try the following if you don't need to care about efficiency.
zip -r foo.zip foo/*
unzip -LL foo.zip
One can simply use the following which is less complicated:
rename 'y/A-Z/a-z/' *
This works on CentOS/Red Hat Linux or other distributions without the rename Perl script:
for i in $( ls | grep [A-Z] ); do mv -i "$i" "`echo $i | tr 'A-Z' 'a-z'`"; done
Source: Rename all file names from uppercase to lowercase characters
(In some distributions the default rename command comes from util-linux, and that is a different, incompatible tool.)
This works if you already have or set up the rename command (e.g. through brew install in Mac):
rename --lower-case --force somedir/*
The simplest approach I found on Mac OS X was to use the rename package from http://plasmasturm.org/code/rename/:
brew install rename
rename --force --lower-case --nows *
--force Rename even when a file with the destination name already exists.
--lower-case Convert file names to all lower case.
--nows Replace all sequences of whitespace in the filename with single underscore characters.
Most of the answers above are dangerous, because they do not deal with names containing odd characters. Your safest bet for this kind of thing is to use find's -print0 option, which will terminate filenames with ASCII NUL instead of \n.
Here is a script, which only alter files and not directory names so as not to confuse find:
find . -type f -print0 | xargs -0n 1 bash -c \
's=$(dirname "$0")/$(basename "$0");
d=$(dirname "$0")/$(basename "$0"|tr "[A-Z]" "[a-z]"); mv -f "$s" "$d"'
I tested it, and it works with filenames containing spaces, all kinds of quotes, etc. This is important because if you run, as root, one of those other scripts on a tree that includes the file created by
touch \;\ echo\ hacker::0:0:hacker:\$\'\057\'root:\$\'\057\'bin\$\'\057\'bash
... well guess what ...
Here's my suboptimal solution, using a Bash shell script:
#!/bin/bash
# First, rename all folders
for f in `find . -depth ! -name CVS -type d`; do
g=`dirname "$f"`/`basename "$f" | tr '[A-Z]' '[a-z]'`
if [ "xxx$f" != "xxx$g" ]; then
echo "Renaming folder $f"
mv -f "$f" "$g"
fi
done
# Now, rename all files
for f in `find . ! -type d`; do
g=`dirname "$f"`/`basename "$f" | tr '[A-Z]' '[a-z]'`
if [ "xxx$f" != "xxx$g" ]; then
echo "Renaming file $f"
mv -f "$f" "$g"
fi
done
Folders are all renamed correctly, and mv isn't asking questions when permissions don't match, and CVS folders are not renamed (CVS control files inside that folder are still renamed, unfortunately).
Since "find -depth" and "find | sort -r" both return the folder list in a usable order for renaming, I preferred using "-depth" for searching folders.
One-liner:
for F in K*; do NEWNAME=$(echo "$F" | tr '[:upper:]' '[:lower:]'); mv "$F" "$NEWNAME"; done
Or even:
for F in K*; do mv "$F" "${F,,}"; done
Note that this will convert only files/directories starting with letter K, so adjust accordingly.
The original question asked for ignoring SVN and CVS directories, which can be done by adding -prune to the find command. E.g to ignore CVS:
find . -name CVS -prune -o -exec mv '{}' `echo {} | tr '[A-Z]' '[a-z]'` \; -print
[edit] I tried this out, and embedding the lower-case translation inside the find didn't work for reasons I don't actually understand. So, amend this to:
$> cat > tolower
#!/bin/bash
mv $1 `echo $1 | tr '[:upper:]' '[:lower:]'`
^D
$> chmod u+x tolower
$> find . -name CVS -prune -o -exec tolower '{}' \;
Ian
Not portable, Zsh only, but pretty concise.
First, make sure zmv is loaded.
autoload -U zmv
Also, make sure extendedglob is on:
setopt extendedglob
Then use:
zmv '(**/)(*)~CVS~**/CVS' '${1}${(L)2}'
To recursively lowercase files and directories where the name is not CVS.
Using Larry Wall's filename fixer:
$op = shift or die $help;
chomp(#ARGV = <STDIN>) unless #ARGV;
for (#ARGV) {
$was = $_;
eval $op;
die $# if $#;
rename($was,$_) unless $was eq $_;
}
It's as simple as
find | fix 'tr/A-Z/a-z/'
(where fix is of course the script above)
for f in `find -depth`; do mv ${f} ${f,,} ; done
find -depth prints each file and directory, with a directory's contents printed before the directory itself. ${f,,} lowercases the file name.
This works nicely on macOS too:
ruby -e "Dir['*'].each { |p| File.rename(p, p.downcase) }"
This is a small shell script that does what you requested:
root_directory="${1?-please specify parent directory}"
do_it () {
awk '{ lc= tolower($0); if (lc != $0) print "mv \"" $0 "\" \"" lc "\"" }' | sh
}
# first the folders
find "$root_directory" -depth -type d | do_it
find "$root_directory" ! -type d | do_it
Note the -depth action in the first find.
Use typeset:
typeset -l new # Always lowercase
find $topPoint | # Not using xargs to make this more readable
while read old
do new="$old" # $new is a lowercase version of $old
mv "$old" "$new" # Quotes for those annoying embedded spaces
done
On Windows, emulations, like Git Bash, may fail because Windows isn't case-sensitive under the hood. For those, add a step that mv's the file to another name first, like "$old.tmp", and then to $new.
With MacOS,
Install the rename package,
brew install rename
Use,
find . -iname "*.py" -type f | xargs -I% rename -c -f "%"
This command find all the files with a *.py extension and converts the filenames to lower case.
`f` - forces a rename
For example,
$ find . -iname "*.py" -type f
./sample/Sample_File.py
./sample_file.py
$ find . -iname "*.py" -type f | xargs -I% rename -c -f "%"
$ find . -iname "*.py" -type f
./sample/sample_file.py
./sample_file.py
Lengthy But "Works With No Surprises & No Installations"
This script handles filenames with spaces, quotes, other unusual characters and Unicode, works on case insensitive filesystems and most Unix-y environments that have bash and awk installed (i.e. almost all). It also reports collisions if any (leaving the filename in uppercase) and of course renames both files & directories and works recursively. Finally it's highly adaptable: you can tweak the find command to target the files/dirs you wish and you can tweak awk to do other name manipulations. Note that by "handles Unicode" I mean that it will indeed convert their case (not ignore them like answers that use tr).
# adapt the following command _IF_ you want to deal with specific files/dirs
find . -depth -mindepth 1 -exec bash -c '
for file do
# adapt the awk command if you wish to rename to something other than lowercase
newname=$(dirname "$file")/$(basename "$file" | awk "{print tolower(\$0)}")
if [ "$file" != "$newname" ] ; then
# the extra step with the temp filename is for case-insensitive filesystems
if [ ! -e "$newname" ] && [ ! -e "$newname.lcrnm.tmp" ] ; then
mv -T "$file" "$newname.lcrnm.tmp" && mv -T "$newname.lcrnm.tmp" "$newname"
else
echo "ERROR: Name already exists: $newname"
fi
fi
done
' sh {} +
References
My script is based on these excellent answers:
https://unix.stackexchange.com/questions/9496/looping-through-files-with-spaces-in-the-names
How to convert a string to lower case in Bash?
In OS X, mv -f shows "same file" error, so I rename twice:
for i in `find . -name "*" -type f |grep -e "[A-Z]"`; do j=`echo $i | tr '[A-Z]' '[a-z]' | sed s/\-1$//`; mv $i $i-1; mv $i-1 $j; done
I needed to do this on a Cygwin setup on Windows 7 and found that I got syntax errors with the suggestions from above that I tried (though I may have missed a working option). However, this solution straight from Ubuntu forums worked out of the can :-)
ls | while read upName; do loName=`echo "${upName}" | tr '[:upper:]' '[:lower:]'`; mv "$upName" "$loName"; done
(NB: I had previously replaced whitespace with underscores using:
for f in *\ *; do mv "$f" "${f// /_}"; done
)
Slugify Rename (regex)
It is not exactly what the OP asked for, but what I was hoping to find on this page:
A "slugify" version for renaming files so they are similar to URLs (i.e. only include alphanumeric, dots, and dashes):
rename "s/[^a-zA-Z0-9\.]+/-/g" filename
I would reach for Python in this situation, to avoid optimistically assuming paths without spaces or slashes. I've also found that python2 tends to be installed in more places than rename.
#!/usr/bin/env python2
import sys, os
def rename_dir(directory):
print('DEBUG: rename('+directory+')')
# Rename current directory if needed
os.rename(directory, directory.lower())
directory = directory.lower()
# Rename children
for fn in os.listdir(directory):
path = os.path.join(directory, fn)
os.rename(path, path.lower())
path = path.lower()
# Rename children within, if this child is a directory
if os.path.isdir(path):
rename_dir(path)
# Run program, using the first argument passed to this Python script as the name of the folder
rename_dir(sys.argv[1])
If you use Arch Linux, you can install rename) package from AUR that provides the renamexm command as /usr/bin/renamexm executable and a manual page along with it.
It is a really powerful tool to quickly rename files and directories.
Convert to lowercase
rename -l Developers.mp3 # or --lowcase
Convert to UPPER case
rename -u developers.mp3 # or --upcase, long option
Other options
-R --recursive # directory and its children
-t --test # Dry run, output but don't rename
-o --owner # Change file owner as well to user specified
-v --verbose # Output what file is renamed and its new name
-s/str/str2 # Substitute string on pattern
--yes # Confirm all actions
You can fetch the sample Developers.mp3 file from here, if needed ;)
None of the solutions here worked for me because I was on a system that didn't have access to the perl rename script, plus some of the files included spaces. However, I found a variant that works:
find . -depth -exec sh -c '
t=${0%/*}/$(printf %s "${0##*/}" | tr "[:upper:]" "[:lower:]");
[ "$t" = "$0" ] || mv -i "$0" "$t"
' {} \;
Credit goes to "Gilles 'SO- stop being evil'", see this answer on the similar question "change entire directory tree to lower-case names" on the Unix & Linux StackExchange.
I believe the one-liners can be simplified:
for f in **/*; do mv "$f" "${f:l}"; done
( find YOURDIR -type d | sort -r;
find yourdir -type f ) |
grep -v /CVS | grep -v /SVN |
while read f; do mv -v $f `echo $f | tr '[A-Z]' '[a-z]'`; done
First rename the directories bottom up sort -r (where -depth is not available), then the files.
Then grep -v /CVS instead of find ...-prune because it's simpler.
For large directories, for f in ... can overflow some shell buffers.
Use find ... | while read to avoid that.
And yes, this will clobber files which differ only in case...
find . -depth -name '*[A-Z]*'|sed -n 's/\(.*\/\)\(.*\)/mv -n -v -T \1\2 \1\L\2/p'|sh
I haven't tried the more elaborate scripts mentioned here, but none of the single commandline versions worked for me on my Synology NAS. rename is not available, and many of the variations of find fail because it seems to stick to the older name of the already renamed path (eg, if it finds ./FOO followed by ./FOO/BAR, renaming ./FOO to ./foo will still continue to list ./FOO/BAR even though that path is no longer valid). Above command worked for me without any issues.
What follows is an explanation of each part of the command:
find . -depth -name '*[A-Z]*'
This will find any file from the current directory (change . to whatever directory you want to process), using a depth-first search (eg., it will list ./foo/bar before ./foo), but only for files that contain an uppercase character. The -name filter only applies to the base file name, not the full path. So this will list ./FOO/BAR but not ./FOO/bar. This is ok, as we don't want to rename ./FOO/bar. We want to rename ./FOO though, but that one is listed later on (this is why -depth is important).
This comand in itself is particularly useful to finding the files that you want to rename in the first place. Use this after the complete rename command to search for files that still haven't been replaced because of file name collisions or errors.
sed -n 's/\(.*\/\)\(.*\)/mv -n -v -T \1\2 \1\L\2/p'
This part reads the files outputted by find and formats them in a mv command using a regular expression. The -n option stops sed from printing the input, and the p command in the search-and-replace regex outputs the replaced text.
The regex itself consists of two captures: the part up until the last / (which is the directory of the file), and the filename itself. The directory is left intact, but the filename is transformed to lowercase. So, if find outputs ./FOO/BAR, it will become mv -n -v -T ./FOO/BAR ./FOO/bar. The -n option of mv makes sure existing lowercase files are not overwritten. The -v option makes mv output every change that it makes (or doesn't make - if ./FOO/bar already exists, it outputs something like ./FOO/BAR -> ./FOO/BAR, noting that no change has been made). The -T is very important here - it treats the target file as a directory. This will make sure that ./FOO/BAR isn't moved into ./FOO/bar if that directory happens to exist.
Use this together with find to generate a list of commands that will be executed (handy to verify what will be done without actually doing it)
sh
This pretty self-explanatory. It routes all the generated mv commands to the shell interpreter. You can replace it with bash or any shell of your liking.
Using bash, without rename:
find . -exec bash -c 'mv $0 ${0,,}' {} \;

Resources