Bash script to recursively find and replace in files [duplicate]

Bash script to recursively find and replace in files [duplicate] - linux

How do I find and replace every occurrence of:
subdomainA.example.com
with
subdomainB.example.com
in every text file under the /home/www/ directory tree recursively?

find /home/www \( -type d -name .git -prune \) -o -type f -print0 | xargs -0 sed -i 's/subdomainA\.example\.com/subdomainB.example.com/g'
-print0 tells find to print each of the results separated by a null character, rather than a new line. In the unlikely event that your directory has files with newlines in the names, this still lets xargs work on the correct filenames.
\( -type d -name .git -prune \) is an expression which completely skips over all directories named .git. You could easily expand it, if you use SVN or have other folders you want to preserve -- just match against more names. It's roughly equivalent to -not -path .git, but more efficient, because rather than checking every file in the directory, it skips it entirely. The -o after it is required because of how -prune actually works.
For more information, see man find.

The simplest way for me is
grep -rl oldtext . | xargs sed -i 's/oldtext/newtext/g'

Note: Do not run this command on a folder including a git repo - changes to .git could corrupt your git index.
find /home/www/ -type f -exec \
sed -i 's/subdomainA\.example\.com/subdomainB.example.com/g' {} +
Compared to other answers here, this is simpler than most and uses sed instead of perl, which is what the original question asked for.

All the tricks are almost the same, but I like this one:
find <mydir> -type f -exec sed -i 's/<string1>/<string2>/g' {} +
find <mydir>: look up in the directory.
-type f:
File is of type: regular file
-exec command {} +:
This variant of the -exec action runs the specified command on the selected files, but the command line is built by appending
each selected file name at the end; the total number of invocations of the command will be much less than the number of
matched files. The command line is built in much the same way that xargs builds its command lines. Only one instance of
`{}' is allowed within the command. The command is executed in the starting directory.

For me the easiest solution to remember is https://stackoverflow.com/a/2113224/565525, i.e.:
sed -i '' -e 's/subdomainA/subdomainB/g' $(find /home/www/ -type f)
NOTE: -i '' solves OSX problem sed: 1: "...": invalid command code .
NOTE: If there are too many files to process you'll get Argument list too long. The workaround - use find -exec or xargs solution described above.

cd /home/www && find . -type f -print0 |
xargs -0 perl -i.bak -pe 's/subdomainA\.example\.com/subdomainB.example.com/g'

For anyone using silver searcher (ag)
ag SearchString -l0 | xargs -0 sed -i 's/SearchString/Replacement/g'
Since ag ignores git/hg/svn file/folders by default, this is safe to run inside a repository.

This one is compatible with git repositories, and a bit simpler:
Linux:
git grep -l 'original_text' | xargs sed -i 's/original_text/new_text/g'
Mac:
git grep -l 'original_text' | xargs sed -i '' -e 's/original_text/new_text/g'
(Thanks to http://blog.jasonmeridth.com/posts/use-git-grep-to-replace-strings-in-files-in-your-git-repository/)

To cut down on files to recursively sed through, you could grep for your string instance:
grep -rl <oldstring> /path/to/folder | xargs sed -i s^<oldstring>^<newstring>^g
If you run man grep you'll notice you can also define an --exlude-dir="*.git" flag if you want to omit searching through .git directories, avoiding git index issues as others have politely pointed out.
Leading you to:
grep -rl --exclude-dir="*.git" <oldstring> /path/to/folder | xargs sed -i s^<oldstring>^<newstring>^g

A straight forward method if you need to exclude directories (--exclude-dir=..folder) and also might have file names with spaces (solved by using 0Byte for both grep -Z and xargs -0)
grep -rlZ oldtext . --exclude-dir=.folder | xargs -0 sed -i 's/oldtext/newtext/g'

An one nice oneliner as an extra. Using git grep.
git grep -lz 'subdomainA.example.com' | xargs -0 perl -i'' -pE "s/subdomainA.example.com/subdomainB.example.com/g"

Simplest way to replace (all files, directory, recursive)
find . -type f -not -path '*/\.*' -exec sed -i 's/foo/bar/g' {} +
Note: Sometimes you might need to ignore some hidden files i.e. .git, you can use above command.
If you want to include hidden files use,
find . -type f -exec sed -i 's/foo/bar/g' {} +
In both case the string foo will be replaced with new string bar

find /home/www/ -type f -exec perl -i.bak -pe 's/subdomainA\.example\.com/subdomainB.example.com/g' {} +
find /home/www/ -type f will list all files in /home/www/ (and its subdirectories).
The "-exec" flag tells find to run the following command on each file found.
perl -i.bak -pe 's/subdomainA\.example\.com/subdomainB.example.com/g' {} +
is the command run on the files (many at a time). The {} gets replaced by file names.
The + at the end of the command tells find to build one command for many filenames.
Per the find man page:
"The command line is built in much the same way that
xargs builds its command lines."
Thus it's possible to achieve your goal (and handle filenames containing spaces) without using xargs -0, or -print0.

I just needed this and was not happy with the speed of the available examples. So I came up with my own:
cd /var/www && ack-grep -l --print0 subdomainA.example.com | xargs -0 perl -i.bak -pe 's/subdomainA\.example\.com/subdomainB.example.com/g'
Ack-grep is very efficient on finding relevant files. This command replaced ~145 000 files with a breeze whereas others took so long I couldn't wait until they finish.

or use the blazing fast GNU Parallel:
grep -rl oldtext . | parallel sed -i 's/oldtext/newtext/g' {}

grep -lr 'subdomainA.example.com' | while read file; do sed -i "s/subdomainA.example.com/subdomainB.example.com/g" "$file"; done
I guess most people don't know that they can pipe something into a "while read file" and it avoids those nasty -print0 args, while presevering spaces in filenames.
Further adding an echo before the sed allows you to see what files will change before actually doing it.

Try this:
sed -i 's/subdomainA/subdomainB/g' `grep -ril 'subdomainA' *`

According to this blog post:
find . -type f | xargs perl -pi -e 's/oldtext/newtext/g;'

#!/usr/local/bin/bash -x
find * /home/www -type f | while read files
do
sedtest=$(sed -n '/^/,/$/p' "${files}" | sed -n '/subdomainA/p')
if [ "${sedtest}" ]
then
sed s'/subdomainA/subdomainB/'g "${files}" > "${files}".tmp
mv "${files}".tmp "${files}"
fi
done

If you do not mind using vim together with grep or find tools, you could follow up the answer given by user Gert in this link --> How to do a text replacement in a big folder hierarchy?.
Here's the deal:
recursively grep for the string that you want to replace in a certain path, and take only the complete path of the matching file. (that would be the $(grep 'string' 'pathname' -Rl).
(optional) if you want to make a pre-backup of those files on centralized directory maybe you can use this also: cp -iv $(grep 'string' 'pathname' -Rl) 'centralized-directory-pathname'
after that you can edit/replace at will in vim following a scheme similar to the one provided on the link given:
:bufdo %s#string#replacement#gc | update

You can use awk to solve this as below,
for file in `find /home/www -type f`
do
awk '{gsub(/subdomainA.example.com/,"subdomainB.example.com"); print $0;}' $file > ./tempFile && mv ./tempFile $file;
done
hope this will help you !!!

For replace all occurrences in a git repository you can use:
git ls-files -z | xargs -0 sed -i 's/subdomainA\.example\.com/subdomainB.example.com/g'
See List files in local git repo? for other options to list all files in a repository. The -z options tells git to separate the file names with a zero byte, which assures that xargs (with the option -0) can separate filenames, even if they contain spaces or whatnot.

A bit old school but this worked on OS X.
There are few trickeries:
• Will only edit files with extension .sls under the current directory
• . must be escaped to ensure sed does not evaluate them as "any character"
• , is used as the sed delimiter instead of the usual /
Also note this is to edit a Jinja template to pass a variable in the path of an import (but this is off topic).
First, verify your sed command does what you want (this will only print the changes to stdout, it will not change the files):
for file in $(find . -name *.sls -type f); do echo -e "\n$file: "; sed 's,foo\.bar,foo/bar/\"+baz+\"/,g' $file; done
Edit the sed command as needed, once you are ready to make changes:
for file in $(find . -name *.sls -type f); do echo -e "\n$file: "; sed -i '' 's,foo\.bar,foo/bar/\"+baz+\"/,g' $file; done
Note the -i '' in the sed command, I did not want to create a backup of the original files (as explained in In-place edits with sed on OS X or in Robert Lujo's comment in this page).
Happy seding folks!

just to avoid to change also
NearlysubdomainA.example.com
subdomainA.example.comp.other
but still
subdomainA.example.com.IsIt.good
(maybe not good in the idea behind domain root)
find /home/www/ -type f -exec sed -i 's/\bsubdomainA\.example\.com\b/\1subdomainB.example.com\2/g' {} \;

Here's a version that should be more general than most; it doesn't require find (using du instead), for instance. It does require xargs, which are only found in some versions of Plan 9 (like 9front).
du -a | awk -F' ' '{ print $2 }' | xargs sed -i -e 's/subdomainA\.example\.com/subdomainB.example.com/g'
If you want to add filters like file extensions use grep:
du -a | grep "\.scala$" | awk -F' ' '{ print $2 }' | xargs sed -i -e 's/subdomainA\.example\.com/subdomainB.example.com/g'

For Qshell (qsh) on IBMi, not bash as tagged by OP.
Limitations of qsh commands:
find does not have the -print0 option
xargs does not have -0 option
sed does not have -i option
Thus the solution in qsh:
PATH='your/path/here'
SEARCH=\'subdomainA.example.com\'
REPLACE=\'subdomainB.example.com\'
for file in $( find ${PATH} -P -type f ); do
TEMP_FILE=${file}.${RANDOM}.temp_file
if [ ! -e ${TEMP_FILE} ]; then
touch -C 819 ${TEMP_FILE}
sed -e 's/'$SEARCH'/'$REPLACE'/g' \
< ${file} > ${TEMP_FILE}
mv ${TEMP_FILE} ${file}
fi
done
Caveats:
Solution excludes error handling
Not Bash as tagged by OP

If you wanted to use this without completely destroying your SVN repository, you can tell 'find' to ignore all hidden files by doing:
find . \( ! -regex '.*/\..*' \) -type f -print0 | xargs -0 sed -i 's/subdomainA.example.com/subdomainB.example.com/g'

Using combination of grep and sed
for pp in $(grep -Rl looking_for_string)
do
sed -i 's/looking_for_string/something_other/g' "${pp}"
done

perl -p -i -e 's/oldthing/new_thingy/g' `grep -ril oldthing *`

to change multiple files (and saving a backup as *.bak):
perl -p -i -e "s/\|/x/g" *
will take all files in directory and replace | with x
called a “Perl pie” (easy as a pie)

Related

How to use sed to change file extensions?

I have to do a sed line (also using pipes in Linux) to change a file extension, so I can do some kind of mv *.1stextension *.2ndextension like mv *.txt *.c. The thing is that I can't use batch or a for loop, so I have to do it all with pipes and sed command.

you can use string manipulation
filename="file.ext1"
mv "${filename}" "${filename/%ext1/ext2}"
Or if your system support, you can use rename.
Update
you can also do something like this
mv ${filename}{ext1,ext2}
which is called brace expansion

sed is for manipulating the contents of files, not the filename itself. My suggestion:
rename 's/\.ext/\.newext/' ./*.ext
Or, there's this existing question which should help.

This may work:
find . -name "*.txt" |
sed -e 's|./||g' |
awk '{print "mv",$1, $1"c"}' |
sed -e "s|\.txtc|\.c|g" > table;
chmod u+x table;
./table
I don't know why you can't use a loop. It makes life much easier :
newex="c"; # Give your new extension
for file in *.*; # You can replace with *.txt instead of *.*
do
ex="${file##*.}"; # This retrieves the file extension
ne=$(echo "$file" | sed -e "s|$ex|$newex|g"); # Replaces current with the new one
echo "$ex";echo "$ne";
mv "$file" "$ne";
done

You can use find to find all of the files and then pipe that into a while read loop:
$ find . -name "*.ext1" -print0 | while read -d $'\0' file
do
mv $file "${file%.*}.ext2"
done
The ${file%.*} is the small right pattern filter. The % marks the pattern to remove from the right side (matching the smallest glob pattern possible), The .* is the pattern (the last . followed by the characters after the .).
The -print0 will separate file names with the NUL character instead of \n. The -d $'\0' will read in file names separated by the NUL character. This way, file names with spaces, tabs, \n, or other wacky characters will be processed correctly.

You may try following options
Option 1 find along with rename
find . -type f -name "*.ext1" -exec rename -f 's/\.ext1$/ext2/' {} \;
Option 2 find along with mv
find . -type f -name "*.ext1" -exec sh -c 'mv -f $0 ${0%.ext1}.ext2' {} \;
Note: It is observed that rename doesn't work for many terminals

Another solution only with sed and sh
printf "%s\n" *.ext1 |
sed "s/'/'\\\\''/g"';s/\(.*\)'ext1'/mv '\''\1'ext1\'' '\''\1'ext2\''/g' |
sh
for better performance: only one process created
perl -le '($e,$f)=#ARGV;map{$o=$_;s/$e$/$f/;rename$o,$_}<*.$e>' ext2 ext3

well this should work
mv $file $(echo $file | sed -E -e 's/.xml.bak.*/.xml/g' | sed -E -e 's/.\///g')
output
abc.xml.bak.foobar -> abc.xml

Convert all EOL (dos->unix) of all files in a directory and sub-directories recursively without dos2unix

How do I convert all EOL (dos->unix) of all files in a directory and sub-directories recursively without dos2unix? (I do not have it and cannot install it.)
Is there a way to do it using tr -d '\r' and pipes? If so, how?

For all files in current directory you can do it with a Perl one-liner: perl -pi -e 's/\r\n/\n/g' * (stolen from here)
EDIT: And with a small modification you can do subdirectory recursion:
find | xargs perl -pi -e 's/\r\n/\n/g'

You can use sed's -i flag to change the files in-place:
find . -type f -exec sed -i 's/\x0d//g' {} \+
If I were you, I would keep the files around to make sure the operation went okay. Then you can delete the temporary files when you get done. This can be done like so:
find . -type f -exec sed -i'.OLD' 's/\x0d//g' {} \+
find . -type f -name '*.OLD' -delete

Do you have sane file names and directory names without spaces, etc in them?
If so, it is not too hard. If you've got to deal with arbitrary names containing newlines and spaces, etc, then you have to work harder than this.
tmp=${TMPDIR:-/tmp}/crlf.$$
trap "rm -f $tmp.?; exit 1" 0 1 2 3 13 15
find . -type f -print |
while read name
do
tr -d '\015' < $name > $tmp.1
mv $tmp.1 $name
done
rm -f $tmp.?
trap 0
exit 0
The trap stuff ensures you don't get temporary files left around. There other tricks you can pull, with more random names for your temporary file names. You don't normally need them unless you work in a hostile environment.

You can also use the editor in batch mode.
find . -type f -exec bash -c 'echo -ne "%s/\\\r//\nx\n" | ex "{}" ' \;

If \r isn't followed by \n (maybe the case in files of Tim Pote):
deleting \r (using tr -d) may remove newlines
replacing \r with \n may not cause double / triple newlines
Maybe Tim Pote could verify the points above for the files he mentioned.

This removes carriage returns from all files in the current directory and all subdirectories, and should work on most Unix-like OSs:
grep -lIUre '\r' | xargs sed -i 's/\r//'

If its done in widows:
try to run the command in git bash:
$ find | xargs perl -pi -e 's/\r\n/\n/g'
It can show some Can't do inplace edit: type a message so ignore it

Find and replace with sed in directory and sub directories

I run this command to find and replace all occurrences of 'apple' with 'orange' in all files in root of my site:
find ./ -exec sed -i 's/apple/orange/g' {} \;
But it doesn't go through sub directories.
What is wrong with this command?
Here are some lines of output of find ./:
./index.php
./header.php
./fpd
./fpd/font
./fpd/font/desktop.ini
./fpd/font/courier.php
./fpd/font/symbol.php

Your find should look like that to avoid sending directory names to sed:
find ./ -type f -exec sed -i -e 's/apple/orange/g' {} \;

For larger s&r tasks it's better and faster to use grep and xargs, so, for example;
grep -rl 'apples' /dir_to_search_under | xargs sed -i 's/apples/oranges/g'

Since there are also macOS folks reading this one (as I did), the following code worked for me (on 10.14)
egrep -rl '<pattern>' <dir> | xargs -I# sed -i '' 's/<arg1>/<arg2>/g' #
All other answers using -i and -e do not work on macOS.
Source

This worked for me:
find ./ -type f -exec sed -i '' 's#NEEDLE#REPLACEMENT#' *.php {} \;

grep -e apple your_site_root/**/*.* -s -l | xargs sed -i "" "s|apple|orange|"

Found a great program for this called ruplacer
https://github.com/dmerejkowsky/ruplacer
Usage
ruplacer before_text after_text # prints out list of things it will replace
ruplacer before_text after_text --go # executes the replacements
It also respects .gitignore so it won't mess up your .git or node_modules directories (find . by default will go into your .git directory and can corrupt it!!!)

I think we can do this with one line simple command
for i in `grep -rl eth0 . 2> /dev/null`; do sed -i ‘s/eth0/eth1/’ $i; done
Refer to this page.

In linuxOS:
sed -i 's/textSerch/textReplace/g' namefile
if "sed" not work try :
perl -i -pe 's/textSerch/textReplace/g' namefile

Linux command: How to 'find' only text files?

After a few searches from Google, what I come up with is:
find my_folder -type f -exec grep -l "needle text" {} \; -exec file {} \; | grep text
which is very unhandy and outputs unneeded texts such as mime type information. Any better solutions? I have lots of images and other binary files in the same folder with a lot of text files that I need to search through.

I know this is an old thread, but I stumbled across it and thought I'd share my method which I have found to be a very fast way to use find to find only non-binary files:
find . -type f -exec grep -Iq . {} \; -print
The -I option to grep tells it to immediately ignore binary files and the . option along with the -q will make it immediately match text files so it goes very fast. You can change the -print to a -print0 for piping into an xargs -0 or something if you are concerned about spaces (thanks for the tip, #lucas.werkmeister!)
Also the first dot is only necessary for certain BSD versions of find such as on OS X, but it doesn't hurt anything just having it there all the time if you want to put this in an alias or something.
EDIT: As #ruslan correctly pointed out, the -and can be omitted since it is implied.

Based on this SO question :
grep -rIl "needle text" my_folder

Why is it unhandy? If you need to use it often, and don't want to type it every time just define a bash function for it:
function findTextInAsciiFiles {
# usage: findTextInAsciiFiles DIRECTORY NEEDLE_TEXT
find "$1" -type f -exec grep -l "$2" {} \; -exec file {} \; | grep text
}
put it in your .bashrc and then just run:
findTextInAsciiFiles your_folder "needle text"
whenever you want.
EDIT to reflect OP's edit:
if you want to cut out mime informations you could just add a further stage to the pipeline that filters out mime informations. This should do the trick, by taking only what comes before :: cut -d':' -f1:
function findTextInAsciiFiles {
# usage: findTextInAsciiFiles DIRECTORY NEEDLE_TEXT
find "$1" -type f -exec grep -l "$2" {} \; -exec file {} \; | grep text | cut -d ':' -f1
}

find . -type f -print0 | xargs -0 file | grep -P text | cut -d: -f1 | xargs grep -Pil "search"
This is unfortunately not space save. Putting this into bash script makes it a bit easier.
This is space safe:
#!/bin/bash
#if [ ! "$1" ] ; then
echo "Usage: $0 <search>";
exit
fi
find . -type f -print0 \
| xargs -0 file \
| grep -P text \
| cut -d: -f1 \
| xargs -i% grep -Pil "$1" "%"

Another way of doing this:
# find . |xargs file {} \; |grep "ASCII text"
If you want empty files too:
# find . |xargs file {} \; |egrep "ASCII text|empty"

How about this:
$ grep -rl "needle text" my_folder | tr '\n' '\0' | xargs -r -0 file | grep -e ':[^:]*text[^:]*$' | grep -v -e 'executable'
If you want the filenames without the file types, just add a final sed filter.
$ grep -rl "needle text" my_folder | tr '\n' '\0' | xargs -r -0 file | grep -e ':[^:]*text[^:]*$' | grep -v -e 'executable' | sed 's|:[^:]*$||'
You can filter-out unneeded file types by adding more -e 'type' options to the last grep command.
EDIT:
If your xargs version supports the -d option, the commands above become simpler:
$ grep -rl "needle text" my_folder | xargs -d '\n' -r file | grep -e ':[^:]*text[^:]*$' | grep -v -e 'executable' | sed 's|:[^:]*$||'

Here's how I've done it ...
1 . make a small script to test if a file is plain text
istext:
#!/bin/bash
[[ "$(file -bi $1)" == *"file"* ]]
2 . use find as before
find . -type f -exec istext {} \; -exec grep -nHi mystring {} \;

Here's a simplified version with extended explanation for beginners like me who are trying to learn how to put more than one command in one line.
If you were to write out the problem in steps, it would look like this:
// For every file in this directory
// Check the filetype
// If it's an ASCII file, then print out the filename
To achieve this, we can use three UNIX commands: find, file, and grep.
find will check every file in the directory.
file will give us the filetype. In our case, we're looking for a return of 'ASCII text'
grep will look for the keyword 'ASCII' in the output from file
So how can we string these together in a single line? There are multiple ways to do it, but I find that doing it in order of our pseudo-code makes the most sense (especially to a beginner like me).
find ./ -exec file {} ";" | grep 'ASCII'
Looks complicated, but not bad when we break it down:
find ./ = look through every file in this directory. The find command prints out the filename of any file that matches the 'expression', or whatever comes after the path, which in our case is the current directory or ./
The most important thing to understand is that everything after that first bit is going to be evaluated as either True or False. If True, the file name will get printed out. If not, then the command moves on.
-exec = this flag is an option within the find command that allows us to use the result of some other command as the search expression. It's like calling a function within a function.
file {} = the command being called inside of find. The file command returns a string that tells you the filetype of a file. Regularly, it would look like this: file mytextfile.txt. In our case, we want it to use whatever file is being looked at by the find command, so we put in the curly braces {} to act as an empty variable, or parameter. In other words, we're just asking for the system to output a string for every file in the directory.
";" = this is required by find and is the punctuation mark at the end of our -exec command. See the manual for 'find' for more explanation if you need it by running man find.
| grep 'ASCII' = | is a pipe. Pipe take the output of whatever is on the left and uses it as input to whatever is on the right. It takes the output of the find command (a string that is the filetype of a single file) and tests it to see if it contains the string 'ASCII'. If it does, it returns true.
NOW, the expression to the right of find ./ will return true when the grep command returns true. Voila.

I have two issues with histumness' answer:
It only list text files. It does not actually search them as
requested. To actually search, use
find . -type f -exec grep -Iq . {} \; -and -print0 | xargs -0 grep "needle text"
It spawns a grep process for every file, which is very slow. A better solution is then
find . -type f -print0 | xargs -0 grep -IZl . | xargs -0 grep "needle text"
or simply
find . -type f -print0 | xargs -0 grep -I "needle text"
This only takes 0.2s compared to 4s for solution above (2.5GB data / 7700 files), i.e. 20x faster.
Also, nobody cited ag, the Silver Searcher or ack-grep¸as alternatives. If one of these are available, they are much better alternatives:
ag -t "needle text" # Much faster than ack
ack -t "needle text" # or ack-grep
As a last note, beware of false positives (binary files taken as text files). I already had false positive using either grep/ag/ack, so better list the matched files first before editing the files.

Although it is an old question, I think this info bellow will add to the quality of the answers here.
When ignoring files with the executable bit set, I just use this command:
find . ! -perm -111
To keep it from recursively enter into other directories:
find . -maxdepth 1 ! -perm -111
No need for pipes to mix lots of commands, just the powerful plain find command.
Disclaimer: it is not exactly what OP asked, because it doesn't check if the file is binary or not. It will, for example, filter out bash script files, that are text themselves but have the executable bit set.
That said, I hope this is useful to anyone.

I do it this way:
1) since there're too many files (~30k) to search thru, I generate the text file list daily for use via crontab using below command:
find /to/src/folder -type f -exec file {} \; | grep text | cut -d: -f1 > ~/.src_list &
2) create a function in .bashrc:
findex() {
cat ~/.src_list | xargs grep "$*" 2>/dev/null
}
Then I can use below command to do the search:
findex "needle text"
HTH:)

I prefer xargs
find . -type f | xargs grep -I "needle text"
if your filenames are weird look up using the -0 options:
find . -type f -print0 | xargs -0 grep -I "needle text"

bash example to serach text "eth0" in /etc in all text/ascii files
grep eth0 $(find /etc/ -type f -exec file {} \; | egrep -i "text|ascii" | cut -d ':' -f1)

If you are interested in finding any file type by their magic bytes using the awesome file utility combined with power of find, this can come in handy:
$ # Let's make some test files
$ mkdir ASCII-finder
$ cd ASCII-finder
$ dd if=/dev/urandom of=binary.file bs=1M count=1
1+0 records in
1+0 records out
1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.009023 s, 116 MB/s
$ file binary.file
binary.file: data
$ echo 123 > text.txt
$ # Let the magic begin
$ find -type f -print0 | \
xargs -0 -I ## bash -c 'file "$#" | grep ASCII &>/dev/null && echo "file is ASCII: $#"' -- ##
Output:
file is ASCII: ./text.txt
Legend: $ is the interactive shell prompt where we enter our commands
You can modify the part after && to call some other script or do some other stuff inline as well, i.e. if that file contains given string, cat the entire file or look for a secondary string in it.
Explanation:
find items that are files
Make xargs feed each item as a line into one liner bash
command/script
file checks type of file by magic byte, grep checks if ASCII
exists, if so, then after && your next command executes.
find prints results null separated, this is good to escape
filenames with spaces and meta-characters in it.
xargs , using -0 option, reads them null separated, -I ##
takes each record and uses as positional parameter/args to bash
script.
-- for bash ensures whatever comes after it is an argument even
if it starts with - like -c which could otherwise be interpreted
as bash option
If you need to find types other than ASCII, simply replace grep ASCII with other type, like grep "PDF document, version 1.4"

find . -type f | xargs file | grep "ASCII text" | awk -F: '{print $1}'
Use find command to list all files, use file command to verify they are text (not tar,key), finally use awk command to filter and print the result.

How about this
find . -type f|xargs grep "needle text"

Move all files except one

How can I move all files except one? I am looking for something like:
'mv ~/Linux/Old/!Tux.png ~/Linux/New/'
where I move old stuff to new stuff -folder except Tux.png. !-sign represents a negation. Is there some tool for the job?

If you use bash and have the extglob shell option set (which is usually the case):
mv ~/Linux/Old/!(Tux.png) ~/Linux/New/

Put the following to your .bashrc
shopt -s extglob
It extends regexes.
You can then move all files except one by
mv !(fileOne) ~/path/newFolder
Exceptions in relation to other commands
Note that, in copying directories, the forward-flash cannot be used in the name as noticed in the thread Why extglob except breaking except condition?:
cp -r !(Backups.backupdb) /home/masi/Documents/
so Backups.backupdb/ is wrong here before the negation and I would not use it neither in moving directories because of the risk of using wrongly then globs with other commands and possible other exceptions.

I would go with the traditional find & xargs way:
find ~/Linux/Old -maxdepth 1 -mindepth 1 -not -name Tux.png -print0 |
xargs -0 mv -t ~/Linux/New
-maxdepth 1 makes it not search recursively. If you only care about files, you can say -type f. -mindepth 1 makes it not include the ~/Linux/Old path itself into the result. Works with any filenames, including with those that contain embedded newlines.
One comment notes that the mv -t option is a probably GNU extension. For systems that don't have it
find ~/Linux/Old -maxdepth 1 -mindepth 1 -not -name Tux.png \
-exec mv '{}' ~/Linux/New \;

A quick way would be to modify the tux filename so that your move command will not match.
For example:
mv Tux.png .Tux.png
mv * ~/somefolder
mv .Tux.png Tux.png

I think the easiest way to do is with backticks
mv `ls -1 ~/Linux/Old/ | grep -v Tux.png` ~/Linux/New/
Edit:
Use backslash with ls instead to prevent using it with alias, i.e. mostly ls is aliased as ls --color.
mv `\ls -1 ~/Linux/Old/ | grep -v Tux.png` ~/Linux/New/
Thanks #Arnold Roa

For bash, sth answer is correct. Here is the zsh (my shell of choice) syntax:
mv ~/Linux/Old/^Tux.png ~/Linux/New/
Requires EXTENDED_GLOB shell option to be set.

I find this to be a bit safer and easier to rely on for simple moves that exclude certain files or directories.
ls -1 | grep -v ^$EXCLUDE | xargs -I{} mv {} $TARGET

This could be simpler and easy to remember and it works for me.
mv $(ls ~/folder | grep -v ~/folder/exclude.png) ~/destination

The following is not a 100% guaranteed method, and should not at all be attempted for scripting. But some times it is good enough for quick interactive shell usage. A file file glob like
[abc]*
(which will match all files with names starting with a, b or c) can be negated by inserting a "^" character first, i.e.
[^abc]*
I sometimes use this for not matching the "lost+found" directory, like for instance:
mv /mnt/usbdisk/[^l]* /home/user/stuff/.
Of course if there are other files starting with l I have to process those afterwards.

How about:
mv $(echo * | sed s:Tux.png::g) ~/Linux/New/
You have to be in the folder though.

This can bei done without grep like this:
ls ~/Linux/Old/ -QI Tux.png | xargs -I{} mv ~/Linux/Old/{} ~/Linux/New/
Note: -I is a captial i and makes the ls command ignore the Tux.png file, which is listed afterwards.
The output of ls is then piped into mv via xargs, which allows to use the output of ls as source argument for mv.
ls -Q just quotes the filenames listed by ls.

mv `find Linux/Old '!' -type d | fgrep -v Tux.png` Linux/New
The find command lists all regular files and the fgrep command filters out any Tux.png. The backticks tell mv to move the resulting file list.

ls ~/Linux/Old/ | grep -v Tux.png | xargs -i {} mv ~/Linux/New/'

move all files(not include except file) to except_file
find -maxdepth 1 -mindepth 1 -not -name except_file -print0 |xargs -0 mv -t ./except_file
for example(cache is current except file)
find -maxdepth 1 -mindepth 1 -not -name cache -print0 |xargs -0 mv -t ./cache

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Bash script to recursively find and replace in files [duplicate] - linux

How do I find and replace every occurrence of: subdomainA.example.com with subdomainB.example.com in every text file under the /home/www/ directory tree recursively?

The simplest way for me is grep -rl oldtext . | xargs sed -i 's/oldtext/newtext/g'

cd /home/www && find . -type f -print0 | xargs -0 perl -i.bak -pe 's/subdomainA\.example\.com/subdomainB.example.com/g'

For anyone using silver searcher (ag) ag SearchString -l0 | xargs -0 sed -i 's/SearchString/Replacement/g' Since ag ignores git/hg/svn file/folders by default, this is safe to run inside a repository.

A straight forward method if you need to exclude directories (--exclude-dir=..folder) and also might have file names with spaces (solved by using 0Byte for both grep -Z and xargs -0) grep -rlZ oldtext . --exclude-dir=.folder | xargs -0 sed -i 's/oldtext/newtext/g'

An one nice oneliner as an extra. Using git grep. git grep -lz 'subdomainA.example.com' | xargs -0 perl -i'' -pE "s/subdomainA.example.com/subdomainB.example.com/g"

or use the blazing fast GNU Parallel: grep -rl oldtext . | parallel sed -i 's/oldtext/newtext/g' {}

Try this: sed -i 's/subdomainA/subdomainB/g' `grep -ril 'subdomainA' *`

According to this blog post: find . -type f | xargs perl -pi -e 's/oldtext/newtext/g;'

#!/usr/local/bin/bash -x find * /home/www -type f | while read files do sedtest=$(sed -n '/^/,/$/p' "${files}" | sed -n '/subdomainA/p') if [ "${sedtest}" ] then sed s'/subdomainA/subdomainB/'g "${files}" > "${files}".tmp mv "${files}".tmp "${files}" fi done

You can use awk to solve this as below, for file in `find /home/www -type f` do awk '{gsub(/subdomainA.example.com/,"subdomainB.example.com"); print $0;}' $file > ./tempFile && mv ./tempFile $file; done hope this will help you !!!

just to avoid to change also NearlysubdomainA.example.com subdomainA.example.comp.other but still subdomainA.example.com.IsIt.good (maybe not good in the idea behind domain root) find /home/www/ -type f -exec sed -i 's/\bsubdomainA\.example\.com\b/\1subdomainB.example.com\2/g' {} \;

If you wanted to use this without completely destroying your SVN repository, you can tell 'find' to ignore all hidden files by doing: find . \( ! -regex './\..' \) -type f -print0 | xargs -0 sed -i 's/subdomainA.example.com/subdomainB.example.com/g'

Using combination of grep and sed for pp in $(grep -Rl looking_for_string) do sed -i 's/looking_for_string/something_other/g' "${pp}" done

perl -p -i -e 's/oldthing/new_thingy/g' `grep -ril oldthing *`

to change multiple files (and saving a backup as .bak): perl -p -i -e "s/\|/x/g" will take all files in directory and replace | with x called a “Perl pie” (easy as a pie)

Related

How to use sed to change file extensions?

Convert all EOL (dos->unix) of all files in a directory and sub-directories recursively without dos2unix

Find and replace with sed in directory and sub directories

Linux command: How to 'find' only text files?

Move all files except one

Categories

Resources

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Bash script to recursively find and replace in files [duplicate] - linux

How do I find and replace every occurrence of: subdomainA.example.com with subdomainB.example.com in every text file under the /home/www/ directory tree recursively?

The simplest way for me is grep -rl oldtext . | xargs sed -i 's/oldtext/newtext/g'

cd /home/www && find . -type f -print0 | xargs -0 perl -i.bak -pe 's/subdomainA\.example\.com/subdomainB.example.com/g'

For anyone using silver searcher (ag) ag SearchString -l0 | xargs -0 sed -i 's/SearchString/Replacement/g' Since ag ignores git/hg/svn file/folders by default, this is safe to run inside a repository.

A straight forward method if you need to exclude directories (--exclude-dir=..folder) and also might have file names with spaces (solved by using 0Byte for both grep -Z and xargs -0) grep -rlZ oldtext . --exclude-dir=.folder | xargs -0 sed -i 's/oldtext/newtext/g'

An one nice oneliner as an extra. Using git grep. git grep -lz 'subdomainA.example.com' | xargs -0 perl -i'' -pE "s/subdomainA.example.com/subdomainB.example.com/g"

or use the blazing fast GNU Parallel: grep -rl oldtext . | parallel sed -i 's/oldtext/newtext/g' {}

Try this: sed -i 's/subdomainA/subdomainB/g' `grep -ril 'subdomainA' *`

According to this blog post: find . -type f | xargs perl -pi -e 's/oldtext/newtext/g;'

#!/usr/local/bin/bash -x find * /home/www -type f | while read files do sedtest=$(sed -n '/^/,/$/p' "${files}" | sed -n '/subdomainA/p') if [ "${sedtest}" ] then sed s'/subdomainA/subdomainB/'g "${files}" > "${files}".tmp mv "${files}".tmp "${files}" fi done

You can use awk to solve this as below, for file in `find /home/www -type f` do awk '{gsub(/subdomainA.example.com/,"subdomainB.example.com"); print $0;}' $file > ./tempFile && mv ./tempFile $file; done hope this will help you !!!

just to avoid to change also NearlysubdomainA.example.com subdomainA.example.comp.other but still subdomainA.example.com.IsIt.good (maybe not good in the idea behind domain root) find /home/www/ -type f -exec sed -i 's/\bsubdomainA\.example\.com\b/\1subdomainB.example.com\2/g' {} \;

If you wanted to use this without completely destroying your SVN repository, you can tell 'find' to ignore all hidden files by doing: find . \( ! -regex '.*/\..*' \) -type f -print0 | xargs -0 sed -i 's/subdomainA.example.com/subdomainB.example.com/g'

Using combination of grep and sed for pp in $(grep -Rl looking_for_string) do sed -i 's/looking_for_string/something_other/g' "${pp}" done

perl -p -i -e 's/oldthing/new_thingy/g' `grep -ril oldthing *`

to change multiple files (and saving a backup as *.bak): perl -p -i -e "s/\|/x/g" * will take all files in directory and replace | with x called a “Perl pie” (easy as a pie)

Related

How to use sed to change file extensions?

Convert all EOL (dos->unix) of all files in a directory and sub-directories recursively without dos2unix

Find and replace with sed in directory and sub directories

Linux command: How to 'find' only text files?

Move all files except one

Categories

Resources

If you wanted to use this without completely destroying your SVN repository, you can tell 'find' to ignore all hidden files by doing: find . \( ! -regex './\..' \) -type f -print0 | xargs -0 sed -i 's/subdomainA.example.com/subdomainB.example.com/g'

to change multiple files (and saving a backup as .bak): perl -p -i -e "s/\|/x/g" will take all files in directory and replace | with x called a “Perl pie” (easy as a pie)