Replace a string with another string in all files below my current dir

Replace a string with another string in all files below my current dir - string

How do I replace every occurrence of a string with another string below my current directory?
Example: I want to replace every occurrence of www.fubar.com with www.fubar.ftw.com in every file under my current directory.
From research so far I have come up with
sed -i 's/www.fubar.com/www.fubar.ftw.com/g' *.php

You're on the right track, use find to locate the files, then sed to edit them, for example:
find . -name '*.php' -exec sed -i -e 's/www.fubar.com/www.fubar.ftw.com/g' {} \;
Notes
The . means current directory - i.e. in this case, search in and below the current directory.
For some versions of sed you need to specify an extension for the -i option, which is used for backup files.
The -exec option is followed by the command to be applied to the files found, and is terminated by a semicolon, which must be escaped, otherwise the shell consumes it before it is passed to find.

Solution using find, args and sed:
find . -name '*.php' -print0 | xargs -0 sed -i 's/www.fubar.com/www.fubar.ftw.com/g'

A pure bash solution
#!/bin/bash
shopt -s nullglob
for file in *.php
do
while read -r line
do
echo "${line/www.fubar.com/www.fubar.ftw.com}"
done < "$file" > tempo && mv tempo "$file"
done

A more efficient * alternative to the currently accepted solution:
`grep "www.fubar.com" . -lr | xargs sed -i 's/www.fubar.com/www.fubar.ftw.com/g'
This avoids the inefficiency of the find . -exec method, which needlessly runs a sed in-place replacement over all files below your current directory regardless of if they contain the string you're looking for or not, by instead using grep -lr. This gets just the files containing the string you want to replace which you can then pipe to xargs sed -i to perform the in-place replacement on just those files.
* : I used time to make a cursory comparison of my method with the accepted solution (adapted for my own use case); The find . -exec-style method took 3.624s to run on my machine and my above proposed solution took 0.156s, so roughly 23x faster for my use case.

If there are no subfolders, a simpler to remember way is
replace "www.fubar.com" "www.fubar.ftw.com" -- *
where * can also be a list of files
from the manual:
Invoke replace in one of the following ways:
shell> replace from to [from to] ... -- file_name [file_name] ...
shell> replace from to [from to] ... < file_name
If you have hidden files with a dot you can add those to * with
shopt -s dotglob
If you only have one depth of subfolders you can use */* instead of *
replace "www.fubar.com" "www.fubar.ftw.com" -- */*

When using ZSH as your shell you can do:
sed -i 's/www.fubar.com/www.fubar.ftw.com/g' **/*.php

Related

How to loop over files of same format in linux shell [duplicate]

This question already has answers here:
Bash sed in loop
(2 answers)
Closed last year.
I want to apply a specific action in various *.dat files. What I want to do is use sed to remove a specific character using
sed 's/"//g' file.dat >file.dat
I've tried to use the above code in the following way
sed 's/"//g' *.dat > *.dat
but it doesn't seem to work for all the files in the directory.
Any idea on how to loop over all those file in linux shell?

I would use the find command and sed -i (the -i is in-place). So, the complete command would be something like -
find . -name "*.dat" -exec sed -i 's/\"//g' {} \;

You can't read from a file and write to the same file in the same pipeline, so
sed … file > file
will fail. In fact, it will truncate the file. Many implementations of sed contain the nonstandard -i flag, which abstracts the work of writing to a temporary file:
sed -i … file
So you could do:
for dat in *.dat; do
sed -i 's/"//g' "$dat"
done
If your sed doesn't have the -i, you can use tr to remove a single character from files very efficiently:
for dat in *.dat; do
tr -d '"' "$dat" > "$dat.tmp"
mv "$dat.tmp" "$dat"
done
If you want to do this recursively (that is, on file nested within directories within your initial target directory), use either bash's globstar setting, or find:
shopt -s globstar
for dat in **/*.dat; do … # the rest is the same as above
or
find . -name '*.dat' -exec sed -i 's/"//g' {} \;

try this code:
find . -type f -name *.dat -exec sed 's/"//g' {} > {} ';'

How to use sed to change file extensions?

I have to do a sed line (also using pipes in Linux) to change a file extension, so I can do some kind of mv *.1stextension *.2ndextension like mv *.txt *.c. The thing is that I can't use batch or a for loop, so I have to do it all with pipes and sed command.

you can use string manipulation
filename="file.ext1"
mv "${filename}" "${filename/%ext1/ext2}"
Or if your system support, you can use rename.
Update
you can also do something like this
mv ${filename}{ext1,ext2}
which is called brace expansion

sed is for manipulating the contents of files, not the filename itself. My suggestion:
rename 's/\.ext/\.newext/' ./*.ext
Or, there's this existing question which should help.

This may work:
find . -name "*.txt" |
sed -e 's|./||g' |
awk '{print "mv",$1, $1"c"}' |
sed -e "s|\.txtc|\.c|g" > table;
chmod u+x table;
./table
I don't know why you can't use a loop. It makes life much easier :
newex="c"; # Give your new extension
for file in *.*; # You can replace with *.txt instead of *.*
do
ex="${file##*.}"; # This retrieves the file extension
ne=$(echo "$file" | sed -e "s|$ex|$newex|g"); # Replaces current with the new one
echo "$ex";echo "$ne";
mv "$file" "$ne";
done

You can use find to find all of the files and then pipe that into a while read loop:
$ find . -name "*.ext1" -print0 | while read -d $'\0' file
do
mv $file "${file%.*}.ext2"
done
The ${file%.*} is the small right pattern filter. The % marks the pattern to remove from the right side (matching the smallest glob pattern possible), The .* is the pattern (the last . followed by the characters after the .).
The -print0 will separate file names with the NUL character instead of \n. The -d $'\0' will read in file names separated by the NUL character. This way, file names with spaces, tabs, \n, or other wacky characters will be processed correctly.

You may try following options
Option 1 find along with rename
find . -type f -name "*.ext1" -exec rename -f 's/\.ext1$/ext2/' {} \;
Option 2 find along with mv
find . -type f -name "*.ext1" -exec sh -c 'mv -f $0 ${0%.ext1}.ext2' {} \;
Note: It is observed that rename doesn't work for many terminals

Another solution only with sed and sh
printf "%s\n" *.ext1 |
sed "s/'/'\\\\''/g"';s/\(.*\)'ext1'/mv '\''\1'ext1\'' '\''\1'ext2\''/g' |
sh
for better performance: only one process created
perl -le '($e,$f)=#ARGV;map{$o=$_;s/$e$/$f/;rename$o,$_}<*.$e>' ext2 ext3

well this should work
mv $file $(echo $file | sed -E -e 's/.xml.bak.*/.xml/g' | sed -E -e 's/.\///g')
output
abc.xml.bak.foobar -> abc.xml

How to search and replace using grep

I need to recursively search for a specified string within all files and subdirectories within a directory and replace this string with another string.
I know that the command to find it might look like this:
grep 'string_to_find' -r ./*
But how can I replace every instance of string_to_find with another string?

Another option is to use find and then pass it through sed.
find /path/to/files -type f -exec sed -i 's/oldstring/new string/g' {} \;

I got the answer.
grep -rl matchstring somedir/ | xargs sed -i 's/string1/string2/g'

You could even do it like this:
Example
grep -rl 'windows' ./ | xargs sed -i 's/windows/linux/g'
This will search for the string 'windows' in all files relative to the current directory and replace 'windows' with 'linux' for each occurrence of the string in each file.

This works best for me on OS X:
grep -r -l 'searchtext' . | sort | uniq | xargs perl -e "s/matchtext/replacetext/" -pi
Source: http://www.praj.com.au/post/23691181208/grep-replace-text-string-in-files

Usually not with grep, but rather with sed -i 's/string_to_find/another_string/g' or perl -i.bak -pe 's/string_to_find/another_string/g'.

Other solutions mix regex syntaxes. To use perl/PCRE patterns for both search and replace, and process only matching files, this works quite well:
grep -rlIZPi 'match1' | xargs -0r perl -pi -e 's/match2/replace/gi;'
match1 and match2 are usually identical but match2 can contain more advanced features that are only relevant to the substitution, e.g. capturing groups.
Translation: grep recursively and list matching filenames, each separated by null to protect any special characters; pipe any filenames to xargs which is expecting a null-separated list; if any filenames are received, pass them to perl to perform the actual substitutions.
For case-sensitive matching, drop the i flag from grep and the i pattern modifier from the s/// expression, but not the i flag from perl itself. To include binary files, remove the I flag from grep.

Be very careful when using find and sed in a git repo! If you don't exclude the binary files you can end up with this error:
error: bad index file sha1 signature
fatal: index file corrupt
To solve this error you need to revert the sed by replacing your new_string with your old_string. This will revert your replaced strings, so you will be back to the beginning of the problem.
The correct way to search for a string and replace it is to skip find and use grep instead in order to ignore the binary files:
sed -ri -e "s/old_string/new_string/g" $(grep -Elr --binary-files=without-match "old_string" "/files_dir")
Credits for #hobs

Here is what I would do:
find /path/to/dir -type f -iname "*filename*" -print0 | xargs -0 sed -i '/searchstring/s/old/new/g'
this will look for all files containing filename in the file's name under the /path/to/dir, than for every file found, search for the line with searchstring and replace old with new.
Though if you want to omit looking for a specific file with a filename string in the file's name, than simply do:
find /path/to/dir -type f -print0 | xargs -0 sed -i '/searchstring/s/old/new/g'
This will do the same thing above, but to all files found under /path/to/dir.

Modern rust tools can be used to do this job.
For example to replace in all (non ignored) files "oldstring" and "oldString" with "newstring" and "newString" respectively you can :
Use fd and sd
fd -tf -x sd 'old([Ss]tring)' 'new$1' {}
Use ned
ned -R -p 'old([Ss]tring)' -r 'new$1' .
Use ruplacer
ruplacer --go 'old([Ss]tring)' 'new$1' .
Ignored files
To include ignored (by .gitignore) and hidden files you have to specify it :
use -IH for fd,
use --ignored --hiddenfor ruplacer.

Another option would be to just use perl with globstar.
Enabling shopt -s globstar in your .bashrc (or wherever) allows the ** glob pattern to match all sub-directories and files recursively.
Thus using perl -pXe 's/SEARCH/REPLACE/g' -i ** will recursively
replace SEARCH with REPLACE.
The -X flag tells perl to "disable all warnings" - which means that
it won't complain about directories.
The globstar also allows you to do things like sed -i 's/SEARCH/REPLACE/g' **/*.ext if you wanted to replace SEARCH with REPLACE in all child files with the extension .ext.

Remove files not containing a specific string

I want to find the files not containing a specific string (in a directory and its sub-directories) and remove those files. How I can do this?

The following will work:
find . -type f -print0 | xargs --null grep -Z -L 'my string' | xargs --null rm
This will firstly use find to print the names of all the files in the current directory and any subdirectories. These names are printed with a null terminator rather than the usual newline separator (try piping the output to od -c to see the effect of the -print0 argument.
Then the --null parameter to xargs tells it to accept null-terminated inputs. xargs will then call grep on a list of filenames.
The -Z argument to grep works like the -print0 argument to find, so grep will print out its results null-terminated (which is why the final call to xargs needs a --null option too). The -L argument to grep causes grep to print the filenames of those files on its command line (that xargs has added) which don't match the regular expression:
my string
If you want simple matching without regular expression magic then add the -F option. If you want more powerful regular expressions then give a -E argument. It's a good habit to use single quotes rather than double quotes as this protects you against any shell magic being applied to the string (such as variable substitution)
Finally you call xargs again to get rid of all the files that you've found with the previous calls.
The problem with calling grep directly from the find command with the -exec argument is that grep then gets invoked once per file rather than once for a whole batch of files as xargs does. This is much faster if you have lots of files. Also don't be tempted to do stuff like:
rm $(some command that produces lots of filenames)
It's always better to pass it to xargs as this knows the maximum command-line limits and will call rm multiple times each time with as many arguments as it can.
Note that this solution would have been simpler without the need to cope with files containing white space and new lines.
Alternatively
grep -r -L -Z 'my string' . | xargs --null rm
will work too (and is shorter). The -r argument to grep causes it to read all files in the directory and recursively descend into any subdirectories). Use the find ... approach if you want to do some other tests on the files as well (such as age or permissions).
Note that any of the single letter arguments, with a single dash introducer, can be grouped together (for instance as -rLZ). But note also that find does not use the same conventions and has multi-letter arguments introduced with a single dash. This is for historical reasons and hasn't ever been fixed because it would have broken too many scripts.

GNU grep and bash.
grep -rLZ "$str" . | while IFS= read -rd '' x; do rm "$x"; done
Use a find solution if portability is needed. This is slightly faster.

EDIT: This is how you SHOULD NOT do this! Reason is given here. Thanks to #ormaaj for pointing it out!
find . -type f | grep -v "exclude string" | xargs rm
Note: grep pattern will match against full file path from current directory (see find . -type f output)

One possibility is
find . -type f '!' -exec grep -q "my string" {} \; -exec echo rm {} \;
You can remove the echo if the output of this preview looks correct.
The equivalent with -delete is
find . -type f '!' -exec grep -q "user_id" {} \; -delete
but then you don't get the nice preview option.

To remove files not containing a specific string:
Bash:
To use them, enable the extglob shell option as follows:
shopt -s extglob
And just remove all files that don't have the string "fix":
rm !(*fix*)
If you want to don't delete all the files that don't have the names "fix" and "class":
rm !(*fix*|*class*)
Zsh:
To use them, enable the extended glob zsh shell option as follows:
setopt extended_glob
Remove all files that don't have the string, in this example "fix":
rm -- ^*fix*
If you want to don't delete all the files that don't have the names "fix" and "class":
rm -- ^(*fix*|*class*)
It's possible to use it for extensions, you only need to change the regex: (.zip) , (.doc), etc.
Here are the sources:
https://www.tecmint.com/delete-all-files-in-directory-except-one-few-file-extensions/
https://codeday.me/es/qa/20190819/1296122.html

I can think of a few ways to approach this. Here's one: find and grep to generate a list of files with no match, and then xargs rm them.
find yourdir -type f -exec grep -F -L 'yourstring' '{}' + | xargs -d '\n' rm
This assumes GNU tools (grep -L and xargs -d are non-portable) and of course no filenames with newlines in them. It has the advantage of not running grep and rm once per file, so it'll be reasonably fast. I recommend testing it with "echo" in place of "rm" just to make sure it picks the right files before you unleash the destruction.

This worked for me, you can remove the -f if you're okay with deleting directories.
myString="keepThis"
for x in `find ./`
do if [[ -f $x && ! $x =~ $myString ]]
then rm $x
fi
done

Another solution (although not as fast). The top solution didn't work in my case because the string I needed to use in place of 'my string' has special characters.
find -type f ! -name "*my string*" -exec rm {} \; -print

Convert all EOL (dos->unix) of all files in a directory and sub-directories recursively without dos2unix

How do I convert all EOL (dos->unix) of all files in a directory and sub-directories recursively without dos2unix? (I do not have it and cannot install it.)
Is there a way to do it using tr -d '\r' and pipes? If so, how?

For all files in current directory you can do it with a Perl one-liner: perl -pi -e 's/\r\n/\n/g' * (stolen from here)
EDIT: And with a small modification you can do subdirectory recursion:
find | xargs perl -pi -e 's/\r\n/\n/g'

You can use sed's -i flag to change the files in-place:
find . -type f -exec sed -i 's/\x0d//g' {} \+
If I were you, I would keep the files around to make sure the operation went okay. Then you can delete the temporary files when you get done. This can be done like so:
find . -type f -exec sed -i'.OLD' 's/\x0d//g' {} \+
find . -type f -name '*.OLD' -delete

Do you have sane file names and directory names without spaces, etc in them?
If so, it is not too hard. If you've got to deal with arbitrary names containing newlines and spaces, etc, then you have to work harder than this.
tmp=${TMPDIR:-/tmp}/crlf.$$
trap "rm -f $tmp.?; exit 1" 0 1 2 3 13 15
find . -type f -print |
while read name
do
tr -d '\015' < $name > $tmp.1
mv $tmp.1 $name
done
rm -f $tmp.?
trap 0
exit 0
The trap stuff ensures you don't get temporary files left around. There other tricks you can pull, with more random names for your temporary file names. You don't normally need them unless you work in a hostile environment.

You can also use the editor in batch mode.
find . -type f -exec bash -c 'echo -ne "%s/\\\r//\nx\n" | ex "{}" ' \;

If \r isn't followed by \n (maybe the case in files of Tim Pote):
deleting \r (using tr -d) may remove newlines
replacing \r with \n may not cause double / triple newlines
Maybe Tim Pote could verify the points above for the files he mentioned.

This removes carriage returns from all files in the current directory and all subdirectories, and should work on most Unix-like OSs:
grep -lIUre '\r' | xargs sed -i 's/\r//'

If its done in widows:
try to run the command in git bash:
$ find | xargs perl -pi -e 's/\r\n/\n/g'
It can show some Can't do inplace edit: type a message so ignore it

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string