How to replace string in files recursively via sed or awk? - linux

I would like to know how to search from the command line for a string in various files of type .rb.
And replace:
.delay([ANY OPTIONAL TEXT FOR DELETION]).
with
.delay.
Besides sed an awk are there any other command line tools included in the OS that are better for the task?
Status
So far I have the following regular expression:
.delay\(*.*\)\.
I would like to know how to match only the expression ending on the first closing parenthesis? And avoid replacing:
.delay([ANY OPTIONAL TEXT FOR DELETION]).sometext(param)
Thanks in advance!

If you need to find and replace text in files - sed seems to be the best command line solution.
Search for a string in the text file and replace:
sed -i 's/PATTERN/REPLACEMENT/' file.name
Or, if you need to process multiple occurencies of PATTERN in file, add g key
sed -i 's/PATTERN/REPLACEMENT/g' file.name
For multiple files processing - redirect list of files to sed:
echo "${filesList}" | xargs sed -i ...

You can use find to generate your list of files, and xargs to run sed over the result:
find . -type f -print | xargs sed -i 's/\.delay.*/.delay./'
find will generate a list of files contained in your current directory (., although you can of course pass a different directory), xargs will read that list and then run sed with the list of files as an argument.
Instead of find, which here generates a list of all files, you could use something like grep to generate a list of files that contain a specific term. E.g.:
grep -rl '\.delay' | xargs sed -i ...

For the part of the question where you want to only match and replace until the first ) and not include a second pair of (), here is how to change your regex:
.delay\(*.*\)\.
->
\.delay\([^\)]*\)
I.e. match "actual dot, delay, brace open, everything but brace close and brace close".
E.g. using sed:
>echo .delay([ANY OPTIONAL TEXT FOR DELETION]).sometext(param) | sed -E "s/\.delay\([^\)]*\)/.delay/"
.delay.sometext(param)
I recommend to use grep for finding the right files:
grep -rl --include "*.rb" '\.delay' .
Then feed the list into xargs, as recommended by other answers.
Credits to the other answers for providing a solution for feeding multiple files into sed.

Related

Replace spaces in all files in a directory with underscores

I have found some similar questions here but not this specific one and I do not want to break all my files. I have a list of files and I simply need to replace all spaces with underscores. I know this is a sed command but I am not sure how to generically apply this to every file.
I do not want to rename the files, just modify them in place.
Edit: To clarify, just in case it's not clear, I only want to replace whitespace within the files, file names should not be changed.
find . -type f -exec sed -i -e 's/ /_/g' {} \;
find grabs all items in the directory (and subdirectories) that are files, and passes those filenames as arguments to the sed command using the {} \; notation. The sed command it appears you already understand.
if you only want to search the current directory, and ignore subdirectories, you can use
find . -maxdepth 1 -type f -exec sed -i -e 's/ /_/g' {} \;
This is a 2 part problem. Step 1 is providing the proper sed command, 2 is providing the proper command to replace all files in a given directory.
Substitution in sed commands follows the form s/ItemToReplace/ItemToReplaceWith/pattern, where s stands for the substitution and pattern stands for how the operation should take place. According to this super user post, in order to match whitespace characters you must use either \s or [[:space:]] in your sed command. The difference being the later is for POSIX compliance. Lastly you need to specify a global operation which is simply /g at the end. This simply replaces all spaces in a file with underscores.
Substitution in sed commands follows the form s/ItemToReplace/ItemToReplaceWith/pattern, where s stands for the substitution and pattern stands for how the operation should take place. According to this super user post, in order to match whitespace characters you must use either just a space in your sed command, \s, or [[:space:]]. The difference being the last 2 are for whitespace catching (tabs and spaces), with the last needed for POSIX compliance. Lastly you need to specify a global operation which is simply /g at the end.
Therefore, your sed command is
sed s/ /_/g FileNameHere
However this only accomplishes half of your task. You also need to be able to do this for every file within a directory. Unfortunately, wildcards won't save us in the sed command, as * > * would be ambiguous. Your only solution is to iterate through each file and overwrite them individually. For loops by default should come equipped with file iteration syntax, and when used with wildcards expands out to all files in a directory. However sed's used in this manner appear to completely lose output when redirecting to a file. To correct this, you must specify sed with the -i flag so it will edit its files. Whatever item you pass after the -i flag will be used to create a backup of the old files. If no extension is passed (-i '' for instance), no backup will be created.
Therefore the final command should simply be
for i in *;do sed -i '' 's/ /_/g' $i;done
Which looks for all files in your current directory and echos the sed output to all files (Directories do get listed but no action occurs with them).
Well... since I was trying to get something running I found a method that worked for me:
for file in `ls`; do sed -i 's/ /_/g' $file; done

remove DOS end of file character from many files in Linux

I transferred millions of generated SVG files from DOS to a Linux box and just realized that there is a ^# as the last character of each file (the DOS end of file character) which is giving an error when I try to display the SVG file in a browser.
In this question:
How can I remove the last character of the last line of a file?
Maroun gives the solution as:
sed '$ s/.$//' your_file
But when I modify it to look like this:
sed '$ s/.$//' *.SVG
or
find . -print | grep .SVG | sed '$ s/.$//'
It does not work.
I would also like to be able to specify that it should only delete the last character if it is the ^#.
Can someone please tell me what I am doing wrong or how to get this to work. The SVG files are in thousands of sub directories so I need to be able to make the change from top to bottom of the tree structure.
I'm guessing your first example isn't working because it's taking only the last file expanded by *.SVG. The second example (the one with the find command) fails because you're passing find's output to sed instead of the file contents.
Also, if you want to change the files' contents with sed, you need to use -i so the changes are performed "in place".
You could try this:
for file in *.SVG
do
sed -i '$ s/.$//' $file
done
Or:
find . -name "*.SVG" -exec sed -i '$ s/.$//' {} \;

Need help editing multiple files using sed in linux terminal

I am trying to do a simple operation here. Which is to cut a few characters from one file (style.css) do a find a replace on another file (client_custom.css) for more then 100 directories with different names
When I use the following command
for d in */; do sed -n 73p ~/assets/*/style.css | cut -c 29-35 | xargs -I :hex: sed -i 's/!BGCOLOR!/:hex:/' ~/assets/*/client_custom.css $d; done
It keeps giving me the following error for all the directories
sed: couldn't edit dirname/: not a regular file
I am confused on why its giving me that error message explicitly gave the full path to the file. It works perfectly fine without a for loop.
Can anyone please help me out with this issue?
sed doesn't support folders as input.
for d in */;
puts folders into $d. If you write sed ... $d, then BASH will put the folder name into the arguments of sed and the poor tool will be confused.
Also ~/assets/*/client_custom.css since this will expand to all the files which match this pattern. So sed will be called once with all file names. You probably want to invoke sed once per file name.
Try
for f in ~/assets/*/client_custom.css; do
... | sed -i 's/!BGCOLOR!/:hex:/' $f
done
or, even better:
for f in ~/assets/*/client_custom.css; do
... | sed 's/!BGCOLOR!/:hex:/' "${f}.in" > "${f}"
done
(which doesn't overwrite the output file). This way, you can keep the "*.in" files, edit them with the patterns and then use sed to "expand" all the variables.

Quickest way to remove 70+ strings from a file?

I have 70+ strings I need to find and delete in a file. I need to remove the entire line in the file that the string appears in.
I know I can use sed -i '/string to remove/d' fileA.txt to remove them one at a time. However, considering I have 70+, it will take some time doing it this way.
Is there a way I can put these 70+ strings in a file and have sed go through them one by one? Or if I create a file containing the strings, is there a way to compare the two files so it removes any line from fileA that contains one of the strings?
You could use grep:
grep -vf file_with_words.txt file.txt
where file_with_words.txt would be the file containing the list of words, each word being on a different line and file.txt is the file that you want to remove the lines from.
If your list of words contains regex metacharacters, then tell grep to consider those as fixed strings (if that is what you want):
grep -F -vf file_with_words.txt file.txt
Using sed, you'd need to say:
sed '/word1\|word2\|word3/d' file.txt
or
sed -E '/word1|word2|word3/d' file.txt
You could use command substitution to construct the pattern too:
sed -E "/$(paste -sd'|' file_with_words.txt)/d" file.txt
but grep is clearly the tool to use in this case.
If you want to do the job in bash, here's how:
search=fileA.txt
queries=queries.txt
while read query
do
sed -i '' "/$query/d" $search
done < "$queries"
where queries.txt looks like
I
want
to
delete
these
lines

shell scripting for token replacement in all files in a folder

HI
I am not very good with linux shell scripting.I am trying following shell script to replace
revision number token $rev -<rev number> in all html files under specified directory
cd /home/myapp/test
set repUpRev = "`svnversion`"
echo $repUpRev
grep -lr -e '\$rev -'.$repUpRev.'\$' *.html | xargs sed -i 's/'\$rev -'.$repUpRev.'\$'/'\$rev -.*$'/g'
This seems not working, what is wrong with the above code ?
rev=$(svnversion)
sed -i.bak "s/$rev/some other string/g" *.html
What is $rev in the regexp string? Is it another variable? Or you're looking for a string '$rev'. If latter - I would suggest adding '\' before $ otherwise it's treated as a special regexp character...
This is how you show the last line:
grep -lr -e '\$rev -'.$repUpRev.'\$' *.html | xargs sed -i 's/'\$rev -'.$repUpRev.'\$'/'\$rev -.*$'/g'
It would help if you showed some input data.
The -r option makes the grep recursive. That means it will operate on files in the directory and its subdirectories. Is that what you intend?
The dots in your grep and sed stand for any character. If you want literal dots, you'll need to escape them.
The final escaped dollar sign in the grep and sed commands will be seen as a literal dollar sign. If you want to anchor to the end of the line you should remove the escape.
The .* works only as a literal string on the right hand side of a sed s command. If you want to include what was matched on the left side, you need to use capture groups. The g modifier on the s command is only needed if the pattern appears more than once in a line.
Using quote, unquote, quote, unquote is hard to read. Use double quotes to permit variable expansion.
Try your grep command by itself without the xargs and sed to see if it's producing a list of files.
This may be closer to what you want:
grep -lr -e "\$rev -.$repUpRev.$" *.html | xargs sed -i "s/\$rev -.$repUpRev.$/\$rev -REPLACEMENT_TEXT/g"
but you'll still need to determine if the g modifier, the dots, the final dollar signs, etc., are what you intend.

Resources