Linux - How can i replace some string with same string enclosed with braces? - linux

I have some files in a dir, when i grep some string, i get result like below.
scripts/FileReplace> grep -r "case" *
dir1/file2:case 'a'
dir2/file3:case "ssss"
file1:case 1
After i use replace cmd i expect strings in files updated as below
CASE ('a')
CASE ("ssss")
CASE (1)
ie,, "case" is replaced with "CASE" and text after space is enclosed in braces as above.
Any suggestion how i can do this with shell cmd.

You can use sed and its substitution:
find . -type f -exec sed -i 's/case \(.\+\)/CASE (\1)/' {} +
.\+ matches anything that has at least one character.
\(...\) creates a capture group, you can reference the first capture group as \1.
running with -i~ instead of -i will create backups of the files; recommended especially if you're just experimenting.

Related

Replace spaces in all files in a directory with underscores

I have found some similar questions here but not this specific one and I do not want to break all my files. I have a list of files and I simply need to replace all spaces with underscores. I know this is a sed command but I am not sure how to generically apply this to every file.
I do not want to rename the files, just modify them in place.
Edit: To clarify, just in case it's not clear, I only want to replace whitespace within the files, file names should not be changed.
find . -type f -exec sed -i -e 's/ /_/g' {} \;
find grabs all items in the directory (and subdirectories) that are files, and passes those filenames as arguments to the sed command using the {} \; notation. The sed command it appears you already understand.
if you only want to search the current directory, and ignore subdirectories, you can use
find . -maxdepth 1 -type f -exec sed -i -e 's/ /_/g' {} \;
This is a 2 part problem. Step 1 is providing the proper sed command, 2 is providing the proper command to replace all files in a given directory.
Substitution in sed commands follows the form s/ItemToReplace/ItemToReplaceWith/pattern, where s stands for the substitution and pattern stands for how the operation should take place. According to this super user post, in order to match whitespace characters you must use either \s or [[:space:]] in your sed command. The difference being the later is for POSIX compliance. Lastly you need to specify a global operation which is simply /g at the end. This simply replaces all spaces in a file with underscores.
Substitution in sed commands follows the form s/ItemToReplace/ItemToReplaceWith/pattern, where s stands for the substitution and pattern stands for how the operation should take place. According to this super user post, in order to match whitespace characters you must use either just a space in your sed command, \s, or [[:space:]]. The difference being the last 2 are for whitespace catching (tabs and spaces), with the last needed for POSIX compliance. Lastly you need to specify a global operation which is simply /g at the end.
Therefore, your sed command is
sed s/ /_/g FileNameHere
However this only accomplishes half of your task. You also need to be able to do this for every file within a directory. Unfortunately, wildcards won't save us in the sed command, as * > * would be ambiguous. Your only solution is to iterate through each file and overwrite them individually. For loops by default should come equipped with file iteration syntax, and when used with wildcards expands out to all files in a directory. However sed's used in this manner appear to completely lose output when redirecting to a file. To correct this, you must specify sed with the -i flag so it will edit its files. Whatever item you pass after the -i flag will be used to create a backup of the old files. If no extension is passed (-i '' for instance), no backup will be created.
Therefore the final command should simply be
for i in *;do sed -i '' 's/ /_/g' $i;done
Which looks for all files in your current directory and echos the sed output to all files (Directories do get listed but no action occurs with them).
Well... since I was trying to get something running I found a method that worked for me:
for file in `ls`; do sed -i 's/ /_/g' $file; done

Replace part of regular expression and keep prefix/postfix

I was trying to replace some strings in a file using sed and came accross an issue.
I have the following strings:
TEMPLATE_MODULE
TEMPLATE_SOME_ERR
TEMPLATE_MORE_ERR
I would like to replace TEMPLATE_MODULE with some string and all strings that start with TEMPLATE and end with ERR with a different string, as follows:
TEMPLATE_MODULE ---> NEW_MODULE_NAME
TEMPLATE_SOME_ERR ---> NEW_MODULE_NAME_SOME_ERR
TEMPLATE_MORE_ERR ---> NEW_MODULE_NAME_MORE_ERR
The replacement of TEMPLATE_MODULE is easy:
find . -type f -print -exec sed -i "s/TEMPLATE_MODULE/NEW_MODULE_NAME/g" {} +
Though I don't know how to handle the other part. If I look for strings starting with TEMPLATE_* , I would also catch TEMPLATE_MODULE.
I also want to keep the SOME_ERR or MORE_ERR postfix so this solution would not work:
find . -type f -print -exec sed -i "s/TEMPLATE_.*_ERR$/NEW_MODULE_NAME/g" {} +
Any ideas?
Thanks!
Consider this sample input
$ cat ip.txt
foo
TEMPLATE_MODULE
TEMPLATE_SOME_ERR
TEMPLATE_MORE_ERR
TEMPLATE_SOME_ERR xyz
Use multiple s commands and capture groups
$ sed -E 's/\bTEMPLATE_MODULE\b/NEW_MODULE_NAME/g; s/\bTEMPLATE\w*(_(SOME|MORE)_ERR)\b/NEW_MODULE_NAME\1/g' ip.txt
foo
NEW_MODULE_NAME
NEW_MODULE_NAME_SOME_ERR
NEW_MODULE_NAME_MORE_ERR
NEW_MODULE_NAME_SOME_ERR xyz
\b is for word boundaries. \bcat\b will match only cat and won't match scat or cater
s/\bTEMPLATE_MODULE\b/NEW_MODULE_NAME/g will replace TEMPLATE_MODULE with NEW_MODULE_NAME
s/\bTEMPLATE\w*(_(SOME|MORE)_ERR)\b/NEW_MODULE_NAME\1/g will replace TEMPLATE followed by zero or more word characters ending with _SOME_ERR or _MORE_ERR with NEW_MODULE_NAME and the captured string
Solution is for GNU sed, not sure about portability with other implementations
Assuming that "TEMPLATE_MODULE" and "TEMPLATE_" are literal, but "SOME_ERR" and "MORE_ERR" are placeholders, the following seems possible.
sed "s/TEMPLATE\(_MODULE\)*/NEW_MODULE_NAME/"
I recommend to play with this as a bare sed line, then, if it is OK, integrate it into your commandline.
I think this code does not make more assumptions on occurrences and spellings than your own code.
However, you probably want to use "anchoring" in order to not accidentally replace "SOMEOTHERTEMPLATE_" etc.
To start you off on that here is a modified version:
sed "s/\bTEMPLATE\(_MODULE\)*/NEW_MODULE_NAME/"
I recommend to try that with:
TEMPLATE_MODULE
TEMPLATE_SOME_ERR
TEMPLATE_MORE_ERR
OTHER_TEMPLATE_MODULE
OTHER_TEMPLATE_SOME_ERR

How to replace string in files recursively via sed or awk?

I would like to know how to search from the command line for a string in various files of type .rb.
And replace:
.delay([ANY OPTIONAL TEXT FOR DELETION]).
with
.delay.
Besides sed an awk are there any other command line tools included in the OS that are better for the task?
Status
So far I have the following regular expression:
.delay\(*.*\)\.
I would like to know how to match only the expression ending on the first closing parenthesis? And avoid replacing:
.delay([ANY OPTIONAL TEXT FOR DELETION]).sometext(param)
Thanks in advance!
If you need to find and replace text in files - sed seems to be the best command line solution.
Search for a string in the text file and replace:
sed -i 's/PATTERN/REPLACEMENT/' file.name
Or, if you need to process multiple occurencies of PATTERN in file, add g key
sed -i 's/PATTERN/REPLACEMENT/g' file.name
For multiple files processing - redirect list of files to sed:
echo "${filesList}" | xargs sed -i ...
You can use find to generate your list of files, and xargs to run sed over the result:
find . -type f -print | xargs sed -i 's/\.delay.*/.delay./'
find will generate a list of files contained in your current directory (., although you can of course pass a different directory), xargs will read that list and then run sed with the list of files as an argument.
Instead of find, which here generates a list of all files, you could use something like grep to generate a list of files that contain a specific term. E.g.:
grep -rl '\.delay' | xargs sed -i ...
For the part of the question where you want to only match and replace until the first ) and not include a second pair of (), here is how to change your regex:
.delay\(*.*\)\.
->
\.delay\([^\)]*\)
I.e. match "actual dot, delay, brace open, everything but brace close and brace close".
E.g. using sed:
>echo .delay([ANY OPTIONAL TEXT FOR DELETION]).sometext(param) | sed -E "s/\.delay\([^\)]*\)/.delay/"
.delay.sometext(param)
I recommend to use grep for finding the right files:
grep -rl --include "*.rb" '\.delay' .
Then feed the list into xargs, as recommended by other answers.
Credits to the other answers for providing a solution for feeding multiple files into sed.

BASH find and replace in all files in directory using FIND and SED

I need to look for and replace certain strings for all files in a directory, including sub-directories. I think I'm nearly there using the following method which illustrates my general approach. I do much more inside the -exec than just this replace, but have removed this for clarity.
#!/bin/bash
#call with params: in_directory out_directory
in_directory=$1
out_directory=$2
export in_directory
export out_directory
#Duplicate the in_directory folder structure in out_directory
cd "$in_directory" &&
find . -type d -exec mkdir -p -- "$out_directory"/{} \;
find $in_directory -type f -name '*' -exec sh -c '
for file do
#Quite a lot of other stuff, including some fiddling with $file to
#get rel_file, the part of the path to a file from
#in_directory. E.g if in_directory is ./ then file ./ABC/123.txt
#will have rel_file ABC/123.txt
cat $file|tr -d '|' |sed -e 's/,/|/g' > $out_directory/$rel_file
done
' sh {} +
One issue is likely how I've tried to write the file to pipe the output to. However, this isn't the main/only issue as when I replace it with an explicit test path I still get the error
|sed -e 's/,/|/g' |No such file or directory
which makes me think the cat $file part is the problem?
Any help is massively appreciated as always - this is only the second BASH script I've ever had to write so I expect I've made a fairly basic mistake!
Your "inner" single quotes are being seen as "outer" single quotes and causing you problems. You think you are quoting the | in the tr command but what you are actually doing is ending the initial single-quoted string having an unquoted | and then starting a new single-quoted string. That second single-quoted string then ends at the single-quote that you believe is starting the sed script but is instead ending the previous single-quoted string, etc.
Use double quotes for those embedded single quotes if you can. Where you can't do that you have to use the '\'' sequence to get a literal single-quote in the single-quoted string.

Unix 'alias' fails with 'awk' command

I'm creating an alias in Unix and have found that the following command fails..
alias logspace='find /apps/ /opt/ -type f -size +100M -exec ls -lh {} \; | awk '{print $5, $9 }''
I get the following :
awk: cmd. line:1: {print
awk: cmd. line:1: ^ unexpected newline or end of string
Any ideas on why the piped awk command fails...
Thanks,
Shaun.
To complement's #Dropout's helpful answer:
tl;dr
The problem is the OP's attempt to use ' inside a '-enclosed (single-quoted) string.
The most robust solution in this case is to replace each interior ' with '\'' (sic):
alias logspace='find /apps/ /opt/ -type f -size +100M -exec ls -lh {} \; |
awk '\''{print $5, $9 }'\'''
Bourne-like (POSIX-compatible) shells do not support using ' chars inside single-quoted ('...'-enclosed) strings AT ALL - not even with escaping.
(By contrast, you CAN escape " inside a double-quoted string as \", and, as in #Droput's answer, you can directly, embed ' chars. there, but see below for pitfalls.)
The solution above effectively builds the string from multiple, single-quoted strings into which literal ' chars. - escaped outside the single-quoted strings as \' - are spliced in.
Another way of putting it, as #Etan Reisinger has done in a comment: '\'' means: "close string", "escape single quote", "start new string".
When defining an alias, you usually want single quotes around its definition so as to delay evaluation of the command until the alias is invoked.
Other solutions and their pitfalls:
The following discusses alternative solutions, based on the following alias:
alias foo='echo A '\''*'\'' is born at $(date)'
Note how the * is effectively enclosed in single quotes - using above technique - so as to prevent pathname expansion when the alias is invoked later.
When invoked, this alias prints literal A * star is born, followed by the then-current date and time, e.g.: A * is born at Mon Jun 16 11:33:19 EDT 2014.
Use a feature called ANSI C quoting with shells that support it: bash, ksh, zsh
ANSI C-quoted strings, which are enclosed in $'...', DO allow escaping embedded ' chars. as \':
alias foo=$'echo A \'*\' is born at $(date)'
Pitfalls:
This feature is not part of POSIX.
By design, escape sequences such as \n, \t, ... are interpreted, too (in fact, that's the purpose of the feature).
Use of alternating quoting styles, as in #Dropout's answer:
Pitfall:
'...' and "..." have different semantics, so substituting one for the other can have unintended side-effects:
alias foo="echo A '*' is born at $(date)" # DOES NOT WORK AS INTENDED
While syntactically correct, this will NOT work as intended, because the use of double quotes causes the shell to expand the command substitution $(date) right away, and thus hardwires the date and time at the time of the alias definition into the alias.
As stated: When defining an alias, you usually want single quotes around its definition so as to delay evaluation of the command until the alias is invoked.
Finally, a caveat:
The tricky thing in a Bourne-like shell environment is that embedding ' inside a single-quoted string sometimes - falsely - APPEARS to work (instead of generating a syntax error, as in the question), when it instead does something different:
alias foo='echo a '*' is born at $(date)' # DOES NOT WORK AS EXPECTED.
This definition is accepted (no syntax error), but won't work as expected - the right-hand side of the definition is effectively parsed as 3 strings - 'echo a ', *, and ' is born at $(date)', which, due to how the shell parses string (merging adjacent strings, quote removal), results in the following, single, literal string: a * is born at $(date). Since the * is unquoted in the resulting alias definition, it will expand to a list of all file/directory names in the current directory (pathname expansion) when the alias is invoked.
You chould use different quotes for surrounding the whole text and for inner strings.
Try changing it to
alias logspace="find /apps/ /opt/ -type f -size +100M -exec ls -lh {} \; | awk '{print $5, $9 }'"
In other words, your outer quotes should be different than the inner ones, so they don't mix.
Community wiki update:
The redeeming feature of this answer is recognizing that the OP's problem lies in unescaped use of the string delimiters (') inside a string.
However, this answer contains general string-handling truths, but does NOT apply to (Bourne-like, POSIX-compatible) shell programming specifically, and thus does not address the OP's problem directly - see the comments.
Note: Code snippets are meant to be pseudo code, not shell language.
Basic strings: You canNOT use the same quote within the string as the entire string is delimited with:
foo='hello, 'world!'';
^--start string
^-- end string
^^^^^^^^--- unknown "garbage" causing syntax error.
You have to escape the internal strings:
foo='hello, \'world!\'';
^--escape
This is true of pretty much EVERY programming language on the planet. If they don't provide escaping mechanisms, such as \, then you have to use alternate means, e.g.
quotechar=chr(39); // single quote is ascii #39
foo='hello ,' & quotechar & 'world!' & quotechar;
Escape the $ sign not the awk's single quotes and use double quotes for the alias.
Try this :
alias logspace="find /apps/ /opt/ -type f -size +100M -exec ls -lh {} \; | awk '{print \$5, \$9 }'"

Resources