Replace line in text containing special characters (mathematical equation) linux text - linux

I want to replace a line, that represents a part of mathematical equation:
f(x,z,time,temp)=-(2.0)/(exp(128*((x-2.5*time)*(x-2.5*time)+(z-0.2)*(z-0.2))))+(
with a new one similar to the above. Both new and old lines are saved in bash variables.
Main problem is that mathematical equation is full with special characters that do not allow proper search and replace in bash mode, even when I used as delimiter special character that is not used in equation.
I used
sed -n "s|$OLD|$NEW|g" restart.k
and
sed -i "s|$OLD|$NEW|g" restart.k
but all times I get wrong results.
Any idea to solve this?

There is only * in your pattern here that is special for sed, so escape it and do replacement as usual:
sed "s:$(sed 's:[*]:\\&:g' <<<"$old"):$new:" infile
if there are more special characters in your real sample, then you will need to add them inside bracket []; there are some exceptions like:
if ^ character: it can be place anywhere in [] but not first character, because ^ character at first negates the characters within its bracket expression.
if ] character: it should be the first character, because this character is also used to end the bracket expression.
if - character: it should be the first or last character, because this character is also can be used for defining the range of characters too.

Related

how to remove specific characters in vi or vim editor

I have some txt in vi:
|NC_004718|29751nt|SARS
|NC_045512|29903nt|Severe
|NC_004718|29751nt|SARS
|NC_045512|29903nt|Severe
|NC_004718|29751nt|SARS
now I want to replace remove everything after NC_004718, my expected output is:
NC_004718
NC_045512
NC_004718
NC_045512
NC_004718
How to do it? Thanks.
I would recommend using a substitution with regular expression to match the entire string and to capture what you would like to keep in parentheses. That way you can then replace the entire string with just the match.
:%s/^|\([^|]\+\)|.\+/\1/
To break down what is happening:
% means that you want to apply the command to each line within the file.
s means that you are doing substitution command (on each line). The s command has a syntax of s/<regular expression pattern>/<replacement>/<flags>
The regular eression pattern in the above command is ^|\([^|]\+\)|.\+.
^ means match from the line start.
| matches the character |.
\([^|]\+\) matches all characters except for the character |. Note that the real regular expression is actually ([^|]+), the additional \ characters are there because Vim needs to know that they are intended to be special characters for processing and not exact characters it needs to match. Also note that the parentheses are there to capture the match into a group (see below).
| again matches the actual character |.
.\+ matches all characters until the end of the line. Note that the . is considered special character by default but + still needs a preceding \.
The replacement text is only \1. This denotes that Vim should replace the text with whatever was captured in the first group (i.e. the first set of parentheses).
There are no flags with this command so there is nothing after the last /.
For example,
:g/NC_\d\+/normal! ygnV]p
:g/regex/ to match lines
normal! to execute Normal mode commands
ygn to yank the text previously matched by :g
V to select the whole line
]p or p to replace the line with the match
If you have only lines like those you have shown try:
:%norm xf|D

Replace each column with different spacing using sed

I am trying to replace a different pattern for each column of my input file.
Input file
this- START
this- START
Result I want
/this/ -START-
/this/ -START-
My code
sed 's|^\([a-zA-Z]*\)-\s\([a-zA-Z]*\)$|/\1/ -\2-|' inputfile
Output
/this/ -START-
this- START
The first input works but the 2nd input with a huge amount of spaces does not. How can I deal with both of them using the same line of code?
sed uses POSIX Basic Regular Expressions, which are, like the name suggests, very basic, without a lot of the syntactical sugar or features of other RE packages you might be more used to. But they can still handle this:
$ cat input.txt
this- START
this- START
$ sed 's!^\([a-zA-Z]*\)-[[:space:]]\{1,\}\([a-zA-Z]*\)$!/\1/ -\2-!' input.txt
/this/ -START-
/this/ -START-
The key here is in the [[:space:]]\{1,\} portion: [:space:] inside a []character class matches any whitespace character, like \s in other RE implementations, and \{1,\} matches 1 or more of the preceeding atom, like + in pretty much every other flavor (Which also support this notation, though without needing the backslashes). So combined it matches 1 or more whitespace characters. And since regular expressions are greedy, it matches the longest sequence of whitespace characters instead of stopping after seeing just one.
If you only have spaces, not spaces and/or tabs between columns, it can be simplified to \{1,\} (Note the leading literal space; it's not obvious in rendered markdown). And you can use [[:alpha:]] instead of [a-zA-Z] to match all alphabetic characters. Makes a difference if matching non-English text. And you might want to use \{1,\} instead of * to avoid matching 0-length/missing columns if they can show up in your input.

Vim or sed : Replace character(s) within a pattern

I wanted to replace underscores with hyphens in all places where the character('_') is preceded and following by uppercase letters e.g. QWQW_IOIO, OP_FD_GF_JK, TRT_JKJ, etc. The replacement is needed throughout one document.
I tried to replace this in vim using:
:%s/[A-Z]_[A-Z]/[A-Z]-[A-Z]/g
But that resulted in QWQW_IOIO with QWQ[A-Z]-[A-Z]OIO :(
I tried using a sed command:
sed -i '/[A-Z]_[A-Z]/ s/_/-/g' ./file_name
This resulted in replacement over the whole line. e.g.
QWQW_IOIO variable may contain '_' or '-' line was replaced by
QWQW-IOIO variable may contain '-' or '-'
You had the right idea with your first vim approach. But you need to use a capturing group to remember what character was found in the [A-Z] section. Those are nicely explained here and under :h /\1. As a side note, I would recommend using \u instead of [A-Z], since it is both shorter and faster. That means the solution you want is:
:%s/\(\u\)_\(\u\)/\1-\2/g
Or, if you would like to use the magic setting to make it more readable:
:%s/\v(\u)_(\u)/\1-\2/g
Another option would be to limit the part of the search that gets replaced with the \zs and \ze atoms:
:%s/\u\zs_\ze\u/-/g
This is the shortest solution I'm aware of.
This should do what you want, assuming GNU sed.
sed -i -r -e 's/([A-Z]+)_([A-Z]+)/\1-\2/g' ./file_name
Explanation:
-r flag enables extended regex
[A-Z]+ is "one or more uppercase letters"
() groups a pattern together and creates a numbered memorized match
\1, \2 put those memorized matches in the replacement.
So basically this finds a chunk of uppercase letters followed by an underscore, followed by another chunk of uppercase letters, memorizes only the letter chunks as 2 groups,
([A-Z]+)_([A-Z]+)
Then it replays those groups, but with a hyphen in between instead of an underscore.
\1-\2
The g flag at the end says to do this even if the pattern shows up multiple times on one line.
Note that this falls apart a little in this case:
QWQW_IOIO_ABAB
Because it matches the first time, but not the second; the second part won't match because IOIO was consumed by the first match. So that would result in
QWQW-IOIO_ABAB
This version drops the + so it only matches one uppercase letter, and won't break in the same way:
sed -i -r -e 's/([A-Z])_([A-Z])/\1-\2/g'
It still has a small flaw, if you have a string like this:
A_B_C
Same issue as before, just one letter now instead of multiple.

What does this sed command line do?

I see this lines in my study.
$temp = 'echo $line | sed s/[a-z AZ 0-9 _]//g'
IF($temp != '')
echo "Line contains illegal characters"
I don't understand. Isn't sed is like substituting function? In the code, [a-z AZ 0-9 _] should be replace with ''. I don't understand how this determines if $line has illegal characters.
sed is a stream editor tool that applies regular expressions to transform the input. The command
sed s/regex/replace/g
reads from stdin and every time it finds something matching regex, it replaces it with the contents of replace. In your case, the command
sed s/[a-z A-Z 0-9 _]//g
has [a-z A-Z 0-9] as its regular expression and the empty string as its replacement. (Did you forget a dash between the A and the Z?) This means that anything matching the indicated regular expression gets deleted. This regular expression means "any character that's either between a and z, between A and Z, between 0 and 9, a space, or an underscore," so this command essentially deletes any alphanumeric characters, whitespaces, or underscores from the input and dumps what's left to stdout. Testing whether the output is empty then asks whether there were any characters in there that weren't alphanumeric, spaces, or numbers, which is how the code works.
I'd recommend adding sed to the list of tools you should get a basic familiarity with, since it's a fairly common one to see on the command-line.

How to replace in vim

I have a line in a source file: [12 13 15]. In vim, I type:
:%s/\([0-90-9]\) /\0, /g
wanting to add a coma after 12 and 13. It works, but not quite, as it inserts an extraspace [12 , 13 , 15].
How can I achieve the desired effect?
Use \1 in the replacement expression, not \0.
\1 is the text captured by the first \(...\). If there were any more pairs of escaped parens in your pattern, \2 would match the text capture between the pair starting at the second \(, \3 at the third \(, and so on.
\0 is the entire text matched by the whole pattern, whether in parentheses or not. In your case this includes the space at the end of your pattern.
Also note that [0-90-9] is the same as [0-9]: each [...] collection matches just one character. It happens to work anyway, because in your data ‘a digit followed by a space’ matches in the same places as ‘2 digits followed by a space’. (If you actually needed to only insert commas after 2 digits, you could write [0-9][0-9].)
"I have a line in a source file:..."
then you type :%s/... this will do the substitution on all lines, if it matched. or that is the single line in your file?
If it is the single line, you don't have to group, or [0-9], just :%s/ \+/,/g will do the job.
The fine answers already point interesting solutions, but here's another one,
making use of the \zs, which marks the start of the match. In this pattern:
/[0-9]\zs /
The searched text is /[0-9] /, but only the space counts as a match. Note
that you can use the class \d to simplify the digit character class, so the
following command shall work for your needs:
:s/\d\d\zs /, /g ; matches only the space, replace by `, '
You said you have multiple lines and these changes are only to certain lines.
You can either visually select the lines to be changed or use the :global
command, which searches for lines matching a pattern and applies a command to
them. Now you'd need to build an expression to match the lines to be changed
in a less precise as possible way. If the lines that begins with optional
spaces, a [ and two digits are the only lines to be matched and no other
ones, then this would work for you:
:g/\s*[\d\d/s/\d\d\zs /, /g
Check the help for pattern.txt for \ze and similar and
:global.
Homework: use the help to understand \zs and see how this works:
:s/\d\d\zs\ze /,/g

Resources