How to perform following search and replace in vim?

How to perform following search and replace in vim? - vim

I have the following string in the code at multiple places,
m_cells->a[ Id ]
and I want to replace it with
c(Id)
where the string Id could be anything including numbers also.

A regular expression replace like below should do:
%s/m_cells->a\[\s\(\w\+\)\s\]/c(\1)/g
If you wish to apply the replacement operation on a number of files you could use the :bufdo command.

Full explanation of #BasBossink's answer (as a separate answer because this won't fit in a comment), because regexes are awesome but non-trivial and definitely worth learning:
In Command mode (ie. type : from Normal mode), s/search_term/replacement/ will replace the first occurrence of 'search_term' with 'replacement' on the current line.
The % before the s tells vim to perform the operation on all lines in the document. Any range specification is valid here, eg. 5,10 for lines 5-10.
The g after the last / performs the operation "globally" - all occurrences of 'search_term' on the line or lines, not just the first occurrence.
The "m_cells->a" part of the search term is a literal match. Then it gets interesting.
Many characters have special meaning in a regex, and if you want to use the character literally, without the special meaning, then you have to "escape" it, by putting a \ in front.
Thus \[ and \] match the literal '[' and ']' characters.
Then we have the opposite case: literal characters that we want to treat as special regex entities.
\s matches white*s*pace (space, tab, etc.).
\w matches "*w*ord" characters (letters, digits, and underscore _).
(. matches any character (except a newline). \d matches digits. There are more...)
If a character is not followed by a quantifier, then exactly one such character matches. Thus, \s will match one space or tab, but not fewer or more.
\+ is a quantifier, and means "one or more". (\? matches 0 or 1; * (with no backslash) matches any number: zero or more. Warning: matching on zero occurrences takes a little getting used to; when you're first learning regexes, you don't always get the results you expected. It's also possible to match on an arbitrary exact number or range of occurrences, but I won't get into that here.)
$ and $ work together to form a "capturing group". This means that we don't just want to match on these characters, we also want to remember them specially so that we can do something with them later. You can have any number of capturing groups, and they can be nested too. You can refer to them later by number, starting at 1 (not 0). Just start counting (escaped) left-parantheses from the left to determine the number.
So here, we are matching a space followed by a group (which we will capture) of at least one "word" character followed by a space, within the square brackets.
Then section between the second and third / is the replacement text.
The "c" is literal.
\1 means the first captured group, which in this case will be the "Id".
In summary, we are finding text that matches the given description, capturing part of it, and replacing the entire match with the replacement text that we have constructed.
Perhaps a final suggestion: c after the final / (doesn't matter whether it comes before or after the 'g') enables *c*onfirmation: vim will highlight the characters to be replaced and will show the replacement text and ask whether you want to go ahead. Great for learning.
Yes, regexes are complicated, but super powerful and well worth learning. Once you have them internalized, they're actually fairly easy. I suggest that, as with learning vim itself, you start with the basics, get fluent in them, and then incrementally add new features to your repertoire.
Good luck and have fun.

Related

example from ch.16 "learn vimscript the hard way"

I'm trying to complete an exercise from https://learnvimscriptthehardway.stevelosh.com/chapters/16.html
The sample text to be worked on is:
Topic One
=========
This is some text about topic one.
It has multiple paragraphs.
Topic Two
=========
This is some text about topic two. It has only one paragraph.
The mapping to delete the heading of Topic One or Topic Two (depending on which body the cursor is placed in) and enter insert mode is:
:onoremap ih :<c-u>execute "normal! ?^==\\+$\r:nohlsearch\rkvg_"<cr>
Enter 'cih' in the body of either text below the headings and respective heading will be erased and the cursor will be placed there ready to go, in insert mode. Great mapping--but, I'm trying to understand what's happening with \+$.
When I omit \+$ and use this mapping:
:onoremap ih :<c-u>execute "normal! ?^==\r:nohlsearch\rkvg_"<cr>
it works fine, seemingly identically to the other mapping. So what is the use of the \+$?
Here is how Mr. Losh explains it:
The first piece,
?^==\+$
performs a search backwards for any line that consists of two
or more equal signs and nothing else. This will leave our cursor on
the first character of the line of equal signs."
But what does \+$ accomplish? I've tried to enter it manually in command but I just get an error sound. It works as intended as part of the full function, though. but like I said, when I remove it and run the full command without, it works fine.
There's something I'm missing about the necessity of that '+$'... Maybe it has to do with the "two or more equal signs and nothing else"?

The author's command:
?^==\+$
searches backward for a line consisting exclusively of 2 or more equal signs:
^ anchors the pattern to the beginning of the line,
= matches a literal equal sign,
^= thus matches a literal equal sign at the beginning of the line,
= matches a second equal sign,
\+ matches one or more of the preceding atom, as many as possible,
=\+ thus matches one or more equal sign, as many as possible,
$ anchors the pattern to the end of the line,
so the pattern above is going to match any of the following lines:
==
===
=============
etc.
but not lines like:
==foo
== <- six spaces
etc.
which is exactly the goal of that exercice.
Your command, on the other hand:
?^==
searches backward for a sequence of two equal signs at the beginning of a line:
^ anchors the pattern to the beginning of the line,
== matches two literal equal signs,
so your pattern is going to match the same lines as above:
==
===
=============
etc.
but also lines like:
==foo
== <- six spaces
etc.
because it is not strict enough.
Your pattern would definitely be good enough if used manually to jump to one of those underlines because it gets the job done with minimal typing. But the goal, here, is to make a mapping. Those things have to be generalised to be reliable, which pretty much requires a level of explicitness and precision your pattern lacks.
In short, Steve's pattern checks all the boxes while yours doesn't: it is explicit and precise while yours is implicit and imprecise.

The \+$ is part of the regular expression matching a line of only equals signs. Without it, your mapping would recognize, for example,
This is not a heading
=This is not an underline
as a heading.
The \+ means "At least two of the previous character (=)". The $ means End of line, so there cannot be anything after the equals signs.

Dialogflow RE2 Regex

I am new here. I wanted to ask a question on using REGEX for an entity in DialogFlow
I wanted the entity to accept all text and spaces except for the symbol *
I have tried to use [A-Za-z0-9 ][^*], but it is not working. Any advice. thanks!

In your Regex expression, [^*] means "capture any character at the start of the line." To refer to a literal asterisk rather than matching any character, you need to use \*
If you want to match a line of letters or numbers as in the [A-Za-z0-9] example you give, but only if that string does not include an asterisk, then this expression should work for you:
^[a-zA-Z0-9]+$
This means "match a whole line of text if it only contains one or more of the characters a-z, A-Z, or 0-9".
If you want to match any character or group of characters in a line except for the asterisk, then you could use something like this:
(?!\*)([a-zA-Z0-9]+)(?<!\*)
The first part is called a "negative lookahead," and it looks forward to ensure we're not matching the asterisk. The last part is called a "negative lookbehind," and it looks backwards to make sure we're not matching the asterisk. The middle part is your "capture group," and confirms that you're matching any letters or numbers in a given string, but excluding the * character.
If this Regex gets input like *abc, it will capture abc. If it encounters abc*, it will still capture abc. If it encounters abc*def, it will capture abc and def separately in two capture groups, because it will break around the asterisk.
This link explains the concept of lookarounds in Regex. You can also use this Regex tester to get started practicing your Regular Expressions with explanations of what each block of characters does.
EDITED TO ADD If you're just interested in matching single characters rather than groups of characters, you can use [A-Za-z0-9] and match any upper or lowercase letter and any single digit. You don't need to exclude the * character, because the character group is already exclusive.
This is a slight duplicate of the question below, so responses here may also help you. Hope this helps!
How can I exclude asterisk in a regex expression

[A-Za-z0-9 ][^*]
What you regex will do is match 2 consecutive characters. First, it will look for anything A-Za-z0-9 . Then, it will look at the negated set that includes *, and will match ANY character except *.
You can type your regex into https://regexr.com/ to see a breakdown of how it matches and test some strings.
For example, your regex would match these:
Aa
AA
a&
A1
0_
But would not match these:
A*
a*
1*
And WOULD NOT match anything longer than 2 characters. If you really want to match any string with any characters except *, this should work:
[^\*]+
What that will do is match any number of consecutive characters that are not *. (The + means match 1 or more characters in the set). It is also a good idea to escape * because it is also a reserved character in regex. Even though most regex parsers are smart enough to know that inside a group you probably mean the literal char *, it is still a best practice to escape it. (And by that same token, you would want to use \s instead of the blank space in your original regex.)

Searching for an exact match with a singular digit

I'm trying to search for only a singular digit in vim by itself. For example, if there are two sets of digits 1 and 123 and I want to search for 1, I would only want the singular 1 digit to be found.
I have tried using regular expressions like \<1> and \%(a)#

You almost had the right solution. You want:
\<1\>
This is because each angled bracket needs to be escaped. Alternatively, you could use:
\v<1>
The \v flag tells vim to treat more characters as special without needing to be escaped (for example, (){}+<> all become special rather than literal text. Read :h /\v for more on this.
A great reference for learning regex in vim is vimregex.com. The \<\> characters are explained in 4.1 "Anchors".
If you want to match text like 1.23 this is possible too. Two different approaches:
Modify the iskeyword option so that it includes .. This will also affect how w moves
Use \v<1(\d|.)#!, which basically means "a 1 at the beginning of a word, that isn't followed by some other digit or a period."

Substitute `number` with `(number)` in multiple lines

I am a beginner at Vim and I've been reading about substitution but I haven't found an answer to this question.
Let's say I have some numbers in a file like so:
1
2
3
And I want to get:
(1)
(2)
(3)
I think the command should resemble something like :s:\d\+:........ Also, what's the difference between :s/foo/bar and :s:foo:bar ?
Thanks

Here is an alternative, slightly less verbose, solution:
:%s/^\d\+/(&)
Explanation:
^ anchors the pattern to the beginning of the line
\d is the atom that covers 0123456789
\+ matches one or more of the preceding item
& is a shorthand for \0, the whole match

Let me address those in reverse.
First: there's no difference between :s/foo/bar and :s:foo:bar; whatever delimiter you use after the s, vim will expect you to use from then on. This can be nice if you have a substitution involving lots of slashes, for instance.
For the first: to do this to the first number on the current line (assuming no commas, decimal places, etc), you could do
:s:\(\d\+\):(\1)
The \(...\) doesn't change what is matched - rather, it tells vim to remember whatever matched what is inside, and store it. The first \(...\) is stored in \1, the second in \2, etc. So, when you do the replacement, you can reference \1 to get the number back.
If you want to change ALL numbers on the current line, change it to
:s:\(\d\+\):(\1):g
If you want to change ALL numbers on ALL lines, change it to
:%s:\(\d\+\):(\1):g

You can do what you want with:
:%s/\([0-9]\)/(\1)/
%s means global search and replace, that is do the search/replace for every line in the file. the \( \) defines a group, which in turn is referenced by \1. So the above search and replace, finds all lines with a single digit ([0-9]), and replaces it with the matched digit surrounded by parentheses.

replacing part of regex matches

I have several functions that start with get_ in my code:
get_num(...) , get_str(...)
I want to change them to get_*_struct(...).
Can I somehow match the get_* regex and then replace according to the pattern so that:
get_num(...) becomes get_num_struct(...),
get_str(...) becomes get_str_struct(...)
Can you also explain some logic behind it, because the theoretical regex aren't like the ones used in UNIX (or vi, are they different?) and I'm always struggling to figure them out.
This has to be done in the vi editor as this is main work tool.
Thanks!

To transform get_num(...) to get_num_struct(...), you need to capture the correct text in the input. And, you can't put the parentheses in the regular expression because you may need to match pointers to functions too, as in &get_distance, and uses in comments. However, and this depends partially on the fact that you are using vim and partially on how you need to keep the entire input together, I have checked that this works:
%s/get_\w\+/&_struct/g
On every line, find every expression starting with get_ and continuing with at least one letter, number, or underscore, and replace it with the entire matched string followed by _struct.
Darn it; I shouldn't answer these things on spec. Note that other regex engines might use \& instead of &. This depends on having magic set, which is default in vim.

For an alternate way to do it:
%s/get_\(\w*\)(/get_\1_struct(/g
What this does:
\w matches to any "word character"; \w* matches 0 or more word characters.
\(...\) tells vim to remember whatever matches .... So, \(w*\) means "match any number of word characters, and remember what you matched. You can then access it in the replacement with \1 (or \2 for the second, etc.)
So, the overall pattern get_\(\w*\)( looks for get_, followed by any number of word chars, followed by (.
The replacement then just does exactly what you want.
(Sorry if that was too verbose - not sure how comfortable you are with vim regex.)

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string