vim: replace sub-match with the same number of new strings - vim

My plan is to do a pretty standard search replace, replacing all instances of old_string with new_string. The problem is that I only want to do this for an arbitrary number of old_strings following a specific prefix. So for example:
old_string = "a"
new_string = "b"
prefix = "xxxx"
xxxxaaaaaaaa => xxxxbbbbbbbb
xxxxaaapostfix => xxxxbbbpostfix
xxaaaa => xxaaaa
etc. I'm not sure how to do this. I imagine there's some way to say s/xxxxa*/xxxxb{number of a's}/g or something, but I have no idea what it is.

You can definitely do this! I would use the \= register to evaluate some vimscript. From :h s/\=:
Substitute with an expression *sub-replace-expression*
*sub-replace-\=* *s/\=*
When the substitute string starts with "\=" the remainder is interpreted as an
expression.
The special meaning for characters as mentioned at |sub-replace-special| does
not apply except for "<CR>". A <NL> character is used as a line break, you
can get one with a double-quote string: "\n". Prepend a backslash to get a
real <NL> character (which will be a NUL in the file).
Then you can use the repeat and submatch functions to build the right string. For example:
:%s/\(xxxx\)\(a\+\)/\=submatch(1).repeat('b', len(submatch(2)))
I chose to use \+ instead of * because then the pattern will not be found after the substitute command finished (this effects hlsearch and n)
Of course, if you use the \zs and \ze (start/end of match) atoms, you can use less capturing groups, which makes this waaay shorter and clearer.
:%s/xxxx\zsa\+/\=repeat('b', len(submatch(0)))

If you have perl support, you can use
:%perldo s/xxxx\Ka+/"b" x length($&)/ge
xxxx\Ka+ match one or more a only if preceded by xxxx
lookbehind with \K
/ge replace all occurrences in line, e allows to use Perl code in replacement section
"b" x length($&) the string b repeated length($&) number of times
See :h perl for more info

Related

Find and replace only part of a single line in Vim

Most substitution commands in vim perform an action on a full line or a set of lines, but I would like to only do this on part of a line (either from the cursor to end of the line or between set marks).
example
this_is_a_sentence_that_has_underscores = this_is_a_sentence_that_should_not_have_underscores
into
this_is_a_sentence_that_has_underscores = this is a sentence that should not have underscores
This task is very easy to do for the whole line :s/_/ /g, but seems to be much more difficult to only perform the replacement for anything after the =.
Can :substitution perform an action on half of a line?
Two solutions I can think of.
Option one, use the before/after column match atoms \%>123c and \%<456c.
In your example, the following command substitutes underscores only in the second word, between columns 42 and 94:
:s/\%>42c_\%<94c/ /g
Option two, use the Visual area match atom \%V.
In your example, Visual-select the second long word, leave Visual mode, then execute the following substitution:
:s/\%V_/ /g
These regular expression atoms are documented at :h /\%c and :h /\%V respectively.
Look-around
There is a big clue your post already:
only perform the replacement for anything after the =.
This often means using a positive look-behind, \#<=.
:%s/\(=.*\)\#<=_/ /g
This means match all _ that are after the following pattern =.*. Since all look-arounds (look-aheads and look-behinds) are zero width they do not take up space in the match and the replacement is simple.
Note: This is equivalent to (?<=...) in perl speak. See :h perl-patterns.
What about \zs?
\zs will set the start of a match at a certain point. On the face this sounds exactly what is needed. However \zs will not work correctly as it matches the pattern before the \zs first then the following pattern. This means there will only be one match. Look-behinds on the other hand match the part after \#<= then "look behind" to make sure the match is valid which makes it great for multiple replacement scenario.
It should be noted that if you can use \zs not only is it easy to type but it is also more efficient.
Note: \zs is like \K in perl speak.
More ways?!?
As #glts mentioned you can use other zero-width atoms to basically "anchor" your pattern. A list of a few common ways:
\%>a - after the 'a mark
\%V - match inside the visual area
\%>42c - match after column 42
The possible downside of using one of these methods they need you to set marks or count columns. There is nothing wrong with this but it means the substitution will maybe affected by side-effects so repeating the substitution may not work correctly.
For more help see:
:h /\#<=
:h /zero-width
:h perl-patterns
:h /\zs

How to perform following search and replace in vim?

I have the following string in the code at multiple places,
m_cells->a[ Id ]
and I want to replace it with
c(Id)
where the string Id could be anything including numbers also.
A regular expression replace like below should do:
%s/m_cells->a\[\s\(\w\+\)\s\]/c(\1)/g
If you wish to apply the replacement operation on a number of files you could use the :bufdo command.
Full explanation of #BasBossink's answer (as a separate answer because this won't fit in a comment), because regexes are awesome but non-trivial and definitely worth learning:
In Command mode (ie. type : from Normal mode), s/search_term/replacement/ will replace the first occurrence of 'search_term' with 'replacement' on the current line.
The % before the s tells vim to perform the operation on all lines in the document. Any range specification is valid here, eg. 5,10 for lines 5-10.
The g after the last / performs the operation "globally" - all occurrences of 'search_term' on the line or lines, not just the first occurrence.
The "m_cells->a" part of the search term is a literal match. Then it gets interesting.
Many characters have special meaning in a regex, and if you want to use the character literally, without the special meaning, then you have to "escape" it, by putting a \ in front.
Thus \[ and \] match the literal '[' and ']' characters.
Then we have the opposite case: literal characters that we want to treat as special regex entities.
\s matches white*s*pace (space, tab, etc.).
\w matches "*w*ord" characters (letters, digits, and underscore _).
(. matches any character (except a newline). \d matches digits. There are more...)
If a character is not followed by a quantifier, then exactly one such character matches. Thus, \s will match one space or tab, but not fewer or more.
\+ is a quantifier, and means "one or more". (\? matches 0 or 1; * (with no backslash) matches any number: zero or more. Warning: matching on zero occurrences takes a little getting used to; when you're first learning regexes, you don't always get the results you expected. It's also possible to match on an arbitrary exact number or range of occurrences, but I won't get into that here.)
\( and \) work together to form a "capturing group". This means that we don't just want to match on these characters, we also want to remember them specially so that we can do something with them later. You can have any number of capturing groups, and they can be nested too. You can refer to them later by number, starting at 1 (not 0). Just start counting (escaped) left-parantheses from the left to determine the number.
So here, we are matching a space followed by a group (which we will capture) of at least one "word" character followed by a space, within the square brackets.
Then section between the second and third / is the replacement text.
The "c" is literal.
\1 means the first captured group, which in this case will be the "Id".
In summary, we are finding text that matches the given description, capturing part of it, and replacing the entire match with the replacement text that we have constructed.
Perhaps a final suggestion: c after the final / (doesn't matter whether it comes before or after the 'g') enables *c*onfirmation: vim will highlight the characters to be replaced and will show the replacement text and ask whether you want to go ahead. Great for learning.
Yes, regexes are complicated, but super powerful and well worth learning. Once you have them internalized, they're actually fairly easy. I suggest that, as with learning vim itself, you start with the basics, get fluent in them, and then incrementally add new features to your repertoire.
Good luck and have fun.

An easy way to center text between first and last non-white word in vim?

Is there an easy way using a macro or ~10 line function (no plugin!) to center some text between the first and last word (=sequence of non-blank characters) on a line? E.g. to turn
>>> No user serviceable parts below. <<<
into
>>> No user serviceable parts below. <<<
by balancing the spaces +/-1? You can assume no tabs and the result should not contain tabs, but note that the first word may not start in column 1. (EDIT: ... in fact, both delimiter words as well as the start and end of the text to center may be on arbitrary columns.)
source this function:
fun! CenterInSpaces()
let l = getline('.')
let lre = '\v^\s*\S+\zs\s*\ze'
let rre = '\v\zs\s*\ze\S+\s*$'
let sp = matchstr(l,lre)
let sp = sp.matchstr(l,rre)
let ln = len(sp)
let l = substitute(l,lre,sp[:ln/2-1],'')
let l = substitute(l,rre,sp[ln/2:],'')
call setline('.',l)
endf
note
this function might NOT work in all cases. I just wrote it quick for usual case. this is not a plugin after all
the codes lines could be reduced by combining function calls. but i think it is clear in this way, so I just leave it like this.
if it worked for you, you could create a map
it works like this: (last two lines I typed #: to repeat cmd call)
You can use the :s command with the \= aka sub-replace-expression.
:s#\v^\s*\S+\zs(\s+)(.{-})(\s+)\ze\S+\s*$#\=substitute(submatch(1).submatch(3),'\v^(\s*)(\1\s=)$','\1'.escape(submatch(2),'~&\').'\2','')#
Overview
Capture the text (including white-space) between the >>> and <<< marks. Divide up the white-space on both sides of the text in half and substitute in the non-white-space text in between. This white-space balancing act is done via the regex engine's backtracking because math is hard. Lets go shopping!
Notes:
using \v or very magic mode to reduce escaping as this command is long enough
already
use # as an alternative separator instead of the usual / for :s/pat/sub/ in hopes to make it slightly more readable
Matching Pattern
:s#\v^\s*\S+\zs(\s+)(.{-})(\s+)\ze\S+\s*$#...
:s with no range supplied only do the substitution on the current line.
^\s*\S+ match the starting white-space followed by non-white-space. >>> in this case.
(\s+)(.{-})(\s+) match white-space followed by the "text" followed by white-space
3 capture groups: 1) leading white-space, 2) the "text", and 3) trailing white-space. These will be later referenced by submatch(1), submatch(2), and submatch(3) respectively
.{-} is vim-speak for non-greedy matching or .*? in perl-speak
without the non-greedy matching the second capture group would include too much white-space at its end
\S+\s*$ match the non-white-space (i.e. <<<) and any trailing white-space
Use \zs and ze to designate the start and end of the match to be replaced
Replacement
\=substitute(submatch(1).submatch(3),'\v^(\s*)(\1\s=)$','\1'.escape(submatch(2),'~&\').'\2','')
\= tells vim that replacement will be a vim expression. Also allows the use of submatch() functions
substitute({str}, {pat}, {sub}, {flags}) Our expression will be a nested substitution
substitute(submatch(1).submatch(3), ...) do a substitute over the concatenation of leading and trailing white-spacing captured in submatch(1) and submatch(3)
The {pat} is ^(\s*)(\1\s=)$. Match some white-space followed by white-space of the same length as the first or 1 character longer. Capture both halves.
escape(submatch(2),'~&\') escape submatch(2) for any special characters. e.g. ~,&,\1, ...
The {sub} is '\1'.escape(submatch(2),'~&\').'\2'. Replace with the the escaped submatch(2) (i.e. the "text" we want to center) in between the halves of white-space, \1 and \2 from the {pat}
No {flag}'s are needed so ''
Usage
If you use this often I would suggest creating a command and putting it in ~/.vimrc.
command! -range -bar -nargs=0 CenterBetween <line1>,<line2>s#\v^\s*\S+\zs(\s+)(.{-})(\s+)\ze\S+\s*$#\=substitute(submatch(1).submatch(3),'\v^(\s*)(\1\s=)$','\1'.submatch(2).'\2','')#`
Otherwise use this once and then repeat the last substitution via & on each needed line.
For more help see
:h :s/
:h :s/\=
:h sub-replace-\=
:h submatch(
:h substitute(
:h escape(
:h /\v
:h /\S
:h /\{-
:h /\zs
:h &
EDIT by Kent
Don't be jealous, your answer has it too. ^_^
I didn't change the command, just cp/paste to my vim. only add |noh at the end to disable highlighting.
If execute this command, it looks like:
I don't know of any good way. I usually do it in a semi-automatic way, by using :center on a line of text that only contains the parts that are to be centered and then move the result into the line containing the surrounding parts.
If nobody else has a better answer, perhaps boxes can help if you need to do this kind of thing a lot.

How to replace in vim

I have a line in a source file: [12 13 15]. In vim, I type:
:%s/\([0-90-9]\) /\0, /g
wanting to add a coma after 12 and 13. It works, but not quite, as it inserts an extraspace [12 , 13 , 15].
How can I achieve the desired effect?
Use \1 in the replacement expression, not \0.
\1 is the text captured by the first \(...\). If there were any more pairs of escaped parens in your pattern, \2 would match the text capture between the pair starting at the second \(, \3 at the third \(, and so on.
\0 is the entire text matched by the whole pattern, whether in parentheses or not. In your case this includes the space at the end of your pattern.
Also note that [0-90-9] is the same as [0-9]: each [...] collection matches just one character. It happens to work anyway, because in your data ‘a digit followed by a space’ matches in the same places as ‘2 digits followed by a space’. (If you actually needed to only insert commas after 2 digits, you could write [0-9][0-9].)
"I have a line in a source file:..."
then you type :%s/... this will do the substitution on all lines, if it matched. or that is the single line in your file?
If it is the single line, you don't have to group, or [0-9], just :%s/ \+/,/g will do the job.
The fine answers already point interesting solutions, but here's another one,
making use of the \zs, which marks the start of the match. In this pattern:
/[0-9]\zs /
The searched text is /[0-9] /, but only the space counts as a match. Note
that you can use the class \d to simplify the digit character class, so the
following command shall work for your needs:
:s/\d\d\zs /, /g ; matches only the space, replace by `, '
You said you have multiple lines and these changes are only to certain lines.
You can either visually select the lines to be changed or use the :global
command, which searches for lines matching a pattern and applies a command to
them. Now you'd need to build an expression to match the lines to be changed
in a less precise as possible way. If the lines that begins with optional
spaces, a [ and two digits are the only lines to be matched and no other
ones, then this would work for you:
:g/\s*[\d\d/s/\d\d\zs /, /g
Check the help for pattern.txt for \ze and similar and
:global.
Homework: use the help to understand \zs and see how this works:
:s/\d\d\zs\ze /,/g

How do I capture the output of a vim command in a register, without the newlines?

This is related to this question:
How to redirect ex command output into current buffer or file?
However, the problem with using :redir is that it causes 3 or 4 extra newlines in front of the output, and they appear to be difficult to remove using the substitute function.
For example, if I do the following:
:redir #a
:pwd
:redir END
The contents of #a consist of three blank lines and then the normal expected output.
I tried to post process with something like this:
:let #b = substitute(#a, '\s*\(.\{-}\)\s*', '\1', '')
But the result is that #b has the same contents as #a.
Does anyone know a more effective (i.e. working) way to postprocess, or a replacement for :redir that doesn't have those extra lines?
The value in the b register is unchanged from the value in the a register because your regexp is failing to match.
You need to write grouping parentheses and the opening repetition brace with with backslashes.
See :help /magic; effectively, the magic option is always on for substitute() regexps.
\s only matches SP and TAB (not LF); but \_s does include LF (alternately, you could use \n to just match LF).
You need to anchor the end of the expression so that the \{-} does not “give up” without matching anything (everything but the initial newlines unmatched, and thus unreplaced from the input string).
Here is a modified version of your substitution:
:let #b = substitute(#a,'\_s*\(.\{-}\)\_s*$','\1','')
It may be simpler to just think about deleting leading and trailing whitespace instead of matching everything in between. This can be done in a single substitution by using the g substitution modifier (repeated substitutions) with a regexp that uses the alternation operator where one alternate is anchored to the start of the string (%^) and the other is anchored to the end of the string (%$).
substitute(#a,'\v%^\_s+|\_s+%$','','g')
This regexp uses \v to avoid having to add backslashes for %^, +, |, and %$.
Change both occurrences of \_s to \n if you just want to trim leading/trailing newlines (instead of SP, TAB, or NL).

Resources