What does \#<= and \#= mean in Vim command? - vim

Can't understand \#<= and \#= Benoit's answer of this post, anyone can help explain them?

From vim documentation for patterns
\#= Matches the preceding atom with zero width. {not in Vi}
Like "(?=pattern)" in Perl.
Example matches
foo\(bar\)\#= "foo" in "foobar"
foo\(bar\)\#=foo nothing
*/zero-width*
When using "\#=" (or "^", "$", "\<", "\>") no characters are included
in the match. These items are only used to check if a match can be
made. This can be tricky, because a match with following items will
be done in the same position. The last example above will not match
"foobarfoo", because it tries match "foo" in the same position where
"bar" matched.
Note that using "\&" works the same as using "\#=": "foo\&.." is the
same as "\(foo\)\#=..". But using "\&" is easier, you don't need the
braces.
\#<= Matches with zero width if the preceding atom matches just before what
follows. |/zero-width| {not in Vi}
Like '(?<=pattern)" in Perl, but Vim allows non-fixed-width patterns.
Example matches
\(an\_s\+\)\#<=file "file" after "an" and white space or an
end-of-line
For speed it's often much better to avoid this multi. Try using "\zs"
instead |/\zs|. To match the same as the above example:
an\_s\+\zsfile
"\#<=" and "\#<!" check for matches just before what follows.
Theoretically these matches could start anywhere before this position.
But to limit the time needed, only the line where what follows matches
is searched, and one line before that (if there is one). This should
be sufficient to match most things and not be too slow.
The part of the pattern after "\#<=" and "\#<!" are checked for a
match first, thus things like "\1" don't work to reference \(\) inside
the preceding atom. It does work the other way around:
Example matches
\1\#<=,\([a-z]\+\) ",abc" in "abc,abc"

Related

Select first char up to first non camelCase or non upper case char or up to first snake case _ in vim

I used this map:
map ,w v/\([^ a-z0-9]\|[^ A-Z0-9]\)*<cr>h
the idea is to select
in the words
mysuperTest
MYSUPER_TEST
mysuper_test
to always select the part that says mysuper
but it doesnt work, not sure why
I would use something like the below:
nnoremap ,w v/\C\%#.\([a-z]\+\<bar>[A-Z]\+\)\zs<cr>h
One point to notice is that in a mapping you need to use <bar> (or escape | with an extra backslash) since otherwise | is recognized as a command separator (see :help map-bar.)
Another one to notice is that you want the match to start at the first character outside the word (so you'll land at the end of the word with the h). The visual selection will expand to the start of the match in a search. I suggest using \zs to set the start of the match explicitly (see :help /\zs.)
Finally, beware of 'ignorecase' or 'smartcase' settings. Use \C to explicitly request a case-sensitive match (see :help /\C.)
I also like the idea of using a stronger anchor for the start of the match, so I'm using \%# to match the current cursor position (see :help /\%#), so you're always sure to match the current word only and not end up wandering through the buffer.
Putting it all together:
\C Case-sensitive search
\%# From cursor position
. Skip first character
\( Either one of:
[a-z]\+ One or more lowercase letters
\| (\<bar>) Or:
[A-Z]\+ One or more uppercase letters
\) End group
\zs Set match position here
I'm skipping the first character under the cursor, since in a CamelCase word, the first character won't match capitalization of the remainder of the word.
I kept your original idea of finding the first character after the word then using h to go back one to the left. But that might be a problem if, for example, the word is at the end of the line.
You can actually match the last character of the word instead with something like [a-z]\+\zs[a-z], which will set the start of the match on the last lowercase character. You can do this for both sides of the group (you can have more than one \zs in your pattern, last wins.) If you structure your match that way, you won't need the final h to go back.
I didn't handle numbers, I'll leave those as an exercise to the reader.
Finally, consider there are quite a few corner cases that can make such a mapping quite tricky to get right. Rather than coming up with your own, why not look at plug-ins which add support for handling CamelCase words that have been battle-tested and will cover use cases a lot more advanced than the simple expression you're using here?
There's the excellent vim-scripts/camelcasemotion by Ingo Karkat which sets up a ,w mapping to move to the start of the next CamelCase word, but also i,w to select the current one. You can use powerful combinations such as v3i,w to visually select the current and next two CamelCase words.
You might also check Tim Pope's tpope/vim-abolish which, among other features, defines a set of cr mappings to do coercion from camelCase to MixedCase, snake_case, UPPER_CASE, etc. (Not directly about selecting them, but still related and you might find it useful.)

Find and replace only part of a single line in Vim

Most substitution commands in vim perform an action on a full line or a set of lines, but I would like to only do this on part of a line (either from the cursor to end of the line or between set marks).
example
this_is_a_sentence_that_has_underscores = this_is_a_sentence_that_should_not_have_underscores
into
this_is_a_sentence_that_has_underscores = this is a sentence that should not have underscores
This task is very easy to do for the whole line :s/_/ /g, but seems to be much more difficult to only perform the replacement for anything after the =.
Can :substitution perform an action on half of a line?
Two solutions I can think of.
Option one, use the before/after column match atoms \%>123c and \%<456c.
In your example, the following command substitutes underscores only in the second word, between columns 42 and 94:
:s/\%>42c_\%<94c/ /g
Option two, use the Visual area match atom \%V.
In your example, Visual-select the second long word, leave Visual mode, then execute the following substitution:
:s/\%V_/ /g
These regular expression atoms are documented at :h /\%c and :h /\%V respectively.
Look-around
There is a big clue your post already:
only perform the replacement for anything after the =.
This often means using a positive look-behind, \#<=.
:%s/\(=.*\)\#<=_/ /g
This means match all _ that are after the following pattern =.*. Since all look-arounds (look-aheads and look-behinds) are zero width they do not take up space in the match and the replacement is simple.
Note: This is equivalent to (?<=...) in perl speak. See :h perl-patterns.
What about \zs?
\zs will set the start of a match at a certain point. On the face this sounds exactly what is needed. However \zs will not work correctly as it matches the pattern before the \zs first then the following pattern. This means there will only be one match. Look-behinds on the other hand match the part after \#<= then "look behind" to make sure the match is valid which makes it great for multiple replacement scenario.
It should be noted that if you can use \zs not only is it easy to type but it is also more efficient.
Note: \zs is like \K in perl speak.
More ways?!?
As #glts mentioned you can use other zero-width atoms to basically "anchor" your pattern. A list of a few common ways:
\%>a - after the 'a mark
\%V - match inside the visual area
\%>42c - match after column 42
The possible downside of using one of these methods they need you to set marks or count columns. There is nothing wrong with this but it means the substitution will maybe affected by side-effects so repeating the substitution may not work correctly.
For more help see:
:h /\#<=
:h /zero-width
:h perl-patterns
:h /\zs

How to perform following search and replace in vim?

I have the following string in the code at multiple places,
m_cells->a[ Id ]
and I want to replace it with
c(Id)
where the string Id could be anything including numbers also.
A regular expression replace like below should do:
%s/m_cells->a\[\s\(\w\+\)\s\]/c(\1)/g
If you wish to apply the replacement operation on a number of files you could use the :bufdo command.
Full explanation of #BasBossink's answer (as a separate answer because this won't fit in a comment), because regexes are awesome but non-trivial and definitely worth learning:
In Command mode (ie. type : from Normal mode), s/search_term/replacement/ will replace the first occurrence of 'search_term' with 'replacement' on the current line.
The % before the s tells vim to perform the operation on all lines in the document. Any range specification is valid here, eg. 5,10 for lines 5-10.
The g after the last / performs the operation "globally" - all occurrences of 'search_term' on the line or lines, not just the first occurrence.
The "m_cells->a" part of the search term is a literal match. Then it gets interesting.
Many characters have special meaning in a regex, and if you want to use the character literally, without the special meaning, then you have to "escape" it, by putting a \ in front.
Thus \[ and \] match the literal '[' and ']' characters.
Then we have the opposite case: literal characters that we want to treat as special regex entities.
\s matches white*s*pace (space, tab, etc.).
\w matches "*w*ord" characters (letters, digits, and underscore _).
(. matches any character (except a newline). \d matches digits. There are more...)
If a character is not followed by a quantifier, then exactly one such character matches. Thus, \s will match one space or tab, but not fewer or more.
\+ is a quantifier, and means "one or more". (\? matches 0 or 1; * (with no backslash) matches any number: zero or more. Warning: matching on zero occurrences takes a little getting used to; when you're first learning regexes, you don't always get the results you expected. It's also possible to match on an arbitrary exact number or range of occurrences, but I won't get into that here.)
\( and \) work together to form a "capturing group". This means that we don't just want to match on these characters, we also want to remember them specially so that we can do something with them later. You can have any number of capturing groups, and they can be nested too. You can refer to them later by number, starting at 1 (not 0). Just start counting (escaped) left-parantheses from the left to determine the number.
So here, we are matching a space followed by a group (which we will capture) of at least one "word" character followed by a space, within the square brackets.
Then section between the second and third / is the replacement text.
The "c" is literal.
\1 means the first captured group, which in this case will be the "Id".
In summary, we are finding text that matches the given description, capturing part of it, and replacing the entire match with the replacement text that we have constructed.
Perhaps a final suggestion: c after the final / (doesn't matter whether it comes before or after the 'g') enables *c*onfirmation: vim will highlight the characters to be replaced and will show the replacement text and ask whether you want to go ahead. Great for learning.
Yes, regexes are complicated, but super powerful and well worth learning. Once you have them internalized, they're actually fairly easy. I suggest that, as with learning vim itself, you start with the basics, get fluent in them, and then incrementally add new features to your repertoire.
Good luck and have fun.

replacing part of regex matches

I have several functions that start with get_ in my code:
get_num(...) , get_str(...)
I want to change them to get_*_struct(...).
Can I somehow match the get_* regex and then replace according to the pattern so that:
get_num(...) becomes get_num_struct(...),
get_str(...) becomes get_str_struct(...)
Can you also explain some logic behind it, because the theoretical regex aren't like the ones used in UNIX (or vi, are they different?) and I'm always struggling to figure them out.
This has to be done in the vi editor as this is main work tool.
Thanks!
To transform get_num(...) to get_num_struct(...), you need to capture the correct text in the input. And, you can't put the parentheses in the regular expression because you may need to match pointers to functions too, as in &get_distance, and uses in comments. However, and this depends partially on the fact that you are using vim and partially on how you need to keep the entire input together, I have checked that this works:
%s/get_\w\+/&_struct/g
On every line, find every expression starting with get_ and continuing with at least one letter, number, or underscore, and replace it with the entire matched string followed by _struct.
Darn it; I shouldn't answer these things on spec. Note that other regex engines might use \& instead of &. This depends on having magic set, which is default in vim.
For an alternate way to do it:
%s/get_\(\w*\)(/get_\1_struct(/g
What this does:
\w matches to any "word character"; \w* matches 0 or more word characters.
\(...\) tells vim to remember whatever matches .... So, \(w*\) means "match any number of word characters, and remember what you matched. You can then access it in the replacement with \1 (or \2 for the second, etc.)
So, the overall pattern get_\(\w*\)( looks for get_, followed by any number of word chars, followed by (.
The replacement then just does exactly what you want.
(Sorry if that was too verbose - not sure how comfortable you are with vim regex.)

What vim pattern matches a number which ends with dot?

In PDP11/40 assembling language a number ends with dot is interpreted as a decimal number.
I use the following pattern but fail to match that notation, for example, 8.:
syn match asmpdp11DecNumber /\<[0-9]\+\.\>/
When I replace \. with D the pattern can match 8D without any problem. Could anyone tell me what is wrong with my "end-with-dot" pattern? Thanks.
Your regular expression syntax is fine (well, you can use \d instead of [0-9]), but your 'iskeyword' value does not include the period ., so you cannot match the end-of-word (\>) after it.
It looks like you're writing a syntax for a custom filetype. One option is to
:setlocal filetype+=.
in a corresponding ~/.vim/ftplugin/asmpdp11.vim filetype plugin. Do this when the period character is considered a keyword character in your syntax.
Otherwise, drop the \> to make the regular expression match. If you want to ensure that there's no non-whitespace character after the period, you can assert that condition after the match, e.g. like this:
:syn match asmpdp11DecNumber /\<\d\+\.\S\#!/
Note that a word is defined by vim as:
A word consists of a sequence of letters, digits and underscores, or a
sequence of other non-blank characters, separated with white space
(spaces, tabs, ). This can be changed with the 'iskeyword'
option. An empty line is also considered to be a word.
so your pattern works fine if whitespace follows the number. You may want to skip the \>.
I think the problem is your end-of-word boundary marker. Try this:
syn match asmpdp11DecNumber /\<[0-9]\+\./
Note that I have removed the \> end-of-word boundary. I'm not sure what that was in there for, but it appears to work if you remove it. A . is not considered part of a word, which is why your version fails.

Resources