How to repeat a substitution the number of times the search word occurs in a row in a substitution command in Vim? - vim

I would like to use tabs in a code that doesn’t use them. What I did until now to implement tabs was pretty handcrafty:
:%s/^ /\t/g
:%s/^\t /\t\t/g
. . .
Question: Is there a way to replace two spaces ( ) by tab (\t) the number of times it was found at the beginning of a line?

There are (at least) three substitution techniques relevant to this case.
1. The first one takes advantage of the preceding-atom matching
syntax to naturally define a step of indentation. According to the
question statement, an indent step is a pair of adjacent space
characters preceded with nothing but spaces from the beginning
of line. Following this definition, one can construct the actual
substitution pattern, right to left:
:%s/\%(^ *\)\#<= /\t/g
Indeed, the pattern designates an occurrence of two literal space
characters, but only when they are preceded by a zero-width match
of the atom just before \#<=, which is the pattern ^ * wrapped in
grouping parentheses \%(, \). These non-capturing parentheses are
used instead of the usual capturing ones, \(, \), since there is no
need in further referring to the matched string of leading spaces. Due
to the g flag, the above :substitute command runs through the
leading spaces pair by pair, and replaces each of them by single tab
character.
2. The second technique takes a different approach. Instead of
matching separate indent levels, one can break each of the lines
starting with space characters down into two lines: one containing
the indenting spaces of the original line, and another holding the
rest of it. After that, it is straightforward to replace all of the pairs
of spaces on the first line, and concatenate the lines back together:
:g/^ /s/^ \+/&\r/|-s/ /\t/g|j!
3. The third idea is to process leading spaces by means of Vim
scripting language. A convenient way of doing that is to use the
substitute with an expression feature of the :substitute command
(see :help sub-replace-\=). When started with \=, the substitute
string of the command enables to substitute the matches of a pattern
with results of evaluation of the expression specified after \=:
:%s#^ \+#\=repeat("\t",len(submatch(0))/2)

If you specifically want to convert spaces into tabs (or vice-versa) at the start of a line, there's the useful :retab command which takes care of that. For example:
:retab! 2 will convert spaces in groups of two to tabs
:set expandtab and then :retab! 2 will convert tabstops (of width 2) back to spaces
See :h :retab (and :h 'ts') for the details.
This is not a general solution for the original problem, but I think it covers the most common use case.

There is no general way of doing this using :s regex's. You can't make the /g modifier look backwards otherwise it'd be unusable, and you can't reliably check that you're at the beginning of the line without looking backwards.
The only way of doing it generally is to loop, like so:
:for i in range(100)
: %s/^\t*\zs /\t/e
:endfor
Which is ugly, slow and highly unrecommended. Use :retab

Related

How to convert visual selection from unicode to the corresponding character in vim command?

I'm trying to convert multiple instances of Unicode codes to their corresponding characters.
I have some text with this format:
U+00A9
And I want to generate the following next to it:
©
I have tried to select the code in visual mode and use the selection range '<,'> in command mode as input for i_CTRL_V but I don't know how to use special keys on a command.
I haven't found anything useful in the manual with :help command-mode . I could solve this problem using other tools but I want to improve my vim knowledge. Any hint is appreciated.
Edit:
As #m_mlvx has pointed out my goal is to visually select, then run some command that looks up the Unicode and does the substitution. Manually input a substitution like :s/U+00A9/U+00A9 ©/g is not what I'm interested in as it would require manually typing each of the special characters on every substitution.
Any hint is appreciated.
Here are a whole lot of them…
:help i_ctrl-v is about insert mode and ranges matter in command-line mode so :help command-mode is totally irrelevant.
When they work on text, Ex commands only work on lines, not arbitrary text. This makes ranges like '<,'> irrelevant in this case.
After carefully reading :help i_ctrl-v_digit, linked from :help i_ctrl-v, we can conclude that it is supposed to be used:
with a lowercase u,
without the +,
without worrying about the case of the value.
So both of these should be correct:
<C-v>u00a9
<C-v>u00A9
But your input is U+00A9 so, even if you somehow manage to "capture" that U+00A9, you won't be able to use it as-is: it must be sanitized first. I would go with a substitution but, depending on how you want to use that value in the end, there are probably dozens of methods:
substitute('U+00A9', '\(\a\)+\(.*\)', '\L\1\2', '')
Explanation:
\(\a\) captures an alphabetic character.
+ matches a literal +.
\(.*\) captures the rest.
\L lowercases everything that comes after it.
\1\2 reuses the two capture groups above.
From there, we can imagine a substitution-based method. Assuming "And I want to generate the following next to it" means that you want to obtain:
U+00A9©
you could do:
v<motion>
y
:call feedkeys("'>a\<C-v>" . substitute(#", '\(\a\)+\(.*\)', '\L\1\2', '') . "\<Esc>")<CR>
Explanation:
v<motion> visually selects the text covered by <motion>.
y yanks it to the "unnamed register" #".
:help feedkeys() is used as low-level way to send a complex series of characters to Vim's input queue. It allows us to build the macro programatically before executing it.
'> moves the cursor to the end of the visual selection.
a starts insert mode after the cursor.
<C-v> + the output of the substitution inserts the appropriate character.
That snippet begs for being turned into a mapping, though.
In case you would like to just convert unicodes to corresponding characters, you could use such nr2char function:
:%s/U+\(\x\{4\}\)/\=nr2char('0x'.submatch(1))/g
Brief explanation
U+\(\x\{4\}\) - search for a specific pattern (U+ and four hexadecimal characters which are stored in group 1)
\= - substitute with result of expression
'0x'.submatch(1) - append 0x to our group (U+00A9 -> 0x00A9)
In case you would like to have unicode character next to text you need to modify slightly right side (use submatch(0) to get full match and . to append)
In case someone wonders how to compose the substitution command:
'<,'>s/\<[uU]+\(\x\+\)\>/\=submatch(0)..' '..nr2char(str2nr(submatch(1), 16), 1)/g
The regex is:
word start
Letter "U" or "u"
Literal "plus"
One or more hex digits (put into "capture group")
word end
Then substituted by (:h sub-replace-expression) concatenation of:
the whole matched string
single space
character by UTF-8 hex code taken from "capture group"
This is to be executed in Visual/command mode and works over selected line range.

Select first char up to first non camelCase or non upper case char or up to first snake case _ in vim

I used this map:
map ,w v/\([^ a-z0-9]\|[^ A-Z0-9]\)*<cr>h
the idea is to select
in the words
mysuperTest
MYSUPER_TEST
mysuper_test
to always select the part that says mysuper
but it doesnt work, not sure why
I would use something like the below:
nnoremap ,w v/\C\%#.\([a-z]\+\<bar>[A-Z]\+\)\zs<cr>h
One point to notice is that in a mapping you need to use <bar> (or escape | with an extra backslash) since otherwise | is recognized as a command separator (see :help map-bar.)
Another one to notice is that you want the match to start at the first character outside the word (so you'll land at the end of the word with the h). The visual selection will expand to the start of the match in a search. I suggest using \zs to set the start of the match explicitly (see :help /\zs.)
Finally, beware of 'ignorecase' or 'smartcase' settings. Use \C to explicitly request a case-sensitive match (see :help /\C.)
I also like the idea of using a stronger anchor for the start of the match, so I'm using \%# to match the current cursor position (see :help /\%#), so you're always sure to match the current word only and not end up wandering through the buffer.
Putting it all together:
\C Case-sensitive search
\%# From cursor position
. Skip first character
\( Either one of:
[a-z]\+ One or more lowercase letters
\| (\<bar>) Or:
[A-Z]\+ One or more uppercase letters
\) End group
\zs Set match position here
I'm skipping the first character under the cursor, since in a CamelCase word, the first character won't match capitalization of the remainder of the word.
I kept your original idea of finding the first character after the word then using h to go back one to the left. But that might be a problem if, for example, the word is at the end of the line.
You can actually match the last character of the word instead with something like [a-z]\+\zs[a-z], which will set the start of the match on the last lowercase character. You can do this for both sides of the group (you can have more than one \zs in your pattern, last wins.) If you structure your match that way, you won't need the final h to go back.
I didn't handle numbers, I'll leave those as an exercise to the reader.
Finally, consider there are quite a few corner cases that can make such a mapping quite tricky to get right. Rather than coming up with your own, why not look at plug-ins which add support for handling CamelCase words that have been battle-tested and will cover use cases a lot more advanced than the simple expression you're using here?
There's the excellent vim-scripts/camelcasemotion by Ingo Karkat which sets up a ,w mapping to move to the start of the next CamelCase word, but also i,w to select the current one. You can use powerful combinations such as v3i,w to visually select the current and next two CamelCase words.
You might also check Tim Pope's tpope/vim-abolish which, among other features, defines a set of cr mappings to do coercion from camelCase to MixedCase, snake_case, UPPER_CASE, etc. (Not directly about selecting them, but still related and you might find it useful.)

Searching for an exact match with a singular digit

I'm trying to search for only a singular digit in vim by itself. For example, if there are two sets of digits 1 and 123 and I want to search for 1, I would only want the singular 1 digit to be found.
I have tried using regular expressions like \<1> and \%(a)#
You almost had the right solution. You want:
\<1\>
This is because each angled bracket needs to be escaped. Alternatively, you could use:
\v<1>
The \v flag tells vim to treat more characters as special without needing to be escaped (for example, (){}+<> all become special rather than literal text. Read :h /\v for more on this.
A great reference for learning regex in vim is vimregex.com. The \<\> characters are explained in 4.1 "Anchors".
If you want to match text like 1.23 this is possible too. Two different approaches:
Modify the iskeyword option so that it includes .. This will also affect how w moves
Use \v<1(\d|.)#!, which basically means "a 1 at the beginning of a word, that isn't followed by some other digit or a period."

Find and replace only part of a single line in Vim

Most substitution commands in vim perform an action on a full line or a set of lines, but I would like to only do this on part of a line (either from the cursor to end of the line or between set marks).
example
this_is_a_sentence_that_has_underscores = this_is_a_sentence_that_should_not_have_underscores
into
this_is_a_sentence_that_has_underscores = this is a sentence that should not have underscores
This task is very easy to do for the whole line :s/_/ /g, but seems to be much more difficult to only perform the replacement for anything after the =.
Can :substitution perform an action on half of a line?
Two solutions I can think of.
Option one, use the before/after column match atoms \%>123c and \%<456c.
In your example, the following command substitutes underscores only in the second word, between columns 42 and 94:
:s/\%>42c_\%<94c/ /g
Option two, use the Visual area match atom \%V.
In your example, Visual-select the second long word, leave Visual mode, then execute the following substitution:
:s/\%V_/ /g
These regular expression atoms are documented at :h /\%c and :h /\%V respectively.
Look-around
There is a big clue your post already:
only perform the replacement for anything after the =.
This often means using a positive look-behind, \#<=.
:%s/\(=.*\)\#<=_/ /g
This means match all _ that are after the following pattern =.*. Since all look-arounds (look-aheads and look-behinds) are zero width they do not take up space in the match and the replacement is simple.
Note: This is equivalent to (?<=...) in perl speak. See :h perl-patterns.
What about \zs?
\zs will set the start of a match at a certain point. On the face this sounds exactly what is needed. However \zs will not work correctly as it matches the pattern before the \zs first then the following pattern. This means there will only be one match. Look-behinds on the other hand match the part after \#<= then "look behind" to make sure the match is valid which makes it great for multiple replacement scenario.
It should be noted that if you can use \zs not only is it easy to type but it is also more efficient.
Note: \zs is like \K in perl speak.
More ways?!?
As #glts mentioned you can use other zero-width atoms to basically "anchor" your pattern. A list of a few common ways:
\%>a - after the 'a mark
\%V - match inside the visual area
\%>42c - match after column 42
The possible downside of using one of these methods they need you to set marks or count columns. There is nothing wrong with this but it means the substitution will maybe affected by side-effects so repeating the substitution may not work correctly.
For more help see:
:h /\#<=
:h /zero-width
:h perl-patterns
:h /\zs

How to efficiently switch arguments in vim

I come upon one scenario when editing a file in vim and I still haven't found a way to do it quickly in vim way. When editing a call of a function, I offently put my arguments in a wrong order.
anyFunction(arg2, arg1)
When arriving on this situation, I have to find arg2 / delete it / append it before the ')' / deal with the ', ' / etc.
Isn't it a better way to this task quickly ? I am open to any idea (macro/ shortcut / plugin) even if I'd rather have a 'vim only' way of doing this
You need two things:
A text object to quickly select an argument (as they aren't always that simple like in your example). argtextobj plugin (my improved fork here) does this.
Though you can use delete + visual mode + paste + go back + paste, a plugin to swap text makes this much easier. My SwapText plugin or the already mentioned exchange plugin both do that job.
put this mapping in your _vimrc.
" gw : Swap word with next word
nmap <silent> gw :s/\(\%#\w\+\)\(\_W\+\)\(\w\+\)/\3\2\1/<cr><c-o><c-l>
then in normal mode with the cursor anywhere in arg1 type gw to swap parameters
anyFunction(arg1, arg2)
Explanation:-
arg1 the separator (here a comma) and arg2 are put into regexp memories 1 2 3
the substitute reverses them to 3 2 1
Control-O return to last position
Control-L redraw the screen
Note that the separator is any non-alphanumeric character or string e,g whitespace
I actually made a plugin to deal with a exact situation called argumentative.vim. (Sorry for the plug.)
Argumentative.vim provides the following mappings:
[, and ], motions which will go to the previous or next argument
<, and >, to shift an argument left or right
i, and a, argument text objects. e.g. da,, ci, or yi,
So with this plugin you move to the argument in question and then do a <, or >, as many times as needed. It can also take a count e.g. 2>,.
If you have Tim Pope's excellent repeat.vim plugin installed <, and >, become repeatable with the . command.
I would recommend a plugin: vim-exchange for that:
https://github.com/tommcdo/vim-exchange
This is a perfect use for a regular expression search and replace.
You want to find "anyFunction(", then swap anything up to the ',' with anything from the ',' to the ')'. This is fairly straightforward, using [^,]* for "anything up to the ','" and [^)]* for "anything up to the ')'". Use \(...\) to capture each thing, and \1, \2 to refer to those things in the replacement:
:s#anyFunction(\s*\([^,]*\),\s*\([^)]*\)#anyFunction(\2, \1#g
Note how I use \s* to allow any whitespace between the "anyFunction(" and the first thing, and also between the ',' and the second thing.
If you want this to be able to span multiple lines, you can use \_s instead of \s, and capture the whitespace if you want to maintain the multi-line format:
:s#anyFunction(\(\_s*\)\([^,]*\),\(\_s*\)\([^)]*\)#anyFunction(\1\4,\3\2#g
There is also a multi-line variant of [...] collections, for example \_[^,] meaning "anything (even a new line) except for a ',' " which you could use in the pattern if your use case demands it.
For details, consult the help topics for: /\s, /\_s, /\1, /\(, and /[.
If you want a more general-purpose mapping to use at every location, you can use the cursor position in your regular expression, rather than keying off the function name. The cursor position in a regular expression is matched using \%# as demonstrated here: http://vim.wikia.com/wiki/Exchanging_adjacent_words
Similar to what Peter Rincker suggested (Argumentative), sideways also does what you describe.

Resources