vim select paragraph -- how to redefine paragraph boundaries? - vim

In vim, vip selects "inner paragraph" :help v_ip,
however it is of limited use.
vim paragraph boundary is hard coded, a
paragraph is separated by 2 or more blank lines.
:help paragraph
Some archaic marcos like .IP, also seem to be supported as
paragraph separators, but it is all hard coded.
I want to specify my own paragraph separators to easily
select paragraphs of text in vim.
Like perl in paragraph mode using an regexp splitter.
I tried setting paragraphs to be delimited by blank lines or braces:
:set paragraph+={ cpoptions+={
but does NOT work as documented,
braces are ignored by 'vip' selection command.
The solution I want should work for all paragraphs commands
like vip, vap, dip, dap, {,}.

Note how you can map operators, so you won't have to remap vip or vap (you could map aH to your movement operation - and all of the following work magically using your selections:
daH
vaHy
d2aH
etc

I don't know if it's an option for you, but you can change paragraph delimiting in certain way by including the 'w' flag in 'formatoptions'. Check out help for 'fo-table' to read more. Basically it makes it so that lines ending in a space are like 'soft returns' and lines ending in a non-space character mark the end of paragraphs. Empty lines are not markers at all in this case. The 'w' formation flag does work with all vip, vap, etc., if I recall.
If that isn't going to do the trick for you, then I suggest remapping the vip, vap, etc. sequences to a custom function of your own. That way you can set it up to select things exactly as you want.

See :help paragraphs.
'paragraphs' 'para' string (default "IPLPPPQPP TPHPLIPpLpItpplpipbp")
global
Specifies the nroff macros that separate paragraphs. These are pairs
of two letters (see object-motions).

Related

How to convert visual selection from unicode to the corresponding character in vim command?

I'm trying to convert multiple instances of Unicode codes to their corresponding characters.
I have some text with this format:
U+00A9
And I want to generate the following next to it:
©
I have tried to select the code in visual mode and use the selection range '<,'> in command mode as input for i_CTRL_V but I don't know how to use special keys on a command.
I haven't found anything useful in the manual with :help command-mode . I could solve this problem using other tools but I want to improve my vim knowledge. Any hint is appreciated.
Edit:
As #m_mlvx has pointed out my goal is to visually select, then run some command that looks up the Unicode and does the substitution. Manually input a substitution like :s/U+00A9/U+00A9 ©/g is not what I'm interested in as it would require manually typing each of the special characters on every substitution.
Any hint is appreciated.
Here are a whole lot of them…
:help i_ctrl-v is about insert mode and ranges matter in command-line mode so :help command-mode is totally irrelevant.
When they work on text, Ex commands only work on lines, not arbitrary text. This makes ranges like '<,'> irrelevant in this case.
After carefully reading :help i_ctrl-v_digit, linked from :help i_ctrl-v, we can conclude that it is supposed to be used:
with a lowercase u,
without the +,
without worrying about the case of the value.
So both of these should be correct:
<C-v>u00a9
<C-v>u00A9
But your input is U+00A9 so, even if you somehow manage to "capture" that U+00A9, you won't be able to use it as-is: it must be sanitized first. I would go with a substitution but, depending on how you want to use that value in the end, there are probably dozens of methods:
substitute('U+00A9', '\(\a\)+\(.*\)', '\L\1\2', '')
Explanation:
\(\a\) captures an alphabetic character.
+ matches a literal +.
\(.*\) captures the rest.
\L lowercases everything that comes after it.
\1\2 reuses the two capture groups above.
From there, we can imagine a substitution-based method. Assuming "And I want to generate the following next to it" means that you want to obtain:
U+00A9©
you could do:
v<motion>
y
:call feedkeys("'>a\<C-v>" . substitute(#", '\(\a\)+\(.*\)', '\L\1\2', '') . "\<Esc>")<CR>
Explanation:
v<motion> visually selects the text covered by <motion>.
y yanks it to the "unnamed register" #".
:help feedkeys() is used as low-level way to send a complex series of characters to Vim's input queue. It allows us to build the macro programatically before executing it.
'> moves the cursor to the end of the visual selection.
a starts insert mode after the cursor.
<C-v> + the output of the substitution inserts the appropriate character.
That snippet begs for being turned into a mapping, though.
In case you would like to just convert unicodes to corresponding characters, you could use such nr2char function:
:%s/U+\(\x\{4\}\)/\=nr2char('0x'.submatch(1))/g
Brief explanation
U+\(\x\{4\}\) - search for a specific pattern (U+ and four hexadecimal characters which are stored in group 1)
\= - substitute with result of expression
'0x'.submatch(1) - append 0x to our group (U+00A9 -> 0x00A9)
In case you would like to have unicode character next to text you need to modify slightly right side (use submatch(0) to get full match and . to append)
In case someone wonders how to compose the substitution command:
'<,'>s/\<[uU]+\(\x\+\)\>/\=submatch(0)..' '..nr2char(str2nr(submatch(1), 16), 1)/g
The regex is:
word start
Letter "U" or "u"
Literal "plus"
One or more hex digits (put into "capture group")
word end
Then substituted by (:h sub-replace-expression) concatenation of:
the whole matched string
single space
character by UTF-8 hex code taken from "capture group"
This is to be executed in Visual/command mode and works over selected line range.

What are the "paragraph macros" in :help paragraphs

Running :help paragraph in vim gives:
A paragraph begins after each empty line, and also at each of a set of
paragraph macros, specified by the pairs of characters in the 'paragraphs'
option. The default is "IPLPPPQPP TPHPLIPpLpItpplpipbp", which corresponds to
the macros ".IP", ".LP", etc. (These are nroff macros, so the dot must be in
the first column).
Most of the vim help I've seen has been super helpful, and I was beginning to feel I was getting a grip on it. Suddenly though:
IPLPPPQPP TPHPLIPpLpItpplpipbp
Aaand I'm lost.
Could someone explain to me what this sequence of characters is supposed to mean?
nroff(1) is a unix text-formatting utility. It's e.g. used for formatting the man pages.
In nroff, you got macros that do stuff: e.g. .PP means following is a paragraph with the first line indented. These macros are usually(1) 2-letter codes preceded by a dot.
The docs are saying how Vim detects paragraph boundaries: A paragraph boundary is either an empty new line or a dot in the first column followed by one of the 2-letter codes specified in the paragraphs option.
Example:
Hello
LP
World
If I put the cursor on World and enter vip in normal mode. Everything will be selected.
Hello
.LP
World
.LP is contained in the paragraphs option, thus vip will in this case not mark Hello as it's above the paragraph boundary.
(1) For 1-letter macros, you append a space. That's why there is a space in the default paragraphs value, it's for .P.

How to repeat a substitution the number of times the search word occurs in a row in a substitution command in Vim?

I would like to use tabs in a code that doesn’t use them. What I did until now to implement tabs was pretty handcrafty:
:%s/^ /\t/g
:%s/^\t /\t\t/g
. . .
Question: Is there a way to replace two spaces ( ) by tab (\t) the number of times it was found at the beginning of a line?
There are (at least) three substitution techniques relevant to this case.
1. The first one takes advantage of the preceding-atom matching
syntax to naturally define a step of indentation. According to the
question statement, an indent step is a pair of adjacent space
characters preceded with nothing but spaces from the beginning
of line. Following this definition, one can construct the actual
substitution pattern, right to left:
:%s/\%(^ *\)\#<= /\t/g
Indeed, the pattern designates an occurrence of two literal space
characters, but only when they are preceded by a zero-width match
of the atom just before \#<=, which is the pattern ^ * wrapped in
grouping parentheses \%(, \). These non-capturing parentheses are
used instead of the usual capturing ones, \(, \), since there is no
need in further referring to the matched string of leading spaces. Due
to the g flag, the above :substitute command runs through the
leading spaces pair by pair, and replaces each of them by single tab
character.
2. The second technique takes a different approach. Instead of
matching separate indent levels, one can break each of the lines
starting with space characters down into two lines: one containing
the indenting spaces of the original line, and another holding the
rest of it. After that, it is straightforward to replace all of the pairs
of spaces on the first line, and concatenate the lines back together:
:g/^ /s/^ \+/&\r/|-s/ /\t/g|j!
3. The third idea is to process leading spaces by means of Vim
scripting language. A convenient way of doing that is to use the
substitute with an expression feature of the :substitute command
(see :help sub-replace-\=). When started with \=, the substitute
string of the command enables to substitute the matches of a pattern
with results of evaluation of the expression specified after \=:
:%s#^ \+#\=repeat("\t",len(submatch(0))/2)
If you specifically want to convert spaces into tabs (or vice-versa) at the start of a line, there's the useful :retab command which takes care of that. For example:
:retab! 2 will convert spaces in groups of two to tabs
:set expandtab and then :retab! 2 will convert tabstops (of width 2) back to spaces
See :h :retab (and :h 'ts') for the details.
This is not a general solution for the original problem, but I think it covers the most common use case.
There is no general way of doing this using :s regex's. You can't make the /g modifier look backwards otherwise it'd be unusable, and you can't reliably check that you're at the beginning of the line without looking backwards.
The only way of doing it generally is to loop, like so:
:for i in range(100)
: %s/^\t*\zs /\t/e
:endfor
Which is ugly, slow and highly unrecommended. Use :retab

VIM: How to change the Showbreak Highlight color without using the NonText Color-element

I noted that the 'showbreak' symbol is highlighted with the highlight "NonText" color-element. NonText is also used for the EOL Characters.
I would like to keep the highlight-color for the EOL characters but want to change it for the showbreak symbol is that possible?
Another problem is that my showbreak symbol is not displayed.
I would like to use this symbol "↳" and put it in the linenumbers column (using set cpoptions+=n). I can't find out how to display the symbol and how to put a space after the showbreak symbol (between the text and the symbol).
Can anyone help me?
I don't think you're going to get highlighting to be different than the EOL character, at least I am not aware of a way to do that.
For the second part I can help with. I was able to get "↳ " to show up in my line number column with the following settings:
let &showbreak = '↳ '
set wrap
set cpo=n
Note that there is a space after the ↳. This lines up nice until you have > 9 lines in the file. If you wanted it to line up with the last character of the number column regardless of the number of lines I'm not sure what you're going to have to do.
Edit: I've recently written a proof-of-concept function for someone on IRC that highlights the first character on a line that has been wrapped with a different highlight group. It hasn't been tested much but it seems to work. Not exactly what you're looking for but maybe it's worth a look.
:help hl-NonText makes it pretty clear that you cannot have different colors for the 'showbreak' string and other non-text strings, of which eol is a member (see :help 'listchars'):
NonText
'~' and '#' at the end of the window, characters from 'showbreak' and
other characters that do not really exist in the text (e.g., ">"
displayed when a double-wide character doesn't fit at the end of the
line).
If you're willing to accept this limitation (#elliottcable) hi! link NonText LineNr will match the 'showbreak' string to the line number colors.
If you really wanted to get clever, as a compromise you could create a mapping or command to toggle between ':set list' and ':set nolist' that would also adjust the NonText highlight setting simultaneously.
If you use :set relativenumber (added in vim 7.3), :set showbreak=↳\ \ \ will reliably keep your 'showbreak' neatly lined up since the number width will not change as you navigate through the file. (This in addition to the :set cpo+=n and :set wrap #Randy Morris mentioned in his answer.)
You'll definitely need UTF-8 for the ↳ character, since it does not appear in other encodings. I'd strongly recommend you carefully document your encoding problems, with details about how to reproduce them along with your OS, its version, and the :version output of vim, and post them as separate questions. UTF-8 should be helping you wrangle multiple languages rather than being an impediment.

Why does Vim add spaces when joining lines?

I want to unwrap text in Vim. When I join lines I get an additional space between sentences.
Why is that?
I have a feeling this is what you really want: gJ
From :h gJ:
gJ Join [count] lines, with a minimum of two lines.
Don't insert or remove any spaces. {not in Vi}
This is handy if you've copied something from a terminal and it's pasted it as a big rectangular block into vim, rather than a single line.
I usually use it in visual mode. Hilight stuff, gJ.
Formatting destroys information. There are many different blocks of text which will result in the same one once formatted. Therefore, there's no way to reverse the operation without prior knowledge (i.e. undo).
Unformatted:
Unformatted text could start out as either all one line, or several, yet look the same when formatted.
Unformatted text could start out as either all one line, or several,
yet look the same when formatted.
Formatted:
Unformatted text could start out as
either all one line, or several,
yet look the same when formatted.
If you want your paragraph all on one line, or if you're okay with a little manual fiddling, you can use J to join lines back together. You can use visual mode to apply the J command to several lines at once, perhaps combined with ap or ip to select a paragraph, e.g. vipJ. Again, you'll still lose some information - multiple spaces at line breaks before formatting will end up collapsed to single spaces. (You can actually join without modifying spaces by using gJ instead of J, but you'll already have lost them when you formatted)
If you're bothered by the extra spaces after sentences (lines ending in !, ?, or .), turn off joinspaces: set nojoinspaces
I guess the simple solution to join the lines without spaces between is:
:j!
With ! the join does not insert or delete any spaces. For the whole file, use :%j!.
See: :help :join.
This is the answer that ended up working for me, none of the above worked in my use case.
Essentially, use gJ like multiple others have said, but highlight all of file, so in command mode typing ggVGgJ.
I still got the extra one space after join, if the line we work on does not end with space. Usually this is the desired behaviour. Example
first line without space
second line
after joining with J, become
first line without space second line
Although in some case, we do not wish to apply it,
myInstance->methodA()
->methodB()
And we would want the join to become myInstance->methodA()->methodB() without any space in between!
Here the helpers mapping i use
nmap <leader>jj Jx
<leader> key can be checked with :let mapleader, default to key \ i believe.
so in normal mode, just \jj to perform join without any extra space!

Resources