When using { or } to navigate between paragraphs, Vim appears to treat lines with white space characters as if they were part of the paragraph and skips past them. This behaviour has been discussed in a number of threads and the explanation given (based on :h paragraph) is that "a paragraph begins after each empty line", which is fine.
However, this does not appear to be consistent with the way Vim treats the ap and ip commands, which actually do treat lines with whitespace characters as paragraph breaks. For example, given the following text where the first two paragraphs are separated by a non-empty line (containing whitespace) and the second and third paragraphs are separated by an empty line (and assuming the cursor starts at the top of the buffer) the following occurs:
1 abc # next line contains spaces
2
3 def # next line is blank
4
5 jkl
}: moves the cursor to line 4 (i.e., treats lines 1-4 as a paragraph)
dap: deletes lines 1 and 2 (i.e., treats only lines 1-2 as a paragraph)
These two behaviours appear to be inconsistent with one another. Why do these two commands that operate on a paragraph object behave differently?
As mentioned in :help ap and :help ip:
Exception: a blank line (only containing white space) is also a paragraph boundary.
So the behaviour of ap was made voluntarily different from that of } and the difference is clearly documented. The exact reasoning behind that difference is explained nowhere, though, and may be lost in time. You might want to ask on Vim's official mailing list.
Anyway, we can extrapolate a little…
Vim's } is consistent with vi's }. This is expected since Vim's whole purpose is, after all, to be a convincing stand-in for vi.
ap (and the whole concept of text objects) is a Vim thing. It wasn't in vi so there is no existing behaviour to replicate and the person who added that feature decided to make it treat "paragraphs" in a slightly more intuitive fashion than }.
Related
I'm trying to complete an exercise from https://learnvimscriptthehardway.stevelosh.com/chapters/16.html
The sample text to be worked on is:
Topic One
=========
This is some text about topic one.
It has multiple paragraphs.
Topic Two
=========
This is some text about topic two. It has only one paragraph.
The mapping to delete the heading of Topic One or Topic Two (depending on which body the cursor is placed in) and enter insert mode is:
:onoremap ih :<c-u>execute "normal! ?^==\\+$\r:nohlsearch\rkvg_"<cr>
Enter 'cih' in the body of either text below the headings and respective heading will be erased and the cursor will be placed there ready to go, in insert mode. Great mapping--but, I'm trying to understand what's happening with \+$.
When I omit \+$ and use this mapping:
:onoremap ih :<c-u>execute "normal! ?^==\r:nohlsearch\rkvg_"<cr>
it works fine, seemingly identically to the other mapping. So what is the use of the \+$?
Here is how Mr. Losh explains it:
The first piece,
?^==\+$
performs a search backwards for any line that consists of two
or more equal signs and nothing else. This will leave our cursor on
the first character of the line of equal signs."
But what does \+$ accomplish? I've tried to enter it manually in command but I just get an error sound. It works as intended as part of the full function, though. but like I said, when I remove it and run the full command without, it works fine.
There's something I'm missing about the necessity of that '+$'... Maybe it has to do with the "two or more equal signs and nothing else"?
The author's command:
?^==\+$
searches backward for a line consisting exclusively of 2 or more equal signs:
^ anchors the pattern to the beginning of the line,
= matches a literal equal sign,
^= thus matches a literal equal sign at the beginning of the line,
= matches a second equal sign,
\+ matches one or more of the preceding atom, as many as possible,
=\+ thus matches one or more equal sign, as many as possible,
$ anchors the pattern to the end of the line,
so the pattern above is going to match any of the following lines:
==
===
=============
etc.
but not lines like:
==foo
== <- six spaces
etc.
which is exactly the goal of that exercice.
Your command, on the other hand:
?^==
searches backward for a sequence of two equal signs at the beginning of a line:
^ anchors the pattern to the beginning of the line,
== matches two literal equal signs,
so your pattern is going to match the same lines as above:
==
===
=============
etc.
but also lines like:
==foo
== <- six spaces
etc.
because it is not strict enough.
Your pattern would definitely be good enough if used manually to jump to one of those underlines because it gets the job done with minimal typing. But the goal, here, is to make a mapping. Those things have to be generalised to be reliable, which pretty much requires a level of explicitness and precision your pattern lacks.
In short, Steve's pattern checks all the boxes while yours doesn't: it is explicit and precise while yours is implicit and imprecise.
The \+$ is part of the regular expression matching a line of only equals signs. Without it, your mapping would recognize, for example,
This is not a heading
=This is not an underline
as a heading.
The \+ means "At least two of the previous character (=)". The $ means End of line, so there cannot be anything after the equals signs.
The Vim Handbook gives them both the exact same description. From the Vim Handbook:
ap: "a paragraph", select [count] paragraphs. Exception: a blank line (only containing white space) is also a paragraph boundary. When used in Visual mode it is made linewise.
ip: "inner paragraph", select [count] paragraphs. Exception: a blank line (only containing white space) is also a paragraph boundary. When used in Visual mode it is made linewise.
Because of this, it is not entirely clear to me what the differences between these are. For example, say you had the commands gqap and gqip. How do they differ in behaviour?
Again, just reading :help ap because someone gave you the link, without proper background, will get you nowhere and only bring more confusion.
Text objects can be of two types: those that include their boundaries and those that exclude their boundaries.
By convention, text objects that include their boundaries start with a, like ap, and those that exclude their boundaries start with i, like ip.
What constitutes a "paragraph" is explained under :help paragraph, which is linked from both :help ip and :help ap. In concrete terms, the boundaries of a paragraph are:
a non-empty line preceded by an empty one, think of it as zero-width match,
the next empty line.
So you have ip, which excludes the empty line, and ap, which includes it:
[...] end of paragraph above.
Beginning of paragraph in the middle | |
with some boring filler text so that | ip | ap
it covers a few lines. | |
|
Beginning of paragraph below [...]
When doing gq over a paragraph, using ip or ap doesn't really matter because the extra empty line is unlikely to change anything to what is being done to the text. ip and ap are a bit of an exception, here, because the difference between the two types of text objects usually matters quite a lot:
it
----------------------
<h2>One day I will be a H1</h2>
-------------------------------
at
See :help text-objects for a reference on the subject and :help 04.8 for a more gentle introduction.
As alluded to in my other answer, proper learning lets you develop an intuition that is hard to build from random, disconnected, links and tweets and so on.
Running :help paragraph in vim gives:
A paragraph begins after each empty line, and also at each of a set of
paragraph macros, specified by the pairs of characters in the 'paragraphs'
option. The default is "IPLPPPQPP TPHPLIPpLpItpplpipbp", which corresponds to
the macros ".IP", ".LP", etc. (These are nroff macros, so the dot must be in
the first column).
Most of the vim help I've seen has been super helpful, and I was beginning to feel I was getting a grip on it. Suddenly though:
IPLPPPQPP TPHPLIPpLpItpplpipbp
Aaand I'm lost.
Could someone explain to me what this sequence of characters is supposed to mean?
nroff(1) is a unix text-formatting utility. It's e.g. used for formatting the man pages.
In nroff, you got macros that do stuff: e.g. .PP means following is a paragraph with the first line indented. These macros are usually(1) 2-letter codes preceded by a dot.
The docs are saying how Vim detects paragraph boundaries: A paragraph boundary is either an empty new line or a dot in the first column followed by one of the 2-letter codes specified in the paragraphs option.
Example:
Hello
LP
World
If I put the cursor on World and enter vip in normal mode. Everything will be selected.
Hello
.LP
World
.LP is contained in the paragraphs option, thus vip will in this case not mark Hello as it's above the paragraph boundary.
(1) For 1-letter macros, you append a space. That's why there is a space in the default paragraphs value, it's for .P.
I usually have the tw=80 option set when I edit files, especially LaTeX sources. However, say, I want to compose an email in Vim with the tw=80 option, and then copy and paste it to a web browser. Before I copy and paste, I want to unwrap the text so that there isn't a line break every 80 characters or so. I have tried tw=0 and then gq, but that just wraps the text to the default width of 80 characters. My question is: How do I unwrap text, so that each paragraph of my email appears as a single line? Is there an easy command for that?
Go to the beginning of you paragraph and enter:
v
i
p
J
(The J is a capital letter in case that's not clear)
For whole document combine it with norm:
:%norm vipJ
This command will only unwrap paragraphs. I guess this is the behaviour you want.
Since joining paragraph lines using Normal mode commands is already
covered by another answer, let us consider solving the same issue by
means of line-oriented Ex commands.
Suppose that the cursor is located at the first line of a paragraph.
Then, to unwrap it, one can simply join the following lines up until
the last line of that paragraph. A convenient way of doing that is to
run the :join command designed exactly for the purpose. To define
the line range for the command to operate on, besides the obvious
starting line which is the current one, it is necessary to specify
the ending line. It can be found using the pattern matching the very
end of a paragraph, that is, two newline characters in a row or,
equivalently, a newline character followed by an empty line. Thus,
translating the said definition to Ex-command syntax, we obtain:
:,-/\n$/j
For all paragraphs to be unwrapped, run this command on the first line
of every paragraph. A useful tool to jump through them, repeating
a given sequence of actions, is the :global command (or :g for
short). As :global scans lines from top to bottom, the first line
of the next paragraph is just the first non-empty line among those
remaining unprocessed. This observation gives us the command
:g/./,-/\n$/j
which is more efficient than its straightforward Normal-mode
counterparts.
The problem with :%norm vipJ is that if you have consecutive lines shorter than 80 characters it will also join them, even if they're separated by a blank line. For instance the following example:
# Title 1
## Title 2
Will become:
# Title 1 ## Title 2
With ib's answer, the problem is with lists:
- item1
- item2
Becomes:
- item1 - item2
Thanks to this forum post I discovered another method of achieving this which I wrapped in a function that works much better for me since it doesn't do any of that:
function! SoftWrap()
let s:old_fo = &formatoptions
let s:old_tw = &textwidth
set fo=
set tw=999999 " works for paragraphs up to 12k lines
normal gggqG
let &fo = s:old_fo
let &tw = s:old_tw
endfunction
Edit: Updated the method because I realized it wasn't working on a Linux setup. Remove the lines containing fo if this newer version doesn't work with MacVim (I have no way to test).
Is it possible to have a carriage return without bringing about a linebreak ?
For instance I want to write the following sentences in 2 lines and not 4 (and I do not want to type spaces of course) :
On a ship at sea: a tempestuous noise of thunder and lightning heard.
Enter a Master and a Boatswain
Master : Boatswain!
Boatswain : Here, master: what cheer?
Thanks in advance for your help
Thierry
In a text file, the expected line-end character or character sequence is platform dependent. On Windows, the sequence "carriage return (CR, \r) + line feed (LF, \n)" is used, while Unix systems use newline only (LF \n). Macintoshes traditionally used \r only, but these days on OS X I see them dealing with just about any version. Text editors on any system are often able to support all three versions, and to convert between them.
For VIM, see this article for tips how to convert/set line end character sequences.
However, I'm not exactly sure what advantage the change would have for you: Whichever sequence or character you use, it is just the marker for the end of the line (so there should be one of these, at the end of the first line and you'd have a 2 line text file in any event). However, if your application expects a certain character, you can either change the application -- many programming languages support some form of "universal" newline -- or change the data.
Just in case this is what you're looking for:
:set wrap
:set linebreak
The first tells vim to wrap long lines, and the second tells it to only break lines at word breaks, instead of in the middle of words when it reaches the window size.