What's the meaning of some advanced patterns in vim errorformat? (%s, %+, %\\#=) - vim

I tried reading :help errorformat and googling (mostly stackoverflow), but can't understand some of the patterns mentioned there:
%s - "specifies the text to search for to locate the error line. [...]"
um, first of all, trying to understand the sentence at all, where do I put the "text to search", after the %s? before it? or, I don't know, does it maybe taint the whole pattern? WTF?
secondly, what does this pattern actually do, how does it differ from regular text in a pattern, like some kinda set efm+=,foobar? the "foobar" here is for me also "text to search for"... :/
%+ - e.g. I I've seen something like that used in one question: %+C%.%#
does it mean the whole line will be appended to a %m used in an earlier/later multiline pattern? if yes, then what if there was not %.%# (== regexp .*), but, let's say, %+Ccont.: %.%# - would something like that work to capture only stuff after a cont.: string into the %m?
also, what's the difference between %C%.%# and %+C%.%# and %+G?
also, what's the difference between %A and %+A, or %E vs. %+E?
finally, an example for Python in :help errorformat-multi-line ends with the following characters: %\\#=%m -- WTF does the %\\#= mean?
I'd be very grateful for some help understanding this stuff.

Ah, errorformat, the feature everybody loves to hate. :)
Some meta first.
Some Vim commands (such as :make and :cgetexpr) take the output of a compiler and parse it into a quickfix list. errorformat is a string that describes how this parsing is done. It's a list of patterns, each pattern being a sort of hybrid between a regexp and a scanf(3) format. Some of these patterns match single lines in the compiler's output, others try to match multiple lines (%E, %A, %C etc.), others keep various states (%D, %X), others change the way parsing proceeds (%>), while yet others simply produce messages in the qflist (%G), or ignore lines in the input (%-G). Not all combinations make sense, and it's quite likely you won't figure out all details until you look at Vim' sources. shrug
You probably want to write errorformats using let &erf='...' rather than set erf=.... The syntax is much more human-friendly.
You can experiment with errorformat using cgetexpr. cgetexpr expects a list, which it interprets as the lines in the compiler's output. The result is a qflist (or a syntax error).
qflists are lists of errors, each error being a Vim "dictionary". See :help getqflist() for the (simplified) format.
Errors can identify a place in a file, they can be simple messages (if essential data that identifies a place is missing), and they can be valid or invalid (the invalid ones are essentially the leftovers from parsing).
You can display the current qflist with something like :echomsg string(getqflist()), or you can see it in a nice window with :copen (some important details are not shown in the window though). :cc will take you to the place of the first error (assuming the first error in qflist actually refers to an error in a file).
Now to answer your questions.
um, first of all, trying to understand the sentence at all, where do I put the "text to search", after the %s? before it?
You don't. %s reads a line from the compiler's output and translates it to pattern in the qflist. That's all it does. To see it at work, create a file efm.vim with this content:
let &errorformat ='%f:%s:%m'
cgetexpr ['efm.vim:" bar:baz']
echomsg string(getqflist())
copen
cc
" bar baz
" bar
" foo bar
Then run :so%, and try to understand what's going on. %f:%s:%m looks for three fields: a filename, the %s thing, and the message. The input line is efm.vim:" bar:baz, which is parsed into filename efm.vim (that is, current file), pattern ^\V" bar\$, and message baz. When you run :cc Vim tries to find a line matching ^\V" bar\$, and sends you there. That's the next-to-last line in the current file.
secondly, what does this pattern actually do, how does it differ from regular text in a pattern, like some kinda set efm+=,foobar?
set efm+=foobar %m will look for a line in the compiler's output starting with foobar, then assign the rest of the line to the message field in the corresponding error.
%s reads a line from the compiler's output and translates it to a pattern field in the corresponding error.
%+ - e.g. I I've seen something like that used in one question: %+C%.%#
does it mean the whole line will be appended to a %m used in an earlier/later multiline pattern?
Yes, it appends the content of the line matched by %+C to the message produced by an earlier (not later) multiline pattern (%A, %E, %W, or %I).
if yes, then what if there was not %.%# (== regexp .*), but, let's say, %+Ccont.: %.%# - would something like that work to capture only stuff after a cont.: string into the %m?
No. With %+Ccont.: %.%# only the lines matching the regexp ^cont\.: .*$ are considered, the lines not matching it are ignored. Then the entire line is appended to the previous %m, not just the part that follows cont.:.
also, what's the difference between %C%.%# and %+C%.%# and %+G?
%Chead %m trail matches ^head .* trail$, then appends only the middle part to the previous %m (it discards head and trail).
%+Chead %m trail matches ^head .* trail$, then appends the entire line to the previous %m (including head and trail).
%+Gfoo matches a line starting with foo and simply adds the entire line as a message in the qflist (that is, an error that only has a message field).
also, what's the difference between %A and %+A, or %E vs. %+E?
%A and %E start multiline patterns. %+ seems to mean "add the entire line being parsed to message, regardless of the position of %m".
finally, an example for Python in :help errorformat-multi-line ends with the following characters: %\\#=%m -- WTF does the %\\#= mean?
%\\#= translates to the regexp qualifier \#=, "matches preceding atom with zero width".

Related

Vim errorformat string to show message in QuickFix removing part of it

I'm writing an errorformat string, and it works for the most part. My problem is that I have lines like this as the makeprg output:
Some text I want to show in the QuickFix window^M
Yes, the line ends with an spurious ^M character I want to remove. So, what I want in my QuickFix window is this, without the ^M character:
|| Some text I want to show in the QuickFix window
but I have this instead:
|| Some text I want to show in the QuickFix window^M
So far, this is the relevant part of my errorformat:
set errorformat=%+GSome text%m
I've tested, without success, something like this:
set errorformat=%+GSome text%m%-G^M%.%#
but it throws an error (not from the ^M which is a literal control-M char, not a caret followed by an M).
Obviously the solution is not using %G but I am at a loss here.
How can I remove the line ending character from the line here? And also, removing the initial || would be a plus, but I think it's impossible to do in Vim.
Thanks in advance!
Edited to make clearer how the input text looks
Well, turns out I found a solution, probably not very good but it works, using trial and error.
set errorformat=%\\(Some Text%*[^.]).%\\)%\\#=%m
That is, the solution is using the Vim pattern (regex) expressions within errorformat, which has a quite arcane look but works, together with %* to match unknown text on the rest of the line
The solution uses \#=, a zero-width match, and requires some kind of terminator for the line, which appears before the ^M character I want to ignore, and some kind of text appearing somewhere on the line to match that line and not others.
Probably there is a much better solution but this is the best I could do myself.

Vim errorformat: include part of expression in message string

With vim's errorformat syntax, is there any way to use part of the message in filtering results?
As an example, some linker errors don't have anything explicit to distinguish them as an error on the line, other than the error itself:
/path/to/foo.cpp:42: undefined reference to 'UnimplementedFunction'
or
/path/to/foo.cpp:43: multiple definition of 'MultiplyDefinedFunction'
Using an errorformat of:
set efm=%f:%l:\ %m
would catch and display both of these correctly, but will falsely match many other cases (any line that starts with "[string]:[number]: ").
Or, explicitly specifying them both:
set efm=
set efm+=%f:%l:\ undefined\ reference\ to\ %m
set efm+=%f:%l:\ multiple\ definition\ of\ %m
removes the false positives, but the 'message' becomes far less useful -- the actual error is no longer included (just whatever is after it).
Is there anything in the syntax I'm missing to deal with this situation?
Ideally I'd like to be able to say something along the lines of:
set efm+=%f:%l:\ %{StartMessage}undefined\ reference\ to\ %*\\S%{EndMessage}
set efm+=%f:%l:\ %{StartMessage}multiple\ definition\ of\ %*\\S%{EndMessage}
... where everything matched between StartMessage and EndMessage is used as the error's message.
The errorformat can also use vim's regular expression syntax (albeit in a rather awkward way) which gives us a solution to the problem. We can use a non-capturing group and a zero-width assertion to require the presence of these signaling phrases without consuming them. This then allows the %m to pick them up. As plain regular expression syntax this zero-width assertion looks like:
\%(undefined reference\|multiple definition\)\#=
But in order to use it in efm we need to replace \ by %\ and % by %% and for use in a :set line we need to escape the backslashes, spaces and vertical bar so we finally have:
:set efm=%f:%l:\ %\\%%(undefined\ reference%\\\|multiple\ definitions%\\)%\\#=%m
With that the error file
/path/to/foo.cpp:42: undefined reference to 'UnimplementedFunction'
/path/to/foo.cpp:43: multiple definition of 'MultiplyDefinedFunction'
notafile:123: just some other text
comes out as the following in :copen:
/path/to/foo.cpp|42| undefined reference to 'UnimplementedFunction'
/path/to/foo.cpp|43| multiple definition of 'MultiplyDefinedFunction'
|| notafile:123: just some other text
I've been using sed to rewrite the output in cases like this where I want to get some arbitrary output that's not nessicarily homogenous into the quickfix window.
You could write make.sh that fires off make (or whatever your're using to build) and trims off stuff you're not concerned with:
make | sed '/undefined reference\|multiple definition/!d'
(Deletes lines not containing 'undefined reference' or 'multiple definition')
If that's going to get too unweildly because of the number of error strings you care about, you could do the inverse and just kill stuff you don't care about:
make | sed 's/some garbage\|other useless message//'
then :set makeprg=make.sh in vim

Find first non-matching line in VIM

It happens sometimes that I have to look into various log and trace files on Windows and generally I use for the purpose VIM.
My problem though is that I still can't find any analog of grep -v inside of VIM: find in the buffer a line not matching given regular expression. E.g. log file is filled with lines which somewhere in a middle contain phrase all is ok and I need to find first line which doesn't contain all is ok.
I can write a custom function for that, yet at the moment that seems to be an overkill and likely to be slower than a native solution.
Is there any easy way to do it in VIM?
I believe if you simply want to have your cursor end up at the first non-matching line you can use visual as the command in your global command. So:
:v/pattern/visual
will leave your cursor at the first non-matching line. Or:
:g/pattern/visual
will leave your cursor at the first matching line.
you can use negative look-behind operator #<!
e.g. to find all lines not containing "a", use /\v^.+(^.*a.*$)#<!$
(\v just causes some operators like ( and #<! not to must have been backslash escaped)
the simpler method is to delete all lines matching or not matching the pattern (:g/PATTERN/d or :g!/PATTERN/d respectively)
I'm often in your case, so to "clean" the logs files I use :
:g/all is ok/d
Your grep -v can be achieved with
:v/error/d
Which will remove all lines which does not contain error.
It's probably already too late, but I think that this should be said somewhere.
Vim (since version about 7.4) comes with a plugin called LogiPat, which makes searching for lines which don't contain some string really easy. So using this plugin finding the lines not containing all is ok is done like this:
:LogiPat !"all is ok"
And then you can jump between the matching (or in this case not matching) lines with n and N.
You can also use logical operations like & and | to join different strings in one pattern:
:LP !("foo"|"bar")&"baz"
LP is shorthand for LogiPat, and this command will search for lines that contain the word baz and don't contain neither foo nor bar.
I just managed a somewhat klutzy procedure using the "g" command:
:%g!/search/p
This says to print out the non-matching lines... not sure if that worked, but it did end up with the cursor positioned on the first non-matching line.
(substitute some other string for "search", of course)
You can search with following line and press n to jump to the first non-matching line
^\(.*all is ok\)\#!.*$
Breakdown of operators:
^ -> means start of the line
\( and \) -> To match a whole string multiple times, it must be grouped into one item. This is done by putting "\(" before it and "\)" after it.
\#! -> Matches with zero width if the preceding atom does NOT match at the current position.
.* -> Matches any character repeated 1 or more times
$ -> end of the line
Here is sample animation how it works. For simplicity I searched for word apple.
You can iterate through the non-matches using g and a null substitution:
:g!/pattern/s/^//c
If you reply "n" each time you wont even mark the file as changed.
You need ctrl-C to escape from the circle (or keep going to bottom of file).

Why doesn't Vims errorformat take regular expressions?

Vims errorformat (for parsing compile/build errors) uses an arcane format from c for parsing errors.
Trying to set up an errorformat for nant seems almost impossible, I've tried for many hours and can't get it. I also see from my searches that alot of people seem to be having the same problem. A regex to solve this would take minutesto write.
So why does vim still use this format? It's quite possible that the C parser is faster but that hardly seems relevant for something that happens once every few minutes at most. Is there a good reason or is it just an historical artifact?
It's not that Vim uses an arcane format from C. Rather it uses the ideas from scanf, which is a C function. This means that the string that matches the error message is made up of 3 parts:
whitespace
characters
conversion specifications
Whitespace is your tabs and spaces. Characters are the letters, numbers and other normal stuff. Conversion specifications are sequences that start with a '%' (percent) character. In scanf you would typically match an input string against %d or %f to convert to integers or floats. With Vim's error format, you are searching the input string (error message) for files, lines and other compiler specific information.
If you were using scanf to extract an integer from the string "99 bottles of beer", then you would use:
int i;
scanf("%d bottles of beer", &i); // i would be 99, string read from stdin
Now with Vim's error format it gets a bit trickier but it does try to match more complex patterns easily. Things like multiline error messages, file names, changing directory, etc, etc. One of the examples in the help for errorformat is useful:
1 Error 275
2 line 42
3 column 3
4 ' ' expected after '--'
The appropriate error format string has to look like this:
:set efm=%EError\ %n,%Cline\ %l,%Ccolumn\ %c,%Z%m
Here %E tells Vim that it is the start of a multi-line error message. %n is an error number. %C is the continuation of a multi-line message, with %l being the line number, and %c the column number. %Z marks the end of the multiline message and %m matches the error message that would be shown in the status line. You need to escape spaces with backslashes, which adds a bit of extra weirdness.
While it might initially seem easier with a regex, this mini-language is specifically designed to help with matching compiler errors. It has a lot of shortcuts in there. I mean you don't have to think about things like matching multiple lines, multiple digits, matching path names (just use %f).
Another thought: How would you map numbers to mean line numbers, or strings to mean files or error messages if you were to use just a normal regexp? By group position? That might work, but it wouldn't be very flexible. Another way would be named capture groups, but then this syntax looks a lot like a short hand for that anyway. You can actually use regexp wildcards such as .* - in this language it is written %.%#.
OK, so it is not perfect. But it's not impossible either and makes sense in its own way. Get stuck in, read the help and stop complaining! :-)
I would recommend writing a post-processing filter for your compiler, that uses regular expressions or whatever, and outputs messages in a simple format that is easy to write an errorformat for it. Why learn some new, baroque, single-purpose language unless you have to?
According to :help quickfix,
it is also possible to specify (nearly) any Vim supported regular
expression in format strings.
However, the documentation is confusing and I didn't put much time into verifying how well it works and how useful it is. You would still need to use the scanf-like codes to pull out file names, etc.
They are a pain to work with, but to be clear: you can use regular expressions (mostly).
From the docs:
Pattern matching
The scanf()-like "%*[]" notation is supported for backward-compatibility
with previous versions of Vim. However, it is also possible to specify
(nearly) any Vim supported regular expression in format strings.
Since meta characters of the regular expression language can be part of
ordinary matching strings or file names (and therefore internally have to
be escaped), meta symbols have to be written with leading '%':
%\ The single '\' character. Note that this has to be
escaped ("%\\") in ":set errorformat=" definitions.
%. The single '.' character.
%# The single '*'(!) character.
%^ The single '^' character. Note that this is not
useful, the pattern already matches start of line.
%$ The single '$' character. Note that this is not
useful, the pattern already matches end of line.
%[ The single '[' character for a [] character range.
%~ The single '~' character.
When using character classes in expressions (see |/\i| for an overview),
terms containing the "\+" quantifier can be written in the scanf() "%*"
notation. Example: "%\\d%\\+" ("\d\+", "any number") is equivalent to "%*\\d".
Important note: The \(...\) grouping of sub-matches can not be used in format
specifications because it is reserved for internal conversions.
lol try looking at the actual vim source code sometime. It's a nest of C code so old and obscure you'll think you're on an archaeological dig.
As for why vim uses the C parser, there are plenty of good reasons starting with that it's pretty universal. But the real reason is that sometime in the past 20 years someone wrote it to use the C parser and it works. No one changes what works.
If it doesn't work for you the vim community will tell you to write your own. Stupid open source bastards.

Search for string and get count in vi editor

I want to search for a string and find the number of occurrences in a file using the vi editor.
THE way is
:%s/pattern//gn
You need the n flag. To count words use:
:%s/\i\+/&/gn
and a particular word:
:%s/the/&/gn
See count-items documentation section.
If you simply type in:
%s/pattern/pattern/g
then the status line will give you the number of matches in vi as well.
:%s/string/string/g
will give the answer.
(similar as Gustavo said, but additionally: )
For any previously search, you can do simply:
:%s///gn
A pattern is not needed, because it is already in the search-register (#/).
"%" - do s/ in the whole file
"g" - search global (with multiple hits in one line)
"n" - prevents any replacement of s/ -- nothing is deleted! nothing must be undone!
(see: :help s_flag for more informations)
(This way, it works perfectly with "Search for visually selected text", as described in vim-wikia tip171)
:g/xxxx/d
This will delete all the lines with pattern, and report how many deleted. Undo to get them back after.
Short answer:
:%s/string-to-be-searched//gn
For learning:
There are 3 modes in VI editor as below
: you are entering from Command to Command-line mode. Now, whatever you write after : is on CLI(Command Line Interface)
%s specifies all lines. Specifying the range as % means do substitution in the entire file. Syntax for all occurrences substitution is :%s/old-text/new-text/g
g specifies all occurrences in the line. With the g flag , you can make the whole line to be substituted. If this g flag is not used then only first occurrence in the line only will be substituted.
n specifies to output number of occurrences
//double slash represents omission of replacement text. Because we just want to find.
Once got the number of occurrences, you can Press N Key to see occurrences one-by-one.
For finding and counting in particular range of line number 1 to 10:
:1,10s/hello//gn
Please note, % for whole file is repleaced by , separated line numbers.
For finding and replacing in particular range of line number 1 to 10:
:1,10s/helo/hello/gn
use
:%s/pattern/\0/g
when pattern string is too long and you don't like to type it all again.
I suggest doing:
Search either with * to do a "bounded search" for what's under the cursor, or do a standard /pattern search.
Use :%s///gn to get the number of occurrences. Or you can use :%s///n to get the number of lines with occurrences.
** I really with I could find a plug-in that would giving messaging of "match N of N1 on N2 lines" with every search, but alas.
Note:
Don't be confused by the tricky wording of the output. The former command might give you something like 4 matches on 3 lines where the latter might give you 3 matches on 3 lines. While technically accurate, the latter is misleading and should say '3 lines match'. So, as you can see, there really is never any need to use the latter ('n' only) form. You get the same info, more clearly, and more by using the 'gn' form.

Resources