What is the syntax called that's used for syntax highlighting? - vim

My apologies if this is such a trivial question for most, but it is not readily obvious to me and I can't find an answer. I am trying to understand how to code syntax highlighting for vi, but the syntax used [for the syntax highlighting] eludes me. For example, I know that
syn match myNumber '\d\+'
hi myNumber ctermfg=blue
will highlight positive integers blue. What confuses me is the '\d\+' part. Playing around it seems that \d means digit and \+ means several? But I have no idea what this syntax is called and thus can't find any documentation that could help me. I have only found links of people using it without explanation. I can probably decipher how
'[-+]\=\d[[:digit:]]*\.\d*[eE][\-+]\=\d\+'
means 'positive and negative numbers with decimals and exponents', but if I am to make more complex highlighting I don't know where to begin. Does anybody know of any documentation that I could use to learn this?

That's called a "regular expression" or regex (or regexp) for short. See :help pattern and :help usr_27.txt. Also see vimregex.com.

Related

Why does this regular expression not work in Vim syntax highlighting?

I am attempting to create my own Python syntax highlighting file for Vim.
I'm trying to highlight the class inheritance object and the regex I've created
works in various regex testers, but doesn;t work in Vim. I've read that Vim's regex is close to Perl style
so that is what I've been using.
I'm trying to highlight the word 'Subscribers' in the following text:
class Divisions(Subscribers):
The regex I've composed is:
(?!:class\s\w+)(?<=\()\w+(?=\):)
I'll be honest here, I stumbled into this while I was struggling to make a negative lookbehind work with quantifiers,
which I now understand isn't possible. I was experimenting with the non-capturing group (?:class\s\w+) and accidentally
inserted the exclamation mark which 'magically' solved the problem. At least in the multiple regex testers I was using.
Just for clarity, then follows a look behind (?<=() to caputre but not include the '('
and then a look ahead after the word (?=):) to capture but not include the closing '):'
I've added it to my Vim syntax file as:
syn match pythonClassInherit "(?!:class\s\w+)(?<=\()\w+(?=\):)"
Is this a valid regular expression in Vim? If not, can anybody offer a working solution for Vim?
EDIT: I realized that I've had overcomplicated the issue; you just need a right Vim regex. Try
syntax match pythonClassInherit "\%(class\s\+\h\w*\s*(\s*\)\#<=\h\w*\%(\s*):\)\#="
highlight link pythonClassInherit pythonImport
Then you'll see the result:
You may replace pythonImport by another predefined highlight group.
For highlighting, I found https://learnvimscriptthehardway.stevelosh.com/ (highlighting starts from Chap. 45) helpful. As for regex, romainl's suggestion is awesome.
My first attempt was to use regex \zs. This, unlike \#<=, requires that the syntax group cover the part that comes before \zs, which cannot be satisfied because some other groups have already occupied that part. This led me a complicated solution, by overriding the default highlight groups to include a new group (for Subscribers). But it turns out we really don't have to.
If interested, see Vim syntax file not matching with \zs.
Online regular expression playgrounds don't support Vim's syntax so using them is pointless, here.
See vimregex.com for an overview, :help usr_27 for a gentle tutorial, :help pattern for the definitive reference, and :help perl-patterns for the differences between the Vim syntax and the Perl syntax.

Is this change to Vim's syntax coloring file for sed good or disrupting?

Summary
I'm not asking your opinion about the proposed change being good, bad, amazing, or what.
I'm just asking your help to answer this question: does the change I propose for the syntax coloring of the comments break the syntax coloring of something else (keywords, strings, syntax errors, ...)?
The question above is not opinion based, as my change either does break something or it doesn't. That's it.
Original question
I have created the issue #5876 on Vim's GitHub page to propose a change to vim/runtime/syntax/sed.vim, but it has not received much attention, so I'm considering creating a PR for the change.
In fact, I created an issue instead of a PR because I'm not totally confident the change is not disruptive, hence this question.
The issue is with line 20:
syn match sedComment "^\s*#.*$"
because of which only "full line" comments are colored as comments. Using trailing comments following a command (allowed by GNU sed, for instance), stimulates some red background coloring (since it's considered an error by the syntax coloring logic, I guess).
I think it would be reasonable to relax this definition of comments to permit GNU sed-style comments, for the simple reason that the rule is less restrictive.
In this respect, I have noticed that changing that line to
syn match sedComment "\s*#.*$"
i.e. just removing the anchor ^, seems to be enough. I have also tried testing it by putting some # in search and replace strings in a sed script, and it seems fine.
However I don't feel confident with Vim syntax coloring files, so I would like to be sure that regex, as I edited it, is not causing false positives.
To demonstrate why I'm not confident about it, take this single-line sed script
s/aaa/bbb/#ccc
here # is not colored as a comment, and ccc's background is red (like an error?), whereas, just adding a space, give the correct coloring:
s/aaa/bbb/ #ccc
Therefore I think that my edit works (or seems to work) because of precedence rules between the several syntax coloring directives (with respect to this specific example with s/aaa/bbb/#ccc, I think that # just after the closing delimiter of the s command has a meaning in the language, but I don't know; the GNU Sed man page doesn't say anything about it).
Edit
Another example suggested in the comments is the following, the syntax coloring of which is not broken by the proposed change
s/#if !defined(\([^)]*)/#ifndef \1/ # with or without this comment is fine
My research
I have eventually found some time to study :h E410 and now I have an answer.
From there I read that
1. Keyword
[...] It will only match with a complete word [...]
2. Match
This is a match with a single regexp pattern.
3. Region
This starts at a match of the "start" regexp pattern and ends with a match with the
"end" regexp pattern. Any other text can appear in between. A "skip" regexp
pattern can be used to avoid matching the "end" pattern.
Since the line I'm going to change uses the syntax 2.,
point 1. is not relevant,
point 3. is relevant, as in general there can be an overlap between the what syn match and syn region match.
(By the way, I verified that syn region uses a lazy regex, probably \_.\{-} looking at some examples afterwards, to match stuff between the start and end regexp patterns.)
Then from :h :syn-priority
An item that starts in an earlier position has priority over items that start in later positions.
The answer
Therefore, changing
syn match sedComment "^\s*#.*$"
to
syn match sedComment "\s*#.*$"
should not introduce any syntax error, the reason being that if a text matching the above regex is not actually a comment (as is the case for e.g. the command s/#/hash/), then it must be preceded by some non-comment syntax which will be matched by other syn match/syn region groups. Since this match starts before where the group sedComment starts to match, the former has precedence on the latter.
In conclusion, unless something is broken with sed.vim already, the proposed change will not result in a wrong syntax coloring.
Further observations
Actually, there is already something wrong in sed.vim, and my proposed edit does not solve it (I should edit some other line in sed.vim to fix it, but I'm a bit lazy now).
For instance, in the following line, which is illegal because a is not a valid flag, a does not get the Error coloring.
s/x/y/a
My proposed edit does not solve this bug.
For the same reason, since some other syntax rule is eating 1 character after the third delimiter in the substitution command, the GNU-sed-valid comment in the following command is incorrectly colored
s/x/y/#hello
# │└───┴─── colored as Error
# └─ not colored
The coloring would be wrong even for an old version of sed which does not allow trailing comments, as the Error coloring should include the # too.
My proposed edit does not solve this bug either.
One more observation
The change I propose causes single-letter commands to be highlighted as Error if they are followed by a comment
p # this is a comment
│ └─────────────────┴─── colored as Comment (ok)
└────── colored as Error (bad)
This again, is not a worsening, as the current sed.vim colors almost the whole line above as an Error.

What does the comment "COMBAK" mean?

I was looking at the vimscript syntax file in the syntax directory, and under the keyword vimTodo were the words COMBAK, FIXME, TODO, and XXX. I can figure out what FIXME and TODO mean pretty easily, and I can guess what some might use XXX for, but I have no idea what COMBAK is for.
It must have a meaning of some sort, else there would be no reason to highlight it. I get that it's a code tag, but what does it mean? My best guess so far is an abbreviation for COMEBACK, though I doubt this.
Here is what I found so far:
Googling it got me nothing of use, and a google code search for COMBAK (with or without the quotes) got 0 results. I eventually Googled codetag "COMBAK" and found a single result, which uses it as a tag in a comment twice (a [ctrl+F] will find it): http://pastebin.com/H6mjbyBh.
The program is written in Vimscript, and contains both a vimscript syntax file and vimscript indent plugin file for lisp, along with some other massive functions.
Yes, it literally means "COME BACK".

Making a Vim theme that disables highlighting except for some special keywords

Inspired by several posts, like Your syntax highlighter is wrong, Coding in color and A case against syntax highlighting and some others, I decided making a Vim theme that applied some of these concepts would be a good idea.
The thing is I'm not exactly sure how.
From what I can tell, in order to make a Vim theme you need to basically link a color with a syntax identifier or name. And repeat this hundreds or dozens of time in order to have in your lap a theme.
Like for example linking the color #ff0000 (red) and the syntax identifier, or key, Error. As an example. Not sure if that's actually the syntax key.
This would work fine, except that, every syntax that I don't consider important I have to define as just a default foreground value.
And let's say I wanted to add a new syntax keyword, I'd have to do it with ftsyntax and stuff (I believe) and that would be filetype specific etc.
So the first question is:
What would be the best way to give everything a default foreground color and only pick the exceptions to have some colors?
And the second, perhaps more important question is:
How do I syntax highlight a specific piece of text without having to add a syntax rule? For example have a regex that finds any = and highlights them green, without having to add a syntax rule specific for that.
Any help is appreciated. Of course if the approach I'm taking to this is not ideal or sucks I am open to suggestions to alternatives. Thank you. :)
See the example syntax file below:
syn keyword myKeywords We Are Important Keywords
syn match myEquals '='
hi link myKeywords Special
hi link myEquals Operator
This will put We, Are, Important and Keywords into the myKeywords syntax group and = into the myEquals syntax group.
Then we specify how we want to highlight them, by linking it to the Special and Operator highlight groups.
See: :help group-name for a list of the highlight groups and what the colors look like with your color-scheme.
In my color-scheme, Special is Red and Operator is green.
By default, everything else is set to the default foreground color.
I saved this to ~/.vim/syntax/greduan.vim and tested with :set syntax=greduan
Your question touches two domains:
syntax definition
syntax highlighting
Syntax definition, as in Caek's answer, is simple for the first 10 minutes but grows very quickly into a major PITA because it is a core aspect of Vim's architecture with ramifications far beyond syntax highlighting.
Syntax highlighting has its pitfalls but it is a lot simpler than syntax definition.
I think that you can tackle the problem described in those blog posts with syntax highlighting first and, if needed, graduate to syntax definition.
Grab a simple colorscheme like Busybee.
Link all the highlight groups you don't need to Normal while leaving the ones you want to keep:
hi link Foo Normal
What would be the best way to give everything a default foreground
color and only pick the exceptions to have some colors?
What is best depends ... for me the best, because quickest way was clearing unwanted highlighting in ~/.vimrc:
sy on
hi c Constant|hi c Error|hi c PreProc|hi c Special|hi c Statement|hi c Type
hi c Identifier
How do I syntax highlight a specific piece of text without having to
add a syntax rule?
If by syntax rule you mean syntax item, I'd say you cannot have syntax highlighting without defining syntax items.

vim does not find and replace simple phrase that is clearly present

I have a simple vim problem that Google hasn't managed to help me with. Any thoughts are appreciated.
I do the following search and replace:
:s/numnodes/numnodes1/g
On a file containing the following text:
numprocs=0
numnodes=0
I get
E486: Pattern not found
The position of the green square which indicates where I'd start typing is clearly above the pattern. I tried searching for other short phrases not involving regex, which are also present, which also fail. A simple /numnodes highlights matches as expected. Does anyone have any idea what might be the matter with vim?
Try :%s/searchphrase/replacephase/g
Without the % symbol Vim only matches and replaces on the current line.
try using this:
:%s/numnodes/numnodes1/g

Resources