How to change word recognition in vim spell? - vim

I like that vim 7.0 supports spell checking via :set spell, and I like that it by default only checks comments and text strings in my C code. But I wanted to find a way to change the behavior so that vim will know that when I write words containing underscores, I don't want that word spell checked.
The problem is that I often will refer to variable or function names in my comments, and so right now vim thinks that each piece of text that isn't a complete correct word is a spelling error. Eg.
/* The variable proj_abc_ptr is used in function do_func_stuff' */
Most of the time, the pieces seperated by underscores are complete words, but other times they are abbreviations that I would prefer not to add to a word list. Is there any global way to tell vim to include _'s as part of the word when spell checking?

Here are some more general spell-checking exception rules to put in .vim/after/syntax/{LANG}.vim files:
" Disable spell-checking of bizarre words:
" - Mixed alpha / numeric
" - Mixed case (starting upper) / All upper
" - Mixed case (starting lower)
" - Contains strange character
syn match spellingException "\<\w*\d[\d\w]*\>" transparent contained containedin=pythonComment,python.*String contains=#NoSpell
syn match spellingException "\<\(\u\l*\)\{2,}\>" transparent contained containedin=pythonComment,python.*String contains=#NoSpell
syn match spellingException "\<\(\l\+\u\+\)\+\l*\>" transparent contained containedin=pythonComment,python.*String contains=#NoSpell
syn match spellingException "\S*[/\\_`]\S*" transparent contained containedin=pythonComment,python.*String contains=#NoSpell
Change pythonComment,python.*String for your language.
transparent means that the match inherits its highlighting properties from the containing block (i.e. these rules do not change the way text is displayed).
contained prevents these matches from extending past the containing block (the last rule ends with \S* which would likely match past the end of a block)
containedin holds a list of existing syntax groups to add these new rules to.
contains=#NoSpell overrides any and all inherited groups, thus telling the spellchecker to skip the matched text.

You'll need to move it into its own group. Something like this:
hi link cCommentUnderscore cComment
syn match cCommentUnderscore display '\k\+_\w\+'
syn cluster cCommentGroup add=cCommentUnderscore
In some highlighters you may need contains=#NoSpell on the end of the match line, but in C, the default is #NoSpell, so it should be fine like that.


How can I assign a vim syntax group to a 'Foo( ... )'? For arbitrary ... including contained parenthesis

I am trying to give my debug macros a different color, not just the macro names, but at least the opening and closing parenthesis in the same color.
As a test I'm current using the following line:
,--- Uncolored paren!?
int foobar; Dout( (( ) ) ( ) ); f((char*)x);
\__ all of this should be colored.
If I try the following syntax rule:
syn region Debug matchgroup=DebugDelim start=/Dout(/ end=/)/ contains=cParen
then this isn't working: the first closing parenthesis that should be part of cParen isn't part of any highlighting group anymore and remains therefore uncolored.
Is this a bug in vim or am I doing something wrong?
Note: cParen here is defined as 'usual' (in the vim's c.vim syntax file).
EDIT: After figuring out the answer to my previous question (namely that contained groups match inside the start and end pattern unless you use a matchgroup argument), I changed my question to what remains a question to me.
After more than a week of research and trying things; the main conclusion is that vim's syntax system needs a redesign. It is flawed.
Having said that, the best answer would be 'it can not be done', because the system is just too broken; but -- with lots and lots of efforts a sort of horrible work around can be forged (for THIS case -- not for what I really needed) that looks as follows:
" Highlight my debug macros.
" A group with things that are normally already excluded from cParen.
syn cluster cwNotInParen contains=#cParenGroup,cCppParen,cErrInBracket,cCppBracket,#cStringGroup,#Spell
" Create a new cParen syntax group that will have Debug highlighting.
syn region cwDebugParen transparent matchgroup=cwDebugParenDelim contains=ALLBUT,#cwNotInParen
\ start='(' end=")\#="
" Define cParen last, so it will overrule the previous one.
syn clear cParen
syn region cParen transparent contains=ALLBUT,#cwNotInParen,cwDebugParen
\ start='(' end=')'
" Redefine cCppParen to also exclude cwDebugParen
syn clear cCppParen
syn region cCppParen transparent contained contains=ALLBUT,#cParenGroup,cErrInBracket,cParen,cwDebugParen,cBracket,cString,#Spell
\ start='(' skip='\\$' excludenl end=')' end='$'
" Redefine cCppBracket to also exclude cwDebugParen.
syn clear cCppBracket
syn region cCppBracket transparent contained contains=ALLBUT,#cParenGroup,cErrInParen,cParen,cwDebugParen,cBracket,cString,#Spell
\ start='\[\|<::\#!' skip='\\$' excludenl end=']\|:>' end='$'
" Add a syntax group for "Debug( ... );".
syn region cwDebugMacros transparent matchgroup=cwDebugMacrosDelim contains=cwDebugParen
\ start="\v\W(Debug|DoutFatal|DoutEntering|Dout|ASSERT|assert)\(#="hs=s+1
\ start="\v^(Debug|DoutFatal|DoutEntering|Dout|ASSERT|assert)\(#="
\ end="\v\);?"
hi link cwDebugMacrosDelim Debug " Highlight 'Debug' and the close paren of that macro.
hi link cwDebugParenDelim Debug " Highlight the open paren of the Debug macro.
This works as follows:
A new syntax group is created called cwDebugParen. It is basically the same as cParen but will have highlighting for the parenthesis. Because the syntax definition is containing ALLBUT in many places, this new group will now be accepted everywhere. It is possible - but not very convenient - to just replace the whole syntax file with your own and then add the new cwDebugParen to all ALLBUT,... lists, but in this case it is easier to remove cParen and redefine it after the definition of cwDebugParen, effectively hiding that because the last rule that matches will be used. Unfortunately the definition of cParen depends on some conditionals so the whole would get pretty complex still; therefore I just picked the one cParen definition that is used based on my .vimrc.
Because cParen doesn't have a matchgroup I found that its start pattern is still also matched by cwDebugParen?! -- this seems a bug to me but at least can be avoided by adding cwDebugParen to the ALLBUT list of cParen.
cwDebugParen itself does have a matchgroup, which is necessary for the syntax highlighting of the parens of course. Note that although it ends on the closing paren ')', I use a zero-match (:help \#=) because the same paren is needed to end the enclosing cwDebugMacros region. Note that using end=')'me=s-1 would have the same effect here.
Finally a new region is added, cwDebugMacros, that starts at a pattern like Debug( and ends on );. The two start= arguments are necessary because they use a different highlight start (hs) since one starts at the beginning of the line while the other demands that there is a non-word character in front of Debug. Also the opening paren here is a zero-match because we need it to start the cwDebugParen region that we contain in this region (fortunately we can use a zero-match here; because me=s-1 isn't support for start!?). That then will gobble up all input up till the closing paren, so we don't have to be afraid that our end will match a closing paren that isn't the closing paren of our debug macro. Note that I used end="\v\);?" with a question mark after the semi-colon so that IF the semi-colon directly follows the closing paren then it is also colored the same color (I like that) while if for whatever reason there is white-space (including comments) in between then things aren't broken - only the semi-colon isn't colored anymore then.
Edit: Added a redefine for cCppParen and cCppBracket which use the construct ALLBUT=...,cParen,... and therefore now also need to exclude cwDebugParen.

How to match specific letters in words

I'm currently learning Russian, and there is one caveat in the encoding of Cyrillic letters: Some look exactly like ASCII. Example. The word »облако« (cloud) does neither contain an »a« nor an »o« but instead, it contains a »а« and a »о«. If you're not getting it yet, try to fire up your browsers search dialog, enter an »a« or an »o«, use some highlight-all functionality, and you will see, that »а« and »о« both remain dark.
So, now I want to highlight this problem in vim. Since I'm using mixed language text files, I can't just highlight every ASCII letter (that would be easy), instead, I want all ASCII letters in all words that contain at least one Cyrillic letter to be error-highlighted. My current approach is to use this matches:
" Here, I use бакло as a shortcut for the list of all cyrillic letters,
" this makes this a small self contained example for the word used in the
" problem desctiption, without having the full list in all lines.
" To get the file I actually have, run
" :%s/бакло/ЖжФфЭэЗзЧчБбАаДдВЬвьЪъЫыСсЕеёНнЮюІіКкМмИиЙйПпЛлОоРрЯяГгТтЦцШшЩщХхУу/g
syn match russianWordOk "[бакло]\+"
syn match russianWordError "[бакло][a-zA-Z0-9_]\+"hs=s+1
syn match russianWordError "[a-zA-Z0-9_]\+[бакло]"he=e-1
syn match russianWordError "[бакло][a-zA-Z0-9_]\+[бакло]"hs=s+1,he=e-1
However, like in »облaко« (now a is ASCII), the highlighting would still mark »обл« as valid, »a« as invalid, »к« as not being part of a keyword (it is part of the matching russianWordError keyword) and finally the remaining »о« as valid again. What I want instead is to have the entire word being part of the matching russianWordError keyword but still only the »a« being highlighted as illegal. Is there a way and if yes, how do I accomplish that?
In order to only match whole words, not fragments inside other words, wrap your patterns in \< and \>. These assertions will then be based on Vim's 'iskeyword' setting, and should be fine. (Alternatively, you can do other lookbehind and lookahead assertions via \#<= and \#=.)
syn match russianWordOk "\<[бакло]\+\>"
I would approach the highlighting of the wrong ASCII character not via hs= / he=, but via a contained group. First, identify bad mixed words. There has to be at least one cyrillic letter, either at the beginning, or at the end. The rest is at least one (i.e. repeating the \%(...\) group with \+, or else you would only match single-error words) ASCII, potentially other cyrillics in between:
syn match russianWordBad "\<\%([бакло]*[a-zA-Z0-9_]\)\+[бакло]\+\>" contains=russianWordError
syn match russianWordBad "\<[бакло]\+\%([a-zA-Z0-9_][бакло]*\)\+\>" contains=russianWordError
This contains the ASCII syntax group that does the error highlighting. Because of contained, it only matches inside another group (here: russianWordBad).
syn match russianWordError "[a-zA-Z0-9_]" contained

vim regex & highlight syntax: find a match and ignore sub-match in it

I am trying to write a syntax highlighter in VIM. How do you highlight a match within another match?
To find each match, I created two syn match lines, which work where the matches are separate.
syn match celString "^xpath=.\{-};" -> matches "xpath=.........;"
syn match celComment "\${.\{-}}" -> matches "${LIB_METADATA};"
The first line is pink for the xpath string and blue for the ${..} string.
The second line is pink for the xpath string, but the ${..} contained inside that string is ignored.
I've tried to change the order of the syn match lines, but that doesn't have any effect.
I'd appreciate your ideas.
By default, Vim only applies the syntax groups to text that hasn't yet been assigned a syntax. To specify that one group can contain other groups, use the contains=... attribute:
:syn match celString "^xpath=.\{-};" contains=celComment
The order of definition shouldn't matter here. See :help :syn-contains for more information.

In vim, how do I highlight TODO: and FIXME:?

In vim, FIXME and TODO are highlighted, but I can't get FIXME: and TODO: (note the colon after the keyword) to highlight? What should I put in my .vimrc to make this happen?
Well, you've already found the problem, but here's the why.
There are three basic types of syntax matching: keywords, matches, and regions. Keywords are fixed strings, generally used for basic language keywords (int, double, ...) and also, in your case, for the FIXME and TODO. I really do mean fixed strings; they have to be exact and whole words, unlike matches and regions, which use regex. For example, from the C syntax:
syn keyword cTodo contained TODO FIXME XXX
It looks like that in pretty much all built-in syntax definitions, just with different group names (cTodo).
iskeyword tells vim whether a given character can be part of keyword. By default, it does not include colons, so when looking for keywords, vim sees "FIXME:" as "FIXME", and ignores the colon. If you tack on the colon (set iskeyword+=:), you can now define an extra bit of highlighting:
syn keyword myTodo contained TODO: FIXME:
It's up to you how you want to work it into the existing syntax/highlight groups. If it's for just one filetype, you could add it to that syntax's todo group (e.g. cTodo). If you want it everywhere, you can do "myTodo" as I suggested, then link it straight to the Todo highlighting group (hi def link myTodo Todo).
Alternatively, you can leave iskeyword alone (I'd probably recommend this), and simply use a match:
syn match myTodo contained "\<\(TODO\|FIXME\):"
hi def link myTodo Todo
augroup vimrc_todo
au Syntax * syn match MyTodo /\v<(FIXME|NOTE|TODO|OPTIMIZE|XXX):/
\ containedin=.*Comment,vimCommentTitle
augroup END
hi def link MyTodo Todo
The containedin will add it to all groups ending in "Comment", plus
vimCommentTitle, where " TODO: foo would not get highlighted as MyTodo otherwise.
If you make your own environment, make syntax file (not .vimrc)
global syntax file is located vim directory (ex.
and if you make ~/.vim/syntax/c.vim, then you can add syntax for your
own. (override)
Just add additional syntax in that file. (the way #Jefromi does)

Sub-match syntax highlighting in Vim

First, I'll show the specific problem I'm having, but I think the problem can be generalized.
I'm working with a language that has explicit parenthesis syntax (like Lisp), but has keywords that are only reserved against the left paren. Example:
(key key)
the former is a reserved word, but the latter is a reference to the variable named "key"
Unfortunately, I find highlighting the left paren annoying, so I end up using
syn keyword classification key
instead of
syn keyword classification (key
but the former triggers on the variable uses as well.
I'd take a hack to get around my problem, but I'd be more interested in a general method to highlight just a subset of a given match.
Using syn keyword alone for this situation doesn't work right because you want your highlighting to be more aware of the surrounding syntax. A combination of syn region, syn match, and syn keyword works well.
hi link lispFuncs Function
hi link lispFunc Identifier
hi link sExpr Statement
syn keyword lispFuncs key foo bar contained containedin=lispFunc
syn match lispFunc "(\#<=\w\+" contained containedin=sExpr contains=lispFuncs
syn region sExpr matchgroup=Special start="(" end=")" contains=sExpr,lispFuncs
The above will only highlight key, foo, and bar using the Function highlight group, only if they're also matched by lispFunc.
If there are any words other than key, foo, and bar which come after a (, they will be highlighted using the Identifier highlight group. This allows you to distinguish between standard function names and user-created ones.
The ( and ) will be highlighted using the Special highlight group, and anything inside the () past the first word will be highlighted using the Statement highlight group.
There does appear to be some capability for layered highlighting, as seen here: Highlighting matches in Vim over an inverted pattern
which gives ex commands
:match myBaseHighlight /foo/
:2match myGroup /./
I haven't been able to get anything like that to work in my syntax files, though. I tried something like:
syn match Keyword "(key"
syn match Normal "("
The highlighting goes to Normal or Keyword over the whole bit depending on what gets picked up first (altered by arrangement in the file)
Vim soundly rejected using "2match" as a keyword after "syn".
