How to exclude capitalized words from spell checking in Vim? - vim

There are too many acronyms and proper nouns to add to the dictionary. I would like any words that contains a capital letter to be excluded from spell checking. Words are delimited by either a whilespace or special characters (i.e., non-alphabetic characters). Is this possible?
The first part of the answer fails when the lowercase and special characters surround the capitalized word:
,jQuery,
, iPad,
/demoMRdogood/
[CSS](css)
`appendTo()`,
The current answer gives false positives (excludes from the spellcheck) when the lowercase words are delimited by a special character. Here are the examples:
(async)
leetcode, eulerproject,
The bounty is for the person who fixes this problem.

You can try this command:
:syn match myExCapitalWords +\<[A-Z]\w*\>+ contains=#NoSpell
The above command instructs Vim to handle every pattern described by \<[A-Z]\w*\> as part of the #NoSpell cluster. Items of the #NoSpell cluster aren’t spell checked.
If you further want to exclude all words from spell checking that contain at least one non-alphabetic character you can invoke the following command:
:syn match myExNonWords +\<\p*[^A-Za-z \t]\p*\>+ contains=#NoSpell
Type :h spell-syntax for more information.

Here is the solution that worked for me. This passes the cases I mentioned in the question:
syn match myExCapitalWords +\<\w*[A-Z]\K*\>+ contains=#NoSpell
Here is an alternative solution that uses \S instead of \K. The alternative solution excludes characters that are in the parenthesis and are preceded by a capitalized letter. Since it is more lenient, it works better for URLs:
syn match myExCapitalWords +\<\w*[A-Z]\S*\>+ contains=#NoSpell
Exclude "'s" from the spellcheck
s after an apostrophe is considered a misspelled letter regardless of the solution above. a quick solution is to add s to your dictionary or add a case for that:
syn match myExCapitalWords +\<\w*[A-Z]\K*\>\|'s+ contains=#NoSpell
This was not part the question, but this is a common case for spell checking process so I mentioned it here.

Related

How to match specific letters in words

I'm currently learning Russian, and there is one caveat in the encoding of Cyrillic letters: Some look exactly like ASCII. Example. The word »облако« (cloud) does neither contain an »a« nor an »o« but instead, it contains a »а« and a »о«. If you're not getting it yet, try to fire up your browsers search dialog, enter an »a« or an »o«, use some highlight-all functionality, and you will see, that »а« and »о« both remain dark.
So, now I want to highlight this problem in vim. Since I'm using mixed language text files, I can't just highlight every ASCII letter (that would be easy), instead, I want all ASCII letters in all words that contain at least one Cyrillic letter to be error-highlighted. My current approach is to use this matches:
" Here, I use бакло as a shortcut for the list of all cyrillic letters,
" this makes this a small self contained example for the word used in the
" problem desctiption, without having the full list in all lines.
" To get the file I actually have, run
" :%s/бакло/ЖжФфЭэЗзЧчБбАаДдВЬвьЪъЫыСсЕеёНнЮюІіКкМмИиЙйПпЛлОоРрЯяГгТтЦцШшЩщХхУу/g
syn match russianWordOk "[бакло]\+"
syn match russianWordError "[бакло][a-zA-Z0-9_]\+"hs=s+1
syn match russianWordError "[a-zA-Z0-9_]\+[бакло]"he=e-1
syn match russianWordError "[бакло][a-zA-Z0-9_]\+[бакло]"hs=s+1,he=e-1
However, like in »облaко« (now a is ASCII), the highlighting would still mark »обл« as valid, »a« as invalid, »к« as not being part of a keyword (it is part of the matching russianWordError keyword) and finally the remaining »о« as valid again. What I want instead is to have the entire word being part of the matching russianWordError keyword but still only the »a« being highlighted as illegal. Is there a way and if yes, how do I accomplish that?
In order to only match whole words, not fragments inside other words, wrap your patterns in \< and \>. These assertions will then be based on Vim's 'iskeyword' setting, and should be fine. (Alternatively, you can do other lookbehind and lookahead assertions via \#<= and \#=.)
syn match russianWordOk "\<[бакло]\+\>"
I would approach the highlighting of the wrong ASCII character not via hs= / he=, but via a contained group. First, identify bad mixed words. There has to be at least one cyrillic letter, either at the beginning, or at the end. The rest is at least one (i.e. repeating the \%(...\) group with \+, or else you would only match single-error words) ASCII, potentially other cyrillics in between:
syn match russianWordBad "\<\%([бакло]*[a-zA-Z0-9_]\)\+[бакло]\+\>" contains=russianWordError
syn match russianWordBad "\<[бакло]\+\%([a-zA-Z0-9_][бакло]*\)\+\>" contains=russianWordError
This contains the ASCII syntax group that does the error highlighting. Because of contained, it only matches inside another group (here: russianWordBad).
syn match russianWordError "[a-zA-Z0-9_]" contained

Vim: Match character with a word on the left?

I want to match the following ... in Vim:
Um...yeah.
No...
I know in most programming languages you can do this:
/\b\.\.\./g
How about Vim? How can I match the ... in this situation?
You can try the following one:
/\v\w\zs[.]{3}
It matches three dots that follow any word character, you can use some kind of quantifiers after the \w to match a minimum number of letters, but I hope you get the idea.

What vim pattern matches a number which ends with dot?

In PDP11/40 assembling language a number ends with dot is interpreted as a decimal number.
I use the following pattern but fail to match that notation, for example, 8.:
syn match asmpdp11DecNumber /\<[0-9]\+\.\>/
When I replace \. with D the pattern can match 8D without any problem. Could anyone tell me what is wrong with my "end-with-dot" pattern? Thanks.
Your regular expression syntax is fine (well, you can use \d instead of [0-9]), but your 'iskeyword' value does not include the period ., so you cannot match the end-of-word (\>) after it.
It looks like you're writing a syntax for a custom filetype. One option is to
:setlocal filetype+=.
in a corresponding ~/.vim/ftplugin/asmpdp11.vim filetype plugin. Do this when the period character is considered a keyword character in your syntax.
Otherwise, drop the \> to make the regular expression match. If you want to ensure that there's no non-whitespace character after the period, you can assert that condition after the match, e.g. like this:
:syn match asmpdp11DecNumber /\<\d\+\.\S\#!/
Note that a word is defined by vim as:
A word consists of a sequence of letters, digits and underscores, or a
sequence of other non-blank characters, separated with white space
(spaces, tabs, ). This can be changed with the 'iskeyword'
option. An empty line is also considered to be a word.
so your pattern works fine if whitespace follows the number. You may want to skip the \>.
I think the problem is your end-of-word boundary marker. Try this:
syn match asmpdp11DecNumber /\<[0-9]\+\./
Note that I have removed the \> end-of-word boundary. I'm not sure what that was in there for, but it appears to work if you remove it. A . is not considered part of a word, which is why your version fails.

searching whole word in Vim (dash character)

I know for searching a whole word I should use /\<mypattern\>. But this is not true for dash (+U002d) character and /\<-\> always fails. I also try /\<\%d45\> and it fails too. anyone know the reason?
Edit2: As #bobbogo mentioned dash is not in 'iskeyword' so I add :set isk+=- and /\<-\> works!
Edit1: I think in Vim /\<word\> only is valid for alphanumeric characters and we shouldn't use it for punctuation characters (see Edit2). I should change my question and ask how we can search punctuation character as a whole world for example I want my search found the question mark in "a ? b" and patterns like "??" and "abc?" shouldn't be valid.
\< matches the zero-width boundary between a non-word character and a word character. What is a word character? It's specified by the isk option (:help isk).
Since - is not in your isk option, then - can never start a word, thus \<- will never match.
I don't know what you want, but /\>-\< will match the dash in hello-word.
Could always search for the regex \byourwordhere\b
As OP said. In order to include dash - into search just execute:
:set isk+=-
Thats all.
Example: When you press * over letter c of color-primary it will search for entire variable name not just for color.

Vim Spell option to ignore source code identifiers containint underscore, numbers, etc

Is there any option in vim spell checker to ignore words containing underscore, multiple uppercase letters, minus, numbers in a plain text file. I could not find anything in the manuals (7.2) or Google search.
You can use the syntax command with the #NoSpell cluster:
syn match myExCapitalWords +\<\w*[_0-9A-Z-]\w*\>+ contains=#NoSpell
You may want to look at :help iskeyword variable which defines what 'word' is.
You can try using zg when on the bad word - this will add the word to the allowed word list, which should then stop it being highlight in the future.
Otherwise, I'm not sure whether this is possible using the standard functionality - syntax files can specify regions that should or should not be spell checked but this will not help you, as you are including 'bad words' in comments, which would be spell checked in most cases.

Resources