Vim syntax highlighting and certain characters - vim

I am attempting to write a syntax file for Vim.
One of the lines of code reads
syn match constant "\**\*"
while one of many other lines reads
syn keyword aiOperators up-build
The code for highlighting is the following:
hi constant gui=bold
hi aiOperators guifg=green
However, the result of the above is that only the following is highlighted:
The asterisks of every constant, but not the characters between them.
Characters up until the first hyphen of aiOperators.
What seems to be the issue?

The regular expression for your constant specifies a literal asterisk, zero or more times, followed by a literal asterisk. If you intend to match characters delimited by asterisks, you need something like \*\w\+\*: a literal asterisk, followed by one or more word characters, followed by a literal asterisk.
The :syn keyword only works for keyword characters; by default, the hyphen is not included, so the match stops there. If, for your filetype, the hyphen belongs to the set of keyword characters, use
:setlocal iskeyword+=-
This should not be placed into the syntax file itself, but into ~/.vim/ftplugin/myfiletype.vim. Otherwise, use :syn match.

Related

how to remove specific characters in vi or vim editor

I have some txt in vi:
|NC_004718|29751nt|SARS
|NC_045512|29903nt|Severe
|NC_004718|29751nt|SARS
|NC_045512|29903nt|Severe
|NC_004718|29751nt|SARS
now I want to replace remove everything after NC_004718, my expected output is:
NC_004718
NC_045512
NC_004718
NC_045512
NC_004718
How to do it? Thanks.
I would recommend using a substitution with regular expression to match the entire string and to capture what you would like to keep in parentheses. That way you can then replace the entire string with just the match.
:%s/^|\([^|]\+\)|.\+/\1/
To break down what is happening:
% means that you want to apply the command to each line within the file.
s means that you are doing substitution command (on each line). The s command has a syntax of s/<regular expression pattern>/<replacement>/<flags>
The regular eression pattern in the above command is ^|\([^|]\+\)|.\+.
^ means match from the line start.
| matches the character |.
\([^|]\+\) matches all characters except for the character |. Note that the real regular expression is actually ([^|]+), the additional \ characters are there because Vim needs to know that they are intended to be special characters for processing and not exact characters it needs to match. Also note that the parentheses are there to capture the match into a group (see below).
| again matches the actual character |.
.\+ matches all characters until the end of the line. Note that the . is considered special character by default but + still needs a preceding \.
The replacement text is only \1. This denotes that Vim should replace the text with whatever was captured in the first group (i.e. the first set of parentheses).
There are no flags with this command so there is nothing after the last /.
For example,
:g/NC_\d\+/normal! ygnV]p
:g/regex/ to match lines
normal! to execute Normal mode commands
ygn to yank the text previously matched by :g
V to select the whole line
]p or p to replace the line with the match
If you have only lines like those you have shown try:
:%norm xf|D

How to match specific letters in words

I'm currently learning Russian, and there is one caveat in the encoding of Cyrillic letters: Some look exactly like ASCII. Example. The word »облако« (cloud) does neither contain an »a« nor an »o« but instead, it contains a »а« and a »о«. If you're not getting it yet, try to fire up your browsers search dialog, enter an »a« or an »o«, use some highlight-all functionality, and you will see, that »а« and »о« both remain dark.
So, now I want to highlight this problem in vim. Since I'm using mixed language text files, I can't just highlight every ASCII letter (that would be easy), instead, I want all ASCII letters in all words that contain at least one Cyrillic letter to be error-highlighted. My current approach is to use this matches:
" Here, I use бакло as a shortcut for the list of all cyrillic letters,
" this makes this a small self contained example for the word used in the
" problem desctiption, without having the full list in all lines.
" To get the file I actually have, run
" :%s/бакло/ЖжФфЭэЗзЧчБбАаДдВЬвьЪъЫыСсЕеёНнЮюІіКкМмИиЙйПпЛлОоРрЯяГгТтЦцШшЩщХхУу/g
syn match russianWordOk "[бакло]\+"
syn match russianWordError "[бакло][a-zA-Z0-9_]\+"hs=s+1
syn match russianWordError "[a-zA-Z0-9_]\+[бакло]"he=e-1
syn match russianWordError "[бакло][a-zA-Z0-9_]\+[бакло]"hs=s+1,he=e-1
However, like in »облaко« (now a is ASCII), the highlighting would still mark »обл« as valid, »a« as invalid, »к« as not being part of a keyword (it is part of the matching russianWordError keyword) and finally the remaining »о« as valid again. What I want instead is to have the entire word being part of the matching russianWordError keyword but still only the »a« being highlighted as illegal. Is there a way and if yes, how do I accomplish that?
In order to only match whole words, not fragments inside other words, wrap your patterns in \< and \>. These assertions will then be based on Vim's 'iskeyword' setting, and should be fine. (Alternatively, you can do other lookbehind and lookahead assertions via \#<= and \#=.)
syn match russianWordOk "\<[бакло]\+\>"
I would approach the highlighting of the wrong ASCII character not via hs= / he=, but via a contained group. First, identify bad mixed words. There has to be at least one cyrillic letter, either at the beginning, or at the end. The rest is at least one (i.e. repeating the \%(...\) group with \+, or else you would only match single-error words) ASCII, potentially other cyrillics in between:
syn match russianWordBad "\<\%([бакло]*[a-zA-Z0-9_]\)\+[бакло]\+\>" contains=russianWordError
syn match russianWordBad "\<[бакло]\+\%([a-zA-Z0-9_][бакло]*\)\+\>" contains=russianWordError
This contains the ASCII syntax group that does the error highlighting. Because of contained, it only matches inside another group (here: russianWordBad).
syn match russianWordError "[a-zA-Z0-9_]" contained

How can I use syntax match with literal `*` correctly?

Consider the following vim syntax rules, which I am using to change the color of words surrounded by *.
syntax match boldme /\*.\{-1,}\*/
highlight boldme ctermfg=Red
For some reason, this rule only works if the word is at the beginning of a line, *hello* is red in the first line below but not the second line.
*hello* works
Another word and *hello* does not work.
How can I make syn match work in the middle of a line for the scenario above?
Update: This problem appears to be specific to using the literal * character as part of the match. The following match works fine for using _ instead.
syntax match boldme /_.\+_/
Thus the question is really, how do I force vim to treat a literal * character correctly in syn match?
try this:
syntax match boldme /\*.\+\*/
Update
I don't know how did you do the test, see this gif animation with vim -u NONE:

Highlighting keywords beginning with colon in Vim

I wrote a vim syntax file. I notice that all keywords except those beginning with a colon (:) are being highlighted. Is there any way to escape colons in Vim?
Here's a section of the file:
syn keyword actionLabel :action nextgroup=actionName skipwhite
syn keyword problemLabels :goal :init :domain
syn keyword advLabels :types
syn keyword pondLabels :observe
hi def link actionLabel Statement
hi def link problemLabels Statement
hi def link advLabels Statement
hi def link pondLabels Statement
From :h :syn-define about keywords...
It can only contain keyword characters, according to the
'iskeyword' option. It cannot contain other syntax items. It will
only match with a complete word (there are no keyword characters
before or after the match). The keyword "if" would match in
"if(a=b)", but not in "ifdef x", because "(" is not a keyword
character and "d" is.
That means you'll have to modify iskeyword for your file type to include the colon character (ascii 58). Starting from the vi default, we can support any alphabetic character, number, underscore, or colon:
set iskeyword="#,48-58,_"
The best solution seems to be not using the keyword option, but using the matches option instead.
syn match pddlLabel ':[a-zA-Z0-9]\+'
hi def link pddlLabel Statement

What vim pattern matches a number which ends with dot?

In PDP11/40 assembling language a number ends with dot is interpreted as a decimal number.
I use the following pattern but fail to match that notation, for example, 8.:
syn match asmpdp11DecNumber /\<[0-9]\+\.\>/
When I replace \. with D the pattern can match 8D without any problem. Could anyone tell me what is wrong with my "end-with-dot" pattern? Thanks.
Your regular expression syntax is fine (well, you can use \d instead of [0-9]), but your 'iskeyword' value does not include the period ., so you cannot match the end-of-word (\>) after it.
It looks like you're writing a syntax for a custom filetype. One option is to
:setlocal filetype+=.
in a corresponding ~/.vim/ftplugin/asmpdp11.vim filetype plugin. Do this when the period character is considered a keyword character in your syntax.
Otherwise, drop the \> to make the regular expression match. If you want to ensure that there's no non-whitespace character after the period, you can assert that condition after the match, e.g. like this:
:syn match asmpdp11DecNumber /\<\d\+\.\S\#!/
Note that a word is defined by vim as:
A word consists of a sequence of letters, digits and underscores, or a
sequence of other non-blank characters, separated with white space
(spaces, tabs, ). This can be changed with the 'iskeyword'
option. An empty line is also considered to be a word.
so your pattern works fine if whitespace follows the number. You may want to skip the \>.
I think the problem is your end-of-word boundary marker. Try this:
syn match asmpdp11DecNumber /\<[0-9]\+\./
Note that I have removed the \> end-of-word boundary. I'm not sure what that was in there for, but it appears to work if you remove it. A . is not considered part of a word, which is why your version fails.

Resources