Is there a possibility to display characters in a Vim window (that is: on the screen) that are different from the underlying characters in the buffer?
For example, if filetype is set to html, I'd (sometimes) like to see html-entities replaced by their humanly readable character (for example an ä instead of ä). Of course, this would entail that the rest of the line after the entity would have to be "shifted" to the left. If this is possible somehow, I'd appreciate any hint into the right direction.
If you're using 7.3 or newer, then you can make use of the conceal feature to do that. For example:
syntax match Entity "ä" conceal cchar=ä
try a plugin:
html_umlaute : replaces german Umlaute with their HTML Encoding on saving
html_french : view html entities as accented characters for French
I use both of vim and neovim, but just found that both editors treat unicode characters (such as Korean characters) as a different text object from english alphabet separately.
For instance, if there is a text, my_str = 'ABC가나다', vim only removes ABC if I type dw at A.
Of course I would be able to use di' instead, but it's getting annoying with using surrounding.vim plugin.
Typing ysw[ only surrounds ABC because Vim recognizes 가나다 as a separate text object even tough there is no space between the two.
Will there be any solution for this?
Thank you so much.
Is vimgrep capable of searching unicode strings?
For example:
a.txt contains wide string "hello", vimgrep hello *.txt found nothing, and of course it's in the right path.
"Unicode" is a bit misleading in this case. What you have is not at all typical of text "encoded in accordance with any of the method provided by the Unicode standard". It's a bunch of normal characters with normal code points separated with NULL characters with code point 0000 or 00. Some Java programs do output that kind of garbage.
So, if your search pattern is hello, Vim and :vim are perfectly capable of searching for and finding hello (without NULLs) but they won't ever find hello (with NULLs).
Searching for h^#e^#l^#l^#o (^# is <C-v><C-#>), on the other hand, will find hello (with NULLs) but not hello (without NULLs).
Anyway, converting that file/buffer or making sure you don't end up with such a garbage are much better long-term solutions.
If Vim can detect the encoding of the file, then yes, Vim can grep the file. :vimgrep works by first reading in the file as normal (even including autocmds) into a hidden buffer, and then searching the buffer.
It looks like your file is little-endian UTF-16, without a byte-order mark (BOM). Vim can detect this, but won't by default.
First, make sure your Vim is running with internal support for unicode. To do that, :set encoding=utf-8 at the top of your .vimrc. Next, Vim needs to be able to detect this file's encoding. The 'fileencodings' option controls this.
By default, when you set 'encoding' to utf-8, Vim's 'fileencodings' option contains "ucs-bom" which will detect UTF-16, but ONLY if a BOM is present. To also detect it when no BOM is present, you need to add your desired encoding to 'fileencodings'. It needs to come before any of the 8-bit encodings but after ucs-bom. Try doing this at the top of your .vimrc and restart Vim to use:
set encoding=utf-8
set fileencodings=ucs-bom,utf-16le,utf-8,default,latin1
Now loading files with the desired encoding should work just fine for editing, and therefore also for vimgrep.
I can't get "â" to be written. I can write "Â" though (carrot + capital A).
Any other accent can be written as in any other text editor.
Any suggestions?
Thank you in advance.
You may want to look at the :digraph comamnd in Vim. It will show you the combinations to use with <C-k> to make accented characters. In your case, you want <C-k> followed by a>.
Note: <C-k> means "Control + k" whereas a> means the letter "a" followed by a ">" (greater than sign).
If you are using a latin keyboard layout and are unable to directly type the accented character, check if there is any mapping using it:
:verbose imap â
If so, just remap the command to another key.
<C-K>a^ works for me in Vim 7.3.
You could use digraphs, as pointed out on other answers. But this kind of diacritical character is very common on some languages. If that is true for you, you could set the keymap option:
:set keymap=accents
The list of characters added by this option can be seen in $VIM\keymap\accents.vim.
That being said, this should be working without this option. It is possible that you are with some problem with the value your 'enconding' option, as mentioned here.
First look at digraphs, as mentioned before.
But just to be thorough, and because I haven't seen it mentioned yet, note that any unicode character at all can be inserted via <C-v>uXXXX<cr> (where XXXX is the hexadecimal code point number of the character.) More on this at :help i_^v
For a list of code point values for different characters, try:
Or use a handy Perl script called unum, which lets you search characters by name, and other fun stuff.
EDIT: markup fix
I have a text file with Polish characters. As long as I do not set :set encoding=utf-8 the characters are not displayed correctly. As soon as I set it to Unicode the characters are displayed but umlauts in error messages in Vim on the other hand are not displayed anymore.
E37: Kein Schreibvorgang seit der letzten <c4>nderung (erzwinge mit !)
Instead of the <c4> there should be the character Ä displayed. Can anybody explain me why this happens?
I'm experiencing similar issues (you can view some of the questions in my account info, or search for "central european characters" or "croatian characters").
Changing the encoding value changes the way Vim displays the characters - so, the way some of the characters are displayed is changed - that's why you're getting characters. You could probably solve your problem of Polish characters by choosing some other encoding value (one of the cpXXXX for example instead of utf8), but then you would lose the ability to display utf8 characters which can make Vim rather pretty. At least this works for my case (Croatian).
So, either use while writing polish texts one of the cpXXXX encoding values, or stick to utf8 completely. I recommend the first one. But do not change them.
Still working on that here.
I want to create a lab write-up with LaTeX in Ubuntu, however my text includes Scandinavian characters and at present I have to type them in using /"a and "/o etc. Is it possible to get the latex-compiler to read these special characters when they are typed in as is? Additionally, I would like vim to "read" Finnish: Now when I open a .tex-document containing Scandinavian characters, they are not displayed at all in vim. How can I correct this?
For latex, use the inputenc option:
Instead of utf8, you may use whatever else fits you, like latin1, as well.
Now the trick is to make your terminal run the same character encoding. It seems that it runs a character/input encoding that doesn't fit your input right now.
For this, refer to the "Locale" settings of your distribution. You can always check the locale settings in the terminal by issueing locale. These days, UTF8 locales are preferred as they work with every character imaginable. If your terminal's environment is set up correctly, vim should happily work with all your special characters without mourning.
To find out in which encoding Vim thinks the document is, try:
:set enc
To set the encoding to UTF-8, try:
:set enc=utf8
I can't help with vim, but for LaTeX I recommend you check out XeTeX, which is an extension of TeX that is designed to support Unicode input. XeTeX is now part of Texlive, so if you have TeX installed chances are you already have it.
I use the UCS unicode support: