How to completely turn off vim's abilty to recode files? - vim

Every encodings related question I've found is about how to recode files.
However, mine is quite contrary one - is it possible to make vim not to recode files at all? (and how, if so?)
Sometimes it is writing [converted] at the status line, and always miss. However, I have my terminal set at the same encoding as edited file, so, I don't need no recoding at all.

Use
vim -b "myfile.type"
to edit in binary mode. You can also set
:set binary
or if you're lazy like me
:se bin
before editing a file in vim (applies to the current buffer)
:he `binary`
*'binary'* *'bin'* *'nobinary'* *'nobin'*
'binary' 'bin' boolean (default off)
local to buffer
{not in Vi}
This option should be set before editing a binary file. You can also
use the |-b| Vim argument. When this option is switched on a few
options will be changed (also when it already was on):
'textwidth' will be set to 0
'wrapmargin' will be set to 0
'modeline' will be off
'expandtab' will be off
Also, 'fileformat' and 'fileformats' options will not be used, the
file is read and written like 'fileformat' was "unix" (a single <NL>
separates lines).
The 'fileencoding' and 'fileencodings' options will not be used, the
file is read without conversion.

If you are editing binary files you want what sehe suggests, binary.
If you're editing text files, then you probably need to change fileencodings, or if the problem is that vim hasn't detected the terminal encoding correctly, encoding.

Related

gVim incorrectly guesses file encoding

I'm running gVim 8.2 with default configuration on Windows 7 with russian language (so all the system text and menu items are in russian). When I open a utf8 file with russian text in it, it's displayed incorrectly in cp1251 for some reason:
:set encoding?
encoding=cp1251
manually setting :set encoding=utf8 fixes it.
Other encoding-related options have following values:
:set fileencoding?
fileencoding=
:set fileencodings?
fileencodings=ucs-bom
I find vim help confusing here, because it doesn't seem to explain how it guesses the encoding. For some reason other applications I tried (Notepad++, Sublime Text 4, even Windows Notepad) guess the file encoding correctly. As I mentioned in the beginning, I run gVim with default configuration, so there's no custom vimrc anywhere:
:echo $MYVIMRC
D:\Program Files (x86)\Vim\_vimrc
What would be the correct way to fix this problem?
Create a vimrc with set encoding=utf-8 in it. This should be the default in newer versions of Vim on Windows, as can be seen from :help 'encoding'.
'encoding' 'enc' string (default for MS-Windows: "utf-8",
otherwise: value from $LANG or "latin1")
The default value used to be latin1 on Windows but it was changed to utf-8 recently.
This should be enough to solve your issue.
Again from :help 'encoding':
Sets the character encoding used inside Vim. It applies to text in
the buffers, registers, Strings in expressions, text stored in the
viminfo file, etc. It sets the kind of characters which Vim can work
with.
Vim uses fileencodings (plural) to try and guess the encoding of your file. fileencoding (singular) is the encoding that Vim guessed (or that you've set) for your file. You probably don't need to change either of these.

View ^M as newline in vim without persistent file conversion

When editing a dos-format file from a codebase which is mainly used with Visual Studio in a Windows environment on my local Ubuntu machine with VIM, I get to see the ^M character instead of a newline.
According to the VIM documentation, this represents the carriage return character.
Further complicating the problem is that this only occurs in certain places in the file, so the newlines don't seem to have a consistent format.
By default, VIM recognizes the file as dos file-format, which I see by executing :set ff?.
My goal is to edit the file without breaking its platform conformity; I don't want to persistently convert the file, only because I'm editing in VIM. Hence, the existing answer doesn't satisfy my problem. This answer, doesn't either.
Given this requirement, can I get VIM to just display all ^M's as newlines via some syntax highlighting setting?
Note that ^M isn't composed out of plain characters. I you'd insert them manually in VIM, you'd have to insert Cntr-V before inserting each character.
this only occurs in certain places in the file
There's your problem. If CR-LF were used consistently for all line endings, Vim would (with dos in 'fileformats') correctly detect Windows-style line endings, and handle them transparently (i.e. without showing ^M).
As it is not, Vim detects the file as unix, and displays the parts with Windows line endings with a trailing ^M. You could use the conceal feature to hide them:
:syntax match hideControlM "\r$" conceal
:set conceallevel=2 concealcursor=nv
But if you want to maintain the original structure as much as possible, you need to be able to view them, and add them manually (via <C-V><C-M>) if you add lines inside those Windows-style areas.
Many people would argue that such inconsistent line-ending style is a bug and should be converted to an agreed-upon, consistent line-ending style. Most version control systems (I hope you use one) have automatic conversion features that make it easy to achieve interoperability with both Windows and Unix users.

Is it possible to automatically set UTF16 file encoding when opening a file of that type?

I edit all kinds of files with Vim (as I'm sure most Vim users do). One bug bear I have is what Vim does when I come across a file with an odd encoding. Most editors (these days) make a good stab at detecting file encodings. However, Vim generally doesn't. And you have to type, for example:
:e ++enc=utf-16le
To re-read the file in UTF-16 (Otherwise you get a mass of # signs)
I've been searching around and have seen scripts like set_utf8.vim which can detect a specific file encoding. However, is there are more general solution? I'm a bit bored of having to manually work out what the file encoding is and consulting the help every time I open an unusual file.
Adding the encoding name to 'fileencodings' should do the trick:
:set fencs=ucs-bom,utf-16le,utf-8,default,latin1
Alternatively, there are plugins like AutoFenc and fencview.
Add this code to your .vimrc:
if has("multi_byte")
if &termencoding == ""
let &termencoding = &encoding
endif
set encoding=utf-8
setglobal fileencoding=utf-8
"setglobal bomb
set fileencodings=ucs-bom,utf-8,latin1
endif
Do you have a byte-order-mark ? Vim should detect this and work appropriately. From the doc - section 45.4:
When you start editing that 16-bit Unicode file, and it has a BOM, Vim
will detect this and convert the file to utf-8 when reading it. The
'fileencoding' option (without s at the end) is set to the detected
value. In this case it is "utf-16le". That means it's Unicode,
16-bit and little-endian. This file format is common on MS-Windows
(e.g., for registry files).

Vim and ASCII extended characters?

I would like to know how can I set VIM 7.0 to show and work with ASCII extended characters without problem.
Vim (which is what vi resolves to on most systems) readily supports extended character sets. You might need to tell Vim which encoding to use, though.
This is controlled by two options:
:set encoding
:set fileencoding
If you have loaded a file that displays incorrectly, you may use :set encoding=<new encoding> to force the appropriate encoding. This changes the interpretation of the characters on the fly. If you want to save the file in another encoding preserving the current interpretation of characters, use set fileencoding=<new encoding> to let Vim save the file in that encoding.
I recommend that you set utf-8 as the default encoding in your .vimrc.
Once the the characters are "extended" it's not ASCII any more.
However: Just use vim. ":help unicode" for more details.
The other solutions here didn't work for me. Vim told me that encoding and fileencoding were not supported options. That turned out to be because I was building from source myself, and I did not include the multi-byte feature. My two Macs are similar, but one of them enabled it by default while the other did not.
If you're building Vim from source like I was, include --enable-multibyte in your arguments to ./configure. In my case, Vim defaulted to UTF-8 and supported extended characters after that.
I would suggest you to try the following:
set the terminal to utf-8 (how to do that depends on your terminal; in PuTTY it's in the Window/Translation menu)
set your locale to utf-8 (how to do that depends on your OS; on my Debian box it's set LC_ALL=en_GB.UTF-8 for the current session and sudo dpkg-reconfigure locales for permanent system-wide changes) -- you can check your current locale with locale.
That's how it works for me (using VIM 7.1.314 and no .vimrc).

How can I make vim recognize the file's encoding?

I noticed that most of the time, when using some encoding other than 'the standard english', vim doesn't recognize and does not display characters correctly.
This is most easily seen by opening some ascii graphics, or similar files off the net, which use cp437 code page.
Is there a way to make vim check for encoding when opening a file, and open it with a correct one ?
What encodings do you use, as the most "portable" ones (the ones which the largest amount of users will have least problems with) ?
Vim needs to detect the encoding, and that's going to be problematic, since files don't often explicitly state their encodings (an obvious exception are XML files with an encoding attribute in the header).
You can force Vim to reload a file with a different encoding thus:
:e ++enc=cp437
and you can set the default encoding in your .vimrc if you wish.
This page has more info and links, especially wrt. editing Unicode. UTF-8 is the most widely-used encoding, and the default you should probably go for.
You can use a vim modeline to set the file's encoding. This is simply a comment, in the first five lines of the file, that starts with vi: set fileencoding=cp437.
You could also start with 'vim:', instead of 'vi: set', but the latter makes it compatible with more editors. You definitely need the space between either of these prefixes and 'fileencoding', or whatever option you want to set. The fileencoding option should solve your problem, though.
So, in Python or an .rc file, you can put this at the top of your file:
# vi: set fileencoding=cp437
In Java, C, C++, JavaScript, etc. put this:
// vi: set fileencoding=cp437
For more information, in vim, type :help modeline.
You can set the variable 'fileencodings' in your .vimrc.
This is a list of character encodings considered when starting to edit
an existing file. When a file is read, Vim tries to use the first
mentioned character encoding. If an error is detected, the next one
in the list is tried. When an encoding is found that works,
'fileencoding' is set to it. If all fail, 'fileencoding' is set to
an empty string, which means the value of 'encoding' is used.
See :help filencodings
If you often work with e.g. cp437 or cp1252, you can add it there:
set fileencodings=ucs-bom,utf-8,cp1252,default,latin9
You can encode your files using unicode, and set a Byte Order Mark (BOM) in the file. This will make vim treat it appropriately, but some compilers and programs may have trouble with it. Even basic shell commands like cat may misbehave for some use cases.
To do it, type this in vim:
:set fileencoding=utf-8
:set bomb
:w
For more information, type:
:help bomb
Add at the end of your .vimrc: (you can replace "utf-8" by "latin1" if needed)
if len(&fenc) == 0
silent! exe "e! ++enc=utf-8"
endif

Resources