vim buffer Trying char-by-char conversion - vim

It could be a windows saved file opened in unix/linux problem and I am not quite sure how to solve it.
When I open a file which was previously saved by another developer using windows, my vim buffer some times shows
Trying char-by-char conversion...
In the middle of my file and I am unable to edit the code/text/characters right below this message in my buffer.
Why does it do that and how do I prevent this from happening?

This message comes from the Vim function mac_string_convert() in src/os_mac_conv.c. It is accompanied by the following comment:
conversion failed for the whole string, but maybe it will work for each character
Seems like the file you're editing contains a byte sequence that cannot be converted to Vim's internal encoding. It's hard to offer help without more details, but often, these help:
Ensure that you have :set encoding=utf-8
Check :set filencodings? and ensure that the file you're trying to open is covered, or explicitly specify an encoding with :edit ++enc=... file
The 8g8 command can find an illegal UTF-8 sequence, so that you can remove it, in case the file is corrupted. Binary mode :set binary / :edit ++bin may also help.

Related

How do I change vims character set of stdin?

echo "UTF-16le text"|vim -
:set encoding=utf-16le
:set fileencoding=utf-16le
:e! ++enc=utf-16le
has absolutely no effect on the mojibake that is displayed on the screen. Though the last one (:e! ++enc=utf-16le) results in an error E32: No file name.
If I edit ~/.vimrc to set fileencodings=utf-16le,[...] then it works, but I shouldn't have to edit my configuration file every time I use vim, is there a better way? Preferably a way in which a key code will just cycle between my :set fileencodings, that way I can choose quickly if needed.
The command-line equivalent to ~/.vimrc is passing commands via --cmd. You can also employ the :help :set^= command to prepend a value to an option:
echo "UTF-16le text"|vim --cmd 'set fencs^=utf-16le' -
I shouldn't have to edit my configuration file every time I use vim
First, I would test whether permanently keeping utf-16le in 'fileencodings' has any negative consequences for any files you regularly edit; maybe you can safely keep it in by default.
Second, there are plugins like AutoFenc, which extends the built-in detection, and fencview, which let's you choose the encoding from a menu.
Alternative
The problem with UTF-16 encodings is well known, and the byte order mark is one solution to make it easy to detect those. With such a BOM, Vim will correctly detect the encoding out-of-the-box. If your input is missing the BOM, you can manually prepend it:
{ printf '\xFF\xFE'; echo "UTF-16le text"; } | vim -

Vim long file paths break/split over multiple lines in quickfix window

A long file paths is broken up over multiple lines in the Vim quickfix window which then for example does not allow to jump to the error location displayed in the qf.
The file (and the lines around) are diplayed in the quickfix window as (the example is the output from neomakes pdflatex)
|| Enter file name:
|| /long/path/to/file/.../loca
tionOfTexFiles/myTexFile.tex|144 error| Emergency stop.
|| read
to be able to follow to the file line by lnext/cnext I should have
/long/path/to/file/.../locationOfTexFiles/myTexFile.tex|144 error| Emergency stop.
For quickfix files I have the following relevant (in my view) settings which are set to:
setlocal nolinebreak
setlocal nowrap
setlocal textwidth=9999
So I am wondering how I can display the file path in one line within the quickfix window?
On :make, Vim invokes 'makeprg', captures the output, and then parses it according to 'errorformat'. The latter does support multi-line error messages (cp. :help errorformat-multi-line), but that is mostly for what I would call intentional linebreaks, as specified by the compiler. What you suffer from is unintentional linebreaks because of line wrapping (due to overly long paths).
Now, I don't know about "neomakes pdflatex", but it looks like that tool creates the linebreaks, whereas it shouldn't, as Vim is capturing the output, and there's no receiving terminal (or user). Investigating in that direction (or opening an issue at the project's tracker) might be helpful.
The mentioned Vim options ('linebreak', 'wrap', etc.) have nothing to do with it. They apply to normal buffers; the quickfix buffer as such is not modifiable.
Workarounds
A possible workaround might be to :cd first to a directory that is "closer" to the processed files (or even :set autochdir); this might avoid the long paths in the output.
Alternatively, you may "unmangle" the output by adding a sed stage after the compiler:
let &makeprg .= "| sed -e 's/.../...'"
If I'm not mistaken, the issue is on pdflatex side. The || mark is a good indication: you'll have one per output line -- in case filename and/or lines numbers are recognized, they'll be fed in between the bars.
So. This means you'll need a way to fix the path names. It'll be better to do it outside vim. I'm not saying this is trivial. I'm just saying that if you can have a program able to fix pdflatex outputs, you'll just be one pipe away from the solution (plus a correct forwarding of error codes...).
If you prefer to implement it in vim script, this is possible. But you'll experience side-effects. In my BuildToolsWrapper plugin I'm able to post-process compilation output in vim side, but the result is far from being perfect. I'm working on getqflist() result, and parse each line. When I found a line where I want to fix the filename, it's not simply about fixing the filename but also about assigning a valid buffer number to it. See this function where I can replace a filename with another one. The magic happens where lh#buffer#get_nr() is used. Still you'd need to implement a vim script able to merge split filenames.
IOW: my understanding is that vim is not involved. It could be used to fix the issue, but IMO this is not the easier path to undertake.

(VIM) Is vimgrep capable of searching unicode string

Is vimgrep capable of searching unicode strings?
For example:
a.txt contains wide string "hello", vimgrep hello *.txt found nothing, and of course it's in the right path.
"Unicode" is a bit misleading in this case. What you have is not at all typical of text "encoded in accordance with any of the method provided by the Unicode standard". It's a bunch of normal characters with normal code points separated with NULL characters with code point 0000 or 00. Some Java programs do output that kind of garbage.
So, if your search pattern is hello, Vim and :vim are perfectly capable of searching for and finding hello (without NULLs) but they won't ever find hello (with NULLs).
Searching for h^#e^#l^#l^#o (^# is <C-v><C-#>), on the other hand, will find hello (with NULLs) but not hello (without NULLs).
Anyway, converting that file/buffer or making sure you don't end up with such a garbage are much better long-term solutions.
If Vim can detect the encoding of the file, then yes, Vim can grep the file. :vimgrep works by first reading in the file as normal (even including autocmds) into a hidden buffer, and then searching the buffer.
It looks like your file is little-endian UTF-16, without a byte-order mark (BOM). Vim can detect this, but won't by default.
First, make sure your Vim is running with internal support for unicode. To do that, :set encoding=utf-8 at the top of your .vimrc. Next, Vim needs to be able to detect this file's encoding. The 'fileencodings' option controls this.
By default, when you set 'encoding' to utf-8, Vim's 'fileencodings' option contains "ucs-bom" which will detect UTF-16, but ONLY if a BOM is present. To also detect it when no BOM is present, you need to add your desired encoding to 'fileencodings'. It needs to come before any of the 8-bit encodings but after ucs-bom. Try doing this at the top of your .vimrc and restart Vim to use:
set encoding=utf-8
set fileencodings=ucs-bom,utf-16le,utf-8,default,latin1
Now loading files with the desired encoding should work just fine for editing, and therefore also for vimgrep.

Why would Vim add a new line at the end of a file?

I work with Wordpress a lot, and sometimes I changed Wordpress core files temporarily in order to understand what is going on, especially when debugging. Today I got a little surprise. When I was ready to commit my changes to my git repository, I noticed that git status was marking one of Wordpress files as not staged for commit. I remember I had reverted all the changes I did to that file before closing it, so I decided to use diff to see what had changed. I compared the file on my project with the file on the Wordpress copy that I keep in my downloads directory. It turns out the files differ at the very end. diff indicates that the there is a newline missing at the end of the original file:
1724c1724
< }
\ No newline at end of file
---
> }
I never even touched that line. The changes I made where somewhere in the middle of a large file. This leads me to think that vim added a newline character at the end of the file. Why would that happen?
All the answers I've seen here address the question "how could I prevent Vim from adding a newline character at the end of the file?", while the question was "Why would Vim add a new line at the end of a file?". My browser's search engine brought me here, and I didn't find the answer to that question.
It is related with how the POSIX standard defines a line (see Why should files end with a newline?). So, basically, a line is:
3.206 Line
A sequence of zero or more non- <newline> characters plus a terminating <newline> character.
And, therefore, they all need to end with a newline character. That's why Vim always adds a newline by default (because, according to POSIX, it should always be there).
It is not the only editor doing that. Gedit, the default text editor in GNOME, does the same exact thing.
Edit
Many other tools also expect that newline character. See for example:
How wc expects it.
GCC warns about it.
Also, you may be interested in: Vim show newline at the end of file.
Because vim is a text editor, it can sometimes "clean up" files for you. See http://vimhelp.appspot.com/vim_faq.txt.html#faq-5.4 for details on how to write without the ending newline, paraphrased below:
How do I write a file without the line feed (EOL) at the end of the file?
You can turn off the eol option and turn on the binary option to write a file without the EOL at the end of the file:
   :set binary
   :set noeol
   :w
Alternatively, you can use:
   :set noeol
   :w ++bin
Adding a newline is the default behavior for Vim. If you don't need it, then use this solution: VIM Disable Automatic Newline At End Of File
To disable, add this to your .vimrc
set fileformats+=dos
You can put the following line into your .vimrc
autocmd FileType php setlocal noeol binary
Which should do the trick, but actually your approach is somewhat wrong. First of all php won't mind that ending at all and secondly if you don't want to save your changes don't press u or worse manually try to recreate the state of the file, but just quit without saving q!. If you left the editor and saved for some reason, try git checkout <file>
3.206 Line
A sequence of zero or more non- characters plus a terminating character.
Interestingly, vim will allow you to open a new file, write the file, and the file will be zero bytes. If you open a new file and append a line using o then write the file it will be two characters long. If you open said file back up and delete the second line dd and write the file it will be one byte long. Open the file back up and delete the only line remaining and write the file it will be zero bytes. So vim will let you write a zero byte file only as long as it is completely empty. Seems to defy the posix definition above. I guess...

Opening UCS-2le File With Vim on Windows

I'm using Vim 7.3 on WinXP. I use XML files that are generated by an application at my work which writes them with UCS-2le encoding. After reading several articles on encoding at the vim wiki I found the following advice given, namely to set my file encoding in vimrc:
set fileencodings=ucs-bom,utf-8
The file in question has FF EE as the first characters (confirmed viewing with HxD), but Vim doesn't open it properly. I can open my UCS-2le files properly with this in my vimrc:
set fileencodings=ucs-2le, utf-8
But now my UTF-8 files are a mess!
Any advice how to proceed? I typically run Gvim without behave MSwin (if that matters). I use very few plugins. My actual vimrc setting regarding file encodings are:
set encoding=utf-8
set fileencodings=ucs-bom,utf-8,ucs-2le,latin1
The entry for ucs-2le in the third spot seems to make no difference. As I understand it, the first entry (set encoding) is the encoding Vim uses internally in its buffer, while the second (set fileencodings) deals with the encoding of the file when vim reads and writes it. So, it seems to me that since the file has a byte order mark, ucs-bom as the first entry in setfileencodings should catch it. As far I can tell, it seems that vim doesn't recognize that this file is 16 bytes per character.
Note: I can/do solve the problem in the meantime by manually setting the file encoding when I open my ucs-2le files:
edit ++enc=ucs2-le
Cheers.
Solved it. I am not sure what I did but the fixes noted are working perfectly now to read and write my UCS-2 files - though for unknown reason not immediately (did I just restart Vim?). I could try to reverse the fixes to see which one was the critical change but here's what I've done (see also my comments on Jul 27 above):
Put AutoFenc.vim plugin in my plugins folder (automatically detects file encoding (AutoFenc.vim).
Added iconv.dll and new version of libintl.dll to my vim73 folder (Vim.org)
Edited vimrc as below
vimrc now contains (the last bits just make it easier to see what's happening with file encodings by showing the file encoding in the status line):
"use utf-8 by default
set encoding=utf-8
set fileencodings=ucs-bom,utf-8,ucs-2le,latin1
"always show status line
set laststatus=2
"show encoding in status line http://vim.wikia.com/wiki/Show_fileencoding_and_bomb_in_the_status_line
if has("statusline")
set statusline=%<%f\ %h%m%r%=%{\"[\".(&fenc==\"\"?&enc:&fenc).((exists(\"+bomb\")\ &&\ &bomb)?\",B\":\"\").\"]\ \"}%k\ %-14.(%l,%c%V%)\ %P
endif
And all is well.

Resources