Saving a flat-file through Vim add an invisible byte to the file that creates a new line - linux

The title is not really specific, but I have trouble identifying the correct key words as I'm not sure what is going on here. For the same reason, it is possible that my question has a duplicate, as . If that's the case: sorry!
I have a Linux application that receive data via flat files. I don't know exactly how those files are generated, but I can read them without any problem. Those are short files, only a line each.
For test purpose, I tried to modify one of those files and reinjected it again in the application. But when I do that I can see in the log that it added a mysterious page break at the end of the message (resulting in the application not recognising the message)...
For the sake of example, let's say I receive a flat file, named original, that contains the following:
ABCDEF
I make a copy of this file and named it copy.
If I compare those two files using the "diff" command, it says they are identical (as I expect them to be)
If I open copy via Vi and then quit without changing nor saving anything and then use the "diff" command, it says they are identical (as I also expect them to be)
If I open copy via Vi and then save it without changing anything and then use the "diff" command, I have the following (I added the dot for layout purpose):
diff original copy
1c1
< ABCDEF
\ No newline at end of file
---
.> ABCDEF
And if I compare the size of my two files, I can see that original is 71 bytes when copy is 72.
It seems that the format of the file change when I save the file. I first thought of an encoding problem, so I used the ":set list" command on Vim to see the invisible characters. But for both files, I can see the following:
ABCDEF$
I have found other ways to do my test, But this problem still bugged me and I would really like to understand it. So, my two questions are:
What is happening here?
How can I modify a file like that without creating this mysterious page break?
Thank you for your help!

What happens is that Vim is set by default to assume that the files you edit end with a "newline" character. That's normal behavior in UNIX-land. But the "files" your program is reading look more like "streams" to me because they don't end with a newline character.
To ensure that those "files" are written without a newline character, set the following options before writing:
:set binary noeol
See :help 'eol'.

Related

Vim don't show last empty line

I have written some c code, and the end of my output is '\n', when I check the output text file in vim, I cannot find the last empty line, however, when I open it with another text viewer, I can find the last empty line. How can I configure my vim to show the empty line?
The way Vim shows \n / 0x0a at the end of the file is that it opens the file without complaining about [noeol] when :editing the file (in a kind of "reverse logic" from what you expect). Vim's (and Unix) philosophy is that the trailing newline should be there. This can be confusing when one is used to other editors or predominantly works on MS Windows.
There's a lot of discussion and questions about this (e.g. here); as this is unlikely to change, get used to it.

How to strip binary characters from a file?

I've got a file that contains lines that look like this in vim:
^[[0;32msalt-2016.3.2-1.el6.noarch^[[0;0m^M
which look like this in more:
salt-2016.3.2-1.el6.noarch
I would like to produce a copy of this file that only contains the displayed characters as more shows them. I tried piping it through dos2unix but it refuses to do anything, complaining that "dos2unix: Binary symbol 0x1B found at line 2".
Probably I could achieve what I want with some sed statements, but I'm wondering whether there is a linux/unix utility that will take output from more or cat and produce a file that contains only the whitespace and text as displayed?
There's something called ansifilter which does exactly this. I tested it out on my file and it works.

Why would Vim add a new line at the end of a file?

I work with Wordpress a lot, and sometimes I changed Wordpress core files temporarily in order to understand what is going on, especially when debugging. Today I got a little surprise. When I was ready to commit my changes to my git repository, I noticed that git status was marking one of Wordpress files as not staged for commit. I remember I had reverted all the changes I did to that file before closing it, so I decided to use diff to see what had changed. I compared the file on my project with the file on the Wordpress copy that I keep in my downloads directory. It turns out the files differ at the very end. diff indicates that the there is a newline missing at the end of the original file:
1724c1724
< }
\ No newline at end of file
---
> }
I never even touched that line. The changes I made where somewhere in the middle of a large file. This leads me to think that vim added a newline character at the end of the file. Why would that happen?
All the answers I've seen here address the question "how could I prevent Vim from adding a newline character at the end of the file?", while the question was "Why would Vim add a new line at the end of a file?". My browser's search engine brought me here, and I didn't find the answer to that question.
It is related with how the POSIX standard defines a line (see Why should files end with a newline?). So, basically, a line is:
3.206 Line
A sequence of zero or more non- <newline> characters plus a terminating <newline> character.
And, therefore, they all need to end with a newline character. That's why Vim always adds a newline by default (because, according to POSIX, it should always be there).
It is not the only editor doing that. Gedit, the default text editor in GNOME, does the same exact thing.
Edit
Many other tools also expect that newline character. See for example:
How wc expects it.
GCC warns about it.
Also, you may be interested in: Vim show newline at the end of file.
Because vim is a text editor, it can sometimes "clean up" files for you. See http://vimhelp.appspot.com/vim_faq.txt.html#faq-5.4 for details on how to write without the ending newline, paraphrased below:
How do I write a file without the line feed (EOL) at the end of the file?
You can turn off the eol option and turn on the binary option to write a file without the EOL at the end of the file:
   :set binary
   :set noeol
   :w
Alternatively, you can use:
   :set noeol
   :w ++bin
Adding a newline is the default behavior for Vim. If you don't need it, then use this solution: VIM Disable Automatic Newline At End Of File
To disable, add this to your .vimrc
set fileformats+=dos
You can put the following line into your .vimrc
autocmd FileType php setlocal noeol binary
Which should do the trick, but actually your approach is somewhat wrong. First of all php won't mind that ending at all and secondly if you don't want to save your changes don't press u or worse manually try to recreate the state of the file, but just quit without saving q!. If you left the editor and saved for some reason, try git checkout <file>
3.206 Line
A sequence of zero or more non- characters plus a terminating character.
Interestingly, vim will allow you to open a new file, write the file, and the file will be zero bytes. If you open a new file and append a line using o then write the file it will be two characters long. If you open said file back up and delete the second line dd and write the file it will be one byte long. Open the file back up and delete the only line remaining and write the file it will be zero bytes. So vim will let you write a zero byte file only as long as it is completely empty. Seems to defy the posix definition above. I guess...

Ignore one "misspelling" in Vim

Is there a way to tell Vim not to highlight a word once? For example, in "the password is abc123", I don't want to add abc123 to the wordlist, but still wouldn't like the big red rectangle around it.
Clarification: I'm looking for a command that makes the spell checker ignore the current word (or last misspelling).
Without having the word stored somewhere, it's hard (not to say impossible) to ignore it always.
But, if you are looking to ignore the word really once, that is only for a moment, you can add it to the internal list with the zG command.
*zG*
zG Like "zg" but add the word to the internal word list
|internal-wordlist|.
*internal-wordlist*
The internal word list is used for all buffers where 'spell' is set. It is
not stored, it is lost when you exit Vim. It is also cleared when 'encoding'
is set.
When your cursor is positioned on a word that is highlighted as misspelled you can add it to your wordlist by pressing zg. Vim allows you to load more than one wordlist at a time, which makes it possible to have (for example) a global wordlist, and a project specific wordlist.
By default, when you run zg it will add the current word to the first spellfile it finds in your runtime path for the current encoding. In my case, that turns out to be ~/.vim/spell/en.utf-8.add when I'm working with UTF-8 encoding. Try running the following commands:
:setlocal spellfile+=~/.vim/spell/en.utf-8.add
:setlocal spellfile+=oneoff.utf-8.add
That will set you up so that zg (or 1zg) adds the current word to your default spellfile. But running 2zg would add the current word to a file called oneoff.utf-8.add, in the same directory as the file that you are working on. If the file doesn't exist, Vim will try to create it for you.
When you open the file again in the future, you will have to run the same two commands to make Vim check the oneoff.utf-8.add spellfile. Unfortunately, Vim does not allow you to set the spellfile option in a modeline, so if you want to run these commands automatically when the file opens, you would have to find some other way. This question includes a few ideas on how you might proceed.

In Vim, what is the "alternate file"?

I just ran :help registers in Vim and noticed that # 'contains the name of the alternate file'.
I have seen an example for renaming files that goes like this:
" Save the current file, foo.txt, as bar.txt
:w bar.txt
" Start editing bar.txt
:e#
So apparently in that case, the file you just saved out is the "alternate file."
Can someone give me a more general definition for the "alternate file" and what else you might use it for?
The alternate file is the file that was last edited in the current window. Actually when you use some command to open a new buffer, if the buffer that was displayed had a filename associated with it, that filename is recorded as alternate file name.
See :help alternate-file.
Very useful for...
Pasting in the name of a file I've just been looking at into the current file.
You can use <C-R># for this in insert mode or "#p in normal mode.
Not that useful for...
Jumping back and forth between two files. It does the job very well, but this is just something I don't generally need to do.
Even in the example given, I'd probably use:saveas bar.txt instead.
An Example:
Say if you're doing a bit of C programming and want to call some function. You can't remember the name of the function, so you place a mark on your current location mA and jump into several different files using tags or grep to find out where the function is declared and what it's actually called.
Ah - found it. You can copy the name and return to the mark yiw'A
Uh-oh - we also need to #include the file! Easy - just use the alternate file name register to paste the file name in... Gi#include"<C-R>#"
Be pleased that you've avoided the distraction of having to go back to the function's declaration and copy out the file name via :let #"=#% or something similar.
What I'd rather do when jumping between files:
When editing two files, it's probably easier to split them, so you can keep both on screen at the same time. If I'm editing 2 files I'll usually be comparing them in some way.
Usually I'm interested in 1-3 files (any more and I get confused). I'll often jump into or directly open many other files. Marking the interesting files, or traversing the jump list is usually the way to get around in this case.
If you're editing C/C++ where you're switching between a file and it's header, use a plugin! It will be much more convenient.
I use it in the buffer context to return to the last buffer that I was editing
vim foo bar
:n
:e#
will take you back to foo in that case
I 've always interpreted the "alternate file" as being the "previous file", so it is an handy way to jump back to the buffer you were editing.

Resources