How to insert Unicode character U+2611 in gvim - vim

when I try to enter this Unicode character :☑(U+2611) in vim using the command like : ^Vu2611 (which means press ctrl+V then type u2611 in insert mode),Vim somehow breaks it into two characters : &(26) and ^Q(11).
There's no any problem when I tried to insert other kind of characters like □ (U+a1f5).
It seems like Vim stopped its parsing immediately after 26 (which represents character '&') has been read .
So,how can I insert this kind of Unicode characters in Vim (I have tried to paste it into Vim ,it doesn't work)?
Please Help!!!

In order to process Unicode characters, Vim must use an 'encoding' that is able to represent those characters. With a value of latin1, the mentioned character cannot be encoded (this 8-bit encoding only includes ASCII and several Western European characters, see here).
So, you need to
:set encoding=utf-8
With that, any newly created file will use that encoding, and you should be able to insert Unicode characters and write them (also with another Unicode file encoding, like :w ++enc=ucs-2le; but if you tried to persist as :w ++enc=latin1, you'd get a CONVERSION ERROR).

Related

Editing in binary

I've been tinkering with multiple hex editors but nothing really has worked.
What I'm looking for is a way to change a binary in actual binary (not in hex). This is purely for educational purposes and I know it's trivial to convert between both, but I wanted to be able to change the ones and zeroes just like I would do hex.
I've tried using vim with the %!xxd -b but then it won't work with %!xxd -r. I know how to convert the file into binary, but I'm looking for a way to dynamically change it in this format and being able to save it.
Better yet would be if I could find a way to actually create a binary by coding purely in actual binary.
Any help would be appreciated :D
vim or gvim should work for you directly, without the xxd filter.
Open the file in (g)vim. Place your cursor on a character and type ga to see its character code in the status line. To insert character NNN, place your cursor where you want it, go in insert mode and type Ctrl-v and then the three digit decimal code value. Use Ctrl-v x HH to enter the character by its hexadecimal code.
Make sure your terminal is not set to use UTF8, because in UTF8, typing Ctrl-v 128 will in fact insert c280, the utf-8 encoding of character 128, instead of 80.
LC_ALL=C vim binary-file
is the easiest way to make sure you're doing binary character based editing in vim, but that might do weird things if your terminal is utf-8.
LC_ALL=C gvim binary-file
should open a stand-alone window with proper display.
FYI, if you did want to work in utf-8, Ctrl-v u HHHH is how to enter the Unicode character with Hex code point HHHH.
windows
open cmd.exe or notepad++ or whatever editor
enable numlock key
On laptops you need to use the function key or the blue / grey silver numbers above alphabet keys (using the numbers on the top line will not work as they map to different scan code.
press alt key + 255 will correspond to 0xff
press alt key + 254 will correspond to 0xfe
see below for a demo
C:\>copy con rawbin.bin
 ■²ⁿ√·∙⌂~}─^Z
^Z
1 file(s) copied.
C:\>xxd rawbin.bin
0000000: fffe fdfc fbfa f97f 7e7d c41a 0d0a ........~}....
C:\>

How to replace bytes \xe3\x80\x80 with byte \x20 in vim?

Let's to create target file to operate with.
python3
>>> mfile = open("f:/test.txt","wb")
>>> mfile.write(b'\xe3\x80\x80')
3
>>> mfile.close()
Now to open f:/test.txt with xxd,you will see three bytes \xe3\x80\x80 in it,our target file encoding with utf-8 contains three bytes \xe3\x80\x80.
python3
b'\xe3\x80\x80'.decode('utf-8')
'\u3000'
It means that the unicode of three bytes in test.txt encoding with utf-8 is 3000.
:s/\%u3000/ /g
s/\%u3000/ /g can replace bytes \xe3\x80\x80 with byte \x20 in vim.
Issue remains still here.
:s/\%u3000/\%u20/g
:s/\%u3000/\%x20/g
:s/\%u3000/\x20/g
All the three formats above here can't work,why \xe3\x80\x80 can be expressed by \%u3000 in vim, (white blank) can't be expressed by \%u20 or \%x20 or \x20 ?
can express \x20, white blank is printable character,what's more, i want to replace the three bytes \xe3\x80\x80 with latin-1's nbsp?
The nbsp in latin-1 encoding means Non-breaking space which is NON PRINTABLE CHARACTERS,how to write the expression in vim?
:s/\%u3000/\%ua0/g
:s/\%u3000/\%xa0/g
:s/\%u3000/\xa0/g
None of them can work for the case.
You can type the \xe3\x80\x80 or u3000 character by pressing ctrl+v then u and then the 4 Unicode characters, in your case 3000 (check :help i_CTRL-V_digit ), since is a black character you will see nothing but just a space, you could type :set list to see all the places where you have that character or in any case add this to your .vimrc
set listchars=tab:▸\ ,eol:¬,trail:·,extends:#,nbsp:.
Now in the same way you enter the character, you could try to replace it within the command line, but in this case to be available to enter the ctrl+v you could try using the command-line window (:help cedit).
Go to command mode and after having the : press ctrl+f it will open the command-line window in where you could go into insert mode and type: %s/ctrl+vu3000/ /g and when done press enter to apply command.
Give a try first before entering the command-line window, since when using ctrl+v it may work, not like when using ctrl+k (http://vim.wikia.com/wiki/Entering_special_characters)
In the image instead of replacing with a white space / /, Is replacing with ---- just to visually see the changes.
1.How to input non printable characters when to edit a file in vim?
In the insert mode:
1.ctrl+v (ctrl+q if ctrl+v call paste from regitor)
2.input u
3.input the unicode value of non printable characters
4.input enter key
2.How to input non printable characters in substitute command of vim's ex mode?
For example, to replace all bytes \xe3\x80\x80 with \xa0,all byte's encoding is utf-8.
1. get the byte's unicode value
`\xe3\x80\x80`'s unicode value is `3000`,
`\xa0`'s unicode value is `a0`.
2.press `:` into ex mode.
3.:s/\%u3000/
4:ctrl+v ua0
do not input enter as above process
5.go on to input `/g`.
6.press enter.

Removing hex code ffa3 in Vim

I've got a file with a load of weird characters with in it that I need to get rid of.
Using ga on the character reveals it has the following encodings:
ᆪ> 65443, Hex ffa3, Octal 177643
But I can't seem to find it using :%s/\%xffa3//g. What am I doing wrong?
Look at :help \%x:
\%x2a Matches the character specified with up to two hexadecimal characters.
So Vim is actually matching the three characters <uf>a3. Since you have a four-digit hex number, you need to use \%u:
:%s/\%uffa3//g
Alternatives
You can also insert the character directly into the command line via :help i_CTRL-V_digit (i.e. <C-v>uffa3), but if you already have instances of that character in your buffer (and near your cursor!), I'd just yank that char with yl and insert it in the command-line via <C-r>".

(VIM) Is vimgrep capable of searching unicode string

Is vimgrep capable of searching unicode strings?
For example:
a.txt contains wide string "hello", vimgrep hello *.txt found nothing, and of course it's in the right path.
"Unicode" is a bit misleading in this case. What you have is not at all typical of text "encoded in accordance with any of the method provided by the Unicode standard". It's a bunch of normal characters with normal code points separated with NULL characters with code point 0000 or 00. Some Java programs do output that kind of garbage.
So, if your search pattern is hello, Vim and :vim are perfectly capable of searching for and finding hello (without NULLs) but they won't ever find hello (with NULLs).
Searching for h^#e^#l^#l^#o (^# is <C-v><C-#>), on the other hand, will find hello (with NULLs) but not hello (without NULLs).
Anyway, converting that file/buffer or making sure you don't end up with such a garbage are much better long-term solutions.
If Vim can detect the encoding of the file, then yes, Vim can grep the file. :vimgrep works by first reading in the file as normal (even including autocmds) into a hidden buffer, and then searching the buffer.
It looks like your file is little-endian UTF-16, without a byte-order mark (BOM). Vim can detect this, but won't by default.
First, make sure your Vim is running with internal support for unicode. To do that, :set encoding=utf-8 at the top of your .vimrc. Next, Vim needs to be able to detect this file's encoding. The 'fileencodings' option controls this.
By default, when you set 'encoding' to utf-8, Vim's 'fileencodings' option contains "ucs-bom" which will detect UTF-16, but ONLY if a BOM is present. To also detect it when no BOM is present, you need to add your desired encoding to 'fileencodings'. It needs to come before any of the 8-bit encodings but after ucs-bom. Try doing this at the top of your .vimrc and restart Vim to use:
set encoding=utf-8
set fileencodings=ucs-bom,utf-16le,utf-8,default,latin1
Now loading files with the desired encoding should work just fine for editing, and therefore also for vimgrep.

Enter Unicode characters with 8-digit hex code

How do I enter Unicode characters like 𝓭 without copying it to the clipboard and pasting it?
Things I know:
The command ga on the character 𝓭 gives me hex:0001d4ed.
I can copy it on the clipboard and paste it via "+p.
I know how to enter Unicode values that have a 4 digit hex code:
<C-v>u for example <C-v>u03b1 gives the α character.
You can use <C-v>U, that is, an uppercase u, to input an 8 digit hex codepoint character.
More information here and here.
There is a Vim feature designed to simplify entering characters that
cannot be typed directly. It is called Digraphs (see :help digraphs).
To define a custom digraph for entering ‘𝓭’, use an Ex command similar
to the one below.
:dig dd 120045
where 120045 is the decimal representation of ‘𝓭’, as one can easily
confirm using the ga command.
Inserting a character using a digraph is simple:
Type Ctrl+K followed by the shortcut of that
digraph (dd for the above example).
There exists a Unicode plugin for Vim. According to the plugin description, this plugin has three main features:
Character/digraph completion using either the Unicode name or the codepoint.
Identify the character/digraph under the cursor.
Search for digraphs by name; transform two normal characters into their corresponding digraph.

Resources