how to replace 'LEFT-TO-RIGHT MARK' (U+200E) - <200e> with vim - linux

Thats is how this special character is displayed in vim:
Ive tryed with /\x20(\x0e|\x0f)/ and /\xe2\x80[\x8e\x8f]/ without results.

First, if you want to replace byte 0x20 (it is space, if I am not mistaking), you need to use \%x20, not \x20 because \x designates a hex digit (unless used inside a collection, there \x20 means what expected). But if you want to replace given unicode character, you should use \%u200E (\u200E inside a collection).
Second, both \%x20 and [\x20] will match character with unicode code 0x20, not byte with code 0x20. It does not matter for the space, but makes difference for code points >0x7F.

Try to replace \u200e :)
You can test this works by inserting that character into your buffer, and seeing that it appears as <200e>, if you type this in while in insert mode: <C-R>="\u200e"<CR> (that's CTRL+R and <CR> means ENTER)

I would put the cursor on the blue <200e>, then type yl to yank (copy) the character.
Then, type :%s/<C-R>"/replacement/g
(where <C-R> is Control+R, of course).

Use your terminal's mechanism for entering characters by Unicode codepoint. In the case of gnome-terminal, that's CtrlShiftU followed by the hex code (e.g. 200e) and then Enter.

Related

How to replace bytes \xe3\x80\x80 with byte \x20 in vim?

Let's to create target file to operate with.
python3
>>> mfile = open("f:/test.txt","wb")
>>> mfile.write(b'\xe3\x80\x80')
3
>>> mfile.close()
Now to open f:/test.txt with xxd,you will see three bytes \xe3\x80\x80 in it,our target file encoding with utf-8 contains three bytes \xe3\x80\x80.
python3
b'\xe3\x80\x80'.decode('utf-8')
'\u3000'
It means that the unicode of three bytes in test.txt encoding with utf-8 is 3000.
:s/\%u3000/ /g
s/\%u3000/ /g can replace bytes \xe3\x80\x80 with byte \x20 in vim.
Issue remains still here.
:s/\%u3000/\%u20/g
:s/\%u3000/\%x20/g
:s/\%u3000/\x20/g
All the three formats above here can't work,why \xe3\x80\x80 can be expressed by \%u3000 in vim, (white blank) can't be expressed by \%u20 or \%x20 or \x20 ?
can express \x20, white blank is printable character,what's more, i want to replace the three bytes \xe3\x80\x80 with latin-1's nbsp?
The nbsp in latin-1 encoding means Non-breaking space which is NON PRINTABLE CHARACTERS,how to write the expression in vim?
:s/\%u3000/\%ua0/g
:s/\%u3000/\%xa0/g
:s/\%u3000/\xa0/g
None of them can work for the case.
You can type the \xe3\x80\x80 or u3000 character by pressing ctrl+v then u and then the 4 Unicode characters, in your case 3000 (check :help i_CTRL-V_digit ), since is a black character you will see nothing but just a space, you could type :set list to see all the places where you have that character or in any case add this to your .vimrc
set listchars=tab:▸\ ,eol:¬,trail:·,extends:#,nbsp:.
Now in the same way you enter the character, you could try to replace it within the command line, but in this case to be available to enter the ctrl+v you could try using the command-line window (:help cedit).
Go to command mode and after having the : press ctrl+f it will open the command-line window in where you could go into insert mode and type: %s/ctrl+vu3000/ /g and when done press enter to apply command.
Give a try first before entering the command-line window, since when using ctrl+v it may work, not like when using ctrl+k (http://vim.wikia.com/wiki/Entering_special_characters)
In the image instead of replacing with a white space / /, Is replacing with ---- just to visually see the changes.
1.How to input non printable characters when to edit a file in vim?
In the insert mode:
1.ctrl+v (ctrl+q if ctrl+v call paste from regitor)
2.input u
3.input the unicode value of non printable characters
4.input enter key
2.How to input non printable characters in substitute command of vim's ex mode?
For example, to replace all bytes \xe3\x80\x80 with \xa0,all byte's encoding is utf-8.
1. get the byte's unicode value
`\xe3\x80\x80`'s unicode value is `3000`,
`\xa0`'s unicode value is `a0`.
2.press `:` into ex mode.
3.:s/\%u3000/
4:ctrl+v ua0
do not input enter as above process
5.go on to input `/g`.
6.press enter.

How to fine-tune Macros after having recorded it through recording in Vim?

Specific question
Description
After recording the desired action to registrar o, I pasted the whole macro to my ~/.vimrc and assigned it as follows (directly pasting the mappings are not displayed properly)
Expected behavior
I would like to use this macro to get myself a new "comment line" that leads a new section of script, formatted such that the name of the section is centered. After populating the "section title", I would like to enter insert mode in a new line.
In the following screen-record, I have tested both #o and #p$ on the word "time". The second attempt with#p` worked as desired.
The problem (on Windows machine specifically)
As you see, the #o mapping gets me junk phrases which had been part of my definition for the macro. Does this have to do with the ^M operator? And, how can I fix the #o mapping, which uses * to populate the line?
The two mapping worked just fine on Linux system. (Don't know why, as I have recorded and pasted the macro-definition on Windows machine.) This also does not appear to be a problem on Mac with MacVim.
Generalized question
Is there a way to properly substitute the ^M operator (for <CR>, or "Enter"-key)?
Is there a way to properly substitute the ^[ operator (for <ESC>, or the "Escape"-key)?
Is there a systematic list of mappings from these weird representation of keystrokes, as recorded by the "recording" function through q.
Solution
Substitute the ^M marks in the macro-definition with \r. And, substitute ^[ to be \x1b, for the ESC key. The mappings are fixed as follows:
let #o = ":center\ri\r\x1bkV:s/ /\*/g\rJx50A\*\x1b80d|o"
let #p = ":center\ri\r\x1bkV:s/ /\"/g\rJx50A\"\x1b80d|o"
Complete list of key-codes/mappings? Approach 1: through hex code.
Thanks to Zbynek Vyskovsky, the picture is clear. For whatever key one may think of, Vim takes its ASCII value at the "face value". (The trick is to use a escape clause starting with \x, where x serves as the leader key/string/character connecting to the hex values.) Thus, the correspondence list (incomplete yet), goes as follows:
Enter --- \x0d --- \r
ESC --- \x1b --- \e
Solution native to Vim
By chance, :help expr-quote gives the following list of special characters. This shall serve as the definite answer to the original question in general form.
string *string* *String* *expr-string* *E114*
------
"string" string constant *expr-quote*
Note that double quotes are used.
A string constant accepts these special characters:
\... three-digit octal number (e.g., "\316")
\.. two-digit octal number (must be followed by non-digit)
\. one-digit octal number (must be followed by non-digit)
\x.. byte specified with two hex numbers (e.g., "\x1f")
\x. byte specified with one hex number (must be followed by non-hex char)
\X.. same as \x..
\X. same as \x.
\u.... character specified with up to 4 hex numbers, stored according to the
current value of 'encoding' (e.g., "\u02a4")
\U.... same as \u but allows up to 8 hex numbers.
\b backspace <BS>
\e escape <Esc>
\f formfeed <FF>
\n newline <NL>
\r return <CR>
\t tab <Tab>
\\ backslash
\" double quote
\<xxx> Special key named "xxx". e.g. "\<C-W>" for CTRL-W. This is for use
in mappings, the 0x80 byte is escaped.
To use the double quote character it must be escaped: "<M-\">".
Don't use <Char-xxxx> to get a utf-8 character, use \uxxxx as
mentioned above.
Note that "\xff" is stored as the byte 255, which may be invalid in some
encodings. Use "\u00ff" to store character 255 according to the current value
of 'encoding'.
Note that "\000" and "\x00" force the end of the string.
As you use assigning to register using vim expression language, it's definitely possible in platform independent way. The strings in vim expressions understand the standard escape sequences, therefore it's best to replace ^M with \r and Esc with \x1b:
let #o = ":center\riSomeInsertedString\x1b"
There is no list of of special characters to be translated as far as I know but you can simply take all control characters (ASCII below 32) and translate them to corresponding escape sequence "\xHexValue" where HexValue is the value of the character. Even \r (or ^M) can be translated to \x0d as its ASCII value is 13 (0x0d hex).

Removing hex code ffa3 in Vim

I've got a file with a load of weird characters with in it that I need to get rid of.
Using ga on the character reveals it has the following encodings:
ᆪ> 65443, Hex ffa3, Octal 177643
But I can't seem to find it using :%s/\%xffa3//g. What am I doing wrong?
Look at :help \%x:
\%x2a Matches the character specified with up to two hexadecimal characters.
So Vim is actually matching the three characters <uf>a3. Since you have a four-digit hex number, you need to use \%u:
:%s/\%uffa3//g
Alternatives
You can also insert the character directly into the command line via :help i_CTRL-V_digit (i.e. <C-v>uffa3), but if you already have instances of that character in your buffer (and near your cursor!), I'd just yank that char with yl and insert it in the command-line via <C-r>".

How do I remove the last six characters of every line in Vim?

I have the following characters being repeated at the end of every line:
^[[00m
How can I remove them from each line using the Vim editor?
When I give the command :%s/^[[00m//g, it doesn't work.
You could use :%s/.\{6}$// to literally delete 6 characters off the end of each line.
The : starts ex mode which lets you execute a command. % is a range that specifies that this command should operate on the whole file. The s stands for substitute and is followed by a pattern and replace string in the format s/pattern/replacement/. Our pattern in this case is .\{6}$ which means match any character (.) exactly 6 times (\{6}) followed by the end of the line ($) and replace it with our replacement string, which is nothing. Therefore, as I said above, this matches the last 6 characters of every line and replaces them with nothing.
I would use the global command.
Try this:
:g/$/norm $xxxxxx
or even:
:g/$/norm $5Xx
I think the key to this problem is to keep it generic and not specific to the characters you are trying to delete. That way the technique you learn will be applicable to many other situations.
Assuming this is an ANSI escape sequence, the ^[ stands for a single <Esc> character. You have to enter it by pressing Ctrl + V (or Ctrl + Q) on many Windows Vim installations), followed by Esc. Notice how this is then highlighted in a slightly different color, too.
It's easy enough to replace the last six characters of every line being agnostic to what those characters are, but it leaves considerable room for error so I wouldn't recommend it. Also, if ^[ is an escape character, you're really looking for five characters.
Escape code
Using ga on the character ^[ you can determine whether it's an escape code, in which case the status bar would display
<^[> 27, Hex 1b, Octal 033
Assuming it is, you can replace everything using
:%s/\%x1b\[00m$//gc
With \%x1b coming from the hex value above. Note also that you have to escape the bracket ([) because it's a reserved character in Vim regex. $ makes sure it occurs at the end of a line, and the /gc flags will make it global and confirm each replacement (you can press a to replace all).
Not escape code
It's a simple matter of escaping then. You can use either of the two below:
:%s/\^\[\[00m$//gc
:%s/\V^[[00m\$//gc
If they are all aligning, you can do a visual-block selection and delete it then.
Otherwise, if you have a sequence unknown how to input, you can visually select it by pressing v, then mark and yank it y (per default into register "). Then you type :%s/<C-R>"//g to delete it.
Note:
<C-R>" puts the content of register " at the cursor position.
If you yanked it into another register, say "ay (yank to register a - the piglatin yank, as I call it) and forgot where you put it, you can look at the contents of your registers with :reg.
<C-R> is Vim speak for Ctrl+R
This seems to work fine when the line is more than 5 chars long:
:perldo $_ = substr $_, 0, -5
but when the line is 5 or less chars long it does nothing.
Maybe there is a easy way in perl to delete the last 5 chars of a string, but I don't really know it:)
Use this to delete:
:%s/^[[00m//gc

Enter Unicode characters with 8-digit hex code

How do I enter Unicode characters like 𝓭 without copying it to the clipboard and pasting it?
Things I know:
The command ga on the character 𝓭 gives me hex:0001d4ed.
I can copy it on the clipboard and paste it via "+p.
I know how to enter Unicode values that have a 4 digit hex code:
<C-v>u for example <C-v>u03b1 gives the α character.
You can use <C-v>U, that is, an uppercase u, to input an 8 digit hex codepoint character.
More information here and here.
There is a Vim feature designed to simplify entering characters that
cannot be typed directly. It is called Digraphs (see :help digraphs).
To define a custom digraph for entering ‘𝓭’, use an Ex command similar
to the one below.
:dig dd 120045
where 120045 is the decimal representation of ‘𝓭’, as one can easily
confirm using the ga command.
Inserting a character using a digraph is simple:
Type Ctrl+K followed by the shortcut of that
digraph (dd for the above example).
There exists a Unicode plugin for Vim. According to the plugin description, this plugin has three main features:
Character/digraph completion using either the Unicode name or the codepoint.
Identify the character/digraph under the cursor.
Search for digraphs by name; transform two normal characters into their corresponding digraph.

Resources