What is this weird character? - text

some time ago i found this weird character that i can t put it here because it doesn't work but i have images of it. It looks like a space character in the begin but it isn't.
When i google it i get this here and some random character in the search bar. Also i can't find it in the Unicode/ASCII library of characters
Does anyone what this character is and it's purpose?

Based on your Google search string, you have a long series of the character U+3164, the "Hangul Filler." Hangul is the Korean alphabet.

Related

How to get unicode of characters from 55296 to 56319 in Excel

I generated a list of letters in excel, from character codes 1 to 66535.
I am trying to get back the unicode by using the function "UNICODE". However, excel return #VALUE! for character codes from 55296 to 56319.
Please advise if there are any other function that can return a proper unicodes.
Thank you.
The range you are listing is a special range in Unicode: surrogates.
So, they have Unicode code point, but the problem it is you cannot have them in a text: Windows uses UCS-2/UTF-16 as internal encoding, so there are no way you can put in text. Or better: you to have code points above 65535, Windows uses two surrogates, one in the range 0xD800-0xDBFF (high surrogate) and the second one 0xDC00-0xDFFF )low surrogate). By combining these two, you have all Unicode code points.
But so, you should never have a single surrogate (or a mismatch surrogate, e.g. a high surrogate not followed a low surrogate, or a low surrogate not preceded be a high surrogate).
So, just skip such codes. Or better use them correctly to have characters above 65535.
Note: you cannot have all Unicode characters only with one code point. many characters requires combining many code points (there is a whole category of "combining characters" in Unicode). E.g. the zero with a oblique line is rendered with two unicode characters: the normal zero, and a variant selector. Also accented characters are very limited (and often with just one accent per characters). And without going to more complex scripts.

Vim not detecting implicit newline characters instead of visible newline characters I am trying to strip

Here's an example of some text from which I'm trying to strip those newline characters, which appear explicitly in my vim, and replace them with actual newline characters that I don't see.
But when I search for a newline character using /[\n]/, what I get isn't these visible newline characters, but instead the implicit ones. So I can't do a search and replace.
How should I address this? Here is the text:
The Reason that can be reasoned\n is not the eternal Reason.The name that can\n be namedis not the eternal Name. The Unnamable is of heaven and earth the beginning.\n The Namable becomes of the\n ten thousand things the mother.Therefore it is said:\n '\n\n He\n who desireless is found\n The spiritual of the world will sound.\n But he who by desire is bound\n Sees the mere shell of things around.' These two things are the same in sour ce but different in name.\n Their sameness\n is called a mystery.Indeed
it is the mystery\n
You need to search for \\n, not [\n].
doing:
%s/\\n/\r/g
Should solve your problem (I have no idea why, but vim needs \r instead of \n')

Why this excel string comparison return fail?

Is it an Excel bug? Anyone have experienced this issue, please help?
Just a thought but here's what MS says about TRIM
The TRIM function was designed to trim the 7-bit ASCII space character
(value 32) from text. In the Unicode character set, there is an
additional space character called the nonbreaking space character that
has a decimal value of 160. This character is commonly used in Web
pages as the HTML entity, . By itself, the TRIM function does
not remove this nonbreaking space character.
you might try this to replace the non-breaking space (if that is your problem here).
=TRIM(SUBSTITUTE(A5,CHAR(160),CHAR(32)))
I would have to agree with #Jeeped. Your formula looks correct in all aspects. It must be a non-printing character. If this data is coming from some outside source (I.e. another file) then there very well could be a non-printed character. I just typed in everything you had manually and came up with this.

How can I find the character code of a special character in my text editor?

When pasting text from outside sources into a plain-text editor (e.g. TextMate or Sublime Text 2) a common problem is that special characters are often pasted in as well. Some of these characters render fine, but depending on the source, some might not display correctly (usually showing up as a question mark with a box around it).
So this is actually 2 questions:
Given a special character (e.g., ’ or ♥) can I determine the UTF-8 character code used to display that character from inside my text editor, and/or convert those characters to their character codes?
For those "extra-special" characters that come in as garbage, is there any way to figure out what encoding was used to display that character in the source text, and can those characters somehow be converted to UTF-8?
My favorite site for looking up characters is fileformat.info. They have a great Unicode character search that includes a lot of useful information about each character and its various encodings.
If you see the question mark with a box, that means you pasted something that can't be interpreted, often because it's not legal UTF-8 (not every byte sequence is legal UTF-8). One possibility is that it's UTF-16 with an endian mode that your editor isn't expecting. If you can get the full original source into a file, the file command is often the best tool for determining the encoding.
At &what I built a tool to focus on searching for characters. It indexes all the Unicode and HTML entity tables, but also supplements with hacker dictionaries and a database of keywords I've collected, so you can search for words like heart, quot, weather, umlaut, hash, cloverleaf and get what you want. By focusing on search, it avoids having to hunt around the Unicode pages, which can be frustrating. Give it a try.

How to search for a character the displays as "<85>" in Vim

I have a file that was converted from EBCDIC to ASCII. Where there used to be new lines there are now characters that show up as <85> (a symbol representing a single character, not the four characters it appears to be) and the whole file is on one line. I want to search for them and replace them all with new lines again, but I don't know how.
I tried putting the cursor over one and using * to search for the next occurrence, hoping that it might show up in my / search history. That didn't work, it just searched for the word that followed the <85> character.
I searched Google, but didn't see anything obvious.
My goal is to build a search and replace string like:
:%s/<85>/\n/g
Which currently just gives me:
E486: Pattern not found: <85>
I found "Find & Replace non-printable characters in vim" searching Google. It seems like you should be able to do:
:%s/\%x85/\r/gc
Omit the c to do the replacement without prompting, try with c first to make sure it is doing what you want it to do.
In Vim, typing :h \%x gives more details. In addition to \%x, you can use \%d, \%o, \%u and \%U for decimal, octal, up to four and up to eight hexadecimal characters.
For special character searching, win1252 for example, for the case of <80>,<90>,<9d>...
type:
/\%u80, \/%u90, /\%u9d ...
from the editor.
Similarly for octal, decimal, hex, type: /\%oYourCode, /\%dYourCode, /\%xYourCode.
try this: :%s/<85>/^M/g
note: press Ctrl-V together then M
or if you don't mind using another tool,
awk '{gsub("<85>","\n")}1' file

Resources