Why is .add_reaction not working with unicode emojis? - python-3.x

I canot seem to get this single line of code to output anything except BAD REQUEST Unknown Emoji
await dBot.add_reaction(message,"\\U00000031")
I cannot find any reason online why this shouldnt work. What wonderful noob mistake am I making?

The string you're using isn't an escaped Unicode character, but an escaped backslash character followed by eight digit characters. You probably want only one backslash in the literal, which will let Python parse the literal into a single character as it seems you intend. I'm still not sure that will do what you expect though, since "\U00000031" is the character '1', not an emoji.
From your comment below, it sounds like the emoji you actually want is composed of two Unicode codepoints. The first is just the normal '1' character I discussed above, which you don't need any escapes to write. The second character is U+20E3 ('COMBINING ENCLOSING KEYCAP'), which can be written in a Python literal as '\u20E3' or '\U000020E3'. This puts a keyboard key image around whatever the previous character was, so the sequence "1\u20e3" will render as 1⃣ (which my browser doesn't handle too well, but yours might). I don't know for sure, but I'd be fairly confident that discord would accept that, if it support the 1 key emoji you're looking for at all (which I expect it does).

Related

Vim not detecting implicit newline characters instead of visible newline characters I am trying to strip

Here's an example of some text from which I'm trying to strip those newline characters, which appear explicitly in my vim, and replace them with actual newline characters that I don't see.
But when I search for a newline character using /[\n]/, what I get isn't these visible newline characters, but instead the implicit ones. So I can't do a search and replace.
How should I address this? Here is the text:
The Reason that can be reasoned\n is not the eternal Reason.The name that can\n be namedis not the eternal Name. The Unnamable is of heaven and earth the beginning.\n The Namable becomes of the\n ten thousand things the mother.Therefore it is said:\n '\n\n He\n who desireless is found\n The spiritual of the world will sound.\n But he who by desire is bound\n Sees the mere shell of things around.' These two things are the same in sour ce but different in name.\n Their sameness\n is called a mystery.Indeed
it is the mystery\n
You need to search for \\n, not [\n].
doing:
%s/\\n/\r/g
Should solve your problem (I have no idea why, but vim needs \r instead of \n')

Literal Strings [Lua 5.1]

So I started to learn Lua(5.1) and I saw that thing called literal strings. And I have no idea what these do. The manual says \a is a bell but when I type
print('hello\athere')
The IDE prints a weird square with 'bel' written on it.
So if someone could help me and explain every one of them[Literal Strings]. that would be really helpful.
p.s. i use Sublime Text 3
Only ASCII between 0x20 and 0x7E are printable characters. How other characters are output, including '\a' and '\b', is up to the implementation.
'\a', the ASCII 7 for BEL, is designed to be used to alert. Typical terminal would make an audible or visible alert when outputing '\a'. Your IDE choose to show a different output other than an alert. That's OK since it's up to the implementation.
Such sequences are called "escape sequences", and are found in many different languages. They are used to encode non-printable characters such as newlines in literal (hardcoded) strings.
Lua supports the following escape sequences:
\a: Bell
\b: Backspace
\f: Form feed
\n: Newline
\r: Carriage return
\t: Tab
\v: Vertical tab
\\: Backslash
\": Double quote
\': Single quote
\nnn: Octal value (nnn is 3 octal digits)
\xNN: Hex value (Lua5.2/LuaJIT, NN is two hex digits)
A literal is not more than a value inside the code, e.g.: 'some text'.
The '\a' is something different. A special "char", that is used to output a sound (was using the pc-speaker some aeons ago).

Replace character with a safe character and vice-versa

Here's my problem:
I need to store sentences "somewhere" (it doesn't matter where).
The sentences must not contain spaces.
When I extract the sentences from that "somewhere", I need to restore the spaces.
So, before storing the sentence "I am happy" I could replace the spaces with a safe character, such as &. In C#:
theString.Replace(' ', '&');
This would yield 'I&am&happy'.
And when retrieving the sentence, I would to the reverse:
theString.Replace('&', ' ');
But what if the original sentence already contains the '&' character?
Say I would do the same thing with the sentence 'I am happy & healthy'. With the design above, the string would come back as 'I am happy healthy', since the '&' char has been replaced with a space.
(Of course, I could change the & character to a more unlikely symbol, such as ¤, but I want this to be bullet proof)
I used to know how to solve this, but I forgot how.
Any ideas?
Thanks!
Fredrik
Maybe you can use url encoding (percent encoding) as an inspiration.
Characters that are not valid in a url are escaped by writing %XX where XX is a numeric code that represents the character. The % sign itself can also be escaped in the same way, so that way you never run into problems when translating it back to the original string.
There are probably other similar encodings, and for your own application you can use an & just as well as a %, but by using an existing encoding like this, you can probably also find existing functions to do the encoding and decoding for you.

Why does question mark show up in web browser?

I was (re)reading Joel's great article on Unicode and came across this paragraph, which I didn't quite understand:
For example, you could encode the Unicode string for Hello (U+0048
U+0065 U+006C U+006C U+006F) in ASCII, or the old OEM Greek Encoding,
or the Hebrew ANSI Encoding, or any of several hundred encodings that
have been invented so far, with one catch: some of the letters might
not show up! If there's no equivalent for the Unicode code point
you're trying to represent in the encoding you're trying to represent
it in, you usually get a little question mark: ? or, if you're really
good, a box. Which did you get? -> �
Why is there a question mark, and what does he mean by "or, if you're really good, a box"? And what character is he trying to display?
There is a question mark because the encoding process recognizes that the encoding can't support the character, and substitutes a question mark instead. By "if you're really good," he means, "if you have a newer browser and proper font support," you'll get a fancier substitution character, a box.
In Joel's case, he isn't trying to display a real character, he literally included the Unicode replacement character, U+FFFD REPLACEMENT CHARACTER.
It’s a rather confusing paragraph, and I don’t really know what the author is trying to say. Anyway, different browsers (and other programs) have different ways of handling problems with characters. A question mark “?” may appear in place of a character for which there is no glyph in the font(s) being used, so that it effectively says “I cannot display the character.” Browsers may alternatively use a small rectangle, or some other indicator, for the same purpose.
But the “�” symbol is REPLACEMENT CHARACTER that is normally used to indicate data error, e.g. when character data has been converted from some encoding to Unicode and it has contained some character that cannot be represented in Unicode. Browsers often use “�” in display for a related purpose: to indicate that character data is malformed, containing bytes that do not constitute a character, in the character encoding being applied. This often happens when data in some encoding is being handled as if it were in some other encoding.
So “�” does not really mean “unknown character”, still less “undisplayable character”. Rather, it means “not a character”.
A question mark appears when a byte sequence in the raw data does not match the data's character set so it cannot be decoded properly. That happens if the data is malformed, if the data's charset is explicitally stated incorrectly in the HTTP headers or the HTML itself, the charset is guessed incorrectly by the browser when other information is missing, or the user's browser settings override the data's charset with an incompatible charset.
A box appears when a decoded character does not exist in the font that is being used to display the data.
Just what it says - some browsers show "a weird character" or a question mark for characters outside of the current known character set. It's their "hey, I don't know what this is" character. Get an old version of Netscape, paste some text form Microsoft Word which is using smart quotes, and you'll get question marks.
http://blog.salientdigital.com/2009/06/06/special-characters-showing-up-as-a-question-mark-inside-of-a-black-diamond/ has a decent explanation.

Why doesn't Vims errorformat take regular expressions?

Vims errorformat (for parsing compile/build errors) uses an arcane format from c for parsing errors.
Trying to set up an errorformat for nant seems almost impossible, I've tried for many hours and can't get it. I also see from my searches that alot of people seem to be having the same problem. A regex to solve this would take minutesto write.
So why does vim still use this format? It's quite possible that the C parser is faster but that hardly seems relevant for something that happens once every few minutes at most. Is there a good reason or is it just an historical artifact?
It's not that Vim uses an arcane format from C. Rather it uses the ideas from scanf, which is a C function. This means that the string that matches the error message is made up of 3 parts:
whitespace
characters
conversion specifications
Whitespace is your tabs and spaces. Characters are the letters, numbers and other normal stuff. Conversion specifications are sequences that start with a '%' (percent) character. In scanf you would typically match an input string against %d or %f to convert to integers or floats. With Vim's error format, you are searching the input string (error message) for files, lines and other compiler specific information.
If you were using scanf to extract an integer from the string "99 bottles of beer", then you would use:
int i;
scanf("%d bottles of beer", &i); // i would be 99, string read from stdin
Now with Vim's error format it gets a bit trickier but it does try to match more complex patterns easily. Things like multiline error messages, file names, changing directory, etc, etc. One of the examples in the help for errorformat is useful:
1 Error 275
2 line 42
3 column 3
4 ' ' expected after '--'
The appropriate error format string has to look like this:
:set efm=%EError\ %n,%Cline\ %l,%Ccolumn\ %c,%Z%m
Here %E tells Vim that it is the start of a multi-line error message. %n is an error number. %C is the continuation of a multi-line message, with %l being the line number, and %c the column number. %Z marks the end of the multiline message and %m matches the error message that would be shown in the status line. You need to escape spaces with backslashes, which adds a bit of extra weirdness.
While it might initially seem easier with a regex, this mini-language is specifically designed to help with matching compiler errors. It has a lot of shortcuts in there. I mean you don't have to think about things like matching multiple lines, multiple digits, matching path names (just use %f).
Another thought: How would you map numbers to mean line numbers, or strings to mean files or error messages if you were to use just a normal regexp? By group position? That might work, but it wouldn't be very flexible. Another way would be named capture groups, but then this syntax looks a lot like a short hand for that anyway. You can actually use regexp wildcards such as .* - in this language it is written %.%#.
OK, so it is not perfect. But it's not impossible either and makes sense in its own way. Get stuck in, read the help and stop complaining! :-)
I would recommend writing a post-processing filter for your compiler, that uses regular expressions or whatever, and outputs messages in a simple format that is easy to write an errorformat for it. Why learn some new, baroque, single-purpose language unless you have to?
According to :help quickfix,
it is also possible to specify (nearly) any Vim supported regular
expression in format strings.
However, the documentation is confusing and I didn't put much time into verifying how well it works and how useful it is. You would still need to use the scanf-like codes to pull out file names, etc.
They are a pain to work with, but to be clear: you can use regular expressions (mostly).
From the docs:
Pattern matching
The scanf()-like "%*[]" notation is supported for backward-compatibility
with previous versions of Vim. However, it is also possible to specify
(nearly) any Vim supported regular expression in format strings.
Since meta characters of the regular expression language can be part of
ordinary matching strings or file names (and therefore internally have to
be escaped), meta symbols have to be written with leading '%':
%\ The single '\' character. Note that this has to be
escaped ("%\\") in ":set errorformat=" definitions.
%. The single '.' character.
%# The single '*'(!) character.
%^ The single '^' character. Note that this is not
useful, the pattern already matches start of line.
%$ The single '$' character. Note that this is not
useful, the pattern already matches end of line.
%[ The single '[' character for a [] character range.
%~ The single '~' character.
When using character classes in expressions (see |/\i| for an overview),
terms containing the "\+" quantifier can be written in the scanf() "%*"
notation. Example: "%\\d%\\+" ("\d\+", "any number") is equivalent to "%*\\d".
Important note: The \(...\) grouping of sub-matches can not be used in format
specifications because it is reserved for internal conversions.
lol try looking at the actual vim source code sometime. It's a nest of C code so old and obscure you'll think you're on an archaeological dig.
As for why vim uses the C parser, there are plenty of good reasons starting with that it's pretty universal. But the real reason is that sometime in the past 20 years someone wrote it to use the C parser and it works. No one changes what works.
If it doesn't work for you the vim community will tell you to write your own. Stupid open source bastards.

Resources