Accent problems : charset / utf-8 / thelounge IRC - thelounge

I have a problem with using the IRC client The Lounge.
The type accents : " é à ... " is displayed correctly from the client,
but when a user uses accents from another client accents are no longer displayed on The Lounge is are replaced with " ? " ( see screen )

Related

XPath concat function: \n appears as text

I 'm wring Xpath scripts using concat function to display the result in lines using XPath 1.0
XML example
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
XPath
concat("To : ", /note/to , ' \n ', "From : ", /note/from, ' \n ', "Heading : ", /note/heading, ' \n ', "Body : ", /note/body)
Result
To : Tove \n From : Jani \n Heading : Reminder \n Body : Don't forget me this weekend!
However when I use the concat with two element and "\n" in between it works fine.
xPath:
concat("To : ", /note/to , ' \n ', "From : ", /note/from)
Result:
To : Tove
From : Jani
XPath doesn't recognise backslash as an escape character. "\n" simply represents the two characters "backslash, 'n'", it doesn't represent a newline.
If your XPath expression is embedded in a host language such as Java or C#, then the host language will convert \n to newline before the XPath engine ever sees it. But you don't say anything about the host language in which the expressions appear.
If your XPath expression appears within XSLT or XQuery, then you should write a newline as
or . If it's some other environment, then you can use a literal newline.
So the answer is, it all depends on the host environment in which you are writing these expressions.

How to send emoji as part of HTTP-OUTPUT-PLUGIN for logstash to Telegram bot

I want to send emoji as part of message from logstash using http-output-plugin to telegram bot and get it displayed in Telegram group. PFB the code being used. Please suggest.
http {
format => "json"
http_method => "post"
url => "https://api.telegram.org:443/bot481032672:AAGhbY0l6cuy_HXA-SyiJHbwRznPdA3KPaQ/sendMessage"
mapping => {
"chat_id" => "-191360460"
"text" => "Severity=%{SNMPv2-SMI::enterprises.111.15.3.1.1.5.1}
App Name=%{kpi_match_name}
RUEI KPI Name=%{SNMPv2-SMI::enterprises.111.15.3.1.1.14.1}
Current Value=%{kpi_cur_value}"
}
}
SY
You can send emojis through two ways: \uXXXX and \U0000XXXX. For example the emoji U+1F601 the format would be \u1F601, but I recommend you use the another format \U0001F601. If the unicode format (U+1F601) after "+" has 5 characters you need to put 3 zeros \U + 000 + 1F601 = \U0001F601, but if the unicode format after "+" has 4 characters you need to put 4 zeros, for example U+D83D has to be \U + 0000 + D83D = \U0000D83D.
In this site you can get the unicode emojis that you need: https://unicode.org/emoji/charts/full-emoji-list.html
Here's what worked for me.
Add
parse_mode: "HTML"
to your "mapping" array. This would render the text as HTML.
See https://core.telegram.org/bots/api#sendmessage for details.
Add emoji as "&#x1F622". You can find the list of codes here: https://apps.timwhitlock.info/emoji/tables/unicode

vim text replacement : :%s/é/&&eacute/g foreign language

I have a text with multiple french character: I'm trying to replace all the occurrences for an HTML code
example été to get => été
right now I'm having either error or a weird text
:%s/é/&é/g => this gives me a ééeacute;tééeacute;
:%s/é/&é/g => this gives me a éeacute;téeacute;
:%s/é//é/g => this gives me an error Trailling character
In the replacement part, & is special, it represents the whole match:
J'ai mangé du paté de campagne.
:s/é/&´/g
J'ai mangéacute; du patéeacute; de campagne.
Escape it to obtain the desired &:
:s/é/\´/g
J'ai mang´ du paté de campagne.

Inconsistent token handling in ANTLR4

The ANTLR4 book references a multi-mode example
https://github.com/stfairy/learn-antlr4/blob/master/tpantlr2-code/lexmagic/ModeTagsLexer.g4
lexer grammar ModeTagsLexer;
// Default mode rules (the SEA)
OPEN : '<' -> mode(ISLAND) ; // switch to ISLAND mode
TEXT : ~'<'+ ; // clump all text together
mode ISLAND;
CLOSE : '>' -> mode(DEFAULT_MODE) ; // back to SEA mode
SLASH : '/' ;
ID : [a-zA-Z]+ ; // match/send ID in tag to parser
https://github.com/stfairy/learn-antlr4/blob/master/tpantlr2-code/lexmagic/ModeTagsParser.g4
parser grammar ModeTagsParser;
options { tokenVocab=ModeTagsLexer; } // use tokens from ModeTagsLexer.g4
file: (tag | TEXT)* ;
tag : '<' ID '>'
| '<' '/' ID '>'
;
I'm trying to build on this example, but using the « and » characters for delimiters. If I simply substitute I'm getting error 126
cannot create implicit token for string literal in non-combined grammar: '«'
In fact, this seems to occur as soon as I have the « character in the parser tag rule.
tag : '«' ID '>';
with
OPEN : '«' -> pushMode(ISLAND);
TEXT : ~'«'+;
Is there some antlr foo I'm missing? This is using antlr4-maven-plugin 4.2.
The wiki mentions something along these lines, but the way I read it that's contradicting the example on github and anecdotal experience when using <. See "Redundant String Literals" at https://theantlrguy.atlassian.net/wiki/display/ANTLR4/Lexer+Rules
One of the following is happening:
You forgot to update the OPEN rule in ModeTagsLexer.g4 to use the following form:
OPEN : '«' -> mode(ISLAND) ;
You found a bug in ANTLR 4, which should be reported to the issue tracker.
Have you specified the file encoding that ANTLR should use when reading the grammar? It should be okay with European characters less than 255 but...

What encoding is this and how can I decode it?

I've got an old project file with translations to Portuguese where special characters are broken:
error.text.required=\u00C9 necess\u00E1rio o texto.
error.categoryid.required=\u00C9 necess\u00E1ria a categoria.
error.email.required=\u00C9 necess\u00E1rio o e-mail.
error.email.invalid=O e-mail \u00E9 inv\u00E1lido.
error.fuel.invalid=\u00C9 necess\u00E1rio o tipo de combust\u00EDvel.
error.regdate.invalid=\u00C9 necess\u00E1rio ano de fabrica\u00E7\u00E3o.
error.mileage.invalid=\u00C9 necess\u00E1ria escolher a quilometragem.
error.color.invalid=\u00C9 necess\u00E1ria a cor.
Can you tell me how to decode the file to use the common Portuguese letters?
Thanks
The "\u" is prefix for unicode. You can use the strings "as is", and you'll have diacritics showing in the output. A python code would be something like:
print u"\u00C9 necess\u00E1rio o texto."
which outputs:
É necessário o texto.
Otherwise, you need to convert them in their ASCII equivalents. You can do a simple find/replace. I ended up writing a function like that for converting Romanian diacritics a while ago, but I had dynamic strings coming in...
Smell to me like this is unicode?
\u = prefix unicode character
00E1 = hex code for the 2 byte number of the unicode.
Not sure what the format is - I would ask the sencer, but i would try this approach to decode it.
found it ;)
http://www.fileformat.info/info/unicode/char/20/index.htm
Look at the tables with source code. This can be a C++ source file. This is the way you give unicodde characters in source.

Resources