Global case insensitive regex matching for bot framework composer intents? - bot-framework-composer

In javascript the regex would be something like /word/gi how do i do the equivalent in botframework composer regex intent recognizer?

you can refer it here
example : booka flight to (?<>.*)
https://learn.microsoft.com/en-us/composer/how-to-define-triggers#intent-recognized

Related

spacy how do I make a matcher which is noun-noun without white space within it?

I tried to make a matcher which could detect words like
'all-purpose'
I was trying to make a pattern like
pattern=[{'POS':'NOUN'}, {'ORTH':'-'},{'POS':'NOUN'}]
However, I realized that it only find the matches like
'all - purpose' with white space between tokens instead of 'all-purpose'.
How could I make a matcher like this?
It has to be a generalized pattern like noun-noun instead of
specific words like 'Barak Obama' as in the example in spacy documentation
Best,
What exactly are you trying to match? Using en_core_web_sm, "all-purpose" is three tokens and all has the ADV POS tag for me. So that might be the issue with your match pattern. If you just want hyphenated words this might be a better match:
pattern = [{'IS_ALPHA': True}, {'ORTH':'-'}, {'IS_ALPHA': True}]
More generally, you are correct that your pattern will only match three tokens, though that doesn't require white space - it depends on how the tokenizer works. For example, that's has no spaces but is two tokens.
If you are finding hyphenated words that occur as one token and want to match them, you can use regular expressions in Matcher rules. Here's an example ofhow that would work from the docs:
pattern = [{"TEXT": {"REGEX": "deff?in[ia]tely"}}]
In your case it could just look like this:
pattern = [{"TEXT": {"REGEX": "-"}}]

python telegram telethon how to send emoji

Good day,
I missed something in telethondocumentation. All is clear with files, messages, document, but i cannot find, how to send emoji to other user. When I send emoji code like ;-) it sends it as raw message. If it is equals to send file, please help me to find list of emoji id to put into file variable. Official documentation provides functions below, it is not clear.
GetEmojiKeywordsDifferenceRequest
GetEmojiKeywordsLanguagesRequest
GetEmojiKeywordsRequest
GetEmojiURLRequest
Please hint me with it :)
Emoji are just strings, like any other in Python. The ";-)" replacement for "πŸ˜‰" in official clients is done on the client side, not the server.
You should be able to paste the emoji directly into your code, or if your editor does not support it, use a Python unicode escape:
client.send_message(chat, 'πŸ˜‰')
client.send_message(chat, '\U0001F609')
If you prefer to use text replacements in your code, install the emoji package:
import emoji
client.send_message(chat, emoji.emojize(':wink:'))
(Please note I have not tried the emoji module myself, see their documentation for available replacements.)

Telegraf parsing β€œgrok” patterns

I have custom log file and i need to parse it with telegraf parser, this is an example:
2018-12-03 13:51:31,682 grafana s.testname EXPERTISE
full_access,mentor,employee EXPERTISE_LIST
I created a pattern but gives an error
patterns = ["%{TIMESTAMP_ISO8601:timestamp}" "%{WORD:grafana}" "%{DATA:user}" "%{DATA:project}" "%{DATA:permissions}" "%{DATA:action}" "%{DATA:additional}"] i
done this pattern but its not working
I cant understand what i'm doing wrong.
I don't know exactly what are you doing, but your pattern is wrong. You are splitting it into multiple patterns that will never work.
I make a try with your example with this pattern:
%{TIMESTAMP_ISO8601:timestamp} %{WORD:grafana} %{DATA:user} %{DATA:project} %{DATA:permissions} %{WORD:action}
And it works.
You can try it here.

Non capturing branch reset regex in NodeJS

https://regex101.com/r/UXnhTy/1
var date = /(?|(Sat)ur|(Sun))day/;
console.log(date.exec("Sunday"));
This fails with:
SyntaxError: Invalid regular expression: /(?|(Sat)ur|(Sun))day/: Invalid group
Is there a version of NodeJS that supports this? Or some library out there that
I tested this with nodejs v8.12.0
Not really. An advanced alternative regex library for JavaScript is XRegExp, but it doesn't have the feature you're after - not even as an addon.
A simpler regex feature that is supported by XRegExp is named capture groups, so you can write:
var days = XRegExp('(?:(?<d>Sat)ur|(?<d>Sun))day', 'gi');
You can't use numbers as group names, but named groups should fit what your needs - they allow backreferences (using \k<d>), replacement (${d}), capturing (match.d), and all features of a regular numbered group.
Named captured groups is supported natively by ES2018: ES2018 Regular Expression Updates.
According to node.green, named capture groups are supported by Node.js β‰₯10.3.0, or by β‰₯8.6.0 with the --harmony flag.

ArangoDB does not match documents with case-insensitivity when using regex operator =~

I am using enterprise 3.2 and have an issue with regex matching operator =~. From the documentation, it appears that I can use string regex and that should be case insensitive. However when tried, it fails to match characters when right hand operand has all lowercase. Attaching screenshot to refer to the issue. This first screenshot shows the document is retured by the query when we use the same case as in the collection.
Here is the second screenshot that shows that case insensitive regex fails to pull the record.
Like Tom already said. You have to use REGEX_TEST(text, search, caseInsensitive) for this with caseInsensitive set to true (see docs).
The operator =~ is just a short hand for REGEX_TEST(text, search, false).
I stumbled upon this thread when I had this problem, but there is also another solution to this. Just want to give it to people who wanted to use =~ operator instead of REGEX_TEST() function.
Example:
FOR doc in contacts
FILTER doc.name =~ '(?i)raM'
RETURN doc
Another solution could also be (not tested):
FOR doc in contacts
FILTER LOWER(doc.name) == LOWER('raM')
Indeed you have to use REGEX_TEST. And this is how you do it, with example.
FOR doc IN contacts
FILTER REGEX_TEST(doc.name, 'anystring_representing_regex', true)
RETURN doc

Resources