Dialogflow regex entity similar to #sys.any - dialogflow-es

I have many intents that extract a parameter that could be almost anything. An example would be a company name. Lots of variation there: "VWR", "1-800-Flowers", "#1 Mufflers". This list can include names in many languages.
I'm using the #sys.any entity now but it doesn't work well if the text includes numbers or punctuation. I get this for the parameter for example: "1 - 800 - Flowers". There are spaces around the numbers and punctuation.
I was expecting the Regex entity to solve my problem but on save it throws and error saying its too broad. \S+[\s\S]*\S+ will catch anything in any language. Here's the error: "com.google.apps.framework.request.BadRequestException: Validate entity with entityName 'RegexAny' and entityId '149486a3-7a49-4171-b23c-860f7d47b713' failed because of the following reasons: Regular expression match is too broad: \S+[\s\S]*\S+."
How can I get around this unhelpful restriction and capture the user's input just as they typed id?

I've had this problem happen to me as well. What I do is use the #sys.any parameter and do the regex check in my fulfillment code. Here you can remove any punctuations and spaces. If you decide to do it this way I'd recommend removing any output contexts and setting them programmatically if you find a match with that regex. If there's no match I will set the same context as the input contexts for that intent.
This works wonderfully.

Related

Azure Search - regex search

I am trying to configure Azure Search to find some strings that have special characters, for example
ABC*DEF
When I look for a the full term using "ABC*DEF", it works perfectly.
The problem comes if I want to use a regex term:
When I use a partial term, like /(.*)ABC(.*)/, the result has no problem
When I use a partial term, like /(.*)DEF(.*)/, the result has no problem
But when I try to look for something like /(.*)C\*D(.*)/, the result is empty.
I am using a standard analyzer. I tried also the keyword analyzer but that way the regex search doesn't work at all.
Any suggestions?
You won't be able to create a regex expression that matches ABC*DEF using the standard analyzer.
If you run "ABC\*DEF" through the analyzer api using "standard" analyzer, you will see that ABC*DEF gets divided into 2 tokens at indexing time -> "ABC" and "DEF". Regex expression are not analyzed, however, they need to match a token that exist in the index.
Since ABC\*DEF does not exist in the index (only "ABC" and "DEF" exist), you won't be able to find it using the expression you are searching for.
Using the "keyword" analyzer will keep the whole field as a single token, so if the field "only" contained the expression ABC\*DEF, then the regex expression would work on it, however, if ABC\*DEF is part of a larger paragraph of text, then that's probably not what you want to use.
Your best bet is to create a custom analyzer that tokenizes your text in the way that preserves the special characters that are relevant to your use case.
If you're searching for special chars, why don't you discard normal chars?
[^\w]

Dialogflow Agent regex entities definition

I have created an agent in dialogflow, for which I want to define an entity based on regex values, I know we have regex capability in defining the entities, but I don't know how to use it or how to define regex while defining the entity. There are no examples or blogs available to help me with this. I want to see an example or syntax of how to define regex entities so that I can replicate the same for my case. Any help will be highly appreciated.
Try this. Go to the Entity page. Create a new Entity an call it whatever you want. In the entity screen select regex and enter this value [A-Za-z]{3}[0-9]{7,10}$. Save the Entity. This regex will validate any value that begins with three letters and 7 to 10 character. Example PAP1234567 or DWL123456789.
Now go to an, intent or create one, and on the training phrases add one that says:
My number is PAP12345678. Select the PAP12345678 to highlighted and the entities menu will appeared. Select the new regex entity and save.
Test the intent on DialogFlow. Hope this help.

Is it possible to extract a path as the value of an entity in Dialogflow?

I have an intent that requires the user to give a path:
plot the file in /home/user/path
Is there a way to extract the path with dialogflow and to get its value into an entity? I think that this case cannot be approached with synonyms.
NO, as DialogFlow doesn't support Regex in entities, there is no easy way to parse path value in DialogFlow using entity.
You have two options to parse Paths into an entity.
One: Use #sys.any entity in place of the path and on fulfilment side check if the value of the entity is actually a valid path or not using Regex.
Two: Create your own entity for paths and use DialogFlow Agent-API to keep updating values in that entity whenever new file/folder is created/updated/deleted in whatever file system you are working on. (Yeah this sounds crazy but I don't think there are any other options to achieve what you want)

Solr exact search with a hyphen

I am trying to search for a term in Solr in the Title that contains only the string 1604-04. But the results come back with anything that contains 1604 or 04. What would the syntax be to force solr to search on the exact string of 1604-04?
You can also use Classic Tokenizer.The Classic Tokenizer preserves the same behavior as the Standard Tokenizer with the following exceptions:-
Words are split at hyphens, unless there is a number in the word, in which case the token is not split and
the numbers and hyphen(s) are preserved.
This means if someone searches for 1604-04 then this Tokenizer won't break search string into two tokens.
If you want exact matches only, use a string field or a text field with a KeywordTokenizer as the tokenizer. These will keep your tokens intact as one single entry, and won't break it up into multiple tokens.
The difference is that if you use a Textfield with a KeywordTokenizer, you can still apply other filters, such as a LowercaseFilter, while a string field will store anything verbatim without any further processing possible.
Your analyzer is splitting "1604-04" into two terms, "1604" and "04". You've received answer on how to change your analysis to stop doing that.
Changing your analysis my not be the best solution (can't be entirely sure based on what you've written). Using a phrase query would be the usual way to do this. You can use a phrase query by wrapping it in quotes:
field:"1604-04"
This will still analyze and split it into two terms, but it will look for those terms in sequence. So, that query would match "1604-04" and "1604 04", but not "1604 some other stuff 04".

Wildcard searching for MRN numbers in FHIR

Is there a way I can do a wildcard search for MRN numbers in FHIR?
ex. I want to search for all MRN numbers starting with 12345.
thanks,
Suresh
I think this is actually a bit trickier than it may seem within the fhir standard.
For general text/string searching, your best bet would be the :contains modifier in your query parameters. For example:
[base]/Patient?given:contains=ada
should return a Bundle containing all Patient resources with the string 'ada' (case and accent insensitive) in the given name. However, MRN's are typically stored as Patient.identifier, which is a token parameter. The specification states:
"A token type is a parameter that provides an exact match search, either on a string of characters, potentially scoped by a URI. It is mostly used against a code or identifier data type where the value may have a URI that scopes its meaning, where the search is performed against the pair from a Coding or an Identifier. Tokens are also used against other fields where exact matches are required"
https://www.hl7.org/fhir/search.html#token
However, the specification also provides the :text modifier for token parameters, of which it states:
"For token: :text (the match does a partial searches on the text portion of a CodeableConcept or the display portion of a Coding), instead of the default search which uses codes."
This seems to suggest that you could perform your search with something like:
[base]/Patient?identifier:text=12345
...however the standard ALSO states that "only a few servers are expected to offer this facility." So you may be out of luck unless the server you are querying against has implemented this functionality.

Resources