I was trying to make the bot replace multiple words in one sentence with another word.
ex: User will say "Today is a great day"
and the bot shall answer "Today is a bad night"
the words "great" and "day" were replaced by the words "bad" and "night" in this example.
I've been searching in order to find a similar code, but unfortunately all I could find is "word-blacklisting" scripts.
//I tried to do some coding with it but I am not an expert with node.js the code is written really badly. It's not even worth showing really.
The user will say some sentence and the bot will recognize some predetermined words on the sentence and will replace those words with other words I'll decide in the script
We can use String.replace() combined with Regular Expressions to match and replace single words of your choosing.
Consider this example:
function antonyms(string) {
return string
.replace(/(?<![A-Z])terrible(?![A-Z])/gi, 'great')
.replace(/(?<![A-Z])today(?![A-Z])/gi, 'tonight')
.replace(/(?<![A-Z])day(?![A-Z])/gi, 'night');
}
const original = 'Today is a tErRiBlE day.';
console.log(original);
const altered = antonyms(original);
console.log(altered);
const testStr = 'Daylight is approaching.'; // Note that this contains 'day' *within* a word.
const testRes = antonyms(testStr); // The lookarounds in the regex prevent replacement.
console.log(testRes); // If this isn't the desired behavior, you can remove them.
consider I have this string
a='flexray_datain_flexray_sensors'
and I want to process this string to get
a='flexray_datain_sensors'
And the thing is this can be for any repeated words and not just flexray in matlab. If I already know what the word is then it's easy
I tried:
parts = textscan(bypname , '%s', 'delimiter', '_');
parts = parts{:};
and then processing this cell(parts) using unique or something and removing the repeated words. But I need a better answer .
Does this work for you?
strjoin(unique(strsplit(a,'_'),'stable'),'_')
How can i check whether a sentence contain combinations? For example consider sentence.
John appointed as new CEO for google.
I need to write a rule to check whether sentence contains < 'new' + 'Jobtitle' >.
How can i achieve this. I tried following. I need to check is there 'new' before word .
Rule: CustomRules
(
{
Sentence contains {Lookup.majorType == "organization"},
Sentence contains {Lookup.majorType == "jobtitle"},
Sentence contains {Lookup.majorType == "person_first"}
}
)
One way to handle this is to revert it. Focus on the sequence you need and then get the covering Sentence:
(
{Token#string == "new"}
{Lookup.majorType = "jobtitle"}
):newJT
You should check this edge when the Sentence starts after "new", like this:
new
CEO
You can use something like this:
{Token ... }
{!Sentence, Lookup.majorType ...}
And then get the sentence (if you really need it) in the java RHS:
long end = newJTAnnots.lastNode().getOffset();
long start = newJTAnnots.firstNode().getOffset();
AnnotationSet sentences = inputAS.get("Sentence", start, end);
Be prepared, this is one of those hard questions.
In Farsi or Persian language ی which sounds like y or i and is written in 4 different shapes according to it's place in word. I'll call ی as YA from now for simplification.
take a look at this image
All YA characters are painted in red, in the first word YA is attached to it's previous (right , in Farsi we right from RIGHT to LEFT) character and is free at the end whereas the last YA (3rd word, left-most red char) is free both from left or right.
Having said this long story, I want to find out if a part of a string ends with long YA (YA without points) or short YA (YA with two points beneath it).
i.e تحصیلداری (the 3rd word) ends with long YA but تحصیـ which is a part of 3rd word does not ends with short YA.
Question: How can I say تحصیلداری ends whit which unicode? I just have a simple string, "تحصیلداری", how can I convert its characters to unicode?
I tried the unicodes
string unicodes = "";
foreach (char c in "تحصیلداری")
{
unicodes += c+" "+((int)c).ToString() + Environment.NewLine;
}
MessageBox.Show(unicodes);
result :
but at the end of the day unfortunately all YAs have the same unicode.
Bad news : YA was an example, a real one though. There are also a dozen of other characters like YA with different appearances too.
Additional info :
using this useful link about unicodes I found unicode of different YAs
We solved similar problem the way bellow:
We had a core banking application, the customer sub-system needed a full text search on customers name, family, father name etc.
Different encoding, legacy migrated data, keyboard layouts and Farsi fonts ... made search process inaccurate.
We overcame the problem by replacing problematic characters with some standard one and saving the standard string for search purpose.
After several iterations, the replacement is as bellow that may come in handy:
Formula="UPPER(REPLACE(REPLACE(REPLACE
(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE
(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE
(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE
(REPLACE(REPLACE(REPLACE(REPLACE
(REPLACE(FirsName || LastName || FatherName,
chr(32),''),
chr(13),''),
chr(9),''),
chr(10),''),
'-',''),
'-',''),
'آ','ا'),
'أ', 'ا'),
'ئ', 'ي'),
'ي', 'ي'),
'ك', 'ک'),
'آإئؤةي','اايوهي'),
'ء',''),
'شأل','شاال'),
'ا.','اله'),
'.',''),
'الله','اله'),
'ؤ','و'),
'إ','ا'),
'ة','ه'),
' ا لله','اله'),
'ا لله','اله'),
' ا لله','اله'))"
Despite there are different YEHs in Unicode, it must noticed that all presentation forms of YEHs are same Unicode character with code 0x06cc. You can not determine presentation forms by their Unicode code.
But you can reach your goal be checking to see what characters is before or after YEH.
You can also use Fardis to see Unicode codes of strings.
Input:
Hi. I am John.
My name is John. Who are you ?
Output:
Hi
I am John
My name is John
Who are you
String line = "Hi. My name is John. Who are you ?";
String[] sentences = line.split("(?<=[.!?])\\s+");
for (String sentence : sentences) {
System.out.println("[" + sentence + "]");
}
This produces:
[Hi.]
[My name is John.]
[Who are you ?]
See also
regular-expressions.info tutorials
Lookarounds
Character classes
Java language guide: the for-each loop
If you're not comfortable using split (even though it's the recommended replacement for the "legacy" java.util.StringTokenizer), you can just use only java.util.Scanner (which is more than adequate to do the job).
See also
Scanner vs. StringTokenizer vs. String.Split
Here's a solution that uses Scanner, which by the way implements Iterator<String>. For extra instructional value, I'm also showing an example of using java.lang.Iterable<T> so that you can use the for-each construct.
final String text =
"Hi. I am John.\n" +
"My name is John. Who are you ?";
Iterable<String> sentences = new Iterable<String>() {
#Override public Iterator<String> iterator() {
return new Scanner(text).useDelimiter("\\s*[.!?]\\s*");
}
};
for (String sentence : sentences) {
System.out.println("[" + sentence + "]");
}
This prints:
[Hi]
[I am John]
[My name is John]
[Who are you]
If this regex is still not what you want, then I recommend investing the time to educate yourself so you can take matters into your own hand.
See also
What is the Iterable interface used for?
Why is Java’s Iterator not an Iterable?
Note: the final modifier for the local variable text in the above snippet is a necessity. In an illustrative example, it makes for a concise code, but in your actual code you should refactor the anonymous class to its own named class and have it take text in the constructor.
See also
Anonymous vs named inner classes? - best practices?
Cannot refer to a non-final variable inside an inner class defined in a different method