I'm working with LUIS and want to manage and deal not only with the top scoring intent but also with all others. In this specific situation occurs when someone inquires about two things in the same phrase.
For example: "I want to buy apples" ("Buy" intent) and "I want to sell bananas" ("Sell" intent) versus "I want to buy bananas and sell apples" ("buy" and "sell" intents on the same utterance).
The idea is to define a threshold that will accept as "valid" any intentions score above this confidence number.
During some tests I found out this can work if we have very few intents on the same utterance.
However if we increase the number of intents on the same utterance the results degrades very fast.
I included some examples to clarify what I mean: The output examples below were generated on a LUIS with 4 intents ("buy", "sell", "none" and "prank") and 1 entity ("fruit")
I want to buy apples ==>
{
"query": "i want to buy apples",
"topScoringIntent": {
"intent": "Buy",
"score": 0.999846
},
"intents": [
{
"intent": "Buy",
"score": 0.999846
},
{
"intent": "None",
"score": 0.2572831
},
{
"intent": "sell",
"score": 2.32163586e-7
},
{
"intent": "prank",
"score": 2.32163146e-7
}
],
"entities": [
{
"entity": "apples",
"type": "Fruit",
"startIndex": 14,
"endIndex": 19,
"resolution": {
"values": [
"apple"
]
}
}
]
}
I want to sell bananas ==>
{
"query": "i want to sell bananas",
"topScoringIntent": {
"intent": "sell",
"score": 0.999886036
},
"intents": [
{
"intent": "sell",
"score": 0.999886036
},
{
"intent": "None",
"score": 0.253938943
},
{
"intent": "Buy",
"score": 2.71893583e-7
},
{
"intent": "prank",
"score": 1.97906232e-7
}
],
"entities": [
{
"entity": "bananas",
"type": "Fruit",
"startIndex": 15,
"endIndex": 21,
"resolution": {
"values": [
"banana"
]
}
}
]
}
I want to eat a pizza ==>
{
"query": "i want to eat a pizza",
"topScoringIntent": {
"intent": "prank",
"score": 0.997353
},
"intents": [
{
"intent": "prank",
"score": 0.997353
},
{
"intent": "None",
"score": 0.378299
},
{
"intent": "sell",
"score": 2.72957237e-7
},
{
"intent": "Buy",
"score": 1.54754474e-7
}
],
"entities": []
}
Now with two intents... The score of each one starts to reduce aggressively
I want to buy apples and sell bananas ==>
{
"query": "i want to buy apples and sell bananas",
"topScoringIntent": {
"intent": "sell",
"score": 0.4442593
},
"intents": [
{
"intent": "sell",
"score": 0.4442593
},
{
"intent": "Buy",
"score": 0.263670564
},
{
"intent": "None",
"score": 0.161728472
},
{
"intent": "prank",
"score": 5.190861e-9
}
],
"entities": [
{
"entity": "apples",
"type": "Fruit",
"startIndex": 14,
"endIndex": 19,
"resolution": {
"values": [
"apple"
]
}
},
{
"entity": "bananas",
"type": "Fruit",
"startIndex": 30,
"endIndex": 36,
"resolution": {
"values": [
"banana"
]
}
}
]
}
and if we include the third intent, LUIS seems to collapse:
I want to buy apples, sell bananas and eat a pizza ==>
{
"query": "i want to buy apples, sell bananas and eat a pizza",
"topScoringIntent": {
"intent": "None",
"score": 0.139652014
},
"intents": [
{
"intent": "None",
"score": 0.139652014
},
{
"intent": "Buy",
"score": 0.008631414
},
{
"intent": "sell",
"score": 0.005520768
},
{
"intent": "prank",
"score": 0.0000210663875
}
],
"entities": [
{
"entity": "apples",
"type": "Fruit",
"startIndex": 14,
"endIndex": 19,
"resolution": {
"values": [
"apple"
]
}
},
{
"entity": "bananas",
"type": "Fruit",
"startIndex": 27,
"endIndex": 33,
"resolution": {
"values": [
"banana"
]
}
}
]
}
Do you know/recommend any approach that I should use to train LUIS in order to mitigate this issue? Dealing with multiple intents in the same utterance is key to my case.
Thanks a lot for any help.
You will likely need to do some pre-processing of the input using NLP to chunk the sentences and then train/submit the chunks one at a time. I doubt that LUIS is sophisticated enough to handle multiple intents in compound sentences.
Here's a sample code for preprocessing using Spacy in Python - have not tested this for more complicated sentences but this should work for your example sentence. You can use the segments below to feed to LUIS.
Multiple intents are not an easy problem to address and there may be other ways to handle them
import spacy
model = 'en'
nlp = spacy.load(model)
print("Loaded model '%s'" % model)
doc = nlp("i want to buy apples, sell bananas and eat a pizza ")
for word in doc:
if word.dep_ in ('dobj'):
subtree_span = doc[word.left_edge.i : word.right_edge.i + 1]
print(subtree_span.root.head.text + ' ' + subtree_span.text)
print(subtree_span.text, '|', subtree_span.root.head.text)
print()
If you know the permutations you are expecting you might be able to get the information you need.
I defined a single "buy and sell" intent, in addition to the individual buy and sell intents. I created two entities "Buy Fruit" and "Sell Fruit", each of which contained the "Fruit" entity from your example. Then in the "buy and sell" intent I used sample utterances like "I want to by apples and sell bananas", as well as switching the buy/sell around. I marked the fruit as a "fruit" entity, and the phrases as "buy fruit" and "sell fruit" as respectively.
This is the kind of output I get from "I want to buy a banana and sell an apple":
{
"query": "I want to buy a banana and sell an apple",
"prediction": {
"topIntent": "buy and sell",
"intents": {
"buy and sell": {
"score": 0.899272561
},
"Buy": {
"score": 0.06608531
},
"Sell": {
"score": 0.03477564
},
"None": {
"score": 0.009155964
}
},
"entities": {
"Buy Fruit": [
{}
],
"Sell Fruit": [
{}
],
"Fruit": [
"banana",
"apple"
],
"keyPhrase": [
"banana",
"apple"
],
"$instance": {
"Buy Fruit": [
{
"type": "Buy Fruit",
"text": "buy a banana",
"startIndex": 10,
"length": 12,
"score": 0.95040834,
"modelTypeId": 1,
"modelType": "Entity Extractor",
"recognitionSources": [
"model"
]
}
],
"Sell Fruit": [
{
"type": "Sell Fruit",
"text": "sell an apple",
"startIndex": 27,
"length": 13,
"score": 0.7225706,
"modelTypeId": 1,
"modelType": "Entity Extractor",
"recognitionSources": [
"model"
]
}
],
"Fruit": [
{
"type": "Fruit",
"text": "banana",
"startIndex": 16,
"length": 6,
"score": 0.9982499,
"modelTypeId": 1,
"modelType": "Entity Extractor",
"recognitionSources": [
"model"
]
},
{
"type": "Fruit",
"text": "apple",
"startIndex": 35,
"length": 5,
"score": 0.98748064,
"modelTypeId": 1,
"modelType": "Entity Extractor",
"recognitionSources": [
"model"
]
}
],
"keyPhrase": [
{
"type": "builtin.keyPhrase",
"text": "banana",
"startIndex": 16,
"length": 6,
"modelTypeId": 2,
"modelType": "Prebuilt Entity Extractor",
"recognitionSources": [
"model"
]
},
{
"type": "builtin.keyPhrase",
"text": "apple",
"startIndex": 35,
"length": 5,
"modelTypeId": 2,
"modelType": "Prebuilt Entity Extractor",
"recognitionSources": [
"model"
]
}
]
}
}
}
}
To make this work you would have to cater for all the possible permutations, so this isn't strictly a solution to discerning multiple intents. It's more about defining a composite intent for each permutation of individual intents that you wanted to cater for. In many applications that would not be practical, but in your example it could get you a satisfactory result.
Related
I am currently working with Azure Form Recognizer and had a question. I am using
https://<>.cognitiveservices.azure.com/formrecognizer/v2.0-preview/layout/analyzeResults/2e0a2322-65bb-4fd2-a3bf-98f70b36641e
The JSON returned seems to be using basic OCR. I was wondering if its possible (easily)
to take this
{
"boundingBox": [
4.4033,
1.5114,
6.5483,
1.5114,
6.5483,
1.6407,
4.4033,
1.6407
],
"text": "Invoice For: First Up Consultants",
"words": [
{
"boundingBox": [
4.4033,
1.5143,
4.8234,
1.5143,
4.8234,
1.6155,
4.4033,
1.6155
],
"text": "Invoice",
"confidence": 1
},
{
"boundingBox": [
4.8793,
1.5143,
5.1013,
1.5143,
5.1013,
1.6154,
4.8793,
1.6154
],
"text": "For:",
"confidence": 1
},
{
"boundingBox": [
5.2048,
1.5130,
5.4927,
1.5130,
5.4927,
1.6151,
5.2048,
1.6151
],
"text": "First",
"confidence": 1
},
{
"boundingBox": [
5.5427,
1.5130,
5.7120,
1.5130,
5.7120,
1.6407,
5.5427,
1.6407
],
"text": "Up",
"confidence": 1
},
{
"boundingBox": [
5.7621,
1.5114,
6.5483,
1.5114,
6.5483,
1.6151,
5.7621,
1.6151
],
"text": "Consultants",
"confidence": 1
}
]
}
but return it as
"boundingBox": [
4.4033,
1.5114,
6.5483,
1.5114,
6.5483,
1.6407,
4.4033,
1.6407
],
"text": "Invoice For:",
"value": "First Up Consultants"
}
If this is not something that I can do in azure form recognizer, then no worries. I just wanted to see.
Thank you in advance!
Michael
It sounds like you're looking to extract semantic meaning from your document. In that case, you might want to look at using a custom Form Recognizer model.
You can start by training a custom model to extract key value pairs:
https://learn.microsoft.com/en-us/azure/cognitive-services/form-recognizer/quickstarts/curl-train-extract
Sample key value pair:
{
"key": {
"text": "Address:",
"boundingBox": [ 0.7972, 1.5125, 1.3958, 1.5125, 1.3958, 1.6431, 0.7972, 1.6431 ]
},
"value": {
"text": "1 Redmond way Suite 6000 Redmond, WA 99243",
"boundingBox": [ 0.7972, 1.6764, 2.15, 1.6764, 2.15, 2.2181, 0.7972, 2.2181 ]
},
"confidence": 0.86
}
Or you can train a custom model using labels that you provide:
https://learn.microsoft.com/en-us/azure/cognitive-services/form-recognizer/quickstarts/label-tool
Sample field output:
{
"total": {
"type": "string",
"valueString": "$22,123.24",
"text": "$22,123.24",
"boundingBox": [ 5.29, 3.41, 5.975, 3.41, 5.975, 3.54, 5.29, 3.54 ],
"page": 1,
"confidence": 1
}
}
I have a database of exercises in a workout tracker, and when I do a find(), the result is this:
[
{
"_id": "5e9dacbb6512969974bd5b2d",
"day": "2020-04-10T14:07:55.905Z",
"exercises": [
{
"type": "resistance",
"name": "Bicep Curl",
"duration": 20,
"weight": 100,
"reps": 10,
"sets": 4
}
]
},
{
"_id": "5e9dacbb6512969974bd5b2e",
"day": "2020-04-11T14:07:55.916Z",
"exercises": [
{
"type": "resistance",
"name": "Lateral Pull",
"duration": 20,
"weight": 300,
"reps": 10,
"sets": 4
}
]
},
{
"_id": "5e9dacbb6512969974bd5b2f",
"day": "2020-04-12T14:07:55.916Z",
"exercises": [
{
"type": "resistance",
"name": "Push Press",
"duration": 25,
"weight": 185,
"reps": 8,
"sets": 4
}
]
},
{
"_id": "5e9dacbb6512969974bd5b30",
"day": "2020-04-13T14:07:55.916Z",
"exercises": [
{
"type": "cardio",
"name": "Running",
"duration": 25,
"distance": 4
}
]
},
{
"_id": "5e9dacbb6512969974bd5b31",
"day": "2020-04-14T14:07:55.916Z",
"exercises": [
{
"type": "resistance",
"name": "Bench Press",
"duration": 20,
"weight": 285,
"reps": 10,
"sets": 4
}
]
},
{
"_id": "5e9dacbb6512969974bd5b32",
"day": "2020-04-15T14:07:55.916Z",
"exercises": [
{
"type": "resistance",
"name": "Bench Press",
"duration": 20,
"weight": 300,
"reps": 10,
"sets": 4
}
]
},
{
"_id": "5e9dacbb6512969974bd5b33",
"day": "2020-04-16T14:07:55.916Z",
"exercises": [
{
"type": "resistance",
"name": "Quad Press",
"duration": 30,
"weight": 300,
"reps": 10,
"sets": 4
}
]
},
{
"_id": "5e9dacbb6512969974bd5b34",
"day": "2020-04-17T14:07:55.916Z",
"exercises": [
{
"type": "resistance",
"name": "Bench Press",
"duration": 20,
"weight": 300,
"reps": 10,
"sets": 4
}
]
},
{
"_id": "5e9dacbb6512969974bd5b35",
"day": "2020-04-18T14:07:55.916Z",
"exercises": [
{
"type": "resistance",
"name": "Military Press",
"duration": 20,
"weight": 300,
"reps": 10,
"sets": 4
}
]
},
{
"_id": "5e9dacbb6512969974bd5b36",
"day": "2020-04-19T14:07:55.916Z",
"exercises": [
{
"type": "resistance",
"name": "Bench",
"duration": 30,
"distance": 2
}
]
}
]
Then I need to get total sums of statistics from each exercise, so I used mongoose aggregate to give me this data:
[
{
"_id": null,
"totalDuration": 230,
"totalWeight": 2070,
"totalSets": 32,
"totalReps": 78,
"totalDistance": 6
}
]
I want to combine these two results in one GET request, ideally doing something similar to a push where I just push the totals at the end of the first JSON object. How do I achieve this?
Something like this:
function mergeResults(resultFromFindQuery, totalSums){
var allData = {};
allData['mongoFindresult'] = resultFromFindQuery;
allData['totalSums'] = totalSums;
return allData;
}
Then use the returned value to what you need to do. Now you have both of them in the same variable.
Is it possible to use the Discount option from Facebook Messenger's Receipt Template with Bot Framework V4?
Because I tried searching the samples, but they require string values, while the Facebook Template requires arrays.
Example:
"adjustments":[
{
"name":"New Customer Discount",
"amount":20
},
{
"name":"$10 Off Coupon",
"amount":10
}
],
Facebook Templates can be sent using ChannelData. Here is an example of a receipt template with adjustments:
await context.sendActivity({
text: 'Receipt',
channelData: {
"attachment":{
"type":"template",
"payload": {
"template_type": "receipt",
"recipient_name": "Stephane Crozatier",
"order_number": "12345678902",
"currency": "USD",
"payment_method": "Visa 2345",
"order_url": "http://petersapparel.parseapp.com/order?order_id=123456",
"timestamp": "1428444852",
"address": {
"street_1": "1 Hacker Way",
"street_2": "",
"city": "Menlo Park",
"postal_code": "94025",
"state": "CA",
"country": "US"
},
"summary": {
"subtotal": 75.00,
"shipping_cost": 4.95,
"total_tax": 6.19,
"total_cost": 56.14
},
"adjustments": [{
"name": "New Customer Discount",
"amount": 20
},
{
"name": "$10 Off Coupon",
"amount": 10
}
],
"elements": [{
"title": "Classic White T-Shirt",
"subtitle": "100% Soft and Luxurious Cotton",
"quantity": 2,
"price": 50,
"currency": "USD",
"image_url": "http://petersapparel.parseapp.com/img/whiteshirt.png"
},
{
"title": "Classic Gray T-Shirt",
"subtitle": "100% Soft and Luxurious Cotton",
"quantity": 1,
"price": 25,
"currency": "USD",
"image_url": "http://petersapparel.parseapp.com/img/grayshirt.png"
}
]
}
}
}
});
Say I have a product collection like this:
{
"_id": "5a74784a8145fa1368905373",
"name": "This is my first product",
"description": "This is the description of my first product",
"category": "34/73/80",
"condition": "New",
"images": [
{
"length": 1000,
"width": 1000,
"src": "products/images/firstproduct_image1.jpg"
},
...
],
"attributes": [
{
"name": "Material",
"value": "Synthetic"
},
...
],
"variation": {
"attributes": [
{
"name": "Color",
"values": ["Black", "White"]
},
{
"name": "Size",
"values": ["S", "M", "L"]
}
]
}
}
and a variation collection like this:
{
"_id": "5a748766f5eef50e10bc98a8",
"name": "color:black,size:s",
"productID": "5a74784a8145fa1368905373",
"condition": "New",
"price": 1000,
"sale": null,
"image": [
{
"length": 1000,
"width": 1000,
"src": "products/images/firstvariation_image1.jpg"
}
],
"attributes": [
{
"name": "Color",
"value": "Black"
},
{
"name": "Size",
"value": "S"
}
]
}
I want to keep the documents separate and for the purpose of easy browsing, searching and faceted search implementation, I want to fetch all the data in a single query but I don't want to do join in my application code.
I know it's achievable using a third collection called summary that might look like this:
{
"_id": "5a74875fa1368905373",
"name": "This is my first product",
"category": "34/73/80",
"condition": "New",
"price": 1000,
"sale": null,
"description": "This is the description of my first product",
"images": [
{
"length": 1000,
"width": 1000,
"src": "products/images/firstproduct_image1.jpg"
},
...
],
"attributes": [
{
"name": "Material",
"value": "Synthetic"
},
...
],
"variations": [
{
"condition": "New",
"price": 1000,
"sale": null,
"image": [
{
"length": 1000,
"width": 1000,
"src": "products/images/firstvariation_image.jpg"
}
],
"attributes": [
"color=black",
"size=s"
]
},
...
]
}
problem is, I don't know how to keep the summary collection in sync with the product and variation collection. I know it can be done using mongo-connector but i'm not sure how to implement it.
please help me, I'm still a beginner programmer.
you don't actually need to maintain a summary collection, its redundant to store product and variation summary in another collection
instead of you can use an aggregate pipeline $lookup to outer join product and variation using productID
aggregate pipeline
db.products.aggregate(
[
{
$lookup : {
from : "variation",
localField : "_id",
foreignField : "productID",
as : "variations"
}
}
]
).pretty()
I want to modify scoring in ElasticSearch (v2+) based on the weight of a field in a nested object within an array.
For instance, using this data:
PUT index/test/0
{
"name": "red bell pepper",
"words": [
{"text": "pepper", "weight": 20},
{"text": "bell","weight": 10},
{"text": "red","weight": 5}
]
}
PUT index/test/1
{
"name": "hot red pepper",
"words": [
{"text": "pepper", "weight": 15},
{"text": "hot","weight": 11},
{"text": "red","weight": 5}
]
}
I want a query like {"words.text": "red pepper"} which would rank "red bell pepper" above "hot red pepper".
The way I am thinking about this problem is "first match the 'text' field, then modify scoring based on the 'weight' field". Unfortunately I don't know how to achieve this, if it's even possible, or if I have the right approach for something like this.
If proposing alternative approach, please try and keep a generalized idea where there are tons of different similar cases (eg: simply modifying the "red bell pepper" document score to be higher isn't really a suitable alternative).
The approach you have in mind is feasible. It can be achieved via function score in a nested query .
An example implementation is shown below :
PUT test
PUT test/test/_mapping
{
"properties": {
"name": {
"type": "string"
},
"words": {
"type": "nested",
"properties": {
"text": {
"type": "string"
},
"weight": {
"type": "long"
}
}
}
}
}
PUT test/test/0
{
"name": "red bell pepper",
"words": [
{"text": "pepper", "weight": 20},
{"text": "bell","weight": 10},
{"text": "red","weight": 5}
]
}
PUT test/test/1
{
"name": "hot red pepper",
"words": [
{"text": "pepper", "weight": 15},
{"text": "hot","weight": 11},
{"text": "red","weight": 5}
]
}
post test/_search
{
"query": {
"bool": {
"disable_coord": true,
"must": [
{
"match": {
"name": "red pepper"
}
}
],
"should": [
{
"nested": {
"path": "words",
"query": {
"function_score": {
"functions": [
{
"field_value_factor": {
"field" : "words.weight",
"missing": 0
}
}
],
"query": {
"match": {
"words.text": "red pepper"
}
},
"score_mode": "sum",
"boost_mode": "replace"
}
},
"score_mode": "total"
}
}
]
}
}
}
Result :
"hits": [
{
"_index": "test",
"_type": "test",
"_id": "0",
"_score": 26.030865,
"_source": {
"name": "red bell pepper",
"words": [
{
"text": "pepper",
"weight": 20
},
{
"text": "bell",
"weight": 10
},
{
"text": "red",
"weight": 5
}
]
}
},
{
"_index": "test",
"_type": "test",
"_id": "1",
"_score": 21.030865,
"_source": {
"name": "hot red pepper",
"words": [
{
"text": "pepper",
"weight": 15
},
{
"text": "hot",
"weight": 11
},
{
"text": "red",
"weight": 5
}
]
}
}
]
}
The query in a nutshell would score a document that satisfies the must clause as follows : sum up the weights of the matched nested documents with the score of the must clause.