Algolia filter by sub key value - search

I have data like where one set is:
{
"creatorUsername": "mbalex99",
"description": "For Hikers and All the Lovers Alike!",
"imageUrl": "https://s3.amazonaws.com/edenmessenger/uploads/28C03B77-E3E9-4D33-A433-6522C0480C16.jpg",
"isPrivate": true,
"name": "Nature Lovers ",
"roomId": "-KILq0nBN8wHQuEjMYRF",
"usernames": {
"bannon": true,
"loveless": true,
"mbalex99": true,
"terra": true
},
"objectID": "-KILq0nBN8wHQuEjMYRF"
}
I can't seem to find usernames where a key equals mbalex99?

This is indeed not possible with Algolia. You can only filter by value.
However, you can definitely add an array containing the keys of your object and filter by this attribute:
"usernames": {
"bannon": true,
"loveless": true,
"mbalex99": true,
"terra": true
},
"usernameList": ["bannon", "loveless", "mbalex99", "terra"]
// ...
and
// At query time:
{ "facetFilters": "usernameList:mbalex99" }

Related

How to get catalog search results by keywords via Amazon SP API that like Amazon website search results?

First method "listCatalogItems" produces correct results but limits max 10 ASINs. And now this method is deprecated.
Other method "searchCatalogItems" produces INcorrect random results.
fyi listCatalogItems says it's deprecated but it still works
I am getting correct results when I use searchCatalogItems. Here is my postman call: https://sellingpartnerapi-na.amazon.com/catalog/2022-04-01/items?marketplaceIds=ATVPDKIKX0DER&keywords=samsung
and part of my results:
{
"numberOfResults": 54592886,
"pagination": {
"nextToken": "9HkIVcuuPmX_bm51o3-igBfN45pxW4Ru7ElIM6GCECYCuXJKzT26f-3Tfs1Ro3IhelNA74VxDMJwt_JvE7qiRh0loZTzTpEBWUbZ8HB0T4ttV8cFw4xYQ4RMUzdY_udbnvAHOHCcZcycn0nW8RotZh1l1vj7KQoFIa7pWiOPHyaYWP7sBE9Fg7cGN2wE0an5ePw96h6ZL7m6olRxFOcqTWNanEVRjipq"
},
...
"items": [
{
"asin": "B09YN4W5C1",
"summaries": [
{
"marketplaceId": "ATVPDKIKX0DER",
"adultProduct": false,
"autographed": false,
"brand": "SAMSUNG",
"itemClassification": "VARIATION_PARENT",
"itemName": "SAMSUNG Jet Bot Robot Vacuum Cleaner",
"manufacturer": "SAMSUNG",
"memorabilia": false,
"packageQuantity": 1,
"tradeInEligible": false,
"websiteDisplayGroup": "home_display_on_website",
"websiteDisplayGroupName": "Home"
}
]
},
{
"asin": "B01AQ6OWAG",
"summaries": [
{
"marketplaceId": "ATVPDKIKX0DER",
"adultProduct": false,
"autographed": false,
"brand": "SAMSUNG",
"browseClassification": {
"displayName": "Remote Controls",
"classificationId": "10967581"
},
"itemClassification": "BASE_PRODUCT",
"itemName": "SAMSUNG TV Remote Control BN59-01199F by Samsung",
"manufacturer": "Samsung",
"memorabilia": false,
"modelNumber": "BN59-01199F",
"packageQuantity": 1,
"partNumber": "BN59-01199F",
"tradeInEligible": false,
"websiteDisplayGroup": "ce_display_on_website",
"websiteDisplayGroupName": "CE"
}
]
},

Google Speech API transcribe email

I'm having trouble transcribing email using Google Speech REST API. The best I can get is most of the email address, however Google Speech ignores "dot" and "dot com". For example first.last#gmail.com returns "First Last at gmail". If I say "period" instead of "dot" I at least get "First. Last at gmail." I'm using the following:
{
"config": {
"encoding": "MULAW",
"sampleRateHertz": 8000,
"languageCode": "en-US",
"maxAlternatives": 0,
"profanityFilter": true,
"enableWordTimeOffsets": false,
"model": "phone_call",
"useEnhanced": true
},
"audio": {
"content":"&&NameBase64&&"
}
}
I've tried add "dot" as a speech context with no changes. ".", ".com", "com", and "kom" also didn't change the results.
{
"config": {
"encoding": "MULAW",
"sampleRateHertz": 8000,
"languageCode": "en-US",
"maxAlternatives": 1,
"profanityFilter": true,
"enableWordTimeOffsets": false,
"model": "phone_call",
"useEnhanced": true,
"speechContexts": [{
"phrases": ["dot"],
}],
},
"audio": {
"content":"Base64Recording"
}
}
I've tried adding alphanumberic speech contexts and spelling it out but the results were pretty bad.
Any thoughts on how I can get "." or "dot" and "com" to show up in the transcription would be greatly appreciated.
Have you tried providing a boost value for the phrase? I'm facing the same issue and I noticed that increasing the boost value helped in identifying the word "dot".
Boost values are usually between 0 and 20, but applying anything above 10 helped in recognizing the "dot".
Here's an example:-
"config": {
"encoding": "MULAW",
"sampleRateHertz": 8000,
"languageCode": "en-US",
"maxAlternatives": 1,
"profanityFilter": true,
"enableWordTimeOffsets": false,
"model": "phone_call",
"useEnhanced": true,
"speechContexts": [{
"phrases": ["dot"],
"boots": 15.0
}],
},
"audio": {
"content":"Base64Recording"
}
}
You can also have multiple key value pairs in the context, each with different boost values. For example, this is what I use to detect email addresses:-
[{
phrases: ["$OOV_CLASS_ALPHANUMERIC_SEQUENCE"],
boost: 14.0
},
{
phrases: ["gmail.com","yahoo.com","aol.com","outlook.com"],
boost:5.0
},
{
phrases: ["com",".","c o m",".com","dotcom","dot com","dot","at","at the rate","#"],
boost: 10.0
},
{
phrases: ["org","io","dot org","dot io","gov","dot gov","net","dot net","co","dot co"],
boost:8.0
},
{
phrases: ["$OOV_CLASS_DIGIT_SEQUENCE","8","naught","z","zed","zee","zz","d","aa","ae","ee","oo","ii","ay","eh","ahh","ah","ze","dee",
"1","2","3","4","5","6","7","8","9","0","zero",],
boost: -20.0
}
]
Notice, the phrases with negative boost values will help weed out words that are often misunderstood.

MongoDB query that contain at least one of filters specified in an object

I have a collection in MongoDB that is structured like this:
"5d33488672886334cd21904xx": {
"ownerId": "5d333551k99951924fb3208",
"createdAt": 1582456098,
....phone
....email
....etc
"mainActivities": {
"dogWalking": true
},
"secActivities": {
"washing": true,
"houseSitting": true,
"dogSitting": true,
"training": true,
"trainingEquipment": true,
"selectionAdvice": true,
"institutionsBusinesses": true
}
}
"5d33488672886334cd21904xx": {
"ownerId": "5d333d344d46ed924fb3208",
"createdAt": 99999995,
....phone
....email
....etc
"mainActivities": {
"dogWalking": true
},
"secActivities": {
"washing": false,
"houseSitting": false,
"dogSitting": false,
"training": true,
"trainingEquipment": false,
"selectionAdvice": true,
"institutionsBusinesses": true
}
}
i am sending filters from the front end like so :
{
"filters": {
"mainActivities": {
"dogWalking": true
},
"secActivities": {
"washing": false,
"houseSitting": true,
"dogSitting": true,
"training": true,
"trainingEquipment": true,
"selectionAdvice": false,
"institutionsBusinesses": true
}
}
}
i want to get all the documents that answer the filters and not EXACT match .
Example:
if i am a dog walker service named "Xdog" , who give all the 6/6 activities (secActivities)
in the front end , when user select 2/6 filters i miss this "Xdog" service.
i get to a function who return only the document that specify EXACT match and its not good because
i am missing the services who actually answer this 2/6 filters..
right now i am sending and spreading the "filterBy" variable into the find() function .
Any suggestion would be great i am weak in Databases commands.
Sorry for my poor English!
The way to do partial match is with $or, but I don't think that will do what you actually want.
The example query you showed above is looking for dog walking services that do not provide washing or selectionAdvice services. Your description suggests that you do not want to consider these 2 services because the user did not specify them. In that case, leave these 2 out of the query entirely, using the user selections to add only the relevant fields, like:
.find({
"mainActivities": {
"dogWalking": true
},
"secActivities": {
"houseSitting": true,
"dogSitting": true,
"training": true,
"trainingEquipment": true,
"institutionsBusinesses": true
}
}
If the filters object is already built, you would need to remove the properties that are false. See How do I remove a property from a JavaScript object?

Azure Search - Partial Phrase match

I'm trying to improve the ranking of results that come back from an Azure Search index. The search index basically contains a list of band names and members.
Exact match is important to us, but also a partial match, but likewise, so is a partial word within the query.
If I use the example of trying to find a band called Black Flag. In user input area, I have got as far as typing black fl.
I currently structure the query as: "black fl"|black fl* (exact match on whole phrase and partial match on fl).
This brings back the following results in following order:
Flourescent Black
Florence Black
Black Flag
At the moment, there is the single text field being searched against, using the Standard - Lucene Analyzer.
I've looked at Scoring Profiles but these don't appear to be relevant to such a small dataset in terms of fields available.
I have also explored the full lucene search, by adding things like ^10 on the word black to make it more important - and have changed my query string in many ways, all of which don't seem to give the effect i'm after.
I would expect that Black Flag would match better as the word order is more correct than that of the results that come above it.
Is there a way to change the scoring method to handle this? I now imagine that I'm looking at dealing with a custom analyzer (https://learn.microsoft.com/en-gb/azure/search/index-add-custom-analyzers) but not really sure where to start with this or how I would want the analyzer to behave.
Any thoughts or examples on how best to handle this scenario would be greatly appreciated.
EDIT - More Info
The current solution consists of the following, but it involves having to manipulate the results that come back from the search index.
The index is created as follows:
{
"fields": [
{"name": "id", "type": "Edm.String", "key": true, "filterable": false, "searchable": false, "sortable": false, "facetable": false},
{"name": "entityId", "type": "Edm.Int64", "filterable": false, "searchable": false, "sortable": false, "facetable": false},
{"name": "entityType", "type": "Edm.Int32", "sortable": false, "facetable": false},
{"name": "sortableName", "type": "Edm.String", "filterable": false, "facetable": false, "searchable": false},
{"name": "name", "type": "Edm.String", "filterable": false, "retrievable": false, "sortable": false, "facetable": false, "analyzer":"keyword_analyzer"},
{"name": "town", "type": "Edm.String", "filterable": false, "retrievable": false, "sortable": false, "facetable": false, "analyzer":"keyword_analyzer"},
{"name": "tags", "type": "Collection(Edm.String)", "filterable": false, "retrievable": false, "sortable": false, "facetable": false, "analyzer":"keyword_analyzer"}
],
"defaultScoringProfile": "default_score",
"scoringProfiles": [
{
"name": "default_score",
"text":{
"weights": {
"name": 3.5,
"tags": 2,
"town": 1
}
}
}
],
"analyzers":[
{
"name": "keyword_analyzer",
"#odata.type":"#Microsoft.Azure.Search.CustomAnalyzer",
"charFilters":[
"map_dash",
"map_space"
],
"tokenizer":"keyword_tokenizer",
"tokenFilters":[
"asciifolding",
"lowercase",
"trim",
"delimiter_filter"
]
}
],
"charFilters":[
{
"name":"map_dash",
"#odata.type":"#Microsoft.Azure.Search.MappingCharFilter",
"mappings":["-=>_"]
},
{
"name":"map_space",
"#odata.type":"#Microsoft.Azure.Search.MappingCharFilter",
"mappings":["\\u0020=>_"]
}
],
"tokenizers":[
{
"name": "keyword_tokenizer",
"#odata.type":"#Microsoft.Azure.Search.KeywordTokenizerV2"
}
],
"tokenFilters":[
{
"name": "stopwords_filter",
"#odata.type":"#Microsoft.Azure.Search.StopwordsTokenFilter",
"removeTrailing": false
},
{
"name": "delimiter_filter",
"#odata.type":"#Microsoft.Azure.Search.WordDelimiterTokenFilter",
"generateWordParts": true,
"generateNumberParts": true,
"splitOnCaseChange": false,
"preserveOriginal": true,
"splitOnNumerics": false
}
]
}
Before uploading data to the index we need to normalize it - Black Flag becomes black flag. We also have to remove any preceeding words of the so this means that The Killers becomes killers - also any non standard characters are replaced to remove accents etc.
When performing a search, in code we need to now remove any preceeding the if it exists, and perform the same normalization - I can accept doing this.
We then build up the query which changes depending upon how many words there are in the initial query.
List<string> splitQ = queryPhrase.SplitToList(" ");
if (splitQ.Count > 0)
{
if (splitQ.Count == 1)
{
search.Append($"(\"{splitQ[0]}\" || {this.EscapeSpecialCharacters(splitQ[0])}*)");
}
else
{
for (int i = 0; i < splitQ.Count; i++)
{
if (i == splitQ.Count - 1)
{
search.Append($"+{this.EscapeSpecialCharacters(splitQ[i])}*");
}
else
search.Append($"+\"{splitQ[i]}\"");
}
search.Insert(0, $"(\"{queryPhrase}\"||(");
search.Append("))");
}
}
A single word black would mean the main query is: ("black" || black*)
However, as soon as additional words come in it has to change. black fl becomes: ("black fl"||(+"black"+fl*))
A three word search would be: ("one two three"||(+"one"+"two"+three*))
On top of this we then add any filter options.
Search is sent to the index with the query type set to full
The above has got us as close as we can to having decent and accurate results. However, the scoring is all messed up.
Processing the results...
Firstly we now normalise the score given by the Azure Search Index, depending upon the search query, the scores range massively, so we normalize this as a percentage based on the maximum scoring item.
We now have to apply our own enhancer to the score based on the tag or name field. An exact match to the query gives an enhancer of 5, and a startswith query gets and enhancement of 3.
We then provide a score that uses the enhancement to increase the results position in the rankings.
This final seciton of processing the results seems as though it is something that should be done automatically within the search index system.

Azure Search - OrderBy filterable field then by distance

I'm trying to cater for the following example with Azure Search.
Given the following index schema:
{
"name": "mySchema",
"fields": [
{
"name": "Id",
"type": "Edm.String",
"key": true,
"searchable": false,
"filterable": false,
"sortable": false,
"facetable": false,
"retrievable": true,
"suggestions": false
},
{
"name": "StateId",
"type": "Edm.Int32",
"key": false,
"searchable": false,
"filterable": true,
"sortable": false,
"facetable": false,
"retrievable": true,
"suggestions": false
},
{
"name": "Location",
"type": "Edm.GeographyPoint",
"key": false,
"searchable": false,
"filterable": true,
"sortable": true,
"facetable": false,
"retrievable": true,
"suggestions": false
},
],
}
I want to be able to order my results firstly on the StateId field, and then by the distance from a given lat/long location.
I realise that I am able to achieve the first part by using a $filter= StateId eq x component when querying. However, I do want to still receive results (with a lower score) that are not in the provided StateId, but are of a given distance away from a provided location.
I recognise also, that this looks like it should be able to be achieved by a custom Scoring Profile. I would expect by using a Scoring Profile, I'd be able to return something like this:
[
{
"#search.score":100.0,
"Id":"111",
"StateId":"123",
"Location": {"type": "Point details...."},
},
{
"#search.score":100.0,
"Id":"222",
"StateId":"123",
"Location": {"type": "Point details...."},
},
{
"#search.score":50.0,
"Id":"333",
"StateId":"789",
"Location": {"type": "Point details...."},
}
]
However, I am not able to search on the StateId field, as this is an Edm.Int32 value, so I do not believe using a Scoring Profile would be a viable solution.
Anyone come across a similar scenario?
EDIT:
Trying to explain just a bit further - if I were to explain this in Psuedo-SQL, this is basically the case I'm trying to handle
ORDER BY (CASE WHEN StateId = #StateId THEN 1 ELSE 0 END) DESC, Location
We don't currently support modeling this scenario with scoring profiles. This has come up multiple times though, so it's something we'd like to add.
In the meanwhile, one thing you can do as a work-around is to add the StateId value to one of the searchable fields (e.g. just append it at the end of the text). Then during search include the state id as part of the search string, which should skew results towards those documents that match the state id (or that are a very good match without it, which might be good relevance anyway depending on the case).
During display, if you show this text field you'd have to strip out the state id from the end of the string (or use a different field).

Resources