I'm attempting to utilize Contentful on a current project of mine and I'm trying to understand how to filter my query results based on a field in a linked object.
My top level object contains a Link defined as such:
"name": "Service_Description",
"fields": [
{
"name": "Header",
"id": "header",
"type": "Link",
"linkType": "Entry",
"required": true,
"validations": [
{
"linkContentType": [
"offerGeneral"
]
}
],
"localized": false,
"disabled": false,
"omitted": false
},
This "header" field links to another content type that has this definition:
"fields": [
{
"name": "General",
"id": "general",
"type": "Link",
"linkType": "Entry",
"required": true,
"validations": [
{
"linkContentType": [
"genericGeneral"
]
}
],
"localized": false,
"disabled": false,
"omitted": false
},
which then links to the lowest level:
"fields": [{
"name": "TagList",
"id": "tagList",
"type": "Array",
"items": {
"type": "Link",
"linkType": "Entry",
"validations": [
{
"linkContentType": [
"tag"
]
}
]
},
"validations": []
}
where tagList is an array of tags this piece of content may have.
I want to be able to run a query from the top level object that says get me X number of these "Service_Description" content entries where it contains a tag from a supplied list of tags.
In PostMan, I've been running with this:
https://cdn.contentful.com/spaces/{SPACE_ID}/entries?access_token={ACCESS_TOKEN}&content_type=serviceDescription&include=3
I'm trying to add a filter something like so:
fields.header.fields.general.fields.tagList.sys.id%5Bin%5D={TAG_SYS_ID}
This is clearly incorrect, but I've been struggling with how to walk this relationship to achieve my goal. Perusing the documentation this seems to have something to do with includes, but I'm unsure of how to rectify the problem.
Any direction on how to achieve my goal or if this is possible?
This is now possible, something I believe was solved for in the API based on requests for this functionality. You can see the thread here.
This gist of it is that you have to query on the entries that have linked entries and then include the contentType for those linked entries in the query like so:
contentfulClient.getEntries({
'content_type': 'location',
'fields.market.fields.marketName': 'New York',
'fields.market.sys.contentType.sys.id': 'marketRegion'
})
Unfortunately what you are requesting is not currently possible in Contentful.
We were facing a very similar issue with nested/referenced content types and support said it wasn't possible.
We ended up writing a very complicated system that allowed us to do what you want. Essentially doing a full text search for the referenced content and then querying all of the parents entries. We then matched the relationships by iterating over the parents to find the relationship.
Sorry it couldn't be easier. Hopefully the devs work on something that improve this complication. We have brought this to their attention.
Related
trying to add custom skill in the skillset and map it in the index
here is in detail
I'm using the azure Named Entity Recognition in my skillset as
{
"#odata.type": "#Microsoft.Skills.Text.MergeSkill",
"description": "Merge text content with image tags",
"insertPreTag": " ",
"context": "/document",
"inputs": [
{
"name": "text",
"source": "/document/fullTextAndCaptions"
},
{
"name": "itemsToInsert",
"source": "/document/normalized_images/*/Tags/*/name"
}
],
"outputs": [
{
"name": "mergedText",
"targetName": "finalText"
}
]
}
and in the indexer as
{
"sourceFieldName": "/document/finalText/pages/*/entities/*/value",
"targetFieldName": "entities"
},
{
"sourceFieldName": "/document/finalText/pages/*/locations/*",
"targetFieldName": "locations"
},
and it works 100% now I want to add the Distinct custom skill from https://github.com/Azure-Samples/azure-search-power-skills/tree/master/Text/Distinct
I did publish the function and when I go to test it manually it works as expected.
however overall its not working in skillset. I want it to take the location and filter it and output the distinct only in it's own field in the search index.
I'm having a really hard time to configure the skillset and indexer to get it to work.
any help please?
You'll need to add the distinct custom skill like this, assuming you want to dedup over the whole document
{
"skills": [
...
{
"#odata.type": "#Microsoft.Skills.Custom.WebApiSkill",
"description": "Distinct skill",
"uri": "<https://distinct-skill>",
"context": "/document",
"inputs": [
{
"name": "locations",
"source": /document/finalText/pages/*/locations/*"
}
],
"outputs": [
{
"name": "distinct",
"targetName": "distinctLocations"
}
]
}
...
]
}
and an output field mapping to put it into the index.
{
"sourceFieldName": "/document/distinctLocations",
"targetFieldName": "distinctLocations"
}
See https://learn.microsoft.com/en-us/azure/search/cognitive-search-custom-skill-interface#consuming-custom-skills-from-skillset for adding a custom skill.
The skill inputs for the custom skill must be configured to point to the data you want to disambiguate. In this case, you didn't really need to modify the code, all you had to do was have an input with name 'words' and source '/document/finalText/pages//locations/'.
I would like to do a query matches against two properties of the same item in a sub-collection.
Example:
[
{
"name": "Person 1",
"contacts": [
{ "type": "email", "value": "person.1#xpto.org" },
{ "type": "phone", "value": "555-12345" },
]
}
]
I would like to be able to search by emails than contain xpto.org but,
doing something like the following doesn't work:
search.ismatchscoring('email','contacts/type,','full','all') and search.ismatchscoring('/.*xpto.org/','contacts/value,','full','all')
instead, it will consider the condition in the context of the main object and objects like the following will also match:
[
{
"name": "Person 1",
"contacts": [
{ "type": "email", "value": "555-12345" },
{ "type": "phone", "value": "person.1#xpto.org" },
]
}
]
Is there any way around this without having an additional field that concatenates type and value?
Just saw the official doc. At this moment, there's no support for correlated search:
This happens because each clause applies to all values of its field in
the entire document, so there's no concept of a "current sub-document
https://learn.microsoft.com/en-us/azure/search/search-howto-complex-data-types
and https://learn.microsoft.com/en-us/azure/search/search-query-understand-collection-filters
The solution I've implemented was creating different collections per contact type.
This way I'm able to search directly in, lets say, the email collection without the need for correlated search. It might not be the solution for all cases but it works well in this case.
I was trying out Phoenetic search using Azure Search without much luck. My objective is to work out an Index configuration that can handle typos and accomodate phonetic search for end users.
With the below configuration and sample data, I was trying to search for intentionally misspelled words like 'softvare' or 'alek'. I got results for 'alek' thanks for Phonetic analyzer; but didn't get any results for 'softvare'.
Looks like for this requirement phonetic search will not do the trick.
Only option that I found was to use synonyms map. The major pitfall is that I'm unable to use the Phonetics / Custom analyzer along with Synonyms :(
What are the various strategies that you would recommend for taking care of typos?
search query used
?api-version=2017-11-11&search=alec
?api-version=2017-11-11&search=softvare
Here is the index configuration
"name": "phonetichotels",
"fields": [
{"name": "hotelId", "type": "Edm.String", "key":true, "searchable": false},
{"name": "baseRate", "type": "Edm.Double"},
{"name": "description", "type": "Edm.String", "filterable": false, "sortable": false, "facetable": false, "analyzer":"my_standard"},
{"name": "hotelName", "type": "Edm.String", "analyzer":"my_standard"},
{"name": "category", "type": "Edm.String", "analyzer":"my_standard"},
{"name": "tags", "type": "Collection(Edm.String)", "analyzer":"my_standard"},
{"name": "parkingIncluded", "type": "Edm.Boolean"},
{"name": "smokingAllowed", "type": "Edm.Boolean"},
{"name": "lastRenovationDate", "type": "Edm.DateTimeOffset"},
{"name": "rating", "type": "Edm.Int32"},
{"name": "location", "type": "Edm.GeographyPoint"}
],
Analyzer (part of the index creation)
"analyzers":[
{
"name":"my_standard",
"#odata.type":"#Microsoft.Azure.Search.CustomAnalyzer",
"tokenizer":"standard_v2",
"tokenFilters":[ "lowercase", "asciifolding", "phonetic" ]
}
]
Analyze API Input and Output for 'software'
{
"analyzer":"my_standard",
"text": "software"
}
{
"#odata.context": "https://ctsazuresearchpoc.search.windows.net/$metadata#Microsoft.Azure.Search.V2017_11_11.AnalyzeResult",
"tokens": [
{
"token": "SFTW",
"startOffset": 0,
"endOffset": 8,
"position": 0
}
]
}
Analyze API Input and Output for 'softvare'
{
"analyzer":"my_standard",
"text": "softvare"
}
{
"#odata.context": "https://ctsazuresearchpoc.search.windows.net/$metadata#Microsoft.Azure.Search.V2017_11_11.AnalyzeResult",
"tokens": [
{
"token": "SFTF",
"startOffset": 0,
"endOffset": 8,
"position": 0
}
]
}
Sample data that I loaded
{
"#search.action": "upload",
"hotelId": "5",
"baseRate": 199.0,
"description": "Best hotel in town for software people",
"hotelName": "Fancy Stay",
"category": "Luxury",
"tags": ["pool", "view", "wifi", "concierge"],
"parkingIncluded": false,
"smokingAllowed": false,
"lastRenovationDate": "2010-06-27T00:00:00Z",
"rating": 5,
"location": { "type": "Point", "coordinates": [-122.131577, 47.678581] }
},
{
"#search.action": "upload",
"hotelId": "6",
"baseRate": 79.99,
"description": "Cheapest hotel in town ",
"hotelName": " Alec Baldwin Motel",
"category": "Budget",
"tags": ["motel", "budget"],
"parkingIncluded": true,
"smokingAllowed": true,
"lastRenovationDate": "1982-04-28T00:00:00Z",
"rating": 1,
"location": { "type": "Point", "coordinates": [-122.131577, 49.678581] }
},
With the right configuration, I should have got results even with the misspelled words.
I work on Azure Search. Before I suggest approaches to handle misspelled words, it would be helpful to look at your custom analyzer (my_standard) configuration. It might tell us why it's not able to handle the case for 'softvare'. As a DIY, you can use the Analyze API to see the tokens created using your custom analyzer and it should contain 'software' to actually match the docs.
Now then, here are a few ways that can be used independently or in conjunction to handle misspelled words. The best approach varies depending on the use-case and I strongly suggest you experiment with these to figure out the best one in your case.
You are already familiar with phonetic filters which is a common approach to handle similarly pronounced terms. If you haven't already, try different encoders for the filter to evaluate which configuration gives you the best results. Check out the list of encoders here.
Use fuzzy queries supported as part of the Lucene query syntax in Azure Search which returns terms that are near the original query term based on a distance metric. The limitation here is that it works on a single term. Check the docs for more details. Sample query would look like - search=softvare~1 You can also use term boosting to give the original term more boost in cases where the original term is also a valid term.
You also alluded to synonyms which is also used to query with misspelled terms. This approach gives you the most control over the process of handling typos but also require you to have prior knowledge of different typos for terms. You can use these docs if you want to experiment with synonyms.
As you could read in my post; my Objective was to handle the typos.
The only easy option is to use the inbuilt Lucene functionality - Fuzzy Search. I'm yet to check on the response times as the querytype has to be set to 'full' for using fuzzy search. Otherwise, the results were satisfactory.
Example:
search=softvare~&fuzzy=true&querytype=full
will return all documents with the 'Software' in it.
For further reading please go through Documentation
I have an content type entry in Contentful that has fields like this:
"fields": {
"title": "How It Works",
"slug": "how-it-works",
"countries": [
{
"sys": {
"type": "Link",
"linkType": "Entry",
"id": "3S5dbLRGjS2k8QSWqsKK86"
}
},
{
"sys": {
"type": "Link",
"linkType": "Entry",
"id": "wHfipcJS6WUSaKae0uOw8"
}
}
],
"content": [
{
"sys": {
"type": "Link",
"linkType": "Entry",
"id": "72R0oUMi3uUGMEa80kkSSA"
}
}
]
}
I'd like to run a query that would only return entries if they contain a particular country.
I played around with this query:
https://cdn.contentful.com/spaces/aoeuaoeuao/entries?content_type=contentPage&fields.countries=3S5dbLRGjS2k8QSWqsKK86
However get this error:
The equals operator cannot be used on fields.countries.en-AU because it has type Object.
I'm playing around with postman, but will be using the .NET API.
Is it possible to search for entities, and filter on arrays that contain Objects?
Still learning the API, so I'm guessing it should be pretty straight forward.
Update:
I looked at the request the Contentful Web CMS makes, as this functionality is possible there. They use query params like this:
filters.0.key=fields.countries.sys.id&filters.0.val=3S5dbLRGjS2k8QSWqsKK86
However, this did not work in the delivery API, and might only be an internal query format.
Figured this out. I used the following URL:
https://cdn.contentful.com/spaces/aoeuaoeua/entries?content_type=contentPage&fields.countries.sys.id=wHfipcJS6WUSaKae0uOw8
Note the query parameter fields.countries.sys.id
I have 3 models. 2 resource models, account(id, name) and widget(id, name), and 1 mapping model to map between the two widget_to_account(id, account_id, widget_id), to tell what widgets an account has access to, so to speak.
When stating the relationship between the models in their JSONs, using the guide in http://loopback.io/doc/en/lb3/HasManyThrough-relations.html, RESTful requests like "get widgets of account id=1" for example, works perfectly.
GET /accounts/1/widgets yields the widgets that account 1 has, yielding a widgets array:
[
{
"id": 1,
"name": "wg_user_mgr"
},
{
"id": 2,
"name": "wg_desc"
}
]
That's all good.
However, say I wanted to append this widgets array result along with the account object returned by a GET to the account model?. Loopback documentation suggests that this is done using the include keyword with the request, like so:
GET /accounts/1?filter[include]=widgets, returning hopefully an account model with it's allowed widgets:
{
"id": 1,
"name": "Account1Name",
"widgets": [
{
"id": 1,
"name": "wg_user_mgr",
"display_name": "User Manager"
},
{
"id": 2,
"name": "wg_desc",
"display_name": "Description"
}
]
}
However, what is actually returned by loopback with that request, is:
{
"id": 1,
"name": "Account1Name",
"widgets": []
}
Empty widgets array! When I look at the loopback SQL debugs, I see that it does go to the widget_to_account table and selects the entries of account_id=1, but interestingly it stop there and just returns an empty widgets array.
Any clues? The hasManyThrough loopback docs doesn't actually show any examples of using include like this to bridge two models that are connected via a mapping model.
My guess is they just forgot to code it in ¯\_(ツ)_/¯
UPDATE:
Doing some more digging around, I found the answer at https://groups.google.com/forum/#!topic/loopbackjs/sH7bKoqzU5c.
Where you define the relationships in the 2 resource models, you have to specifically define the "keyThrough" value.
NOT THIS:
"relations": {
"widgets": {
"type": "hasMany",
"model": "widget",
"foreignKey": "account_id",
"through": "widget_to_account"
}
}
BUT THIS:
"relations": {
"widgets": {
"type": "hasMany",
"model": "widget",
"foreignKey": "account_id",
"through": "widget_to_account",
"keyThrough": "account_id"
}
}
This is not made super clear, and is even stated incorrectly in the loopback api docs -.-
UPDATE:
Doing some more digging around, I found the answer at https://groups.google.com/forum/#!topic/loopbackjs/sH7bKoqzU5c.
Where you define the relationships in the 2 resource models, you have to specifically define the "keyThrough" value.
NOT THIS:
"relations": {
"widgets": {
"type": "hasMany",
"model": "widget",
"foreignKey": "account_id",
"through": "widget_to_account"
}
}
BUT THIS:
"relations": {
"widgets": {
"type": "hasMany",
"model": "widget",
"foreignKey": "account_id",
"through": "widget_to_account",
"keyThrough": "widget_id"
}
}
This is not made super clear, and is even stated incorrectly in the loopback api docs. I wish they'de stop this "auto-naming" paradigm they've been pushing around. Looking at loopback SO and the wider community, it's generally caused so much pain with models being named incorrectly, keys like this being set to totally arbitary names -.-