How to correctly query parent & child documents in SOLR? - search

Index Structure:
{
"documents": [
{
"doc_type": [
"dataset"
],
"title": [
"document_title"
],
"id": [
1234
],
"_childDocuments_": [
{
"parent_id": [
1234
],
"doc_type": [
"column"
],
"id": [
789
],
"attr_nm": [
"child_field_value"
],
},
]
}
]
}
Whenever a search/query is performed I always want to return the parent document. E.g. If I search for 'child_field_value' I want to get the parent document ad the child documents are never seen by the user.
I can get this working for parents. E.g. if I search for 'document_title' I can get the structure above using the following setup:
q=document_title
fl=*,[child parentFilter=doc_type:dataset]
qf=title^200 attr_nm
If I try this using the query keyword "child_field_value" I get a strange structure like:
"response": {
"numFound":2,
"start":0,
"docs":[
{
"parent_id":"1234",
"doc_type":"column",
"id":"123",
"attr_nm":"child_field_value",
},
{
"parent_id":"1234",
"doc_type":"column",
"id":"456",
"attr_nm":"child_field_value",
"_childDocuments_":[{
"parent_id":"1234",
"doc_type":"column",
"id":"124",
"attr_nm":"child_field_value",
}]
}
]
}
How can I ensure always get the desired results even if searching a child field for any value?

Related

Available Fields for Sharepoint Search Query on Microsoft Graph API

I'm using the Graph API to make a global search in my sharepoint website, and I need to retrieve some specific fields. I didn't find any documentation that specifies the available fields that I can use for the fields property on my payload, only a documentation for specific document library search.
I have to use the global search because my search needs to access all the document libraries on my sharepoint web site.
The field that I wanted to get from the request is the version of the document in the list. I could add this field in sharepoint, and my view is displaying the version values, but the request does not take this value. I'm using this request below:
Endpoint: https://graph.microsoft.com/v1.0/search/query
Payload:
"requests": [
{
"entityTypes": [
"listItem"
],
"query": {
"queryString": ""
},
"fields": [
"title",
"_UIVersionString"
]
}
]
}
Response:
{
"value": [
{
"searchTerms": [],
"hitsContainers": [
{
"hits": [
{
"hitId": "83C63693-C621-4CFE-B4F7-A36B68AEB421",
"rank": 1,
"summary": "...",
"resource": {
"#odata.type": "#microsoft.graph.listItem",
"fields": {
"title": "Calc.22090615231879"
}
}
},
],
"total": 1,
"moreResultsAvailable": false
}
]
}
],
"#odata.context": "https://graph.microsoft.com/v1.0/$metadata#Collection(microsoft.graph.searchResponse)"
}
The name of the field I'm using in the payload that corresponds to the version is _UIVersionString, where I got it from the specific list search request, using the https://graph.microsoft.com/v1.0/sites/{site-id}/lists/{list-id}/items?$expand=fields endpoint. But sadly, the version is not appearing on my search result.
Is there some documentation I could use to see a list of available fields for this request? I'm trying to find it in MS GraphAPI documentation, but it looks be a big real encyclopedia.
Do you know the name of the field that corresponds to the version?
Thanks a lot!
Other information:
Sharepoint Version: Sharepoint Web (Online)
Type of the lists: Document Library
Lists version configuration:
- Require content approval for submitted items?: No
- Create a version each time you edit a file in this document library?: Create Major Versions
- Require documents to be checked out before they can be edited?: No
Since Graph is using there managed properties from sharepoint you can try "UIVersionStringOWSTEXT".
{
"requests": [
{
"entityTypes": [
"listItem"
],
"query": {
"queryString": "test"
},
"fields": [
"title",
"UIVersionStringOWSTEXT"
]
}
]
}
The results look like
{
"value": [
{
"searchTerms": [
"test"
],
"hitsContainers": [
{
"hits": [
{
"hitId": "GUID",
"rank": 1,
"summary": "test",
"resource": {
"#odata.type": "#microsoft.graph.listItem",
"fields": {
"title": "test",
"uiVersionStringOWSTEXT": "1.0"
}
}
}
],
"total": 1,
"moreResultsAvailable": false
}
]
}
],
"#odata.context": "https://graph.microsoft.com/v1.0/$metadata#Collection(microsoft.graph.searchResponse)"
}
In my tenant this property was already created. I guess it is by default. If your search schema does not have this managed property you can create it and name it like you wish. Map it to the crawled property "ows_q_TEXT__UIVersionString" and make sure your managed property is set to "retrievable".

Arango DB Filter query for print array of value

Given the following document structure:
{
"name": [
{
"use": "official",
"family": "Chalmers",
"given": [
"Peter",
"James"
]
},
{
"use": "usual",
"given": [
"Jim"
]
},
{
"use": "maiden",
"family": "Windsor",
"given": [
"Peter",
"James"
]
}
]
}
Query:
FOR client IN Patient FILTER client.name[*].use=='official' RETURN client.name[*].given
I have telecom and name array.
I want to query to compare if name[*].use=='official' then print corresponding give array.
Expected result:
"given": [
"Peter",
"James"
]
client.name[*].use is an array, so you need to use an array operator. It can be either of the following:
'string' in doc.attribute
doc.attribute ANY == 'string'
doc.attribute ANY IN ['string']
To return just the given names from the 'official' array, you can use a subquery:
RETURN { given:
FIRST(FOR name IN client.name FILTER name.use == 'official' LIMIT 1 RETURN name.given)
}
Alternatively, you can use an inline expression:
FOR client IN Patient
FILTER 'official' IN client.name[*].use
RETURN { given:
FIRST(client.name[* FILTER CURRENT.use == 'official' LIMIT 1 RETURN CURRENT.given])
}
Result:
[
{
"given": [
"Peter",
"James"
]
}
]
In your original post, the example document and query didn't match, but assuming the following structure:
{
"telecom": [
{
"use": "official",
"value": "+1 (03) 5555 6473 82"
},
{
"use": "mobile",
"value": "+1 (252) 5555 910 920 3"
}
],
"name": [
{
"use": "official",
"family": "Chalmers",
"given": [
"Peter",
"James"
]
},
{
"use": "usual",
"given": [
"Jim"
]
},
{
"use": "maiden",
"family": "Windsor",
"given": [
"Peter",
"James"
]
}
]
}
… here is a possible query:
FOR client IN Patient
FILTER LENGTH(client.telecom[* FILTER
CONTAINS(CURRENT.value, "(03) 5555 6473") AND
CURRENT.use == 'official']
)
RETURN {
given: client.name[* FILTER CURRENT.use == 'official' RETURN CURRENT.given]
}
Note that client.telecom[*].value LIKE "..." causes the array of phone numbers to be cast to a string "[\"+1 (03) 5555 6473 82\",\"+1 (252) 5555 910 920 3\"]" against which the LIKE operation is run - this kind of works, but it's not ideal.
CONTAINS() is also faster than LIKE with % wildcards on both sides.
It would be possible that there are multiple 'official' elements, which might require an extra level of array nesting. Above query produces:
[
{
"given": [
[
"Peter",
"James"
]
]
}
]
If you know that there is only one element or restrict it to one element explicitly then you can get rid of one of the wrapping square brackets with FIRST() or FLATTEN().

Can I index an array in a composite index in Azure Cosmos DB?

I have a problem indexing an array in Azure Cosmos DB
I am trying to save this indexing policy via the portal
{
"indexingMode": "consistent",
"automatic": true,
"includedPaths": [
{
"path": "/*"
}
],
"excludedPaths": [
{
"path": "/\"_etag\"/?"
}
],
"compositeIndexes": [
[
{
"path": "/DeviceId",
"order": "ascending"
},
{
"path": "/TimeStamp",
"order": "ascending"
},
{
"path": "/Items/[]/Name/?",
"order": "ascending"
},
{
"path": "/Items/[]/DoubleValue/?",
"order": "ascending"
}
]
]
}
I get the error "Failed to update container DeviceEvents:
Message: {"code":"BadRequest","message":"Message: {"Errors":["The indexing path '\/Items\/[]\/Name\/?' could not be accepted, failed near position '8'."
This seems to be the array [] syntax that is giving an error.
On a side note I am not sure what I am doing makes sense at all but I have a query that looks like this
SELECT SUM(de0["DoubleValue"])
FROM root JOIN de0 IN root["Items"]
WHERE root["ApplicationId"] = 57 AND root["DeviceId"] = 126 AND root["TimeStamp"] >= "2021-02-21T17:55:29.7389397Z" AND de0["Name"] = "Use Case"
Where ApplicationId is the partition key and the item saved looks like this
{
"id": "59ab9323-26ca-436f-8d29-e1ddd826f025",
"DeviceId": 3,
"ApplicationId": 3,
"RawData": "640F7A000A00E30142000000",
"TimeStamp": "2021-02-20T18:36:52.833174Z",
"Items": [
{
"Name": "Battery Status",
"StringValue": "Full",
"DoubleValue": null
},
{
"Name": "Use Case",
"StringValue": null,
"DoubleValue": 12
},
{
"Name": "Battery Voltage",
"StringValue": null,
"DoubleValue": 3.962
},
{
"Name": "Rain Gauge Count",
"StringValue": null,
"DoubleValue": 10
}
],
"_rid": "CgdVAO7B0DNkAAAAAAAAAA==",
"_self": "dbs/CgdVAA==/colls/CgdVAO7B0DM=/docs/CgdVAO7B0DNkAAAAAAAAAA==/",
"_etag": "\"61008771-0000-0d00-0000-603156c50000\"",
"_attachments": "attachments/",
"_ts": 1613846213
}
I need to aggregate on some of these items in the array like say get MAX on temperature or something like this (using Use Case for test although it doesn't make sense). I reasoned that if all the data in the query is in a single composite index the database would be able to do the aggregation without reading the documents themselves. However I can't seem to add a composite index containing an array at all.
Yes, composite index can't contain an array path. It should be a scalar value.
Unlike with included or excluded paths, you can't create a path with
the /* wildcard. Every composite path has an implicit /? at the end of
the path that you don't need to specify. Composite paths lead to a
scalar value and this is the only value that is included in the
composite index.
Reference:https://learn.microsoft.com/en-us/azure/cosmos-db/index-policy#composite-indexes

how to query nested array of objects in mongodb?

i am trying to query nested array of objects in mongodb from node js, tried all the solutions but no luck. can anyone please help this on priority?
I have tried following :
{
"name": "Science",
"chapters": [
{
"name": "ScienceChap1",
"tests": [
{
"name": "ScienceChap1Test1",
"id": 1,
"marks": 10,
"duration": 30,
"questions": [
{
"question": "What is the capital city of New Mexico?",
"type": "mcq",
"choice": [
"Guadalajara",
"Albuquerque",
"Santa Fe",
"Taos"
],
"answer": [
"Santa Fe",
"Taos"
]
},
{
"question": "Who is the author of beowulf?",
"type": "notmcq",
"choice": [
"Mark Twain",
"Shakespeare",
"Abraham Lincoln",
"Newton"
],
"answer": [
"Shakespeare"
]
}
]
},
{
"name": "ScienceChap1test2",
"id": 2,
"marks": 20,
"duration": 30,
"questions": [
{
"question": "What is the capital city of New Mexico?",
"type": "mcq",
"choice": [
"Guadalajara",
"Albuquerque",
"Santa Fe",
"Taos"
],
"answer": [
"Santa Fe",
"Taos"
]
},
{
"question": "Who is the author of beowulf?",
"type": "notmcq",
"choice": [
"Mark Twain",
"Shakespeare",
"Abraham Lincoln",
"Newton"
],
"answer": [
"Shakespeare"
]
}
]
}
]
}
]
}
Here is what I've tried so far but still can't get it to work
db.quiz.find({name:"Science"},{"tests":0,chapters:{$elemMatch:{name:"ScienceCh‌​ap1"}}})
db.quiz.find({ chapters: { $elemMatch: {$elemMatch: { name:"ScienceChap1Test1" } } }})
db.quiz.find({name:"Science"},{chapters:{$elemMatch:{$elemMatch:{name:"Scienc‌​eChap1Test1"}}}}) ({ name:"Science"},{ chapters: { $elemMatch: {$elemMatch: { name:"ScienceChap1Test1" } } }})
Aggregation Framework
You can use the aggregation framework to transform and combine documents in a collection to display to the client. You build a pipeline that processes a stream of documents through several building blocks: filtering, projecting, grouping, sorting, etc.
If you want get the mcq type questions from the test named "ScienceChap1Test1", you would do the following:
db.quiz.aggregate(
//Match the documents by query. Search for science course
{"$match":{"name":"Science"}},
//De-normalize the nested array of chapters.
{"$unwind":"$chapters"},
{"$unwind":"$chapters.tests"},
//Match the document with test name Science Chapter
{"$match":{"chapters.tests.name":"ScienceChap1test2"}},
//Unwind nested questions array
{"$unwind":"$chapters.tests.questions"},
//Match questions of type mcq
{"$match":{"chapters.tests.questions.type":"mcq"}}
).pretty()
The result will be:
{
"_id" : ObjectId("5629eb252e95c020d4a0c5a5"),
"name" : "Science",
"chapters" : {
"name" : "ScienceChap1",
"tests" : {
"name" : "ScienceChap1test2",
"id" : 2,
"marks" : 20,
"duration" : 30,
"questions" : {
"question" : "What is the capital city of New Mexico?",
"type" : "mcq",
"choice" : [
"Guadalajara",
"Albuquerque",
"Santa Fe",
"Taos"
],
"answer" : [
"Santa Fe",
"Taos"
]
}
}
}
}
$elemMatch doesn't work for sub documents. You can use the aggregation framework for "array filtering" by using $unwind.
You can delete each line from the bottom of each command in the aggregation pipeline in the above code to observe the pipelines behavior.
You should try the following queries in the mongodb simple javascript shell.
There could be Two Scenarios.
Scenario One
If you simply want to return the documents that contain certain chapter names or test names for example just one argument in find will do.
For the find method the document you want to be returned is specified by the first argument. You could return documents with the name Science by doing this:
db.quiz.find({name:"Science"})
You could specify criteria to match a single embedded document in an array by using $elemMatch. To find a document that has a chapter with the name ScienceChap1. You could do this:
db.quiz.find({"chapters":{"$elemMatch":{"name":"ScienceChap1"}}})
If you wanted your criteria to be a test name then you could use the dot operator like this:
db.quiz.find({"chapters.tests":{"$elemMatch":{"name":"ScienceChap1Test1"}}})
Scenario Two - Specifying Which Keys to Return
If you want to specify which keys to Return you can pass a second argument to find (or findOne) specifying the keys you want. In your case you can search for the document name and then provide which keys to return like so.
db.quiz.find({name:"Science"},{"chapters":1})
//Would return
{
"_id": ObjectId(...),
"chapters": [
"name": "ScienceChap2",
"tests: [..all object content here..]
}
If you only want to return the marks from each test object you can use the dot operator to do so:
db.quiz.find({name:"Science"},{"chapters.tests.marks":1})
//Would return
{
"_id": ObjectId(...),
"chapters": [
"tests: [
{"marks":10},
{"marks":20}
]
}
If you only want to return the questions from each test:
db.quiz.find({name:"Science"},{"chapters.tests.questions":1})
Test these out. I hope these help.

Marklogic 8 Node.js API - How can I scope a search on a property child of root?

[updated 17:15 on 28/09]
I'm manipulating json data of type:
[
{
"id": 1,
"title": "Sun",
"seeAlso": [
{
"id": 2,
"title": "Rain"
},
{
"id": 3,
"title": "Cloud"
}
]
},
{
"id": 2,
"title": "Rain",
"seeAlso": [
{
"id": 3,
"title": "Cloud"
}
]
},
{
"id": 3,
"title": "Cloud",
"seeAlso": [
{
"id": 1,
"title": "Sun"
}
]
},
];
After inclusion in the database, a node.js search using
db.documents.query(
q.where(
q.collection('test films'),
q.value('title','Sun')
).withOptions({categories: 'none'})
)
.result( function(results) {
console.log(JSON.stringify(results, null,2));
});
will return both the film titled 'Sun' and the films which have a seeAlso/title property (forgive the xpath syntax) = 'Sun'.
I need to find 1/ films with title = 'Sun' 2/ films with seeAlso/title = 'Sun'.
I tried a container query using q.scope() with no success; I don't find how to scope the root object node (first case) and for the second case,
q.where(q.scope(q.property('seeAlso'), q.value('title','Sun')))
returns as first result an item which matches all text inside the root object node
{
"index": 1,
"uri": "/1.json",
"path": "fn:doc(\"/1.json\")",
"score": 137216,
"confidence": 0.6202662,
"fitness": 0.6701325,
"href": "/v1/documents?uri=%2F1.json&database=Documents",
"mimetype": "application/json",
"format": "json",
"matches": [
{
"path": "fn:doc(\"/1.json\")/object-node()",
"match-text": [
"Sun Rain Cloud"
]
}
]
},
which seems crazy.
Any idea about how doing such searches on denormalized json data?
Laurent:
XPaths on JSON are supported by MarkLogic.
In particular, you might consider setting up a path range index to match /title at the root:
http://docs.marklogic.com/guide/admin/range_index#id_54948
Scoped property matching required either filtering or indexed positions to be accurate. An alternative is to set up another path range index on /seeAlso/title
For the match issue it would be useful to know the MarkLogic version and to see the entire query.
Hoping that helps,

Resources