Create and query mango index on unknown object key in CouchDB - couchdb

I have the following object:
{
"roleAttribution": {
"15497490976600-51042": {
"teams": [
"e5abb1e962e11a84ff0e41e99103cd90"
],
"persons": [
"15124323582330-17269"
]
}
},
"type": "link",
}
And need to index/query the teams array. The problem is the roleAttribution keys are unpredictable.
Is there a way to index and query all possible keys of the object up to the teams array?

At this point, CouchDB does not support a good way to just index arrays. (https://issues.apache.org/jira/browse/COUCHDB-2867). You would need to create a view for that. If you would like to query documents based on the values of team array, you would need to iterate over the array in view map function and emit all the values there. More info about the views here http://guide.couchdb.org/draft/views.html

Related

Query CosmosDB when document contains Dictionary

I have a problem with querying CosmosDB document which contains a dictionary. This is an example document:
{
"siteAndDevices": {
"4cf0af44-6233-402a-b33a-e7e35dbbee6a": [
"f32d80d9-e93a-687e-97f5-676516649420",
"6a5eb9fa-c961-93a5-38cc-ecd74ada13ac",
"c90e9986-5aea-b552-e532-cd64a250ad10",
"7d4bfdca-547a-949b-ccb3-bbf0d6e5d727",
"fba51bfe-6a5e-7f25-e58a-7b0ced59b5d8",
"f2caac36-3590-020f-ebb7-5ccd04b4412c",
"1b446af7-ba74-3564-7237-05024c816a02",
"7ef3d931-131e-a639-10d4-f4dd5db834ca"
]
},
"id": "f9ef9fb6-4b70-7d3f-2bc8-c3d335018624"
}
I need to get all documents where provided guid is in the list, so in the dictionary value (I don't know dictionary key). I found an information somewhere here that it is not possible to iterate through keys in dictionary in CosmosDB (maybe it has changed since that time but I din't find any information in documentation), but maybe someone will have some idea. I cannot change form of the document.
I tried to do it in Linq, but I didn't get any results.
var query = _documentClient
.CreateDocumentQuery<Dto>(DocumentCollectionUri())
.Where(d => d.SiteAndDevices.Any(x => x.Value.Contains("f32d80d9-e93a-687e-97f5-676516649420")))
.AsDocumentQuery();
Not sure of the Linq query, but with SQL, you'd need something like this:
SELECT * FROM c
where array_contains(c.siteAndDevices['4cf0af44-6233-402a-b33a-e7e35dbbee6a'],"f32d80d9-e93a-687e-97f5-676516649420")
This is a strange document format though, as you've named your key with an id:
"siteAndDevices": {
"4cf0af44-6233-402a-b33a-e7e35dbbee6a": ["..."]
}
Your key is "4cf0af44-6233-402a-b33a-e7e35dbbee6a", which forces you to use a different syntax to reference it:
c.siteAndDevices['4cf0af44-6233-402a-b33a-e7e35dbbee6a']
You'd save yourself a lot of trouble refactoring this to something like:
{
"id": "dictionary1",
"siteAndDevices": {
"deviceId": "4cf0af44-6233-402a-b33a-e7e35dbbee6a",
"deviceValues": ["..."]
}
}
You can refactor further, such as using an array to contain multiple device id + value combos.

Is it possible to index all attributes without knowing exact structure beforehand in arangodb?

I have a collection with simple documents like this
{
key1:value,
key2:value2,
....
}
I would like to index all keys separately .
But the current arangodb UI only provide a list of attributes seperated by comma, eg. [key1,key2] as input. So I have to define those attributes beforehand
Is there something like * to tell arango to index all attributes.
The standard indexes do not support wildcards to index all attributes (and multiple paths in an index definition will create a combined index, not a union of all keys). But you can create an ArangoSearch View and let it index all attributes:
{
"type": "arangosearch",
"links": {
"coll": {
"analyzers": [
"identity"
],
"includeAllFields": true
}
}
}
Then add some documents into the collection coll:
{"foo": 1}
{"bar": 2}
{"baz": {"nested": 3} }
And finally query the View (here called someView), using the default identity Analyzer:
FOR doc IN someView
SEARCH doc.baz.nested == 3
RETURN doc
As you can see, all attributes including nested ones are indexed by the using the includeAllFields option at the top-level.
More info: https://www.arangodb.com/docs/stable/arangosearch.html

How to retrieve a subset of fields from array of objects using the Couchbase sub-document API?

I am trying to write a lookup which returns an array from a document and skipping some fields:
{
"id": 10000,
"schedule": [
{
"day": 0,
"flight": "AF198",
"utc": "10:13:00"
},
{
"day": 0,
"flight": "AF547",
"utc": "19:14:00"
},
...
]
}
I would like to get all schedule items but only the flight properties. I want to get something like this:
[
{
"flight: "AF198"
},
{
"flight: "AF547"
},
...
]
bucket.lookupIn(key).get("schedule.flight") doesn’t work. I tried "schedule[].flight", "schedule.$.flight" It seems I always need to know the index.
I saw that this is possible with N1QL.
Couchbase - SELECT a subset of fields from array of objects
Do you guys know how to do this with the Subdocument API? Sorry if it is a trivial question. I just cannot find an example on
https://developer.couchbase.com/documentation/server/current/sdk/subdocument-operations.html
Couchbase Subdocument requires the full path, it does not support expansion. In this case it needs to know the index of the array. There are a few other options:
If every path is known, then you could chain all of the subdocument gets. A total of 16 paths can be got at once:
bucket.lookupIn(key).get("schedule[0].flight").get(schedule[1].flight")
Get the parent object and filter on the application side:
bucket.lookupIn(key).get("schedule")
As mentioned in the question, use N1QL.

SELECT single field from embedded array in azure documentDB

I have a documentDB collection that looks like this sample:
{
"data1": "hello",
"data2": [
{
"key": "key1",
"value": "value1"
},
{
"key": "key2",
"value": "value2"
}
}
In reality the data has a lot of other fields and the embedded array has some fields where the data is quite large. I need to query the data and I care about the small "key" field in the data2 array but I do not need the large "value". I am finding returning all the value data is causing performance problems, but if I exclude the array data from the SELECT all together it is fast (so the data size is the issue).
I cannot figure out a way to return only the "key" but exclude the "value" in the embedded array.
I basically want SELECT r.data1, r.data2.key and to have it return as:
{
"data1": "hello",
"data2": [
{
"key": "key1"
},
{
"key": "key2"
}
}
but it doesn't seem possible to SELECT r.data2.key because it is in an array
A JOIN will cause it to return a copy of each document for each "data2" array element, which does not work for me. My only other option would be to migrate the data and put the data I want into its own array so I can select the whole object.
Is this possible some how that I have not been able to figure out?
Mike,
As you have surmised, this is not possible without a custom UDF until DocumentDB supports sub-queries. If you would like to go down that route, see the following answer for an example of how the UDF may have to look:
DocumentDB Sub Query
Good luck!

Mongoose/Mongodb previous and next in embedded document

I'm learning Mongodb/Mongoose/Express and have come across a fairly complex query (relative to my current level of understanding anyway) that I'm not sure how best to approach. I have a collection - to keep it simple let's call it entities - with an embedded actions array:
name: String
actions: [{
name: String
date: Date
}]
What I'd like to do is to return an array of documents with each containing the most recent action (or most recent to a specified date), and the next action (based on the same date).
Would this be possible with one find() query, or would I need to break this down into multiple queries and merge the results to generate one result array? I'm looking for the most efficient route possible.
Provided that your "actions" are inserted with the "most recent" being the last entry in the list, and usually this will be the case unless you are specifically updating items and changing dates, then all you really want to do is "project" the last item of the array. This is what the $slice projection operation is for:
Model.find({},{ "actions": { "$slice": -1 } },function(err,docs) {
// contains an array with the last item
});
If indeed you are "updating" array items and changing dates, but you want to query for the most recent on a regular basis, then you are probably best off keeping the array ordered. You can do this with a few modifiers such as:
Model.update(
{
"_id": ObjectId("541f7bbb699e6dd5a7caf2d6"),
},
{
"$push": { "actions": { "$each": [], "$sort": { "date": 1 } } }
},
function(err,numAffected) {
}
);
Which is actually more of a trick that you can do with the $sort modifier to simply sort the existing array elements without adding or removing. In versions prior to 2.6 you need the $slice "update" modifier in here as well, but this could be set to a value larger than the expected array elements if you did not actually want to restrict the possible size, but that is probably a good idea.
Unfortunately, if you were "updating" via a $set statement, then you cannot do this "sorting" in a single update statement, as MongoDB will not allow both types of operations on the array at once. But if you can live with that, then this is a way to keep the array ordered so the first query form works.
If it just seems too hard to keep an array ordered by date, then you can in fact retrieve the largest value my means of the .aggregate() method. This allows greater manipulation of the documents than is available to basic queries, at a little more cost:
Model.aggregate([
// Unwind the array to de-normalize as documents
{ "$unwind": "$actions" },
// Sort the contents per document _id and inner date
{ "$sort": { "_id": 1, "actions.date": 1 } },
// Group back with the "last" element only
{ "$group": {
"_id": "$_id",
"name": { "$last": "$name" },
"actions": { "$last": "$actions" }
}}
],
function(err,docs) {
})
And that will "pull apart" the array using the $unwind operator, then process with a next stage to $sort the contents by "date". In the $group pipeline stage the "_id" means to use the original document key to "collect" on, and the $last operator picks the field values from the "last" document ( de-normalized ) on that grouping boundary.
So there are various things that you can do, but of course the best way is to keep your array ordered and use the basic projection operators to simply get the last item in the list.

Resources