I have a question about lookup and filter array of objects in mongodb
I have structure: Person
{
"_id": "5cc3366c22c3767a2b114c6b",
"flags": [
"5cc30210fada5d7820d03aaf",
"5cc2c3924a94a575adbdc56a"
],
"key": "Animal",
"name": "name1",
"description": "description1",
"endpoints": [
{
"isEnabled": true,
"publishUrl": "megaUrl",
"env": "5cc1a8911b19026fd193506b"
},
{
"isEnabled": true,
"publishUrl": "megaUrl",
"env": "5ccaeef3312acb103730d4c5"
}
]
}
envs collection
{
"_id" : "5cc1a8911b19026fd193506b",
"name" : "name2",
"key" : "PROD",
"publishUrl" : "url1",
"__v" : 0
}
{
"_id" : "5ccaeef3312acb103730d4c5",
"name" : "name2",
"key" : "PROD",
"publishUrl" : "url1",
"__v" : 0
}
I should filter Document by endpoints.$.env
so, I have: accessKeys = ["PROD", "UAY"], and i should see result . with endpoints where env.key === "PROD" || env.key === "UAT"
Expected result:
{
"_id": "5cc3366c22c3767a2b114c6b",
"flags": [
"5cc30210fada5d7820d03aaf",
"5cc2c3924a94a575adbdc56a"
],
"key": "Animal",
"name": "name1",
"description": "description1",
"endpoints": [
{
"isEnabled": true,
"publishUrl": "megaUrl",
"env": {
"_id" : "5cc1a8911b19026fd193506b",
"name" : "name2",
"key" : "PROD",
"publishUrl" : "url1",
"__v" : 0
}
},
]
}
Help me pls, how i can do that? I know about aggregate, but cant do it :(
Try this :
db.persons.aggregate([{
$unwind : "$endpoints"
},{
$lookup :{
from : "envs",
localField : "endpoints.env",
foreignField : "_id",
as : "endpoints.env"
}
},{
$unwind : "$endpoints.env"
},{
$match : {
"endpoints.env.key" : {$in : accessKeys}
}
},{
$group : {
_id : "$_id",
flags : {$first : "$flags"},
key : {$first : "$key"},
name : {$first : "$name"},
description : {$first : "$description"},
endpoints : {$push : "$endpoints"},
}
}])
I have one collection called "location". in this collection all child and parent collection are stores. now I want to create a query who returns me parent to child spaces separated string.
Collection
businessId: { type: mongoose.Schema.Types.ObjectId, ref: 'admin' },
parentId: { type: mongoose.Schema.Types.ObjectId, ref: 'location' },
name: { type: String },
image: { type: String },
imageManipulation: { type: String },
locationColor: [{ range: { type: String }, color: { type: String } }],
area: {},
settings: {},
status: { type: String, enum: [0, 1], default: 1 },
isChild: { type: String, enum: [0, 1] },
parentPosition: { type: String }
In the above collection, you can see parentId field. if the location is a child then it have parentId. if the location is a parent then parentId will null. parent location can N level child location.
Collection Data
[{
"_id" : ObjectId("5cee1002a01ad50f5c222982"),
"status" : "1",
"name" : "Ground Floor",
"settings" : {
"zoom" : "0",
"positionX" : "0",
"positionY" : "0",
"width" : "498",
"height" : "498"
},
"image" : "1559105538977.jpg",
"businessId" : ObjectId("5cbd61dc3b56b902284ea388"),
"locationColor" : [],
"updatedAt" : ISODate("2019-05-29T04:52:18.999Z"),
"createdAt" : ISODate("2019-05-29T04:52:18.999Z"),
"__v" : 0
},
{
"_id" : ObjectId("5cee103ca01ad50f5c222983"),
"status" : "1",
"name" : "Kitchen",
"settings" : {
"zoom" : "0",
"positionX" : "0",
"positionY" : "0",
"width" : "498",
"height" : "498"
},
"area" : "{\"type\":3,\"points\":[{\"x\":20,\"y\":178},{\"x\":19,\"y\":75},{\"x\":56,\"y\":71},{\"x\":57,\"y\":52},{\"x\":80,\"y\":18},{\"x\":138,\"y\":17},{\"x\":165,\"y\":52},{\"x\":165,\"y\":94},{\"x\":174,\"y\":96},{\"x\":173,\"y\":179}],\"fill\":\"rgba(178,40,40,0.58)\"}",
"parentId" : ObjectId("5cee1002a01ad50f5c222982"),
"image" : "1559105596975.jpg",
"businessId" : ObjectId("5cbd61dc3b56b902284ea388"),
"locationColor" : [],
"updatedAt" : ISODate("2019-05-29T04:53:16.990Z"),
"createdAt" : ISODate("2019-05-29T04:53:16.990Z"),
"__v" : 0
},
{
"_id" : ObjectId("5cee1078a01ad50f5c222984"),
"status" : "1",
"name" : "Cbot",
"settings" : {
"zoom" : "0",
"positionX" : "0",
"positionY" : "0",
"width" : "498",
"height" : "498"
},
"area" : "{\"type\":3,\"points\":[{\"x\":20,\"y\":311},{\"x\":17,\"y\":59},{\"x\":84,\"y\":58},{\"x\":88,\"y\":312}],\"fill\":\"rgba(20,205,123,0.67)\"}",
"parentId" : ObjectId("5cee103ca01ad50f5c222983"),
"image" : "1559105656049.jpg",
"businessId" : ObjectId("5cbd61dc3b56b902284ea388"),
"locationColor" : [],
"updatedAt" : ISODate("2019-05-29T04:54:16.070Z"),
"createdAt" : ISODate("2019-05-29T04:54:16.070Z"),
"__v" : 0
},
{
"_id" : ObjectId("5cee10c1a01ad50f5c222985"),
"status" : "1",
"name" : "Drower 1",
"settings" : {
"zoom" : "5",
"positionX" : "470",
"positionY" : "70",
"width" : "498",
"height" : "498"
},
"area" : "{\"type\":3,\"points\":[{\"x\":21,\"y\":102},{\"x\":81,\"y\":104},{\"x\":79,\"y\":43},{\"x\":21,\"y\":43}],\"fill\":\"rgba(16,77,193,0.5)\"}",
"parentId" : ObjectId("5cee1078a01ad50f5c222984"),
"image" : "1559105729881.jpg",
"businessId" : ObjectId("5cbd61dc3b56b902284ea388"),
"locationColor" : [],
"updatedAt" : ISODate("2019-05-29T04:55:29.901Z"),
"createdAt" : ISODate("2019-05-29T04:55:29.901Z"),
"__v" : 0
},
{
"_id" : ObjectId("5cee110ea01ad50f5c222986"),
"status" : "1",
"name" : "Drawer 2",
"settings" : {
"zoom" : "5",
"positionX" : "484",
"positionY" : "103",
"width" : "498",
"height" : "498"
},
"area" : "{\"type\":1,\"coordinates\":{\"x\":23,\"y\":125,\"width\":58,\"height\":56},\"points\":[{\"x\":23,\"y\":125},{\"x\":81,\"y\":181}],\"fill\":\"rgba(117,37,109,0.74)\"}",
"parentId" : ObjectId("5cee1078a01ad50f5c222984"),
"image" : "1559105806551.jpg",
"businessId" : ObjectId("5cbd61dc3b56b902284ea388"),
"locationColor" : [],
"updatedAt" : ISODate("2019-05-29T04:56:46.574Z"),
"createdAt" : ISODate("2019-05-29T04:56:46.574Z"),
"__v" : 0
},
{
"_id" : ObjectId("5cee1148a01ad50f5c222987"),
"status" : "1",
"name" : "Drawer 3",
"settings" : {
"zoom" : "5",
"positionX" : "477",
"positionY" : "94",
"width" : "498",
"height" : "498"
},
"area" : "{\"type\":3,\"points\":[{\"x\":22,\"y\":205},{\"x\":20,\"y\":290},{\"x\":84,\"y\":288},{\"x\":85,\"y\":205}],\"fill\":\"rgba(164,108,54,0.57)\"}",
"parentId" : ObjectId("5cee1078a01ad50f5c222984"),
"image" : "1559105864947.jpg",
"businessId" : ObjectId("5cbd61dc3b56b902284ea388"),
"locationColor" : [],
"updatedAt" : ISODate("2019-05-29T04:57:44.972Z"),
"createdAt" : ISODate("2019-05-29T04:57:44.972Z"),
"__v" : 0
},
{
"_id" : ObjectId("5cee5e683b9f67a9f501f818"),
"status" : "1",
"name" : "Washroom",
"settings" : {
"zoom" : "5",
"positionX" : "477",
"positionY" : "94",
"width" : "498",
"height" : "498"
},
"area" : "{\"type\":3,\"points\":[{\"x\":22,\"y\":205},{\"x\":20,\"y\":290},{\"x\":84,\"y\":288},{\"x\":85,\"y\":205}],\"fill\":\"rgba(164,108,54,0.57)\"}",
"parentId" : ObjectId("5cee1002a01ad50f5c222982"),
"image" : "1559105864947.jpg",
"businessId" : ObjectId("5cbd61dc3b56b902284ea388"),
"locationColor" : [],
"updatedAt" : ISODate("2019-05-29T04:57:44.972Z"),
"createdAt" : ISODate("2019-05-29T04:57:44.972Z"),
"__v" : 0
},
{
"_id" : ObjectId("5cee5f593b9f67a9f501fa01"),
"status" : "1",
"name" : "Third Floor",
"settings" : {
"zoom" : "0",
"positionX" : "0",
"positionY" : "0",
"width" : "498",
"height" : "498"
},
"image" : "1559105538977123.jpg",
"businessId" : ObjectId("5cbd61dc3b56b902284ea388"),
"locationColor" : [],
"updatedAt" : ISODate("2019-05-29T04:52:18.999Z"),
"createdAt" : ISODate("2019-05-29T04:52:18.999Z"),
"__v" : 0
}]
Expected result in JSON
[
{
"_id": "5cee1002a01ad50f5c222982",
"name": "Ground Floor"
},
{
"_id": "5cee103ca01ad50f5c222983",
"name": " Kitchen"
},
{
"_id": "5cee1078a01ad50f5c222984",
"name": " Cbot"
},
{
"_id": "5cee110ea01ad50f5c222986",
"name": " Drawer 2"
},
{
"_id": "5cee1148a01ad50f5c222987",
"name": " Drawer 3"
},
{
"_id": "5cee10c1a01ad50f5c222985",
"name": " Drower 1"
},
{
"_id": "5cee5e683b9f67a9f501f818",
"name": " Washroom"
},
{
"_id": "5cee5f593b9f67a9f501fa01",
"name": "Third Floor"
}
]
I do not think you should let mongodb take care of name formatting. So my solution is about finding how many spaces a certain name needs before, so that js can deal with formatting. This is the query:
db.collection.aggregate([
{
$graphLookup: {
from: "collection",
startWith: "$parentId",
connectFromField: "parentId",
connectToField: "_id",
as: "hierarchy"
}
},
{
$project: {
"_id": 1,
"name": 1,
"hierarchy_size": { $size: "$hierarchy" }
}
}
]);
With the $graphLookup, the db is building an in memory graph of edges between connectFromField and connectToField. From the graph you only need the depth of your hierarchy, so I computed hierarchy_size. This is the output:
/* 1 */
{
"_id" : ObjectId("5cee1002a01ad50f5c222982"),
"name" : "Ground Floor",
"hierarchy_size" : 0
}
/* 2 */
{
"_id" : ObjectId("5cee103ca01ad50f5c222983"),
"name" : "Kitchen",
"hierarchy_size" : 1
}
/* 3 */
{
"_id" : ObjectId("5cee1078a01ad50f5c222984"),
"name" : "Cbot",
"hierarchy_size" : 2
}
/* 4 */
{
"_id" : ObjectId("5cee10c1a01ad50f5c222985"),
"name" : "Drower 1",
"hierarchy_size" : 3
}
/* 5 */
{
"_id" : ObjectId("5cee110ea01ad50f5c222986"),
"name" : "Drawer 2",
"hierarchy_size" : 3
}
/* 6 */
{
"_id" : ObjectId("5cee1148a01ad50f5c222987"),
"name" : "Drawer 3",
"hierarchy_size" : 3
}
/* 7 */
{
"_id" : ObjectId("5cee5e683b9f67a9f501f818"),
"name" : "Washroom",
"hierarchy_size" : 1
}
/* 8 */
{
"_id" : ObjectId("5cee5f593b9f67a9f501fa01"),
"name" : "Third Floor",
"hierarchy_size" : 0
}
The only problem here might be query performances, but that depends on how much data you need to process. Also consider the memory limit.
The following query is taking around 20 seconds to execute:
FOR p IN PATHS(locations, connections, "outbound", { maxLength: 1 }) FILTER p.source._key == "26094" RETURN p.vertices[*].name
I believe this is a simple query (and the database is not that big) and it should execute fairly quick... I must be doing something wrong... Here is the query result:
==> [object ArangoQueryCursor - count: 286, hasMore: false]
The locations (vertices) collection has 23753 documents, and the connections (edges) collection has 123414 documents.
I tried to filter by _id as well but the performance is somewhat the same.
Is there anything I could do to get a better performance?
Here is the query's .explain() report:
{
"plan" : {
"nodes" : [
{
"type" : "SingletonNode",
"dependencies" : [ ],
"id" : 1,
"estimatedCost" : 1,
"estimatedNrItems" : 1
},
{
"type" : "CalculationNode",
"dependencies" : [
1
],
"id" : 2,
"estimatedCost" : 2,
"estimatedNrItems" : 1,
"expression" : {
"type" : "function call",
"name" : "PATHS",
"subNodes" : [
{
"type" : "array",
"subNodes" : [
{
"type" : "collection",
"name" : "locations"
},
{
"type" : "collection",
"name" : "connections"
},
{
"type" : "value",
"value" : "outbound"
},
{
"type" : "object",
"subNodes" : [
{
"type" : "object element",
"name" : "maxLength",
"subNodes" : [
{
"type" : "value",
"value" : 1
}
]
}
]
}
]
}
]
},
"outVariable" : {
"id" : 2,
"name" : "2"
},
"canThrow" : true
},
{
"type" : "EnumerateListNode",
"dependencies" : [
2
],
"id" : 3,
"estimatedCost" : 102,
"estimatedNrItems" : 100,
"inVariable" : {
"id" : 2,
"name" : "2"
},
"outVariable" : {
"id" : 0,
"name" : "p"
}
},
{
"type" : "CalculationNode",
"dependencies" : [
3
],
"id" : 4,
"estimatedCost" : 202,
"estimatedNrItems" : 100,
"expression" : {
"type" : "compare ==",
"subNodes" : [
{
"type" : "attribute access",
"name" : "_key",
"subNodes" : [
{
"type" : "attribute access",
"name" : "source",
"subNodes" : [
{
"type" : "reference",
"name" : "p",
"id" : 0
}
]
}
]
},
{
"type" : "value",
"value" : "26094"
}
]
},
"outVariable" : {
"id" : 3,
"name" : "3"
},
"canThrow" : false
},
{
"type" : "FilterNode",
"dependencies" : [
4
],
"id" : 5,
"estimatedCost" : 302,
"estimatedNrItems" : 100,
"inVariable" : {
"id" : 3,
"name" : "3"
}
},
{
"type" : "CalculationNode",
"dependencies" : [
5
],
"id" : 6,
"estimatedCost" : 402,
"estimatedNrItems" : 100,
"expression" : {
"type" : "expand",
"subNodes" : [
{
"type" : "iterator",
"subNodes" : [
{
"type" : "variable",
"name" : "1_",
"id" : 1
},
{
"type" : "attribute access",
"name" : "vertices",
"subNodes" : [
{
"type" : "reference",
"name" : "p",
"id" : 0
}
]
}
]
},
{
"type" : "attribute access",
"name" : "name",
"subNodes" : [
{
"type" : "reference",
"name" : "1_",
"id" : 1
}
]
}
]
},
"outVariable" : {
"id" : 4,
"name" : "4"
},
"canThrow" : false
},
{
"type" : "ReturnNode",
"dependencies" : [
6
],
"id" : 7,
"estimatedCost" : 502,
"estimatedNrItems" : 100,
"inVariable" : {
"id" : 4,
"name" : "4"
}
}
],
"rules" : [
"move-calculations-up",
"move-filters-up",
"move-calculations-up-2",
"move-filters-up-2"
],
"collections" : [
{
"name" : "connections",
"type" : "read"
},
{
"name" : "locations",
"type" : "read"
}
],
"variables" : [
{
"id" : 0,
"name" : "p"
},
{
"id" : 1,
"name" : "1_"
},
{
"id" : 2,
"name" : "2"
},
{
"id" : 3,
"name" : "3"
},
{
"id" : 4,
"name" : "4"
}
],
"estimatedCost" : 502,
"estimatedNrItems" : 100
},
"warnings" : [ ],
"stats" : {
"rulesExecuted" : 21,
"rulesSkipped" : 0,
"plansCreated" : 1
}
}
PATHS() will build all paths of the graph and then post-filter the results using the FILTER on the _key attribute. This may create a huge result set first (for all paths) before filtering out all non-matches.
If all that's required is to find connected vertices on depth 1, I think it will be more efficient to do something like this:
querying using TRAVERSAL:
This is more efficient because it will build all paths in the graph but only those starting at the specified start vertex:
FOR p IN TRAVERSAL(locations, connections, "1", "outbound", { minDepth: 1, maxDepth: 1, paths: true })
RETURN p.path.vertices[*].name
querying direct neighbors using NEIGHBORS:
This may be slightly more efficient even because it will construct a smaller intermediate result.
Additionally, it won't return the start vertex (26094) but all vertices directly connected to it:
FOR p IN NEIGHBORS(locations, connections, "26094", "outbound")
RETURN p.vertex.name
querying the edges directly (not using graph functions)
Finally you can query the edge collection directly.
Again, this won't return the start vertex (26094) but all vertices directly connected to it:
FOR edge IN connections
FILTER edge._from == "locations/26094"
FOR vertex IN locations
FILTER vertex._id == edge._to
RETURN vertex.name