I have one collection called "location". in this collection all child and parent collection are stores. now I want to create a query who returns me parent to child spaces separated string.
Collection
businessId: { type: mongoose.Schema.Types.ObjectId, ref: 'admin' },
parentId: { type: mongoose.Schema.Types.ObjectId, ref: 'location' },
name: { type: String },
image: { type: String },
imageManipulation: { type: String },
locationColor: [{ range: { type: String }, color: { type: String } }],
area: {},
settings: {},
status: { type: String, enum: [0, 1], default: 1 },
isChild: { type: String, enum: [0, 1] },
parentPosition: { type: String }
In the above collection, you can see parentId field. if the location is a child then it have parentId. if the location is a parent then parentId will null. parent location can N level child location.
Collection Data
[{
"_id" : ObjectId("5cee1002a01ad50f5c222982"),
"status" : "1",
"name" : "Ground Floor",
"settings" : {
"zoom" : "0",
"positionX" : "0",
"positionY" : "0",
"width" : "498",
"height" : "498"
},
"image" : "1559105538977.jpg",
"businessId" : ObjectId("5cbd61dc3b56b902284ea388"),
"locationColor" : [],
"updatedAt" : ISODate("2019-05-29T04:52:18.999Z"),
"createdAt" : ISODate("2019-05-29T04:52:18.999Z"),
"__v" : 0
},
{
"_id" : ObjectId("5cee103ca01ad50f5c222983"),
"status" : "1",
"name" : "Kitchen",
"settings" : {
"zoom" : "0",
"positionX" : "0",
"positionY" : "0",
"width" : "498",
"height" : "498"
},
"area" : "{\"type\":3,\"points\":[{\"x\":20,\"y\":178},{\"x\":19,\"y\":75},{\"x\":56,\"y\":71},{\"x\":57,\"y\":52},{\"x\":80,\"y\":18},{\"x\":138,\"y\":17},{\"x\":165,\"y\":52},{\"x\":165,\"y\":94},{\"x\":174,\"y\":96},{\"x\":173,\"y\":179}],\"fill\":\"rgba(178,40,40,0.58)\"}",
"parentId" : ObjectId("5cee1002a01ad50f5c222982"),
"image" : "1559105596975.jpg",
"businessId" : ObjectId("5cbd61dc3b56b902284ea388"),
"locationColor" : [],
"updatedAt" : ISODate("2019-05-29T04:53:16.990Z"),
"createdAt" : ISODate("2019-05-29T04:53:16.990Z"),
"__v" : 0
},
{
"_id" : ObjectId("5cee1078a01ad50f5c222984"),
"status" : "1",
"name" : "Cbot",
"settings" : {
"zoom" : "0",
"positionX" : "0",
"positionY" : "0",
"width" : "498",
"height" : "498"
},
"area" : "{\"type\":3,\"points\":[{\"x\":20,\"y\":311},{\"x\":17,\"y\":59},{\"x\":84,\"y\":58},{\"x\":88,\"y\":312}],\"fill\":\"rgba(20,205,123,0.67)\"}",
"parentId" : ObjectId("5cee103ca01ad50f5c222983"),
"image" : "1559105656049.jpg",
"businessId" : ObjectId("5cbd61dc3b56b902284ea388"),
"locationColor" : [],
"updatedAt" : ISODate("2019-05-29T04:54:16.070Z"),
"createdAt" : ISODate("2019-05-29T04:54:16.070Z"),
"__v" : 0
},
{
"_id" : ObjectId("5cee10c1a01ad50f5c222985"),
"status" : "1",
"name" : "Drower 1",
"settings" : {
"zoom" : "5",
"positionX" : "470",
"positionY" : "70",
"width" : "498",
"height" : "498"
},
"area" : "{\"type\":3,\"points\":[{\"x\":21,\"y\":102},{\"x\":81,\"y\":104},{\"x\":79,\"y\":43},{\"x\":21,\"y\":43}],\"fill\":\"rgba(16,77,193,0.5)\"}",
"parentId" : ObjectId("5cee1078a01ad50f5c222984"),
"image" : "1559105729881.jpg",
"businessId" : ObjectId("5cbd61dc3b56b902284ea388"),
"locationColor" : [],
"updatedAt" : ISODate("2019-05-29T04:55:29.901Z"),
"createdAt" : ISODate("2019-05-29T04:55:29.901Z"),
"__v" : 0
},
{
"_id" : ObjectId("5cee110ea01ad50f5c222986"),
"status" : "1",
"name" : "Drawer 2",
"settings" : {
"zoom" : "5",
"positionX" : "484",
"positionY" : "103",
"width" : "498",
"height" : "498"
},
"area" : "{\"type\":1,\"coordinates\":{\"x\":23,\"y\":125,\"width\":58,\"height\":56},\"points\":[{\"x\":23,\"y\":125},{\"x\":81,\"y\":181}],\"fill\":\"rgba(117,37,109,0.74)\"}",
"parentId" : ObjectId("5cee1078a01ad50f5c222984"),
"image" : "1559105806551.jpg",
"businessId" : ObjectId("5cbd61dc3b56b902284ea388"),
"locationColor" : [],
"updatedAt" : ISODate("2019-05-29T04:56:46.574Z"),
"createdAt" : ISODate("2019-05-29T04:56:46.574Z"),
"__v" : 0
},
{
"_id" : ObjectId("5cee1148a01ad50f5c222987"),
"status" : "1",
"name" : "Drawer 3",
"settings" : {
"zoom" : "5",
"positionX" : "477",
"positionY" : "94",
"width" : "498",
"height" : "498"
},
"area" : "{\"type\":3,\"points\":[{\"x\":22,\"y\":205},{\"x\":20,\"y\":290},{\"x\":84,\"y\":288},{\"x\":85,\"y\":205}],\"fill\":\"rgba(164,108,54,0.57)\"}",
"parentId" : ObjectId("5cee1078a01ad50f5c222984"),
"image" : "1559105864947.jpg",
"businessId" : ObjectId("5cbd61dc3b56b902284ea388"),
"locationColor" : [],
"updatedAt" : ISODate("2019-05-29T04:57:44.972Z"),
"createdAt" : ISODate("2019-05-29T04:57:44.972Z"),
"__v" : 0
},
{
"_id" : ObjectId("5cee5e683b9f67a9f501f818"),
"status" : "1",
"name" : "Washroom",
"settings" : {
"zoom" : "5",
"positionX" : "477",
"positionY" : "94",
"width" : "498",
"height" : "498"
},
"area" : "{\"type\":3,\"points\":[{\"x\":22,\"y\":205},{\"x\":20,\"y\":290},{\"x\":84,\"y\":288},{\"x\":85,\"y\":205}],\"fill\":\"rgba(164,108,54,0.57)\"}",
"parentId" : ObjectId("5cee1002a01ad50f5c222982"),
"image" : "1559105864947.jpg",
"businessId" : ObjectId("5cbd61dc3b56b902284ea388"),
"locationColor" : [],
"updatedAt" : ISODate("2019-05-29T04:57:44.972Z"),
"createdAt" : ISODate("2019-05-29T04:57:44.972Z"),
"__v" : 0
},
{
"_id" : ObjectId("5cee5f593b9f67a9f501fa01"),
"status" : "1",
"name" : "Third Floor",
"settings" : {
"zoom" : "0",
"positionX" : "0",
"positionY" : "0",
"width" : "498",
"height" : "498"
},
"image" : "1559105538977123.jpg",
"businessId" : ObjectId("5cbd61dc3b56b902284ea388"),
"locationColor" : [],
"updatedAt" : ISODate("2019-05-29T04:52:18.999Z"),
"createdAt" : ISODate("2019-05-29T04:52:18.999Z"),
"__v" : 0
}]
Expected result in JSON
[
{
"_id": "5cee1002a01ad50f5c222982",
"name": "Ground Floor"
},
{
"_id": "5cee103ca01ad50f5c222983",
"name": " Kitchen"
},
{
"_id": "5cee1078a01ad50f5c222984",
"name": " Cbot"
},
{
"_id": "5cee110ea01ad50f5c222986",
"name": " Drawer 2"
},
{
"_id": "5cee1148a01ad50f5c222987",
"name": " Drawer 3"
},
{
"_id": "5cee10c1a01ad50f5c222985",
"name": " Drower 1"
},
{
"_id": "5cee5e683b9f67a9f501f818",
"name": " Washroom"
},
{
"_id": "5cee5f593b9f67a9f501fa01",
"name": "Third Floor"
}
]
I do not think you should let mongodb take care of name formatting. So my solution is about finding how many spaces a certain name needs before, so that js can deal with formatting. This is the query:
db.collection.aggregate([
{
$graphLookup: {
from: "collection",
startWith: "$parentId",
connectFromField: "parentId",
connectToField: "_id",
as: "hierarchy"
}
},
{
$project: {
"_id": 1,
"name": 1,
"hierarchy_size": { $size: "$hierarchy" }
}
}
]);
With the $graphLookup, the db is building an in memory graph of edges between connectFromField and connectToField. From the graph you only need the depth of your hierarchy, so I computed hierarchy_size. This is the output:
/* 1 */
{
"_id" : ObjectId("5cee1002a01ad50f5c222982"),
"name" : "Ground Floor",
"hierarchy_size" : 0
}
/* 2 */
{
"_id" : ObjectId("5cee103ca01ad50f5c222983"),
"name" : "Kitchen",
"hierarchy_size" : 1
}
/* 3 */
{
"_id" : ObjectId("5cee1078a01ad50f5c222984"),
"name" : "Cbot",
"hierarchy_size" : 2
}
/* 4 */
{
"_id" : ObjectId("5cee10c1a01ad50f5c222985"),
"name" : "Drower 1",
"hierarchy_size" : 3
}
/* 5 */
{
"_id" : ObjectId("5cee110ea01ad50f5c222986"),
"name" : "Drawer 2",
"hierarchy_size" : 3
}
/* 6 */
{
"_id" : ObjectId("5cee1148a01ad50f5c222987"),
"name" : "Drawer 3",
"hierarchy_size" : 3
}
/* 7 */
{
"_id" : ObjectId("5cee5e683b9f67a9f501f818"),
"name" : "Washroom",
"hierarchy_size" : 1
}
/* 8 */
{
"_id" : ObjectId("5cee5f593b9f67a9f501fa01"),
"name" : "Third Floor",
"hierarchy_size" : 0
}
The only problem here might be query performances, but that depends on how much data you need to process. Also consider the memory limit.
Related
I am having certain issue with querying mongodb using mongoose, but unable to find my required output. Below is the data I am looking for the solution.
I am trying to find with two params here one is location and second is type
{
"_id" : ObjectId("5f04dcf8e123292518a863ae"),
"location" : "Australia",
"name" : "June Bill",
"filters" : [
{
"_id" : ObjectId("5f047efe9fc7ad000f44f990"),
"name" : "Q1",
"type" : "Platinum",
"conditions" : [
{
"And" : [
{
"_id" : "objectid",
"field" : "location",
"value" : "Australia",
"operator" : "equal"
},
{
"_id" : "objectid",
"field" : "name",
"value" : "admin",
"operator" : "equal"
}
]
}
]
},
{
"_id" : ObjectId("5f04d51c0ce40120bbb7dc6e"),
"name" : "Q2",
"type" : "Gold",
"conditions" : [
{
"_id" : ObjectId("5f04d51c0ce40120bbb7dc6f"),
"field" : "ocation",
"value" : "Australia",
"operator" : "equal"
},
{
"_id" : ObjectId("5f04d51c0ce40120bbb7dc70"),
"field" : "start_date",
"value" : "2020-06-01T00 3A00",
"operator" : "equal"
},
]
}
],
"__v" : 0
},
{
"_id" : ObjectId("5f04dcf8e123292518a863ae"),
"location" : "Brazil",
"name" : "June Bill",
"filters" : [
{
"_id" : ObjectId("5f047efe9fc7ad000f44f990"),
"name" : "Q1",
"type" : "Silver",
"conditions" : [
{
"And" : [
{
"_id" : "objectid",
"field" : "location",
"value" : "Australia",
"operator" : "equal"
},
{
"_id" : "objectid",
"field" : "name",
"value" : "admin",
"operator" : "equal"
}
]
}
]
},
{
"_id" : ObjectId("5f04d51c0ce40120bbb7dc6e"),
"name" : "Q2",
"type" : "Gold",
"conditions" : [
{
"_id" : ObjectId("5f04d51c0ce40120bbb7dc6f"),
"field" : "location",
"value" : "Australia",
"operator" : "equal"
},
{
"_id" : ObjectId("5f04d51c0ce40120bbb7dc70"),
"field" : "start_date",
"value" : "2020-06-01T00 3A00",
"operator" : "equal"
},
]
}
],
"__v" : 0
}
here I am trying to find with location and in filters with its type
How Can i do this in mongoose?
I tried
Model.find({'location': 'Brazil', 'filters':{ "$in" : ["Silver"]} });
But it didn't work, Can any one help me achieving actual result?
You can use the . to query the embedded filed or if you want to match multiple fields you can use $elemMatch
db.collection.find({
"location": "Brazil",
"filters.type": {
"$in": [
"Silver"
]
}
},
{
"filters.$": 1
})
MongoDB Playground
If you want to filter out the matched result you can use $ operator in projection
Use this way:
db.collection.find({
"location": "Brazil",
"filters.type": "Silver"
})
MongoDb Playground
Hello This query is running fine for me I want to add other arrays as well like contact and I want separate count for them. If I add contact it add the result in count P.S : I used array name with $sum it start showing me zero count
return Company.aggregate(
{"$unwind":"$agreement"},
{"$unwind":"$contact"},
{"$unwind":"$companySite"},
{ $group: { _id: {
"Organization": "$orgId",
"Company": "$name"
}, "count": { $sum: 1 } } },{
"$project": {
"_id": 0,
"Company": "$_id.Company",
"Organization":"$_id.Organization",
"Count": "$count"
}
})
expected output
"Company": "Multi-Metal Manufacturing",
"Organization": "1",
"AgreementCount": 1,
"ContactCount" : 4
"ABC Count": 5
Sample Document is :
{
"_id" : ObjectId("57c6a97a90c933a2d54117dc"),
"ITBCompanyId" : 1034,
"updatedAt" : ISODate("2016-08-31T12:47:03.679Z"),
"createdAt" : ISODate("2016-08-31T09:55:06.217Z"),
"identifier" : "INL10",
"name" : "Inline Data Systems",
"addressLine1" : "7 Park Place",
"addressLine2" : "Ste D",
"city" : "Swansea",
"orgId" : "1",
"deletedAt" : null,
"_info" : {
"lastUpdated" : ISODate("2016-02-10T21:22:15.000Z"),
"updatedBy" : "Clarissa"
},
"whois" : [],
"configuration" : [],
"contact" : [
{
"email" : "justin#inlinedatasystems.com",
"mobileGuid" : "d8852942-8f5a-406f-a646-a5f8697f7885",
"presence" : null,
"gender" : null,
"title" : null,
"country" : null,
"zip" : null,
"state" : null,
"city" : null,
"addressLine2" : null,
"addressLine1" : null,
"lastName" : "Wilkerson",
"firstName" : "Justin",
"id" : 191,
"_id" : ObjectId("57c6abdfb966968130b1671c"),
"communicationItems" : [],
"customFields" : null,
"company" : {
"name" : "Inline Data Systems",
"id" : "19331"
},
"relationship" : {
"name" : null,
"id" : 0
}
}
],
"agreement" : [
{
"periodType" : null,
"billAmount" : "0",
"billTermsId" : 12,
"billOneTimeFlag" : false,
"billCycleId" : "2",
"expiredDays" : "0"
},
{
"id" : "40",
"name" : "Managed Hosted Server Agreement",
"billAmount" : "0",
"periodType" : null,
"_id" : ObjectId("57c6ab96b966968130b16616"),
"_info" : {
"lastUpdated" : ISODate("2014-10-03T14:25:42.000Z"),
"updatedBy" : "ali "
},
"contact" : {
"id" : "191",
"name" : "Justin Wilkerson"
},
"agreementType" : {
"id" : "33",
"name" : "HDCloud - Hosted Server"
}
},
{
"id" : "41",
"name" : "Managed Backup Agreement",
"parentAgreementId" : "42",
"customerPO" : ""
}
],
"companySite" : [
{
"id" : 1037,
"name" : "Main",
"city" : "Swansea",
"state" : "IL",
"zip" : "62226",
}
]
The following query is taking around 20 seconds to execute:
FOR p IN PATHS(locations, connections, "outbound", { maxLength: 1 }) FILTER p.source._key == "26094" RETURN p.vertices[*].name
I believe this is a simple query (and the database is not that big) and it should execute fairly quick... I must be doing something wrong... Here is the query result:
==> [object ArangoQueryCursor - count: 286, hasMore: false]
The locations (vertices) collection has 23753 documents, and the connections (edges) collection has 123414 documents.
I tried to filter by _id as well but the performance is somewhat the same.
Is there anything I could do to get a better performance?
Here is the query's .explain() report:
{
"plan" : {
"nodes" : [
{
"type" : "SingletonNode",
"dependencies" : [ ],
"id" : 1,
"estimatedCost" : 1,
"estimatedNrItems" : 1
},
{
"type" : "CalculationNode",
"dependencies" : [
1
],
"id" : 2,
"estimatedCost" : 2,
"estimatedNrItems" : 1,
"expression" : {
"type" : "function call",
"name" : "PATHS",
"subNodes" : [
{
"type" : "array",
"subNodes" : [
{
"type" : "collection",
"name" : "locations"
},
{
"type" : "collection",
"name" : "connections"
},
{
"type" : "value",
"value" : "outbound"
},
{
"type" : "object",
"subNodes" : [
{
"type" : "object element",
"name" : "maxLength",
"subNodes" : [
{
"type" : "value",
"value" : 1
}
]
}
]
}
]
}
]
},
"outVariable" : {
"id" : 2,
"name" : "2"
},
"canThrow" : true
},
{
"type" : "EnumerateListNode",
"dependencies" : [
2
],
"id" : 3,
"estimatedCost" : 102,
"estimatedNrItems" : 100,
"inVariable" : {
"id" : 2,
"name" : "2"
},
"outVariable" : {
"id" : 0,
"name" : "p"
}
},
{
"type" : "CalculationNode",
"dependencies" : [
3
],
"id" : 4,
"estimatedCost" : 202,
"estimatedNrItems" : 100,
"expression" : {
"type" : "compare ==",
"subNodes" : [
{
"type" : "attribute access",
"name" : "_key",
"subNodes" : [
{
"type" : "attribute access",
"name" : "source",
"subNodes" : [
{
"type" : "reference",
"name" : "p",
"id" : 0
}
]
}
]
},
{
"type" : "value",
"value" : "26094"
}
]
},
"outVariable" : {
"id" : 3,
"name" : "3"
},
"canThrow" : false
},
{
"type" : "FilterNode",
"dependencies" : [
4
],
"id" : 5,
"estimatedCost" : 302,
"estimatedNrItems" : 100,
"inVariable" : {
"id" : 3,
"name" : "3"
}
},
{
"type" : "CalculationNode",
"dependencies" : [
5
],
"id" : 6,
"estimatedCost" : 402,
"estimatedNrItems" : 100,
"expression" : {
"type" : "expand",
"subNodes" : [
{
"type" : "iterator",
"subNodes" : [
{
"type" : "variable",
"name" : "1_",
"id" : 1
},
{
"type" : "attribute access",
"name" : "vertices",
"subNodes" : [
{
"type" : "reference",
"name" : "p",
"id" : 0
}
]
}
]
},
{
"type" : "attribute access",
"name" : "name",
"subNodes" : [
{
"type" : "reference",
"name" : "1_",
"id" : 1
}
]
}
]
},
"outVariable" : {
"id" : 4,
"name" : "4"
},
"canThrow" : false
},
{
"type" : "ReturnNode",
"dependencies" : [
6
],
"id" : 7,
"estimatedCost" : 502,
"estimatedNrItems" : 100,
"inVariable" : {
"id" : 4,
"name" : "4"
}
}
],
"rules" : [
"move-calculations-up",
"move-filters-up",
"move-calculations-up-2",
"move-filters-up-2"
],
"collections" : [
{
"name" : "connections",
"type" : "read"
},
{
"name" : "locations",
"type" : "read"
}
],
"variables" : [
{
"id" : 0,
"name" : "p"
},
{
"id" : 1,
"name" : "1_"
},
{
"id" : 2,
"name" : "2"
},
{
"id" : 3,
"name" : "3"
},
{
"id" : 4,
"name" : "4"
}
],
"estimatedCost" : 502,
"estimatedNrItems" : 100
},
"warnings" : [ ],
"stats" : {
"rulesExecuted" : 21,
"rulesSkipped" : 0,
"plansCreated" : 1
}
}
PATHS() will build all paths of the graph and then post-filter the results using the FILTER on the _key attribute. This may create a huge result set first (for all paths) before filtering out all non-matches.
If all that's required is to find connected vertices on depth 1, I think it will be more efficient to do something like this:
querying using TRAVERSAL:
This is more efficient because it will build all paths in the graph but only those starting at the specified start vertex:
FOR p IN TRAVERSAL(locations, connections, "1", "outbound", { minDepth: 1, maxDepth: 1, paths: true })
RETURN p.path.vertices[*].name
querying direct neighbors using NEIGHBORS:
This may be slightly more efficient even because it will construct a smaller intermediate result.
Additionally, it won't return the start vertex (26094) but all vertices directly connected to it:
FOR p IN NEIGHBORS(locations, connections, "26094", "outbound")
RETURN p.vertex.name
querying the edges directly (not using graph functions)
Finally you can query the edge collection directly.
Again, this won't return the start vertex (26094) but all vertices directly connected to it:
FOR edge IN connections
FILTER edge._from == "locations/26094"
FOR vertex IN locations
FILTER vertex._id == edge._to
RETURN vertex.name
I observere an enormous runtime difference between those two AQL statements an a DB set with about 20 Mio records:
FOR e IN EAll
FILTER e.lastname == "Kmp" // <-- skip-index
FILTER e.lastpaff != "" // <-- no index
RETURN e
// runs in less than a second
AND
FOR e IN EAll
FILTER e.lastpaff != "" // <-- no index
FILTER e.lastname == "Kmp" // <-- skip-index
RETURN e
// needs about a minute to execute.
In addition to be (or not) indexed, the selectivity of those statements is highly different: the indexedAttribute is highly selective where-as the nonIndexedAttribute only filters 50%.
Is it possible that there is not yet an optimization rule for that? I currently am using ArangoDB 2.4.0.
DETAILS:
There is a SKIP-Index on the indexed Attribute (which seems to be used in the execuation plan 1).
Here are the execuation plan, in which only the order of the filters are changed:
FAST QUERY:
arangosh [Uni]> stmt.explain()
{
"plan" : {
"nodes" : [
{
"type" : "SingletonNode",
"dependencies" : [ ],
"id" : 1,
"estimatedCost" : 1,
"estimatedNrItems" : 1
},
{
"type" : "IndexRangeNode",
"dependencies" : [
1
],
"id" : 8,
"estimatedCost" : 170463.32,
"estimatedNrItems" : 170462,
"database" : "Uni",
"collection" : "EAll",
"outVariable" : {
"id" : 0,
"name" : "i"
},
"ranges" : [
[
{
"variable" : "i",
"attr" : "lastname",
"lowConst" : {
"bound" : "Kmp",
"include" : true,
"isConstant" : true
},
"highConst" : {
"bound" : "Kmp",
"include" : true,
"isConstant" : true
},
"lows" : [ ],
"highs" : [ ],
"valid" : true,
"equality" : true
}
]
],
"index" : {
"type" : "skiplist",
"id" : "13295598550318",
"unique" : false,
"fields" : [
"lastname"
]
},
"reverse" : false
},
{
"type" : "CalculationNode",
"dependencies" : [
8
],
"id" : 5,
"estimatedCost" : 340925.32,
"estimatedNrItems" : 170462,
"expression" : {
"type" : "compare !=",
"subNodes" : [
{
"type" : "attribute access",
"name" : "lastpaff",
"subNodes" : [
{
"type" : "reference",
"name" : "i",
"id" : 0
}
]
},
{
"type" : "value",
"value" : ""
}
]
},
"outVariable" : {
"id" : 2,
"name" : "2"
},
"canThrow" : false
},
{
"type" : "FilterNode",
"dependencies" : [
5
],
"id" : 6,
"estimatedCost" : 511387.32,
"estimatedNrItems" : 170462,
"inVariable" : {
"id" : 2,
"name" : "2"
}
},
{
"type" : "ReturnNode",
"dependencies" : [
6
],
"id" : 7,
"estimatedCost" : 681849.3200000001,
"estimatedNrItems" : 170462,
"inVariable" : {
"id" : 0,
"name" : "i"
}
}
],
"rules" : [
"move-calculations-up",
"move-filters-up",
"move-calculations-up-2",
"move-filters-up-2",
"use-index-range",
"remove-filter-covered-by-index"
],
"collections" : [
{
"name" : "EAll",
"type" : "read"
}
],
"variables" : [
{
"id" : 0,
"name" : "i"
},
{
"id" : 1,
"name" : "1"
},
{
"id" : 2,
"name" : "2"
}
],
"estimatedCost" : 681849.3200000001,
"estimatedNrItems" : 170462
},
"warnings" : [ ],
"stats" : {
"rulesExecuted" : 19,
"rulesSkipped" : 0,
"plansCreated" : 1
}
}
SLOW Query:
arangosh [Uni]> stmt.explain()
{
"plan" : {
"nodes" : [
{
"type" : "SingletonNode",
"dependencies" : [ ],
"id" : 1,
"estimatedCost" : 1,
"estimatedNrItems" : 1
},
{
"type" : "EnumerateCollectionNode",
"dependencies" : [
1
],
"id" : 2,
"estimatedCost" : 17046233,
"estimatedNrItems" : 17046232,
"database" : "Uni",
"collection" : "EAll",
"outVariable" : {
"id" : 0,
"name" : "i"
},
"random" : false
},
{
"type" : "CalculationNode",
"dependencies" : [
2
],
"id" : 3,
"estimatedCost" : 34092465,
"estimatedNrItems" : 17046232,
"expression" : {
"type" : "compare !=",
"subNodes" : [
{
"type" : "attribute access",
"name" : "lastpaff",
"subNodes" : [
{
"type" : "reference",
"name" : "i",
"id" : 0
}
]
},
{
"type" : "value",
"value" : ""
}
]
},
"outVariable" : {
"id" : 1,
"name" : "1"
},
"canThrow" : false
},
{
"type" : "FilterNode",
"dependencies" : [
3
],
"id" : 4,
"estimatedCost" : 51138697,
"estimatedNrItems" : 17046232,
"inVariable" : {
"id" : 1,
"name" : "1"
}
},
{
"type" : "CalculationNode",
"dependencies" : [
4
],
"id" : 5,
"estimatedCost" : 68184929,
"estimatedNrItems" : 17046232,
"expression" : {
"type" : "compare ==",
"subNodes" : [
{
"type" : "attribute access",
"name" : "lastname",
"subNodes" : [
{
"type" : "reference",
"name" : "i",
"id" : 0
}
]
},
{
"type" : "value",
"value" : "Kmp"
}
]
},
"outVariable" : {
"id" : 2,
"name" : "2"
},
"canThrow" : false
},
{
"type" : "FilterNode",
"dependencies" : [
5
],
"id" : 6,
"estimatedCost" : 85231161,
"estimatedNrItems" : 17046232,
"inVariable" : {
"id" : 2,
"name" : "2"
}
},
{
"type" : "ReturnNode",
"dependencies" : [
6
],
"id" : 7,
"estimatedCost" : 102277393,
"estimatedNrItems" : 17046232,
"inVariable" : {
"id" : 0,
"name" : "i"
}
}
],
"rules" : [
"move-calculations-up",
"move-filters-up",
"move-calculations-up-2",
"move-filters-up-2"
],
"collections" : [
{
"name" : "EAll",
"type" : "read"
}
],
"variables" : [
{
"id" : 0,
"name" : "i"
},
{
"id" : 1,
"name" : "1"
},
{
"id" : 2,
"name" : "2"
}
],
"estimatedCost" : 102277393,
"estimatedNrItems" : 17046232
},
"warnings" : [ ],
"stats" : {
"rulesExecuted" : 19,
"rulesSkipped" : 0,
"plansCreated" : 1
}
}
Indeed, conditions like the following disabled the usage of indexes even though an index could be used:
FILTER doc.indexedAttribute != ... FILTER doc.indexedAttribute == ...
Interestingly an index is used when the two conditions are put into the same FILTER condition and combined with &&:
FILTER doc.indexedAttribute != ... && doc.indexedAttribute == ...
Though these two statements are equivalent, they trigger a slightly different code path. The former will be AND-combining two existing FILTER ranges, the latter one will produce a range from a single FILTER. The case of AND-combination for the FILTER ranges was overly defensive and rejected both sides even if only a single side (in this case the one with the non-equality operator) could not be used for an index scan.
This has been fixed in 2.4, and the fix will be contained in 2.4.2. A workaround for now is to combine the two FILTER statements in a single one.
I have set of baskets stored in mongodb like:
/* 47 */
{
"_id" : ObjectId("535ff14c2e441acf44708ec7"),
"tip" : "0",
"basketTransactionCash" : "2204",
"basketTransactionCard" : "0",
"completed" : ISODate("2014-04-07T14:35:17.000Z"),
"consumerId" : null,
"storeId" : 1,
"basketId" : 210048,
"basketProduct" : [
{
"_id" : ObjectId("535feffa2e441acf446facb7"),
"name" : "Vanilla Spice Hot Chocolate",
"productId" : 23,
"basketProductInstanceId" : 838392,
"quantity" : 1,
"productModifiedPrice" : null,
"productInstancePrice" : "633",
"vatPercentage" : "0.2"
}
],
"created" : ISODate("2014-04-07T14:35:42.000Z"),
"__v" : 0
}
/* 48 */
{
"_id" : ObjectId("535ff14c2e441acf44708ede"),
"tip" : "0",
"basketTransactionCash" : "230",
"basketTransactionCard" : "0",
"completed" : ISODate("2014-04-07T14:41:51.000Z"),
"consumerId" : 1,
"storeId" : 1,
"basketId" : 210072,
"basketProduct" : [
{
"_id" : ObjectId("535feffa2e441acf446facf3"),
"name" : "Melon",
"productId" : 4544,
"basketProductInstanceId" : 838430,
"quantity" : 1,
"productModifiedPrice" : null,
"productInstancePrice" : "200",
"vatPercentage" : "0"
},
{
"_id" : ObjectId("535feffa2e441acf446facf4"),
"name" : "30p",
"productId" : 8496,
"basketProductInstanceId" : 838431,
"quantity" : 1,
"productModifiedPrice" : null,
"productInstancePrice" : "30",
"vatPercentage" : "0"
}
],
"created" : ISODate("2014-04-07T14:42:16.000Z"),
"__v" : 0
}
/* 49 */
{
"_id" : ObjectId("535ff14c2e441acf44708ee2"),
"tip" : "0",
"basketTransactionCash" : "2204",
"basketTransactionCard" : "0",
"completed" : ISODate("2014-04-07T14:41:54.000Z"),
"consumerId" : 2,
"storeId" : 1,
"basketId" : 210076,
"basketProduct" : [
{
"_id" : ObjectId("535feffb2e441acf446fad02"),
"name" : "Creamy Natural Yoghurt",
"productId" : 69,
"basketProductInstanceId" : 839800,
"quantity" : 1,
"productModifiedPrice" : null,
"productInstancePrice" : "911.7",
"vatPercentage" : "0.2"
},
{
"_id" : ObjectId("535feffb2e441acf446fad03"),
"name" : "Melon",
"productId" : 4544,
"basketProductInstanceId" : 839801,
"quantity" : 1,
"productModifiedPrice" : null,
"productInstancePrice" : "200",
"vatPercentage" : "0"
},
{
"_id" : ObjectId("535feffb2e441acf446fad04"),
"name" : "30p",
"productId" : 8496,
"basketProductInstanceId" : 839802,
"quantity" : 1,
"productModifiedPrice" : null,
"productInstancePrice" : "30",
"vatPercentage" : "0"
}
],
"created" : ISODate("2014-04-07T14:43:00.000Z"),
"__v" : 0
}
Could I have this grouped by completed by hourly? So the end result should be something similar to the below
{
"totalTip" : "0",
"basketTransactionCashTotal" : "4638",
"basketTransactionCardTotal" : "0",
"completed" : "2014-04-07 14",
"consumerIds" : [null, 1, 2],
"storeId" : 1,
"basketIds" : [210048, 210072, 210076]
"basketProduct" : [
{
"_id" : ObjectId("535feffa2e441acf446facf3"),
"name" : "Melon",
"productId" : 4544,
"basketProductInstanceId" : 838430,
"quantity" : 1,
"productModifiedPrice" : null,
"productInstancePrice" : "200",
"vatPercentage" : "0"
},
{
"_id" : ObjectId("535feffa2e441acf446facf4"),
"name" : "30p",
"productId" : 8496,
"basketProductInstanceId" : 838431,
"quantity" : 1,
"productModifiedPrice" : null,
"productInstancePrice" : "30",
"vatPercentage" : "0"
},
{
"_id" : ObjectId("535feffb2e441acf446fad02"),
"name" : "Creamy Natural Yoghurt",
"productId" : 69,
"basketProductInstanceId" : 839800,
"quantity" : 1,
"productModifiedPrice" : null,
"productInstancePrice" : "911.7",
"vatPercentage" : "0.2"
},
{
"_id" : ObjectId("535feffb2e441acf446fad03"),
"name" : "Melon",
"productId" : 4544,
"basketProductInstanceId" : 839801,
"quantity" : 1,
"productModifiedPrice" : null,
"productInstancePrice" : "200",
"vatPercentage" : "0"
},
{
"_id" : ObjectId("535feffb2e441acf446fad04"),
"name" : "30p",
"productId" : 8496,
"basketProductInstanceId" : 839802,
"quantity" : 1,
"productModifiedPrice" : null,
"productInstancePrice" : "30",
"vatPercentage" : "0"
},
{
"_id" : ObjectId("535feffa2e441acf446facb7"),
"name" : "Vanilla Spice Hot Chocolate",
"productId" : 23,
"basketProductInstanceId" : 838392,
"quantity" : 1,
"productModifiedPrice" : null,
"productInstancePrice" : "633",
"vatPercentage" : "0.2"
}
]
}
Could anyone help me understand this for mongoose in nodejs please?
First of all, the fields tip, basketTransactionCash and basketTransactionCard need to be converted to numeric type for you to perform any arithmetic operations. You can look at this question for an approach to update all documents with the right data type.
Once the data type is taken care of, you can use the aggregation framework to get close to what you want. The date aggregation operators provide a way to group by the hour. Sample query is below:
db.collection.aggregate([
{
"$group": {
"_id": {
"year": {"$year": "$completed"},
"month": {"$month": "$completed"},
"day": {"$dayOfMonth": "$completed"},
"hour": {"$hour": "$completed"}
},
"totalTip": {
"$sum": "$tip"
},
"basketTransactionCashTotal": {
"$sum": "$basketTransactionCash"
},
"basketTransactionCardTotal": {
"$sum": "$basketTransactionCard"
},
"consumerIds": {
"$push": "$consumerId"
},
"storeId": {
"$addToSet": "$storeId"
},
"basketIds": {
"$push": "$basketId"
},
"basketProducts": {
"$push": "$basketProduct"
}
}
}
])
I would suggest you to read up on aggregation and play around with it a little bit as it's a quite powerful tool.
NOTE: Unless the data types are updated, you'll get 0 for totalTip, basketTransactionCashTotal and basketTransactionCardTotal