I was wondering what the best way to aggregate this data would be in Groovy?
Lets say I have the following data:
[
[id: 1, name: bob, age:20, numberOfPackages: 10, numberOfPurchases:20 ],
[id: 1, name: bob, age:20, numberOfPackages: 5, numberOfPurchases:6 ],
[id: 2, name: Rob, age:22, numberOfPackages: 3, numberOfPurchases:5 ],
]
and I want to transform it to the following (merge id/name/age but sum price/number of purchases per id):
[
[id: 1, name: bob, age:20, numberOfPackages: 15, numberOfPurchases:26 ],
[id: 2, name: Rob, age:22, numberOfPackages: 3, numberOfPurchases:5 ],
]
Summing the prices, and the number of purchases separately makes little sense, do you mean:
def data = [
[id: 1, name: 'bob', age:20, price: 10, numberOfPurchases:20],
[id: 1, name: 'bob', age:20, price: 5, numberOfPurchases:6],
[id: 2, name: 'rob', age:22, price: 3, numberOfPurchases:5]
]
data.groupBy { [id:it.id, name:it.name, age:it.age] }.collect { k, v ->
[id:k.id,
name:k.name,
age:k.age,
spend:v.collect { it.price * it.numberOfPurchases }.sum()]
}
Which gives:
[
[id:1, name:'bob', age:20, spend:230],
[id:2, name:'rob', age:22, spend:15]
]
It may be e.g.:
def data = [
[id: 1, name: 'bob', age:20, price: 10, numberOfPurchases:20 ],
[id: 1, name: 'bob', age:20, price: 5, numberOfPurchases:6 ],
[id: 2, name: 'rob', age:22, price: 3, numberOfPurchases:5 ],
]
data.groupBy { it.id }.collectEntries {
[
(it.key): [
name: it.value.name.first(),
age: it.value.first(),
price: it.value.sum { it.price },
numberOfPurchases: it.value.sum { it.numberOfPurchases },
]
]
}
Related
First of all, I'm not sure I've set this up as it should be, like by the book. I'm from the SQL world and jumped into the NOSQL land.
Ok, so. I have this collection with Projects, and inside the projects I have files as a child-ref. I can populate and all that stuff, works really well. But I want to filter with tags. I have a tags field inside the File collection, an array with strings, pretty straight forward.
What I would like to do is; send a projectId and a string with a spec filter and get the files, belonging to the project and also containing the tag. Oh, and also, populated.
Is this even the right approach with NOSQL/MONGO? I know how I would do it in SQL, with parent_id's and with some joins etc. I've looked into some aggregate but I'm too novice to work it out it seems.
edit, just to show how my collections are built:
Project Collection
[{
id: 1,
name: 'Project01',
files: [
id: 1,
id: 2,
id: 3,
id: 4,
id: 5,
...
]
},
...
]
Files Collection
[{
id: 1,
name: 'filename'
tags: ['a','b']
},{
id: 2,
name: 'filename2'
tags: ['b', 'c']
},{
id: 3,
name: 'filename3'
tags: ['a', 'd', 'e', 'f']
},
...]
The result I'm going for (get all files in project 1 where tags includes 'b'.
{
id: 1,
name: 'Project01',
files: [
{
id: 1,
name: 'filename'
tags: ['a','b']
},{
id: 2,
name: 'filename2'
tags: ['b', 'c']
}
]
}
try this $unwind operation in mongodb
Collections as per your requirement
[
{
_id: 1,
name: "Project01",
files: [
{
id: 1,
name: "filename11",
tags: [
"a",
"b"
]
},
{
id: 2,
name: "filename12",
tags: [
"b",
"c"
]
},
{
id: 3,
name: "filename13",
tags: [
"a",
"c"
]
}
]
},
{
_id: 2,
name: "Project02",
files: [
{
id: 1,
name: "filename21",
tags: [
"a",
"b"
]
},
{
id: 2,
name: "filename22",
tags: [
"a",
"c"
]
},
{
id: 3,
name: "filename23",
tags: [
"b",
"c"
]
}
]
}
]
Method 1: for your project collection
db.collection.aggregate([
{
$match: {
_id: 1
}
},
{
$unwind: "$files"
},
{
$match: {
_id: 1,
"files.tags": {
$in: [
"b"
]
}
}
}
])
Method 2 for files collection
db.collection.aggregate([
{
$unwind: "$tags"
},
{
$match: {
tags: "xyz"
}
}
])
Try it here Mongoplayground
In ArangoDB I want to group and sort notification data.
I have the following notification data sets
[
{id: 1, groupId: 1, text: 'Aoo', time: 23},
{id: 2, groupId: 2, text: 'Boo', time: 32},
{id: 3, groupId: 1, text: 'Coo', time: 45},
{id: 4, groupId: 3, text: 'Doo', time: 56},
{id: 5, groupId: 1, text: 'Eoo', time: 22},
{id: 6, groupId: 2, text: 'Foo', time: 23}
]
I want to group the notification by groupId and the recent notification group should appear on top.
Final result should be like this
[
{ groupId: 3, notifications: [{id: 4, groupId: 3, text: 'Doo', time: 56}],
{ groupId: 1, notification: [{id: 3, groupId: 1, text: 'Coo', time: 45}, {id: 1, groupId: 1, text: 'Aoo', time: 23}, {id: 5, groupId: 1, text: 'Eoo', time: 22}]},
{ groupId: 2, notifications: [{id: 2, groupId: 2, text: 'Boo', time: 32}, {id: 6, groupId: 2, text: 'Foo', time: 23}] }
]
Tried following AQL
FOR doc IN notificaion
SORT doc.time DESC
COLLECT groupId = doc.groupId INTO g
RETURN { groupId, notifications: g[*].doc }
Above query sorts the inner group elements but the outer groups are not sorted.
I'm struggling to construct an AQL for it. Any pointer will be helpful.
Thanks
Sort twice: once the set of documents collected - as you already do, then the collection:
FOR doc IN notification
SORT doc.time DESC
COLLECT groupId = doc.groupId INTO g
SORT g[*].doc.time DESC
RETURN { groupId, notifications: g[*].doc }
In my tests this yields the desired sequence:
[
{
"groupId": 3,
"notifications": [
{
"id": 4,
"groupId": 3,
"text": "Doo",
"time": 56
}
]
},
{
"groupId": 1,
"notifications": [
{
"id": 3,
"groupId": 1,
"text": "Coo",
"time": 45
},
{
"id": 1,
"groupId": 1,
"text": "Aoo",
"time": 23
},
{
"id": 5,
"groupId": 1,
"text": "Eoo",
"time": 22
}
]
},
{
"groupId": 2,
"notifications": [
{
"id": 2,
"groupId": 2,
"text": "Boo",
"time": 32
},
{
"id": 6,
"groupId": 2,
"text": "Foo",
"time": 23
}
]
}
]
I have 2 sphinx indexes: index1 and index2.
When I search in index1: I've got two matches:
{ error: '',
warning: '',
status: [ 0 ],
fields: [ 'name' ],
attrs: [],
matches:
[ { id: 5731, weight: 2, attrs: {} },
{ id: 17236, weight: 2, attrs: {} } ],
total: 2,
total_found: 2,
time: 0,
words: [ { word: '*foo*', docs: 2, hits: 4 } ] }
Now I can fetch those 2 records from database and return to client.
When I search same term in index2: I've got three matches:
{ error: '',
warning: '',
status: [ 0 ],
fields: [ 'name' ],
attrs: [],
matches:
[ { id: 28, weight: 1, attrs: {} },
{ id: 41, weight: 1, attrs: {} },
{ id: 42, weight: 1, attrs: {} } ],
total: 3,
total_found: 3,
time: 0,
words: [ { word: '*foo*', docs: 3, hits: 3 } ] }
Now I can fetch those 3 records from database and return to client.
When I search in all indexes I've got five records:
{ error: '',
warning: '',
status: [ 0 ],
fields: [ 'name' ],
attrs: [],
matches:
[ { id: 5731, weight: 2, attrs: {} },
{ id: 17236, weight: 2, attrs: {} },
{ id: 28, weight: 1, attrs: {} },
{ id: 41, weight: 1, attrs: {} },
{ id: 42, weight: 1, attrs: {} } ],
total: 5,
total_found: 5,
time: 0,
words: [ { word: '*foo*', docs: 5, hits: 7 } ] }
The problem is indexes build on different database tables. So I don't know actually what to do with matches cause ids reference to different tables.
How can I get index names with search results or sources or something to know what exactly have been found?
I'm using sphinxapi node.js client if it matters.
Add an explicit attribute to the indexes :)
source index1 {
sql_query = SELECT id, title, 1 as idx FROM ...
sql_attr_uint = idx:2
source index2 {
sql_query = SELECT id, title, 2 as idx FROM ...
sql_attr_uint = idx:2
(the number in sql_attr_uint is the number of bits for the attribute)
In my database I have :User nodes, and they are related by :Friendship relationships. I want to get a structure like this:
[
{
id: 1,
username: "Whatever",
email: "whatever#test.com"
...
},
[ 6, 7, 8, ... ]
],
[
{
id: 2,
username: "Another user",
email: "anotheruser#test.com"
...
},
[ 15, 16, 17, 18, ... ]
],
...
...where the numbers are the IDs of the nodes that the node is directly related to with a :Friendship relationship.
This answer has some queries that almost do the work:
Can I find all the relations between two nodes in neo4j?
But the closest one I came up with was:
match p=(a:User)-[:Friendship]->(d:User)
return d, reduce(nodes = [],n in nodes(p) | nodes + [id(n)]) as node_id_col
...which returns this structure:
[
{
id: 1,
username: "Whatever",
email: "whatever#test.com"
...
},
[ 1, 6 ]
],
[
{
id: 1,
username: "Whatever",
email: "whatever#test.com"
...
},
[ 1, 7 ]
],
[
{
id: 1,
username: "Whatever",
email: "whatever#test.com"
...
},
[ 1, 8 ]
],
[
{
id: 2,
username: "Another user",
email: "anotheruser#test.com"
...
},
[ 2, 15 ]
],
[
{
id: 2,
username: "Another user",
email: "anotheruser#test.com"
...
},
[ 2, 16 ]
],
...
That is not good, because it is returning a lot of redundant data.
So what would be the proper Cypher query for this?
Thanks!
I think you may be over complicating things OR I am not properly understanding the problem. Does something like this work for you?
match (a:User)-[:Friendship]->(d:User)
return a, collect(id(d))
Pretty new to Mongodb and I am trying to construct a query using Mongoose to get a desired result, if possible.
Test data:
{ id: 1, display_time: '01:00', name: 'test1' },
{ id: 2, display_time: '03:00', name: 'test2' },
{ id: 3, display_time: '01:00', name: 'test3' },
{ id: 4, display_time: '04:00', name: 'test4' },
{ id: 5, display_time: '01:00', name: 'test5' }
Desired result:
{
"01:00": [
{ id: 1, display_time: '01:00', name: 'test1' },
{ id: 3, display_time: '01:00', name: 'test3' },
{ id: 5, display_time: '01:00', name: 'test5' }
],
"03:00": [
{ id: 2, display_time: '03:00', name: 'test2' }
],
"04:00": [
{ id: 4, display_time: '04:00', name: 'test4' }
],
}
Basically it groups the documents based on the display_time field and returns it in that format. Is this possible with Mongo?
You can try somthing like this:
db.collection.aggregate(
[
{ $group : { _id : "$display_time", test: { $push: "$$ROOT" } } }
]
)
{
"_id" : "01:00",
"test" :
[
{ id: 1, display_time: '01:00', name: 'test1' },
{ id: 3, display_time: '01:00', name: 'test3' }
]
}
{
"_id" : "04:00",
"test" :
[
{ id: 4, display_time: '04:00', name: 'test4' }
]
}
For moar information http://docs.mongodb.org/manual/reference/operator/aggregation/group/