Find distinct values group by another field mongodb - python-3.x

I have collection with documents like this :
{
"_id" : ObjectId("5c0685fd6afbd73b80f45338"),
"page_id" : "1234",
"category_list" : [
"football",
"sport"
],
"time_broadcast" : "09:13"
}
{
"_id" : ObjectId("5c0685fd6afbd7355f45338"),
"page_id" : "1234",
"category_list" : [
"sport",
"handball"
],
"time_broadcast" : "09:13"
}
{
"_id" : ObjectId("5c0694ec6afbd74af41ea4af"),
"page_id" : "123456",
"category_list" : [
"news",
"updates"
],
"time_broadcast" : "09:13"
}
....
now = datetime.datetime.now().time().strftime("%H:%M")
What i want is : when "time_broadcast" is equal to "now",i get list of distinct "category_list" of each "page_id".
Here is how the output should look like :
{
{
"page_id" : "1234",
"category_list" : ["football", "sport", "handball"]
},
{
"page_id" : "123456",
"category_list" : ["news", "updates"]
}
}
I have tried like this :
category_list = db.users.find({'time_broadcast': now}).distinct("category_list")
but this gives me as output list of distinct values but
of all "page_id" :
["football", "sport", "handball","news", "updates"]
not category_list by page_id .
Any help please ?
Thanks

you need to write an aggregate pipeline
$match - filter the documents by criteria
$group - group the documents by key field
$addToSet - aggregate the unique elements
$project - project in the required format
$reduce - reduce the array of array to array by $concatArrays
aggregate query
db.tt.aggregate([
{$match : {"time_broadcast" : "09:13"}},
{$group : {"_id" : "$page_id", "category_list" : {$addToSet : "$category_list"}}},
{$project : {"_id" : 0, "page_id" : "$_id", "category_list" : {$reduce : {input : "$category_list", initialValue : [], in: { $concatArrays : ["$$value", "$$this"] }}}}}
]).pretty()
result
{ "page_id" : "123456", "category_list" : [ "news", "updates" ] }
{
"page_id" : "1234",
"category_list" : [
"sport",
"handball",
"football",
"sport"
]
}
you can add $sort by page_id pipeline if required

Related

Mongo pull object from array inside array

i have inside my mongoDB collection this document
{
"_id" : ObjectId("5b633025579fac22e74bf3be"),
"FLAGS" : [
{
"toSent" : [
{
"_id" : ObjectId("5b633025579fac22e74bf3c2"),
"phone" : "+84404040404"
},
{
"_id" : ObjectId("5b633025579fac22e74bf3c1"),
"phone" : "+212652253403"
},
{
"_id" : ObjectId("5b633025579fac22e74bf3c0"),
"phone" : "+212123456788"
}
],
"_id" : ObjectId("5b633025579fac22e74bf3bf"),
"action" : "group_p_a"
},
{
"toSent" : [
{
"_id" : ObjectId("5b633031579fac22e74bf3c9"),
"phone" : "+212651077199"
},
{
"_id" : ObjectId("5b633031579fac22e74bf3c8"),
"phone" : "+84404040404"
},
{
"_id" : ObjectId("5b633031579fac22e74bf3c7"),
"phone" : "+212652253403"
},
{
"_id" : ObjectId("5b633031579fac22e74bf3c6"),
"phone" : "+212123456788"
}
],
"_id" : ObjectId("5b633031579fac22e74bf3c5"),
"action" : "group_p_a"
}
],
"time" : ISODate("2018-08-02T16:24:05.747+0000"),
"action_user_phone" : "+212123456788",
"idGroup" : "e534379a-1580-4568-b5ec-6eaf981538d2",
"nomGroup" : "MOH FOR EVER",
"__v" : NumberInt(0)
}
TODO
I need to remove for example this element { "_id" : ObjectId("5b633025579fac22e74bf3c2"), "phone" : "+84404040404"}
WHAT I DID
GroupEvents.update({}, {$pull:{FLAGS:{$elemMatch:{toSent:{phone: "+84404040404"} }}}},function(err,ret){
if(err)
console.log("error"+err);
if(ret)
console.log(ret);
});
It remove all what's inside toSent event if it doesn't match.
Any help please
You need to use $ positional operator instead of $elemMatch here
GroupEvents.update(
{ "Flags.toSent.phone": "+84404040404" },
{ "$pull": { "FLAGS.$.toSent": { "phone": "+84404040404" }}},
)
If you want remove from every element of FLAGS array this you need to use $[] the all positional operator
GroupEvents.update(
{ "Flags.toSent.phone": "+84404040404" },
{ "$pull": { "FLAGS.$[].toSent": { "phone": "+84404040404" }}},
)

Cant able to find subdocument counts based on condition

I have a schema in which it has some fields..
i am not able to find query for this, i tried $group but was not able to find results
collection: tasks
{
"_id" : ObjectId("5a475ee4b342fa03e71192bd"),
"title" : "Some Title",
"assignedUsers" : [
{
"_id" : ObjectId("5a47386ee4788102e530f60d"),
"name" : "Sam",
"status" : "Unconfirmed"
},
{
"_id" : ObjectId("5a473878e4788102e530f60f"),
"name" : "Ricky",
"status" : "Rejected"
}
{
"_id" : ObjectId("5a47388be4788102e530f611"),
"name" : "Niel",
"status" : "Unconfirmed"
},
{
"_id" : ObjectId("5a47388be4788102e530f611"),
"name" : "ABC",
"status" : "Unconfirmed"
},
{
"_id" : ObjectId("5a473892e4788102e530f612"),
"name" : "Rocky",
"status" : "Rejected"
}
]
}
Result should contain
Unconfirmed=3
Rejected=2
Thanks
Use below query,
db.coll3.aggregate([{
$unwind: '$assignedUsers'
}, {
$group: {
_id: '$assignedUsers.status',
'count': {
$sum: 1
}
}
}
])
If you want to query against a particular document make sure, you use a $match as first stage and then use the other 2 $unwind and $group.
You would get result as
{ "_id" : "Rejected", "count" : 2 }
{ "_id" : "Unconfirmed", "count" : 3 }
Hope this helps.

Searching value in 2 different fields mongodb + node.js

I am newbie. But I try to learn the most logical ways to write the queries.
Assume I have collection which is as;
{
"id" : NumberInt(1),
"school" : [
{
"name" : "george",
"code" : "01"
},
{
"name" : "michelangelo",
"code" : "01"
}
],
"enrolledStudents" : [
{
"userName" : "elisabeth",
"code" : NumberInt(21)
}
]
}
{
"id" : NumberInt(2),
"school" : [
{
"name" : "leonarda da vinci",
"code" : "01"
}
],
"enrolledStudents" : [
{
"userName" : "michelangelo",
"code" : NumberInt(25)
}
]
}
I want to list occurence of a key with their corresponding code values.
As an example key : michelangelo
To find the occurence of the key, I wrote two differen aggregation queries as;
db.test.aggregate([
{$unwind: "$school"},
{$match : {"school.name" : "michelangelo"}},
{$project: {_id: "$id", "key" : "$school.name", "code" : "$school.code"}}
])
and
db.test.aggregate([
{$unwind: "$enrolledStudents"},
{$match : {"enrolledStudents.userName" : "michelangelo"}},
{$project: {_id: "$id", "key" : "$enrolledStudents.userName", "code" : "$enrolledStudents.code"}}
])
the result of these 2 queries return what I want as;
{ "_id" : 1, "key" : "michelangelo", "code" : "01" }
{ "_id" : 2, "key" : "michelangelo", "code" : 25 }
One of them to search in enrolledStudents, the other one is searching in school field.
Can these 2 queries reduced into more logical query? Or is this the only way to do it?
ps: I am aware that database structure is not logical, but I tried to simulate.
edit
I try to write a query with find.
db.test.find({$or: [{"enrolledStudents.userName" : "michelangelo"} , {"school.name" : "michelangelo"}]}).pretty()
but this returns the whole documents as;
{
"id" : 1,
"school" : [
{
"name" : "george",
"code" : "01"
},
{
"name" : "michelangelo",
"code" : "01"
}
],
"enrolledStudents" : [
{
"userName" : "elisabeth",
"code" : 21
}
]
}
{
"id" : 2,
"school" : [
{
"name" : "leonarda da vinci",
"code" : "01"
}
],
"enrolledStudents" : [
{
"userName" : "michelangelo",
"code" : 25
}
]
}
Mongo 3.4
$match - This stage will keep all the school array and enrolledStudents where there is atleast one embedded document matching both the query condition
$group - This stage will combine all the school and enrolledStudents array to 2d array for each _id in a group.
$project - This stage will $filter the merge array for matching query condition and $map the array to with new labels values array.
$unwind - This stage will flatten the array.
$addFields & $replaceRoot - This stages will add the id field and promote the values array to the top.
db.collection.aggregate([
{$match : {$or: [{"enrolledStudents.userName" : "michelangelo"} , {"school.name" : "michelangelo"}]}},
{$group: {_id: "$id", merge : {$push:{$setUnion:["$school", "$enrolledStudents"]}}}},
{$project: {
values: {
$map:
{
input: {
$filter: {
input: {"$arrayElemAt":["$merge",0]},
as: "onef",
cond: {
$or: [{
$eq: ["$$onef.userName", "michelangelo"]
}, {
$eq: ["$$onef.name", "michelangelo"]
}]
}
}
},
as: "onem",
in: {
key : { $ifNull: [ "$$onem.userName", "$$onem.name" ] },
code : "$$onem.code"}
}
}
}
},
{$unwind: "$values"},
{$addFields:{"values.id":"$_id"}},
{$replaceRoot: { newRoot:"$values"}}
])
Sample Response
{ "_id" : 2, "key" : "michelangelo", "code" : 25 }
{ "_id" : 1, "key" : "michelangelo", "code" : "01" }
Mongo <= 3.2
Replace last two stages of above aggregation with $project to format the response.
{$project: {"_id": 0 , id:"$_id", key:"$values.key", code:"$values.code"}}
Sample Response
{ "_id" : 2, "key" : "michelangelo", "code" : 25 }
{ "_id" : 1, "key" : "michelangelo", "code" : "01" }
You can use $redact instead of $group & match and add $project with $map to format the response.
$redact to go through a document level at a time and perform $$DESCEND and $$PRUNE on the matching criteria.
The only thing to note is usage of $ifNull in the first document level for id so that you can $$DESCEND to embedded document level for further processing.
db.collection.aggregate([
{
$redact: {
$cond: [{
$or: [{
$eq: ["$userName", "michelangelo"]
}, {
$eq: ["$name", "michelangelo"]
}, {
$ifNull: ["$id", false]
}]
}, "$$DESCEND", "$$PRUNE"]
}
},
{
$project: {
id:1,
values: {
$map:
{
input: {$setUnion:["$school", "$enrolledStudents"]},
as: "onem",
in: {
key : { $ifNull: [ "$$onem.userName", "$$onem.name" ] },
code : "$$onem.code"}
}
}
}
},
{$unwind: "$values"},
{$project: {_id:0,id:"$id", key:"$values.key", code:"$values.code"}}
])

Mongoose Query Help: Paginate, Sort, Limit on Nested Array

I am have a chat Mongoose model in the below is the sample data. If this is still not clear please revert back to me with your questions. Any help is greatly appreciated.
{
"_id" : ObjectId("5745910831a1sd58d070a8faa"),
"messages" : [
{
"user" : "user1",
"message" : "How are you user1?",
"readInd" : "N",
"createDate" : ISODate("2016-05-25T11:36:00.468+0000"),
"_id" : ObjectId("5745912c31a1c58d070a904d")
},
{
"user" : "user1",
"message" : "Hello user1",
"readInd" : "N",
"createDate" : ISODate("2016-05-25T11:38:53.893+0000"),
"_id" : ObjectId("5745912531a1c58d070a902e")
}
],
"createDate" : ISODate("2016-05-25T11:35:20.534+0000"),
"users" : [
"57450b4506561ff5052f0a66",
"57450d8108d8d22c06cf138f"
],
"__v" : NumberInt(0)
},
{
"_id" : ObjectId("57458e9331a1c58d070a8e30"),
"messages" : [
{
"user" : "user2",
"message" : "How are you user2",
"readInd" : "N",
"createDate" : ISODate("2016-05-25T11:46:03.240+0000"),
"_id" : ObjectId("574590f331a1c58d070a8ede")
},
{
"user" : "user2",
"message" : "Hello user2",
"readInd" : "N",
"createDate" : ISODate("2016-05-25T11:48:53.925+0000"),
"_id" : ObjectId("574590e931a1c58d070a8eab")
}
],
"createDate" : ISODate("2016-05-25T11:35:20.534+0000"),
"users" : [
"5745149e3aaab38706c00b64",
"57450d8108d8d22c06cf138f"
],
"__v" : NumberInt(0)
}
{
"_id" : ObjectId("5745910831a1c58d070a8faa"),
"messages" : [
{
"user" : "user3",
"message" : "How are you user3?",
"readInd" : "N",
"createDate" : ISODate("2016-05-25T11:56:00.468+0000"),
"_id" : ObjectId("5745912c31a1c58d070a904d")
},
{
"user" : "user3",
"message" : "Hello user3",
"readInd" : "N",
"createDate" : ISODate("2016-05-25T11:58:53.893+0000"),
"_id" : ObjectId("5745912531a1c58d070a902e")
}
],
"createDate" : ISODate("2016-05-25T11:35:20.534+0000"),
"users" : [
"57450b4506561ff5052f0a66",
"57450d8108d8d22c06cf138f"
],
"__v" : NumberInt(0)
},
{
"_id" : ObjectId("5745910831a1c58d070a8faa"),
"messages" : [
{
"user" : "user4",
"message" : "How are you user4?",
"readInd" : "N",
"createDate" : ISODate("2016-05-25T11:66:00.468+0000"),
"_id" : ObjectId("5745912c31a1c58d070a904d")
},
{
"user" : "user4",
"message" : "Hello user4",
"readInd" : "N",
"createDate" : ISODate("2016-05-25T11:68:53.893+0000"),
"_id" : ObjectId("5745912531a1c58d070a902e")
}
],
"createDate" : ISODate("2016-05-25T11:35:20.534+0000"),
"users" : [
"57450b4506561ff5052f0a66",
"57450d8108d8d22c06cf138f"
],
"__v" : NumberInt(0)
}
below is the explanation:
user1 sent 2 messages at 11:36 and 11:38 respectively
user2 sent 2 messages at 11:46 and 11:48 respectively
user3 sent 2 messages at 11:56 and 11:58 respectively
user4 sent 2 messages at 11:66 and 11:68 respectively
My Expected Result is:
Pagination/limit Criteria:
show 2 records per page.
show only the Most recent message based on user.
Sample output:
Page1:
"user" : "57450d8108d8d22c06cf138f",
"message" : "How are you user4?"
"user" : "57450d8108d8d22c06cf138f",
"message" : "How are you user3?"
Page2:
"user" : "57450d8108d8d22c06cf138f",
"message" : "How are you user2"
"user" : "57450d8108d8d22c06cf138f",
"message" : "How are you user1?"
try this way of query , this is help to u
query for page one
db.getCollection('message').aggregate( [ { $match : { user : "57450d8108d8d22c06cf138f" } },
{ $unwind : "$messages" } ,
{ $sort : { 'messages.createDate' : -1} },
{ $limit : 2 },
{ $project : { _id: 0,'message':'$messages.message','user':'$messages.user'} } ])
query for next page
db.getCollection('message').aggregate( [ { $match : { user : "57450d8108d8d22c06cf138f" } },
{ $unwind : "$messages" } ,
{ $sort : { 'messages.createDate' : -1} },
{ $limit : 2 },{ $skip : 2 }
{ $project : { _id: 0,'message':'$messages.message','user':'$messages.user'} } ])

How to find how to find subdocuments within an array where array in each subdocument contains certain keyword?

I am creating twitter clone using mongodb and want to search tweets which have certain hashtag in them. I came up with following document structure for my database -
{
"_id" : ObjectId("56f88c038c297eb4048e5dff"),
"twitterHandle" : "abhayp",
"firstName" : "Abhay",
"lastName" : "Pendawale",
"emailID" : "abhayp#gmail.com",
"password" : "278a2c36eeebde495853b14e6e5525fd12074229",
"phoneNumber" : "12394872398",
"location" : "San Jose",
"birthYear" : 1992,
"birthMonth" : 0,
"birthDay" : 1,
"followers" : [
"abhayp",
"deepakp",
"john",
"madhuras",
"nupurs",
"vgalgali"
],
"following" : [
"abhayp",
"abhinavk",
"ankitac",
"anupd",
"arpits"
],
"tweets" : [
{
"tweet_id" : "3f0fe01f8231356f784d07111efdf9d8ead28133",
"tweet_text" : "This new twitter sounds good!",
"created_on" : ISODate("2016-03-13T01:47:37Z"),
"firstName" : "Abhay",
"lastName" : "Pendawale",
"twitterHandle" : "abhayp",
"tags" : [ ]
},
{
"tweet_id" : "4e57b6d7d6b47d69054f0be55c238e8038751d84",
"tweet_text" : "#CMPE273 Node.js is #awesome",
"created_on" : ISODate("2016-03-07T23:16:39Z"),
"firstName" : "Abhay",
"lastName" : "Pendawale",
"twitterHandle" : "abhayp",
"tags" : [
"awesome",
"CMPE273"
]
},
{
"tweet_id" : "e5facd5f37c44313d5be02ffe0a3ca7190affd6b",
"tweet_text" : "Getting an incredible welcome in #Pune #travel #adventure #film #india ",
"created_on" : ISODate("2016-03-07T23:37:27Z"),
"firstName" : "Abhay",
"lastName" : "Pendawale",
"twitterHandle" : "abhayp",
"tags" : [
"adventure",
"film",
"india",
"Pune",
"travel"
]
},
{
"tweet_id" : "f5a735c1f747732f3e04f6cb2c092ff44750c0fd",
"tweet_text" : "The D Day today!\n#TheDDay",
"created_on" : ISODate("2016-03-18T22:24:57Z"),
"firstName" : "Abhay",
"lastName" : "Pendawale",
"twitterHandle" : "abhayp",
"tags" : [ ]
}
]}
I want to find out the tweets which have "Pune" in the tags array of the tweet. I tried following command
db.users.find({ "tweets.tags" : {$all: ["Pune"] }}, { tweets : 1 }).pretty();
but this command returns me ALL tweets of users who have 'Pune' entry in tags of one of their tweets. How can search only those tweets which have 'Pune' entry in their tags array?
Note: I don't want users who have tweeted #Pune, rather I want all tweets which contain #Pune. The duplicate marked question does not solve this problem.
Running following query -
db.users.aggregate([{
$match: {
'tweets.tags': 'Pune'
}
}, {
$project: {
tweets: {
$filter: {
input: '$tweets',
as: 'tweet',
cond: {
$eq: ['$$tweet.tags', 'Pune']
}
}
}
}
}]);
returns following results -
{ "_id" : ObjectId("56f88c038c297eb4048e5df1"), "tweets" : [ ] }
{ "_id" : ObjectId("56f88c038c297eb4048e5dff"), "tweets" : [ ] }
{ "_id" : ObjectId("56f88c038c297eb4048e5e07"), "tweets" : [ ] }
Which is certainly not what I want! :(
I don't think that's possible, beside using some convoluted aggregation stages.
The projection operator is close but only returns the first match.
My advice would be to put the tweets in a separate collection, that would be much more convenient to browse them, and probably more scalable. You're already duplicating the relevant user infos in them, so you don't have to change their content.

Resources