How do I get count in MongoDB based on specific fields

How do I get count in MongoDB based on specific fields - node.js

I have documents like this in my MongoDB Listings collection.
listingID: 'abcd',
listingData: {
category: 'resedetial'
},
listingID: 'xyz',
listingData: {
category: 'resedetial'
},
listingID: 'efgh',
listingData: {
category: 'office'
}
I am trying to get total count of all listings and count according to category.
I can get total count of listings with aggregation query. But I am not sure how to get output like this resedentialCount: 2, officeCount: 1 , ListingsCount: 3
This is my aggregation query
{
$match: {
listingID,
},
},
{
$group: {
_id: 1,
ListingsCount: { $sum: 1 },
},
}

Try this:
let listingAggregationCursor = db.collection.aggregate([
{$group: {_id:"$listingData.category",ListingsCount:{$sum:1} }}
])
let listingAggregation=await listingAggregationCursor.toArray();
(I got this query from https://www.statology.org/mongodb-group-by-count)
This will give you an array of objects with each listing category as well as how many times they occur.
For getting the total listingsCount, sum up all of the count fields from the array of objects. You can do that like this:
let listingsCount=0;
for(listingCategory of listingAggregation) {
listingsCount+=listingCategory.count;
}
You should have the data you need at this point. Now it's just a matter of extracting and formatting it as you see fit.
Hope this helps!

Related

Mongo db - how to join and sort two collection with pagination

I have 2 collections:
Office -
{
_id: ObjectId(someOfficeId),
name: "some name",
..other fields
}
Documents -
{
_id: ObjectId(SomeId),
name: "Some document name",
officeId: ObjectId(someOfficeId),
...etc
}
I need to get list of offices sorted by count of documetns that refer to office. Also should be realized pagination.
I tryied to do this by aggregation and using $lookup
const aggregation = [
{
$lookup: {
from: 'documents',
let: {
id: '$id'
},
pipeline: [
{
$match: {
$expr: {
$eq: ['$officeId', '$id']
},
// sent_at: {
// $gte: start,
// $lt: end,
// },
}
}
],
as: 'documents'
},
},
{ $sortByCount: "$documents" },
{ $skip: (page - 1) * limit },
{ $limit: limit },
];
But this doesn't work for me
Any Ideas how to realize this?
p.s. I need to show offices with 0 documents, so get offices by documets - doesn't work for me

Query
you can use lookup to join on that field, and pipeline to group so you count the documents of each office (instead of putting the documents into an array, because you only case for the count)
$set is to get that count at top level field
sort using the noffices field
you can use the skip/limit way for pagination, but if your collection is very big it will be slow see this. Alternative you can do the pagination using the _id natural order, or retrieve more document in each query and have them in memory (instead of retriving just 1 page's documents)
Test code here
offices.aggregate(
[{"$lookup":
{"from":"documents",
"localField":"_id",
"foreignField":"officeId",
"pipeline":[{"$group":{"_id":null, "count":{"$sum":1}}}],
"as":"noffices"}},
{"$set":
{"noffices":
{"$cond":
[{"$eq":["$noffices", []]}, 0,
{"$arrayElemAt":["$noffices.count", 0]}]}}},
{"$sort":{"noffices":-1}}])
As the other answer pointed out you forgot the _ of id, but you don't need the let or match inside the pipeline with $expr, with the above lookup. Also $sortByCount doesn't count the member of an array, you would need $size (sort by count is just group and count its not for arrays). But you dont need $size also you can count them in the pipeline, like above.
Edit
Query
you can add in the pipeline what you need or just remove it
this keeps all documents, and counts the array size
and then sorts
Test code here
offices.aggregate(
[{"$lookup":
{"from":"documents",
"localField":"_id",
"foreignField":"officeId",
"pipeline":[],
"as":"alldocuments"}},
{"$set":{"ndocuments":{"$size":"$alldocuments"}}},
{"$sort":{"ndocuments":-1}}])

There are two errors in your lookup
While passing the variable in with $let. You forgot the _ of the $_id local field
let: {
id: '$id'
},
In the $exp, since you are using a variable id and not a field of the
Documents collection, you should use $$ to make reference to the variable.
$expr: {
$eq: ['$officeId', '$$id']
},

How to count number of false values in an array mongo db query

_id:5e4d18bd10e5482eb623c6e4
notification_obj:
0 notification_text:"Welcome to the app and your account is created hello."
open:false
type:"just_click"
1 notification_text:"Sebal started following you."
open:true
type:"open_profile"
2 notification_text:"Hella started following you."
open:false
type:"open_profile"
So here I have an array 'notification_obj' array in a document of mongo database, I want to search the record with _id and in that record I want to count 'How many 'open:false' values are there. I want to count in this array that open:false how many times. Please help with a "query" in mongo db.

I think this code help you.
db.getCollection('your_collection').aggregate([
{
$match: { _id: ObjectId("5a544.............") }
},
{
$unwind: '$notification_obj'
},
{
$match: { 'notification_obj.open': false }
},
{
$count: 'total'
}
]);
Output:
{
"total" : 1
}

mongodb/mongoose aggregate memory usage very big+

I have a mongodb into which multiple sensors dump their data once a day to
a mongodb. Each document in essense is: { sid , date, data } (sensor_id, date as date (I only use the date component), and a data array of a couple hundred values.
Now I want to be able to get a overview statistic, for how many sensors I have data for each day. This aggegation works nicely, while I have a few dozens of elements, but even if I have a couple of hundred documents, then the query never finishes.
function dailyStatistic(callback) {
return air
.aggregate( [
{ $match: {} },
{ $group: { _id: { date: '$date' }, myCount: { $sum: 1 } } }
])
.allowDiskUse(true);
}
air is the name of my mongoose collection.
The aggregation should really just return:
[ {date:2017-08-07, myCount: 10}, {date:2017-08-08}, myCount: 26} ]
Now when I watch the machine (via glances) I get CPU_IOWAIT and MEMSWAP errrors, that ultimately will kill the node.js process before it gets the data.
When I check out the collection on robomongo, I can easily browse the
different data points. But also in robomongo, this script never gets me
a result:
db.getCollection('air').find({}).length()
Any ideas?
Thanks Andreas

Probably you do not have an index on date db.getCollection('air').createIndex({date:1})
db.getCollection('air').find({}).length() browse all the results
Instead uses db.getCollection('air').count({})

The best way to do this without crashing MongoDb would be to fetch data for a date range. In your case for 1 day.
function dailyStatistic(dateMin,dateMax,callback) {
return air
.aggregate( [
{ $match: {
date:{$gte:dateMin,$lte:dateMax}} },
{
$project:{
sid:1,
date:1,
data:1,
day: {$day: "$date"},
month: {$month: "$date"},
year: {$year: "$date"}
}
},
{ $group: { _id: {day: "$day",month: "$month", year: "$year"}, myCount: { $sum: 1 } } }
])
.allowDiskUse(true);}
You can take this further by adding pagination when the records available per hour/min is also too huge.
And as pagetronic suggested, create the indexes if you haven't.

How to find a document by its position/index in the array?

I need to retrieve let's say the documents at position 1,5 and 8 in a MongoDB database using Mongoose.
Is it possible at all to get a document by its position in a collection? If so, could you show how to do that?
I need something like this:
var someDocs = MyModel.find({<collectionIndex>: [1, 5, 8]}, function(err, docs) {
//do something with the three documents
})
I tried to do the following to see what indexes are used in collection but I get the 'getIndexes is not a function' error:
var indexes = MyModel.getIndexes();
Appreciate any help.

If by position 5 for example, you mean the literal 5th element in your collection, that isn't the best way to go. A Mongo collection is usually in the order in which you inserted the elements, but it may not always be. See the answer here and check the docs on natural order: https://stackoverflow.com/a/33018164/7531267.
In your case, you might have a unique id on each record that you can query by.
Assuming the [1, 5, 8] you mentioned are the ids, something like this should do it:
var someDocs = MyModel.find({ $or: [{ id: 1 }, { id: 5 }, { id: 8 }]}}, function(err, cb) {
//do something with the three documents
})
You can also read about $in to replace $or and clean up the query a bit.

Assume you have this document in users collections:
{
_id: ObjectId('...'),
name: 'wrq',
friends: ['A', 'B', 'C']
}
Code below to search first and thrid friend of user 'wrq':
db.users.aggregate(
[
{
$match: {
name: 'wrq'
}
},
{
$project:{
friend1: {
$arrayElemAt: ["$friends", 0]
},
friend3: {
$arrayElemAt: ["$friends", 2]
}
}
}
]
)

Combine multiple query with one single $in query and specify limit for each array field?

I am using mongoose with node.js for MongoDB. Now i need to make 20 parallel find query requests in my database with limit of documents 4, same as shown below just brand_id will change for different brand.
areamodel.find({ brand_id: brand_id }, { '_id': 1 }, { limit: 4 }, function(err, docs) {
if (err) {
console.log(err);
} else {
console.log('fetched');
}
}
Now as to run all these query parallely i thought about putting all 20 brand_id in a array of string and then use a $in query to get the results, but i don't know how to specify the limit 4 for every array field which will be matched.
I write below code with aggregation but don't know where to specify limit for each element of my array.
var brand_ids = ["brandid1", "brandid2", "brandid3", "brandid4", "brandid5", "brandid6", "brandid7", "brandid8", "brandid9", "brandid10", "brandid11", "brandid12", "brandid13", "brandid14", "brandid15", "brandid16", "brandid17", "brandid18", "brandid19", "brandid20"];
areamodel.aggregate(
{ $project: { _id: 1 } },
{ $match : { 'brand_id': { $in: brand_ids } } },
function(err, docs) {
if (err) {
console.error(err);
} else {
}
}
);
Can anyone please tell me how can i solve my problem using only one query.
UPDATE- Why i don't think $group be helpful for me.
Suppose my brand_ids array contains these strings
brand_ids = ["id1", "id2", "id3", "id4", "id5"]
and my database have below documents
{
"brand_id": "id1",
"name": "Levis",
"loc": "india"
},
{
"brand_id": "id1",
"name": "Levis"
"loc": "america"
},
{
"brand_id": "id2",
"name": "Lee"
"loc": "india"
},
{
"brand_id": "id2",
"name": "Lee"
"loc": "america"
}
Desired JSON output
{
"name": "Levis"
},
{
"name": "Lee"
}
For above example suppose i have 25000 documents with "name" as "Levis" and 25000 of documents where "name" is "Lee", now if i will use group then all of 50000 documents will be queried and grouped by "name".
But according to the solution i want, when first document with "Levis" and "Lee" gets found then i will don't have to look for remaining thousands of the documents.
Update- I think if anyone of you can tell me this then probably i can get to my solution.
Consider a case where i have 1000 total documents in my mongoDB, now suppose out of that 1000, 100 will pass my match query.
Now if i will apply limit 4 on this query then will this query take same time to execute as the query without any limit, or not.
Why i am thinking about this case
Because if my query will take same time then i don't think $group will increase my time as all documents will be queried.
But if time taken by limit query is more than the time taken without the limit query then.
If i can apply limit 4 on each array element then my question will be solved.
If i cannot apply limit on each array element then i don't think $group will be useful, as in this case i have to scan whole documents to get the results.
FINAL UPDATE- As i read on below answer and also on mongodb docs that by using $limit, time taken by query does not get affected it is the network bandwidth that gets compromised. So i think if anyone of you can tell me how to apply limit on array fields (by using $group or anything other than that)then my problem will get solved.
mongodb: will limit() increase query speed?
Solution
Actually my thinking about mongoDB was very wrong i thought adding limit with queries decrease time taken by query but it is not the case that's why i stumbled so many days to try the answer which Gregory NEUT and JohnnyHK Told me to. Thanks a lot both of you guys i must have found the solution at the day one if i had known about this thing. thanks alot for helping me out of here guys i really appreciate it.

I propose you to use the $group aggregation attribute to group all data you got from the $match by brand_id, and then limit the groups of data using $slice.
Look at this stack overflow post
db.collection.aggregate(
{
$sort: {
created: -1,
}
}, {
$group: {
_id: '$city',
title: {
$push: '$title',
}
}, {
$project: {
_id: 0,
city: '$_id',
mostRecentTitle: {
$slice: ['$title', 0, 2],
}
}
})

I propose using distinct, since that will return all different brand names in your collection. (I assume this is what you are trying to achieve?)
db.runCommand ( { distinct: "areamodel", key: "name" } )
MongoDB docs
In mongoose i think it is: areamodel.db.db.command({ distinct: "areamodel", key: "name" }) (Untested)

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How do I get count in MongoDB based on specific fields - node.js

Related

Mongo db - how to join and sort two collection with pagination

How to count number of false values in an array mongo db query

mongodb/mongoose aggregate memory usage very big+

How to find a document by its position/index in the array?

Combine multiple query with one single $in query and specify limit for each array field?

Categories

Resources