Given a LARGE (hundreds of thousands) collection of event documents (see below for example), what is the most performant method to retrieve the first event with an _id greater than (n) ?
Example Document
{
_id: NumberLong(352757), // Uniqueness guaranteed
type: "BallDropped",
createdAt: "2014-01-01T00:00:00Z",
// ... followed by dynamic properties of unknown size
}
Current Implementation
Given a collection of many events, retrieve the first event with an _id greater than 35.
First, retrieve the id of the event using aggregate.
I do this assuming that the projection phase (return just the id) will be more performant than cycling over full documents of unknown size.
db.events.aggregate(
{ $project: { _id: 1 } },
{ $match: { _id: { $gt: NumberLong(35) } } },
{ $sort: { _id: 1 } },
{ $limit: 1 }
)
Then, I call findOne with the returned _id to retrieve that document.
What are your thoughts?
Related
I have documents like this in my MongoDB Listings collection.
listingID: 'abcd',
listingData: {
category: 'resedetial'
},
listingID: 'xyz',
listingData: {
category: 'resedetial'
},
listingID: 'efgh',
listingData: {
category: 'office'
}
I am trying to get total count of all listings and count according to category.
I can get total count of listings with aggregation query. But I am not sure how to get output like this resedentialCount: 2, officeCount: 1 , ListingsCount: 3
This is my aggregation query
{
$match: {
listingID,
},
},
{
$group: {
_id: 1,
ListingsCount: { $sum: 1 },
},
}
Try this:
let listingAggregationCursor = db.collection.aggregate([
{$group: {_id:"$listingData.category",ListingsCount:{$sum:1} }}
])
let listingAggregation=await listingAggregationCursor.toArray();
(I got this query from https://www.statology.org/mongodb-group-by-count)
This will give you an array of objects with each listing category as well as how many times they occur.
For getting the total listingsCount, sum up all of the count fields from the array of objects. You can do that like this:
let listingsCount=0;
for(listingCategory of listingAggregation) {
listingsCount+=listingCategory.count;
}
You should have the data you need at this point. Now it's just a matter of extracting and formatting it as you see fit.
Hope this helps!
I have 2 collections:
Office -
{
_id: ObjectId(someOfficeId),
name: "some name",
..other fields
}
Documents -
{
_id: ObjectId(SomeId),
name: "Some document name",
officeId: ObjectId(someOfficeId),
...etc
}
I need to get list of offices sorted by count of documetns that refer to office. Also should be realized pagination.
I tryied to do this by aggregation and using $lookup
const aggregation = [
{
$lookup: {
from: 'documents',
let: {
id: '$id'
},
pipeline: [
{
$match: {
$expr: {
$eq: ['$officeId', '$id']
},
// sent_at: {
// $gte: start,
// $lt: end,
// },
}
}
],
as: 'documents'
},
},
{ $sortByCount: "$documents" },
{ $skip: (page - 1) * limit },
{ $limit: limit },
];
But this doesn't work for me
Any Ideas how to realize this?
p.s. I need to show offices with 0 documents, so get offices by documets - doesn't work for me
Query
you can use lookup to join on that field, and pipeline to group so you count the documents of each office (instead of putting the documents into an array, because you only case for the count)
$set is to get that count at top level field
sort using the noffices field
you can use the skip/limit way for pagination, but if your collection is very big it will be slow see this. Alternative you can do the pagination using the _id natural order, or retrieve more document in each query and have them in memory (instead of retriving just 1 page's documents)
Test code here
offices.aggregate(
[{"$lookup":
{"from":"documents",
"localField":"_id",
"foreignField":"officeId",
"pipeline":[{"$group":{"_id":null, "count":{"$sum":1}}}],
"as":"noffices"}},
{"$set":
{"noffices":
{"$cond":
[{"$eq":["$noffices", []]}, 0,
{"$arrayElemAt":["$noffices.count", 0]}]}}},
{"$sort":{"noffices":-1}}])
As the other answer pointed out you forgot the _ of id, but you don't need the let or match inside the pipeline with $expr, with the above lookup. Also $sortByCount doesn't count the member of an array, you would need $size (sort by count is just group and count its not for arrays). But you dont need $size also you can count them in the pipeline, like above.
Edit
Query
you can add in the pipeline what you need or just remove it
this keeps all documents, and counts the array size
and then sorts
Test code here
offices.aggregate(
[{"$lookup":
{"from":"documents",
"localField":"_id",
"foreignField":"officeId",
"pipeline":[],
"as":"alldocuments"}},
{"$set":{"ndocuments":{"$size":"$alldocuments"}}},
{"$sort":{"ndocuments":-1}}])
There are two errors in your lookup
While passing the variable in with $let. You forgot the _ of the $_id local field
let: {
id: '$id'
},
In the $exp, since you are using a variable id and not a field of the
Documents collection, you should use $$ to make reference to the variable.
$expr: {
$eq: ['$officeId', '$$id']
},
I have 'n' number of documents present inside a collection in MongoDB.
Structure of those documents is as follows:
{
"_id": "...",
"submissions": [{...}, ...]
}
I want to find the document which has the highest number of submissions out of all the documents present.
Is there any Mongo find/aggregation query which can do the same?
I don't think any straight way to achieve this,
You can try below aggregation query,
$addFields to add new field totalSubmissions to get total elements in submissions array
$sort by totalSubmissions in descending order
$limit to select single document
collection.aggregate([
{ $addFields: { totalSubmissions: { $size: "$submissions" } } },
{ $sort: { totalSubmissions: -1 } },
{ $limit: 1 }
])
Playground
How do we integrate both distinct and selects the documents where the value of the field is not equal to the specified value.in a query in mongo using nodejs (keystone framework) ? or just basically in mongo. I am receiving an error which is field selection and slice cannot be used with distinct Error:. Any idea? or solution? I did try to use Syntax: {field: {$ne: value} } and that is the error. Also how can we include a limit when limit cannot be used with distinct Error: limit cannot be used with distinct.
query
keystone.list('Customer').model.find({ customer_id: { $in: locals.data.customers } }, { vin: { $ne: vin } }).distinct('vin').limit(4) ....
You can add a query to distinct but not skip and limit
https://docs.mongodb.com/manual/reference/method/db.collection.distinct/#specify-query-with-distinct
Instead, you can use the aggregate pipeline as
db.customer.aggregate(
{ $match:{ customer_id: { $in: locals.data.customers } }},
{ $group:{_id:"$vin"}},
{ $skip: skip},
{ $limit: limit},
{ $group:{_id:null,vin:{$push:"$_id"}}}
);
I am trying to create a historical record for updates to a document in Mongo DB via NodeJS. The document updates are only in one object within the document, so it seems like creating an array of historical values makes sense.
However, when I use the $push function with db.collection.update(), it only updates the array at the 0 index rather than add to the array.
Here is what I have:
{
_id: ID,
odds: {
spread: CURRENTSPREAD,
total: CURRENTTOTAL,
history: [
0: {
spread: PREVIOUSSPREAD1,
total: PREVIOUSTOTAL1,
date: DATEENTERED
}
]
}
}
Here is what I would like:
{
_id: ID,
odds: {
spread: CURRENTSPREAD,
total: CURRENTTOTAL,
history: [
0: {
spread: PREVIOUSSPREAD1,
total: PREVIOUSTOTAL1,
date: DATEENTERED1
},
1: {
spread: PREVIOUSSPREAD2,
total: PREVIOUSTOTAL2,
date: DATEENTERED2
},
...,
n: {
spread: PREVIOUSSPREAD-N,
total: PREVIOUSTOTAL-N,
date: DATEENTERED-N
}
]
}
}
There is no need to check whether the previous value exists before adding.
Here is my code:
var oddsHistoryUpdate = {
$push: {
'odds.history': {
spread: game.odds.spread,
total: game.odds.total,
date: Date.now()
}
}
}
db.collection('games').update({"_id": ID}, oddsHistoryUpdate).
.then(finish executing)
Why is it only pushing to the 0 index instead of adding to the array? How do I fix?
Bigga_HD's answer is the correct one regarding the $push operator. However, there may be an alternative solution that is more aligned to how MongoDB works under the hood.
A single document in MongoDB has a hard limit of 16MB, and if a document is frequently updated, it is possible that the array grows so large that it hits this limit.
Alternatively, you can just insert a new document into the collection instead of pushing the old document inside an array. The new & old documents can be differentiated by their insertion date. For example:
{
_id: ID,
name: <some identification>
insert_date: ISODate(...),
odds: {
spread: CURRENTSPREAD,
total: CURRENTTOTAL
}
}
You can then query the collection using a combination of e.g. its name and insert_date, sorted by its date descending, and limit by 1 to get the latest version:
db.collection.find({name: ...}).sort({insert_date: -1}).limit(1)
or remove the limit to find all versions:
db.collection.find({name: ...}).sort({insert_date: -1})
To support this query, you can create an index based on name and insert_date in descending order (see Create Indexes to Support Your Queries)
db.collection.createIndex({name: 1, insert_date: -1})
As a bonus, you can use a TTL index on the insert_date field to automatically delete old document versions.
$push
The $push operator appends a specified value to an array.
The $push operator has the form:
{ $push: { <field1>: <value1>, ... } }
If the field is absent in the document to update, $push adds the array field with the value as its element.
If the field is not an array, the operation will fail.
If the value is an array, $push appends the whole array as a single element. To add each element of the value separately, use the $each modifier with $push.
$each -Appends multiple values to the array field.
This should do the trick for you. Obviously, it's a very simplified example.
{ $push: { <field1>: { <modifier1>: <value1>, ... }, ... } }
let oddsHistoryUpdate = {
spread: game.odds.spread,
total: game.odds.total,
date: Date.now()
}
db.games.update(
{ _id: ID },
{ $push: { odds.history: oddsHistoryUpdate} }
)
I suggest try using Mongoose for your NodeJS - MongoDB interactions.
The answer was uncovered by dnickless.
In a previous call, I update the main odds object which I didn't realize was wiping out the history array.
Updating the previous call from
update($set: {odds: { spread: SPREAD, total: TOTAL }})
to
update($set: {"odds.spread": SPREAD, "odds.total": TOTAL})
and then making my $push call as written, all works fine.