Related
I am trying to query an embedded subdocument and then only return an array in that subdocument via projection. After a query you can select fields that you want returned via projection. I want to use the native functionality because it is possible and the most clean way. The problem is it returns arrays in two documents.
I tried different query and projection options, but no result.
User model
// Define station schema
const stationSchema = new mongoose.Schema({
mac: String,
stationName: String,
syncReadings: Boolean,
temperature: Array,
humidity: Array,
measures: [{
date: Date,
temperature: Number,
humidity: Number
}],
lastUpdated: Date
});
// Define user schema
var userSchema = mongoose.Schema({
apiKey: String,
stations : [stationSchema]
}, {
usePushEach: true
}
);
api call
app.get('/api/stations/:stationName/measures',function(req, res, next) {
var user = {
apiKey: req.user.apiKey
}
const query = {
apiKey: user.apiKey,
'stations.stationName': req.params.stationName
}
const options = {
'stations.$.measures': 1,
}
User.findOne(query, options)
.exec()
.then(stations => {
res.status(200).send(stations)
})
.catch(err => {
console.log(err);
res.status(400).send(err);
})
});
Expected result
{
"_id": "5c39c99356bbf002fb092ce9",
"stations": [
{
"stationName": "livingroom",
"measures": [
{
"humidity": 60,
"temperature": 20,
"date": "2019-01-12T22:49:45.468Z",
"_id": "5c3a6f09fd357611f8d078a0"
},
{
"humidity": 60,
"temperature": 20,
"date": "2019-01-12T22:49:46.500Z",
"_id": "5c3a6f0afd357611f8d078a1"
},
{
"humidity": 60,
"temperature": 20,
"date": "2019-01-12T22:49:47.041Z",
"_id": "5c3a6f0bfd357611f8d078a2"
}
]
}
]
}
Actual result
{
"_id": "5c39c99356bbf002fb092ce9",
"stations": [
{
"stationName": "livingroom",
"measures": [
{
"humidity": 60,
"temperature": 20,
"date": "2019-01-12T22:49:45.468Z",
"_id": "5c3a6f09fd357611f8d078a0"
},
{
"humidity": 60,
"temperature": 20,
"date": "2019-01-12T22:49:46.500Z",
"_id": "5c3a6f0afd357611f8d078a1"
},
{
"humidity": 60,
"temperature": 20,
"date": "2019-01-12T22:49:47.041Z",
"_id": "5c3a6f0bfd357611f8d078a2"
}
]
},
******************************************************
// this whole object should not be returned
{
"stationName": "office",
"measures": []
}
******************************************************
]
}
edit
The answer below with aggregation works, but I still find it odd that I would need so much code. If after my normal query I get the same result with ".stations[0].measures", instead of the whole aggregation pipeline:
.then(stations => {
res.status(200).send(stations.stations[0].measures)
})
The way I read the code, the above does exactly the same as:
const options = {'stations.$.measures': 1}
Where the dollar sign puts in the index 0 as that was the index of the station that matches the query part: stationName: "livingroom"
Can someone explain?
This is not described in terms of mongoose but this will find a particular station name in an array of stations in 1 or more docs and return only the measures array:
db.foo.aggregate([
// First, find the docs we are looking for:
{$match: {"stations.stationName": "livingroom"}}
// Got the doc; now need to fish out ONLY the desired station. The filter will
// will return an array so use arrayElemAt 0 to extract the object at offset 0.
// Call this intermediate qqq:
,{$project: { qqq:
{$arrayElemAt: [
{ $filter: {
input: "$stations",
as: "z",
cond: { $eq: [ "$$z.stationName", "livingroom" ] }
}}, 0]
}
}}
// Lastly, just project measures and not _id from this object:
,{$project: { _id:0, measures: "$qqq.measures" }}
]);
$elemMatch operator limits the contents of an array field from the query results to contain only the first element matching the $elemMatch condition.
Try $elemMatch in Select Query as below :
const query = {
apiKey: user.apiKey,
'stations.stationName': req.params.stationName
}
const options = {
'stations' : {$elemMatch: { 'stationName' : req.params.stationName }}
}
I have documents that looks something like that, with a unique index on bars.name:
{ name: 'foo', bars: [ { name: 'qux', somefield: 1 } ] }
. I want to either update the sub-document where { name: 'foo', 'bars.name': 'qux' } and $set: { 'bars.$.somefield': 2 }, or create a new sub-document with { name: 'qux', somefield: 2 } under { name: 'foo' }.
Is it possible to do this using a single query with upsert, or will I have to issue two separate ones?
Related: 'upsert' in an embedded document (suggests to change the schema to have the sub-document identifier as the key, but this is from two years ago and I'm wondering if there are better solutions now.)
No there isn't really a better solution to this, so perhaps with an explanation.
Suppose you have a document in place that has the structure as you show:
{
"name": "foo",
"bars": [{
"name": "qux",
"somefield": 1
}]
}
If you do an update like this
db.foo.update(
{ "name": "foo", "bars.name": "qux" },
{ "$set": { "bars.$.somefield": 2 } },
{ "upsert": true }
)
Then all is fine because matching document was found. But if you change the value of "bars.name":
db.foo.update(
{ "name": "foo", "bars.name": "xyz" },
{ "$set": { "bars.$.somefield": 2 } },
{ "upsert": true }
)
Then you will get a failure. The only thing that has really changed here is that in MongoDB 2.6 and above the error is a little more succinct:
WriteResult({
"nMatched" : 0,
"nUpserted" : 0,
"nModified" : 0,
"writeError" : {
"code" : 16836,
"errmsg" : "The positional operator did not find the match needed from the query. Unexpanded update: bars.$.somefield"
}
})
That is better in some ways, but you really do not want to "upsert" anyway. What you want to do is add the element to the array where the "name" does not currently exist.
So what you really want is the "result" from the update attempt without the "upsert" flag to see if any documents were affected:
db.foo.update(
{ "name": "foo", "bars.name": "xyz" },
{ "$set": { "bars.$.somefield": 2 } }
)
Yielding in response:
WriteResult({ "nMatched" : 0, "nUpserted" : 0, "nModified" : 0 })
So when the modified documents are 0 then you know you want to issue the following update:
db.foo.update(
{ "name": "foo" },
{ "$push": { "bars": {
"name": "xyz",
"somefield": 2
}}
)
There really is no other way to do exactly what you want. As the additions to the array are not strictly a "set" type of operation, you cannot use $addToSet combined with the "bulk update" functionality there, so that you can "cascade" your update requests.
In this case it seems like you need to check the result, or otherwise accept reading the whole document and checking whether to update or insert a new array element in code.
if you dont mind changing the schema a bit and having a structure like so:
{ "name": "foo", "bars": { "qux": { "somefield": 1 },
"xyz": { "somefield": 2 },
}
}
You can perform your operations in one go.
Reiterating 'upsert' in an embedded document for completeness
I was digging for the same feature, and found that in version 4.2 or above, MongoDB provides a new feature called Update with aggregation pipeline.
This feature, if used with some other techniques, makes possible to achieve an upsert subdocument operation with a single query.
It's a very verbose query, but I believe if you know that you won't have too many records on the subCollection, it's viable. Here's an example on how to achieve this:
const documentQuery = { _id: '123' }
const subDocumentToUpsert = { name: 'xyz', id: '1' }
collection.update(documentQuery, [
{
$set: {
sub_documents: {
$cond: {
if: { $not: ['$sub_documents'] },
then: [subDocumentToUpsert],
else: {
$cond: {
if: { $in: [subDocumentToUpsert.id, '$sub_documents.id'] },
then: {
$map: {
input: '$sub_documents',
as: 'sub_document',
in: {
$cond: {
if: { $eq: ['$$sub_document.id', subDocumentToUpsert.id] },
then: subDocumentToUpsert,
else: '$$sub_document',
},
},
},
},
else: { $concatArrays: ['$sub_documents', [subDocumentToUpsert]] },
},
},
},
},
},
},
])
There's a way to do it in two queries - but it will still work in a bulkWrite.
This is relevant because in my case not being able to batch it is the biggest hangup. With this solution, you don't need to collect the result of the first query, which allows you to do bulk operations if you need to.
Here are the two successive queries to run for your example:
// Update subdocument if existing
collection.updateMany({
name: 'foo', 'bars.name': 'qux'
}, {
$set: {
'bars.$.somefield': 2
}
})
// Insert subdocument otherwise
collection.updateMany({
name: 'foo', $not: {'bars.name': 'qux' }
}, {
$push: {
bars: {
somefield: 2, name: 'qux'
}
}
})
This also has the added benefit of not having corrupted data / race conditions if multiple applications are writing to the database concurrently. You won't risk ending up with two bars: {somefield: 2, name: 'qux'} subdocuments in your document if two applications run the same queries at the same time.
I have the attached document structure. I need to retrieve the document with only some parameter.
For example
I need the data to be like this.
{
"_id": "57f36d71fb1ef61bd84f866b",
"testMaxScore": 235,
"testMaxTime": 60,
"inviteId": "57f0a97d11594560c02a8f43",
"testName": "Sr. Interactive Developer l1",
"sectionList": [
{
"sectionName": "Java MCQ",
"sectionInfo": "Some info",
"questionList": [
{
"_id": "57ea3d003f2ec2cbbe98bbb9",
"question": ""
},
{
"_id": "57ea3d003f2ec2cbbe98bbb9",
"question": ""
}
]
}
]
How can i acheve this ?
I am using mongoose
Can anyone help me on this
Thanks,
Kiran
Possible through the aggregation framework. Consider running an aggregation operation that has a single pipeline with the $project operator to project just the fields you want.
In the above example, you would run it as
Model.aggregate([
{
"$project": {
"testMaxScore": 1,
"testMaxTime": 1,
"inviteId": 1,
"testName": 1
"sectionList.sectionName" : 1,
"sectionList.sectionInfo" : 1,
"sectionList.questionList._id": 1,
"sectionList.questionList.question": 1
}
}
]).exec(function(err, result){
console.log(result);
})
or using the find() method:
Model.find(
{ },
{
"testMaxScore": 1,
"testMaxTime": 1,
"inviteId": 1,
"testName": 1
"sectionList.sectionName" : 1,
"sectionList.sectionInfo" : 1,
"sectionList.questionList._id": 1,
"sectionList.questionList.question": 1
}
).exec(function(err, result){
console.log(result);
})
Try to use the following find expression:
yourSchema.find({}).select('testName inviteId sectionList.sectionName'); // and so on
I have a Comments collection in Mongoose, and a query that returns the most recent five (an arbitrary number) Comments.
Every Comment is associated with another document. What I would like to do is make a query that returns the most recent 5 comments, with comments associated with the same other document combined.
So instead of a list like this:
results = [
{ _id: 123, associated: 12 },
{ _id: 122, associated: 8 },
{ _id: 121, associated: 12 },
{ _id: 120, associated: 12 },
{ _id: 119, associated: 17 }
]
I'd like to return a list like this:
results = [
{ _id: 124, associated: 3 },
{ _id: 125, associated: 19 },
[
{ _id: 123, associated: 12 },
{ _id: 121, associated: 12 },
{ _id: 120, associated: 12 },
],
{ _id: 122, associated: 8 },
{ _id: 119, associated: 17 }
]
Please don't worry too much about the data format: it's just a sketch to try to show the sort of thing I want. I want a result set of a specified size, but with some results grouped according to some criterion.
Obviously one way to do this would be to just make the query, crawl and modify the results, then recursively make the query again until the result set is as long as desired. That way seems awkward. Is there a better way to go about this? I'm having trouble phrasing it in a Google search in a way that gets me anywhere near anyone who might have insight.
Here's an aggregation pipeline query that will do what you are asking for:
db.comments.aggregate([
{ $group: { _id: "$associated", maxID: { $max: "$_id"}, cohorts: { $push: "$$ROOT"}}},
{ $sort: { "maxID": -1 } },
{ $limit: 5 }
])
Lacking any other fields from the sample data to sort by, I used $_id.
If you'd like results that are a little closer in structure to the sample result set you provided you could add a $project to the end:
db.comments.aggregate([
{ $group: { _id: "$associated", maxID: { $max: "$_id"}, cohorts: { $push: "$$ROOT"}}},
{ $sort: { "maxID": -1 } },
{ $limit: 5 },
{ $project: { _id: 0, cohorts: 1 }}
])
That will print only the result set. Note that even comments that do not share an association object will be in an array. It will be an array of 1 length.
If you are concerned about limiting the results in the grouping as Neil Lunn is suggesting, perhaps a $match in the beginning is a smart idea.
db.comments.aggregate([
{ $match: { createDate: { $gte: new Date(new Date() - 5 * 60000) } } },
{ $group: { _id: "$associated", maxID: { $max: "$_id"}, cohorts: { $push: "$$ROOT"}}},
{ $sort: { "maxID": -1 } },
{ $limit: 5 },
{ $project: { _id: 0, cohorts: 1 }}
])
That will only include comments made in the last 5 minutes assuming you have a createDate type field. If you do, you might also consider using that as the field to sort by instead of "_id". If you do not have a createDate type field, I'm not sure how best to limit the comments that are grouped as I do not know of a "current _id" in the way that there is a "current time".
I honestly think you are asking a lot here and cannot really see the utility myself, but I'm always happy to have that explained to me if there is something useful I have missed.
Bottom line is you want comments from the last five distinct users by date, and then some sort of grouping of additional comments by those users. The last part is where I see difficulty in rules no matter how you want to attack this, but I'll try to keep this to the most brief form.
No way this happens in a single query of any sort. But there are things that can be done to make it an efficient server response:
var DataStore = require('nedb'),
store = new DataStore();
async.waterfall(
function(callback) {
Comment.aggregate(
[
{ "$match": { "postId": thisPostId } },
{ "$sort": { "associated": 1, "createdDate": -1 } },
{ "$group": {
"_id": "$associated",
"date": { "$first": "$createdDate" }
}},
{ "$sort": { "date": -1 } },
{ "$limit": 5 }
],
callback);
},
function(docs,callback) {
async.each(docs,function(doc,callback) {
Comment.aggregate(
[
{ "$match": { "postId": thisPostId, "associated": doc._id } },
{ "$sort": { "createdDate": -1 } },
{ "$limit": 5 },
{ "$group": {
"_id": "$associated",
"docs": {
"$push": {
"_id": "$_id", "createdDate": "$createdDate"
}
},
"firstDate": { "$first": "$createdDate" }
}}
],
function(err,results) {
if (err) callback(err);
async.each(results,function(result,callback) {
store.insert( result, function(err, result) {
callback(err);
});
},function(err) {
callback(err);
});
}
);
},
callback);
},
function(err) {
if (err) throw err;
store.find({}).sort({ "firstDate": - 1 }).exec(function(err,docs) {
if (err) throw err;
console.log( JSON.stringify( docs, undefined, 4 ) );
});
}
);
Now I stuck more document properties in both the document and the array, but the simplified form based on your sample would then come out like this:
results = [
{ "_id": 3, "docs": [124] },
{ "_id": 19, "docs": [125] },
{ "_id": 12, "docs": [123,121,120] },
{ "_id": 8, "docs": [122] },
{ "_id": 17, "docs": [119] }
]
So the essential idea is to first find your distinct "users" who where the last to comment by basically chopping off the last 5. Without filtering some kind of range here that would go over the entire collection to get those results, so it would be best to restrict this in some way, as in the last hour or last few hours or something sensible as required. Just add those conditions to the $match along with the current post that is associated with the comments.
Once you have those 5, then you want to get any possible "grouped" details for multiple comments by those users. Again, some sort of limit is generally advised for a timeframe, but as a general case this is just looking for the most recent comments by the user on the current post and restricting that to 5.
The execution here is done in parallel, which will use more resources but is fairly effective considering there are only 5 queries to run anyway. In contrast to your example output, the array here is inside the document result, and it contains the original document id values for each comment for reference. Any other content related to the document would be pushed into the array as well as required (ie The content of the comment).
The other little trick here is using nedb as a means for storing the output of each query in an "in memory" collection. This need only really be a standard hash data structure, but nedb gives you a way of doing that while maintaining the MongoDB statement form that you may be used to.
Once all results are obtained you just return them as your output, and sorted as shown to retain the order of who commented last. The actual comments are grouped in the array for each item and you can traverse this to output how you like.
Bottom line here is that you are asking for a compounded version of the "top N results problem", which is something often asked of MongoDB. I've written about ways to tackle this before to show how it's possible in a single aggregation pipeline stage, but it really is not practical for anything more than a relatively small result set.
If you really want to join in the insanity, then you can look at Mongodb aggregation $group, restrict length of array for one of the more detailed examples. But for my money, I would run on parallel queries any day. Node.js has the right sort of environment to support them, so you would be crazy to do it otherwise.
I'm very new to MongoDB, so please have mercy on me! I have a schema that looks like this:
{
"hour": 0,
"minutes": [
{
"minute": 0,
"minuteVolume": 0,
"seconds": [
{
"second": 0,
"secondVolume": 0
},
{
"second": 1,
"secondVolume": 0
}
},
{
"minute": 22,
"minuteVolume": 0,
"seconds": [
{
"second": 0,
"secondVolume": 0
},
{
"second": 1,
"secondVolume": 0
}
}],
"hourVolume": 0
}
I'm trying to update a specific "secondVolume" and "minuteVolume". I've tried the following:
collection.update({"hour": hour,
"minutes": {$elemMatch: {"minute": minute}},
"minutes.seconds": {$elemMatch: {"second": second}}},
{ $inc: {hourVolume: 1, "minutes.$.minuteVolume": 1, "minutes.$.minuteVolume.seconds.$.second": 1}
},
{upsert:false,safe:true},
function(err,data){
if (err){
console.log(err);
}
else
{
console.log(data);
}
}
);
but I'm clearly doing something wrong. If I remove the $elemMatch for "second" and only try to update the "minuteVolume", it works just groovy. This leads me to believe that I'm doing something wrong with the positional operators or that my query isn't unwinding the document properly.
Is this even possible with a single query in MongoDB? I'm using mongodb driver version 1.4.19.
Thanks a lot in advance!
It looks like you can me and the 240 people who have voted for this feature.
https://jira.mongodb.org/browse/SERVER-831
If you know the position of the elements (possibly by querying the document first) you can update by using positional operators instead of by using $elemMatch.
{ $inc: {hourVolume: 1, "minutes.0.minuteVolume": 1, "minutes.0.minuteVolume.seconds.2.second": 1 }
I've had to redo several schemas to prevent multi-nesting and therefore allow for a one-shot update.