I have some documents with below format
{
"count_used" : Int64,
"last_used" : Date,
"enabled" : Boolean,
"data" : String
}
I use the last_used field to sort on within an aggregate query so the last used document is served first. I use updateOne after the aggregate to bump the last used field using the below query. So every read request will change the last_used field date stamp.
updateOne(
{ _id: result['_id']},
{
$currentDate: { 'last_used': true },
$inc: { 'count_used': 1 }
}
)
The problem i have is clients will make 5 concurrent requests and instead of receiving document 1,2,3,4,5 they will get 1,1,1,2,2 for example.
I'm guessing this is because there is no lock between the read and the update, so several reads get same results before the update is performed.
Is there any way around this?
Update with below aggregate nodejs code
db.collection('testing').aggregate([{
$project: {
last_used : 1,
count_used : 1,
doc_type : 1,
_id : 1,
data : 1,
is_type : { $setIsSubset: [ [type], '$doc_type' ] }
}
},
{ $match: {
$or: query}
},
{ $sort: {
is_type: -1,
last_used: 1
}
},
{
$limit: 1
} ]).next(function (error, result) {
db.collection('adverts').updateOne({_id: result['_id']}, {$currentDate: {'last_used': true}, $inc: {'count_used': 1}})
}
The problem is that multiple read requests come in from same client who issues multiple concurrent requests, which means they hit db at the same time and read the same data before the updateOne fires.
Related
I don't really know how to frame the question but what I have is the following schema in mongoose
new Schema({
gatewayId: { type: String, index: true },
timestamp: { type: Date, index: true },
curr_property:Number,
curr_property_cost:Number,
day_property:Number,
day_property_cost: Number,
curr_solar_generating: Number,
curr_solar_export:Number,
day_solar_generated:Number,
day_solar_export:Number,
curr_chan1:Number,
curr_chan2:Number,
curr_chan3:Number,
day_chan1:Number,
day_chan2:Number,
day_chan3:Number
},{
collection: 'owlelecmonitor'
});
and I want to be able to query all the documents in the collection but the data should be arranged inside the array in the following format
[ [{
gatewayId: 1,
timestamp: time
....
},
{
gatewayId: 1,
timestamp: time2
....
}],
[{
gatewayId: 2,
timestamp: time
....
},
{
gatewayId: 2,
timestamp: time2
....
}],
[{
gatewayId: 3,
timestamp: time
....
},
{
gatewayId: 3,
timestamp: time2
....
}]
];
Is there a way that I can do this in mongoose instead of retrieving the documents and processing them again ?
Yes, it's possible. Consider the following aggregation pipeline in mongo shell. This uses a single pipeline stream comprising of just the $group operator, grouping all the documents by gatewayId and creating another array field that holds all the grouped documents. This extra field uses the accumulator operator $push on the system variable $$ROOT which returns the root document, i.e. the top-level document, currently being processed in the aggregation pipeline stage.
With the cursor returned from the aggregate() method, you can then use its map() method to create the desired final array. The following mongo shell demonstration describes the above concept:
var result = db.owlelecmonitor.aggregate([
{
"$group": {
"_id": "$gatewayId",
"doc": {
"$push": "$$ROOT"
}
}
}
]).map(function (res){ return res.doc; });
printjson(result);
This will output to shell the desired result.
To implement this in Mongoose, use the following aggregation pipeline builder:
OwlelecMonitorModel
.aggregate()
.group({
"_id": "$gatewayId",
"doc": {
"$push": "$$ROOT"
}
})
.exec(function (err, result) {
var res = result.map(function (r){return r.doc;});
console.log(res);
});
I'm looking for a way to use Mongoose's aggregation pipeline to sum data. I have a schema that looks like this:
var Object = new Schema ({
Object2 : { type : ObjectId, ref : 'Object2' },
Object3 : { type : ObjectId, ref : 'Object3' },
value : {},
unit : String
})
The 'value' field will typically be a number, but I am allowing the user to include text in the field that transforms the number. Thus, value is stored as a string, and then I convert it into a number. I can't just store it as a number because there will be no way to convert it back for the user's display.
I've been trying to perform a Mongoose $sum pipeline to sum these values, while still allowing me to convert the number. Here is what I've tried:
data.aggregate([
{ $match:
{
$and: [{Object2: mongoose.Types.ObjectId(id)}, {Object3: {$in: sessions.map(function(this){ return new mongoose.Types.ObjectId(this._id); })}}]
}
},
{ $project:
{
'value': calculateNumber('value')
}
},
{ $group:
{
_id: null,
value: { $sum: "$value"},
unit: { $first: "$unit"}
}
}
], function(err, result) {
sendCallback(err, result, callback);
return;
})
However, I get a value of 0. Not using the project stage gives a value of 0, and I can't figure out if there is any way to use the project stage to apply my function and convert the fields to Numbers, so that I can take advantage of $sum.
Any help, without just declaring value as a Number in my schema, would be greatly appreciated!
I want to fetch all users user_totaldocs and user_totalthings and want to sum those variables.
How can it's possible? Here is user schema:
var user_schema = mongoose.Schema({
local : {
...
...
user_id : String,
user_totaldocs : Number,
user_totalthings : Number
....
}
});
You can use the Aggregation Pipeline to add calculated fields to a result. There are some examples below using the mongo shell, but the syntax in Mongoose's Aggregate() helper is similar.
For example, to calculate sums (per user document) you can use the $add expression in a $project stage:
db.user.aggregate(
// Limit to relevant documents and potentially take advantage of an index
{ $match: {
user_id: "foo"
}},
{ $project: {
user_id: 1,
total: { $add: ["$user_totaldocs", "$user_totalthings"] }
}}
)
To calculate totals across multiple documents you need to use a $group stage with a $sum accumulator, for example:
db.user.aggregate(
{ $group: {
_id: null,
total: { $sum: { $add: ["$user_totaldocs", "$user_totalthings"] } },
totaldocs: { $sum: "$user_totaldocs" },
totalthings: { $sum: "$user_totalthings" }
}}
)
You may want only the one total field; I've added in totaldocs and totalthings as examples of calculating multiple fields.
A group _id of null will combine values from all documents passed to the $group stage, but you can also use other criteria here (such as grouping by user_id).
You can use aggregation framework provided by mongodb. For your case --
if you want to fetch sum of user_totaldocs and sum of user_totalthings across the collection (meaning for all users), do --
db.user_schemas.aggregate(
[
{
$group : {
user_id : null,
user_totaldocs: { $sum: "$user_totaldocs"}, // for your case use local.user_totaldocs
user_totalthings: { $sum: "$user_totalthings" }, // for your case use local.user_totalthings
count: { $sum: 1 } // for no. of documents count
}
}
])
To sum user_totaldocs and user_totalthings for particular user in a collection(assuming there are multiple document for a user), this will return sum for each user, DO --
db.user_schemas.aggregate(
[
{
$group : {
user_id : "$user_id",
user_totaldocs: { $sum: "$user_totaldocs"}, // for your case use local.user_totaldocs
user_totalthings: { $sum: "$user_totalthings" }, // for your case use local.user_totalthings
count: { $sum: 1 } // for no. of documents count
}
}
])
No need to provide individual user id.
For more info read:
1. http://docs.mongodb.org/manual/reference/operator/aggregation/group/#pipe._S_group
2. http://docs.mongodb.org/manual/core/aggregation/
I am trying to aggregate some records in a mongo database using the node driver. I am first matching to org, fed, and sl fields (these are indexed). If I only include a few companies in the array that I am matching the org field to, the query runs fine and works as expected. However, when including all of the clients in the array, I always get:
MongoError: getMore: cursor didn't exist on server, possible restart or timeout?
I have tried playing with the allowDiskUse, and the batchSize settings, but nothing seems to work. With all the client strings in the array, the aggregation runs for ~5hours before throwing the cursor error. Any ideas? Below is the pipeline along with the actual aggregate command.
setting up the aggregation pipeline:
var aggQuery = [
{
$match: { //all clients, from last three days, and scored
org:
{ $in : array } //this is the array I am talking about
,
frd: {
$gte: _.last(util.lastXDates(3))
},
sl : true
}
}
, {
$group: { //group by isp and make fields for calculation
_id: "$gog",
count: {
$sum: 1
},
countRisky: {
$sum: {
$cond: {
if :{
$gte: ["$scr", 65]
},
then: 1,
else :0
}
}
},
countTimeZoneRisky: {
$sum: {
$cond: {
if :{
$eq: ["$gmt", "$gtz"]
},
then: 0,
else :1
}
}
}
}
}
, {
$match: { //show records with count >= 500
count: {
$gte: 500
}
}
}
, {
$project: { //rename _id to isp, only show relevent fields
_id: 0,
ISP: "$_id",
percentRisky: {
$multiply: [{
$divide: ["$countRisky", "$count"]
},
100
]
},
percentTimeZoneDiscrancy: {
$multiply: [{
$divide: ["$countTimeZoneRisky", "$count"]
},
100
]
},
count: 1
}
}
, {
$sort: { //sort by percent risky and then by count
percentRisky: 1,
count: 1
}
}
];
Running the aggregation:
var cursor = reportingCollections.hitColl.aggregate(aggQuery, {
allowDiskUse: true,
cursor: {
batchSize: 40000
}
});
console.log('Writing data to csv ' + currentFileNamePrefix + '!');
//iterate through cursor and write documents to CSV
cursor.each(function (err, document) {
//write each document to csv file
//maybe start a nuclear war
});
You're calling the aggregate method which doesn't return the cursor by default (like e.g. find()). To return query as a cursor, you must add the cursor option in the options. But, the timeout setting for the aggregation cursor is (currently) not supported. The native node.js driver only supports the batchSize setting.
You would set the batchOption like this:
var cursor = coll.aggregate(query, {cursor: {batchSize:100}}, writeResultsToCsv);
To circumvent such problems, I'd recommend aggregation or map-reduce directly through mongo client. There you can add the notimeout option.
The default timeout is 10 minutes (obviously useless for long time-consuming queries) and there's no way currently to set a different one as far as I know, only infinite by aforementioned option. The timeout hits you especially for high batch sizes, because it will take more than 10 mins to process the incoming docs and before you ask mongo server for more, the cursor has been deleted.
IDK your use case, but if it's a web view, there should be only fast queries/aggregations.
BTW I think this didn't change with 3.0.*
I have the following mongodb query in node.js which gives me a list of unique zip codes with a count of how many times the zip code appears in the database.
collection.aggregate( [
{
$group: {
_id: "$Location.Zip",
count: { $sum: 1 }
}
},
{ $sort: { _id: 1 } },
{ $match: { count: { $gt: 1 } } }
], function ( lookupErr, lookupData ) {
if (lookupErr) {
res.send(lookupErr);
return;
}
res.send(lookupData.sort());
});
});
How can this query be modified to return one specific zip code? I've tried the condition clause but have not been able to get it to work.
Aggregations that require filtered results can be done with the $match operator. Without tweaking what you already have, I would suggest just sticking in a $match for the zip code you want returned at the top of the aggregation list.
collection.aggregate( [
{
$match: {
zip: 47421
}
},
{
$group: {
...
This example will result in every aggregation operation after the $match working on only the data set that is returned by the $match of the zip key to the value 47421.
in the $match pipeline operator add
{ $match: { count: { $gt: 1 },
_id : "10002" //replace 10002 with the zip code you want
}}
As a side note, you should put the $match operator first and in general as high in the aggregation chain as you can.