Let say I have a collection of Person{email: 'actual email', ..other data} and want to query if Person exists with given email and retrieve it data if so or get a null if not.
If i want to do that once than no problem just do a query, through mongoose using Person.findOne() or whatever.
But what if I have to do a check for 25-100 given emails? Of course I can just send a tons of requests to mongodb and retrieve the data but it seems a vast of network.
Is there a good and perfomant way to query a mongodb with multiple clauses in single batch like findBatch([{email: 'email1'}, {email: 'email2'}...{email: 'emailN'} ]) and got as result [document1,null,document3,null, documentN] where null is for not matched find criterias?
Currently I see only one option:
Huge find with single {email: $in: [] } query and that do a matching through the searching on the server side in application logic. Cons: quite cumbersome and error prone if you have more than one search criteria.
Is there any better ways to implement such thing?
Try this:
Replace the arrayOfEmails with your query array
Replace emailField with the actual name in your db documents
db.collName.aggregate([
{
"$match" : {
"emailField" : {
"$in" : arrayOfEmails
}
}
},
{
"$group" : {
"_id" : null,
"docs" : {
"$push" : {
"$cond" : [
{
"$in" : [
arrayOfEmails,
[
"$emailField"
]
]
},
"$$ROOT",
null
]
}
}
}
}
])
Related
I want to search multiple elements from database with sequelize. I am using in operator to use this.
It works when I write $in : [1,2]
But when I write $in: [req.body.regions] //it does not work. How can I parse body object to an array.
Phones.findAll({
attributes: ['id', 'enabled', 'color_id', 'sold', 'region_id'
],
where: {
region_id: (req.body.region_id) ? { $in : [req.body.region_id]}: { $ne: null },
color_id: (req.body.color_id) ? { $in : [req.body.color_id]} : { $ne: null },
phone_model_id: (req.body.phone_model_id) ? { $in : [req.body.phone_model_id]} : { $ne: null },
enabled: 1,
},
Solution : As I can see the request is comma separated , you can use the below one :
{ $in : req.body.region_id.split(",") }
Good way to do : it should be regions[0], regions[1] or just repeat same name regions[] in your postman api , then in your code use it like :
{ $in : req.body.regions }
First method is easy for you to test in postman but for developers the second way is more elegant and easy to create or maintain array rather than comma separated string
If
req.body.regions
is an array use this code:
$in: req.body.regions
{
"_id" : ObjectId("5852725660632d916c8b9a38"),
"response_log" : [
{
"campaignId" : "AA",
"created_at" : ISODate("2016-12-20T11:53:55.727Z")
},
{
"campaignId" : "AB",
"created_at" : ISODate("2016-12-20T11:55:55.727Z")
}]
}
I have a document which contains an array. I want to select all those documents that do not have response_log.created_at in last 2 hours from current time and count of response_log.created_at in last 24 is less than 3.
I am unable to figure out how to go about it. Please help
You can use the aggregation framework to filter the documents. A pipeline with $match and $redact steps will do the filtering.
Consider running the following aggregate operation where $redact allows you to proccess the logical condition with the $cond operator and uses the system variables $$KEEP to "keep" the document where the logical condition is true or $$PRUNE to "remove" the document where the condition was false.
This operation is similar to having a $project pipeline that selects the fields in the collection and creates a new field that holds the result from the logical condition query and then a subsequent $match, except that $redact uses a single pipeline stage which is more efficient:
var moment = require('moment'),
last2hours = moment().subtract(2, 'hours').toDate(),
last24hours = moment().subtract(24, 'hours').toDate();
MongoClient.connect(config.database)
.then(function(db) {
return db.collection('MyCollection')
})
.then(function (collection) {
return collection.aggregate([
{ '$match': { 'response_log.created_at': { '$gt': last2hours } } },
{
'$redact': {
'$cond': [
{
'$lt': [
{
'$size': {
'$filter': {
'input': '$response_log',
'as': 'res',
'cond': {
'$lt': [
'$$res.created_at',
last24hours
]
}
}
}
},
3
]
},
'$$KEEP',
'$$PRUNE'
]
}
}
]).toArray();
})
.then(function(docs) {
console.log(docs)
})
.catch(function(err) {
throw err;
});
Explanations
In the above aggregate operation, if you execute the first $match pipeline step
collection.aggregate([
{ '$match': { 'response_log.created_at': { '$gt': last2hours } } }
])
The documents returned will be the ones that do not have "response_log.created_at" in last 2 hours from current time where the variable last2hours is created with the momentjs library using the subtract API.
The preceding pipeline with $redact will then further filter the documents from the above by using the $cond ternary operator that evaluates this logical expression that uses $size to get the count and $filter to return a filtered array with elements that match other logical condition
{
'$lt': [
{
'$size': {
'$filter': {
'input': '$response_log',
'as': 'res',
'cond': { '$lt': ['$$res.created_at', last24hours] }
}
}
},
3
]
}
to $$KEEP the document if this condition is true or $$PRUNE to "remove" the document where the evaluated condition is false.
I know that this is probably not the answer that you're looking for but this may not be the best use case for Mongo. It's easy to do that in a relational database, it's easy to do that in a database that supports map/reduce but it will not be straightforward in Mongo.
If your data looked different and you kept each log entry as a separate document that references the object (with id 5852725660632d916c8b9a38 in this case) instead of being a part of it, then you could make a simple query for the latest log entry that has that id. This is what I would do in your case if I ware to use Mongo for that (which I wouldn't).
What you can also do is keep a separate collection in Mongo, or add a new property to the object that you have here which would store the latest date of campaign added. Then it would be very easy to search for what you need.
When you are working with a database like Mongo then how your data looks like must reflect what you need to do with it, like in this case. Adding a last campaign date and updating it on every campaign added would let you search for those campaign that you need very easily.
If you want to be able to make any searches and aggregates possible then you may be better off using a relational database.
I am using mongoose with node.js for MongoDB. Now i need to make 20 parallel find query requests in my database with limit of documents 4, same as shown below just brand_id will change for different brand.
areamodel.find({ brand_id: brand_id }, { '_id': 1 }, { limit: 4 }, function(err, docs) {
if (err) {
console.log(err);
} else {
console.log('fetched');
}
}
Now as to run all these query parallely i thought about putting all 20 brand_id in a array of string and then use a $in query to get the results, but i don't know how to specify the limit 4 for every array field which will be matched.
I write below code with aggregation but don't know where to specify limit for each element of my array.
var brand_ids = ["brandid1", "brandid2", "brandid3", "brandid4", "brandid5", "brandid6", "brandid7", "brandid8", "brandid9", "brandid10", "brandid11", "brandid12", "brandid13", "brandid14", "brandid15", "brandid16", "brandid17", "brandid18", "brandid19", "brandid20"];
areamodel.aggregate(
{ $project: { _id: 1 } },
{ $match : { 'brand_id': { $in: brand_ids } } },
function(err, docs) {
if (err) {
console.error(err);
} else {
}
}
);
Can anyone please tell me how can i solve my problem using only one query.
UPDATE- Why i don't think $group be helpful for me.
Suppose my brand_ids array contains these strings
brand_ids = ["id1", "id2", "id3", "id4", "id5"]
and my database have below documents
{
"brand_id": "id1",
"name": "Levis",
"loc": "india"
},
{
"brand_id": "id1",
"name": "Levis"
"loc": "america"
},
{
"brand_id": "id2",
"name": "Lee"
"loc": "india"
},
{
"brand_id": "id2",
"name": "Lee"
"loc": "america"
}
Desired JSON output
{
"name": "Levis"
},
{
"name": "Lee"
}
For above example suppose i have 25000 documents with "name" as "Levis" and 25000 of documents where "name" is "Lee", now if i will use group then all of 50000 documents will be queried and grouped by "name".
But according to the solution i want, when first document with "Levis" and "Lee" gets found then i will don't have to look for remaining thousands of the documents.
Update- I think if anyone of you can tell me this then probably i can get to my solution.
Consider a case where i have 1000 total documents in my mongoDB, now suppose out of that 1000, 100 will pass my match query.
Now if i will apply limit 4 on this query then will this query take same time to execute as the query without any limit, or not.
Why i am thinking about this case
Because if my query will take same time then i don't think $group will increase my time as all documents will be queried.
But if time taken by limit query is more than the time taken without the limit query then.
If i can apply limit 4 on each array element then my question will be solved.
If i cannot apply limit on each array element then i don't think $group will be useful, as in this case i have to scan whole documents to get the results.
FINAL UPDATE- As i read on below answer and also on mongodb docs that by using $limit, time taken by query does not get affected it is the network bandwidth that gets compromised. So i think if anyone of you can tell me how to apply limit on array fields (by using $group or anything other than that)then my problem will get solved.
mongodb: will limit() increase query speed?
Solution
Actually my thinking about mongoDB was very wrong i thought adding limit with queries decrease time taken by query but it is not the case that's why i stumbled so many days to try the answer which Gregory NEUT and JohnnyHK Told me to. Thanks a lot both of you guys i must have found the solution at the day one if i had known about this thing. thanks alot for helping me out of here guys i really appreciate it.
I propose you to use the $group aggregation attribute to group all data you got from the $match by brand_id, and then limit the groups of data using $slice.
Look at this stack overflow post
db.collection.aggregate(
{
$sort: {
created: -1,
}
}, {
$group: {
_id: '$city',
title: {
$push: '$title',
}
}, {
$project: {
_id: 0,
city: '$_id',
mostRecentTitle: {
$slice: ['$title', 0, 2],
}
}
})
I propose using distinct, since that will return all different brand names in your collection. (I assume this is what you are trying to achieve?)
db.runCommand ( { distinct: "areamodel", key: "name" } )
MongoDB docs
In mongoose i think it is: areamodel.db.db.command({ distinct: "areamodel", key: "name" }) (Untested)
I am trying to count the number of models in a collection based on a property:
I have an upvote model, that has: post (objectId) and a few other properties.
First, is this good design? Posts could get many upvotes, so I didn’t want to store them in the Post model.
Regardless, I want to count the number of upvotes on posts with a specific property with the following and it’s not working. Any suggestions?
upvote.count({‘post.specialProperty’: mongoose.Types.ObjectId(“id”), function (err, count) {
console.log(count);
});
Post Schema Design
In regards to design. I would design the posts collection for documents to be structured as such:
{
"_id" : ObjectId(),
"proprerty1" : "some value",
"property2" : "some value",
"voteCount" : 1,
"votes": [
{
"voter": ObjectId()// voter Id,
other properties...
}
]
}
You will have an array that will hold objects that can contain info such as voter id and other properties.
Updating
When a posts is updated you could simply increment or decrement the voteCountaccordingly. You can increment by 1 like this:
db.posts.update(
{"_id" : postId},
{
$inc: { voteCount: 1},
$push : {
"votes" : {"voter":ObjectId, "otherproperty": "some value"}
}
}
)
The $inc modifier can be used to change the value for an existing key or to create a new key if it does not already exist. Its very useful for updating votes.
Totaling votes of particular Post Criteria
If you want to total the amount for posts fitting a certain criteria, you must use the Aggregation Framework.
You can get the total like this:
db.posts.aggregate(
[
{
$match : {property1: "some value"}
},
{
$group : {
_id : null,
totalNumberOfVotes : {$sum : "$voteCount" }
}
}
]
)
first, a comment. The collection described is simplified, for this question. I'm interesting in understanding how to manipulate a mongo db and get statistics of my data.
Let's say I have a collection with test results. The schema is:
Results {
_id: ObjectId
TestNumber: int
result: String // this contains "pass" or "fail"
// additional data
}
For each test can be many reports, so most likely each TestNumber appears in more than one document.
How can I perform a query which returns this info on the entire collection:
TestNumber | count of result == "pass" | count of result == "fail"
You can use the below aggregation operations pipelined together:
Group all the documents based on their testNumber and the type of
result together, so for every testNumber, we would have two
groups each, one for fail and another for pass, with the count of
documents in each group.
Project a variable "pass" for the group containing the result as
pass, and fail for the other group.
Group together the documents again based on the testNumber, and
push the pass and fail documents into an array.
Project the fields as required.
The Code:
Results.aggregate([
{$group:{"_id":{"testNumber":"$testNumber","result":"$result"},
"count":{$sum:1}}},
{$project:{"_id":0,
"testNumber":"$_id.testNumber",
"result":{$cond:[{$eq:["$_id.result","pass"]},
{"pass":"$count"},
{"fail":"$count"}]}}},
{$group:{"_id":"$testNumber",
"result":{$push:"$result"}}},
{$project:{"testNumber":"$_id","result":1,"_id":0}}
],function(a,b){
// post process
})
Sample Data:
db.collection.insert([
{
"_id":1,
"testNumber":1,
"result":"pass"
},
{
"_id":2,
"testNumber":1,
"result":"pass"
},
{
"_id":3,
"testNumber":1,
"result":"fail"
},
{
"_id":4,
"testNumber":2,
"result":"pass"
}])
Sample o/p:
{ "result" : [ { "pass" : 1 } ], "testNumber" : 2 }
{ "result" : [ { "fail" : 1 }, { "pass" : 2 } ], "testNumber" : 1 }
iterating doc.result will give you the pass count and the number of failed tests for the testNumber.