MongoDB update multiple items with multiple changes - node.js

Is there any recommended way to update multiple items in MongoDB with one query ? I know that this is possible:
db.collection('mycollection').update({active: 1}, {$set: {active:0}}, {multi: true});
But in my case I want to update several documents with "unique" changes.
e.g. I want to combine these two queries into one:
db.collection('mycollection').update({
id: 'my id'
}, {
$set: {
name: "new name"
}
});
db.collection('mycolleciont').update({
id: 'my second id'
}, {
$set: {
name: "new name two"
}
});
Why ? I have a system which gets daily updates imported. The updates are mostly large so its around 200,000 Updates a day so currently I am executing 200,000 times the update query which takes a long time.
If its necessary to know: I am using Mongo 3 and nodeJS.

Related

MongoDB update query in subarray

An update in the array of objects inside another array of objects.
mongodb field that I'm working on:
otherFields: values,
tasks: [
{
_id: mongodb.objectID(),
title: string,
items:[{
_id: mongodb.objectID(),
title: string,
completed: boolean //field need to be update.
}]
},
{}...
],
otherFields: value
sample mongodb document
I need to find the document using the task_id and the item_id and update a completed field in item of a task. Using the mongoose findOneAndUpdate method
const path = "tasks.$.items." + item_id + "completed";
collectionName.findOneAndUpdate(
{ _id: req.user._id, "tasks._id": taskID },
{ $set: { [path]: true }});
The above query doesn't work!!!
There is no need to use multiple query conditions, since you'd like to update a specific item that has an unique ID. Therefore you could use something along the lines:
collectionName.findOneAndUpdate(
{ 'tasks.items._id': itemID },
...
);
Keep in mind this structure is far away from optimized as it would basically look through the entire database...
Also now that I think of it, you'd also have issue with the update, as there are two nested arrays within the document. Read more here: How to Update Multiple Array Elements in mongodb

mongoose update millions of records while extracting information

We have a production database with over 5 million customer customer records, each customer document has an embedded array of licenses they have applied for. And example customer document is as follows:
{
_id: ObjectId('...'),
phoneNumber: 'xxxx',
// Other customer fields
licenses: [
{
_id: ObjectId('...'),
state: 'PENDING',
expired: false,
createdAt: ISODate(''),
// Other license fields
},
// More Licenses for this customer
]
}
I have been tasked with changing the state of every PENDING license applied for during the month of September to REJECTED and sending an SMS to every customer whose pending permit just got rejected.
Using the model.where(condition).countDocuments() I have found that there is over 3 million customers (not licenses) matching the aforementioned criteria. Each customer has an average of 9 licenses.
I need assistance coming up with a strategy that won't slow down the system when performing this action. Furthermore, this is around 17GB of data.
Sending SMS is fine, I can queue details for SMS service. My challenge is processing the licenses while extracting relevant information for SMS.
First of all you have to create an index on the collection:
db.collection.createIndex( { "licenses.state": 1 } )
Then you shoud do something like that:
model.updateMany({}, {
'$set': {
'licenses.$[elem].state': 'REJECTED'
}
}, { arrayFilters: [{
'elem.createdAt': { $gte: ISODate(....) }
}],
multi: true
} ).then(function (doc)){}
If you have a replica set and your updates are on the primary instance you should not affect the secondary instances when reading on those once.
If you want to split the update on many batches you can use the _id (already indexed). Of course it depends on your _id format.

How to improve the performance of query in mongodb?

I have a collection in MongoDB with more than 5 million documents. Whenever I create a document inside the same collection I have to check if there exists any document with same title and if it exists then I don't have to add this to the database.
Example: here is my MongoDB document:
{
"_id":ObjectId("3a434sa3242424sdsdw"),
"title":"Lost in space",
"desc":"this is description"
}
Now whenever a new document is being created in the collection, I want to check if the same title already exists in any of the documents and if it does not exists, then only I want to add it to the database.
Currently, I am using findOne query and checking for the title, if it not available only then it is added to the database. I am facing the performance issue in this. It is taking too much time to do this process. Please suggest a better approach.
async function addToDB(data){
let result= await db.collection('testCol').findOne({title:data.title});
if(result==null){
await db.collection('testCol').insertOne(data);
}else{
console.log("already exists in db");
}
}
You can reduce the network round trip time which is currently 2X. Because you execute two queries. One for find then one for update. You can combine them into one query as below.
db.collection.update(
<query>,
{ $setOnInsert: { <field1>: <value1>, ... } },
{ upsert: true }
)
It will not update if already exists.
db.test.update(
{"key1":"1"},
{ $setOnInsert: { "key":"2"} },
{ upsert: true }
)
It looks for document with key1 is 1. If it finds, it skips. If not, it inserts using the data provided in the object of setOnInsert.

Get last created object for each user?

I have a collection, say, "Things":
{ id: 1
creator: 1
created: Today }
{ id: 2
creator: 2
created: Today }
{ id: 3
creator: 2
created: Yesterday }
I'd like to create a query that'll return each Thing created by a set of users, but only their most recently created thing.
What would this look like? I can get search my collection with an array of creators and it works just fine - how can I also only get the most recently created object per user?
Thing.find({ _creator : { "$in" : creatorArray })...
You cannot find, sort and pick the most recent in just a single find() query. But you can do it using aggregation:
Match all the records where the creator is amongst the one who we are looking
for.
Sort the records in descending order based on the created field.
Group the documents based on the creator.
Pick each creator's first document from the group, which will also be
his latest.
Project the required fields.
snippet:
Thing.aggregate([
{$match:{"creator":{$in:[1,2]}}},
{$sort:{"created":-1}},
{$group:{"_id":"$creator","record":{$first:"$$ROOT"}}},
{$project:{"_id":0,
"id":"$record.id",
"creator":"$record.creator",
"created":"$record.created"}}
], function(err,data){
})

How should I model my MongoDB collection for nested documents?

I'm managing a MongoDB database for a building products store. The most immediate collection is products, right?
There are quite several products, however they all belong to one among a set of 5-8 categories and then to one subcatefory among a small set of subcategories.
For example:
-Electrical
*Wires
p1
p2
..
*Tools
p5
pn
..
*Sockets
p11
p23
..
-Plumber
*Pipes
..
*Tools
..
PVC
..
I will use Angular at web site client side to show whole products catalog, I think about AJAX for querying the right subset of products I want.
Then, I wonder whether I should manage one only collection like:
{
MainCategory1: {
SubCategory1: {
{},{},{},{},{},{},{}
}
SubCategory2: {
{},{},{},{},{},{},{}
}
SubCategoryn: {
{},{},{},{},{},{},{}
}
},
MainCategory2: {
SubCategory1: {
{},{},{},{},{},{},{}
}
SubCategory2: {
{},{},{},{},{},{},{}
}
SubCategoryn: {
{},{},{},{},{},{},{}
}
},
MainCategoryn: {
SubCategory1: {
{},{},{},{},{},{},{}
}
SubCategory2: {
{},{},{},{},{},{},{}
}
SubCategoryn: {
{},{},{},{},{},{},{}
}
}
}
Or a single collection per each category. The number of documents might not be higher than 500. However I care about a balance for:
quick DB answer,
easy server side DB querying, and
client-side Angular code for rendering results to html.
I'm using mongodb node.js module, not Mongoose now.
What CRUD operations will I do?
Inserts of products, I'd also like to have a way to obtain autogenerated ids (maybe sequential) per each new register. However, as it might seem natural I wouldn't offer the _id to the user.
Querying the whole documents set of a subcategory. Maybe just obtaining a few attributes at first.
Querying whole or a specific subset of attributes of a document (product) in particular.
Modifying a product's attributes values.
I agree client side should get the easiest result to render. However, to nest categories into products is still a bad idea. The trade off is once you want to change, for example, the name of a category, it will be a disaster. And if you think about the possible usecases, for example:
list all categories
find all subcategories of a certain category
find all products in a certain category
You'll find it hard to do these stuff with your data structure.
I had same situation in my current project. So here's what I do for your reference.
First, categories should be in a separate collection. DON'T nest categories into each other, as it will complicate the procedure to find all subcategories. The traditional way for finding all subcategories is to maintain an idPath property. For example, your categories are divided into 3 levels:
{
_id: 100,
name: "level1 category"
parentId: 0, // means it's the top category
idPath: "0-100"
}
{
_id: 101,
name: "level2 category"
parentId: 100,
idPath: "0-100-101"
}
{
_id: 102,
name: "level3 category"
parentId: 101,
idPath: "0-100-101-102"
}
Note with idPath, parentId is not necessary anymore. It's for you to understand the structure easier.
Once you need to find all subcategories of category 100, simply do the query:
db.collection("category").find({_id: /^0-100-/}, function(err, doc) {
// whatever you want to do
})
With category stored in a separate collection, in your product you'll need to reference them by _id, just like when we use RDBMS. For example:
{
... // other fields of product
categories: [100, 101, 102, ...]
}
Now if you want to find all products in a certain category:
db.collection("category").find({_id: new RegExp("/^" + idPath + "-/"}, function(err, categories) {
var cateIds = _.pluck(categories, "_id"); // I'm using underscore to pluck category ids
db.collection("product").find({categories: { $in: cateIds }}, function(err, products) {
// products are here
}
})
Fortunately, category collection is usually very small, with only hundreds of records inside (or thousands). And it doesn't varies a lot. So you can always store a live copy of categories inside memory, and it can be constructed as nested objects like:
[{
id: 100,
name: "level 1 category",
... // other fields
subcategories: [{
id: 101,
... // other fields
subcategories: [...]
}, {
id: 103,
... // other fields
subcategories: [...]
},
...]
}, {
// another top1 category
}, ...]
You may want to refresh this copy every several hours, so:
setTimeout(3600000, function() {
// refresh your memory copy of categories.
});
That's all I get in mind right now. Hope it helps.
EDIT:
to provide int ID for each user, $inc and findAndModify is very useful. you may have a idSeed collection:
{
_id: ...,
seedValue: 1,
forCollection: "user"
}
When you want to get an unique ID:
db.collection("idSeed").findAndModify({forCollection: "user"}, {}, {$inc: {seedValue: 1}}, {}, function(err, doc) {
var newId = doc.seedValue;
});
The findAndModify is an atomic operator provided by mongodb. It will guarantee thread safety. and the find and modify actually happens in a "transaction".
2nd question is in my answer already.
query subsets of properties is described with mongodb Manual. NodeJS API is almost the same. Read the document of projection parameter.
update subsets is also supported by $set of mongodb operator.

Resources