How can I optimize my MongoDB Upsert statement?

How can I optimize my MongoDB Upsert statement? - node.js

A decision was made to switch our database from SQL to noSQL and I have a few questions on best practices and if my current implementation could be improved.
My current SQL implementation for upserting player data after a game.
let template = Players.map(
(player) =>
`(
${player.Rank},"${player.Player_ID}","${player.Player}",${player.Score},${tpp},1
)`,
).join(',');
let stmt = `INSERT INTO playerStats (Rank, Player_ID, Player, Score, TPP, Games_Played)
VALUES ${template}
ON CONFLICT(Player_ID) DO UPDATE
SET Score = Score+excluded.Score,
Games_Played=Games_Played+1,
TPP=TPP+excluded.TPP`;
db.run(stmt, function (upsert_error) { ...
The expected code is to update existing players by checking if a current Player_id exist. If so update their score among other things. Else insert a new player.
Mongo Implementation
const players = [
{ name: 'George', score: 10, id: 'g65873' },
{ name: 'Wayne', score: 100, id: 'g63853' },
{ name: 'Jhonny', score: 500, id: 'b1234' },
{ name: 'David', score: 3, id: 'a5678' },
{ name: 'Dallas', score: 333333, id: 'a98234' },
];
const db = client.db(dbName);
const results = players.map((player) => {
// updateOne(query, update, options)
db.collection('Players')
.updateOne(
{ Player_Name: player.name },
{
$setOnInsert: { Player_Name: player.name, id: player.id },
$inc: { Score: player.score },
},
{ upsert: true, multi: true },
);
});
Is there a better way in mongo to implement this? I tried using updateMany and bulkUpdate and I didn't get the results I expected.
Are there any tips, tricks, or resources aside from the mongo.db that you would recommend for those moving from SQL to noSQL?
Thanks again!

Your approach is fine. However, there are a few flaws:
Command updateOne updates exactly one document as the name implies. Thus multi: true
is obsolete.
Field names are case-sensitive (unlike most SQL databases). It should be $inc: { score: player.score }, not "Score"
Field Player_Name does not exist, it will never find any document for update.
So, your command should be like this:
db.collection('Players').updateOne(
{ name: player.name }, //or { id: player.id } ?
{
$setOnInsert: { name: player.name, id: player.id },
$inc: { score: player.score },
},
{ upsert: true }
)
According to my experience, moving from SQL to NoSQL is harder if you try to translate the SQL statement you have in your mind into a NoSQL command one-by-one. For me it worked better when I wiped out the SQL idea and try to understand and develop the NoSQL command from scratch.
Of course, when you do your first find, delete, insert, update then you will see many analogies to SQL but latest when you approach to the aggregation framework you are lost if you try to translate them into SQL or vice versa.

Related

MongoDB: Updating the path 'Y' would create a conflict at 'Y'

I have the following query being executed:
const myData = await this.findOneAndUpdate({
myId,
color,
}, {
$setOnInsert: {
myId,
color,
date: now.format('YYYY-MM-DD'),
counter: myObj.counter - amount,
},
$inc: {
counter: -amount
},
}, {
new: true,
upsert: true,
});
I get the error:
"Updating the path 'count' would create a conflict at 'count'"
First I thought the error was happening because of the version of mongoose, but I don't think that is the case.
Now I understand this is happening because I have color in both $setOnInsert and $inc, but I don't understand why.
Also: This code works on MongoDB 3.4.24 Community but does NOT WORK on MongoDB 5.0.11 Community
So my questions are:
why this error is happening exactly? Could this be a BUG?
Why this works in older version of MongoDB?
What would be the best approach to refactor this?

You are getting the above error from MongoDB, because of the way $inc works, with upsert: true, new: true, $inc will insert a new document. Check this playground.
In your case, you have $setOnInsert and $inc, in case your matching document is not found, both the operators will try to set the value of key counter, which will create a conflict, hence you see the error. To fix it, you can use the pipeline form of updates, something like this:
const myData = await this.findOneAndUpdate({
myId,
color,
}, [
{
$set: {
myId,
color,
date: {
$cond: {
if: {$eq: ["$$ROOT", null]},
then: now.format('YYYY-MM-DD'),
else: "$$ROOT.date"
}
},
counter: {
$cond: {
if: {$eq: ["$$ROOT", null]},
then: myObj.counter - amount,
else: {$substract: ["$$ROOT.counter", amount]}
}
}
}
}
], {
new: true,
upsert: true,
});

MongoDB upsert an array of objects from a list

I am working on moving my database from sqlite3 to mongo. I went
through mongo university, yet I'm not sure I have found a really good
example of upsertting in bulk.
Use case : user uploads a data file with a list of players and their stats. The app needs to either update a player or add a new player if they do not already exist.
Current Implementation : Function takes a list of Players and creates SQL statement.
let template = '(Player.Rank, Player.Player_ID, Player.Player, Player.Score, TTP, 1),(Player.Rank, Player_ID, ...) ... (... TTP, 1)';
const stmt = `INSERT INTO playerStats (Rank, Player_ID, Player, Score, TPP, Games_Played)
VALUES ${template}
ON CONFLICT(Player_ID) DO UPDATE
SET Score = Score+excluded.Score, TPP=TPP+excluded.TPP, Games_Played=Games_Played+1`;
db.run(stmt, callback);
Im hoping to have each document be a league which contains players, games, and managers.
Mongo DB document template
{
"LEAGUE_ID": "GEO_SAM",
"Players": [
{
"id": "PlayerID",
"name": "Player",
"score": "Score",
"rank": "Rank",
"xPlayed": "Games_Played",
"ttp": "TTP"
}
],
"Managers": [
{...}
],
"Games": [
{...}
]
}
I am totally lost and not sure how to get this done. Do I need to create a loop and ask mongo to upsert on each iteration? I have searched through countless examples but all of them use static hard coded values.
Here is my testing example.
const query = { League_id : "GEO_SAM", Players.$.id: $in{ $Players }};
const update = { $inc: { score: $Player.Score}};
const options = { upsert: true };
collection.updateMany(query, update, options);
I also don't understand how to pass the entire player object to push to the array if the player_id isn't found.

My solution was to create a metaData field containing the league ID with a single player. If anyone else has a better solution I would love to hear from you.
{
MetaData: { "LEAGUE_ID": "GEO_SAM"},
Player: {
"id": "PlayerID",
"name": "Player",
"score": "Score",
"rank": "Rank",
"xPlayed": "Games_Played",
"ttp": "TTP"
}
}
Then I mapped over the values and inserted each one.
client.connect().then((client) => {
const db = client.db(dbName);
const results = Players.map((player) => {
db.collection('Players').updateOne(
{ Player_Name: player.Player_ID },
{
$setOnInsert: {
Player_ID: player.Player_ID,
Player: player.Player,
Rank: player.Rank,
},
$inc: { Score: player.Score, Games_Played: 1, TPP: player.TPP },
},
{ upsert: true, multi: true },
);
});

Mongoose full text search not filtering correctly

So basically i have model with a bunch of string fields like so:
const Schema: Schema = new Schema(
{
title: {
type: String,
trim: true
},
description: {
type: String,
trim: true
},
...
}
);
Schema.index({ '$**': 'text' });
export default mongoose.model('Watch', Schema);
where I index all of them.
Now when I search being that this schema is used as a ref for another model I do a search like this where user is an instance of the other model
const { search, limit = 5 } = req.query;
const query = search && { match: { $text: { $search: new RegExp(search, 'i') } } };
const { schemaRes } = await user
.populate({
path: 'schema',
...query,
options: {
limit
}
})
.execPopulate();
and the searching itself seems to work ok, the problem is when search fields starts to be more specific it seems to me the it does not regard it well.
Example
db
{ title: 'Rolex', name: 'Submariner', description: 'Nice' }
{ title: 'Rolex', name: 'Air-King', description: 'Nice' }
When the search param is Rolex I get both items which is ok but when the search param becomes Rolex Air-King i keep on getting both items which to me is not ok because I would rather get only one.
Is there something I could do to achieve this?

Returning both items is correct, since both items match your search params, but with different similarity score.
You can output the similarity score to help sorting the result.
user.aggregate([
{ $match: { $text: { $search: "Rolex Air-King" } } },
{ $set: { score: { $meta: "textScore" } } }
])
// new RegExp("Rolex Air-King", 'i') is not necessary and even invalid,
// as $search accepts string and is already case-insensitive by default
The query will return
[{
"_id": "...",
"title": "Rolex",
"name": "Air-King",
"description": "Nice",
"score": 2.6
},
{
"_id": "....",
"title": "Rolex",
"name": "Submariner",
"description": "Nice",
"score": 1.1
}]
Since the second result item matches your search query (even partially), MongoDB returns it.
You could use the score to help sort the items. But determining the right threshold to filter the result is complex, as the score depends on the word count as well.
On a side note: You can assign different weights to the fields if they are not equally important
https://docs.mongodb.com/manual/tutorial/control-results-of-text-search/

Conditional update, depending on field matched

Say I have a collection of documents, each one managing a discussion between a teacher and a student:
{
_id,
teacherId,
studentId,
teacherLastMessage,
studentLastMessage
}
I will get queries with 3 parameters: an _id, a userId and a message.
I'm looking for a way to update the teacherLastMessage field or studentLastMessage field depending on which one the user is.
At the moment, I have this:
return Promise.all([
// if user is teacher, set teacherLastMessage
db.collection('discussions').findOneAndUpdate({
teacherId: userId,
_id
}, {
$set: {
teacherLastMessage: message
}
}, {
returnOriginal: false
}),
// if user is student, set studentLastMessage
db.collection('discussions').findOneAndUpdate({
studentId: userId,
_id
}, {
$set: {
studentLastMessage: message
}
}, {
returnOriginal: false
})
]).then((results) => {
results = results.filter((result) => result.value);
if (!results.length) {
throw new Error('No matching document');
}
return results[0].value;
});
Is there a way to tell mongo to make a conditional update, based on the field matched? Something like this:
db.collection('discussions').findOneAndUpdate({
$or: [{
teacherId: userId
}, {
studentId: userId
}],
_id
}, {
$set: {
// if field matched was studentId, set studentLastMessage
// if field matched was teacherId, set teacherLastMessage
}
});
Surely it must be possible with mongo 3.2?

What you want would require referencing other fields inside of $set. This is currently impossible. Refer to this ticket as an example.
First of all, your current approach with two update queries looks just fine to me. You can continue using that, just make sure that you have the right indexes in place. Namely, to get the best performance for these updates, you should have two compound indexes:
{ _id: 1, teacherId: 1 }
{ _id: 1, studentId: 1 }.
To look at this from another perspective, you should probably restructure your data. For example:
{
_id: '...',
users: [
{
userId: '...',
userType: 'student',
lastMessage: 'lorem ipsum'
},
{
userId: '...',
userType: 'teacher',
lastMessage: 'dolor sit amet'
}
]
}
This would allow you to perform your update with a single query.

Your data structure is a bit weird, unless you have a specific business case which requires the data the be molded that way i would suggest creating a usertype unless a user can both be a teacher and a student then keep your structure.
The $set{} param can take a object, my suggestion is to do your business logic prior. You should already know prior to your update if the update is going to be for a teacher or student - some sort of variable should be set / authentication level to distinguish teachers from students. Perhaps on a successful login in the callback you could set a cookie/local storage. Regardless - if you have the current type of user, then you could build your object earlier, so make an object literal with the properties you need based on the user type.
So
if(student)
{
var updateObj = { studentLastMsg: msg }
}
else
{
var updateObj = { teacherLastMsg: msg }
}
Then pass in your update for the $set{updateObj} I'll make this a snippet - on mobile

Sorting by virtual field in mongoDB (mongoose)

Let's say I have some Schema which has a virtual field like this
var schema = new mongoose.Schema(
{
name: { type: String }
},
{
toObject: { virtuals: true },
toJSON: { virtuals: true }
});
schema.virtual("name_length").get(function(){
return this.name.length;
});
In a query is it possible to sort the results by the virtual field? Something like
schema.find().sort("name_length").limit(5).exec(function(docs){ ... });
When I try this, the results are simple not sorted...

You won't be able to sort by a virtual field because they are not stored to the database.
Virtual attributes are attributes that are convenient to have around
but that do not get persisted to mongodb.
http://mongoosejs.com/docs/2.7.x/docs/virtuals.html

Virtuals defined in the Schema are not injected into the generated MongoDB queries. The functions defined are simply run for each document at the appropriate moments, once they have already been retrieved from the database.
In order to reach what you're trying to achieve, you'll also need to define the virtual field within the MongoDB query. For example, in the $project stage of an aggregation.
There are, however, a few things to keep in mind when sorting by virtual fields:
projected documents are only available in memory, so it would come with a huge performance cost if we just add a field and have the entire documents of the search results in memory before sorting
because of the above, indexes will not be used at all when sorting
Here's a general example on how to sort by virtual fields while keeping a relatively good performance:
Imagine you have a collection of teams and each team contains an array of players directly stored into the document. Now, the requirement asks for us to sort those teams by the ranking of the favoredPlayer where the favoredPlayer is basically a virtual property containing the most relevant player of the team under certain criteria (in this example we only want to consider offense and defense players). Also, the aforementioned criteria depend on the users' choices and can, therefore, not be persisted into the document.
To top it off, our "team" document is pretty large, so in order to mitigate the performance hit of sorting in-memory, we project only the fields we need for sorting and then restore the original document after limiting the results.
The query:
[
// find all teams from germany
{ '$match': { country: 'de' } },
// project only the sort-relevant fields
// and add the virtual favoredPlayer field to each team
{ '$project': {
rank: 1,
'favoredPlayer': {
'$arrayElemAt': [
{
// keep only players that match our criteria
$filter: {
input: '$players',
as: 'p',
cond: { $in: ['$$p.position', ['offense', 'defense']] },
},
},
// take first of the filtered players since players are already sorted by relevance in our db
0,
],
},
}},
// sort teams by the ranking of the favoredPlayer
{ '$sort': { 'favoredPlayer.ranking': -1, rank: -1 } },
{ '$limit': 10 },
// $lookup, $unwind, and $replaceRoot are in order to restore the original database document
{ '$lookup': { from: 'teams', localField: '_id', foreignField: '_id', as: 'subdoc' } },
{ '$unwind': { path: '$subdoc' } },
{ '$replaceRoot': { newRoot: '$subdoc' } },
];
For the example you gave above, the code could look something like the following:
var schema = new mongoose.Schema(
{ name: { type: String } },
{
toObject: { virtuals: true },
toJSON: { virtuals: true },
});
schema.virtual('name_length').get(function () {
return this.name.length;
});
const MyModel = mongoose.model('Thing', schema);
MyModel
.aggregate()
.project({
'name_length': {
'$strLenCP': '$name',
},
})
.sort({ 'name_length': -1 })
.exec(function(err, docs) {
console.log(docs);
});

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How can I optimize my MongoDB Upsert statement? - node.js

Related

MongoDB: Updating the path 'Y' would create a conflict at 'Y'

MongoDB upsert an array of objects from a list

Mongoose full text search not filtering correctly

Conditional update, depending on field matched

Sorting by virtual field in mongoDB (mongoose)

Categories

Resources