MongoDB/Mongoose: Query for valid document property

MongoDB/Mongoose: Query for valid document property - node.js

Using Mongoose for MongoDB I store several collections of data which are defined each by a Mongoose schema.
1) Is there an easy way (without explicitly querying the database) to find out whether a specific property is part of a particular collection schema model?
Lets say I have a collection of users, including information about name and address. At runtime I - for mistake - receive data which is supposed to be stored in the user's document but does not (fully) comply with the schema (e. g. shoe size is included).
2) I know that Mongoose refuses to save the data set in that case but how and at all do I get some sort of feedback about that to report back appropriately to the client?

I think the fastest way to check whether a certain collection contains documents that have the field that you mention is to run a count query with the $exists operator on each collection:
db.collection1.count({ field: { $exists: true }});
db.collection2.count({ field: { $exists: true }});
db.collection3.count({ field: { $exists: true }});
Afterwards, you can save the return value of each count operation in a variable and pass it to the client, thus making it possible to convey a message to the end-user.

Related

ravendb NodeJS, load related document and create a nested result in a query

I have an index that returns something like this
Company_All {
name : string;
id : string;
agentDocumentId : string
}
is it possible to load the related agent document and then generate a nested result with selectFields and QueryData like this
ICompanyView {
companyName : 'Warner',
user {
documentId : 'A/1'
firstName : 'john',
lastName : 'paul'
}
}
I need something like the below query that obviously doesn't work as I expect:
const queryData = new QueryData(
["name", "agentDocumentId", "agent.firstName", "agent.lastName"],
["companyName", "user.documentId", "user.lastName", "user.firstName"]);
return await session.query<Company_AllResult>({ index: Company_All })
.whereEquals("companyId", request.companyId)
.include(`agents/${agentDocumentId}`) // ????
.selectFields(queryData,ICompanyView)
.single();

Yes, you can do that using:
https://ravendb.net/docs/article-page/5.4/nodejs/indexes/indexing-related-documents
This is called indexing related documents, and is accessible at indexing time, not query time.
Alternatively, you have the filter clause, which has access to the loaded document, but I wouldn't generally recommend doing this.

Generally:
When you query an index, the results of querying the index are the documents from the collection the index was defined on.
Index-fields defined in the index are used to filter the index-query
but the results are still documents from the original collection.
If you define an index that indexes content from a related-document then when making an index-query you can filter the documents by the indexed-fields from the related documents, but the results are still documents from the original collection.
When making an index-query (or any other query) you can project the query results so that Not the full documents of the original collection are returned but some other object.
Now:
To project/get data from the indexed related-document you have 2 options:
Store the index-fields from the related-document in the index.
(Store all -or- specific fields).
This way you have access to that content when making a projection in your query.
See this code sample.
Don't store the index-fields from the related-document,
then you will be able to use the index-fields to filter by in your query,
but to get content you will need to use 'include' feature in your query,
and then use the session.load, which will Not make another trip to the server.
i.e. https://demo.ravendb.net/demos/nodejs/related-documents/query-related-documents

Dealing with race conditions and starvation when generating unique IDs using MongoDB + NodeJS

I am using MongoDB to generate unique IDs of this format:
{ID TYPE}{ZONE}{ALPHABET}{YY}{XXXX}
Here ID TYPE will be an alphabet from {U, E, V} depending on the input, zone will be from the set {N, S, E, W}, YY will be the last 2 digits of the current year and XXXXX will be a 5 digit number beginning from 0 (willbe padded with 0s to make it 5 digits long). When XXXXX reaches 99999, the ALPHABET part will be incremented to the next alphabet (starting from A).
I will receive ID TYPE and ZONE as input and will have to give the generated unique ID as output. Everytime, I have to generate a new ID, I will read the last generated for the given ID TYPE and ZONE, increment the number part by 1 (XXXXX + 1) and then save the new generated ID in MongoDB and return the output to the user.
This code will be run on a single NodeJS server and there can be multiple clients calling this method
Is there a possibility of a race condition like the once described below if I am ony running a single server instance:
First client reads last generated ID as USA2100000
Second client reads last generated ID as USA2100000
First client generates the new ID and saves it as USA2100001
Second client generates the new ID and saves it as USA2100001
Since 2 clients have generated IDs, finally the DB should have had USA2100002.
To overcome this, I am using MongoDB transactions. My code in Typescript using Mongoose as ODM is something like this:
session = await startSession();
session.startTransaction();
lastId = await GeneratedId.findOne({ key: idKeyStr }, "value").value
lastId = createNextId(lastId);
const newIdObj: any = {
key: `Type:${idPrefix}_Zone:${zone_letter}`,
value: lastId,
};
await GeneratedId.findOneAndUpdate({ key: idKeyStr }, newIdObj, {
upsert: true,
new: true,
});
await session.commitTransaction();
session.endSession();
I want to know what exactly will happen when the situation I
described above happens with this code?
Will the second client's transaction throw an exception and I have to abort or retry the transaction in my code or will it handle the retry on its own?
How does MongoDB or other DBs handle transactions? Does MongoDB lock the documents involved in the transaction? Are the exclusive locks (wont even allow other clients to read)?
If the same client keeps failing to commit its transaction, this client would be starved. How to deal with this starvation?

You are using MongoDB to store the ID. It's a state. Generation of the ID is a function. You use Mongodb to generate the ID when mongodb process takes arguments of the function and returns the generated ID. It's not what you are doing. You are using nodejs to generate the ID.
Number of threads, or rather event loops is critical as it defines the architecture but in either way you don't need transactions. Transactions in mongodb are being called "multi-document transactions" exactly to highlight they are intended for consistent update of several documents at once. The very first paragraph of https://docs.mongodb.com/manual/core/transactions/ warns you that if you update a single document there is no room for transactions.
A single threaded application does not require any synchronisation. You can reliably read the latest generated ID on start and guarantee the ID is unique within the nodejs process. If you exclude mongodb and other I/O from the generation function you will make it synchronous so you can maintain state of the ID within nodejs process and guarantee its uniqueness. Once generated you can persist in in the db asynchronously. In the worst case scenario you may have a gap in the sequential numbers but no duplicates.
If there is a slighteest chance that you may need to scale up to more than 1 nodejs process to handle more simultaneous requests or add another host for redundancy in the future you will need to sync generation of the ID and you can employ Mongodb unique indexes for that. The function itself doesn't change much you still generate the ID as in a single-threaded architecture but add an extra step to save the ID to mongo. The document should have unique index on the ID field, so in case of concurrent updates one of the query will successfully add the document and another will fail with "E11000 duplicate key error". You catch such errors on nodejs side and repeat the function again picking the next number:

This is what you can try. You need to store only one document in the GeneratedId collection. This document will have the last generated id's value. The document must have a known _id field, for example lets say it will be an integer with value 1. So, the document can be like this:
{ _id: 1, lastGeneratedId: "<some value>" }
In your application, you can use the findOneAndUpdate() method with a filter { _id: 1 }; which means you are targeting one document update. This update will be an atomic operation; as per the MongoDB documentation "All write operations in MongoDB are atomic on the level of a single document." . Do you need a transaction in this case? No. The update operation is atomic and performs better than using a transaction. See Update Documents - Atomicity.
Then, how do I generate the new generated id and retrieve it?
I will receive ID TYPE and ZONE...
Using the above input values and the existing lastGeneratedId value you can arrive at the new value and update the document (with the new value). The new value can be calculated / formatted within the Aggregation Pipeline of the update operation - you can use the feature Updates with Aggregation Pipeline (this is available with MongoDB v4.2 or higher).
Note the findOneAndUpdate() method returns the updated (or modified) document when you use the update option new: true. This returned document will have the newly generated lastGeneratedId value.
The update method can look like this (using NodeJS driver or even Mongoose):
const filter = { _id: 1 }
const update = [
{ $set: { lastGeneratedId: { // your calculation of new value goes here... } } }
]
const options = { new: true, projection: { _id: 0, lastGeneratedId: 1} }
const newId = await GeneratedId.findOneAndUpdate(filter, update, options).['lastGeneratedId']
Note about the JavaScript function:
With MongoDB v4.4 you can use JavaScript functions within an Aggregation Pipeline; and this is applicable for the Updates with Aggregation Pipeline. For details see $function aggregation pipeline operator.

MongoDB ignore <query> on upsert

I'm trying to upsert a document into MongoDB using the main Node driver. I want to query by _id, and if that _id doesn't exist, then create a new document with a normal ObjectId. However, from the MongoDB docs:
The update creates a base document from the equality clauses in the query parameter, and then applies the update expressions from the update parameter.
(https://docs.mongodb.com/manual/reference/method/db.collection.update/#upsert-behavior)
Meaning that it will try to use whatever I compared the _id to in the query section as the document's new id. The problem is that, if the document doesn't exist yet, I'm comparing _id to null, so when it creates the new document, it sets the _id as null. Is there a way to avoid this behavior? I want to query by _id, but not use whatever I compare it to in the upsert. Here's my code so far:
dbm.collection('orders').findOneAndUpdate(
{
_id: order._id
},
{
$set: order,
$setOnInsert: {_id: ObjectId()}//tried with and without this
},
{
upsert: true,
returnOriginal: false
}).then().catch()
Here, order, is an object with a few fields. And for my purposes, I need to query by _id, not some other indexed field.

MongoError: E11000 duplicate key error collection cms_demo1.posts index: username_1 dup key: { : null } [duplicate]

Following is my user schema in user.js model -
var userSchema = new mongoose.Schema({
local: {
name: { type: String },
email : { type: String, require: true, unique: true },
password: { type: String, require:true },
},
facebook: {
id : { type: String },
token : { type: String },
email : { type: String },
name : { type: String }
}
});
var User = mongoose.model('User',userSchema);
module.exports = User;
This is how I am using it in my controller -
var user = require('./../models/user.js');
This is how I am saving it in the db -
user({'local.email' : req.body.email, 'local.password' : req.body.password}).save(function(err, result){
if(err)
res.send(err);
else {
console.log(result);
req.session.user = result;
res.send({"code":200,"message":"Record inserted successfully"});
}
});
Error -
{"name":"MongoError","code":11000,"err":"insertDocument :: caused by :: 11000 E11000 duplicate key error index: mydb.users.$email_1 dup key: { : null }"}
I checked the db collection and no such duplicate entry exists, let me know what I am doing wrong ?
FYI - req.body.email and req.body.password are fetching values.
I also checked this post but no help STACK LINK
If I removed completely then it inserts the document, otherwise it throws error "Duplicate" error even I have an entry in the local.email

The error message is saying that there's already a record with null as the email. In other words, you already have a user without an email address.
The relevant documentation for this:
If a document does not have a value for the indexed field in a unique index, the index will store a null value for this document. Because of the unique constraint, MongoDB will only permit one document that lacks the indexed field. If there is more than one document without a value for the indexed field or is missing the indexed field, the index build will fail with a duplicate key error.
You can combine the unique constraint with the sparse index to filter these null values from the unique index and avoid the error.
unique indexes
Sparse indexes only contain entries for documents that have the indexed field, even if the index field contains a null value.
In other words, a sparse index is ok with multiple documents all having null values.
sparse indexes
From comments:
Your error says that the key is named mydb.users.$email_1 which makes me suspect that you have an index on both users.email and users.local.email (The former being old and unused at the moment). Removing a field from a Mongoose model doesn't affect the database. Check with mydb.users.getIndexes() if this is the case and manually remove the unwanted index with mydb.users.dropIndex(<name>).

If you are still in your development environment, I would drop the entire db and start over with your new schema.
From the command line
➜ mongo
use dbName;
db.dropDatabase();
exit

I want to explain the answer/solution to this like I am explaining to a 5-year-old , so everyone can understand .
I have an app.I want people to register with their email,password and phone number .
In my MongoDB database , I want to identify people uniquely based on both their phone numbers and email - so this means that both the phone number and the email must be unique for every person.
However , there is a problem : I have realized that everyone has a phonenumber but not everyone has an email address .
Those that don`t have an email address have promised me that they will have an email address by next week. But I want them registered anyway - so I tell them to proceed registering their phonenumbers as they leave the email-input-field empty .
They do so .
My database NEEDS an unique email address field - but I have a lot of people with 'null' as their email address . So I go to my code and tell my database schema to allow empty/null email address fields which I will later fill in with email unique addresses when the people who promised to add their emails to their profiles next week .
So its now a win-win for everyone (but you ;-] ): the people register, I am happy to have their data ...and my database is happy because it is being used nicely ...but what about you ? I am yet to give you the code that made the schema .
Here is the code :
NOTE : The sparse property in email , is what tells my database to allow null values which will later be filled with unique values .
var userSchema = new mongoose.Schema({
local: {
name: { type: String },
email : { type: String, require: true, index:true, unique:true,sparse:true},
password: { type: String, require:true },
},
facebook: {
id : { type: String },
token : { type: String },
email : { type: String },
name : { type: String }
}
});
var User = mongoose.model('User',userSchema);
module.exports = User;
I hope I have explained it nicely .
Happy NodeJS coding / hacking!

In this situation, log in to Mongo find the index that you are not using anymore (in OP's case 'email'). Then select Drop Index

Check collection indexes.
I had that issue due to outdated indexes in collection for fields, which should be stored by different new path.
Mongoose adds index, when you specify field as unique.

Well basically this error is saying, that you had a unique index on a particular field for example: "email_address", so mongodb expects unique email address value for each document in the collection.
So let's say, earlier in your schema the unique index was not defined, and then you signed up 2 users with the same email address or with no email address (null value).
Later, you saw that there was a mistake. so you try to correct it by adding a unique index to the schema. But your collection already has duplicates, so the error message says that you can't insert a duplicate value again.
You essentially have three options:
Drop the collection
db.users.drop();
Find the document which has that value and delete it. Let's say the value was null, you can delete it using:
db.users.remove({ email_address: null });
Drop the Unique index:
db.users.dropIndex(indexName)
I Hope this helped :)

Edit: This solution still works in 2023 and you don't need to drop your collection or lose any data.
Here's how I solved same issue in September 2020. There is a super-fast and easy way from the mongodb atlas (cloud and desktop). Probably it was not that easy before? That is why I feel like I should write this answer in 2020.
First of all, I read above some suggestions of changing the field "unique" on the mongoose schema. If you came up with this error I assume you already changed your schema, but despite of that you got a 500 as your response, and notice this: specifying duplicated KEY!. If the problem was caused by schema configuration and assuming you have configurated a decent middleware to log mongo errors the response would be a 400.
Why this happens (at least the main reason)
Why is that? In my case was simple, that field on the schema it used to accept only unique values but I just changed it to accept repeated values. Mongodb creates indexes for fields with unique values in order to retrieve the data faster, so on the past mongo created that index for that field, and so even after setting "unique" property as "false" on schema, mongodb was still using that index, and treating it as it had to be unique.
How to solve it
Dropping that index. You can do it in 2 seconds from Mongo Atlas or executing it as a command on mongo shell. For the sack of simplicity I will show the first one for users that are not using mongo shell.
Go to your collection. By default you are on "Find" tab. Just select the next one on the right: "Indexes". You will see how there is still an index given to the same field is causing you trouble. Just click the button "Drop Index". Done.
So don't drop your database everytime this happens
I believe this is a better option than just dropping your entire database or even collection. Basically because this is why it works after dropping the entire collection. Because mongo is not going to set an index for that field if your first entry is using your new schema with "unique: false".

I faced similar issues ,
I Just clear the Indexes of particular fields then its works for me .
https://docs.mongodb.com/v3.2/reference/method/db.collection.dropIndexes/

This is my relavant experience:
In 'User' schema, I set 'name' as unique key and then ran some execution, which I think had set up the database structure.
Then I changed the unique key as 'username', and no longer passed 'name' value when I saved data to database. So the mongodb may automatically set the 'name' value of new record as null which is duplicate key. I tried the set 'name' key as not unique key {name: {unique: false, type: String}} in 'User' schema in order to override original setting. However, it did not work.
At last, I made my own solution:
Just set a random key value that will not likely be duplicate to 'name' key when you save your data record. Simply Math method '' + Math.random() + Math.random() makes a random string.

I had the same issue. Tried debugging different ways couldn't figure out. I tried dropping the collection and it worked fine after that. Although this is not a good solution if your collection has many documents. But if you are in the early state of development try dropping the collection.
db.users.drop();

I have solved my problem by this way.
Just go in your mongoDB account -> Atlast collection then drop your database column. Or go mongoDB compass then drop your database,
It happed sometimes when you have save something null inside database.

This is because there is already a collection with the same name with configuration..Just remove the collection from your mongodb through mongo shell and try again.
db.collectionName.remove()
now run your application it should work

I had a similar problem and I realized that by default mongo only supports one schema per collection. Either store your new schema in a different collection or delete the existing documents with the incompatible schema within the your current collection. Or find a way to have more than one schema per collection.

I got this same issue when I had the following configuration in my config/models.js
module.exports.models = {
connection: 'mongodb',
migrate: 'alter'
}
Changing migrate from 'alter' to 'safe' fixed it for me.
module.exports.models = {
connection: 'mongodb',
migrate: 'safe'
}

same issue after removing properties from a schema after first building some indexes on saving. removing property from schema leads to an null value for a non existing property, that still had an index. dropping index or starting with a new collection from scratch helps here.
note: the error message will lead you in that case. it has a path, that does not exist anymore. im my case the old path was ...$uuid_1 (this is an index!), but the new one is ....*priv.uuid_1

I have also faced this issue and I solved it.
This error shows that email is already present here. So you just need to remove this line from your Model for email attribute.
unique: true
This might be possible that even if it won't work. So just need to delete the collection from your MongoDB and restart your server.

It's not a big issue but beginner level developers as like me, we things what kind of error is this and finally we weast huge time for solve it.
Actually if you delete the db and create the db once again and after try to create the collection then it's will be work properly.
➜ mongo
use dbName;
db.dropDatabase();
exit

Drop you database, then it will work.
You can perform the following steps to drop your database
step 1 : Go to mongodb installation directory, default dir is "C:\Program Files\MongoDB\Server\4.2\bin"
step 2 : Start mongod.exe directly or using command prompt and minimize it.
step 3 : Start mongo.exe directly or using command prompt and run the following command
i) use yourDatabaseName (use show databases if you don't remember database name)
ii) db.dropDatabase()
This will remove your database.
Now you can insert your data, it won't show error, it will automatically add database and collection.

I had the same issue when i tried to modify the schema defined using mangoose. I think the issue is due to the reason that there are some underlying process done when creating a collection like describing the indices which are hidden from the user(at least in my case).So the best solution i found was to drop the entire collection and start again.

If you are in the early stages of development: Eliminate the collection. Otherwise: add this to each attribute that gives you error (Note: my English is not good, but I try to explain it)
index:true,
unique:true,
sparse:true

in my case, i just forgot to return res.status(400) after finding that user with req.email already exists

Go to your database and click on that particular collection and delete all the indexes except id.

Mongoose: How to populate 2 level deep population without populating fields of first level? in mongodb

Here is my Mongoose Schema:
var SchemaA = new Schema({
field1: String,
.......
fieldB : { type: Schema.Types.ObjectId, ref: 'SchemaB' }
});
var SchemaB = new Schema({
field1: String,
.......
fieldC : { type: Schema.Types.ObjectId, ref: 'SchemaC' }
});
var SchemaC = new Schema({
field1: String,
.......
.......
.......
});
While i access schemaA using find query, i want to have fields/property
of SchemaA along with SchemaB and SchemaC in the same way as we apply join operation in SQL database.
This is my approach:
SchemaA.find({})
.populate('fieldB')
.exec(function (err, result){
SchemaB.populate(result.fieldC,{path:'fieldB'},function(err, result){
.............................
});
});
The above code is working perfectly, but the problem is:
I want to have information/properties/fields of SchemaC through SchemaA, and i don't want to populate fields/properties of SchemaB.
The reason for not wanting to get the properties of SchemaB is, extra population will slows the query unnecessary.
Long story short:
I want to populate SchemaC through SchemaA without populating SchemaB.
Can you please suggest any way/approach?

As an avid mongodb fan, I suggest you use a relational database for highly relational data - that's what it's built for. You are losing all the benefits of mongodb when you have to perform 3+ queries to get a single object.
Buuuuuut, I know that comment will fall on deaf ears. Your best bet is to be as conscious as you can about performance. Your first step is to limit the fields to the minimum required. This is just good practice even with basic queries and any database engine - only get the fields you need (eg. SELECT * FROM === bad... just stop doing it!). You can also try doing lean queries to help save a lot of post-processing work mongoose does with the data. I didn't test this, but it should work...
SchemaA.find({}, 'field1 fieldB', { lean: true })
.populate({
name: 'fieldB',
select: 'fieldC',
options: { lean: true }
}).exec(function (err, result) {
// not sure how you are populating "result" in your example, as it should be an array,
// but you said your code works... so I'll let you figure out what goes here.
});
Also, a very "mongo" way of doing what you want is to save a reference in SchemaC back to SchemaA. When I say "mongo" way of doing it, you have to break away from your years of thinking about relational data queries. Do whatever it takes to perform fewer queries on the database, even if it requires two-way references and/or data duplication.
For example, if I had a Book schema and Author schema, I would likely save the authors first and last name in the Books collection, along with an _id reference to the full profile in the Authors collection. That way I can load my Books in a single query, still display the author's name, and then generate a hyperlink to the author's profile: /author/{_id}. This is known as "data denormalization", and it has been known to give people heartburn. I try and use it on data that doesn't change very often - like people's names. In the occasion that a name does change, it's trivial to write a function to update all the names in multiple places.

SchemaA.find({})
.populate({
path: "fieldB",
populate:{path:"fieldC"}
}).exec(function (err, result) {
//this is how you can get all key value pair of SchemaA, SchemaB and SchemaC
//example: result.fieldB.fieldC._id(key of SchemaC)
});

why not add a ref to SchemaC on SchemaA? there will be no way to bridge to SchemaC from SchemaA if there is no SchemaB the way you currently have it unless you populate SchemaB with no other data than a ref to SchemaC

As explained in the docs under Field Selection, you can restrict what fields are returned.
.populate('fieldB') becomes populate('fieldB', 'fieldC -_id'). The -_id is required to omit the _id field just like when using select().

I think this is not possible.Because,when a document in A referring a document in B and that document is referring another document in C, how can document in A know which document to refer from C without any help from B.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string