I have a problem with querying CosmosDB document which contains a dictionary. This is an example document:
{
"siteAndDevices": {
"4cf0af44-6233-402a-b33a-e7e35dbbee6a": [
"f32d80d9-e93a-687e-97f5-676516649420",
"6a5eb9fa-c961-93a5-38cc-ecd74ada13ac",
"c90e9986-5aea-b552-e532-cd64a250ad10",
"7d4bfdca-547a-949b-ccb3-bbf0d6e5d727",
"fba51bfe-6a5e-7f25-e58a-7b0ced59b5d8",
"f2caac36-3590-020f-ebb7-5ccd04b4412c",
"1b446af7-ba74-3564-7237-05024c816a02",
"7ef3d931-131e-a639-10d4-f4dd5db834ca"
]
},
"id": "f9ef9fb6-4b70-7d3f-2bc8-c3d335018624"
}
I need to get all documents where provided guid is in the list, so in the dictionary value (I don't know dictionary key). I found an information somewhere here that it is not possible to iterate through keys in dictionary in CosmosDB (maybe it has changed since that time but I din't find any information in documentation), but maybe someone will have some idea. I cannot change form of the document.
I tried to do it in Linq, but I didn't get any results.
var query = _documentClient
.CreateDocumentQuery<Dto>(DocumentCollectionUri())
.Where(d => d.SiteAndDevices.Any(x => x.Value.Contains("f32d80d9-e93a-687e-97f5-676516649420")))
.AsDocumentQuery();
Not sure of the Linq query, but with SQL, you'd need something like this:
SELECT * FROM c
where array_contains(c.siteAndDevices['4cf0af44-6233-402a-b33a-e7e35dbbee6a'],"f32d80d9-e93a-687e-97f5-676516649420")
This is a strange document format though, as you've named your key with an id:
"siteAndDevices": {
"4cf0af44-6233-402a-b33a-e7e35dbbee6a": ["..."]
}
Your key is "4cf0af44-6233-402a-b33a-e7e35dbbee6a", which forces you to use a different syntax to reference it:
c.siteAndDevices['4cf0af44-6233-402a-b33a-e7e35dbbee6a']
You'd save yourself a lot of trouble refactoring this to something like:
{
"id": "dictionary1",
"siteAndDevices": {
"deviceId": "4cf0af44-6233-402a-b33a-e7e35dbbee6a",
"deviceValues": ["..."]
}
}
You can refactor further, such as using an array to contain multiple device id + value combos.
Following is my user schema in user.js model -
var userSchema = new mongoose.Schema({
local: {
name: { type: String },
email : { type: String, require: true, unique: true },
password: { type: String, require:true },
},
facebook: {
id : { type: String },
token : { type: String },
email : { type: String },
name : { type: String }
}
});
var User = mongoose.model('User',userSchema);
module.exports = User;
This is how I am using it in my controller -
var user = require('./../models/user.js');
This is how I am saving it in the db -
user({'local.email' : req.body.email, 'local.password' : req.body.password}).save(function(err, result){
if(err)
res.send(err);
else {
console.log(result);
req.session.user = result;
res.send({"code":200,"message":"Record inserted successfully"});
}
});
Error -
{"name":"MongoError","code":11000,"err":"insertDocument :: caused by :: 11000 E11000 duplicate key error index: mydb.users.$email_1 dup key: { : null }"}
I checked the db collection and no such duplicate entry exists, let me know what I am doing wrong ?
FYI - req.body.email and req.body.password are fetching values.
I also checked this post but no help STACK LINK
If I removed completely then it inserts the document, otherwise it throws error "Duplicate" error even I have an entry in the local.email
The error message is saying that there's already a record with null as the email. In other words, you already have a user without an email address.
The relevant documentation for this:
If a document does not have a value for the indexed field in a unique index, the index will store a null value for this document. Because of the unique constraint, MongoDB will only permit one document that lacks the indexed field. If there is more than one document without a value for the indexed field or is missing the indexed field, the index build will fail with a duplicate key error.
You can combine the unique constraint with the sparse index to filter these null values from the unique index and avoid the error.
unique indexes
Sparse indexes only contain entries for documents that have the indexed field, even if the index field contains a null value.
In other words, a sparse index is ok with multiple documents all having null values.
sparse indexes
From comments:
Your error says that the key is named mydb.users.$email_1 which makes me suspect that you have an index on both users.email and users.local.email (The former being old and unused at the moment). Removing a field from a Mongoose model doesn't affect the database. Check with mydb.users.getIndexes() if this is the case and manually remove the unwanted index with mydb.users.dropIndex(<name>).
If you are still in your development environment, I would drop the entire db and start over with your new schema.
From the command line
➜ mongo
use dbName;
db.dropDatabase();
exit
I want to explain the answer/solution to this like I am explaining to a 5-year-old , so everyone can understand .
I have an app.I want people to register with their email,password and phone number .
In my MongoDB database , I want to identify people uniquely based on both their phone numbers and email - so this means that both the phone number and the email must be unique for every person.
However , there is a problem : I have realized that everyone has a phonenumber but not everyone has an email address .
Those that don`t have an email address have promised me that they will have an email address by next week. But I want them registered anyway - so I tell them to proceed registering their phonenumbers as they leave the email-input-field empty .
They do so .
My database NEEDS an unique email address field - but I have a lot of people with 'null' as their email address . So I go to my code and tell my database schema to allow empty/null email address fields which I will later fill in with email unique addresses when the people who promised to add their emails to their profiles next week .
So its now a win-win for everyone (but you ;-] ): the people register, I am happy to have their data ...and my database is happy because it is being used nicely ...but what about you ? I am yet to give you the code that made the schema .
Here is the code :
NOTE : The sparse property in email , is what tells my database to allow null values which will later be filled with unique values .
var userSchema = new mongoose.Schema({
local: {
name: { type: String },
email : { type: String, require: true, index:true, unique:true,sparse:true},
password: { type: String, require:true },
},
facebook: {
id : { type: String },
token : { type: String },
email : { type: String },
name : { type: String }
}
});
var User = mongoose.model('User',userSchema);
module.exports = User;
I hope I have explained it nicely .
Happy NodeJS coding / hacking!
In this situation, log in to Mongo find the index that you are not using anymore (in OP's case 'email'). Then select Drop Index
Check collection indexes.
I had that issue due to outdated indexes in collection for fields, which should be stored by different new path.
Mongoose adds index, when you specify field as unique.
Well basically this error is saying, that you had a unique index on a particular field for example: "email_address", so mongodb expects unique email address value for each document in the collection.
So let's say, earlier in your schema the unique index was not defined, and then you signed up 2 users with the same email address or with no email address (null value).
Later, you saw that there was a mistake. so you try to correct it by adding a unique index to the schema. But your collection already has duplicates, so the error message says that you can't insert a duplicate value again.
You essentially have three options:
Drop the collection
db.users.drop();
Find the document which has that value and delete it. Let's say the value was null, you can delete it using:
db.users.remove({ email_address: null });
Drop the Unique index:
db.users.dropIndex(indexName)
I Hope this helped :)
Edit: This solution still works in 2023 and you don't need to drop your collection or lose any data.
Here's how I solved same issue in September 2020. There is a super-fast and easy way from the mongodb atlas (cloud and desktop). Probably it was not that easy before? That is why I feel like I should write this answer in 2020.
First of all, I read above some suggestions of changing the field "unique" on the mongoose schema. If you came up with this error I assume you already changed your schema, but despite of that you got a 500 as your response, and notice this: specifying duplicated KEY!. If the problem was caused by schema configuration and assuming you have configurated a decent middleware to log mongo errors the response would be a 400.
Why this happens (at least the main reason)
Why is that? In my case was simple, that field on the schema it used to accept only unique values but I just changed it to accept repeated values. Mongodb creates indexes for fields with unique values in order to retrieve the data faster, so on the past mongo created that index for that field, and so even after setting "unique" property as "false" on schema, mongodb was still using that index, and treating it as it had to be unique.
How to solve it
Dropping that index. You can do it in 2 seconds from Mongo Atlas or executing it as a command on mongo shell. For the sack of simplicity I will show the first one for users that are not using mongo shell.
Go to your collection. By default you are on "Find" tab. Just select the next one on the right: "Indexes". You will see how there is still an index given to the same field is causing you trouble. Just click the button "Drop Index". Done.
So don't drop your database everytime this happens
I believe this is a better option than just dropping your entire database or even collection. Basically because this is why it works after dropping the entire collection. Because mongo is not going to set an index for that field if your first entry is using your new schema with "unique: false".
I faced similar issues ,
I Just clear the Indexes of particular fields then its works for me .
https://docs.mongodb.com/v3.2/reference/method/db.collection.dropIndexes/
This is my relavant experience:
In 'User' schema, I set 'name' as unique key and then ran some execution, which I think had set up the database structure.
Then I changed the unique key as 'username', and no longer passed 'name' value when I saved data to database. So the mongodb may automatically set the 'name' value of new record as null which is duplicate key. I tried the set 'name' key as not unique key {name: {unique: false, type: String}} in 'User' schema in order to override original setting. However, it did not work.
At last, I made my own solution:
Just set a random key value that will not likely be duplicate to 'name' key when you save your data record. Simply Math method '' + Math.random() + Math.random() makes a random string.
I had the same issue. Tried debugging different ways couldn't figure out. I tried dropping the collection and it worked fine after that. Although this is not a good solution if your collection has many documents. But if you are in the early state of development try dropping the collection.
db.users.drop();
I have solved my problem by this way.
Just go in your mongoDB account -> Atlast collection then drop your database column. Or go mongoDB compass then drop your database,
It happed sometimes when you have save something null inside database.
This is because there is already a collection with the same name with configuration..Just remove the collection from your mongodb through mongo shell and try again.
db.collectionName.remove()
now run your application it should work
I had a similar problem and I realized that by default mongo only supports one schema per collection. Either store your new schema in a different collection or delete the existing documents with the incompatible schema within the your current collection. Or find a way to have more than one schema per collection.
I got this same issue when I had the following configuration in my config/models.js
module.exports.models = {
connection: 'mongodb',
migrate: 'alter'
}
Changing migrate from 'alter' to 'safe' fixed it for me.
module.exports.models = {
connection: 'mongodb',
migrate: 'safe'
}
same issue after removing properties from a schema after first building some indexes on saving. removing property from schema leads to an null value for a non existing property, that still had an index. dropping index or starting with a new collection from scratch helps here.
note: the error message will lead you in that case. it has a path, that does not exist anymore. im my case the old path was ...$uuid_1 (this is an index!), but the new one is ....*priv.uuid_1
I have also faced this issue and I solved it.
This error shows that email is already present here. So you just need to remove this line from your Model for email attribute.
unique: true
This might be possible that even if it won't work. So just need to delete the collection from your MongoDB and restart your server.
It's not a big issue but beginner level developers as like me, we things what kind of error is this and finally we weast huge time for solve it.
Actually if you delete the db and create the db once again and after try to create the collection then it's will be work properly.
➜ mongo
use dbName;
db.dropDatabase();
exit
Drop you database, then it will work.
You can perform the following steps to drop your database
step 1 : Go to mongodb installation directory, default dir is "C:\Program Files\MongoDB\Server\4.2\bin"
step 2 : Start mongod.exe directly or using command prompt and minimize it.
step 3 : Start mongo.exe directly or using command prompt and run the following command
i) use yourDatabaseName (use show databases if you don't remember database name)
ii) db.dropDatabase()
This will remove your database.
Now you can insert your data, it won't show error, it will automatically add database and collection.
I had the same issue when i tried to modify the schema defined using mangoose. I think the issue is due to the reason that there are some underlying process done when creating a collection like describing the indices which are hidden from the user(at least in my case).So the best solution i found was to drop the entire collection and start again.
If you are in the early stages of development: Eliminate the collection. Otherwise: add this to each attribute that gives you error (Note: my English is not good, but I try to explain it)
index:true,
unique:true,
sparse:true
in my case, i just forgot to return res.status(400) after finding that user with req.email already exists
Go to your database and click on that particular collection and delete all the indexes except id.
I have one collection, called "games" whose documents store the ids of the owners of games.
{
"owner" : "88878b6c25c167d"
}
{
"owner" : "88878b6c25c167d"
}
{
"owner" : "af565f77f73469b"
}
Then I have another collection called "users".
{
"id" : "af565f77f73469b"
}
{
"id" : "a881335e1d4cf17"
}
{
"id" : "aa3ce3f7767c46b"
}
{
"id" : "d19e52c0bd78bcb"
}
{
"id" : "88878b6c25c167d"
}
So the first query I do retrieves the owners of all the games and stores those values in an array.['88878b6c25c167d', '88878b6c25c167d', 'af565f77f73469b']
The second query I want to perform should retrieve the documents of the users with the corresponding IDs. Previously I had used this query:
db.users.find({'id':
{'$in': [
'88878b6c25c167d',
'88878b6c25c167d',
'af565f77f73469b'
]}})
Here's the problem with this: It does not return duplicate documents if a user is listed twice, as above. Instead I just get one. This breaks my application. How can I make sure that each owner returns a document?
MongoDB works perfectly fine --- it finds all user, whose id-s are contained in the array.
Do not know the broader context of your needs (maybe tell us what you want to achieve -- not what is wrong?), but if you want to have an association between games and users something like that may be suitable:
after retrieving collection of games; just create an auxiliary hash map (normal JS object) that for given owner id will return the array of its games.
retrieve info about users who are owners.
if you want to know, which games belong to particular user just pass her id to data structure from 1. and get the arrays with games.
Is it what you were looking for? Do you need help with 1.?
Say I have a doc to save with couchDB and the doc looks like this:
{
"email": "lorem#gmail.com",
"name": "lorem",
"id": "lorem",
"password": "sha1$bc5c595c$1$d0e9fa434048a5ae1dfd23ea470ef2bb83628ed6"
}
and I want to be able to query the doc either by 'id' or 'email'. So when save this as a view I write so:
db.save('_design/users', {
byId: {
map: function(doc) {
if (doc.id && doc.email) {
emit(doc.id, doc);
emit(doc.email, doc);
}
}
}
});
And then I could query like this:
db.view('users/byId', {
key: key
}, function(err, data) {
if (err || data.length === 0) return def.reject(new Error('not found'));
data = data[0] || {};
data = data.value || {};
self.attrs = _.clone(data);
delete self.attrs._rev;
delete self.attrs._id;
def.resolve(data);
});
And it works just fine. I could load the data either by id or email. But I'm not sure if I should do so.
I have another solution which by saving the same doc with two different view like byId and byEmail, but in this way I save the same doc twice and obviously it will cost space of the database.
Not sure which solution is better.
The canonical solution would be to have two views, one by email and one by id. To not waste space for the document, you can just emit null as the value and then use the include_docs=true query paramter when you query the view.
Also, you might want to use _id instead of id. That way, CouchDB ensures that the ID will be unique and you don't have to use a view to loop up documents.
I'd change to the two separate views. That's explicit and clear. When you emit the same doc twice in a single view – by an id and e-mail you're effectively combining the 2 views into one. You may think of it as a search tree with the 2 root branches. I don't see any reason of doing that, and would suggest leaving the data access and storage optimization job to the database.
The views combination may also yield tricky bugs, when for some reason you confuse an id and an e-mail.
There is absolutely nothing wrong with emitting the same document multiple times with a different key. It's about what makes most sense for your application.
If id and email are always valid and interchangeable ways to identify a user then a single view is perfect. For example, when id is some sort of unique account reference and users are allowed to use that or their (more memorable) email address to login.
However, if you need to differentiate between the two values, e.g. id is only meant for application administrators, then separate views are probably better. (You could probably use a complex key instead ... but that's another answer.)
I'm new to Key-Value Stores and I need your recommendation. We're working on a system that manages documents and their revisions. A bit like a wiki does. We're thinking about saving this data in a key value store.
Please don't give me a recommendation that is the database you prefer because we want to hack it so we can use many different key value databases. We're using node.js so we can easily work with json.
My Question is: What should the structure of the database look like? We have meta data for each document(timestamp, lasttext, id, latestrevision) and we have data for each revision (the change, the author, timestamp, etc...). So, which key/value structure you recommend?
thx
Cribbed from the MongoDB groups. It is somewhat specific to MongoDB, however, it is pretty generic.
Most of these history implementations break down to two common strategies.
Strategy 1: embed history
In theory, you can embed the history of a document inside of the document itself. This can even be done atomically.
> db.docs.save( { _id : 1, text : "Original Text" } )
> var doc = db.docs.findOne()
> db.docs.update( {_id: doc._id}, { $set : { text : 'New Text' }, $push : { hist : doc.text } } )
> db.docs.find()
{ "_id" : 1, "hist" : [ "Original Text" ], "text" : "New Text" }
Strategy 2: write history to separate collection
> db.docs.save( { _id : 1, text : "Original Text" } )
> var doc = db.docs.findOne()
> db.docs_hist.insert ( { orig_id : doc._id, ts : Math.round((new Date()).getTime() / 1000), data : doc } )
> db.docs.update( {_id:doc._id}, { $set : { text : 'New Text' } } )
Here you'll see that I do two writes. One to the master collection and
one to the history collection.
To get fast history lookup, just grab the original ID:
> db.docs_hist.ensureIndex( { orig_id : 1, ts : 1 })
> db.docs_hist.find( { orig_id : 1 } ).sort( { ts : -1 } )
Both strategies can be enhanced by only displaying diffs
You could hybridize by adding a link from history collection to original collection
Whats the best way of saving a document with revisions in a key value store?
It's hard to say there is a "best way". There are obviously some trade-offs being made here.
Embedding:
atomic changes on a single doc
can result in large documents, may break the reasonable size limits
probably have to enhance code to avoid returning full hist when not necessary
Separate collection:
easier to write queries
not atomic, needs two operations (do you have transactions?)
more storage space (extra indexes on original docs)
I'd keep a hierarchy of the real data under each document with the revision data attached, for instance:
{
[
{
"timestamp" : "2011040711350621",
"data" : { ... the real data here .... }
},
{
"timestamp" : "2011040711350716",
"data" : { ... the real data here .... }
}
]
}
Then use the push operation to add new versions and periodically remove the old versions. You can use the last (or first) filter to only get the latest copy at any given time.
I think there are multiple approaches and this question is old but I'll give my two cents as I was working on this earlier this year. I have been using MongoDB.
In my case, I had a User account that then had Profiles on different social networks. We wanted to track changes to social network profiles and wanted revisions of them so we created two structures to test out. Both methods had a User object that pointed to foreign objects. We did not want to embed objects from the get-go.
A User looked something like:
User {
"tags" : [Tags]
"notes" : "Notes"
"facebook_profile" : <combo_foreign_key>
"linkedin_profile" : <same as above>
}
and then, for the combo_foreign_key we used this pattern (Using Ruby interpolation syntax for simplicity)
combo_foreign_key = "#{User.key}__#{new_profile.last_updated_at}"
facebook_profiles {
combo_foreign_key: facebook_profile
... and you keep adding your foreign objects in this pattern
}
This gave us O(1) lookup of the latest FacebookProfile of a User but required us to keep the latest FK stored in the User object. If we wanted all of the FacebookProfiles we would then ask for all keys in the facebook_profiles collection with the prefix of "#{User.key}__" and this was O(N)...
The second strategy we tried was storing an array of those FacebookProfile keys on the User object so the structure of the User object changed from
"facebook_profile" : <combo_foreign_key>
to
"facebook_profile" : [<combo_foreign_key>]
Here we'd just append on the new combo_key when we added a new profile variation. Then we'd just do a quick sort of the "facebook_profile" attribute and index on the largest one to get our latest profile copy. This method had to sort M strings and then index the FacebookProfile based on the largest item in that sorted list. A little slower for grabbing the latest copy but it gave us the advantage knowing every version of a Users FacebookProfile in one swoop and we did not have to worry about ensuring that foreign_key was really the latest profile object.
At first our revision counts were pretty small and they both worked pretty well. I think I prefer the first one over the second now.
Would love input from others on ways they went about solving this issue. The GIT idea suggested in another answer actually sounds really neat to me and for our use case would work quite well... Cool.