CouchDB check if a document exists in a validation function - couchdb

I would like to see if a document exists in the database that has the name field "name" set to "a name" before allowing a new document to be added to the database.
I this possible in CouchDB using update handlers (inside design documents)?

Seems you are looking for a unique constraint in CouchDB. The only unique constraint supported by CouchDB is based on the document ID.
You should include your "name" attribute value into the document ID if you would like to have the document unicity based on it.
Validate document update functions defined in desing documents can only use the data of the document being created/updated/deleted, it can no use data from other documents in the database.
Yo can find a similar question here.

This is not widely known, but _update endpoint allowed to return a doc with _id prop different from requested. It means, in your case, you need to have an unique document say _id:"doc-name", which will serve as a constraint.
Then you call smth like POST _design/whatever/_update/saveDependentDoc/doc-name, providing new doc with different _id as a request body.
Your _update function will effectively receive two docs as an input (or null and newDoc if constraint doc is missing). The function then decides what should it do: return received doc to persist it, or return nothing.
The solution isn’t a full answer to your question, however it might be helpful in some cases.
This trick only works for updating existing docs if you know revision, for sure.

Related

How can I retrieve the id of a document I added to a Cosmosdb collection?

I have a single collection into which I am inserting documents of different types. I use the type parameter to distinguish between different datatypes in the collection. When I am inserting a document, I have created an Id field for every document, but Cosmosdb has a built-in id field.
How can I insert a new document and retrieve the id of the created Document all in one query?
The CreateDocumentAsync method returns the created document so you should be able to get the document id.
Document created = await client.CreateDocumentAsync(collectionLink, order);
I think you just need to .getResource() method to get the create document obj.
Please refer to the java code:
DocumentClient documentClient = new DocumentClient(END_POINT,
MASTER_KEY, ConnectionPolicy.GetDefault(),
ConsistencyLevel.Session);
Document document = new Document();
document.set("name","aaa");
document = documentClient.createDocument("dbs/db/colls/coll",document,null,false).getResource();
System.out.println(document.toString());
//then do your business logic with the document.....
C# code:
Parent p = new Parent
{
FamilyName = "Andersen.1",
FirstName = "Andersen",
};
Document doc = client.CreateDocumentAsync("dbs/db/colls/coll",p,null).Result.Resource;
Console.WriteLine(doc);
Hope it helps you.
Sure, you could always fetch the id from creation method response in your favorite API as already shown in other answers. You may have reasons why you want to delegate key-assigning to DocumentDB, but to be frank, I don't see any good ones.
If inserted document would have no id set DocumentDB would generate a GUID for you. There wouldn't be any notable difference compared to simply generating a new GUID yourself and assign it into id-field before save. Self-assigning the identity would let you simplify your code a bit and also let you use the identity not only after persisting but also BEFORE. Which could simplify a lot of scenarios you may have or run into in future.
Also, note that you don't have to use GUIDs as as id and could use any unique value you already have. Since you mentioned you have and Id field (which by name, I assume to be a primary key) then you should consider reusing this instead introducing another set of keys.
Self-assigned non-Guid key is usually a better choice since it can be designed to match your data and application needs better than a GUID. For example, in addition to being just unique, it may also be a natural key, narrower, human-readable, ordered, etc.

How to check for duplication before creating a new document in CouchDB/Cloudant?

We want to check if a document already exists in the database with the same fields and values of a new object we are trying to save to prevent duplicated item.
Note: This question is not about updating documents or about duplicated document IDs, we only check the data to prevent saving a new document with the same data of an existing one.
Preferably we'd like to accomplish this with Mango/Cloudant queries and not rely on views.
The idea so far is:
1) Scan the the data that we are trying to save and dynamically create a selector that matches that document's structure. (We can't have the selectors hardcoded because we have types of many documents)
2) Query de DB with for any documents matching that selector to if any document already exists that matches those criteria.
However I wonder about the performance of this approach since many of the selector fields will not be indexed.
I also much rather follow best practices than create something out of the blue, but haven't been able to find any known solutions for this specific scenario.
If you happen to know of any, please share.
Option 1 - Define a meaningful ID for your documents
The ID could be a logical coposition or a computed hash from the values that should be unique
If you want to check if a document ID already exists you can use the HEAD method
HEAD /db/docId
which returns 200-OK if the docId exits on the database.
If you would like to check if you have the same content in the new document and in the previous one, you may use the Validate Document Update Function which allows to compare both documents.
function(newDoc, oldDoc, userCtx, secObj) {
...
}
Option 2 - Use content hash computed outside CouchDB
Before create or update a document a hash should be computed using the values of the attributes that should be unique.
The hash is included in the document in a new attribute i.e. "key_hash"
Create a mango index using the "key_hash" attribute
When a new doc should be inserted, the hash should be computed and find for documents with the same hash value using a mango expression before the doc is inserted.
Option 3 - Compute hash in a View
Define a view which emit the computed hash for each document as key
Couchdb Javascript support does not include hashing functions, this could be difficult to include in a design document.
Use erlang to define the map function, where you can access to the erlang support for hashing.
Before creating a new document you should query the view using a the hash that you need to compute previously.
One solution would be to take Juanjo's and Alexis's comment one step further.
Select the keys you wish to keep unique
Put the values in a string and generate a hash
Set the document's _id to that hash
PUT the document on the database.
check return for failure
If another document already exists on the database with the same _id value, the PUT request will fail.

CouchDB and Couchbase Document Keys

In reference material for CouchDB and Couchbase it's common guidance to store the type of a document as a parameter within the actual document.
I've got a database, where I have different documents that record certain behaviour by URL. So naturally, I use the URL as the id of the document.
The problem I find is that by using just the key as the document id, I now get clashes between documents of different types. So I have started using the type as the first part of the key like this:
{ doc._id: "rss_entry|http://www.spiegel.de/1234", [...] }
{ doc._id: "page_text|http://www.spiegel.de/1234", [...] }
Now I start to wonder why I've never seen this approach to model type in any of the documentation.
Prefixes are commonly used. In addition to support for scenarios such as yours, prefixing allows one to perform logical range queries against views. There is use of this technique in the modeling examples, but perhaps the concept is not described in as much detail as you are expecting. In the section http://docs.couchbase.com/couchbase-devguide-2.5/#modeling-documents, the documents are keyed as beer_NNNN and brewery_NNNN. Also, the section http://docs.couchbase.com/couchbase-devguide-2.5/#using-reference-documents-for-lookups goes a bit deeper into this technique. There is a counter document named user::count and then each user is keyed as user::NNNN. Additionally, there are documents in the example that are keyed as fb::NNNN for a Facebook ID, email::XXX#YYYY.com for a user's email address, etc.

How does `mongoose` handle adding documents that have FIELDS that are __NOT__ part of the schema?

I'm playing around with quick start guide for mongoose.
http://mongoosejs.com/docs/index.html
I assumed that it would throw an error when I saved a document with a field NOT defined in the schema. Instead, it created a new document in the collection but without the field. (Note: I realize mongodb itself is "schema-less" so each document in a collection can be completely different from each other.)
two questions
How does mongoose handle adding documents that have fields that are NOT part of the schema? It seems like it just ignore them, and if none of the fields map, will create an empty document just with an ObjectId.
And how do you get mongoose to warn you if a specific field of a document hasn't been added even though the document successfully saved?
(The question is - I believe - simple enough, so I didn't add code, but I definitely will if someone requests.)
Thanks.
Q: How does mongoose handle adding documents that have fields that are NOT part of the schema?
The strict option, (enabled by default), ensures that values passed to our model constructor that were not specified in our schema do not get saved to the db.
- mongoose docs
Q: How do you get mongoose to warn you if a specific field of a document hasn't been added even though the document successfully saved?
The strict option may also be set to "throw" which will cause errors
to be produced instead of dropping the bad data. - mongoose docs
...but if you absolutely require saving keys that aren't in the schema, then you have to handle this yourself. Two approaches I can think of are:
1. To save keys that aren't in the schema, you could set strict to false on a specific model instance or on a specific update. Then, you'd need to write some validation that (a) the values in the document conformed to your standards and (b) the document saved in the database matched the document you sent over.
2. You could see if the Mixed schema type could serve your needs instead of disabling the validations that come with strict. (Scroll down to 'usage notes' on that link, as the link to the 'Mixed' documentation seems broken for the moment.)
Mongoose lets you add "validator" and "pre" middleware that perform useful functions. For instance, you could specify the required attribute in your schema to indicate that a specific property must be set. You could also specify a validator that you can craft to throw an error if the associated property doesn't meet your specifications. You can also set up a Mongoose "pre" validator that examines the document and throws an Error if it finds fields that are outside of your schema. By having your middleware call next() (or not), you can control whether you proceed to the document save (or not).
This question/response on stackoverflow can help with figuring out whether or not an object has a property.

I want absolute atomicity on a single couchdb instance (insert, fail if already existing)

I've come to really love the couchdb style of organizing and updating data, but there are a few situations where I really need to be able to create an entry and determine if an equivalent entry is already in existence before returning to the user. The only situation that this is absolutely necessary for my application is user registration. I'm fine with having all user registration writes go to a particular, designated couchdb instance known as the "registration-instance".
I want to hash the user_id into some _id to use. Then execute a put with this _id, but fail if the _id is already inserted. I need to return to the user that the user name is already reserved, and I cannot detect the conflict later and resolve it at that point, because the user would be under the impression that they had reserved the user name.
I don't see why couchdb couldn't provide some way to do this, under the assumption that you designate that inserts for a particular "type" of document always are routed to a particular instance.
If you send a single CouchDB server a PUT request for a new user document you should get the behavior you want already.
If the document does not exist then it will create the new document.
If the document does exist then it is guaranteed to return a 409 conflict error. This is due to the fact that you did not supply a _rev property because you aren't trying to update the pre-existing document.
Only when the _id and _rev properties match will CouchDB update the existing document.
You might also want to read up on document update handlers:
http://wiki.apache.org/couchdb/Document_Update_Handlers
You might use an update handler to hash the user_id and dynamically assign the appropriate _id. You can also customize what kind of error response couch sends with an update handler.
Good luck!

Resources