Mongoose - flexible field - node.js

I am new to NOSQL and MongoDB, I am building an app with NodeJS and Mongoose, and I am building a mongoose schema for a new collection.
The documents of this collection will have some standard fields (id, creation date, user etc...) but then I need to store other stuff, which is a "data" field which will have to contain different data depending on the document. The value will sometimes be simple text and other times it will have lots of key/values pairs.
I am wondering what would be the best solution for this kind of storing needs :
-Create only one "data" field with a String type and then put different types of data into it (text for simple values, stringified objects for more complex data)
-Create in the model all the possible fields that my "more complex data" could have and use only the ones I need in each document
-Something else
What is the best practice for this kind of thing ?

Try to use the Mixed schemaType, I think that's what you are looking for:
http://mongoosejs.com/docs/schematypes.html

This one:
-Create in the model all the possible fields that my "more complex data" could have and use only the ones I need in each document
Mark the standard fields of the schema as required: true and leave the rest optional. That way you get the flexibility you want without losing the Mongoose benefits of validation, casting, and change detection.

Related

Can we post an array of objects with knex?

I'm using knex with node and I was thinking of making a field in the migration that would take an array of objects. I know simple stuff like
.string("bla")
.notNullable()
and so on, looking in the documentation can't seem to find it
wanting something like
.array_of_objects //field name
knex.schema.createTable('users', function (table) {
table.increments();
table.string('name');
table.timestamps();
//example
table.array
})
With table.specificType you can create which ever type of database column you like. You need to check from your database, which kind of columns it supports.
It would make easier to answer the question if you could tell what kind of SQL queries you are after. To be able to use knex, you need to know SQL. Knex will never abstract SQL away.
If you are looking for a database library for node.js that abstracts SQL away you might like to checkout sequelize.
Thanks for the post with the specificType. From what I saw PG didn't have an array of objects field, but did have an array of strings field. I ended up taking my array of objects and separating it into two string arrays one named keys and the other values. I stored both of those arrays in the database. Then when I did a get request I took the two arrays joined them together into an array of objects with some javascript tacking. Seems to do the trick.

How to check for duplication before creating a new document in CouchDB/Cloudant?

We want to check if a document already exists in the database with the same fields and values of a new object we are trying to save to prevent duplicated item.
Note: This question is not about updating documents or about duplicated document IDs, we only check the data to prevent saving a new document with the same data of an existing one.
Preferably we'd like to accomplish this with Mango/Cloudant queries and not rely on views.
The idea so far is:
1) Scan the the data that we are trying to save and dynamically create a selector that matches that document's structure. (We can't have the selectors hardcoded because we have types of many documents)
2) Query de DB with for any documents matching that selector to if any document already exists that matches those criteria.
However I wonder about the performance of this approach since many of the selector fields will not be indexed.
I also much rather follow best practices than create something out of the blue, but haven't been able to find any known solutions for this specific scenario.
If you happen to know of any, please share.
Option 1 - Define a meaningful ID for your documents
The ID could be a logical coposition or a computed hash from the values that should be unique
If you want to check if a document ID already exists you can use the HEAD method
HEAD /db/docId
which returns 200-OK if the docId exits on the database.
If you would like to check if you have the same content in the new document and in the previous one, you may use the Validate Document Update Function which allows to compare both documents.
function(newDoc, oldDoc, userCtx, secObj) {
...
}
Option 2 - Use content hash computed outside CouchDB
Before create or update a document a hash should be computed using the values of the attributes that should be unique.
The hash is included in the document in a new attribute i.e. "key_hash"
Create a mango index using the "key_hash" attribute
When a new doc should be inserted, the hash should be computed and find for documents with the same hash value using a mango expression before the doc is inserted.
Option 3 - Compute hash in a View
Define a view which emit the computed hash for each document as key
Couchdb Javascript support does not include hashing functions, this could be difficult to include in a design document.
Use erlang to define the map function, where you can access to the erlang support for hashing.
Before creating a new document you should query the view using a the hash that you need to compute previously.
One solution would be to take Juanjo's and Alexis's comment one step further.
Select the keys you wish to keep unique
Put the values in a string and generate a hash
Set the document's _id to that hash
PUT the document on the database.
check return for failure
If another document already exists on the database with the same _id value, the PUT request will fail.

CouchDB and Couchbase Document Keys

In reference material for CouchDB and Couchbase it's common guidance to store the type of a document as a parameter within the actual document.
I've got a database, where I have different documents that record certain behaviour by URL. So naturally, I use the URL as the id of the document.
The problem I find is that by using just the key as the document id, I now get clashes between documents of different types. So I have started using the type as the first part of the key like this:
{ doc._id: "rss_entry|http://www.spiegel.de/1234", [...] }
{ doc._id: "page_text|http://www.spiegel.de/1234", [...] }
Now I start to wonder why I've never seen this approach to model type in any of the documentation.
Prefixes are commonly used. In addition to support for scenarios such as yours, prefixing allows one to perform logical range queries against views. There is use of this technique in the modeling examples, but perhaps the concept is not described in as much detail as you are expecting. In the section http://docs.couchbase.com/couchbase-devguide-2.5/#modeling-documents, the documents are keyed as beer_NNNN and brewery_NNNN. Also, the section http://docs.couchbase.com/couchbase-devguide-2.5/#using-reference-documents-for-lookups goes a bit deeper into this technique. There is a counter document named user::count and then each user is keyed as user::NNNN. Additionally, there are documents in the example that are keyed as fb::NNNN for a Facebook ID, email::XXX#YYYY.com for a user's email address, etc.

How does `mongoose` handle adding documents that have FIELDS that are __NOT__ part of the schema?

I'm playing around with quick start guide for mongoose.
http://mongoosejs.com/docs/index.html
I assumed that it would throw an error when I saved a document with a field NOT defined in the schema. Instead, it created a new document in the collection but without the field. (Note: I realize mongodb itself is "schema-less" so each document in a collection can be completely different from each other.)
two questions
How does mongoose handle adding documents that have fields that are NOT part of the schema? It seems like it just ignore them, and if none of the fields map, will create an empty document just with an ObjectId.
And how do you get mongoose to warn you if a specific field of a document hasn't been added even though the document successfully saved?
(The question is - I believe - simple enough, so I didn't add code, but I definitely will if someone requests.)
Thanks.
Q: How does mongoose handle adding documents that have fields that are NOT part of the schema?
The strict option, (enabled by default), ensures that values passed to our model constructor that were not specified in our schema do not get saved to the db.
- mongoose docs
Q: How do you get mongoose to warn you if a specific field of a document hasn't been added even though the document successfully saved?
The strict option may also be set to "throw" which will cause errors
to be produced instead of dropping the bad data. - mongoose docs
...but if you absolutely require saving keys that aren't in the schema, then you have to handle this yourself. Two approaches I can think of are:
1. To save keys that aren't in the schema, you could set strict to false on a specific model instance or on a specific update. Then, you'd need to write some validation that (a) the values in the document conformed to your standards and (b) the document saved in the database matched the document you sent over.
2. You could see if the Mixed schema type could serve your needs instead of disabling the validations that come with strict. (Scroll down to 'usage notes' on that link, as the link to the 'Mixed' documentation seems broken for the moment.)
Mongoose lets you add "validator" and "pre" middleware that perform useful functions. For instance, you could specify the required attribute in your schema to indicate that a specific property must be set. You could also specify a validator that you can craft to throw an error if the associated property doesn't meet your specifications. You can also set up a Mongoose "pre" validator that examines the document and throws an Error if it finds fields that are outside of your schema. By having your middleware call next() (or not), you can control whether you proceed to the document save (or not).
This question/response on stackoverflow can help with figuring out whether or not an object has a property.

Mongoose: Only return one embedded document from array of embedded documents

I've got a model which contains an array of embedded documents. This embedded documents keeps track of points the user has earned in a given activity. Since a user can be a part of several activities or just one, it makes sense to keep these activities in an array. Now, i want to extract the hall of fame, the top ten users for a given activity. Currently i'm doing it like this:
userModel.find({ "stats.activity": "soccer" }, ["stats", "email"])
.desc("stats.points")
.limit(10)
.run (err, users) ->
(if you are wondering about the syntax, it's coffeescript)
where "stats" is the array of embedded documents/activeties.
Now this actually works, but currently I'm only testing with accounts who only has one activity. I assume that something will go wrong (sorting-wise) once a user has more activities. Is there anyway i can tell mongoose to only return the embedded document where "activity" == "soccer" alongside the top-level document?
Btw, i realize i can do this another way, by having stats in it's own collection and having a db-ref to the relevant user, but i'm wondering if it's possible to do it like this before i consider any rewrites.
Thanks!
You are correct that this won't work once you have multiple activities in your array.
Specifically, since you can't return just an arbitrary subset of an array with the element, you'll get back all of it and the sort will apply across all points, not just the ones "paired" with "activity":"soccer".
There is a pretty simple tweak that you could make to your schema to get around this though. Don't store the activity name as a value, use it as the key.
{ _id: userId,
email: email,
stats: [
{soccer : points},
{rugby: points},
{dance: points}
]
}
Now you will be able to query and sort like so:
users.find({"stats.soccer":{$gt:0}}).sort({"stats.soccer":-1})
Note that when you move to version 2.2 (currently only available as unstable development version 2.1) you would be able to use aggregation framework to get the exact results you want (only a particular subset of an array or subdocument that matches your query) without changing your schema.

Resources