I have a mongoose User schema built like this:
var UserSchema = new Schema({
username: { type: String, required: true, index: { unique: true } },
password: { type: String, required: true },
salt: { type: String, required: true}
});
I want to be able to send this user object to the client side of my application but I don't want to sned the password or salt fields.
So I added he following code to my user model module
U
serSchema.methods.forClientSide = function() {
console.log('in UserSchema.methods.forClientSide');
console.log(this);
//var userForClientSide=_.omit(this,'passsword','salt');
var userForClientSide={_id:this._id, username:this.username };
console.log(userForClientSide);
return userForClientSide;
}
I have required the underscore module (its installed locally via a dependency in my package.js).
not the commented out line - I was expecting it to omit the password and salt fields of the user object but it did not do anything :( the logged object had the full set of properties.
when replaced with the currently used like var userForClientSide={_id:this._id, username:this.username }; it gets the results I want but:
1) I want to know why does the _.omit not work.
2) I don't like my current workaround very much because it actually selects some properties instead of omitting the ones I don't like so if I will add any new propertes to the scema I will have to add them here as well.
This is my first attempt at writing something using node.js/express/mongodb/mongoose etc. so It is very possible hat I am missing some other better solution to this issue (possibly some feature of mongoose ) feel free to educate me of the right way to do things like this.
so basically I want to know both what is the right way to do this and why did my way not work.
thanks
1) I want to know why does the _.omit not work.
Mongoose uses defineProperty and some heavy metaprogramming. If you want to use underscore, first call user.toJSON() to get a plain old javascript object that will work better with underscore without all the metaprogramming fanciness, functions, etc.
A better solution is to use mongo/mongoose's fields object and pass the string "-password -salt" and therefore just omit getting these back from mongo at all.
Another approach is to use the mongoose Transform (search for "tranform" on that page). Your use case is the EXACT use case the documentation uses as an example.
You can also make your mongoose queries "lean" by calling .lean() on your query, in which case you will get back plain javascript objects instead of mongoose model instances.
However, after trying each of these things, I'm personally coming to the opinion that there should be a separate collection for Account that has the login details and a User collection, which will make leaking the hashes extremely unlikely even by accident, but any of the above will work.
Related
I have a very specific question. I have a web project that is using Express (Node.JS) and MLab (MongoDB/Mongoose). I've manually edited several records in a collection (yeah, I know, bad idea) and am using one of those fields in a Mongoose search. The schema is defined as follows: (relevant part only)
user: {
id: {
type: mongoose.Schema.Types.ObjectId,
ref: "Registration"
},
username: String,
type: String
}
My search is as follows:
Master.find({$or: [{'user.type': 'committee'}, {'user.type': 'admin'}]}, function(err, foundUsers) {
do stuff
});
The search works just fine (using 'user.type'), but the user object in each record is undefined in foundUsers.
What am I missing?
Thanks!
Found it. I tried to pull a fast one and add something to the user record that wasn't in the Registration Schema. Mongo was smarter than I was in this case.
What is the best way to propagate updates when you have a denormalized Schema? Should it be all done in the same function?
I have a schema like so:
var Authors = new Schema({
...
name: {type: String, required:true},
period: {type: Schema.Types.ObjectId, ref:'Periods'},
quotes: [{type: Schema.Types.ObjectId, ref: 'Quotes'}]
active: Boolean,
...
})
Then:
var Periods = new Schema({
...
name: {type: String, required:true},
authors: [{type: Schema.Types.ObjectId, ref:'Authors'}],
active: Boolean,
...
})
Now say I want to denormalize Authors, since the period field will always just use the name of the period (which is unique, there can't be two periods with the same name). Say then that I turn my schema into this:
var Authors = new Schema({
...
name: {type: String, required:true},
period: String, //no longer a ref
active: Boolean,
...
})
Now Mongoose doesn't know anymore that the period field is connected to the Period schema. So it's up to me to update the field when the name of a period changes. I created a service module that offers an interface like this:
exports.updatePeriod = function(id, changes) {...}
Within this function I go through the changes to update the period document that needs to be updated. So here's my question. Should I, then, update all authors within this method? Because then the method would have to know about the Author schema and any other schema that uses period, creating a lot of coupling between these entities. Is there a better way?
Perhaps I can emit an event that a period has been updated and all the schemas that have denormalized period references can observe it, is that a better solution? I'm not quite sure how to approach this issue.
Ok, while I wait for a better answer than my own, I will try to post what I have been doing so far.
Pre/Post Middleware
The first thing I tried was to use the pre/post middlewares to synchronize documents that referenced each other. (For instance, if you have Author and Quote, and an Author has an array of the type: quotes: [{type: Schema.Types.ObjectId, ref:'Quotes'}], then whenever a Quote is deleted, you'd have to remove its _id from the array. Or if the Author is removed, you may want all his quotes removed).
This approach has an important advantage: if you define each Schema in its own file, you can define the middleware there and have it all neatly organized. Whenever you look at the schema, right below you can see what it does, how its changes affect other entities, etc:
var Quote = new Schema({
//fields in schema
})
//its quite clear what happens when you remove an entity
Quote.pre('remove', function(next) {
Author.update(
//remove quote from Author quotes array.
)
})
The main disadvantage however is that these hooks are not executed when you call update or any Model static updating/removing functions. Rather you need to retrieve the document and then call save() or remove() on them.
Another smaller disadvantage is that Quote now needs to be aware of anyone that references it, so that it can update them whenever a Quote is updated or removed. So let's say that a Period has a list of quotes, and Author has a list of quotes as well, Quote will need to know about these two to update them.
The reason for this is that these functions send atomic queries to the database directly. While this is nice, I hate the inconsistency between using save() and Model.Update(...). Maybe somebody else or you in the future accidently use the static update functions and your middleware isn't triggered, giving you headaches that you struggle to get rid of.
NodeJS Event Mechanisms
What I am currently doing is not really optimal but it offers me enough benefits to actually outweight the cons (Or so I believe, if anyone cares to give me some feedback that'd be great). I created a service that wraps around a model, say AuthorService that extends events.EventEmitter and is a Constructor function that will look roughly like this:
function AuthorService() {
var self = this
this.create = function() {...}
this.update = function() {
...
self.emit('AuthorUpdated, before, after)
...
}
}
util.inherits(AuthorService, events.EventEmitter)
module.exports = new AuthorService()
The advantages:
Any interested function can register to the Service
events and be notified. That way, for instance, when a Quote is
updated, the AuthorService can listen to it and update the Authors
accordingly. (Note 1)
Quote doesn't need to be aware of all the documents that reference it, the Service simply triggers the QuoteUpdated event and all the documents that need to perform operations when this happens will do so.
Note 1: As long as this service is used whenever anyone needs to interact with mongoose.
The disadvantages:
Added boilerplate code, using a service instead of mongoose directly.
Now it isn't exactly obvious what functions get called when you
trigger the event.
You decouple producer and consumer at the cost of legibility (since
you just emit('EventName', args), it's not immediately obvious
which Services are listening to this event)
Another disadvantage is that someone can retrieve a Model from the Service and call save(), in which the events won't be triggered though I'm sure this could be addressed with some kind of hybrid between these two solutions.
I am very open to suggestions in this field (which is why I posted this question in the first place).
I'm gonna speak more from an architectural point of view than a coding point of view since when it comes right down to it, you can pretty-much achieve anything with enough lines of code.
As far as I've been able to understand, your main concern has been keeping consistency across your database, mainly removing documents when their references are removed and vice-versa.
So in this case, rather than wrapping the whole functionality in extra code I'd suggest going for atomic Actions, where an Action is a method you define yourself that performs a complete removal of an entity from the DB (both document and reference).
So for example when you wanna remove an author's quote, you do something like removing the Quote document from the DB and then removing the reference from the Author document.
This sort of architecture ensures that each of these Actions performs a single task and performs it well, without having to tap into events (emitting, consuming) or any other stuff. It's a self-contained method for performing its own unique task.
I am playing around with node.js, express, and mongoose.
For the sake of getting something up and running right now I am passing the Express query string object directly to a mongoose find function. What I am curious about is how dangerous would this practice be in a live app. I know that a RDBMS would be extremely vulnerable to SQL injection. Aside from the good advice of "sanitize your inputs" how evil is this code:
app.get('/query', function (req, res) {
models.findDocs(req.query, function (err, docs) {
res.send(docs);
});
});
Meaning that a a get request to http://localhost:8080/query?name=ahsteele&status=a would just shove the following into the findDocs function:
{
name: 'ahsteele',
status: 'a'
}
This feels icky for a lot of reasons, but how unsafe is it? What's the best practice for passing query parameters to mongodb? Does express provide any out of the box sanitization?
As far as injection being problem, like with SQL, the risk is significantly lower... albeit theoretically possible via an unknown attack vector.
The data structures and protocol are binary and API driven rather than leveraging escaped values within a domain-specific-language. Basically, you can't just trick the parser into adding a ";db.dropCollection()" at the end.
If it's only used for queries, it's probably fine... but I'd still caution you to use a tiny bit of validation:
Ensure only alphanumeric characters (filter or invalidate nulls and anything else you wouldn't normally accept)
Enforce a max length (like 255 characters) per term
Enforce a max length of the entire query
Strip special parameter names starting with "$", like "$where" & such
Don't allow nested arrays/documents/hashes... only strings & ints
Also, keep in mind, an empty query returns everything. You might want a limit on that return value. :)
Operator injection is a serious problem here and I would recommend you at least encode/escape certain characters, more specifically the $ symbol: http://docs.mongodb.org/manual/faq/developers/#dollar-sign-operator-escaping
If the user is allowed to append a $ symbol to the beginning of strings or elements within your $_GET or $_POST or whatever they will quickly use that to: http://xkcd.com/327/ and you will be a gonner, to say the least.
As far as i know Express doesnt provide any out of box control for sanitization. Either you can write your own Middleware our do some basic checks in your own logic.And as you said the case you mention is a bit risky.
But for ease of use the required types built into Mongoose models at least give you the default sanitizations and some control over what gets into or not.
E.g something like this
var Person = new Schema({
title : { type: String, required: true }
, age : { type: Number, min: 5, max: 20 }
, meta : {
likes : [String]
, birth : { type: Date, default: Date.now }
}
});
Check this for more info also.
http://mongoosejs.com/docs/2.7.x/docs/model-definition.html
I am trying to sanitize user input in mongoose. I though that using mongoose middleware would help, but it seems that I am either wrong or I am doing something wrong.
The reason I am trying to use Mongoose middleware (and not Express middleware) is that I have a document that can have a nested document - however, that nested document can be a standalone document as well. I am trying to create a "single point of truth" for my documents so that I can sanitize only in one place.
The following code does not seem to work:
Organization.pre("validate", function (next) {
this.subdomain = this.trim().toLowerCase();
next();
});
PS. I am also using mongoose-validator, which in turn uses node-validator to validate the user input - node validator also has some sanitize methods, maybe I should use them somehow?
In this case I think it would be better to add trim: true to the Organization schema definition for subdomain:
subdomain: { type: String, trim: true }
I'd like the unique _id field in one of my models to be relatively short: 8 letters/numbers, instead of the usual Mongo _id which is much longer. Having a short unique-index like this helps elsewhere in my code, for reasons I'll skip over here. I've successfully created a schema that does the trick (randomString is a function that generates a string of the given length):
new Schema('Activities', {
'_id': { type: String, unique: true, 'default': function(){ return randomString(8); } },
// ... other definitions
}
This works well so far, but I am concerned about duplicate IDs generated from the randomString function. There are 36^8 possible IDs, so right now it is not a problem... but as the set of possible IDs fills up, I am worried about insert commands failing due to a duplicate ID.
Obviously, I could do an extra query to check if the ID was taken before doing an insert... but that makes me cry inside.
I'm sure there's a better way to be doing this, but I'm not seeing it in the documentation.
This shortid lib https://github.com/dylang/shortid is being used by Doodle or Die, seems to be battle tested.
By creating a unique index on _id you'll get an error if you try to insert a document with a duplicate key. So wrap error handling around any inserts you do that looks for the error and then generates another ID and retries the insert in that case. You could add a method to your schema that implements this enhanced save to keep things clean and DRY.