How should I validate my inputs in the backend? - node.js

My web app will receive a lot of user-generated content. I want to improve basic security by validating inputs that users generate. My backend is running Node+Express.
How should I implement input validation?
With assert?
Coming from Python, I started writing assert statements:
assert(title.length > 0)
With express-validator?
Then I discovered that there are form validation libraries, such as express-validator. This seemed like the 'right' way of doing it. But compared to simple assertion statements, it also seemed actually like I would double or triple the number of lines of code I need to write.
With TypeScript?
Then I realised that perhaps I could just write some of my Node code in TypeScript, and it would do all the validation for me.
?
Which approach should I use?

The first thing you should do is sanitize the user input to prevent XSS.
There are many NPM packages for that, DOMPurify does a great job.
The most basic way to validate the input is to use if statements, for example:
app.post("/user", (req, res) => {
const { email, password } = req.body;
if (!email || !validator.isEmail(email)) {
return res.status(400).send("Invalid email");
} else if (!password || password.length < 8) {
return res.status(400).send("Invalid password");
}
// Do stuff
return res.sendStatus(200);
});
Validation libraries are pretty useful, in the example above I used the isEmail function from validator.
Note that TypeScript does not do any runtime validation, it performs type-checking during compile time and then compiles the whole TypeScript code into JavaScript, stripping away all the types.
Which means that, for example, even if a field you specified as a string in TypeScript turns out to be a number during runtime, JavaScript won't throw any error, so it's really important that you make sure that every input field is validated.
Validating each input in each request could become quite repetitive and ugly to look at, but thankfully you can implement Express middlewares that you can easily add to the requests that need them.

Related

NodeJS - What's the deal with throwing errors as a means of control flow?

This question is similar, but answers are specific to sails: NodeJS best practices: Errors for flow control?
I come from a background working on REST APIs in C#. In C#, when there was an operational "error" (I'm intentionally using error in quotes), we just returned a constant or key to a lookup table of messages. Depending on the message that was returned, the routing layer would return an appropriate status code to the user (200, 404, etc.).
Take logging in. Consider that someone logs into a server, and the account does not exist. This is not an unusual thing to happen, so it doesn't make much sense to throw an error.
Now, I'm writing Node applications and I'm frustrated. It seems like common practice is to just throw errors everywhere when anything unexpected happens. This makes sense for things like the database failing to retrieve an account which is known to exist, or for catching syntax errors. But for something like a user entering the wrong username/password, throwing an error seems like overkill.
Then consider a case where you have error-reporting middleware. For syntax errors, you might want to log a stack trace, when you would never want to do that for something like a wrong password. By using errors for control flow, a user entering a bad password could be using up the same server resources as logging a critical database exception. You could extend the Error class and switch on the type of error, but that seems sloppy.
Finally, some solutions just directly return ExpressJS reponses with status codes, but I don't always want the same status code for every case where a user isn't found in the database. Maybe I want to return a 404 for a user profile page, and a 400 for a failed login.
Has anyone else faced this problem, and if so, how have you handled it (pun intended)?
UPDATE
Here's some code to demonstrate what my solution would look like.
I've invented a new class called Goof, which is like an error but it is used in situations where it is a likely outcome. For example, bad login credentials would be a type of Goof:
// constants.js.
const WRONG_PASSWORD = 1;
const USERNAME_DOES_NOT_EXIST = 1;
// goof.js.
class Goof {
constructor(message) {
this.message = message;
}
}
// Inside express route.
const username = req.body.username;
const password = req.body.password;
const result = await signInUser(username, password);
if(typeof(result) === Goof) {
if(result.status === constants.WRONG_PASSWORD) {
res.sendStatus(400);
return;
}
if(result.status === constants.USERNAME_DOES_NOT_EXIST) {
res.sendStatus(400)
return;
}
else {
throw new Error('A Goof was returned but not handled properly');
}
} else {
res.json(result);
}
What are the advantages/disadvantages of this over throwing and catching errors?

Best practices form processing with Express

I'm writing a website which implements a usermanagement system and I wonder what best practices regarding form processing I have to consider.
Especially performance, security, SEO and user experience are important to me. When I was working on it I came across a couple questions and I didn't find an complete node/express code snippet where I could figure out all of my below questions.
Use case: Someone is going to update the birthday of his profile. Right now I am doing a POST request to the same URL to process the form on that page and the POST request will respond with a 302 redirect to the same URL.
General questions about form processing:
Should I do a POST request + 302 redirect for form processing or rather something else like an AJAX request?
How should I handle invalid FORM requests (for example invalid login, or email address is already in use during signup)?
Express specific questions about form processing:
I assume before inserting anything into my DB I need to sanitize and validate all form fields on the server side. How would you do that?
I read some things about CSRF but I have never implemented a CSRF protection. I'd be happy to see that in the code snippet too
Do I need to take care of any other possible vulnerabilities when processing forms with Express?
Example HTML/Pug:
form#profile(method='POST', action='/settings/profile')
input#profile-real-name.validate(type='text', name='profileRealName', value=profile.name)
label(for='profile-real-name') Name
textarea#profile-bio.materialize-textarea(placeholder='Tell a little about yourself', name='profileBio')
| #{profile.bio}
label(for='profile-bio') About
input#profile-url.validate(type='url', name='profileUrl', value=profile.bio)
label(for='profile-url') URL
input#profile-location.validate(type='text', name='profileLocation', value=profile.location)
label(for='profile-location') Location
.form-action-buttons.right-align
a.btn.grey(href='' onclick='resetForm()') Reset
button.btn.waves-effect.waves-light(type='submit')
Example Route Handlers:
router.get('/settings/profile', isLoggedIn, profile)
router.post('/settings/profile', isLoggedIn, updateProfile)
function profile(req, res) {
res.render('user/profile', { title: 'Profile', profile: req.user.profile })
}
function updateProfile(req, res) {
var userId = req.user._id
var form = req.body
var profile = {
name: form.profileRealName,
bio: form.profileBio,
url: form.profileUrl,
location: form.profileLocation
}
// Insert into DB
}
Note: A complete code snippet which takes care of all form processing best practices adapted to the given example is highly appreciated. I'm fine with using any publicly available express middleware.
Should I do a POST request + 302 redirect for form processing or rather something else like an AJAX request?
No, best practice for a good user experience since 2004 or so (basically since gmail launched) has been form submission via AJAX and not web 1.0 full-page load form POSTs. In particular, error handling via AJAX is less likely to leave your user at a dead end browser error page and then hit issues with the back button. The AJAX in this case should send an HTTP PATCH request to be most semantically correct but POST or PUT will also get the job done.
How should I handle invalid FORM requests (for example invalid login, or email address is already in use during signup)?
Invalid user input should result in an HTTP 400 Bad Request status code response, with details about the specific error(s) in a JSON response body (the format varies per application but either a general message or field-by-field errors are common themes)
For email already in use I use the HTTP 409 Conflict status code as a more particular flavor of general bad request payload.
I assume before inserting anything into my DB I need to sanitize and validate all form fields on the server side. How would you do that?
Absolutely. There are many tools. I generally define a schema for a valid request in JSON Schema and use a library from npm to validate that such as is-my-json-valid or ajv. In particular, I recommend being as strict as possible: reject incorrect types, or coerce types if you must, remove unexpected properties, use small but reasonable string length limits and strict regular expression patterns for strings when you can, and of course make sure your DB library property prevents injection attacks.
I read some things about CSRF but I have never implemented a CSRF protection.
The OWSAP Node Goat Project CSRF Exercise is a good place to start with a vulnerable app, understand and exploit the vulnerability, then implement the fix (in this case with a straightforward integration of the express.csrf() middleware.
Do I need to take care of any other possible vulnerabilities when processing forms with Express?
Yes generally application developers must understand and actively code securely. There's a lot of material out there on this but particular care must be taken when user input gets involved in database queries, subprocess spawning, or being written back out to HTML. Solid query libraries and template engines will handle most of the work here, you just need to be aware of the mechanics and potential places malicious user input could sneak in (like image filenames, etc).
I am certainly no Express expert but I think I can answer at least #1:
You should follow the Post/Redirect/Get web development pattern in order to prevent duplicate form submissions. I've heard a 303-redirect is the proper http statuscode for redirecting form submissions.
I do process forms using the POST route and once I'm done I trigger a 302-redirect.
As of #3 I recommend looking into express-validator, which is well introduce here: https://developer.mozilla.org/en-US/docs/Learn/Server-side/Express_Nodejs/forms . It's a middleware which allows you to validate and sanitize like this:
req.checkBody('name', 'Invalid name').isAlpha();
req.checkBody('age', 'Invalid age').notEmpty().isInt();
req.sanitizeBody('name').escape();
I wasn't able to comment hence the answer even though it's not a complete answer. Just thought it might help you.
If user experience is something you're thinking about, a page redirection is a strong no. Providing a smooth flow for the people visiting your website is important to prevent drops, and since forms are already not such a pleasure to fill, easing their usage is primary. You don't want to reload their page that might have already took some time to load just to display an error message. Once the form is valid and you created the user cookie, a redirection is fine though, even if you could do things on the client app to prevent it, but that's out-of-scope.
As stated by Levent, you should checkout express-validator, which is the more established solution for this kind of purpose.
req.check('profileRealName', 'Bad name provided').notEmpty().isAlpha()
req.check('profileLocation', 'Invalid location').optional().isAlpha();
req.getValidationResult().then(function (result) {
if (result.isEmpty()) { return null }
var errors = result.array()
// [
// { param: "profileRealName", msg: "Bad name provided", value: ".." },
// { param: "profileLocation", msg: "Invalid location", value: ".." }
// ]
res.status(400).send(errors)
})
.then(function () {
// everything is fine! insert into the DB and respond..
})
From what it looks like, I can assume you are using MongoDB. Given that, I would recommend using an ODM, like Mongoose. It will allow you to define models for your schemas and put restrictions directly on it, letting the model handles these kind of redundant validations for you.
For example, a model for your user could be
var User = new Schema({
name: { type: String, required: [true, 'Name required'] },
bio: { type: String, match: /[a-z]/ },
age: { type: Number, min: 18 }, // I don't know the kind of site you run ;)
})
Using this schema on your route would be looking like
var user = new User({
name: form.profileRealName,
bio: form.profileBio,
url: form.profileUrl,
location: form.profileLocation
})
user.save(function (err) {
// and you could grab the error here if it exists, and react accordingly
});
As you can see it provides a pretty cool api, which you should read about in their docs if you want to know more.
About CRSF, you should install csurf, which has pretty good instructions and example usages on their readme.
After that you're pretty much good to go, there is not much more I can think about apart making sure you stay up to date with your critical dependencies, in case a 0-day occurs, for example the one that happened in 2015 with JWTs, but that's still kinda rare.

Frisby Functional Test standards

I'm new to this and I have been searching for ways (or standards) to write proper functional tests but I still I have many unanswered questions. I'm using FrisbyJS to write functional tests for my NodeJS API application and jasmine-node to run them.
I have gone through Frisby's documentation, but it wasn't fruitful for me.
Here is a scenario:
A guest can create a User. (No username duplication allowed, obviously)
After creating a User, he can login. On successful login, he gets an Access-Token.
A User can create a Post. Then a Post can have Comment, and so on...
A User cannot be deleted once created. (Not from my NodeJS Application)
What Frisby documentation says is, I should write a test within a test.
For example (full-test.spec.js):
// Create User Test
frisby.create('Create a `User`')
.post('http://localhost/users', { ... }, {json: true})
.expectStatus(200)
.afterJSON(function (json) {
// User Login Test
frisby.create('Login `User`')
.post('http://localhost/users/login', { ... }, {json: true})
.expectStatus(200)
.afterJSON(function (json) {
// Another Test (For example, Create a post, and then comment)
})
.toss();
})
.toss();
Is this the right way to write a functional test? I don't think so... It looks dirty.
I want my tests to be modular. Separate files for each test.
If I create separate files for each test, then while writing a test for Create Post, I'll need a User's Access-Token.
To summarize, the question is: How should I write tests if things are dependent on each other?
Comment is dependent on Post. Post is dependent on User.
This is the by product to using NodeJS. This is a large reason I regret deciding on frisby. That and the fact I can't find a good way to load expected results out of a database in time to use them in the tests.
From what I understand - You basically want to execute your test cases in a sequence. One after the other.
But since this is javascript, the frisby test cases are asynchronous. Hence, to make them synchronous, the documentation suggested you to nest the test cases. Now that is probably OK for a couple of test cases. But nesting would go chaotic if there are hundreds of test cases.
Hence, we use sequenty - another nodejs module, which uses call back to execute functions(and test cases wrapped in these functions) in sequence. In the afterJSON block, after all the assertions, you have to do a call back - cb()[which is passed to your function by sequenty].
https://github.com/AndyShin/sequenty
var sequenty = require('sequenty');
function f1(cb){
frisby.create('Create a `User`')
.post('http://localhost/users', { ... }, {json: true})
.expectStatus(200)
.afterJSON(function (json) {
//Do some assertions
cb(); //call back at the end to signify you are OK to execute next test case
})
.toss();
}
function f2(cb){
// User Login Test
frisby.create('Login `User`')
.post('http://localhost/users/login', { ... }, {json: true})
.expectStatus(200)
.afterJSON(function (json) {
// Some assertions
cb();
})
.toss();
}
sequenty.run(f1,f2);

Validation In Express-Validator

I am using express-validator for validation. I am using mongoose for database, it also has validation built in. I want to know which one should I use?
I also want to know if the validation in express-validator is parallel. Take this code for example:
req.checkBody('email', 'Invalid email').notEmpty().isEmail().isUnique();
req.checkBody('password', 'Invalid possword').notEmpty().len(8, 30);
req.checkBody('first_name', 'Invalid first_name').notEmpty().isAlpha();
req.checkBody('last_name', 'Invalid last_name').notEmpty().isAlpha();
req.checkBody('dateofbirth', 'Invalid dateofbirth').notEmpty.isDate();
isUnique() is a custom validation method that checks if the email has not already been registered or not, it queries to database to validate so. Though not mentioned in the code above but I also have few other post request where I need to validate multiple fields where database queries will be made in each of them.
So I wanted to know if its possible to run each of the above check method in parallel as that would make it faster and would also me more node like. I obviously will like to use a module for running these in parallel like async. I would also like to know if already these check methods are running in parallel already?
Please help me figure this out? Thanks in advance.
express-validator is meant to validate input passed by the browser/client; Mongoose's validation is meant to validate newly created documents. Both serve a different purpose, so there isn't a clean-cut answer to which one you should use; you could use both, even.
As for the order of validation: the checks will be performed in series. You could use async.parallel() to make it appear as if the checks are performed in parallel, but in reality they won't be since the checks are synchronous.
EDIT: node-validator (and therefore express-validator) is a string validator. Testing for uniqueness isn't a string operation but operates on your data model, so you shouldn't try to use node-validator for it (in fact, I don't even think you can).
Instead, I would suggest using Mongoose's unique feature to ensure that an e-mail address only occurs once in your database.
Alternatively, use a validator module that supports async operations, like async-validate.
Using both mongodb unique constraint and express-validator will cause a little bit complicated errors processing: you need to analyse errors in different places and translate it to common format for rest response.
So another approach will be creating custom express validator which will find record with such value in mongodb:
router.use(expressValidator({
customValidators: {
isUsernameAvailable(username) {
return new Promise((resolve, reject) => {
User.findOne({ username: username }, (err, user) => {
if (err) throw err;
if(user == null) {
resolve();
} else {
reject();
}
});
});
}
}
})
);
...
req.checkBody('username', 'Username is required').notEmpty();
req.checkBody('username', 'Username already in use').isUsernameAvailable();
req.asyncValidationErrors().then(() => {
...
If you need, see total example with this code on my website: https://maketips.net/tip/161/validate-username-available-using-express-validator
Conclusion: express-validator supports asynchronous validation as well so you can perform any data and string validation you need without mixing user validation and low-level db backend validation
There is no way that Express-Validator can check a email address is unique or not, on the other hand MongoDB is not responsible for checking your entered email address is in correct format.
So you must have to use both of them when they are in need because they serve completly diffrent purpose.

node.js middleware and js encapsulation

I'm new to javascript, and jumped right into node.js. I've read a lot of theory, and began well with the practical side (I'm writing an API for a mobile app), but I have one basic problem, which has lead me to middleware. I've successfully implemented a middleware function, but I would like to know if the use I'm giving the idea of middleware is OK, and also resolve the original problem which brought me to middleware. My question is two-fold, it's as follows:
1) From what I could gather, the idea of using middleware is repeating a process before actually processing the request. I've used it for token verification, as follows:
Only one of my urls doesn't receive a token parameter, so
app.js
app.get('/settings', auth.validateToken, auth.settings);
auth.js
function validateToken(req, res, next){ //code };
In validateToken, my code checks the token, then calls next() if everything is OK, or modifies res as json to return a specific error code.
My questions regarding this are: a) Is this a correct use of middleware? b) is there a [correct] way of passing a value onto the next function? Instead of calling next only if everything is OK, is there a [correct] way of calling next either way, and knowing from inside the next function (whichever it is), if the middleware was succesful or not? If there is, would this be a proper use of middleware? This precise point brings me to my original problem, and part two of this question, which is encapsulating functions:
THIS PART WAS FIXED, SEE MY SECOND COMMENT.
2) I discovered middleware trying to simply encapsulate validateToken, and be able to call it from inside the functions that the get handlers point to, for example auth.settings.
I'm used to common, sequential programming, and not in javascript, and haven't for the life of me been able to understand how to do this, taking into account the event-based nature of node.js.
What I want to do right now is write a function which simply verifies the user and password. I have it perfectly written inside a particular handler, but was about to copy-paste it to another one, so I stopped. I want to do things the right way from scratch, and understand node.js. One of the specific problems I've been having, is that the error code I have to return when user and password don't match are different depending on the parent function, so I would need this function to be able to tell the callback function "hey, the password and user don't match", so from the parent function I can respond with the correct message.
I think what I actually want is to write an asynchronous function I can call from inside another one.
I hope I've been clear, I've been trying to solve this on my own, but I can't quite finish wrapping my head around what my actual problem is, I'm guessing it's due to my recent introduction to node.js and JS.
Thanks in advance! Jennifer.
1) There is res.locals object (http://expressjs.com/api.html#res.locals) designed to store data local to the request and to pass them from one middleware to another. After request is processed this object is disposed of. If you want to store data within the session you can use req.session.
2) If I understand your question, you want a function asynchronously passing the response to the caller. You can do it in the same way most node's functions are designed.
You define a function in this way:
function doSomething(parameters, callback) {
// ... do something
// if (errorConddition()) err = errorCode();
if (callback) callback(err, result)
}
And the caller instead of using the return value of the function passes callback to this function:
function caller(req, res, next) {
//...
doSomething(params, function(err, result) {
if (! err && result) {
// do something with the result
next();
} else {
// do something else
next();
// or even res.redirect('/error');
}
});
}
If you find yourself writing similar callback functions you should define them as function and just pass the function as parameter:
//...
doSomething(param, processIt);
function processIt(err, result) {
// ...
}
What keeps you confused, probably, is that you don't treat functions as values yet, which is a very specific to JavaScript (not counting for languages that are little used).
In validateToken, my code checks the token, then calls next() if everything is OK, or modifies res as json to return a specific error code.
a) Is this a correct use of middleware?
b) is there a [correct] way of passing a value onto the next function?
Yes that is the correct way of using middleware, although depending on the response message type and specifications you could use the built in error handling of connect. That is in this example generate a 401 status code by calling next({status:401,stack:'Unauthorized'});
The middleware system is designed to handle the request by going through a series of functions until one function replies to the request. This is why the next function only takes one argument which is error
-> if an error object is passed to the next function then it will be used to create a response and no further middleware will be processed. The manner in which error response is created is as follows
// default to 500
if (res.statusCode < 400) res.statusCode = 500;
debug('default %s', res.statusCode);
// respect err.status
if (err.status) res.statusCode = err.status;
// production gets a basic error message
var msg = 'production' == env
? http.STATUS_CODES[res.statusCode]
: err.stack || err.toString();
-> to pass values down the middleware stack modifying the request object is the best method. This ensures that all processing is bound to that specific request and since the request object goes through every middleware function it is a good way to pass information down the stack.

Resources