Best practices form processing with Express - node.js

I'm writing a website which implements a usermanagement system and I wonder what best practices regarding form processing I have to consider.
Especially performance, security, SEO and user experience are important to me. When I was working on it I came across a couple questions and I didn't find an complete node/express code snippet where I could figure out all of my below questions.
Use case: Someone is going to update the birthday of his profile. Right now I am doing a POST request to the same URL to process the form on that page and the POST request will respond with a 302 redirect to the same URL.
General questions about form processing:
Should I do a POST request + 302 redirect for form processing or rather something else like an AJAX request?
How should I handle invalid FORM requests (for example invalid login, or email address is already in use during signup)?
Express specific questions about form processing:
I assume before inserting anything into my DB I need to sanitize and validate all form fields on the server side. How would you do that?
I read some things about CSRF but I have never implemented a CSRF protection. I'd be happy to see that in the code snippet too
Do I need to take care of any other possible vulnerabilities when processing forms with Express?
Example HTML/Pug:
form#profile(method='POST', action='/settings/profile')
input#profile-real-name.validate(type='text', name='profileRealName', value=profile.name)
label(for='profile-real-name') Name
textarea#profile-bio.materialize-textarea(placeholder='Tell a little about yourself', name='profileBio')
| #{profile.bio}
label(for='profile-bio') About
input#profile-url.validate(type='url', name='profileUrl', value=profile.bio)
label(for='profile-url') URL
input#profile-location.validate(type='text', name='profileLocation', value=profile.location)
label(for='profile-location') Location
.form-action-buttons.right-align
a.btn.grey(href='' onclick='resetForm()') Reset
button.btn.waves-effect.waves-light(type='submit')
Example Route Handlers:
router.get('/settings/profile', isLoggedIn, profile)
router.post('/settings/profile', isLoggedIn, updateProfile)
function profile(req, res) {
res.render('user/profile', { title: 'Profile', profile: req.user.profile })
}
function updateProfile(req, res) {
var userId = req.user._id
var form = req.body
var profile = {
name: form.profileRealName,
bio: form.profileBio,
url: form.profileUrl,
location: form.profileLocation
}
// Insert into DB
}
Note: A complete code snippet which takes care of all form processing best practices adapted to the given example is highly appreciated. I'm fine with using any publicly available express middleware.

Should I do a POST request + 302 redirect for form processing or rather something else like an AJAX request?
No, best practice for a good user experience since 2004 or so (basically since gmail launched) has been form submission via AJAX and not web 1.0 full-page load form POSTs. In particular, error handling via AJAX is less likely to leave your user at a dead end browser error page and then hit issues with the back button. The AJAX in this case should send an HTTP PATCH request to be most semantically correct but POST or PUT will also get the job done.
How should I handle invalid FORM requests (for example invalid login, or email address is already in use during signup)?
Invalid user input should result in an HTTP 400 Bad Request status code response, with details about the specific error(s) in a JSON response body (the format varies per application but either a general message or field-by-field errors are common themes)
For email already in use I use the HTTP 409 Conflict status code as a more particular flavor of general bad request payload.
I assume before inserting anything into my DB I need to sanitize and validate all form fields on the server side. How would you do that?
Absolutely. There are many tools. I generally define a schema for a valid request in JSON Schema and use a library from npm to validate that such as is-my-json-valid or ajv. In particular, I recommend being as strict as possible: reject incorrect types, or coerce types if you must, remove unexpected properties, use small but reasonable string length limits and strict regular expression patterns for strings when you can, and of course make sure your DB library property prevents injection attacks.
I read some things about CSRF but I have never implemented a CSRF protection.
The OWSAP Node Goat Project CSRF Exercise is a good place to start with a vulnerable app, understand and exploit the vulnerability, then implement the fix (in this case with a straightforward integration of the express.csrf() middleware.
Do I need to take care of any other possible vulnerabilities when processing forms with Express?
Yes generally application developers must understand and actively code securely. There's a lot of material out there on this but particular care must be taken when user input gets involved in database queries, subprocess spawning, or being written back out to HTML. Solid query libraries and template engines will handle most of the work here, you just need to be aware of the mechanics and potential places malicious user input could sneak in (like image filenames, etc).

I am certainly no Express expert but I think I can answer at least #1:
You should follow the Post/Redirect/Get web development pattern in order to prevent duplicate form submissions. I've heard a 303-redirect is the proper http statuscode for redirecting form submissions.
I do process forms using the POST route and once I'm done I trigger a 302-redirect.
As of #3 I recommend looking into express-validator, which is well introduce here: https://developer.mozilla.org/en-US/docs/Learn/Server-side/Express_Nodejs/forms . It's a middleware which allows you to validate and sanitize like this:
req.checkBody('name', 'Invalid name').isAlpha();
req.checkBody('age', 'Invalid age').notEmpty().isInt();
req.sanitizeBody('name').escape();
I wasn't able to comment hence the answer even though it's not a complete answer. Just thought it might help you.

If user experience is something you're thinking about, a page redirection is a strong no. Providing a smooth flow for the people visiting your website is important to prevent drops, and since forms are already not such a pleasure to fill, easing their usage is primary. You don't want to reload their page that might have already took some time to load just to display an error message. Once the form is valid and you created the user cookie, a redirection is fine though, even if you could do things on the client app to prevent it, but that's out-of-scope.
As stated by Levent, you should checkout express-validator, which is the more established solution for this kind of purpose.
req.check('profileRealName', 'Bad name provided').notEmpty().isAlpha()
req.check('profileLocation', 'Invalid location').optional().isAlpha();
req.getValidationResult().then(function (result) {
if (result.isEmpty()) { return null }
var errors = result.array()
// [
// { param: "profileRealName", msg: "Bad name provided", value: ".." },
// { param: "profileLocation", msg: "Invalid location", value: ".." }
// ]
res.status(400).send(errors)
})
.then(function () {
// everything is fine! insert into the DB and respond..
})
From what it looks like, I can assume you are using MongoDB. Given that, I would recommend using an ODM, like Mongoose. It will allow you to define models for your schemas and put restrictions directly on it, letting the model handles these kind of redundant validations for you.
For example, a model for your user could be
var User = new Schema({
name: { type: String, required: [true, 'Name required'] },
bio: { type: String, match: /[a-z]/ },
age: { type: Number, min: 18 }, // I don't know the kind of site you run ;)
})
Using this schema on your route would be looking like
var user = new User({
name: form.profileRealName,
bio: form.profileBio,
url: form.profileUrl,
location: form.profileLocation
})
user.save(function (err) {
// and you could grab the error here if it exists, and react accordingly
});
As you can see it provides a pretty cool api, which you should read about in their docs if you want to know more.
About CRSF, you should install csurf, which has pretty good instructions and example usages on their readme.
After that you're pretty much good to go, there is not much more I can think about apart making sure you stay up to date with your critical dependencies, in case a 0-day occurs, for example the one that happened in 2015 with JWTs, but that's still kinda rare.

Related

i can't see difference between put and patch method

I just want to like a quote or dislike if already i liked the quote. So first i find the quote and then i check if i already liked the quote, if not then i like, otherwise i dislike.
I have a router like below
router.put('/:quoteId', isAuth, quotesController.likeQuote);
And likeQuote method is like below
module.exports.likeQuote = (req, res, next) => {
const quoteId = req.params.quoteId;
const userId = req.userId;
Quote.findById(quoteId)
.then((quote) => {
if (quote.likes.indexOf(userId) == -1) {
quote.likes.push(userId);
} else {
quote.likes.pull(userId);
}
return quote.save();
})
.then((updatedQuote) => {
res.status(201).json({ message: 'You liked the post!' });
})
.catch((err) => {
err.statusCode = 500;
next(err);
});
But my question is, i just want to know how PUT and PATCH works? I think we should send all the fields in PUT but not in PATCH methods, but in my case i don't even send any fields and both work just fine.How this happens?
The actual REST API methods (PUT, PATCH, ... ) do not have any limitations. the logic you choose to write is what defines this. Now you're asking about "best practices" and whenever you ask about that you will get many different answers. I'll explain my view.
PUT, so the essence of PUT is to replace the existing object completely, that's why people are telling you to send the entire object because when you use PUT what's expected is a complete swap.
PATCH, the essence of PATCH is to update the existing resource. which is in your case what you're looking for, in this case you just send the required fields you need for the update.
Now is it wrong if you write PUT to be an update and not a complete swap? I would argue it is not. As long as you keep consistent logic throughout your app you can build your own "best practices" that will suit your needs.
Now you did tag this question as Mongo related so I would like to introduce to you the concept of piplined updates (for Mongo v4.2+) where you can execute your logic in 1 single update.
Mongo Playground
i just want to know how PUT and PATCH works?
An important distinction to understand is that we don't have a standard for how PUT and PATCH work; that's a implementation detail, and is deliberately hidden behind the "uniform interface".
What we do have is standardized semantics, an agreement about what PUT and PATCH mean.
(This is further complicated by people not being familiar with the standard, and therefore misinterpretations of the meaning are common.)
If the implementation of the request handler doesn't match the semantics of the request, that's OK... but if something goes expensively wrong as a result, it's the fault of the implementation.
PUT and PATCH are both method-tokens that indicate that we are trying to modify the resource identified by the target-uri. In particular, we use those method-tokens when we are trying to make the server's representation of the resource match the representation on the client.
For example, imagine editing a web page. We GET /home.html, change the TITLE element in our copy, and we want to save our changes to the server. How do we do that in HTTP?
One answer is that we send a copy of home.html (with our changes) back to the server, so that the server can save it. That's PUT.
Another answer is that we diff our copy and the server's copy, and send to the server a patch-document that describes the changes that the server should make to it's copy. That's PATCH.
router.put('/:quoteId', isAuth, quotesController.likeQuote);
What this invocation is doing is configuring the framework, so that requests with the PUT method token and a target-uri that matches "/:quoteId" are delegated to the likeQuote method.
And at this level, the framework assumes that you know what you are doing - there's no attempt to verify that "likeQuote" implements PUT semantics. To ensure that the implementation and the request semantics match, you are going to need to do some work (inspect the code, test, etc).
in my case i don't even send any fields and both work just fine.
Right - because the framework assumes that you know what you are doing, and your current implementation doesn't try to access or interpret the body of the HTTP request.
Note: that's a big hint that the request handler not actually implementing PUT/PATCH semantics (how could the server possibly make its copy of the quote look like the client's if it doesn't look at the information the client provided)?
It is okay to use POST; assuming that your implementation is correct, you should not be using methods with remote authoring semantics, because that's not what you are doing. This same implementation attached to a POST route would be fine.
As is, things are broken - you have a mismatch between the request semantics and the handler implementation. Under controlled conditions, you are likely to get away with it. It's entirely possible that you are only going to be invoking this code under controlled conditions.

Good practice to handle useless response in express js/nodejs?

Am using express js on node for my api service ! In which am using sequelize for query handling purposes !
So in some usecase like creating record, or updating record its simply returning "1" or sometimes nothing !
In this case , am just using
res.sendStatus(200);
or sometimes
res.send("success");
Is there any better way or this is the correct way to handle ? Or should in need .end() in order to end the process ??
which is a good way to handle these kind of useless responses which we dont need to send back ?
This is where Status 204 comes in to play: https://en.wikipedia.org/wiki/List_of_HTTP_status_codes#2xx_success
It states: everything is OK (like in 200), but there is simple no need to send a body.
Using express, it's just as simple as: res.sendStatus(204)
Usually a json is send back to the client with different information in it that lets the client-side know that the operation that they request through the api is successfull or failed. for e.g. here, you can define a standard json response like this
//in case of success of the request
{
requestStatus: 1,
...other data if you want to like new Id of the created data/updated data or info about it if you want or depending upon use case
}
or
// in case of failure of the request
{
requestStatus: 0,
...other data if you want to like reason for failure, error message or depending upon your use case
}
Just add this line of code, should be fine:
//For all bad endpoints
app.use('*', (req, res) => {
res.status(404).json({ msg: 'Not a good endpoint, why are you here? Go play FIFA.'})
})
If you want you can generate an error HTML file, but since it's back-end, JSON format is strongly suggested. You can also add success: false for more clarity.

Strapi & react-admin : I'd like to set 'Content-Range' header dynamically when any fetchAll query fires

I'm still a novice web developer, so please bear with me if I miss something fundamental !
I'm creating a backoffice for a Strapi backend, using react-admin.
React-admin library uses a 'data provider' to link itself with an API. Luckily someone already wrote a data provider for Strapi. I had no problem with step 1 and 2 of this README, and I can authenticate to Strapi within my React app.
I now want to fetch and display my Strapi data, starting with Users. In order to do that, quoting Step 3 of this readme : 'In controllers I need to set the Content-Range header with the total number of results to build the pagination'.
So far I tried to do this in my User controller, with no success.
What I try to achieve:
First, I'd like it to simply work with the ctx.set('Content-Range', ...) hard-coded in the controller like aforementioned Step 3.
Second, I've thought it would be very dirty to c/p this logic in every controller (not to mention in any future controllers), instead of having some callback function dynamically appending the Content-Range header to any fetchAll request. Ultimately that's what I aim for, because with ~40 Strapi objects to administrate already and plenty more to come, it has to scale.
Technical infos
node -v: 11.13.0
npm -v: 6.7.0
strapi version: 3.0.0-alpha.25.2
uname -r output: Linux 4.14.106-97.85.amzn2.x86_64
DB: mySQL v2.16
So far I've tried accessing the count() method of User model like aforementioned step3, but my controller doesn't look like the example as I'm working with users-permissions plugin.
This is the action I've tried to edit (located in project/plugins/users-permissions/controllers/User.js)
find: async (ctx) => {
let data = await strapi.plugins['users-permissions'].services.user.fetchAll(ctx.query);
data.reduce((acc, user) => {
acc.push(_.omit(user.toJSON ? user.toJSON() : user, ['password', 'resetPasswordToken']));
return acc;
}, []);
// Send 200 `ok`
ctx.send(data);
},
From what I've gathered on Strapi documentation (here and also here), context is a sort of wrapper object. I only worked with Express-generated APIs before, so I understood this snippet as 'use fetchAll method of the User model object, with ctx.query as an argument', but I had no luck logging this ctx.query. And as I can't log stuff, I'm kinda blocked.
In my exploration, I naively tried to log the full ctx object and work from there:
// Send 200 `ok`
ctx.send(data);
strapi.log.info(ctx.query, ' were query');
strapi.log.info(ctx.request, 'were request');
strapi.log.info(ctx.response, 'were response');
strapi.log.info(ctx.res, 'were res');
strapi.log.info(ctx.req, 'were req');
strapi.log.info(ctx, 'is full context')
},
Unfortunately, I fear I miss something obvious, as it gives me no input at all. Making a fetchAll request from my React app with these console.logs print this in my terminal:
[2019-09-19T12:43:03.409Z] info were query
[2019-09-19T12:43:03.410Z] info were request
[2019-09-19T12:43:03.418Z] info were response
[2019-09-19T12:43:03.419Z] info were res
[2019-09-19T12:43:03.419Z] info were req
[2019-09-19T12:43:03.419Z] info is full context
[2019-09-19T12:43:03.435Z] debug GET /users?_sort=id:DESC&_start=0&_limit=10& (74 ms)
While in my frontend I get the good ol' The Content-Range header is missing in the HTTP Response message I'm trying to solve.
After writing this wall of text I realize the logging issue is separated from my original problem, but if I was able to at least log ctx properly, maybe I'd be able to find the solution myself.
Trying to summarize:
Actual problem is, how do I set my Content-Range properly in my strapi controller ? (partially answered cf. edit 3)
Collateral problem n°1: Can't even log ctx object (cf. edit 2)
Collateral problem n°2: Once I figure out the actual problem, is it feasible to address it dynamically (basically some callback function for index/fetchAll routes, in which the model is a variable, on which I'd call the appropriate count() method, and finally append the result to my response header)? I'm not asking for the code here, just if you think it's feasible and/or know a more elegant way.
Thank you for reading through and excuse me if it was confuse; I wasn't sure which infos would be relevant, so I thought the more the better.
/edit1: forgot to mention, in my controller I also tried to log strapi.plugins['users-permissions'].services.user object to see if it actually has a count() method but got no luck with that either. Also tried the original snippet (Step 3 of aforementioned README), but failed as expected as afaik I don't see the User model being imported anywhere (the only import in User.js being lodash)
/edit2: About the logs, my bad, I just misunderstood the documentation. I now do:
ctx.send(data);
strapi.log.info('ctx should be : ', {ctx});
strapi.log.info('ctx.req = ', {...ctx.req});
strapi.log.info('ctx.res = ', {...ctx.res});
strapi.log.info('ctx.request = ', {...ctx.request});
ctrapi.log.info('ctx.response = ', {...ctx.response});
Ctx logs this way; also it seems that it needs the spread operator to display nested objects ({ctx.req} crash the server, {...ctx.req} is okay). Cool, because it narrows the question to what's interesting.
/edit3: As expected, having logs helps big time. I've managed to display my users (although in the dirty way). Couldn't find any count() method, but watching the data object that is passed to ctx.send(), it's equivalent to your typical 'res.data' i.e a pure JSON with my user list. So a simple .length did the trick:
let data = await strapi.plugins['users-permissions'].services.user.fetchAll(ctx.query);
data.reduce((acc, user) => {
acc.push(_.omit(user.toJSON ? user.toJSON() : user, ['password', 'resetPasswordToken']));
return acc;
}, []);
ctx.set('Content-Range', data.length) // <-- it did the trick
// Send 200 `ok`
ctx.send(data);
Now starting to work on the hard part: the dynamic callback function that will do that for any index/fetchAll call. Will update once I figure it out
I'm using React Admin and Strapi together and installed ra-strapi-provider.
A little boring to paste Content-Range header into all of my controllers, so I searched for a better solution. Then I've found middleware concept and created one that fits my needs. It's probably not the best solution, but do its job well:
const _ = require("lodash");
module.exports = strapi => {
return {
// can also be async
initialize() {
strapi.app.use(async (ctx, next) => {
await next();
if (_.isArray(ctx.response.body))
ctx.set("Content-Range", ctx.response.body.length);
});
}
};
};
I hope it helps
For people still landing on this page:
Strapi has been updated from #alpha to #beta. Care, as some of the code in my OP is no longer valid; also some of their documentation is not up to date.
I failed to find a "clever" way to solve this problem; in the end I copy/pasted the ctx.set('Content-Range', data.length) bit in all relevant controllers and it just worked.
If somebody comes with a clever solution for that problem I'll happily accept his answer. With the current Strapi version I don't think it's doable with policies or lifecycle callbacks.
The "quick & easy fix" is still to customize each relevant Strapi controller.
With strapi#beta you don't have direct access to controller's code: you'll first need to "rewrite" one with the help of this doc. Then add the ctx.set('Content-Range', data.length) bit. Test it properly with RA, so for the other controllers, you'll just have to create the folder, name the file, copy/paste your code + "Search & Replace" on model name.
The "longer & cleaner fix" would be to dive into the react-admin source code and refactorize so the lack of "Content-Range" header doesn't break pagination.
You'll now have to maintain your own react-admin fork, so make sure you're already committed into this library and have A LOT of tables to manage through it (so much that customizing every Strapi controller will be too tedious).
Before forking RA, please remember all the stuff you can do with the Strapi backoffice alone (including embedding your custom React app into it) and ensure it will be worth the trouble.

Login via POST does not yield valid session

I am currently trying to convert a smallish app from nodejs to golang (hence the two tags) but I'm running into a bit of trouble in doing so.
Essentially it is a very simple http POST login which I can't seem to realise. The background is that my university provides a calendar export function and I would like to provide this calendar as a feed that could be added to Google Cal.
Now the thing is that I have the whole thing working in node, but I would really like to be able realise it in go aswell.
The important bit of node code would be
var query = url.parse(req.url, true).query;
var data = {
u: query.user, // Username
p: query.password, // Password
};
needle.post(LOGIN_URL, data, {}, function (error, response) {
//extract cookies etc.
});
which is working like a charm but if I try to do the same in go
import "github.com/parnurzeal/gorequest"
//...
resp, body, err := gorequest.New().Post(LOGIN_URL).Send("u=user&p=pass").End()
//extract cookies etc.
I end up an invalid (timed out) session. I already tried using just net/http in go, which doesn't seem to change anything.
The result the POST request yields is a 302 redirect to an overview page (Btw: it is ASP based). Could it be that this is what's causing the problem, since gorequest then fetches that overview page without the cookies returned in resp, effectively creating a new session that isn't authorized, or am I overlooking something terribly simple?
So it seems that I found the answer myself by following your advice and using "net/http" and digging a little deeper into what the http.Client actually does. To anyone who might encounter similar problems, here is my solution:
http.Client automatically redirects if it receives a 30x response by the server see documentation. Although one can override the redirect policy, I was unable to prevent redirection entirely.
Additionally it seems as if the client has a bug (what I would call it at least), as it drops all header upon redirect see the issue or in the source where new headers are created.
While searching around in net/http I found http.DefaultTransport which is used by http.Client and does not redirect. It seems somewhat lower level and exactly what I was after. The following piece of code demonstrates how I replaced the line realised with gorequest from above:
data := url.Values{"u": {USER}, "p": {PASS}}
req, err := http.NewRequest("POST", LOGIN_URL, bytes.NewBufferString(data.Encode()))
//I needed quite some time to figure out that I needed to set the content type accordingly
req.Header.Add("Content-Type", "application/x-www-form-urlencoded")
//...
resp, err := http.DefaultTransport.RoundTrip(req)
//...
//resp.Header["Set-Cookie"] now contains the login/session cookies
Although I need to extract cookies myself and set a few header values, the solution works perfectly and I am quite happy with it. If anybody has some improvements to my solution or any other advice I am glad to hear it. And thanks to JimB and Volker :).

Resolve MongoDB reference

I am currently building a chatting app with nodejs and mongoDB.
Basically I have two collections to maintain in the db.
user = {
_id: ObjectId("1234"),
account: "stan123"
}
thread = {
_user: ObjectId("1234"),
messages: [
{
body:"hi"
_user:ObjectId("1234")
},
{
body:"second msg"
_user:ObjectId("1234")
}
]
}
I am planning to pass the thread model with all resolved info (user) to the client side, so that I can construct my widget with it.
I searched for solutions for this.Some suggests to make extra calls from client side to get the data.
However, I am worried that when the amount of message grows, there will be considerable http calls that might hurt site speed.
I know some drivers can resolve DBRefs automatically and make the code clean.
However, according to
http://docs.mongodb.org/manual/applications/database-references/
I decided to just use id to maintain reference that make it's as simple as possible.
My plan is resolving all references on server side. Current approach is getting the length of message array first.
Then loop through the message array and make a second query to resolve user info separately.
In each query callback, do a messageToResolve++ and if(messageToResolve >= thread.messages.length)
If the condition meets, send the resolved model to client and end the response.
This is not a case I would consider embedded because it would be painful when you need to update user data.
(message is embedded because it exists only when thread exists)
I am not sure if it's a good way to do it.
Does anyone has a better solution?
Sorry if I didn't explain my problem and solution clear enough.
And thanks in advance.

Resources