Should I seperate input validation and DB query? - node.js

I am now working on a API back-end. I've written a series of functions to query a database, like:
(yes, NodeJS)
function addGuys(guys, cb) {
var sql = 'INSERT INTO guys ...';
guys.ForEach(function(guy) {
// Construct sql according to keys given in `guy`
// Validate here?
// add col names, values to sql statment, etc
}
// Execute query
}
The problem now is, should I validate input in these DB functions, or should I validate input before sending it to the DB functions? The latter seems cleaner, but I am a novice and still not sure. Any advice will be welcomed. Thanks.

You should not construct SQL directly yourself at all. Instead you should use a library for that, which does its own validation and escaping. A popular example would be http://docs.sequelizejs.com/en/v3/

Related

Getting database names from server

I want to do a simple thing: get the database names on a RavenDB server. Looks straightforward according to the docs (https://ravendb.net/docs/article-page/4.1/csharp/client-api/operations/server-wide/get-database-names), however I'm facing a chicken-and-egg problem.
The problem comes because I want to get the database names without knowing them in advance. The code in the docs works great, but requires to have an active connection to a DocumentStore. And to get an active connection to a DocumentStore, is mandatory to select a valid database. Otherwise I can't execute the GetDatabaseNamesOperation.
That makes me think that I'm missing something. Is there any way to get the database names without having to know at least one of them?
The database isn't mandatory to open a store. Following code works with no problems:
using (var store = new DocumentStore
{
Urls = new[] { "http://live-test.ravendb.net" }
})
{
store.Initialize();
var dbs = store.Maintenance.Server.Send(new GetDatabaseNamesOperation(0, 25));
}
We send GetDatabaseNamesOperation to the ServerStore, which is common for all databases and holds common data (like database names).

Knex + SQL Server whereIn query 8-12s -- raw version returns NO results but if I input the .toQuery() result directly I get results

The database is in Azure cloud and not being used in production currently. There are 80.000 rows and a uprn is a VARCHAR(100);
I'm already using JOI to validate each UPRN as well;
I'm using KNEX with a SQL Server database with the following whereIn query:
knex(LOCATIONS.table).whereIn(LOCATIONS.uprn, req.body.uprns)
but this takes 8-12s to complete and sometimes timesout. if I use .toQuery() on the same thing, SSMS will return the result within 1-2.
If I do a raw query, the resulting .toQuery() or toString() works in SSMS and returns results. But if I try to use the raw directly, it will return 0 results.
I'm looking to either fix what's making whereIn so slow or get the raw query working.
EDIT 1:
After much debugging and trying -- it seems that the bug is due to how knex deals with arrays, so I made a for-of loop to add ? ? ? for each array element and then inputed the array for all params.
This led me to realizing the performance issue is due to SQL server way of parameterising.
I ended up building a raw query string with all of the parameters and validating the input with Joi string/regex config:
Joi.string()
.min(1)
.max(35)
.regex(/^[a-z\d\-_\s]+$/i)
allowing only for alphanumeric, dashes and spaces which should prevent sql injection.
I'm going to look deeper into security issues with this and might make a separate login that can only SELECT data from that table and nothing more to run with these queries.
Needed to just handle it raw and validate separately.

Express with pug, Postgres and proper MVC

I recently started using Node.js + Express.js (generated with pug) + pg-promise for handling db.
My first target is to obtain data from Postgres (already set up) and display it pretty using render and pug. Let's say it is user list from Users table.
On this restful tutorial I have learned how to get data and return it as JSON - it worked.
Based on Mozilla's tutorial I seperated my code:
routes/users.js: where for '/' I call user_controller.user_list method (using router.get)
controllers/userController.js I have exported user_list where I would like to ask model for data and call render if I have results
queries.js which is kinda my model? But I'm not sure. It has API: connection to db with promises and one function for every query I am going to use in Controllers. I believe I should have like one Model file per table (or any logical entity) but where to store pgp connections?
This file is based on first tutorial I mentioned
// queries.js (connectionString is set properly to my postgres)
var pgp = require('pg-promise')(options);
var db = pgp(connectionString);
function getUsers(req, res, next) {
db.any('SELECT (user_id, username) FROM public.users ORDER BY user_id ASC LIMIT 1000')
.then(function (data) {
res.json({ data: data });
})
.catch(function (err) {
return next(err);
});
}
module.exports = {
getUsers: getUsers
};
Here starts my problem as most tutorials uses mongoose which is very model-db-schema-friendly and what I have is simple 'SELECT ...' string I pass to pg-promise's any() function.
Therefore I have no model class like User.
In userControllers.js I don't know how to call getUsers() to handle its data. Returning JS object from getUsers() would be nice.
Also: where should I call render? In controller or only in
db.any(...).then(function (data) { <--here--> })
Before, I also tried to embed whole Postgres handling into Controller but from db.any() I got this array for handling:
[{ row: '(1,John)' },{ row: '(2,Amy)' },{ row: '(50,Peter)' } ]
Didn't know how go from there as I probably lost my API functionality as well ;-)
I am browsing through multiple tutorials how to handle MVC but usually they handle MongoDB and
satisfy readers with res.send() not render().
I am not sure that I understand what your question is exactly about, but since I do not have enough reputation to comment, I'll do my best to help you with your interrogations. :)
First, regarding the queries.js file, it is IMO not exactly a model, but rather a DAO (Data Access Object) file. DAO comes between you Model (which is actually you database) and your Controller layers. There usually is a DAO file per object (User, Pet, whatever you want) in your data model.
When the data model is rather complex, it can be useful to use an Object Relational Mapping (ORM) such as Mongoose to map your database and execute complexe processes on your objects. In such a case, you might need a specific file per object so as to describe your model and store your queries. But since you don't need an ORM, you DAO can directly interact with your database. That is why you do not have a User.js file.
Regarding the way the db object should be used, I think you should refer directly to pg-promise documentation on the matter.
IMPORTANT: For any given connection, you should only create a single
Database object in a separate module, to be shared in your application
(see the code example below). If instead you keep creating the
Database object dynamically, your application will suffer from loss in
performance, and will be getting a warning in a development
environment (when NODE_ENV = development)
As a matter of fact, a db object in pg-promise sort of represents the database itself and is actually designed for the simultaneous use of several databases, which does not seem to be your case for the moment.
Finally, when it comes to the render function, I believe it should be in the controller, as your DAO is not supposed to know how the data it has gathered is going to be used.
Modularity is always a time-saving choice on the long-term.
Furthermore, note that you might later need a Business Layer between your DAO and your controller, in order to preprocess and postprocess data you are going to persist or to display. In such a case, if you need for instance to ask for data from your database, you will need to render data after it is processed by the Business layer. If the render is made in the DAO layer, it will not be possible.
In the link I provided earlier to pg-promise's db object connection, you will also find documentation on the any() method. You might already have looked it up.
It specifically states that it returns
A promise object that represents the query result:
When no rows are returned, it resolves with an empty array.
When 1 or more rows are returned, it resolves with the array of rows.
so your returned data is a JS Array. If you want to make it a JS object, just use
JSON.stringify(yourArray) to process your data before rendering it in your controller.
But I wonder if Pug is not able to use your data directly.
Also, if you cannot get any data out of your DAO, maybe you should check that your data object is not empty, as such a case is tolerated by the any() method. If you expect your query to always return something, you might want to consider using the many() or the one() methods.
I hope this helps you.

Does pg (node-postgres) automatically sanitize data

I am using node-postgres for a production application and I am wondering if there is anything I should be concerned about? Is the data sanitized automatically by node-postgres?
I couldn't find anything about it on the github page: https://github.com/brianc/node-postgres
Absolutely! The parameterized query support in node-postgres is first class. All escaping is done by the postgresql server ensuring proper behavior across dialects, encodings, etc... For example, this will not inject sql:
client.query("INSERT INTO user(name) VALUES($1)", ["'; DROP TABLE user;"], function (err, result) {
// ...
});
This is from their documentation.
It basically depends on how you execute your queries as #vitaly-t described
Suppose you will define query in a string and execute as follows:
var query = `SELECT * FROM table where username='${username}' and password='${password}`;
pool.query(query, (error, results) => {
});
This case if i would pass username=' 'or 1=1; -- and password=' 'or 1=1; --
Then it will return all records from the table (means SQL injection works)
But if I would execute the following query
pool.query('SELECT * FROM table where username=$1 and password=$2', [username, password], (error, results) => {
});
Then SQL injection will never work because pg will sanitize the data.
So it's depends on how you execute the queries.
It depends on how you execute your queries:
Formatting via Prepared Statements is executed by the server, which in turn sanitizes your query from any SQL injection. But it has other restrictions, like you cannot execute more than one query at a time, and you cannot provide sanitizied entity names when needed.
Client-side query formatting, like the one implemented by pg-promise, sanitizes values, plus offers flexibility in formatting entity names and multiple queries.

Proper way to create indexes during deployment

I am creating an expressjs api and using mongodb. I have a decent understanding of indexes and I understand that they are expensive to create when there is data in the database.
In MS Sql Server you would create indexes when creating your database tables. My question is do I handle this creation of indexes in a post call in my express app or do I achieve this using scripts when deploymening my application?
For example I need Geospatial indexing.
Would index creation be handled in the express app like this?
//express post call
let col = db.collection( 'collection' );
col.createIndex( // someIndex );
col.insertOne( //Some document );
I am looking for the best method to creating the 'initial' state of my mongodb and specifically creating indexes I will need for certain collections before these collections contain any documents.
So, It may happen, You have a lot of data in your database while deployment and you do not want your Indexing terrible. Here's what MongoDB can Help. You can do indexing in Background which will not prevent all read and write operations to the database while the index builds.A simple Command:
db.collection.createIndex( { a: 1 }, { background: true } )
Check the Manual For details.
https://docs.mongodb.org/manual/tutorial/build-indexes-in-the-background/

Resources