cakePHP and authorization for CRUD operations - security

I have a cakephp 1.3 application and I have run into a 'data leak' security hole. I am looking for the best solution using cake and not just something that will work. The application is a grade tracking system that lets teachers enter grades and students can retrieve their grades. Everything is working as expected but when I started to audit security I found that the basic CRUD operations have leaks. Meaning that student X can see student Y's grades. Students should only see their own grades. I will limit this questions to the read operation.
Using cake, I have a grade_controller.php file with this view function:
function view($id = null) {
// Extra, not related code removed
$this->set('grade', $this->grade->read(null, $id));
}
And
http://localhost/grade/view/5
Shows the grade for student $id=5. That's great. But if student #5 manipulates the URL and changes it to a 6, person #6's grades are shown. The classic data leak security hole.
I had two thoughts on the best way to resolve this. 1) I can add checks to every CRUD operations called in the controller. Or 2) add code to the model (for example using beforeFind()) to check if person X has access to that data element.
Option #1 seems like it is time consuming and error prone.
Option #2 seem like the best way to go. But, it required calling find() before some operations. The read() example above never executes beforeFind() and there is no beforeRead() callback.
Suggestions?

Instead of having a generic read() in your controller, you should move ALL finds, queries..etc into the respective model.
Then, go through each model and add any type of security checks you need on any finds that need to be restricted. 1) it will be much more DRY coding, and 2) you'll better be able to manage security risks like this since you know where all your queries are held.
For your example, I would create a getGrade($id) method in my Grade model and check the student_id field (or whatever) against your Auth user id CakeSession::read("Auth.User.id");
You could also build some generic method(s) similar to is_owner() to re-use the same logic throughout multiple methods.

If CakePHP supports isAuthorized, here's something you could do:
Create a column, that has the types of users (eg. 'student', 'teacher', ...)
Now, it the type of User is 'student', you can limit their access, to view only their data. An example of isAuthorized is as follows. I am allowing the student to edit only their profile information. You can extend the concept.
if ((($role['User']['role'] & $this->user_type['student']) == $this->user_type['student']) {
if (in_array($this->action, array('view')) == true) {
$id = $this->params->pass[0];
if ($id == $user_id) {
return (true);
}
}
}
}

Related

Where to put outside-aggregate validation?

I've got a question regarding outside-aggregate validation.
In our domain partner can place orders that contain certain products (1).
Once order is placed (2) he can mark it as paid (3) in our system.
Once order is marked as paid (4) we assign licences to products in external library service (5).
Once we know licences are assigned (6) we close entire saga.
Here's a small drawing illustrating the process:
At this moment besides commands, command handlers and events there are two domain classes that are involved in entire process:
Order aggregate containing business logic
Order saga coordinating entire process and assigning licences
Now, there is one invariant that is not modelled in this process yet - before we mark order as paid we have to check if user does not already have particular licence assigned. We get this from library service as well.
Where would you put this validation? Command handler? Wrap Order in some domain service? Pass some validator to Order constructor?
class Order
{
public function __construct(OrderValidator $validator)
{
if (!$validator->isValid($this)) {
throw new \DomainException();
}
// else proceed
}
}
class OrderValidator
{
private $libraryServiceClient;
public function isValid(Order $order)
{
// check licence using $libraryServiceClient
}
}
As far as I understood the problem is in step 3 (Mark order as payed). In this step we need a user (let's call it payer) that marks the order as payed. So when creating this payer object (using factory maybe) we need to know if he is allowed to mark an order as payed. In order to get this information a call should be made to the external library.
What I suggest is to have an application service that have ->markOrderAsPayed($orderId, $payerUserId)
This method will make a call to 2 domain services. One for getting the payer and one for marking the order as payed.
$payer = $this->payerService->getPayer($payerUserId);
$this->orderService->payOrder($orderId, $payer);
In the getPayer() function you should make a call to the external library to know how many licences the payer have.
I hope this will be helpful, it is just based on what I understood from the questions and comments.

Referencing external doc in CouchDB view

I am scraping an 90K record database using JSON-RPC and I am trying to put in some basic error checking. I want to start by scraping the database twice using two different settings and adding a prefix to the second scrape. This way I can check to ensure that the two settings are not producing different records (due to dropped updates, etc). I wanted to implement the comparison using a view which compares each document from the first scrape with it's twin produced by the second scrape and then emit the names of records with a difference between them.
However, I cannot quite figure out how to pull in another doc in the view, everything I have read only discusses external docs using the emit() function, which is too late to permit me to compare it. In the example below, the lookup() function would grab the referenced document.
Is this just not possible?
function(doc) {
if(doc._id.slice(0,1)!=='$' && doc._id.slice(0,1)!== "_"){
var otherDoc = lookup('$test" + doc._id);
if(otherDoc){
var keys = doc.value.keys();
var same = true;
keys.forEach(function(key) {
if ((key.slice(0,1) !== '_') && (key.slice(0,1) !=='$') && (key!=='expires')) {
if (!Object.equal(otherDoc[key], doc[key])) {
same = false;
}
}
});
if(!same){
emit(doc._id, 1);
}
}
}
}
Context
You are correct that this is not possible in CouchDB. The whole point of the map function is that it must be idempotent, otherwise you lose all the other nice benefits of a pre-calculated index.
This is why you cannot access external resources in the map function, whether they be other records or the clock. Any time you run a map you must always get the same result if you put the same record into it. Since there are no relationships between records in CouchDB, you cannot promise that this is possible.
Solution
However, you can still achieve your end goal, just be different means. Some possibilities...
Assuming there is some meaningful numeric value in each doc, you could use a view to take the sum of all those values and group them by which import you did ({key: <batch id>, value: <meaningful number>}). Then compare the two numbers in your client or the browser to see if they match.
A brute force approach would be to use a view to pair the docs that should match. Each doc is on a different row, but they're grouped by a common field. Then iterate through the entire index comparing the pairs. This would certainly be the quickest to code and doesn't depend on your application or data.
Implement a validation function to enforce a schema on your data. Just be warned that this will reduce your write throughput since each written record will be piped out of Erlang and into the JS engine. Also, this is only applicable if you're worried about properly formed records instead of their precise content, which might not be the case.
Instead of your different batch jobs creating different docs, have them place them into the same doc. The structure might look like this: { "_id": "something meaningful", "batch_one": { ..data.. }, "batch_two": { ..data.. } } Then your validation function could compare them or you could create a view that indexes all the docs that don't match. All depends on where in your pipeline you want to do the error checking and correction.
Personally I like the last option better, but only if you don't plan to use the database as is in production. Ie., you wouldn't want to carry around all that extra data in each record.
Hope that helps.
Cheers.

CouchDB: Single document vs "joining" documents together

I'm tryting to decide the best approach for a CouchApp (no middleware). Since there are similarities to my idea, lets assume we have a stackoverflow page stored in a CouchDB. In essence it consists of the actual question on top, answers and commets. Those are basically three layers.
There are two ways of storing it. Either within a single document containing a suitable JSON representation of the data, or store each part of the entry within a separate document combining them later through a view (similar to this: http://www.cmlenz.net/archives/2007/10/couchdb-joins)
Now, both approaches may be fine, yet both have massive downsides from my current point of view. Storing a busy document (many changes through multiple users are expected) as a signle entity would cause conflicts to happen. If user A stores his/her changes to the document, user B would receive a conflict error once he/she is finished typing his/her update. I can imagine its possible to fix this without the users knowledge through re-downloading the document before retrying.
But what if the document is rather big? I'll except them to become rather blown up over time which would put quite some noticeable delay on a save process, especially if the retry process has to happen multiple times due to many users updating a document at the same time.
Another problem I'd see is editing. Every user should be allowed to edit his/her contributions. Now, if they're stored within one document it might be hard to write a solid auth handler.
Ok, now lets look at the multiple documents approach. Question, Answers and Comments would be stored within their own documents. Advantage: only the actual owner of the document can cause conflicts, something that won't happen too often. Being rather small elements of the whole, redownloading wouldn't take much time. Furthermore the auth routine should be quite easy to realize.
Now here's the downside. The single document is real easy to query and display. Having a lot of unsorted snippets laying around seems like a messy thing since I didn't really get the actual view to present me with a 100% ready to use JSON object containing the entire item in an ordered and structured format.
I hope I've been able to communicate the actual problem. I try to decide which solution would be more suitable for me, which problems easier to overcome. I imagine the first solution to be the prettier one in terms of storage and querying, yet the second one the more practical one solvable through better key management within the view (I'm not entirely into the principle of keys yet).
Thank you very much for your help in advance :)
Go with your second option. It's much easier than having to deal with the conflicts. Here are some example docs how I might structure the data:
{
_id: 12345,
type: 'question',
slug: 'couchdb-single-document-vs-joining-documents-together',
markdown: 'Im tryting to decide the best approach for a CouchApp (no middleware). Since there are similarities to...' ,
user: 'roman-geber',
date: 1322150148041,
'jquery.couch.attachPrevRev' : true
}
{
_id: 23456,
type: 'answer'
question: 12345,
markdown: 'Go with your second option...',
user : 'ryan-ramage',
votes: 100,
date: 1322151148041,
'jquery.couch.attachPrevRev' : true
}
{
_id: 45678,
type: 'comment'
question: 12345,
answer: 23456,
markdown : 'I really like what you have said, but...' ,
user: 'somedude',
date: 1322151158041,
'jquery.couch.attachPrevRev' : true
}
To store revisions of each one, I would store the old versions as attachments on the doc being edited. If you use the jquery client for couchdb, you get it for free by adding the jquery.couch.attachPrevRev = true. See Versioning docs in CouchDB by jchris
Create a view like this
fullQuestion : {
map : function(doc) {
if (doc.type == 'question') emit([doc._id, null, null], null);
if (doc.type == 'answer') emit([doc.question, doc._id, null], null);
if (doc.type == 'comment') emit([doc.question, doc.answer, doc._id], null) ;
}
}
And query the view like this
http://localhost:5984/so/_design/app/_view/fullQuestion?startkey=['12345']&endkey=['12345',{},{}]&include_docs=true
(Note: I have not url encoded this query, but it is more readable)
This will get you all of the related documents for the question that you will need to build the page. The only thing is that they will not be sorted by date. You can sort them on the client side (in javascript).
EDIT: Here is an alternative option for the view and query
Based on your domain, you know some facts. You know an answer cant exist before a question existed, and a comment on an answer cant exist before an answer existed. So lets make a view that might make it faster to create the display page, respecting the order of things:
fullQuestion : {
map : function(doc) {
if (doc.type == 'question') emit([doc._id, doc.date], null);
if (doc.type == 'answer') emit([doc.question, doc.date], null);
if (doc.type == 'comment') emit([doc.question, doc.date], null);
}
}
This will keep all the related docs together, and keep them ordered by date. Here is a sample query
http://localhost:5984/so/_design/app/_view/fullQuestion?startkey=['12345']&endkey=['12345',{}]&include_docs=true
This will get back all the docs you will need, ordered from oldest to newest. You can now zip through the results, knowing that the parent objects will be before the child ones, like this:
function addAnswer(doc) {
$('.answers').append(answerTemplate(doc));
}
function addCommentToAnswer(doc) {
$('#' + doc.answer).append(commentTemplate(doc));
}
$.each(results.rows, function(i, row) {
if (row.doc.type == 'question') displyQuestionInfo(row.doc);
if (row.doc.type == 'answer') addAnswer(row.doc);
if (row.doc.type == 'comment') addCommentToAnswer(row.doc)
})
So then you dont have to perform any client side sorting.
Hope this helps.

How do I design a couchdb view for following case ?

I am migrating an application from mySQL to couchDB. (Okay, Please dont pass judgements on this).
There is a function with signature
getUserBy($column, $value)
Now you can see that in case of SQL it is a trivial job to construct a query and fire it.
However as far as couchDB is concerned I am supposed to write views with map functions
Currently I have many views such as
get_user_by_name
get_user_by_email
and so on. Can anyone suggest a better and yet scalable way of doing this ?
Sure! One of my favorite views, for its power, is by_field. It's a pretty simple map function.
function(doc) {
// by_field: map function
// A single view for every field in every document!
var field, key;
for (field in doc) {
key = [field, doc[field]];
emit(key, 1);
}
}
Suppose your documents have a .name field for their name, and .email for their email address.
To get users by name (ex. "Alice" and "Bob"):
GET /db/_design/example/_view/by_field?include_docs=true&key=["name","Alice"]
GET /db/_design/example/_view/by_field?include_docs=true&key=["name","Bob"]
To get users by email, from the same view:
GET /db/_design/example/_view/by_field?include_docs=true&key=["email","alice#gmail.com"]
GET /db/_design/example/_view/by_field?include_docs=true&key=["name","bob#gmail.com"]
The reason I like to emit 1 is so you can write reduce functions later to use sum() to easily add up the documents that match your query.

How do I setup my POCO's with Subsonic 3 and the SimpleRepostitory? or where is the convention?

Is there someplace that details how to setup your POCO's when using the SimpleRepository with SubSonic 3? It sounds like it's convention over configuration, but I can't find where that convention is explained.
http://www.subsonicproject.com/docs/Conventions looks like it was meant for 2.0, and is also marked incomplete. (BTW: I'd love to help reorganize the docs into more a 2.0 and 3.0 as the current docs are a bit confusing on which version they are referring to.)
For instance I'd like to know how I'd go about setting up a
one-to-one relationship
User <=> Profile
class User {
Id
ProfileId instead of Profile? or is Profile profile possible?
}
class Profile {
Id
UserId instead of User? or is User user possible?
}
One-to-many relationship
class User {
Id
IList<Post> Posts (?) or IList<int> PostIds (?) or is this implied somehow? or is this just wrong?
}
class Post {
Id
UserId instead of User? or is User user possible?
}
Many-to-many
I'm guessing I'd need to setup a many to many table?
class User {
IList<Blog> Blogs (?) or IList<int> BlogIds (?) or is this implied somehow?
}
class BlogsUsers { // Do I have to create this class?
UserId
BlogId
}
class User {
IList<User> Users (?) or IList<int> UserIds (?) or is this implied somehow?
}
In the example solution it doesn't seem like these are set so I'm wondering how you'd go about doing a (my guess proceeds example):
one-to-one
User.Profile
r.Single<Profile>(p=>p.User == userId);
parent on one-to-many
Post.User
id = r.Single<Post>(postId).UserId;
r.Single<User>(id); // which kind of stinks with two queries, JOIN?
children on one-to-many
User.Posts
r.Find<Post>(p=>p.UserId == userId)
or many-to-many
User.Blogs
ids = r.Find<BlogsUsers>(bu=>bu.UserId == userId);
r.Find<Blog>(b=>b.BlogId == ids); // again with the two queries? :)
Blog.Users
ids = r.Find<BlogsUsers>(bu=>bu.BlogId == blogId);
r.Find<User>(u=>u.UserId == ids); // again with the two queries? :)
I would assume that there's got to be a way to not have the two queries and for these properties to already be autogenerated in some way. Truth be told though I did only have an hour to play with everything last night so I am a little afraid of Rob yelling at me. I'm SORRY! :P
If these are not autogen'd then where are views and store procedures for 3.0? Please give me a link for those as well while you're at it fellow SO'er.
This is probably your best place to start:
http://subsonicproject.com/docs/Using_SimpleRepository
Relationships are setup in code, by you, and we don't carry those forward to the DB (yet - hopefully soon). Ideally you setup your model as you need to and when you're ready you go and manually "solidify" your relationships in the DB. This is to reduce friction during development - data integrity isn't really something to worry about when building a site.
That said, I know people want this feature - I just need to build it.

Resources