Access Control in a Web Application - security

I'm currently reading a lot about access control possibilites/mechanisms that can be used to protect resources in an application or web application. There's ACL, RBAC, ABAC and many other concepts out there.
Assuming I have developed a simple webservice that returns knowledgebase articles on a route like '/api/article'. The 'controller' connects to the database and fetches all articles and returns them as XML or JSON.
Now I would like to have control over which article in the database is accessible for which user or group. So for instance if user 'peter' accesses the route '/api/article' with his credentials, the webservice shall return only articles that are visible for 'peter'.
I would want to use ACL to control what each user/group can read/write/delete. But what I don't quite understand:
Where does one enforce the access control? Do I just fetch all records in the controller if a user accesses the route '/api/articles' and check each single record against an access control list (that doesn't sound very good performance wise)? Or is there a way that the 'SELECT' statement to the database only return the records that can actually be seen by that specific user?
I really tried hard to find more information on that topic, and there is a lot about different access control mechanisms, but not about where and how the actual enforcement happens...and it even get's more complex if it comes to other actions like modification, deletion and so on...

This is really a matter of implementation and everyone does it its own way. It also depends on the nature of the data, particularly on the size of your authorization (do your have 5 roles and users are attached to them or does each user have a specific set of articles he can access, different for each user - for instance)
If you do not want to make the processing on the controller, you could store the authorization information in your database, in a table which links a user to a set of KB articles, or a role (which would then be reflected in the article). In that case your SELECT query would just pass the authenticated user ID. This requires that the maintenance of the relationship is done of the database, which may not be obvious.
Alternatively you can store the ACL on the controller and build a query from there - for specific articles or groups of articles.
Getting all the articles and checking them on the controller is not a good idea (as you mention), DBs have been designed also to avoid such issues.

Related

How to ease CouchDB read/write restrictions on _users database

In my couchapp two databases are being used
1 Is for application data
2 Is "_users" database.
In my application In one form I'm trying to implement autocomplete where data source is a "view" created in "_users" database.
Now when I login with normal user id other than admin. While trying to access the view inside "_users" database I'm getting the error 403 which is :
{"error":"forbidden","reason":"Only admins can access design document actions for system databases."}
Is it possible for me to allow and limit the access for non admin users to that view only ? So I can get the list of users from _users database into my application.
I've never been able to do many tasks that require much custom with CouchDB by itself. I've always needed a script somewhere else that gives me the info that I need.
What works for me is this setup:
A gatekeeper Sinatra app that has admin access to my CouchDB
Using CouchDB's config to proxy to my Sinatra app. httpd_global_handlers _my_service {couch_httpd_proxy, handle_proxy_req, <<"http://127.0.0.1:9999">>}
The reason for the proxy is because any request that comes through to your gatekeeper will have the AuthSession token set. Inside your gatekeeper, you can GET localhost:5984/_session passing the AuthSession cookie along, it will tell you who is making the request, allowing you to look them up and see if they have access, or just give everyone access to whatever you like. Another reason for the proxy is to avoid any CORS nonsense since you're making the request to yourserver:5984/_my_service.
Update
A purely client-side/javascript solution means that it will be fundamentally insecure at some point, since well, everything is on the client-side. But perhaps your application, doesn't need to be that secure. That's up to you.
One workaround could be to make your application authenticate as a predefined admin, and then create more admin users that way. You could authenticate once when your application boots or on an as needed basis.
The "problem" is that CouchDB sees the _users database as fundamentally special, and doesn't give you the opportunity to change the credential requirements like other databases. Normally you would be able to use the _security document to give role based or user based access. But that's not possible with _users.
An alternative implementation might be to keep track of your own users and forgo the _users database altogether. In that case you could set your own cookies and have your own login and logout methods that don't depend on CouchDB's authentication scheme. You could query your own _view/users because it would be in your main database. Things wouldn't be locked down tight but they would work fine as long as no one was interested in hacking your system. :)

CouchDB Security in a Lightweight Stack?

I'm working on a hobbyist project in order to become more familiar with CouchDB. This is my first time working with CouchDB. For this project, my goal is to investigate whether or not it's possible to build a web application with nothing but HTML, CSS, JavaScript, CouchDB, and nginx (i.e., I'm not hosting any of my code in Couch, just data).
It could be that this is highly impractical, but I'd first like to explore all the options in this stack.
At the moment, my biggest questions are about security. Let's say that I have a few databases in CouchDB, each corresponding to a hosted site. For this example, we'll focus on a single database-- i.e., a single site. Some of the content from this site should be available to everyone, even anonymous users; and other stuff should only be available to users with a certain role. What are my options, and how secure are each of them?
I've come up with a few ideas so far, and this is the one I was planning to work on over the weekend:
Add users and roles to /{site_db}/_security.
According to the Couch documentation, doing so will require any request for data in {site_db} to be from an authenticated user.
Add a user called anon, which will only have one role, which is anon.
When the user first visits the site, my JS model will check the status of the current session (GET /_session).
If no session exists, the JS model will authenticate using the anon account.
Define views in my design document.
Any views that should only be available to non-anonymous users should check the roles on the userCtx object.
Validation of any newly-created documents should check userCtx to see if the user's role is on the whitelist.
This seems like it should work, although I can't help thinking it's overly complex, and that there must be a better way. Also, I'm not sure how to prevent the anon user from updating his own user document to add more roles.
If you don't like CouchDB security model, you can implement yours easily in your reverse proxy instead.
Here is an example with Apache but it seems to be very similar in nginx.

How do I limit a CouchApp's access to CouchDB's `_user` table?

I'd like to get some advice on how to structure the user data for my CouchDB application.
Here's what we're building: We are creating a suite of applications (mostly leveraging the video api of html5) that train people on different skills. We're going to start with a few simple video lectures coupled with interactive activities. We'd like to save individual user's progress, and in the near future we'd like to create some mini-courses that users take in small and large groups. Multiple users would participate in the same activity either live (like google docs), or over some long duration (like a wiki).
My concern is with private user data. Ultimately it would be the simplest, for us and the user, for them to sign up with their email address and a password. But, in CouchDB's _user database read access to the data is essentially public, and I'd rather not make all my user's email addresses public. How to make those email addresses private is my biggest concern. In addition it would be nice if we could have all user data be private unless the user chooses to make it public.
I have thought of a few options, and read through this article and many others on the wiki: http://wiki.apache.org/couchdb/PerDocumentAuthorization
I'm really not leaning toward anything yet, and would love some advice.
In the soon-to-be-released CouchDB-1.2.0, documents in the _users database can only be read by the respective authenticated user and administrators.

CQRS when reading with permissions for large data set

I am trying to understand how the read side of CQRS can work with a large document management application (videos/pdf files/ etc) that we are writing.
We want to show a list of all documents which the user has edit permission on (i.e. show all the documents the user can edit).There could be 10,000s of documents that a particular user could edit.
In general I have read that the a single "table" (flat structure) should suffice for most screens and with permissions you could have a table per role.
How would I design my read model to allow me to quickly get the documents that I can edit for a specific user?
Currently I can see a table holding holding my documents, another holding the users and another table that links the "editing" role between the user and the documents. So I am doing joins to get the data for this screen.
Also, there could be roles for deleting, viewing etc.
Is this the correct way in this case?
JD
You can provide a flat table that has a user id along with the respective denormalized document information.
SELECT * FROM documents_editable_by_user WHERE UserId = #UserId
SELECT * FROM documents_deletable_by_user WHERE UserId = #UserId
SELECT * FROM documents_visible_for_user WHERE UserId = #UserId
But you could even dynamically create a table/list per user in your read model store. This becomes quite easy once you switch from a SQL-based read store to NoSQL (if you haven't already.)
Especially when there are tens of thousands of documents visible for or editable by a user, flattened tables can give a real performance boost compared to joins.
When I had a read model that took the form of a filtering-search-form (pun not intended), I used rhino-security as the foundation of an authorization service.
I configured the system so that the authorization service's tables got pushed through SQL Server's pub-sub system and SQL Server Agent, to the clients that were partially displaying the denormalized data - I then let Rhino.Security join the authorization model together into the read model, on a per-user basis.
Because I essentially never wrote to the read model's authorization tables from the read model, we got a nice encapsulation on the authorization service's database and logic, because authorization was only changed through that service, and it was globally unique and specific (consistent) to that service. This meant that our custom GUIs for handling advanced (hierarchial entities, user groups, users, permissions, per-entity-permissions) authorization requirements could still do CRUD against this authorization model and that would be pushed in soft real time to any read model.

Authorization System Design Question

I'm trying to come up with a good way to do authentication and authorization. Here is what I have. Comments are welcome and what I am hoping for.
I have php on a mac server.
I have Microsoft AD for user accounts.
I am using LDAP to query the AD when the user logs in to the Intranet.
My design question concerns what to do with that AD information. It was suggested by a co-worker to use a naming convention in the AD to avoid an intermediary database. For example, I have a webpage, personnel_payroll.php. I get the url and with the URL and the AD user query the AD for the group personnel_payroll. If the logged in user is in that group they are authorized to view the page. I would have to have a group for every page or at least user domain users for the generic authentication.
It gets more tricky with controls on a page. For example, say there is a button on a page or a grid, only managers can see this. I would need personnel_payroll_myButton as a group in my AD. If the user is in that group, they get the button. I could have many groups if a page had several different levels of authorizations.
Yes, my AD would be huge, but if I don't do this something else will, whether it is MySQL (or some other db), a text file, the httpd.conf, etc.
I would have a generic php funciton IsAuthorized for the various items that passes the url or control name and the authenticated user.
Is there something inherently wrong with using a naming convention for security like this and using the AD as that repository? I have to keep is somewhere. Why not AD?
Thank you for comments.
EDIT: Do you think this scheme would result in super slow pages because of the LDAP calls?
EDIT: I cannot be the first person to ever think of this. Any thoughts on this are appreciated.
EDIT: Thank you to everyone. I am sorry that I could not give you all more points for answering. I had to choose one.
I wonder if there might be a different way of expressing and storing the permissions that would work more cleanly and efficiently.
Most applications are divided into functional areas or roles, and permissions are assigned based on those [broad] areas, as opposed to per-page permissions. So for example, you might have permissions like:
UseApplication
CreateUser
ResetOtherUserPassword
ViewPayrollData
ModifyPayrollData
Or with roles, you could have:
ApplicationUser
ApplicationAdmin
PayrollAdmin
It is likely that the roles (and possibly the per-functionality permissions) may already map to data stored in Active Directory, such as existing AD Groups/Roles. And if it doesn't, it will still be a lot easier to maintain than per-page permissions. The permissions can be maintained as user groups (a user is either in a group, so has the permission, or isn't), or alternately as a custom attribute:
dn: cn=John Doe,dc=example,dc=com
objectClass: top
objectClass: person
objectClass: webAppUser
cn: John Doe
givenName: John
...
myApplicationPermission: UseApplication
myApplicationPermission: ViewPayrollData
This has the advantage that the schema changes are minimal. If you use groups, AD (and every other LDAP server on the planet) already has that functionality, and if you use a custom attribute like this, only a single attribute (and presumably an objectClass, webAppUser in the above example) would need to be added.
Next, you need to decide how to use the data. One possibility would be to check the user's permissions (find out what groups they are in, or what permissions they have been granted) when they log in and store them on the webserver-side in their session. This has the problem that permissions changes only take effect at user-login time and not immediately. If you don't expect permissions to change very often (or while a user is concurrently using the system) this is probably a reasonable way to go. There are variations of this, such as reloading the user's permissions after a certain amount of time has elapsed.
Another possibility, but with more serious (negative) performance implications is to check permissions as needed. In this case you end up hitting the AD server more frequently, causing increased load (both on the web server and AD server), increased network traffic, and higher latency/request times. But you can be sure that the permissions are always up-to-date.
If you still think that it would be useful to have individual pages and buttons names as part of the permissions check, you could have a global "map" of page/button => permission, and do all of your permissions lookups through that. Something (completely un-tested, and mostly pseudocode):
$permMap = array(
"personnel_payroll" => "ViewPayroll",
"personnel_payroll_myButton" => "EditPayroll",
...
);
function check_permission($elementName) {
$permissionName = $permMap[$elementName];
return isUserInLdapGroup($user,$permissionName);
}
The idea of using AD for permissions isn't flawed unless your AD can't scale. If using a local database would be faster/more reliable/flexible, then use that.
Using the naming convention for finding the correct security roles is pretty fragile, though. You will inevitably run into situations where your natural mapping doesn't correspond to the real mapping. Silly things like you want the URL to be "finbiz", but its already in AD as "business-finance" - do you duplicate the group and keep them synchronized, or do you do the remapping within your application...? Sometimes its as simple as "FinBiz" vs "finbiz".
IMO, its best to avoid that sort of problem to begin with, e.g, use group "185" instead of "finbiz" or "business-finance", or some other key that you have more control over.
Regardless of how your getting your permissions, if end up having to cache it, you'll have to deal with stale cache data.
If you have to use ldap, it would make more sense to create a permissions ou (or whatever the AD equivalent of "schema" is) so that you can map arbitrary entities to those permissions. Cache that data, and you should ok.
Edit:
Part of the question seems to be to avoid an intermediary database - why not make the intermediary the primary? Sync the local permissions database to AD regularly (via a hook or polling), and you can avoid two important issues 1) fragile naming convention, 2) external datasource going down.
You will have very slow pages this way (it sounds to me like you'll be re-querying AD LDAP every time a user navigates to figure out what he can do), unless you implement caching of some kind, but then you may run into volatile permission issues (revoked/added permissions on AD while you didn't know about it).
I'd keep permissions and such separate and not use AD as the repository to manage your application specific authorization. Instead use a separate storage provider which will be much easier to maintain and extend as necessary.
Is there something inherently wrong with using a naming convention for security like
this and using the AD as that repository? I have to keep is somewhere. Why not AD?
Logically, using groups for authorization in LDAP/AD is just what it was designed for. An LDAP query of a specific user should be reasonably fast.
In practice, AD can be very unpredictable about how long data changes take to replicate between servers. If someday your app ends up in a big forest with domain controllers distributed all over the continent, you will really regret putting fine-grained data into there. Seriously, it can take an hour for that stuff to replicate for some customers I've worked with. Mysterious situations arise where things magically start working after servers are rebooted and the like.
It's fine to use a directory for 'myapp-users', 'managers', 'payroll' type groups. Try to avoid repeated and wasteful LDAP lookups.
If you were on Windows, one possibility is to create a little file on the local disk for each authorized item. This gives you 'securable objects'. Your app can then impersonate the user and try to open the file. This leverages MS's big investment over the years on optimizing this stuff. Maybe you can do this on the Mac somehow.
Also, check out Apache's mod_auth_ldap. It is said to support "Complex authorization policies can be implemented by representing the policy with LDAP filters."
I don't know what your app does that it doesn't use some kind of database for stuff. Good for you for not taking the easy way out! Here's where a flat text file with JSON can go a long way.
It seems what you're describing is an Access Control List (ACL) to accompany authentication, since you're going beyond 'groups' to specific actions within that group. To create an ACL without a database separate from your authentication means, I'd suggest taking a look at the Zend Framework for PHP, specifically the ACL module.
In your AD settings, assign users to groups (you mention "managers", you'd likely have "users", "administrators", possibly some department-specific groups, and a generic "public" if a user is not part of a group). The Zend ACL module allows you to define "resources" (correlating to page names in your example), and 'actions' within those resources. These are then saved in the session as an object that can be referred to to determine if a user has access. For example:
<?php
$acl = new Zend_Acl();
$acl->addRole(new Zend_Acl_Role('public'));
$acl->addRole(new Zend_Acl_Role('users'));
$acl->addRole(new Zend_Acl_Role('manager'), 'users');
$acl->add(new Zend_Acl_Resource('about')); // Public 'about us' page
$acl->add(new Zend_Acl_Resource('personnel_payroll')); // Payroll page from original post
$acl->allow('users', null, 'view'); // 'users' can view all pages
$acl->allow('public', 'about', 'view'); // 'public' can only view about page
$acl->allow('managers', 'personnel_payroll', 'edit'); // 'managers' can edit the personnel_payroll page, and can view it since they inherit permissions of the 'users' group
// Query the ACL:
echo $acl->isAllowed('public', 'personnel_payroll', 'view') ? "allowed" : "denied"; // denied
echo $acl->isAllowed('users', 'personnel_payroll', 'view') ? "allowed" : "denied"; // allowed
echo $acl->isAllowed('users', 'personnel_payroll', 'edit') ? "allowed" : "denied"; // denied
echo $acl->isAllowed('managers', 'personnel_payroll', 'edit') ? "allowed" : "denied"; // allowed
?>
The benefit to separating the ACL from the AD would be that as the code changes (and what "actions" are possible within the various areas), granting access to them is in the same location, rather than having to administrate the AD server to make the change. And you're using an existing (stable) framework so you don't have to reinvent the wheel with your own isAuthorized function.

Resources