Modeling a hierarchical security with a graph - arangodb

I'm trying to create an access control list for my documents. Each document needs to store which groups have CRUD rights, well not create rights on a created document. The rights may be hierarchical in that a document might not have explicit rights, but rather inherit them from the account that owns the document.
For example a user can have lots of recipes. By default the user only allows his group to read the recipes. Think Linux style groups where each user gets their own group. This default is part of the user's profile. Now for 10 recipes the user allows the group "public" to read the recipes. These are explicit designations within the recipe document itself.
Is a graph a good fit here? Should I have a vertex for user and a vertex for recipe with an edge between them called, I don't know, OWNS? In the query I could say, in pseudocode, WHERE recipe.CRUD.readers contains "public" OR recipe's OWNS inbound profile.recipes.CRUD.readers contains "public". Is this doable? Would Arango perform well on this?
I know I could duplicate the default CRUD from the profile down into the recipes. This seems like a) a massive waste of memory, and b) error prone if the system gets out of sync (do transactions work across collections in a cluster?), and c) operationally intensive since ever document has to get scanned and possibly updated.

Related

DDD considering Repositories & Services for entities

I've been getting acquainted with DDD and trying to understand the way Entities and Aggregate Roots interact.
Below is the example of the situation:
Let's say there is a user and he/she has multiple email addresses (can have up to 200 for the sake of example). Each email address has it's own identity and so does the user. And there is one to many relationship between users and their email.
From the above example I consider Users and Emails as two entities while Users is the aggregate root
DDD Rules that I came across:
Rule: Only aggregate root has access to the repository.
Question 1: Does it mean that I cannot have a separate database table/collection to store the emails separately? Meaning that the emails have to be embedded inside the user document.
Rule: Entities outside the aggregate can only access other entities in the aggregate via the aggregate root.
Question 2: Now considering I do split them up into two different tables/collection and link the emails by having a field in email called associatedUserId that holds the reference to the user that email belongs to. I can't directly have an API endpoint like /users/{userId}/emails and handle it directly in the EmailService.getEmailsByUserId(String userId)? If not how do I model this?
I am sorry if the question seems a bit too naive but I can't seem to figure it out.
Only aggregate root has access to the repository
Does it mean that I cannot have a separate database table/collection to store the emails separately? Meaning that the emails have to be embedded inside the user document.
It means that there should be a single lock to acquire if you are going to make any changes to any of the member entities of the aggregate. That certainly means that the data representation of the aggregate is stored in a single database; but you could of course distribute the information across multiple tables in that database.
Back in 2003, using relational databases as the book of record was common; one to many relationships would normally involve multiple tables all within the same database.
Entities outside the aggregate can only access other entities in the aggregate via the aggregate root.
I can't directly have an API endpoint like /users/{userId}/emails and handle it directly in the EmailService.getEmailsByUserId(String userId)?
Of course you can; you'll do that by first loading the root entity of the User aggregate, then invoking methods on that entity to get at the information that you need.
A perspective: Evans was taking a position against the idea that the application should be able to manipulate arbitrary entities in the domain model directly. Instead, the application should only be allowed to the "root" entities in the domain model. The restriction, in effect, means that the application doesn't really need to understand the constraints that are shared by multiple entities.
Four or five years later cqrs appeared, further refining this idea -- it turns out that in read-only use cases, the domain model doesn't necessarily contribute very much; you don't need to worry about the invariants if they have already been satisfied and you aren't changing anything.
In effect, this suggests that GET /users/{userId}/emails can just pull the data out of a read-only view, without necessarily involving the domain model at all. But POST /users/{userId}/emails needs to demonstrate the original care (meaning, we need to modify the data via the domain model)
does this mean that I need to first go to the UserRepo and pull out the user and then pull out the emails, can't I just make a EmailService talking to an Email Repo directly
In the original text by Evans, repositories give access to root entities, rather than arbitrary entities. So if "email" is a an entity within the "user aggregate", then it normally wouldn't have a repository of its own.
Furthermore, if you find yourself fighting against that idea, it may be a "code smell" trying to bring you to recognize that your aggregate boundaries are in the wrong place. If email and user are in different aggregates, then of course you would use different repositories to get at them.
The trick is to recognize that aggregate design is a reflection of how we lock our data for modification, not how we link our data for reporting.

Class diagram for a mini JEE project

I want to start a project in JEE and I need to confirm about my class diagram. I need to know if the methods used are correct and if the composition I used is correct or not.
This is my class diagram:
The project is about an online sales store, that wants to set up a management tool to sell products, and to manage its products. This tool must include the following features:
Identification module: identification of clients, managers, supervisors
Sales module: make purchases for users
Product Management Module: Adding / Deleting Products
Statistical module: visualization of sales statistics
Functional Specifications
It is necessary to act on the application, to connect to the application with a user ID and password. To facilitate its use and in order to avoid any mishandling thereafter, here is the solution:
User Profile:
The user will be able to visualize the products sold by My Online Races. The user can place an order, provided that he has registered with the site My Online Races.
Manager profile:
The manager will be able to manage the products:
Add / Edit / Delete Products
Add / Edit / Delete category
These data insertions can be made using CSV or XML files, but also through various forms on the website.
The manager will be able to view the sales statistics.
Supervisor Profile:
The supervisor can add managers whose roles are specified above.
The supervisor will be able to view the sales statistics.
The supervisor will be able to view all the actions performed by the managers, a sort of audit trail.
Well I wish to know already if you have remarks about my design. As well as I have a confusion for several methods, for example adding, modifying and deleting a product. Should I put them in the manager or product class? Is the composition I put correct or should I remove it?
Quick review of the diagram and advices
First some minor remarks about class naming: Ordered should be called Order.
The composition between Article and Order is just wrong (not from a formal view, but from the meaning it conveys). Use a normal one-to-many association: it would reflect much better the real nature of the relation between the two classes. Please take into account that a new article may exist without having been ordered, so it shoud be 0..* instead of 1..*
+belongs and +do in the middle of an association are syntactically incorect. You should use a plain triangle instead (or nothing at all). The triangle should be oriented in the reading direction Person do |> Order and Article belongs to |> Category
The methods seem ok. You do not need to add a suffix.
How shall objects be managed (created/updated/deleted) ?
A more advanced concern is not about the diagram but about how you want to organise persistence (i.e. database storage):
do you really want the object to be an active record, that is an object that adds, updates and deletes itself (to the database) ? It's simple to set up, works well, but makes the class dependent on the underlying database implementation and thus makes maintenance more difficult;
or wouldn't it be better to use a repository for each object ? In this case the repository acts as a collection that manages all the database operations. The domain object (Article, order, User, ...) then have nothing to know about the database, wich leads to more maintainable code.
But this is a broader architectural question. If it's just for a first experimental project with JEE, you can very well use the active records. It's simpler to set up. Be sure however in this case to disambiguate the Add/Update/Delete on Person, since it currently may give the impression that any person can add anyone.
Improvement of the model
A final remark, again not about the diagram itself, is about the domain. Your model considers that an Order is about a single Article.
In reality however, orders are in general about one or several articles: if this would also be the case here, your Order would become an OrderItem and the real Order would be inserted between Person and OrderItem. You could then make the relation between Order and OrderItem a composition (i.e: OrderItem is owned by Order, which has responsibility for creating its items, and the items have no sense without the related order).

Designing Role/User based application security

I am sure this has been done by many of you and was looking for some guidance into designing a robust, scalable, application that can handle data (row level) security for multiple users.
We are looking at a system whereby users, some acting as individuals, and some as part of larger groups will need access to both their own data as well as to data shared with them by other individuals and organizations. In some cases the data is shared just to view and in other cases full edit permissions will be available.
And even within the same organization there will need to be restrictions on what users can see in terms of data created by other users in the organization.
So a very rough example may be...
User A with Role X, (individual user)
User B with Role Y (individual user)
User C with Role Z (individual user)
Users D and E with Roles X, in Organization AA
Users F and G with Roles Y in Organization AA
Each individual User might create for eg a contract on our system. That contract shoudl not be viewable by anyone else until they choose to share that contract with other individual users eg A will share his contract with User B or F.
But User A may also want to share his contract with all users of Role Y in Organization AA.
Likewise sharing the contract may even mean permission to edit the contract.
We initially had wanted to have a separate schema for each individual user to ensure security at a data level but this makes sharing more complicated and also may result in 1000's of schemas (which just doent seem like a good idea).
So it seems like our only resort is to leave all the data in one schema in one db and simply design a user and role driven application level security model which can accommodate all the required CRUD permissions on each contract. It just sounds like this is going to get very complicated and "not so pretty". For every contract we would have to define a list of users and roles by organization which with the individual permissions for each one of these users/roles.
Has anyone done anything like this? Any suggestions with regards to a good and secure application design?
Thanks
Here are some rough thoughts for this requirement
Since you are talking about data security control rather than system function/feature access control, so I will focus on data access control here.
Structure of Access Control Classes
You might want to treat all level of security atom as the same thing, e. g, it's generally defined in a way that individual / permission / role / group are inherited from the same basic principle class, and then conceptually, all access control definition will be on this basic principle, hence you don't need to know whether the control is over a role, a group or a user, this will give you more flexibility and simples the question.
You might also want to define the principles as kind of combination / hierarchy structure, for example
User A is assigned RoleA, RoleB, RoleC and PERMISSION2
Role A is a combination of RoleC and RoleD
RoleC is a combination of permission PERMISSION1, PERMISSION2 etc.
BTW: Since the design here is very flexible so you will need a system level conflict check mechanism to check
The conflict between all the principles upon any assignment to make sure there's no system level conflict(for example, roleA defines that the user can not update document of Type A, while PERMISSION2 defines it can), another possible method is to define a priority for different type of principle, for example, defined level USER > ROLE > PERMISSION.
And also a function level(SOX law etc) conflict check to make sure there's no function level conflict, for example, someone capable to do payment can not have access to operate on cash etc)
Access for Documents
Then a schema such as document / principle / restrict type / restrict operation / restrict start and thru date could be defined to store data access control for each document,
Document could be a contract, a purchase order, purchase request etc, it's actual type doesn't matter at all here.
Principle could be a user / permission / role / group etc, also you don't want to focus on the actual principle type here, by this design, only one row of data is needed for those normal documents(since we could define it on role level), and also if the user choose to share this document with a particular organization or individual, it will also be able to handle.
Restrict type might be "ALLOW" / "NOT_ALLOW" etc,
Restrict operation could be "VIEW" / "UPDATE" / "ADMIN" etc,
Restrict start and thru date will be the date range this access control take effectives, by combine this thru date, operation and restrict type we defined, you can combine exceptional access control with normal access control, hence to implement requirements like share / delegation will be easier, and it also didn't bring too much complicate for normal access control.
Organizations
If your organization has a hierarchy structure, you can also define how the access control to be inherited for parent and to child organizations, a sample schema might be,
Organization Id / Acccess inherent Type / Start and thru date.
Organization id: primary key of this organization
Access inherent type: Could be something like: ALLOW_ALL_CHILD / ALLOW_ALL_PARENT / ALLOW_ALL_SAME_LEVEL /DENY_ALL etc.
Start and thru date could be the date this role takes effective and ends.
Something you might be able to implement could be:
For organization A, all it's subsidiaries is not allowed to view / update any documents under it.
For organization B, all it's subsidiaries is allowed to have view / update permissions to all documents under it, but not to the documents on the same hierarchy level.
Then you can combine this organization level access control and the document level access control, when evaluating the rules, the document level access control always have a higher priority than organization level ones.

Best practice for managing POSIX group membership with additional attributes in LDAP

We are currently designing a new member administration for our study association in LDAP, using OpenLDAP 2.3.31 distributed with Debian. One of the requirements is that we need to record which members are in committees. To this end, we have created a subclass of organizationalUnit, called x-FMFCommittee and a subclass of organizationalRole caled x-FMFCommitteeRole. Now, all roles (such as president, treasurer, etc.) that are associated with a committee are located in the subtree of the committee entry. The DN of the members that take on this role are then set as a roleOccupant attribute on the x-FMFCommitteeRole. This works perfectly fine.
However, we also give our users access to a Linux shell and Windows desktops (using Samba) and for this we would also like to administer POSIX group membership. To this end, every committee we want to do this for, also has the objectclass posixGroup (as per RFC2307). In order to get the group members and the groups associated to a user, we need to set the memberUid attribute on the committee entry whenever a member occupies a role in the committee.
We have tried doing this using the dynlist overlay, but this fails when doing reverse group lookups. The only option we see now is to do this all manually, but we really would like to automate this, to provide easier maintenance in the future as administrators tend to change often at our association.
Has anybody come across a similar scenario before, or does anyone know a solution to this problem?
I would suggest that you simply get rid of the first implementation and just use posixGroup. Database denormalization is always a bad idea, whatever form it takes.
And you don't need to extend schemas for this problem. If you want to distinguish these committees just put them them in their own subtree.
But I'd like more detail on why using a dynamic list doesn't work. You could use the memberOf overlay instead of having to do reverse lookups.

Modelling Access Control in MongoDB

Does anyone have an example of modelling access control in MongoDB? The situation I'm thinking of is:
There are a set of resources, each being their own document (e.g. cars, people, trees etc.).
A user can gain access to a resource through an explicit grant, or implicitly by being the owner of a resource, existing in another collection (e.g. a role) or some other implicit ways.
In one collection.find() method, that could have skip and limit options applied (for pagination), is there a way to check all these explicit and implicit paths and produce a result of resources a user has access to?
In MySQL we have modelled this using a grants table with resource id, granting user id, authorized user id and operation (read, write etc.). We then, in one query, select all resources where at least one subquery is true, and the subqueries then check all the different paths to access e.g. one checks for a grant, one checks for ownership etc.
I just can't wrap my head around doing this in MongoDB, I'm not sure if it's even possible...
Thanks
You can't query more than one document at a time. Ideally, shouldn't access control be a part of the business logic. Your backend php/c#/language ought to ensure that the current request is authorized. If so, then simply query the requested document.
If you feel, you need to implement the exact same structure in mongodb, which I suggest you don't, then you will need to embed all those fields (the ones from the other mysql tables that help you identify whether the request is authorized) in each and every document of every collection. You will be duplicating data (denormalizing it). Which brings the headache of ensuring that all the copies are updated and have the same value.
Edit 1:
Lets talk about Car document. To track its owner, you will have owner property (this will contain the _id of the owner document). To track all users who can 'use' (explicit grant) the car, you will have an array allowerdDrivers (this will contain the _id of each user document). Lets assume the current user making the request belong to the 'admin' role. The user document will have an array applicableRoles that store the _id of each role document applicable.
To retrieve all cars that the user has access to, you only need to make two queries. One to fetch his roles. If he is an admin, return ALL cars. If he is not, then make another query where owner equals his id or allowedDrivers contains his id.
I understand your actual use case may be more complicated, but chances are there is a document-oriented way of solving that. You have to realize that the way data is modelled in documents is vastly different from how you would model it in a RDbMS.
Doing it in business logic would be painfully slow and inefficient.
How so? This is business logic, if user a owns post b then let them do the action (MVC style), otherwise don't.
That sounds like business logic to me and most frameworks consider this business logic to be placed within the controller action (of the MVC paradigm); i.e. in PHP Yii:
Yii::app()->roles->hasAccess('some_view_action_for_a_post', $post)
I think that by doing it in the database end you have confused your storage layer with your business layer.
Also with how complex some role based permission actions can get the queries you commit must be pretty big with many sub selects. Considering how MySQL creates and handles result sets (sub selects ARE NOT JOINS) I have a feeling these queries do not scale particularly well.
Also you have to consider when you want to change the roles, or a function that defines a role, that can access a certain object you will have to change your SQL queries directly instead of just adding the role to a roles table and assigning the object properties for that role and assigning users that role (AKA code changes).
So I would seriously look into how other frameworks in other languages (and your own) do their RBAC because I think you have blurred the line and made your life quite hard with what you have done, in fact here might be a good place to start: Group/rule-based authorization approach in node.js and express.js

Resources