Ensuring query restrictions are honored during SaveChanges - Breeze security

Ensuring query restrictions are honored during SaveChanges - Breeze security - security

Consider a typical Breeze controller that limits the results of a query to entities that the logged in user has access to. When the browser calls SaveChanges, does Breeze verify on the server that the entities reported as modified are from the original set?
To put it another way, does the EFContextProvider (in the case Entity Framework) keep track of entities that have been handed out, so it can check against malicious data passed to SaveChanges? Or does BeforeSaveEntity need to validate that the user has access to the changed entities?

You must guard against malicious data in your BeforeSaveEntity or BeforeSaveEntities methods.
The idea that the EFContextProvider would keep track of entities that have already been handed out is probably something that we would NOT want to do because
The EFContextProvider would no longer be stateless, which was a design goal to facilitate scaling.
You would still need to guard against malicious data for "Added" entities in the BeforeXXX methods.
It is actually a valid use case for some of our users to "modify" entities without having first queried them.

Related

Validating domain object properties in the Application layer. Is it okay?

In DDD, the Application layer is supposed to just perform coordination tasks, whereas the Domain layer is responsible of validating the business rules.
My question is about validating the domain object properties. For example, I need to validate that a required property has some value in it before persisting it to the database through repositories.
In terms of DDD, is it acceptable to perform this sort of property validation in the Application layer?

Kinds of validation
In the situation you describe, there are two different validation steps that you need to consider separately:
Input validation. This is the responsibility of an app service. The goal is to ensure that no garbage or harmful data enters the system.
Protecting model invariants. This is your domain logic. Whenever something in the domain changes, you need to make sure that the changes are valid within your domain, i.e. all invariants still hold.
Validating domain invariants as part of an app service
Note that sometimes you also want to validate domain invariants in an app service. This could be necessary if you need to communicate invariant violations back to the client. Doing this in the domain would make your domain logic client-specific, which is not what you want.
In this situation, you need to take care that the domain logic does not leak into the app service. One way to overcome this problem and at the same time make a business rule accessible to both the domain and the app service is the Specification Pattern.
Here is an answer of mine to another question that shows an example implementation for the specification pattern.

You can validate incoming data in your ui layer.
For example you can you symfony forms validation or just check for necessary data inside your layer with Rest.
What about Domain Layer, it depends.
You didn't precise what kind of domain object it is.
Mostly you do such kind of validation by creating Value Object, with creation logic inside. For example Email Value Object, you can't create wrong one, otherwise it will throw exception.
Aggregates can perform validation before executing method and it's called invariants. For example, user has method becomeVIP, inside a method there is constraint, that only user with name 'Andrew', can become a VIP.
So you don't do validation after the action, but before the action. You don't let your aggregate go into wrong state.
If you have logic, which is not correlated with aggregate you put it in domain service, for example email uniqueness check.

Rather than "validating hat a required property has some value in it" at the periphery of the Domain, I prefer to make sure that it can never become null in the Domain the first place.
You can do that by forcing consumers of the constructors, factories and methods of that entity to always pass a value for the property.
That being said, you can also enforce it at the Application level and in the Presentation layer (most web application frameworks provide convenient ways of checking it these days). Better 2 or 3 verifications than one. But the domain should be the primary source of consistency.

OWIN with Identity 2 - avoid regularly hitting the database for common objects

I have just taken the plunge and started to learn the OWIN style of authorizing users into MVC applications. One issue I'm having is storing objects since the move away from session objects and into claims.
Traditionally what I would do is authenticate the user, and then store the User object in the session. This is useful when you are regularly using the data from that object all over the application.
Now that I have moved to OWIN with Identity, I instead store the UserId as a claim. I understand that the use of complex objects is best avoided with claims.
So I find that I'm regularly having to hit the database to read User information based on the UserId.
Here is how I am reading the UserId claim:
List<Claim> claims = HttpContext.Current.GetOwinContext().Authentication.User.Claims.ToList();
var ret = claims.FirstOrDefault(x => x.Type == StaffClaims.OrganisationId);
Is there a way that I can avoid taking this ID and reading the corresponding record from the DB each time? I want to achieve something like having the User object stored in memory somewhere.
Alternatively, does Entity Framework 6 allow caching so that I don't hit the database when repeating the same query (unless I know it has changed and should be re-read)?

First, storing the user object in the session is a hugely bad idea. Don't do that ever.
Second, you don't need to store the user id in a claim; you can get it anytime with User.Identity.GetUserId().
Third, Entity Framework does utilize caching, but not in a way I'd consider it as something you could rely on. If you want to cache something, then do it explicitly with System.Runtime.Caching. You can also utilize the OutputCache attribute on actions to cache the rendered view, which has the side effect of not requiring database calls to render it again.
Finally, this is not a big deal in the first place. Just fetch the user when you need it. Before you worry about this one simple query, there's probably 10,000 other areas of your application and could and should be optimized first.

Breeze.js - Securing IQueryable calls

I'm rather new at this, but I've come to understand the security risks of using Breeze to expose an IQueryable<>. Would someone please suggest to me some best practices (or merely some recommendations) for securing an IQueryable collection that's exposed in the JavaScript? Thanks.

I would not expose any data via IQueryable that should nto be sent to the client via a random query. So a projection could be exposed or a DTO.
I'm not sure if this answers your question tho ... What "security risks" are you worried about?

I second this question, too. But to add some specifics along the questions that Ward asked:
In securing queryable services, two traditional issues come to mind:
1) Vertical security: Which items is the currently logged in user (based on user identity or roles) NOT allowed to see in the UI. Those need to be removed from the queryable list. IMO, this can be done as part of the queryable ActionFilter magic by chaining some exclude logic on the returned IQueryable.
2) Horizontal security: Some models contain fields that are not appropriate for the logged in user to see (and/or edit). This is more difficult to handle as it's not a matter of just removing instances from the returned IQueryable. The returned class has a different shape and therefore can be handled either by the json formatter omitting the fields based on security (which AFAIK screws up breeze meta data) or you return a DTO in which case since the DTO doesn't exist in the metadata it's not a full life cycle (updatable) class? (I am asking this not stating it)
I would like to see either built-in support or easy to implement recipes for number 2). Perhaps some sample code to amend the client side metadata to make DTOs work perfectly fine comingled with model objects. The newset VS 2012 SPA templates (in the TodoList app) seem to push DTO variants of the model object both on the queryable and insert/update side. This is similar to the traditional MVC modelviews...
Finally - I'd add a request to auto-handling of the overposting security issue for inserts and updates. This is the reciprocal aspect of 2). Some users should not be able to edit certain fields.

DDD - can a repository fetch an aggregate by something other than its identifier?

I model a User as an aggregate root and a User is composed of an Identifier value object as well as an Email value object. Both value objects can uniquely identify a User, however the email is allowed to change and the identifier cannot.
In most examples of DDD I have seen, a repository for an aggregate root only fetches by identifier. Would it be correct to add another method that fetches by email to the repository? Am I modeling this poorly?

I would say yes, it is appropriate for a repository to have methods for retrieving aggregates by something other than the identity. However, there are some subtleties to be aware of.
The reason that many repository examples only retrieve by ID is based on the observation that repositories coupled with the structure of aggregates cannot fulfill all query requirements. For instance, if you have a query which calls for some fields from an aggregate as well as some fields for a referenced aggregate and some summary data, the corresponding aggregate classes cannot be used to represent this data. Instead, a dedicated read-model is needed. Therefore, querying responsibilities are decoupled from the repository. This have several advantages (queries can be served by a dedicated de-normalized store) and it is the principal paradigm of CQRS. In this type of architecture, domain classes are only retrieved by the repository when some behavior needs to execute. All read-only use cases are served by a read-models.
The reason that I think it appropriate for a repository to have a GetByEmail method is based on YAGNI and battling complexity. You an allow your application to evolve as requirements change and grow. You don't need to jump to CQRS and separate read/write stores right away. You can start with a repository that also happens to have a query method. The only thing to keep in mind is that you should try to retrieve entities by ID when you need to invoke some behavior on those entities.

I would put this functionality into a service / business layer that is specific to your User object. Not every object is going to have an Email identifier. This seems more like business logic than the responsibility of the repository. I am sure you already know this, but here is good explanation of what I am talking about.
I would not recommend this, but you could have a specific implementation of your repository for a User that exposes a GetByEmail(string emailAddress) method, but I still like the service idea.

I agree with what eulerfx has answered:
You need to ask yourself why you need to get the AR using something
other than the ID.
I think it would be rather obvious that you do not have the ID but you do have some other unique identifier such as the e-mail address.
If you go with CQRS you need to first determine whether the data is important to the domain or only to the query store. If you require the data to be 100% consistent then it changes things slightly. You would, for instance, need 100% consistency if you are checking whether an e-mail address exists in order to satisfy the unique constraint. If the queried data is at any time stale you will probably run into problems.
Remember that a repository represents a collection of sorts. So if you do not need to actually operate on the AR (command side) but you have decided that where you are using your domain is appropriate then you could always go for a ContainsEMailAddress on the repository; else you could have a query side for your domain data store also since your domain data store (OLTP type store) is 100% consistent whereas your query store (OLAP type store) may only be eventually consistent, as is typical of CQRS with a separate query store.

In most examples of DDD I have seen, a repository for an aggregate
root only fetches by identifier.
I'd be curious to know what examples you've looked at. According to the DDD definition, a Repository is
A mechanism for encapsulating storage, retrieval, and search behavior
which emulates a collection of objects.
Search obviously includes getting a root or a collection of roots by all sorts of criteria, not only their ID's.
Repository is a perfect place for GetCustomerByEmail(), GetCustomersOver18(), GetCustomersByCountry(...) and so on.

Would it be correct to add another method that fetches by email to the repository? - I would not do that. In my opinion a repository should have only methods for getting by id, save and delete.
I'd rather ask why you don't have user id in the command handler in which you want to fetch the user and call a domain method on it. I don't know what exactly you are doing, but for the login/register scenario, I would do following. When a user logs in, he passes an email address and a password, and you do a query to authenticate the user - this would not use domain or repository (that is just for commands), but would use some query implementation which would return some UserDto which would contain user id, from this point you have the user id. Next scenario is registration. The command handler to create a new user would create a new user entity, then the user needs to log in.

CQRS/EventStore - Passing GUIDS to UI?

I am currently using J Oliver's EventStore which as I understand uses Guids for the Stream ID and this is what is used to build my aggregate root.
From a CQRS point of view and a DDD perspective, I should be thinking about the domain and not GUIDs.
So, if I do a GET (Mvc client), am I right to say that my URL should have the identity of my domain object (aggregate root) and from that I get the GUID from the read store which is then used to build my aggregate root from the event store?
Or should I pass the GUID around to my forms and pass them back as hidden form variables? At least this way I know the aggregate root id and do not have to query the read store?
I suppose the first way is the correct way (not using GUIDs in forms) as then all my Gets and POSTS deal with identities of domain objects rather than GUIDSs which the client will not know.
I suppose this also allows me to build as REST based API which focuses on resources and their identities rather than system generated GUIDS. Please correct me if I am wrong

In my opinion you are on the right track here. The ui should rely solely on the read model and not really care about the aggregates. Then when you need to send a command you can use the read model to get the id of the aggregate you are interested in. The read model should be very fast to read from anyway (that's the whole reason behind using different models for reads and writes) and very easy to cache if you need to. This will also give you much nicer urls.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string