Protecting sensitive entity data - domain-driven-design

Protecting sensitive entity data - domain-driven-design

I'm looking for some advice on architecture for a client/server solution with some peculiarities.
The client is a fairly thick one, leaving the server mostly to peristence, concurrency and infrastructure concerns.
The server contains a number of entities which contain both sensitive and public information. Think for example that the entities are persons, assume that social security number and name are sensitive and age is publicly viewable.
When starting the client, the user is presented with a number of entities, not disclosing any sensitive information. At any time the user can choose to log in and authenticate against the server, given the authentication is successful the user is granted access to the sensitive information.
The client is hosting a domain model and I was thinking of implementing this as some kind of "lazy loading", making the first request instantiating the entities and later refreshing them with sensitive data. The entity getters would throw exceptions on sensitive information when they've not been disclosed, f.e.:
class PersonImpl : PersonEntity
{
private bool undisclosed;
public override string SocialSecurityNumber {
get {
if (undisclosed)
throw new UndisclosedDataException();
return base.SocialSecurityNumber;
}
}
}
Another more friendly approach could be to have a value object indicating that the value is undisclosed.
get {
if (undisclosed)
return undisclosedValue;
return base.SocialSecurityNumber;
}
Some concerns:
What if the user logs in and then out, the sensitive data has been loaded but must be disclosed once again.
One could argue that this type of functionality belongs within the domain and not some infrastructural implementation(i.e. repository implementations).
As always when dealing with a larger number of properties there's a risk that this type of functionality clutters the code
Any insights or discussion is appreciated!

I think that this is actually a great example of using View Models. Your concern seems directly related to the consumption of the entities, because of the data that they contain. Instead of passing your entities all the way up to the UI, you could restrict them to live within the domain only - i.e. no entities are passed into or out of the domain at all, with most/all activities done with a command/query approach on the repositories. Repositories would then return a view model instead of the entity.
So how/why does this apply? You could actually have two different view models. One for authenticated and one for non-authenticated users. You expose the actual values for the sensitive data in the authenticated view model and not for the non-authenticated one. You could have them derived from a common interface, and then code against the interface instead of the object type. For your concrete implementation of the non-authenticated user, you can just populate the non-sensitive data, leaving the sensitive getters to do what you want them to do.
My opinion on a couple of points:
I am not a fan of lazy loading in entities. Lazy loading is a data access responsibility and not really part of the model. For me, it is a first-class member of the things I vehemently avoid in my domain, along with paging and sorting. As for how to relate these items together, I would rather loosely couple the objects via ID pointers to other entities. If I want/need the data contained by one of these entities, then I can load it. It is kind of like lazy loading in a way, but I enforce that it never happens in the domain model itself by doing this.
I am not a fan of throwing exceptions on getters. Setters, on the other hand, is fine. I look at it this way. The entity should always be in a valid state. Getters will not impact the state of the entity - setters will. Throwing on a setter is enforcing the integrity of the model. Using the two view model approach would allow me to move the logic to the presenter. So, I could basically do something like "if user is of type non-authorized, do this; otherwise do something else". Since what you are referring to would ultimately be a case of how the data is presented to the user, and not important to the model, I think it fits well. In general, I use nullable types for my properties that can be null and do not enforce anything on the getters, as it is not part of its responsibility, usually. Instead, I use roles to determine what view model to use.
The obvious drawback is that there is more coding required to use the view models, but it comes at the obvious benefit of decoupling presentation and views from the domain. It also will help in unit/integration testing, where you can verify that a certain view model cannot return a type of data.
However, you can use something akin to AutoMapper (depending on what your platform is) to help in populating your view model from your entities.

I made the mistake of posting the question without creating an OpenId so it looks like I'll have to comment here(?).
First of all, thanks for taking you time to answer - It certainly has more to do with how data is presented than how the model works. However, I feel the need to clarify a few things.
The domain model / entities are never referenced directly from the UI. I'm using a variant of the DM-V-VM pattern for UI/business model separation. For lazy loading and repository implementation in general I have entity implementations in a infrastructure layer where things like serialization, dirty tracking and lazy loading is handled.
So the domain layer has entities like:
class Entity {
virtual string SocialSecurityNumber { get; }
}
And the infrastructure layer adds some other functionality to be able to update and restore entites from a server:
class EntityImpl : Entity {
bool isDirty;
bool isLoaded;
// Provide the means to set value on deserialization
override string SocialSecurityNumber;
}
So the lazy loading behavior would be implemented in the infrastructure layer and never seen by the domain layer.
I agree that throwing on getters wouldn't be nice but my concerns are on how an anonymous view model would retrieve the data. As of now, to retrieve a list of entities the viewmodel would hold a reference to a domain repository, should I have two repositories, one for authenticated(and therefore disclosed) entities, and another one for the unauthenticated users - maybe even two different entities?

Related

Repository Add and Create methods

Why are repositories' .Add method usually implemented as accepting the instance of entity to add, with the .Id already "set" (although it can be set again via reflection), which should be repo's responsibility?
Wouldn't it be better to implement it as .CreateAndAdd?
For example, given a Person entity:
public class Person
{
public Person(uint id, string name)
{
this.Id = id;
this.Name = name;
}
public uint Id { get; }
public string Name { get; }
}
why are repositories usually implemented as:
public interface IRpository<T>
{
Task<T> AddAsync(T entity);
}
and not as:
public interface IPersonsRpository
{
Task<Person> CreateAndAddAsync(string name);
}

why are repositories usually implemented as...?
A few reasons.
Historically, domain-driven-design is heavily influenced by the Eric Evans book that introduced the term. There, Evans proposed that repositories provide collection semantics, providing "the illusion of an in memory collection".
Adding a String, or even a Name, to a collection of Person doesn't make very much sense.
More broadly, figuring out how to reconstitute an entity from a set of a parameters is a separate responsibility from storage, so perhaps it doesn't make sense to go there (note: a repository often ends up with the responsibility of reconstituting an entity from some stored memento, so it isn't completely foreign, but there's usually an extra abstraction, the "Factory", that really does the work.)
Using a generic repository interface often makes sense, as interacting with individual elements of the collection via retrieve/store operations shouldn't require a lot of custom crafting. Repositories can support custom queries for different kinds of entities, so it can be useful to call that out specifically
public interface IPersonRepository : IRepository<Person> {
// Person specific queries go here
}
Finally, the id... and the truth of it is that identity, as a concept, has a whole lot of "it depends" baked into it. In some cases, it may make sense for the repository to assign an id to an entity -- for instance, using a unique key generated by the database. Often, you'll instead want to have control of the identifier outside of the repository. Horses for courses.

There already is a great answer on the question, I just want to add some of my thoughts. (It will contain some duplication from the previous answer, so if this is a bad thing just let me know and I'll remove it :) ).
The Responsibility of ID generation can belong to different part of an organization or a system.
Sometimes the ID will be generated by some special rules like a Social Security Number. This number can be used for ID of a Person in a system, so before creating a Person entity this code will have to be generated from a specific SSNGenerator Service.
We can use a random generated ID like a UUID. UUIDs can be generated outside of the Repository and assigned to the entity during creation and the Repository will only store it (add, save) it to the DB.
IDs generated by databases are very interesting. You can have Sequential IDs like in RDBMS, UUID-ish like in MonogoDB or some Hash. In this case the Responsibility of ID generation is assigned to the DB so it can happen only after the Entity is stored not when it's created. (I'm allowing myself freedom here as you can generate it before saving a transaction or read the last one etc.. but I like to generalize here and avoid discussing cases with race conditions and collisions). This means that you Entity does't have an identity before the save completes. Is this a good thing? Of course It depends :)
This problem is a great example of leaky abstractions.
When you implement a solution sometimes the technology used will affect it. You will have to deal with the fact that for example the ID is generated by your Database which is part of your Infrastructure (if you have defined such a layer in your code). You can also avoid this by using s UUID even if you use a RDBMS, but then you have to join (again technology specific stuff :) ) on these IDs so sometimes people like to use the default.
Instead of having Add or AddAndCreate you can have Save method instead that does the same thing, it's just a different term that some people prefer. The repository is indeed often defined as an "In memory collection" but that doesn't mean that we have to stick to it strictly (It can be a good thing to do that most of the time but still...).
As mentioned, if you database generates ID's, the Repository seems like a good candidate to assign IDs (before of after storing) because it is the one talking to the DB.
If you are using events the way you generate ID's can affect things. For example lets say you want to have UserRegisteredEvent with the UserID as s property. If you are using the DB to generate ID's you will have to store the User first and then create and store/dispatch the event or do something of the sort. On the other hand if you generate the ID beforehand you can save the event and the entity together (in a transaction or in the same document doesn't matter). Sometimes this can get tricky.
Background, experience with technologies and framework, exposure to terminology in literature, school and work affects how we think about things and what terminology sounds better to us. Also we (most of the time) work in teams and this can affect how we name things and how implement them.

Using Martin Fowler's definition:
A Repository mediates between the domain and data mapping layers,
acting like an in-memory domain object collection. Client objects
construct query specifications declaratively and submit them to
Repository for satisfaction. Objects can be added to and removed from
the Repository, as they can from a simple collection of objects, and
the mapping code encapsulated by the Repository will carry out the
appropriate operations behind the scenes. Conceptually, a Repository
encapsulates the set of objects persisted in a data store and the
operations performed over them, providing a more object-oriented view
of the persistence layer
A Repository gives an Object Oriented view of the underlying Data (which may be otherwise stored in a relational DB). It's responsible for mapping your Table to your Entity.
Generating an ID for an object is whole different responsibility, which is not trivial and can get quite complex. You may decide to generate the ID in the DB or a separate service. Regardless of where the ID is generated, a Repository should seamlessly map it between your Entity and Table.
ID generation is a responsibility of its own, and if you add it to the Repository, then you are moving away from Single Responsibility Principle.
A side note here that using GUID for an ID is a terrible idea, because they are not sequential. They only meet the uniqueness requirement of an ID but they are not helpful for searching through the Database Index.

What is the purpose of child entity in Aggregate root?

[ Follow up from this question & comments: Should entity have methods and if so how to prevent them from being called outside aggregate ]
As the title says: i am not clear about what is the actual/precise purpose of entity as a child in aggregate?
According to what i've read on many places, these are the properties of entity that is a child of aggregate:
It has identity local to aggregate
It cannot be accessed directly but through aggregate root only
It should have methods
It should not be exposed from aggregate
In my mind, that translates to several problems:
Entity should be private to aggregate
We need a read only copy Value-Object to expose information from an entity (at least for a repository to be able to read it in order to save to db, for example)
Methods that we have on entity are duplicated on Aggregate (or, vice versa, methods we have to have on Aggregate that handle entity are duplicated on entity)
So, why do we have an entity at all instead of Value Objects only? It seams much more convenient to have only value objects, all methods on aggregate and expose value objects (which we already do copying entity infos).
PS.
I would like to focus to child entity on aggregate, not collections of entities.
[UPDATE in response to Constantin Galbenu answer & comments]
So, effectively, you would have something like this?
public class Aggregate {
...
private _someNestedEntity;
public SomeNestedEntityImmutableState EntityState {
get {
return this._someNestedEntity.getState();
}
}
public ChangeSomethingOnNestedEntity(params) {
this._someNestedEntity.someCommandMethod(params);
}
}

You are thinking about data. Stop that. :) Entities and value objects are not data. They are objects that you can use to model your problem domain. Entities and Value Objects are just a classification of things that naturally arise if you just model a problem.
Entity should be private to aggregate
Yes. Furthermore all state in an object should be private and inaccessible from the outside.
We need a read only copy Value-Object to expose information from an entity (at least for a repository to be able to read it in order to save to db, for example)
No. We don't expose information that is already available. If the information is already available, that means somebody is already responsible for it. So contact that object to do things for you, you don't need the data! This is essentially what the Law of Demeter tells us.
"Repositories" as often implemented do need access to the data, you're right. They are a bad pattern. They are often coupled with ORM, which is even worse in this context, because you lose all control over your data.
Methods that we have on entity are duplicated on Aggregate (or, vice versa, methods we have to have on Aggregate that handle entity are duplicated on entity)
The trick is, you don't have to. Every object (class) you create is there for a reason. As described previously to create an additional abstraction, model a part of the domain. If you do that, an "aggregate" object, that exist on a higher level of abstraction will never want to offer the same methods as objects below. That would mean that there is no abstraction whatsoever.
This use-case only arises when creating data-oriented objects that do little else than holding data. Obviously you would wonder how you could do anything with these if you can't get the data out. It is however a good indicator that your design is not yet complete.

Entity should be private to aggregate
Yes. And I do not think it is a problem. Continue reading to understand why.
We need a read only copy Value-Object to expose information from an entity (at least for a repository to be able to read it in order to
save to db, for example)
No. Make your aggregates return the data that needs to be persisted and/or need to be raised in a event on every method of the aggregate.
Raw example. Real world would need more finegrained response and maybe performMove function need to use the output of game.performMove to build propper structures for persistence and eventPublisher:
public void performMove(String gameId, String playerId, Move move) {
Game game = this.gameRepository.load(gameId); //Game is the AR
List<event> events = game.performMove(playerId, move); //Do something
persistence.apply(events) //events contains ID's of entities so the persistence is able to apply the event and save changes usign the ID's and changed data wich comes in the event too.
this.eventPublisher.publish(events); //notify that something happens to the rest of the system
}
Do the same with inner entities. Let the entity return the data that changed because its method call, including its ID, capture this data in the AR and build propper output for persistence and eventPublisher. This way you do not need even to expose public readonly property with entity ID to the AR and the AR neither about its internal data to the application service. This is the way to get rid of Getter/Setters bag objects.
Methods that we have on entity are duplicated on Aggregate (or, vice versa, methods we have to have on Aggregate that handle entity
are duplicated on entity)
Sometimes the business rules, to check and apply, belongs exclusively to one entity and its internal state and AR just act as gateway. It is Ok but if you find this patter too much then it is a sign about wrong AR design. Maybe the inner entity should be the AR instead a inner entity, maybe you need to split the AR into serveral AR's (inand one the them is the old ner entity), etc... Do not be affraid about having classes that just have one or two methods.
In response of dee zg comments:
What does persistance.apply(events) precisely do? does it save whole
aggregate or entities only?
Neither. Aggregates and entities are domain concepts, not persistence concepts; you can have document store, column store, relational, etc that does not need to match 1 to 1 your domain concepts. You do not read Aggregates and entities from persitence; you build aggregates and entities in memory with data readed from persistence. The aggregate itself does not need to be persisted, this is just a possible implementation detail. Remember that the aggregate is just a construct to organize business rules, it's not a meant to be a representation of state.
Your events have context (user intents) and the data that have been changed (along with the ID's needed to identify things in persistence) so it is incredible easy to write an apply function in the persistence layer that knows, i.e. what sql instruction in case of relational DB, what to execute in order to apply the event and persist the changes.
Could you please provide example when&why its better (or even
inevitable?) to use child entity instead of separate AR referenced by
its Id as value object?
Why do you design and model a class with state and behaviour?
To abstract, encapsulate, reuse, etc. Basic SOLID design. If the entity has everything needed to ensure domain rules and invariants for a operation then the entity is the AR for that operation. If you need extra domain rules checkings that can not be done by the entity (i.e. the entity does not have enough inner state to accomplish the check or does not naturaly fit into the entity and what represents) then you have to redesign; some times could be to model an aggregate that does the extra domain rules checkings and delegate the other domain rules checking to the inner entity, some times could be change the entity to include the new things. It is too domain context dependant so I can not say that there is a fixed redesign strategy.
Keep in mind that you do not model aggregates and entities in your code. You model just classes with behaviour to check domain rules and the state needed to do that checkings and response whith the changes. These classes can act as aggregates or entities for different operations. These terms are used just to help to comunicate and understand the role of the class on each operation context. Of course, you can be in the situation that the operation does not fit into a entity and you could model an aggregate with a V.O. persistence ID and it is OK (sadly, in DDD, without knowing domain context almost everything is OK by default).
Do you wanna some more enlightment from someone that explains things much better than me? (not being native english speaker is a handicap for theese complex issues) Take a look here:
https://blog.sapiensworks.com/post/2016/07/14/DDD-Aggregate-Decoded-1
http://blog.sapiensworks.com/post/2016/07/14/DDD-Aggregate-Decoded-2
http://blog.sapiensworks.com/post/2016/07/14/DDD-Aggregate-Decoded-3

It has identity local to aggregate
In a logical sense, probably, but concretely implementing this with the persistence means we have is often unnecessarily complex.
We need a read only copy Value-Object to expose information from an
entity (at least for a repository to be able to read it in order to
save to db, for example)
Not necessarily, you could have read-only entities for instance.
The repository part of the problem was already addressed in another question. Reads aren't an issue, and there are multiple techniques to prevent write access from the outside world but still allow the persistence layer to populate an entity directly or indirectly.
So, why do we have an entity at all instead of Value Objects only?
You might be somewhat hastily putting concerns in the same basket which really are slightly different
Encapsulation of operations
Aggregate level invariant enforcement
Read access
Write access
Entity or VO data integrity
Just because Value Objects are best made immutable and don't enforce aggregate-level invariants (they do enforce their own data integrity though) doesn't mean Entities can't have a fine-tuned combination of some of the same characteristics.

These questions that you have do not exist in a CQRS architecture, where the Write model (the Aggregate) is different from a Read model. In a flat architecture, the Aggregate must expose read/query methods, otherwise it would be pointless.
Entity should be private to aggregate
Yes, in this way you are clearly expressing the fact that they are not for external use.
We need a read only copy Value-Object to expose information from an entity (at least for a repository to be able to read it in order to save to db, for example)
The Repositories are a special case and should not be see in the same way as Application/Presentation code. They could be part of the same package/module, in other words they should be able to access the nested entities.
The entities can be viewed/implemented as object with an immutable ID and a Value object representing its state, something like this (in pseudocode):
class SomeNestedEntity
{
private readonly ID;
private SomeNestedEntityImmutableState state;
public getState(){ return state; }
public someCommandMethod(){ state = state.mutateSomehow(); }
}
So you see? You could safely return the state of the nested entity, as it is immutable. There would be some problem with the Law of Demeter but this is a decision that you would have to make; if you break it by returning the state you make the code simpler to write for the first time but the coupling increases.
Methods that we have on entity are duplicated on Aggregate (or, vice versa, methods we have to have on Aggregate that handle entity are duplicated on entity)
Yes, this protect the Aggregate's encapsulation and also permits the Aggregate to protect it's invariants.

I won't write too much. Just an example. A car and a gear. The car is the aggregate root. The gear is a child entity

DDD: where should logic go that tests the existence of an entity?

I am in the process of refactoring an application and am trying to figure out where certain logic should fit. For example, during the registration process I have to check if a user exists based upon their email address. As this requires testing if the user exists in the database it seems as if this logic should not be tied to the model as its existence is dictated by it being in the database.
However, I will have a method on the repository responsible for fetching the user by email, etc. This handles the part about retrieval of the user if they exist. From a use case perspective, registration seems to be a use case scenario and accordingly it seems there should be a UserService (application service) with a register method that would call the repository method and perform if then logic to determine if the user entity returned was null or not.
Am I on the right track with this approach, in terms of DDD? Am I viewing this scenario the wrong way and if so, how should I revise my thinking about this?
This link was provided as a possible solution, Where to check user email does not already exits?. It does help but it does not seem to close the loop on the issue. The thing I seem to be missing from this article would be who would be responsible for calling the CreateUserService, an application service or a method on the aggregate root where the CreateUserService object would be injected into the method along with any other relevant parameters?
If the answer is the application service that seems like you are loosing some encapsulation by taking the domain service out of the domain layer. On the other hand, going the other way would mean having to inject the repository into the domain service. Which of those two options would be preferable and more in line with DDD?

I think the best fit for that behaviour is a Domain Service. DS could access to persistence so you can check for existence or uniquenes.
Check this blog entry for more info.
I.e:
public class TransferManager
{
private readonly IEventStore _store;
private readonly IDomainServices _svc;
private readonly IDomainQueries _query;
private readonly ICommandResultMediator _result;
public TransferManager(IEventStore store, IDomainServices svc,IDomainQueries query,ICommandResultMediator result)
{
_store = store;
_svc = svc;
_query = query;
_result = result;
}
public void Execute(TransferMoney cmd)
{
//interacting with the Infrastructure
var accFrom = _query.GetAccountNumber(cmd.AccountFrom);
//Setup value objects
var debit=new Debit(cmd.Amount,accFrom);
//invoking Domain Services
var balance = _svc.CalculateAccountBalance(accFrom);
if (!_svc.CanAccountBeDebitted(balance, debit))
{
//return some error message using a mediator
//this approach works well inside monoliths where everything happens in the same process
_result.AddResult(cmd.Id, new CommandResult());
return;
}
//using the Aggregate and getting the business state change expressed as an event
var evnt = Transfer.Create(/* args */);
//storing the event
_store.Append(evnt);
//publish event if you want
}
}
from http://blog.sapiensworks.com/post/2016/08/19/DDD-Application-Services-Explained

The problem that you are facing is called Set based validation. There are a lot of articles describing the possible solutions. I will give here an extract from one of them (the context is CQRS but it can be applied to some degree to any DDD architecture):
1. Locking, Transactions and Database Constraints
Locking, transactions and database constraints are tried and tested tools for maintaining data integrity, but they come at a cost. Often the code/system is difficult to scale and can be complex to write and maintain. But they have the advantage of being well understood with plenty of examples to learn from. By implication, this approach is generally done using CRUD based operations. If you want to maintain the use of event sourcing then you can try a hybrid approach.
2. Hybrid Locking Field
You can adopt a locking field approach. Create a registry or lookup table in a standard database with a unique constraint. If you are unable to insert the row then you should abandon the command. Reserve the address before issuing the command. For these sort of operations, it is best to use a data store that isn’t eventually consistent and can guarantee the constraint (uniqueness in this case). Additional complexity is a clear downside of this approach, but less obvious is the problem of knowing when the operation is complete. Read side updates are often carried out in a different thread or process or even machine to the command and there could be many different operations happening.
3. Rely on the Eventually Consistent Read Model
To some this sounds like an oxymoron, however, it is a rather neat idea. Inconsistent things happen in systems all the time. Event sourcing allows you to handle these inconsistencies. Rather than throwing an exception and losing someone’s work all in the name of data consistency. Simply record the event and fix it later.
As an aside, how do you know a consistent database is consistent? It keeps no record of the failed operations users have tried to carry out. If I try to update a row in a table that has been updated since I read from it, then the chances are I’m going to lose that data. This gives the DBA an illusion of data consistency, but try to explain that to the exasperated user!
Accepting these things happen, and allowing the business to recover, can bring real competitive advantage. First, you can make the deliberate assumption these issues won’t occur, allowing you to deliver the system quicker/cheaper. Only if they do occur and only if it is of business value do you add features to compensate for the problem.
4. Re-examine the Domain Model
Let’s take a simplistic example to illustrate how a change in perspective may be all you need to resolve the issue. Essentially we have a problem checking for uniqueness or cardinality across aggregate roots because consistency is only enforced with the aggregate. An example could be a goalkeeper in a football team. A goalkeeper is a player. You can only have 1 goalkeeper per team on the pitch at any one time. A data-driven approach may have an ‘IsGoalKeeper’ flag on the player. If the goalkeeper is sent off and an outfield player goes in the goal, then you would need to remove the goalkeeper flag from the goalkeeper and add it to one of the outfield players. You would need constraints in place to ensure that assistant managers didn’t accidentally assign a different player resulting in 2 goalkeepers. In this scenario, we could model the IsGoalKeeper property on the Team, OutFieldPlayers or Game aggregate. This way, maintaining the cardinality becomes trivial.

You seems to be on the right way, the only stuff I didn't get is what your UserService.register does.
It should take all the values to register a user as input, validate them (using the repository to check the existence of the email) and, if the input is valid store the new User.
Problems can arise when the validation involve complex queries. In that case maybe you need to create a secondary store with special indexes suited for queries that you can't do with your domain model, so you will have to manage two different stores that can be out of sync (a user exists in one but it isn't replicated in the other one, yet).
This kind of problem happens when you store your aggregates in something like a key-value store where you can search just with the id of the aggregate, but if you are using something like a sql database that permits to search using your entities fields, you can do a lot of stuff with simple queries.
The only thing you need to take care is avoid to mix query logic and commands logic, in your example the lookup you need to do is easy, is just one field and the result is a boolean, sometimes it can be harder like time operations, or query spanning multiple tables aggregating results, in these cases it is better to make your (command) service use a (query) service, that offers a simple api to do the calculation like:
interface UserReportingService {
ComplexResult aComplexQuery(AComplexInput input);
}
That you can implement with a class that use your repositories, or an implementation that executes directly the query on your database (sql, or whatever).
The difference is that if you use the repositories you "think" in terms of your domain object, if you write directly the query you think in terms of your db abstractions (tables/sets in case of sql, documents in case of mongo, etc..). One or the other depends on the query you need to do.

It is fine to inject repository into domain.
Repository should have simple inteface, so that domain objects could use it as simple collection or storage. Repositories' main idea is to hide data access under simple and clear interface.
I don't see any problems in calling domain services from usecase. Usecase is suppossed to be archestrator. And domain services are actions. It is fine (and even unavoidable) to trigger domain actions by usecase.
To decide, you should analyze Where is this restriction come from?
Is it business rule? Or maybe user shouldn't be a part of model at all?
Usualy "User" means authorization and authentification i.e behaviour, that for my mind should placed in usecase. I prefare to create separate entity for domain (e.g. buyer) and relate it with usecase's user. So when new user is registered it possible to trigger creation of new buyer.

Can a "rich domain model" violate the Single Responsibility Principle?

An interesting thread came up when I typed in this question just now. I don't think it answers my question though.
I've been working a lot with .NET MVC3, where it's desirable to have an anemic model. View models and edit models are best off as dumb data containers that you can just pass off from a controller to a view. Any kind of application flow should come from the controllers, and the views handle UI concerns. In MVC, we don't want any behavior in the model.
However we don't want any business logic in the controllers either. For larger applications its best to keep the domain code separate from and independent of the models, views, and controllers (and HTTP in general for that matter). So there is a separate project which provides, before anything else, a domain model (with entities and value objects, composed into aggregates according to DDD).
I've made a few attempts to move away from an anemic model towards a richer one in domain code, and I'm thinking of giving up. It seems to me like having entity classes which both contain data and behavior violates the SRP.
Take for example one very common scenario on the web, composing emails. Given some event, it is the domain's responsibility to compose an EmailMessage object given an EmailTemplate, EmailAddress, and custom values. The template exists as an entity with properties, and custom values are provided as input by the user. Let's also say for the sake of argument that the EmailMessage's FROM address can be provided by an external service (IConfigurationManager.DefaultFromMailAddress). Given those requirements, it seems like a rich domain model could give the EmailTemplate the responsibility of composing the EmailMessage:
public class EmailTemplate
{
public EmailMessage ComposeMessageTo(EmailAddress to,
IDictionary<string, string> customValues, IConfigurationManager config)
{
var emailMessage = new EmailMessage(); // internal constructor
// extension method
emailMessage.Body = this.BodyFormat.ApplyCustomValues(customValues);
emailMessage.From = this.From ?? config.DefaultFromMailAddress;
// bla bla bla
return emailMessage;
}
}
This was one of my attempts at rich domain model. After adding this method though, it was the EmailTemplate's responsibility to both contain the entity data properties and compose messages. It was about 15 lines long, and seemed to distract the class from what it really means to be an EmailTemplate -- which, IMO, is to just store data (subject format, body format, attachments, and optional from/reply-to addresses).
I ended up refactoring this method into a dedicated class who's sole responsibility is composing an EmailMessage given the previous arguments, and I am much happier with it. In fact, I'm starting to prefer anemic domains because it helps me keep responsibilities separate, making classes and unit tests shorter, more concise, and more focused. It seems that making entities and other data objects "devoid of behavior" can be good for separating responsibility. Or am I off track here?

The argument in favor of a rich domain model instead of an anemic model hinges on one of the value propositions of OOP, which is keeping behavior and data next to each other. The core benefit is that of encapsulation and cohesion which assists in reasoning about the code. A rich domain model can also be viewed as an instance of the information expert pattern. The value of all of these patterns however is to a great extent subjective. If it is more helpful for you to keep data and behavior separate, then so be it, though you might also consider other people that will be looking at the code. I prefer to encapsulate as much as I can. The other benefit of a richer domain model in this case would be the possibility to make certain properties private. If a property is only used by one method on the class, why make it public?
Whether a rich domain model violates SRP depends on your definition of responsibility. According to SRP, a responsibility is a reason to change, which itself calls for a definition. This definition will generally depend on the use-case at hand. You can declare that the responsibility of the template class is to be a template, with all of the implications that arise, one of which is generating a message from the template. A change in one of the template properties can affect the ComposeMessageTo method, which indicates that perhaps these are a single responsibility. Moreover, the ComposeMessageTo method is the most interesting part of the template. Clients of the template don't care how the method is implemented or what properties are present of the template class. They only want to generate a message based on the template. This also votes in favor of keeping data next to the method.

Well, it depends on how you want to look at it.
Another way is: "Can the Single Responsibility Principle violate a rich domain model?"
Both are guidelines. There is no "principle" anywhere in software design. There are, however, good designs and bad designs. Both these concepts can be used in different ways, to achieve a good design.

How do you deal with DDD and EF4

I'm facing several problems trying to apply DDD with EF4 (in ASP MVC2 context). Your advaice would be greatly appreciated.
First of all, I started to use POCO because the dependacy on ObjectContext was not very comfortable in many situations.
Going to POCO solved some problems but the experience is not what I was used to with NHibernate.
I would like to know if it's possible to use designer and to generate not only entities but also a Value Objects (ComplexType?). If I mean Value Object is a class with one ctor without any set properties (T4 modification needed ?).
The only way I found to add behavior to anemic entities is to create partial classes that extends those generated by edmx. I'm not satisfied with this approach.
I don't know how to create several repositories with one edmx. For now I'm using a partial classes to group methods for each aggregate. Each group is a repository in fact.
The last question is about IQueryable. Should it be exposed outside the repository ? If I refer to the ble book, the repository should be a unit of execution and shouldn't expose something like IQueryable. What do you think ?
Thanks for your help.
Thomas

It's fine to use POCOs, but note that EntityObject doesn't require an ObjectContext.
Yes, Complex Types are value objects and yes, you can generate them in the designer. Select several properties of an entity, right click, and choose refactor into complex type.
I strongly recommend putting business methods in their own types, not on entities. "Anemic" types can be a problem if you must maintain them, but when they're codegened they're hardly a maintenance problem. Making business logic separate from entity types allows your business rules and your data model to evolve independently. Yes, you must use partial classes if you must mix these concerns, but I don't believe that separating your model and your rules is a bad thing.
I think that repositories should expose IQueryable, but you can make a good case that domain services should not. People often try to build their repositories into domain services, but remember that the repository exists only to abstract away persistence. Concerns like security should be in domain services, and you can make the case that having IQueryable there gives too much power to the consumer.

I think it's OK to expose IQueryable outside of the repository, only because not doing so could be unnecessarily restrictive. If you only expose data via methods like GetPeopleByBirthday and GetPeopleByLastName, what happens when somebody goes to search for a person by last name and birthday? Do you pull in all the people with the last name "Smith" and do a linear search for the birthday you want, or do you create a new method GetPeopleByBirthdayAndLastName? What about the poor hapless fellow who has to implement a QBE form?
Back when the only way to make ad hoc queries against the domain was to generate SQL, the only way to keep yourself safe was to offer just specific methods to retrieve and change data. Now that we have LINQ, though, there's no reason to keep the handcuffs on. Anybody can submit a query and you can execute it safely without concern.
Of course, you could be concerned that a user might be able to view another's data, but that's easy to mitigate because you can restrict what data you give out. For example:
public IQueryable<Content> Content
{
get { return Content.Where(c => c.UserId == this.UserId); }
}
This will make sure that the only Content rows that the user can get are those that have his UserId.
If your concern is the load on the database, you could do things like examine query expressions for table scans (accessing tables without Where clauses or with no indexed columns in the Where clause). Granted, that's non-trivial, and I wouldn't recommend it.

It's been some time since I asked that question and had a chance to do it on my own.
I don't think it's a good practice to expose IQueryable at all outside the DAL layer. It brings more problems that it solves. I'm talking about large MVC applications. First of all the refactorings is harder, many developers user IQueryable instances from the views and after struggle with the fact that when resolving IQueryable the connection was already disposed. Performance problems because all the database is often queried for a given set of resultats and so on.
I rather expose Ienumerable from my repositories and believe me, it saves me many troubles.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string