DDD and defining aggregates - domain-driven-design

DDD and defining aggregates - domain-driven-design

We are building a system which can sell our api services to multiple companies.
We have
companies (companies which purchased our api)
accounts (each company can have multiple accounts, and each account has it's user types)
users (users within account)
Infrastructurally, it looks something like this:
"company1" : {
"accounts" : [
account1 :{"users" : [{user1,user2}], accountType},
account2 :{"users" : [{user1,user2}], accountType},
]}
One of the business rules states that users can't change accounts after registration.
Other rule states that user can change his type, but only within that account type.
From my understanding, my domain model should be called UserAccount, and it should consist of Account, User and UserType entities, where Account would be aggregate root.
class UserAccount{
int AccountId;
string AccountName;
int AccountTypeId;
List<UserTypes> AvailableUserTypesForThisAccount
User User
void SetUserType(userTypeId){
if(AvailableUserTypesForThisAccount.Contains(userTypeId) == false)
throw new NotSupportedException();
}
}
With this aggregate, we can change type of the user, but it can only be type which is available for that account (one of invariants).
When I fetch UserAccount from repository, I would fetch all necessary tables (or entity data objects) and mapped them to aggregate, and returned it as a whole.
Is my understanding and modeling going in to the right direction?

It's important to understand the design trade-off of aggregates; because aggregates partition your domain model into independent spaces, you gain the ability to modify unrelated parts of the model concurrently. But you lose the ability to enforce business rules that span multiple aggregates at the point of change.
What this means is that you need to have a clear understanding of the business value of those two things. For entities that aren't going to change very often, your business may prefer strict enforcement over concurrent changes; where the data is subject to frequent change, you will probably end up preferring more isolation.
In practice, isolation means evaluating whether or not the business can afford to mitigate the cases where "conflicting" edits leave the model in an unsatisfactory state.
With this aggregate, we can change type of the user, but it can only be type which is available for that account (one of invariants).
With an invariant like this, an important question to ask is "what is the business cost of a failure here"?
If User and Account are separate aggregates, then you face the problem that a user is being assigned to a "type" at the same time that an account is dropping support for that type. So what would it cost you to detect (after the change) that a violation of the "invariant" had occurred, and what would it cost to apply a correction?
If Account is relatively stable (as seems likely), then most of those errors can be mitigated by comparing the user type to a cached list of those allowed in the account. This cache can be evaluated when the user is being changed, or in the UI that supports the edit. That will reduce (but not eliminate) the error rate without compromising concurrent edits.
From my understanding, my domain model should be called UserAccount, and it should consist of Account, User and UserType entities, where Account would be aggregate root.
I think you've lost the plot here. The "domain model" isn't really a named thing, it's just a collection of aggregates.
If you wanted an Account aggregates that contain Users and UserTypes, then you would probably model it something like
Account : Aggregate {
accountId : Id<Account>,
name : AccountName,
users : List<User>,
usertypes : List<UserType>
}
This design implies that all changes to a User need to be accessed via the Account aggregate, and that no User belongs to more than one account, and that no other aggregate can directly reference a user (you need to negotiate directly with the Account aggregate).
Account::SetUserType(UserHint hint, UserType userTypeId){
if(! usertypes.Contains(userTypeId)) {
throw new AccountInvariantViolationException();
}
User u = findUser(users, hint);
...
}
When I fetch UserAccount from repository, I would fetch all necessary tables (or entity data objects) and mapped them to aggregate, and returned it as a whole.
Yes, that's exactly right -- it's another reason that we generally prefer small aggregates loosely coupled, rather than one large aggregate.
what about having only the relationship between Account and User live in the Account aggregate as well as the type of user (as an AccountUser entity) and have the rest of the user information live in a separate User aggregate?
That model could work for some kinds of problems -- in that case, the Account aggregate would probably looks something like
Account : Aggregate {
accountId : Id<Account>,
name : AccountName,
users : Map<Id<User>,UserType>
usertypes : List<UserType>
}
This design allows you to throw exceptions if somebody tries to remove a UserType from an Account when some User is currently of that type. But it cannot, for example, ensure that the user type described here is actually consistent with the state of the independent User aggregate -- or event be certain that the identified User exists (you'll be relying on detection and mitigation for those cases).
Is that better? worse? It's not really possible to say without a more thorough understanding of the actual problem being addressed (trying to understand ddd from toy problems is really hard).
The principle is to understand which the business invariant that must be maintained at all times (as opposed to those where later reconciliation is acceptable), and then group together all of the state which must be kept consistent to satisfy the invariant.
But what if account can have hundreds or thousands of users? What would be your vision of aggregate?
Assuming the same constraints: that we have some aggregate that is responsible for the allowed range of user types.... if the aggregate got to be too large to manage in a reasonable way, and the constraints imposed by the business cannot be relaxed, then I would probably compromise the "repository" abstraction, and allow the enforcement of the set validation rules to leak into the database itself.
The conceit of DDD, taken from its original OO best practices roots, is that the model is real, and the persistence store is just an environmental detail. But looked at with a practical eye, in a world where the processes have life cycles and there are competing consumers and... it's the persistence store that represents the truth of the business.

Related

DDD Modify one aggregate per transaction with invariants in both aggregates

Suppose I have an aggregate root Tenant and an aggregate root Organization. Multiples Organizations can be linked to a single Tenant. Tenant only has the Id of the Organizations in it's aggregate.
Suppose I have the following invariant in the Organization aggregate: Organization can only have one subscription for a specific product type.
Suppose I have the following invariant in the Tenant aggregate: only one subscription for a product type must exists across all Organizations related to a Tenant.
How can we enforce those invariants using the one aggregate per transaction rule?
When adding a subscription to an Organization, we can easily validate the first invariant, and fire a domain event to update (eventual consistency) the Tenant, but what happens if the invariant is violated in the Tenant aggregate?
Does it imply to fire another domain event to rollback what happens in the Organization aggregate? Seems tricky in the case a response had been sent to a UI after the first aggregate had been modified successfully.
Or is the real approach here is to use a domain service to validate the invariants of both aggregates before initiating the update? If so, do we place the invariants/rules inside the domain service directly or do we place kind of boolean validation methods on aggregates to keep the logic there?
UPDATE
What if the UI must prevent the user from saving in the UI if one invariants is violated? In this case we are not even trying to update an aggregate.

One thing you might want to consider is the possibility of a missing concept in your domain. You might want to explore the possibility of your scenario having something as a Subscription Plan concept, which by itself is an aggregate and enforces all of these rules you're currently trying to put inside the Tenant/Organization aggregates.
When facing such scenarios I tend to think to myself "what would an organization do if there was no system at all facilitating this operation". In your case, if there were multiple people from the same tenant, each responsible for an organization... how would they synchronize their subscriptions to comply with the invariants?
In such an exercise, you will probably reach some of the scenarios already explored:
Have a gathering event (such as a conference call) to make sure no redundant subscriptions are being made: that's the Domain Service path.
Each make their own subscriptions and they notify each other, eventually charging back redundant ones: that's the Event + Rollback path.
They might compromise and keep a shared ledger where they can check how subscriptions are going corporation wide and the ledger is the authority in such decisions: that's the missing aggregate path.
You will probably reach other options if you stress the issue enough.

How can we enforce those invariants using the one aggregate per transaction rule?
There are a few different answers.
One is to abandon the "rule" - limiting yourself to one aggregate per transaction isn't important. What really matters is that all of the objects in the unit of work are stored together, so that the transaction is an all or nothing event.
BEGIN TRANSACTION
UPDATE ORGANIZATION
UPDATE TENANT
COMMIT
A challenge in this design is that the aggregates no longer describe atomic units of storage - the fact that this organization and this tenant need to be stored in the same shard is implicit, rather than explicit.
Another is to redesign your aggregates - boundaries are hard, and its often the case that our first choice of boundaries are wrong. Udi Dahan, in his talk Finding Service Boundaries, observed that (as an example) the domain behaviors associated with a book title usually have little or nothing to do with the book price; they are two separate things that have a relation to a common thing, but they have no rules in common. So they could be treated as part of separate aggregates.
So you can redesign your Organization/Tenant boundaries to more correctly capture the relations between them. Thus, all of the relations that we need to correctly evaluate this rule are in a single aggregate, and therefore necessarily stored together.
The third possibility is to accept that these two aggregates are independent of each other, and the "invariant" is more like a guideline than an actual rule. The two aggregates act like participants in a protocol, and we design into the protocol not only the happy path, but also the failure modes.
The simple forms of these protocols, where we have reversible actions to unwind from a problem, are called sagas. Caitie McCaffrey gave a well received talk on this in 2015, or you could read Clemens Vasters or Bernd Rücker; Garcia-Molina and Salem introduced the term in their study of long lived transactions.
Process Managers are another common term for this idea of a coordinated protocol, where you might have a more complicated graph of states than commit/rollback.

The first idea that came to my mind is to have a property of the organization called "tenantHasSubscription" that property can be updated with domain events. Once you have this property you can enforce the invariant in the organization aggregate.

If you want to be 100% sure that the invariant is never violated, all the commands SubscribeToProduct(TenantId, OrganizationId) have to be managed by the same aggregate (maybe the Tenant), that has internally all the values to check the invariant.
Otherwise to do your operation you will always have to query for an "external" value (from the aggregate point of view), this will introduce "latency" in the operation that open a window for inconsistency.
If you query a db to have values, can it happen that when the result is on the wire, somebody else is updating it, because the db doesn't wait you consumed your read to allow others to modify it, so your aggregate will use stale data to check invariants.
Obviously this is an extremism, this doesn't mean that it is for sure dangerous, but you have to calculate the probability of a failure to happen, how can you be warned when it happen, and how to solve it (automatically by the program, or maybe a manual intervention, depending on the situation).

DDD Factory Responsibility

If have the following Code.
public class CountryFactory : IEntityFactory
{
private readonly IRepository<Country> countryRepository;
public CountryFactory(IRepository<Country> countryRepository)
{
this.countryRepository = countryRepository;
}
public Country CreateCountry(string name)
{
if (countryRepository.FindAll().Any(c => c.Name == name))
{
throw new ArgumentException("There is already a country with that name!");
}
return new Country(name);
}
}
From a DDD approach, is the the correct way to create a Country. Or is it better to have a CountryService which checks whether or not a country exists, then if it does not, just call the factory to return a new entity. This will then mean that the service will be responsible of persisting the Entity rather than the Factory.
I'm a bit confused as to where the responsibility should lay. Especially if more complex entities needs to be created which is not as simple as creating a country.

In DDD factories are used to encapsulate complex objects and aggregates creation. Usually, factories are not implemented as separate classes but rather static methods on the aggregate root class that returns the new aggregate.
Factory methods are better suited than constructors since you might need to have technical constructors for serialization purposes and var x = new Country(name) has very little meaning inside your Ubiquitous Language. What does it mean? Why do you need a name when you create a country? Do you really create countries, how often new countries appear, do you even need to model this process? All these questions arise if you start thinking about your model and ubiquitous language besides tactical pattern.
Factories must return valid objects (i.e. aggregates), checking all invariants inside it, but not outside. Factory might receive services and repositories as parameters but this is also not very common. Normally, you have an application service or command handler that does some validations and then creates a new aggregate using the factory method and adds it to the repository.
There is also a good answer by Lev Gorodinski here Factory Pattern where should this live in DDD?
Besides, implementation of Factories is extensively described in Chapter 11 of the Red Book.

Injecting a Repository into a Factory is OK, but it shouldn't be your first concern. The starting point should be : what kind of consistency does your business domain require ?
By checking Country name uniqueness in CountryFactory which is part of your Domain layer, you give yourself the impression that the countries will always be consistent. But the only aggregate is Country and since there is no AllCountries aggregate to act as a consistency boundary, respect of this invariant will not be guaranteed. Somebody could always sneak in a new Country that has exactly the same name as the one being added, just after you checked it. What you could do is wrap the CreateCountry operation into a transaction that would lock the entire set of Countries (and thus the entire table if you use an RDBMS) but this would hurt concurrency.
There are other options to consider.
Why not leverage a database unique constraint to enforce the Country name invariant ? As a complement, you could also have another checkpoint at the UI level to warn the user that the country name they typed in is already taken. This would necessitate another "query" service that just calls CountryRepository.GetByName() but where the returned Countries are not expected to be modified.
Soon you'll be realizing that there are really two kinds of models - ones that can give you some domain data at a given moment in time so that you can display it on a user interface, and ones that expose operations (AddCountry) and will guarantee that domain invariants always hold. This is a first step towards CQRS.
What is the frequency of Countries being added or modified ? If it is that high, do we really need a Country name to be unique at all times ? Wouldn't it solve a lot of problems if we loosened up the constraints and allowed a user to temporarily create a duplicate Country name ? A mechanism could detect the duplicates later on and take a compensating action, putting the newly added Country on hold and reaching out to the user to ask them to change the name. A.k.a eventual consistency instead of immediate consistency.
Does Country need to be an Aggregate ? What would be the cost if it was a Value Object and duplicated in each entity where it is used ?

Domain security involving domain logic

Together with my application's domain logic I am trying to outline the security model. I am stuck with a requirement that prevents me from considering security just a cross-cutting concern over my domain logic. Here follows my situation.
A user in my system can potentially be allowed to create a certain kind of objects, say, 'filters'. I introduce a permission called 'CREATE_FILTER', and a user is either allowed to create filters or not, depending on whether the admin assigned such a permission to this user, or not. Ok.
Now consider a more complex requirement when the number of filters a user can create is limited. So, e.g. the admin should be able to set max number of filters any user is allowed to create, or even more complex, to assign max numbers individually to users, say value of 3 to User1, 5 to User2 and so on. So, the security system, in order to authorize filter creation to a user, is not sufficient to check whether a user has such a permission assigned, but has to analyze the domain model in a complex way in order to look how many filters there are already created by the user to make the decision. To make things more complex, we can imagine that the max limit will depend on the amount of money user has on their account, or something.
I want to conceptually separate (at least in my mind), whether such a complicated security logic purely pertains to security (which will of course depend on the domain model) or is this already a full-fledged part of the domain logic itself? Does it make sense to keep a 'permission' concept, when assigning/removing permissions does not help much (since it's domain state on which depends authorization decision rather than assigned permissions)? Would it be a way to go, say, to have a complicated permission concept which not simply allows an action by a mere fact of its existence but would rather involve some complex decision making logic?

Here's one way you could handle this ...
On one side you have a security model (might be a bounded context in ddd speak) that is solving the problem of assigning permissions to subjects (users), maybe indirectly through the use of roles. I would envision upper boundaries (max numbers) to be an attribute associated with the assigned permission.
There's also a query part to this model. Yet, it can only answer "simple" questions:
Has this user permission to create filters?
How many filters can this user create?
Some would even say this query part is a separate model altogether.
On the other end you have your application's model which is largely "security" free apart from these pesky requirements along the lines of "user John Doe can only create 3 filter". As an aside, it's doubtful we're still speaking of a "user" at this point, but rather of a person acting in a certain role in this particular use case. Anyway, back to how we could keep this somewhat separate. Suppose we have a somewhat layered approach and we have an application service with an authorization service in front. The responsibility of the authorization service is to answer the question "is this user allowed to perform this operation? yes or no?" and stop processing if the answer is no. Here's a very naive version of that (C#).
public class FilterAuthorizationServices :
Handles<CreateFilter>
{
public FilterAuthorizationServices(FilterRepository filterRepository) { ... }
public void Authorize(Subject subject, CreateFilter message)
{
if(!subject.HasPermissionTo("CreateFilter"))
{
throw new NotAuthorizedException("...");
}
if(filterRepository.CountFiltersCreatedBy(subject.Id) >
subject.GetTheMaximumNumberOfFiltersAllowedToCreate())
{
throw new NotAuthorizedException("...");
}
}
}
Notice how the application service is not even mentioned here. It can concentrate on invoking the actual domain logic. Yet, the authorization service is using both the query part of the above model (embodied by the Subject) and the model of the application (embodied by the FilterRepository) to fulfill the authorization request. Nothing wrong with doing it this way.
You could even go a step further and ditch the need for the application's model if that model could somehow provide the "current number of created filters" to the security model. But that might be a bridge too far for you, since that would lead down the path of evaluating dynamic expressions (which wouldn't necessarily be a bad place to be in). If you want to go there, may I suggest you create a mini DSL to define the required expressions and associated code to parse and evaluate them.
If you'd like brokered authorization you could look at something like XACML (https://www.oasis-open.org/committees/tc_home.php?wg_abbrev=xacml) though you'll have to overcome your fear of XML first ;-)

Bounded Contexts and Aggregate Roots

We are trying to model an RBAC-based user maintenance system using DDD principles. We have identified the following entities:
Authorization is an Aggregate Root with the following:
User (an entity object)
List<Authority> (list of value objects)
Authority contains the following value objects:
AuthorityType (base class of classes Role and Permission)
effectiveDate
Role contains a List<Permission>
Permission has code and description attributes
In a typical scenario, Authorization is definitely the Aggregate Root since everything in user maintenance revolves around that (e.g. I can grant a user one or more Authority-ies which is either a Role or Permission)
My question is : what about Role and Permission? Are they also Aggregate Roots in their own separate contexts? (i.e. I have three contexts, authorization, role, permission). While can combine all in just one context, wouldn't the Role be too heavy enough since it will be loaded as part of the Authorization "object graph"?

Firstly I can't help but feel you've misunderstood the concept of a bounded context. What you've described as BC's I would describe as entities. In my mind, bounded contexts serve to give entities defined in the ubiquitous language a different purpose for a given context.
For example, in a hospital domain, a Patient being treated in the outpatients department might have a list of Referrals, and methods such as BookAppointment(). A Patient being treated as an Inpatient however, will have a Ward property and methods such as TransferToTheatre(). Given this, there are two bounded contexts that patients exist in: Outpatients & Inpatients. In the insurance domain, the sales team put together a Policy that has a degree of risk associated to it and therefore cost. But if it reaches the claims department, that information is meaningless to them. They only need to verify whether the policy is valid for the claim. So there is two contexts here: Sales & Claims
Secondly, are you just using RBAC as an example while you experiment with implementing DDD? The reason I ask is because DDD is designed to help solve complex business problems - i.e. where calculations are required (such as the risk of a policy). In my mind RBAC is a fairly simple infrastructure service that doesn't concern itself with actual domain logic and therefore doesn't warrant a strict DDD implementation. DDD is expensive to invest in, and you shouldn't adopt it just for the sake of it; this is why bounded contexts are important - only model the context with DDD if it's justifiable.
Anyway, at the risk of this answer sounding to 'academic' I'll now try to answer your question assuming you still want to model this as DDD:
To me, this would all fit under one context (called 'Security' or something)
As a general rule of thumb, make everything an aggregate that requires an independent transaction, so:
On the assumption that the system allows for Authorities to be added to the Authorization object, make Authorization an aggregate. (Although there might be an argument for ditching Authorization and simply making User the aggregate root with a list of Authorities)
Authorities serve no meaning outside of the Authorization aggregate and are only created when adding one, so these remain as entities.
On the assumption that system allows for Permissions to be added to a Role, Role becomes an aggregate root.
On the assumption that Permissions cannot be created/delete - i.e. they are defined by the system itself, so these are simple value objects.
Whilst on the subject of aggregate design, I cannot recommend these articles enough.

What Belongs to the Aggregate Root

This is a practical Domain Driven Design question:
Conceptually, I think I get Aggregate roots until I go to define one.
I have an Employee entity, which has surfaced as an Aggregate root. In the Business, some employees can have work-related Violations logged against them:
Employee-----*Violations
Since not all Employees are subject to this, I would think that Violations would not be a part of the Employee Aggregate, correct?
So when I want to work with Employees and their related violations, is this two separate Repository interactions by some Service?
Lastly, when I add a Violation, is that method on the Employee Entity?
Thanks for the help!

After doing even MORE research, I think I have the answer to my question.
Paul Stovell had this slightly edited response to a similar question on the DDD messageboard. Substitute "Customer" for "Employee", and "Order" for "Violation" and you get the idea.
Just because Customer references Order
doesn't necessarily mean Order falls
within the Customer aggregate root.
The customer's addresses might, but
the orders can be independent (for
example, you might have a service that
processes all new orders no matter who
the customer is. Having to go
Customer->Orders makes no sense in
this scenario).
From a domain point of view, you can
even question the validity of those
references (Customer has reference to
a list of Orders). How often will you
actually need all orders for a
customer? In some systems it makes
sense, but in others, one customer
might make many orders. Chances are
you want orders for a customer between
a date range, or orders for a customer
that aren't processed yet, or orders
which have not been paid, and so on.
The scenario in which you'll need all
of them might be relatively uncommon.
However, it's much more likely that
when dealing with an Order, you will
want the customer information. So in
code, Order.Customer.Name is useful,
but Customer.Orders[0].LineItem.SKU -
probably not so useful. Of course,
that totally depends on your business
domain.
In other words, Updating Customer has nothing to do with updating Orders. And orders, or violations in my case, could conceivable be dealt with independently of Customers/Employees.
If Violations had detail lines, then Violation and Violation line would then be a part of the same aggregate because changing a violation line would likely affect a Violation.
EDIT**
The wrinkle here in my Domain is that Violations have no behavior. They are basically records of an event that happened. Not sure yet about the implications that has.

Eric Evan states in his book, Domain-Driven Design: Tackling the Complexity in the Heart of Software,
An AGGREGATE is a cluster of associated objects that we treat as a unit for the purpose of data changes.
There are 2 important points here:
These objects should be treated as a "unit".
For the purpose of "data change".
I believe in your scenario, Employee and Violation are not necessarily a unit together, whereas in the example of Order and OrderItem, they are part of a single unit.
Another thing that is important when modeling the agggregate boundaries is whether you have any invariants in your aggregate. Invariants are business rules that should be valid within the "whole" aggregate. For example, as for the Order and OrderItem example, you might have an invariant that states the total cost of the order should be less than a predefined amount. In this case, anytime you want to add an OrderItem to the Order, this invariant should be enforced to make sure that your Order is valid. However, in your problem, I don't see any invariants between your entities: Employee and Violation.
So short answer:
I believe Employee and Violation each belong to 2 separate aggregates. Each of these entities are also their own aggregate roots. So you need 2 repositories: EmployeeRepository and ViolationRepository.
I also believe you should have an unidirectional association from Violation to Employee. This way, each Violation object knows who it belongs to. But if you want to get the list of all Violations for a particular Employee, then you can ask the ViolationRepository:
var list = repository.FindAllViolationsByEmployee(someEmployee);

You say that you have employee entity and violations and each violation does not have any behavior itself. From what I can read above, it seems to me that you may have two aggregate roots:
Employee
EmployeeViolations (call it EmployeeViolationCard or EmployeeViolationRecords)
EmployeeViolations is identified by the same employee ID and it holds a collection of violation objects. You get behavior for employee and violations separated this way and you don't get Violation entity without behavior.
Whether violation is entity or value object you should decide based on its properties.

I generally agree with Mosh on this one. However, keep in mind the notion of transactions in the business point of view. So I actually take "for the purpose of data changes" to mean "for the purpose of transaction(s)".
Repositories are views of the domain model. In a domain environment, these "views" really support or represent a business function or capability - a transaction. Case in point, the Employee may have one or more violations, and if so, are aspects of a transaction(s) in a point in time. Consider your use cases.
Scenario: "An employee commits an act that is a violation of the workplace." This is a type of business event (i.e. transaction, or part of a larger, perhaps distributed transaction) that occurred. The root affected domain object actually can be seen from more than one perspective, which is why it is confusing. But the thing to remember is behavior as it pertains to a business transaction, since you want your business processes to model the real-world as accurate as possible. In terms of relationships, just like in a relational database, your conceptual domain model should actually indicate this already (i.e. the associativity), which often can be read in either direction:
Employee <----commits a -------committed by ----> Violation
So for this use case, it would be fair that to say that it is a transaction dealing with violations, and that the root - or "primary" entity - is a Violation. That, then would be your aggregate root you would reference for that particular business activity or business process. But that is not to say that, for a different activity or process, that you cannot have an Employee aggregate root, such as the "new employee process". If you take care, there should be no negative impact of cyclic references, or being able to traverse your domain model multiple ways. I will warn, however, that governing of this should be thought about and handled by your controller piece of your business domain, or whatever equivalent you have.
Aside: Thinking in terms of patterns (i.e. MVC), the repository is a view, the domain objects are the model, and thus one should also employ some form of controller pattern. Typically, the controller declares the concrete implementation of and access to the repositories (collections of aggregate roots).
In the data access world...
Using LINQ-To-SQL as an example, the DataContext would be the controller exposing a view of Customer and Order entities. The view is a non-declarative, framework-oriented Table type (rough equivalent to Repository). Note that the view keeps a reference to its parent controller, and often goes through the controller to control how/when the view gets materialized. Thus, the controller is your provider, taking care of mapping, translation, object hydration, etc. The model is then your data POCOs. Pretty much a typical MVC pattern.
Using N/Hibernate as an example, the ISession would be the controller exposing a view of Customer and Order entities by way of the session.Enumerable(string query) or session.Get(object id) or session.CreateCriteria(typeof(Customer)).List()
In the business logic world...
Customer { /*...*/ }
Employee { /*...*/ }
Repository<T> : IRepository<T>
, IEnumerable<T>
//, IQueryable<T>, IQueryProvider //optional
{ /**/ }
BusinessController {
Repository<Customer> Customers { get{ /*...*/ }} //aggregate root
Repository<Order> Orders { get{ /*...*/ }} // aggregate root
}
In a nutshell, let your business processes and transactions be the guide, and let your business infrastructure naturally evolve as processes/activities are implemented or refactored. Moreover, prefer composability over traditional black box design. When you get to service-oriented or cloud computing, you will be glad you did. :)

I was wondering what the conclusion would be?
'Violations' become a root entity. And 'violations' would be referenced by 'employee' root entity. ie violations repository <-> employee repository
But you are consfused about making violations a root entity becuase it has no behavior.
But is 'behaviour' a criteria to qualify as a root entity? I dont think so.

a slightly orthogonal question to test understanding here, going back to Order...OrderItem example, there might be an analytics module in the system that wants to look into OrderItems directly i.e get all orderItems for a particular product, or all order items greater than some given value etc, does having a lot of usecases like that and driving "aggregate root" to extreme could we argue that OrderItem is a different aggregate root in itself ??

It depends. Does any change/add/delete of a vioation change any part of employee - e.g. are you storing violation count, or violation count within past 3 years against employee?

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string