Are collections of inner aggregates valid? - domain-driven-design

Lets say I have an AggregateRoot called Folder with a collection of sub-folders like this:
class Folder : AggregateRoot
{
string Name { get; }
ICollection<Folder> Folders { get; }
}
The inner collection here is really just a list of aggregate-ids to other Folder-aggregates which is resolved lazily when enumerated.
Is this construct valid in domain modeling where aggregates not only references other aggregates but also defines its Folders-property as a collection of other Folder-aggregates?
Why? The example above may not be particularly good but my goal is mainly to have a natural way of working with aggregate collections and hide the fact that agg-references are resolved through a repository under the surface. I want to work with aggregates as easily as with entity-collections.
My thinking here is also that the sub-folders in some way are owned by the parent-aggregate, that the collection really is a way of representing a place where the aggregates are stored even if it's not really true as they are stored more generally through a repository.
The example with the recursiveness was not really important. The focus is on the fact that a aggregate "seem" to own other aggregates. And when making a change in two folders it would only be possible to save them one by one of course but that should be ok. I would also have to include some rule that folders only can be created in place and not added manually so that they could turn up in more than one agg-collection.

Child Structures are valid use cases in Domain Modeling, and one often encountered in recursive concepts like Groups, Tags, etc. just like in your example of Folders. And I like dealing with them as pure collection objects in the domain layer, with no hint of persistence logic whatsoever. When writing such domain logic, I like to imagine that I am dealing with objects as if they will be perpetually preserved on the RAM .
I will consider your recursive example for my explanation, but the same concept applies to any "collection" of child objects, not just recursive relationships.
Below is a sample implementation in pseudocode, annotated with comments. I apologize in advance that the code is closer to Python in structure. I wanted to convey the idea accurately, and not worry about how to represent it in C#, in which I am not well versed. Please ask questions about the code if something is not clear.
Notes on the pseudocode:
In the domain layer, you deal with collections simply as if its another list/collection, without having to worry about underlying persistence complexities.
FolderService is an ApplicationService that is typically invoked by the API. This service is responsible for assembling the infrastructure services, interacting with the domain layer, and eventual persistence.
FolderTable is an imaginary database representation of the Folder object. FolderRepository along knows about this class and its implementation details
The complexities of saving and retrieving the Folder object from DB would be present only in the FolderRepository class.
The load_by_name repository method eagerly loads and populates all subfolders into the parent folder. We can convert this to be lazily loaded, only on access, and never load it unless we are traversing (can even be paginated, based on requirements, especially if there is no specific limit on the no. of subfolders)
class Folder(AggregateRoot):
name: str
folders: List[Folder]
#classmethod
def construct_from_args(cls, params):
# This is a factory method
return Folder(**params)
def add_sub_folder(self, new_folder: Folder) -> None:
self.folders.append(new_folder)
def remove_sub_folder(self, existing_folder: Folder) -> None:
# Dummy implementation. Actual implementation will be more involved
for folder in self.folders:
if folder.name == existing_folder.name:
self.folders.remove(existing_folder)
class FolderService(ApplicationService):
def add_sub_folder(parent_folder_name: str, new_folder_params: dict) -> None:
folder_repo = _system.get_folder_repository()
parent_folder = folder_repo.load_by_name(parent_folder_name)
new_sub_folder = Folder.construct_from_args(new_folder_params)
parent_folder.add_sub_folder(new_sub_folder)
folder_repo.save(parent_folder)
class FolderTable(DBObject):
# A DB representation of the folder domain object
# `parent_id` will be empty for the root folder
name: str
parent_id: integer
class FolderRepository(Repository):
# Constructor for Repository
# that has `connection`s to the database, for later use
# FolderRepository works exclusively with `FolderTable` class
def load_by_name(self, folder_name: str) -> Folder:
# Load a folder, including its subfolders, from database
persisted_folder = self.connection.find(name=folder_name)
parent_identifier = persisted_folder.id
sub_folders = self.connection.find(parent_identifier)
for sub_folder in sub_folders:
persisted_folder.folders.append(sub_folder)
return persisted_folder
def save(self, folder: Folder) -> None:
persisted_folder = self.connection.find(name=folder.name)
parent_identifier = persisted_folder.id
# Gather the persisted list of folders from database
persisted_sub_folders = self.connection.find(parent_identifier)
for sub_folder in folder.folders:
# The heart of the persistence logic, with three conditions:
# If the subfolder is already persisted,
# Do Nothing
# If there is a persisted subfolder that is no longer a part of folders,
# Remove it
# If the subfolder is not among those already persisted,
# Add it
If you see holes in this implementation or my thought-process, please do point them out.

When an aggregate contains another one should avoid direct references as there is usually some unwanted pain that follows.
Any lazy-loading would indicate that you could redesign things a bit as you should avoid that also.
The most common pattern is to either have only a list of ids or a list of values objects. The latter appears to be more appropriate in your case. You could then always have a fully loaded AR with all the relevant folders. In order to navigate you would need to retrieve the relevant folder.
This particular example has some peculiarities since it represents a hierarchy but you'd have to deal with those on a case-by-base bases.
In short: having any aggregate reference another, whether it be through a collection or otherwise, is ill-advised.

Related

Always valid domain model entails prefixing a bunch of Value Objects. Doesn't that break the ubiquitous language?

The principle of always valid domain model dictates that value object and entities should be self validating to never be in an invalid state.
This requires creating some kind of wrapper, sometimes, for primitive values. However it seem to me that this might break the ubiquitous language.
For instance say I have 2 entities: Hotel and House. Each of those entities has images associated with it which respect the following rules:
Hotels must have at least 5 images and no more than 20
Houses must have at least 1 image and no more than 10
This to me entails the following classes
class House {
HouseImages images;
// ...
}
class Hotel {
HotelImages images;
}
class HouseImages {
final List<Image> images;
HouseImages(this.images) : assert(images.length >= 1),
assert(images.length <= 10);
}
class HotelImages {
final List<Image> images;
HotelImages(this.images) : assert(images.length >= 5),
assert(images.length <= 20);
}
Doesn't that break the ubiquitous languages a bit ? It just feels a bit off to have all those classes that are essentially prefixed (HotelName vs HouseName, HotelImages vs HouseImages, and so on). In other words, my value object folder that once consisted of x, y, z, where x, y and z where also documented in a lexicon document, now has house_x, hotel_x, house_y, hotel_y, house_z, hotel_z and it doesn't look quite as english as it was when it was x, y, z.
Is this common or is there something I misunderstood here maybe ? I do like the assurance it gives though, and it actually caught some bugs too.
There is some reasoning you can apply that usually helps me when deciding to introduce a value object or not. There are two very good blog articles concerning this topic I would like to recommend:
https://enterprisecraftsmanship.com/posts/value-objects-when-to-create-one/
https://enterprisecraftsmanship.com/posts/collections-primitive-obsession/
I would like to address your concrete example based on the heuristics taken from the mentioned article:
Are there more than one primitive values that encapsulate a concept, i.e. things that always belong together?
For instance, a Coordinate value object would contain Latitude and Longitude, it would not make sense to have different places of your application knowing that these need to be instantiated and validated together as a whole. A Money value object with an amount and a currency identifier would be another example. On the other hand I would usually not have a separate value object for the amount field as the Money object would already take care of making sure it is a reasonable value (e.g. positive value).
Is there complexity and logic (like validation) that is worth being hidden behind a value object?
For instance, your HotelImages value object that defines a specific collection type caught my attention. If HotelImages would not be used in different spots and the logic is rather simple as in your sample I would not mind adding such a collection type but rather do the validation inside the Hotel entity. Otherwise you would blow up your application with custom value objects for basically everything.
On the other hand, if there was some concept like an image collection which has its meaning in the business domain and a set of business rules and if that type is used in different places, for instance, having a ImageCollection value object that is used by both Hotel and House it could make sense to have such a value object.
I would apply the same thinking concerning your question for HouseName and HotelName. If these have no special meaning and complexity outside of the Hotel and House entity but are just seen as some simple properties of those entities in my opinion having value objects for these would be an overkill. Having something like BuildingName with a set of rules what this name has to follow or if it even is consisting of several primitive values then it would make sense again to use a value object.
This relates to the third point:
Is there actual behaviour duplication that could be avoided with a value object?
Coming from the last point thinking of actual duplication (not code duplication but behaviour duplication) that can be avoided with extracting things into a custom value object can also make sense. But in this case you always have to be careful not to fall into the trap of incidental duplication, see also [here].1
Does your overall project complexity justify the additional work?
This needs to be answered from your side of course but I think it's good to always consider if the benefits outweigh the costs. If you have a simple CRUD like application that is not expected to change a lot and will not be long lived all the mentioned heuristics also have to be used with the project complexity in mind.

Binary Tree/Search Tree

I'm beginning to start data structures and algorithms in Python and I am on Binary trees right now. However I'm really confused on how to implement them. I see some implementations where there are two classes one for Node and one for the tree itself like so:
class Node:
def __init__(self, data):
self.right = None
self.left = None
self.data = data
class Tree:
def __init__(self):
self.root = None
def insert()
def preorder()
.
.
.
However I also see implementations where there is no Node class and instead all the code goes inside the Tree class without the following
def __init__(self):
self.root = None
Can someone please help me understand why there are two different ways, the pros and cons of each method, the differences between them and which method I should follow to implement a binary tree.
Thank you!
Yes, there are these two ways.
First of all, in the 1-class approach, that class is really the Node class (possibly named differently) of the 2-class approach. True, all the methods are then on the Node class, but that could also have been the case in the 2-class approach, where the Tree class would defer much of the work by calling methods on the Node class. So what the 1-class approach is really missing, is the Tree class. The name of the class can obscure this observation, but it is better to see it that way, even when the methods to work with the tree are on the Node class.
The difference becomes apparent when you need to represent an empty tree. With the 1-class approach you would have to set your main variable to None. But that None is a different concept than a tree without nodes. Because on an object that represents an empty tree one can still call methods to add nodes again. You cannot call methods on None. It also means that the caller needs to be aware when to change its own variable to None or to a Node instance, depending on whether the tree happened to be (or become) empty.
With the 2-class system this management is taken out of the hands of the caller, and managed inside the Tree instance. The caller will always use the single reference to that instance, independent on whether the tree is empty or not.
Examples
1. Creating an empty tree
1-class approach:
root = None
2-class approach:
tree = Tree()
2. Adding a first value to an empty tree
1-class approach
# Cannot use an insert method, as there is no instance yet
root = new Node(1) # Must get the value for the root
2-class approach
tree.insert(1) # No need to be aware of a changing root
As the 1-class approach needs to get the value of the root in this case, you often see that with this approach, the root is always captured like that, even when the tree was not empty. In that case the value of the caller's root variable will not really change.
3. Deleting the last value from a tree
root = root.delete(1) # Must get the new value for the root; could be None!
2-class approach
tree.delete(1) # No need to be aware of a changing root
Even though the deletion might in general not lead to an empty tree, in the 1-class approach the caller must take that possibility into account and always assign to its own root reference, as it might become None.
Variants for 1-class systems
1. A None value
Sometimes you see approaches where an empty tree is represented by a Node instance that has a None-value, so to indicate this is not really data. Then when the first value is added to the tree, that node's value is updated to that value. In this way, the caller does no longer have to manage the root: it will always be a reference to the same Node instance, which sometimes can have a None value so to indicate the tree is actually empty.
The code for the caller to work with this 1-class variant, would become the same as the 2-class approach.
This is bad practice. In some cases you may want a tree to be able to have a None-value as data, and then you cannot make a tree with such a value, as it will be mistaken for a dummy node that represents an empty tree
2. Sentinel node
This is a variant to the above. Here the dummy node is always there. The empty tree is represented by a single node instance. When the first value is inserted into the tree, this dummy node's value is never updated, but the insertion happens to its left member. So the dummy node never plays a role for holding data, but is just the container that maintains the real root of the tree in its left member. This really means that the dummy node is the parallel of what the Tree instance is in the 2-class approach. Instead of having a proper root member, it has a left member for that role, and it has no use for its right member.
How native data structures are implemented
The 2-class approach is more like how native data types work. For instance, take the standard list in Python. An empty list is represented by [], not by None. There is a clear distinction between [] and None. Once you have the list reference (like lst = []), that reference never changes. You an append and delete, but you never have to assign to lst itself.

Repositories and Roots of aggregates

I'm reading a book by Eric Evans DDD.
And I found a contradiction.
Chapter books about aggregates:
Choose one ENTITY to be the root of each AGGREGATE, and control all
access to the objects inside the boundary through the root.
Chapter books about repositories:
A subset of persistent objects must be globally accessible through a
search based on object attributes. Such access is needed for the roots
of AGGREGATES that are not convenient to reach by traversal. They are
usually ENTITIES, sometimes VALUE OBJECTS with complex internal
structure, and sometimes enumerated VALUES. Providing access to other
objects muddies important distinctions.
Provide REPOSITORIES only for AGGREGATE roots that actually need
direct access.
It can be concluded that the root of the aggregate can be:
entity
value object
enumerated values
Correctly I understood everything?
Or may be right:
Provide REPOSITORIES only for
aggregate roots
value objects
enumerated values
?
And what is enumerated values(which needs its own repository!)?
Per #Marco's comment above, the root of an aggregate can only be an entity (i.e. something with an ID property). An example of this would be an Order object. No matter how many attributes you change on an Order its quality is determined by its Id property and nothing else.
A value object (often implemented as a struct in many languages) does not have an ID. A common example of this would be a Money value object with a Dollars property and Cents property. Because it has no ID, the concept of querying it by ID does not apply, and thus the concept of a repository does not apply. An aggregate could have a value object as a property, though (e.g. the Total property on an Order aggregate).
An enumerated type is just a list of name/value pairs. It uses the enum keyword in several languages. Again, there's no ID for the enum nor any of its members, so the concept of a repository does not apply. The concept of an enum is useful in DDD because it helps express the domain model better than, say, magic numbers e.g. order.Status = OrderStatus.Submitted vs order.Status = 1.

Usage of a correct collection Type

I am looking for a native, or a custom-type that covers the following requirements:
A Generic collection that contains only unique objects like a HashSet<T>
It implements INotifyCollectionChanged
It implements IENumerable<T> (duh) and must be wrappable by a ReadOnlyCollection<T> (duh, duh)
It should work with both small and large numbers of items (perhaps changing inner behaviour?)
the signature of the type must be like UniqueList<T> (like a list, not a key/valuepair)
It does not have to be sortable.
Searchability is not a "must-have".
The main purpose of this is to set up a small mesh/network between related objects.
So this network can only unique objects and there has to be a mechanism that notifies the application when changes in the collection happen.Since it is for a proof-of-concept the scope is purely within the assembly (no db's or fs are of any importance).
What is a proper native type for this or what are the best ingredients to create a composite?
Sounds like you could just wrap HashSet<T> in your own type extremely easily, just to implement INotifyCollectionChanged. You can easily proxy everything you need - e.g. GetEnumerator can just call set.GetEnumerator() etc. Implementing INotifyCollectionChanged should just be a matter of raising the event when an element is added or removed. You probably want to make sure you don't raise the event if either you add an element which is already present or remove an element which isn't already present. HashSet<T>.Add/Remove both return bool to help you with this though.
I wouldn't call it UniqueList<T> though, as that suggests list-like behaviour such as maintaining ordering. I'd call it ObservableSet<T> or something like that.

Shared Domain Logic?

Take for example:
CreateOrderTicket(ByVal items As List(Of OrderItems)) As String
Where would you put this sort of logic given:
CreateOrder should generate a simple list ( i.e. Item Name - Item Price )
PizzaOrderItem
SaladBarOrderItem
BarOrderItem
Would you recommend:
Refactoring common to an abstract class/interface with shared properties a method called CreateOrderTicket
Or,
Creating a common service that exposes a CreateOrderTicket
We obviously would not want three createOrderTicket methods, but adding methods, inheriting, overloading and using generics seem like a high cost just to abstract one behaviour..
Assume for the sake of a simple example that (currently) there is no OrderItem baseclass or interface..
Help!! :)
p.s. Is there a way to overload without forcing all inheriting objects to use the same name?
Abstract base class sounds like the best option in this situation. Of course it all depends on what kind of shared behaviour these items have. Without knowing more, I'd guess all of these order items have Name and Price for example - and in future you might add more common stuff.
Without a shared base class which contains the Name and Price properties, you'll probably have troubles implementing a CreateOrderTicket method which takes a list containing more than 1 kind of orders.
Also I don't think inheriting from an abstract base class would be that high cost as technically the objects already derive from the Object base class. (Though I don't think this is completely equal to a custom base class.)
VB.Net can implement methods from an interface using a different name than the one specified in the interface but don't think the same goes for overriding abstract functionality.

Resources