I have the need to store arbitrary references to entities within my Raven Database. Sometimes the entity is an aggregate root (see Events below) and other times it is a value entity (see Sessions below). I'm currently planning to store the references as Lucene queries (or a Lucene-like syntax.) Has anyone done anything like this? Am I heading down a difficult path?
Some of my concerns are:
Value entities are unlikely to provide identifiers, can I expect to reliably reference value entities?
Individual entities should be unaware of (decoupled from) the Arbitrary Relationship infrastructure, what is the best way to infer the queries from complex object graphs?
Limiting the relationships to only Aggregate roots (and preventing references to value entities) would simplify the problem, but it would require me to restructure my Event/Session documents. I'd like these two systems to remain decoupled (concerns of one should not impact the other.)
I've included example documents below to illustrate my scenario. Any thoughts, ideas, guidance, or examples would be very appreciated.
Events Collection
{
Id: “30f6...54a7”,
Title: “Annual Meeting”
Sessions: [
{
Code: “COM001”,
Title: “Opening Ceremony”
},
{
Code: “TEC201”,
Title: “Intermediate Tech”
}
]
}
People Collection
{
Id: "45a8...f209",
Name: "Chad"
}
Arbitrary Relationships Collection
{
Id: “b613...8ebb”,
SubjectEntityQuery: "People.Id:45a8...f209",
TargetEntityQuery: “Events.Id:30f6...54a7.Sessions,Code:COM001”,
Action: "Attended Session",
Story: "Chad attended the Opening Ceremony session"
}
Edit
I'd like to give more detail on the arbitrary relationships. We will have the ability to extend the system to respond to system events and record the interaction between two entities. We have many more entities than Events, Sessions, and People. The relationship may be a person sharing a link or a tweet about a hashtag. Effectively, the Arbitrary Relationships collection becomes a graph-like structure that allows us to see all ~interactions~ for a given entity.
This is screaming relational design.
The easiest way to do this is make the Relationship an object and its Subject and Target fields an array of strings holding IDs of the actual documents it references. This way you can take advantage of Includes to load them along with the relationship document. Eitherway, I don't see how storing a Lucene query syntax helps here.
There may be a better way to model this, but it really depends on your business model and on the thing you are trying to achieve.
Also, you might want to get rid of GUID IDs, just use Raven's conventions.
Related
I am developing an online job portal using DDD patterns.
There are many "objects" that i have figured out like Users, Jobs, Roles, Expertise, ExperienceRange, Country, State, City, Address, Subscriptions, etc
My question is how do i figure out which of these is an entity or a value object or an aggregate? Please advice me if you have ever faced the same dilemma.
I have made the following decision:
Entities - User, Job, SubscriptionPackage
ValueObjects - Role, Expertise, ExperienceRange, City, State, Country
I know that we should not think about persistence while doing DDD modelling but a doubt has surfaced that whichever value objects i am storing in database should have an id or not?
if they have an id do they not violate the fundamental principal of ValueObjects and if we do not save them with ids then how to reference them in foreign key fields?
Please help me answer these queries.
If you can suggest which of the above mentioned objects are entities, which are value objects and which are aggregates that would be great.
Thanking in advance
When thinking of DDD, leave the DB mapping to a later stage. I know I'm repeating what you said, but just because it's true. A value object might have a DB id for other reasons (normalisation, reporting , etc).
First come up with your object model and then figure out how to map it. In some (rare) cases you might need to change slightly your object model if there's something that is too expensive to map properly (I cannot think of an example, but I don't want to be extremist).
So once more, forget about the DB - think about objects. For what reason does an entity have an id? I would say so then later it can be retrieved and modified, while keeping the same id.
And if it is a VO is because the identity is implicit in the values of the object. Does it make sense for a User to have an id? What about an Address? Or a City?... It depends.
To give the example of a city value object, if you need to map that as FK to 'cities' table, then your City object will probably have an id, but the id is not exposed. It's a detail of the implementation. While the user id would be exposed. For example a city might be linked to a province/state and that to a country.
But in another application, where users can add cities and information about them, the city might be an Entity or even an Aggregate. It really depends on your requirements.
Having said that, the list of Entities and VOs you provided looks ok in a general way, but I don't know your requirements.
To answer the first question: you can read Entities, Value Objects, Aggregates and Roots as there are some rules about what is a VO, Entity or Aggregate. The difficulty comes from how to apply them, and experience is the solution to that.
As a summary:
Entities
Many objects are not fundamentally defined by their attributes, but rather by a thread of continuity and identity.
Value Objects
Many objects have no conceptual identity. These objects describe characteristics of a thing.
Aggregates
Aggregates draw a boundary around one or more Entities. An Aggregate enforces invariants for all its Entities for any operation it supports.
How do you solve a situation when you have multiple representations of same object, depending on a view?
For example, lets say you have a book store. Within a book store, you have 2 main representations of Books:
In Lists (search results, browse by category, author, etc...): This is a compact representation that might have some aggregates like for example NumberOfAuthors and NumberOfRwviews. Each Author and Review are entities themselves saved in db.
DetailsView: here you wouldn't have aggregates but real values for each Author, as Book has a property AuthorsList.
Case 2 is clear, you get all from DB and show it. But how to solve case 1. if you want to reduce number of connections and payload to/from DB? So, if you don't want to get all actual Authors and Reviews from DB but just 2 ints for count for each of them.
Full normalized solution would be 2, but 1 seems to require either some denormalization or create 2 different entities: BookDetails and BookCompact within Business Layer.
Important: I am not talking about View DTOs, but actually getting data from DB which doesn't fit into Business Layer Book class.
For me it sounds like multiple Query Models (QM).
I used DDD with CQRS/ES style, so aggregate roots are producing events based on commands being passed in. To those events multiple QMs are subscribed. So I create multiple "views" based on requirements.
The ES (event-sourcing) has huge power - I can introduce another QMs later by replaying stored events.
Sounds like managing a lot of similar, or even duplicate data, but it has sense for me.
QMs can and are optimized to contain just enough data/structure/indexes for given purpose. This is the way out of "shared data model". I see the huge evil in "RDMS" one for all approach. You will always get lost in complexity of managing shared model - like you do.
I had a very good result with the following design:
domain package contains #Entity classes which contain all necessary data which are stored in database
dto package which contains view/views of entity which will be returned from service
Dto should have constructor which takes entity as parameter. To copy data easier you can use BeanUtils.copyProperties(domainClass, dtoClass);
By doing this you are sharing only minimal amount of information and it is returned in object which does not have any functionality.
I'm currently designing a backend for a social networking-related application in REST. I'm very intrigued by the DDD principle. Now let's assume I have a User object who has a Collection of Friends. These can be thousands if the app and the user would become very successful. Every Friend would have some properties as well, it is basically a User.
Looking at the DDD Cargo application example, the fully expanded Cargo-object is stored and retrieved from the CargoRepository from time to time. WOW, if there is a list in the aggregate-root, over time this would trigger a OOM eventually. This is why there is pagination, and lazy-loading if you approach the problem from a data-centric point of view. But how could you cope with these large collections in a persistence-unaware DDD?
As #JefClaes mentioned in the comments: You need to determine whether your User AR indeed requires a collection of Friends.
Ownership does not necessarily imply that a collection is necessary.
Take an Order / OrderLine example. An OrderLine has no meaning without being part of an Order. However, the Customer that an Order belongs to does not have a collection of Orders. It may, possibly, have a collection of ActiveOrders if a customer is limited to a maximum number (or amount) iro active orders. Keeping a collection of historical orders would be unnecessary.
I suspect the large collection problem is not limited to DDD. If one were to receive an Order with many thousands of lines there may be design trade-offs but the order may much more likely be simply split into smaller orders.
In your case I would assert that the inclusion / exclusion of a Friend has very little to do with the consistency of the User AR.
Something to keep in mind is that as soon as you start using you domain model for querying your start running into weird sorts of problems. So always try to think in terms of some read/query model with a simple query interface that can access your data directly without using your domain model. This may simplify things.
So perhaps a Relationship AR may assist in this regard.
If some paging or optimization techniques are the part of your domain, it's nothing wrong to design domain classes with this ability.
Some solutions I've thought about
If User is aggregate root, you can populate your UserRepository with method GetUserWithFriends(int userId, int firstFriendNo, int lastFriendNo) encapsulating specific user object construction. In same way you can also populate user model with some counters and etc.
On the other side, it is possible to implement lazy loading for User instance's _friends field. Thus, User instance can itself decide which "part" of friends list to load.
Finally, you can use UserRepository to get all friends of certain user with respect to paging or other filtering conditions. It doesn't violate any DDD principles.
DDD is too big to talk that it's not for CRUD. Programming in a DDD way you should always take into account some technical limitations and adapt your domain to satisfy them.
Do not prematurely optimize. If you are afraid of large stress, then you have to benchmark your application and perform stress tests.
You need to have a table like so:
friends
id, user_id1, user_id2
to handle the n-m relation. Index your fields there.
Also, you need to be aware whether friends if symmetrical. If so, then you need a single row for two people if they are friends. If not, then you might have one row, showing that a user is friends with the other user. If the other person considers the first a friend as well, you need another row.
Lazy-loading can be achieved by hidden (AJAX) requests so users will have the impression that it is faster than it really is. However, I would not worry about such problems for now, as later you can migrate the content of the tables to a new structure which is unkown now due to the infinite possible evolutions of your project.
Your aggregate root can have a collection of different objects that will only contain a small subset of the information, as reference to the actual business objects. Then when needed, items can be used to fetch the entire information from the underlying repository.
I´m a little confused about inheritance and relationships in core data, and I was hopping someone could drive to the right path. In my app i have created 3 entities, and none of them have (and are not suppose to have) common properties, but there´s gonna be a save and a load button for all the work that the user does. From my understanding I need to "wrap" all the entities "work" into an object which will be used to save and load, and my question is, do I need to create relationships between the entities? Because I have to relate them somehow and this is what make sense to me. Is my logic correct?
I'm implementing a budget calculator, and for the purpose of everyone understand what my issue is, I´m going to give an practical example and please correct me if my logic is incorrect:
Let´s just say you are a fruit seller, and because of that it´s normal to have a database of clients and also a fruit database with the kinds of fruit you sell. From my understanding I find two entities here:
Client with properties named: name, address, phone, email, etc.
Stock with properties named: name, weight, stock, cost, supplier, etc.
TheBudget with properties named: name, amount, type, cost, delivery, etc.
I didn´t put all the properties because I think you get the point. I mean as you can see, there´s only two properties I could inherit; the rest is different. So, if I was doing a budget for a client, I can have as many clients I want and also the amount of stock, but what about the actual budget?
I´m sorry if my explanation was not very clear, but if it was..what kind of relationships should I be creating? I think Client and TheBudget have a connection. What do you advise me?
That's not entirely correct, but some parts are on the right track. I've broken your question down into three parts: relationships, inheritance and the Managed Object Context to hopefully help you understand each part separately:
Relationships
Relationships are usually used to indicate that one entity can 'belong' to another (i.e. an employee can belong to a company). You can setup multiple one-to-many relationships (i.e. an employee belongs to a company and a boss) and you can setup the inverse relationships (which is better described with the word 'owns' or 'has', such as 'one company has many employees).
There are many even more complicated relationships depending on your needs and a whole set of delete rules that you can tell the system to follow when an entity in a relationship is deleted. When first starting out I found it easiest to stick with one-to-one and one-to-many relationships like I've described above.
Inheritance
Inheritance is best described as a sort of base template that is used for other, more specific entities. You are correct in stating that you could use inheritance as a sort of protocol to define some basic attributes that are common across a number of entities. A good example of this would be having a base class 'Employee' with attributes 'name', 'address' and 'start date'. You could then create other entities that inherit from this Employee entity, such as 'Marketing Rep', 'HR', 'Sales Rep', etc. which all have the common attributes 'name', 'address' and 'start date' without creating those attributes on each individual entity. Then, if you wanted to update your model and add, delete or modify a common attribute, you could do so on the parent entity and all of its children will inherit those changes automatically.
Managed Object Context (i.e. saving)
Now, onto the other part of your question/statement: wrapping all of your entities into an object which will be used to save and load. You do not need to create this object, core data uses the NSManagedObjectContext (MOC for short) specifically for this purpose. The MOC is tasked with keeping track of objects you create, delete and modify. In order to save your changes, you simply call the save: method on your MOC.
If you post your entities and what they do, I might be able to help make suggestions on ways to set it up in core data. You want to do your best to setup as robust a core data model as you can during the initial development process. The OS needs to be able to 'upgrade' the backing store to incorporate any changes you've made between your core data model revisions. If you do a poor job of setting up your core data model initially and release your code that way, it can be very difficult to try and make a complicated model update when the app is in the wild (as you've probably guessed, this is advice born out of painful experience :)
In Domain Driven Design are collection properties of entities allowed to have partial values?
For example, should properties such as Customer.Orders, Post.Comments, Graph.Vertices always contain all orders, comments, vertices or it is allowed to have today's orders, recent comments, orphaned vertices?
Correspondingly, should Repositories provide methods like
GetCustomerWithOrdersBySpecification
GetPostWithCommentsBefore
etc.?
I don't think that DDD tells you to do or not to do this. It strongly depends on the system you are building and the specific problems you need to solve.
I not even heard about patterns about this.
From a subjective point of view I would say that entities should be complete by definitions (considering lazy loading), and could completely or partially be loaded to DTO's, to optimized the amount of data sent to clients. But I wouldn't mind to load partial entities from the database if it would solve some problem.
Remember that Domain-Driven Design also has a concept of services. For performing certain database queries, it's better to model the problem as a service than as a collection of child objects attached to a parent object.
A good example of this might be creating a report by accepting several user-entered parameters. It be easier to model this as:
CustomerReportService.GetOrdersByOrderDate(Customer theCustomer, Date cutoff);
Than like this:
myCustomer.OrdersCollection.SelectMatching(Date cutoff);
Or to put it another way, the DDD model you use for data entry does not have to be the same as the DDD model you use for reporting.
In highly scalable systems, it's common to separate these two concerns.