I am new to DDD, reading literature on it, but having trouble applying some concepts.
I am presenting a simplified view of app that I am building. It is a home loan application system. UI has wizard like steps to collect information, like step 1 is collect applicant info, step 2 is collect property info, step 3 is to capture decision of approval or decline. Each application gets assigned a unique ID at step 1. My challenge is how to model incremental save from each step.
Loan application is my aggregate root. From what I read, each root has only one repository and entire root has to be saved together so that it is valid. However UI collects info incrementally and at each step, application entity is valid - when I save data from step 1, my loan application object is valid. When data from step 2 is saved, loan application object is still valid.
Looking for some advice on How to design Api and repository here? If agg root is valid at each step and can be saved in small steps, then what’s the point of one save api exposed? Should there be 3 separate Api exposed to UI and these 3 api call 3 separate repo classes or 1 api calling 3 separate methods on one repo? I am using entity framework for save to db.
Thank you.
The requirements of the application and the way you model this will impact how you do it.
(Note: I will use AR for aggregate root)
In your case if you have LoanApplication AR it needs to contain information about the Applicant and the Property.
Lets say that each Applicant will have something like an unique account, so you can keep track of how many loans an Applicant has.
In this scenario the Applicant will be an entity and probably an AR. It will have it's own repository: ApplicantRepository. In this case Applicant AR needs to be referenced from your LoanApplication AR.
This means that in the first step of your wizard you can either search for an Applicant using ApplicantRepository or create a new one. In case of creating a new one you can create and save it in the first step. Later you can reference the Applicant (by reference or by ID) from LoanApplication
If you don't want that, then Applicant can be a value object and it's information will be stored in the LoadApplication AR.
The same thing applies to Property: you can either have a Property Entity with a PropertyRepository or just store PropertyInfo value object to LoadApplication AR.
Another important thing is that you LoanApplication has a life-cycle. Depending on it's current state, the invariant for it may change. It's OK to have an AR that will go through different phases. Take for example online stores. When you order something say from Amazon, your order can be Approved or Pending (or in other states) and this is part of it's life-cycle. When you want to complete the order the system it may ask you for payment details before submitting it.
In your ca you can create a life-cycle for LoadApplication by having a Status: Pending, SubmittedForApproval, Approved, Rejected etc. It may also need additional information about why it was Rejected etc.
If you want to save information about the process of creating a LoadApplication you can assign a Status that represents the fact that you LoadApplication is still in process of creation: Pending or InProgress. This way if your application crashes in the middle it can resume by getting the LoadApplication and checking it's Status.
You can add behavior in LoadApplication AR related to it's status (for example it wont allow a transition to Approved status if it's not in SubmittedForApproval status and wont allow to change the Property if it's not in InProgress status (you don't want to change the Property or Applicant when the status is Approved or SubmittedForApproval).
Implementing Save:
If you have decided that the LoadApplication Aggregate will contains three entities: LoadApplication, Applicant and Property, then all these entities are saved and loaded together because the aggregate is a Transactional Boundary. This guideline can help in implementing Save, but it can be tricky.
It will depend on several factors:
What is you database (SQL, MongoDB)?
Are you using a framework (NHibernate, Mongoose) or using native API (using raw SQL)?
Since only the LoadApplicationRepository will Save the LoadApplication with Applicant and Property then there is no problem even if the Applicant is saved again to the DB as there will be no changes to it. You will only override existing data with the same data, not so cool for performance, but it's not an issue with your logic.
On the other hand if you are using an ORM they can detect changes in Objects and generate only the required query to update only the new changes to the database. In this case if you say add a Property to LoadApplication if will pick that up and update only the Property in the database.
For example lets say you are using SQL database with (N)Hibernate or EntityFramewok, your ORM will track changes that a Property was added and generate SQL to insert it to Properties table in the DB but won't generate insert of update for the already existing Applicant because it didn't change.
If you are writing your own logic using for example raw SQL, then you will have to write the logic that tracks changes yourself.
One way is to add changes collection in LoanApplicaton AR that will contain changes/events (ApplicantAssigned, PropertyAssigned, SomethingChanged etc) so you can use them in Save method to generate SQL based on changes. When you Save the Aggregate you can Clear the changes.
Here is a great essay on modeling aggregates:
http://dddcommunity.org/library/vernon_2011/.
Here are couple on domain events:
https://www.martinfowler.com/eaaDev/DomainEvent.html
https://lostechies.com/jimmybogard/2014/05/13/a-better-domain-events-pattern/
Related
Suppose I have database tables Customer, Order, Item. I have OrderRepository that accesses, directly with SQL/my ORM, both the Order and Items table. E.g. I could have a method, getItems on the OrderRespositry that returns all items of that order.
Suppose I now also create ItemRepository. Given I now have 2 repositories accessing the same database table, is that generally considered poor design? My thinking is, sometimes a user wants to update the details about an Item (e.g. name), but when using the OrdersRepository, it doesn't really make sense to not be able to access the items directly (you want to know about all the items in an order)
Of course, the OrderRepository could internally create* an ItemRepository and call methods like getItemsById(ids: string[]). However, consider the case that I want to get all orders and items ever purchased by a Customer. Assuming you had the orderIds for a customer, you could have a getOrders(ids: string[]) on the OrderRepository to fetch all the orders and then do a second query to fetch all the Items. I feel you make your life harder (and less efficient) in the sense you have to do the join to match items with orders in the app code rather than doing a join in SQL.
If it's not considered bad practice, is there some kind of limit to how much overlap Repositories should have with each other. I've spent a while trying to search for this on the web, but it seems all the tutorials/blogs/vdieos really don't go further than 1 table per entity (which may be an anti-pattern).
Or am I missing a trick?
Thanks
FYI: using express with TypeScript (not C#)
is a repository creating another repository considered acceptable. shouldn't only the service layer do that?
It's difficult to separate the Database Model from the DDD design but you have to.
In your example:
GetItems should have this signature - OrderRepostiory.GetItems(Ids: int[]) : ItemEntity. Note that this method returns an Entity (not a DAO from your ORM). To get the ItemEntity, the method might pull information from several DAOs (tables, through your ORM) but it should only pull what it needs for the entity's hydration.
Say you want to update an item's name using the ItemRepository, your signature for that could look like ItemRepository.rename(Id: int, name: string) : void. When this method does it's work, it could change the same table as the GetItems above but note that it could also change other tables as well (For example, it could add an audit of the change to an AuditTable).
DDD gives you the ability to use different tables for different Contexts if you want. It gives you enough flexibility to make really bold choices when it comes the infrastructure that surrounds your domain. So ultimately, it's a matter of what makes sense for your specific situation and team. Some teams would apply CQRS and the GETOrder and Rename methods will look completely different under the covers.
Situation:
We have a classic Order with OrderLines. Each OrderLine has reference to the ProductId.
Each Product has its RelatedProduct. For example, product
class Product {
string Id;
string Name;
string RelatedProductId;
decimal RelatedProductQuantity;
.
.
.
}
There is a business rule that whenever Product is added to Order with new OrderLine then Product with id=RelatedProductId should also be added in a quantity=RelatedProductQuantity.
Questions:
How to keep this rule within the domain so it doesn't spill over to application service but at the same time keep Order aggregate clean in a sense not to poison it by injecting repository or any data-fetching thing?
Should we use domain service? And if so, can domain service have repository injected, prepare all the data, create OrderLine (for both, base and related products), fill in the aggregate and save it to repository?
If none of the above, what's the best way to model it?
There are two common patterns that you will see here:
Fetch a copy of the information in your application code, then pass that information to the domain model as an argument
Pass the capability to fetch the information as an argument to the domain model
The second option is your classic "domain service" approach, where you use a "stateless" instance to fetch a copy of "global" state.
But, with the right perspective you might recognize that the first approach is the same mechanism - only it's the application code, rather than the domain code, that fetches the copy of the information.
In both cases, it's still the domain model deciding what to do with the copy of the information, so that's all right.
Possible tie breakers:
If the information you need to copy isn't local (ie: you are dealing with a distributed system, and the information isn't available in a local cache), then fetching that information will have failure modes, and you probably don't want to pollute the domain model with a bunch of code to handle that (in much the same way that you don't pollute your domain code with a bunch of database related concerns).
When it's hard to guess in advance which arguments are going to be passed to fetch the data, then it may make sense to let the domain code invoke that function directly. Otherwise, you end up with the application code asking the domain model for the arguments, and the passing the information back into the model, and this could even ping pong back and forth several times.
(Not that it can't be done: you can make it work - what's less clear is how happy you are going to be maintaining the code).
If you aren't sure... use the approach that feels more familiar.
I'm currently getting started with NodeJS, MongoDB, Mongoose etc and got a few questions about Mongoose schemes/model best practices:
Is it acceptable to have multiple schemes/models for a single view? Let's take a calendar app for example. I have a single view for the calendar and several models like the CalendarModel (stores calendarID, color, owner etc) and the CalendarEventModel (contains info on the events to be shown in the calendar.) Is this a suitable approach?
In the above example, should there be a controller for each Model/Scheme or is it considered acceptable to have one controller that controls both models/schemes and puts them together into the single view that I have?
Would it be a good alternative to have one CalendarModel and store all CalendarEvents within that model, so that the model would consist of info like this: calendarID, owner, color, eventsList
Thank you! :)
There is no one simple answer to this problème. This really depends on what is the requirement. Here are a few pointers.
Both have some pros and cons but I would say it is a suitable approach
Single View with multiple schemes/models
Pro:
Allow to group data that change together in a schema
Simple because only one view to use and all present
Con:
A single view is great but changing data requires all the view to be loaded again
A single view may not be reusable because it will tend to contain a lot of useless info if reused
Multiple View with multiple schemes/models
Pro:
Very flexible and reusable.
More control over the size of the data.
A good grouping of data that change together (Granularity)
Con:
More complex to manage
May not new reusability
Maybe overkill
It depends on the job the controller does for the app. For this question, I would really ask myself the question of what is the goal of the controller. A big controller is simpler but does multiple things and it may become quickly a mess to maintain.
If you change a model and you have a single controller you have to change it and you may break the controller for another functionality.
If you change a model and you have multiple controllers you have to change a single point and it is more controlled but you are more vulnerable to create side effects on other views and controllers.
This is a question of data.
Single Model
Pro:
Simple
Load once get all
Less roundtrip
All centralized
Con:
Forced to store events and calendar in the same database
No caching possible
Large transfer size each time something change
New event requires the update of the calendar
Multiple Models
Pro:
More Control on the database for storage
Possible to get pieces to size the data
Possible to cache stable data
Con:
More complex for migration and data coherence
More roundtrip to the database or aggregation required
Maybe overkill and more query and assembly
My approach to this example without knowing the exact requirement.
I would separate the events and the main model.
This allows to update and reload update without reloading the calendar
Since the calendar does not change a lot I would cache (Redis or in process state) the calendar avoiding the database load.
I could load events per month and years. This is great to keep the size small.
If the requirements are fixed one controller and one view is good but rarely the requirements will change. Without over-engineering I would separate events and calendar in two-controller. When I add the custom font feature per calendar this will change the calendar model and controller only.
The view can be a single instance for the global view but if I have a detailed event view also I would add another view. Each view has a single purpose and reuse is possible but not forced.
Lastly, the business rule should stay in the same layer and not leak between. For instance, the rule that two events cannot be on the same day may be enforced by the controller. Another rule that says that event should be a move to an existing calendar should also be in the controller layer if decided so.
Notes: This is my opinion and there exist multiple opinions on the subject. Also, it is really dependant on the requirements. One good solution for one app/api may be the worst solution for another.
EDIT:
Single or Multiple Controllers
I would tend to group code that does a single purpose. Here if I have 2 models (event/calendar) and 2 views (Calendar overview and event detail) I would tend to have 2 controllers if they have different roles. If creating and editing can be directly done in the calendar overview, then use a single controller and the event detail use a subset of that same controller for its view. But calendar preference/overview and event management can be two different things. Note here that the number of models could be 5 or 7 and it would not matter. I could have 6 different schemas to help me with the storage and database but only have 1 controller.
Deciding the number of things
Models:
An abstraction of the data and the storage solution (files, database, in memory,...). Therefore, choosing the correct representation depends on the desired data structure. Think about what changes and what can be group together.
In this example, 1 model for Calendar({id, color, owner,...}), 1 model for Events, 1 model for Owner, ... If you need to use SQL for Events ({id, calendar id, detail id}) and Events details is in Mongo ({id, name, time, color, date, description}) then use 2 models for the events.
Controllers:
Represent a function and a way to interact with the user. Logical function grouping. And business rules centralization.
In this example, there is 2 logical grouping, Calendar overview management with preference and Event update and creation and detail loading. Note that both controllers will use the Event model. The calendar will load the events of the current month but may just load the id, name and time. The Event controller will load only a specific event and allow it to create and update this event.
Views:
These are representations of the data. This allows us to simplify the output and keep the model structure and the business rules from leaking. Nobody has to know that you may use 3 database kind to store your data and how your data is structured.
In this example, there could be 3 or more views. There could be a per month overview and a per year overview. Each uses the calendar overview but are separate views because the data may not be structure exactly the same. Also, there could be an event list overview that uses the calendar overview controller. And event detail view.
Note that visually all views will be different but at the core, it is just a way to package the calendar and the selected events (1 month, 1 day, 1 year, all events) for the visual. If all the view turn out identical just create a single one but independent view allows you to change requirement. 1 day view may require more detail on events than the 1 year view.
I'm currently trying to learn Node.js and Mongoodb by building the server side of a web application which should manage insurance documents for the insurance agent.
So let's say i'm the user, I sign in, then I start to add my customers and their insurances.
So I have 2 collection related, Customers and Insurances.
I have one more collection to store the users login data, let's call it Users.
I don't want the new users to see and modify the customers and the insurances of other users.
How can I "divide" every user related record, so that each user can work only with his data?
I figured out I can actually add to every record, the _id of the one user who created the record.
For example I login as myself, I got my Id "001", I could add one field with this value in every customer and insurance.
In that way I could filter every query with this code.
Would it be a good idea? In my opinion this filtering is a waste of processing power for mongoDB.
If someone has any idea of a solution, or even a link to an article about it, it would be helpful.
Thank you.
This is more a general permissions problem than just a MongoDB question. Also, without knowing more about your schemas it's hard to give specific advice.
However, here are some approaches:
1) Embed sub-documents
Since MongoDB is a document store allowing you to store arbitrary JSON-like objects, you could simply store the customers and licenses wholly inside each user object. That way querying for a user would return their customers and licenses as well.
2) Denormalise
Common practice for NoSQL databases is to denormalise related data (ie. duplicate the data). This might include embedding a sub-document that is a partial representation of your customers/licenses/whatever inside your user document. This has the similar benefit to the above solution in that it eliminates additional queries for sub-documents. It also has the same drawbacks of requiring more care to be taken for preserving data integrity.
3) Reference with foreign key
This is a more traditionally relational approach, and is basically what you're suggesting in your question. Depending on whether you want the reference to be bi-directional (both documents reference each other) or uni-directional (one document references the other) you can either store the user's ID as a foreign user_id field, or store an array of customer_ids and insurance_ids in the user document. In relational parlance this is sometimes described to as "has many" or "belongs to" (the user has many customers, the customer belongs to a user).
We're looking into using CouchDB/CouchCocoa to replicate data to our mobile app.
Our system has a large number of users. Part of the database is private to each user -- for example their tasks. These I've been able to replicate without problem using filtered replication.
Here's the catch... The database also includes shared information only some of which pertains to a given user. How do I selectively replicate that shared information? For example a user's task might reference specific shared documents. Is there a way to make sure those documents are included in the replication without including all the shared documents?
From the documentation it seems that adding doc_ids to the replication (or adding another replication with those doc ids) might be one solution. Has any one tried this? Are there other solutions?
EDIT: Given the number of users it seems impractical to tag each shared document with all the users sharing it but perhaps that's the only way to do this?
Final solution mostly depends on your documents structure, but currently I see two use-cases:
As you keep everything within single database, probably you have some fields set to recognize, that document is shared or document is private, right? Example:
owner: "Mike"
participants: [] // if there is nobody mentioned, document looks like as private(?)
So you just need some filter that would handle only private documents and only shared ones: by tags, number of participants, references or somehow.
Also, if you need to replicate some documents only for specific user (e.g. only for Mike), than you need special view to handle all these documents and, yes, use replication by document ids, but this wouldn't be an atomic request: you need some service script to handle these steps. If shared documents are defined by references to them, than the only solution is the same: some service script, view that generated document reference tree and replication by doc._id's.
Review your architecture. Having per user database is normal use-case for CouchDB and follows way of data partitioning and isolation. So you may create per user database that would be private only for that user. For shared documents you may create additional databases playing with database members of security options. Each "shared" database will handle only certain number of participants by names or by groups, so there couldn't be any data leaks unless that was not a CouchDB bug(:
This approach looks too weird from first sight, but everything you've needed there is to create some management script that would handle database creation and publication, replications would be easy as possible and users data is in safe.
P.S. I've supposed that "sharing" operation makes document visible not for every one, but for some set of users. If I was wrong and "shared" state means "public" state than p2. will be more simpler: N users databases + 1 public one.