Reuse same database tables in different repositories (repositories overlap on the data they access) - domain-driven-design

Suppose I have database tables Customer, Order, Item. I have OrderRepository that accesses, directly with SQL/my ORM, both the Order and Items table. E.g. I could have a method, getItems on the OrderRespositry that returns all items of that order.
Suppose I now also create ItemRepository. Given I now have 2 repositories accessing the same database table, is that generally considered poor design? My thinking is, sometimes a user wants to update the details about an Item (e.g. name), but when using the OrdersRepository, it doesn't really make sense to not be able to access the items directly (you want to know about all the items in an order)
Of course, the OrderRepository could internally create* an ItemRepository and call methods like getItemsById(ids: string[]). However, consider the case that I want to get all orders and items ever purchased by a Customer. Assuming you had the orderIds for a customer, you could have a getOrders(ids: string[]) on the OrderRepository to fetch all the orders and then do a second query to fetch all the Items. I feel you make your life harder (and less efficient) in the sense you have to do the join to match items with orders in the app code rather than doing a join in SQL.
If it's not considered bad practice, is there some kind of limit to how much overlap Repositories should have with each other. I've spent a while trying to search for this on the web, but it seems all the tutorials/blogs/vdieos really don't go further than 1 table per entity (which may be an anti-pattern).
Or am I missing a trick?
Thanks
FYI: using express with TypeScript (not C#)
is a repository creating another repository considered acceptable. shouldn't only the service layer do that?

It's difficult to separate the Database Model from the DDD design but you have to.
In your example:
GetItems should have this signature - OrderRepostiory.GetItems(Ids: int[]) : ItemEntity. Note that this method returns an Entity (not a DAO from your ORM). To get the ItemEntity, the method might pull information from several DAOs (tables, through your ORM) but it should only pull what it needs for the entity's hydration.
Say you want to update an item's name using the ItemRepository, your signature for that could look like ItemRepository.rename(Id: int, name: string) : void. When this method does it's work, it could change the same table as the GetItems above but note that it could also change other tables as well (For example, it could add an audit of the change to an AuditTable).
DDD gives you the ability to use different tables for different Contexts if you want. It gives you enough flexibility to make really bold choices when it comes the infrastructure that surrounds your domain. So ultimately, it's a matter of what makes sense for your specific situation and team. Some teams would apply CQRS and the GETOrder and Rename methods will look completely different under the covers.

Related

Create Mongoose Schema Dynamically for e-commerce website in Node

I would like to ask a question about a possible solution for an e-commerce database design in terms of scalability and flexibility.
We are going to use MongoDB and Node on the backend.
I included an image for you to see what we have so far. We currently have a Products table that can be used to add a product into the system. The interesting part is that we would like to be able to add different types of products to the system with varying attributes.
For example, in the admin management page, we could select a Clothes item where we should fill out a form with fields such as Height, Length, Size ... etc. The question is how could we model this way of structure in the database design?
What we were thinking of was creating tables such as ClothesProduct and many more and respectively connect the Products table to one of these. But we could have 100 different tables for the varying product types. We would like to add a product type dynamically from the admin management. Is this possible in Mongoose? Because creating all possible fields in the Products table is not efficient and it would hit us hard for the long-term.
Database design snippet
Maybe we should just create separate tables for each unique product type and from the front-end, we would select one of them to display the correct form?
Could you please share your thoughts?
Thank you!
We've got a mongoose backend that I've been working on since its inception about 3 years ago. Here some of my lessons:
Mongodb is noSQL: By linking all these objects by ID, it becomes very painful to find all products of "Shop A": You would have to make many queries before getting the list of products for a particular shop (shop -> brand category -> subCategory -> product). Consider nesting certain objects in other objects (e.g. subcategories inside categories, as they are semantically the same). This will save immense loading times.
Dynamically created product fields: We built a (now) big module that allows user to create their own databse keys & values, and assign them to different objects. In essence, it looks something like this:
SpecialFieldModel: new Schema({
...,
key: String,
value: String,
...,
})
this way, you users can "make their own products"
Number of products: Mongodb queries can handle huge dataloads, so I wouldn't worry too much about some tables beings thousands of objects large. However, if you want large reports on all the data, you will need to make sure your IDs are in the right place. Then you can use the Aggregation framework to construct big queries that might have to tie together multiple collectons in the db, and fetch the data in an efficient manner.
Don't reference IDs in both directions, unless you don't know what you're doing: Saving a reference to category ID in subcatgories and vice-versa is incredibly confusing. Which field do you have to update if you want to switch subcategories? One or the other? Or both? Even with strong tests, it can be very confusing for new developers to understand "which direction the queries are running in" (if you are building a proudct that might have to be extended in the future). We've done both which has led to a few problems. However, those modules that saved references to upper objects (rather than lower ones), I found to be consistently more pleasant and simple to work with.
created/updatedAt: Consider adding these fields to every single model & Schema. This will help with debugging, extensibility, and general features that you will be able to build in the future, which might otherwise be impossible. (ProductSchema.set('timestamps', true);)
Take my advice with a grain of salt, as I haven't designed most of our modules. But these are the sorts of things I consider as continue working on our applications.

Dependent entities within same aggregate

Situation:
We have a classic Order with OrderLines. Each OrderLine has reference to the ProductId.
Each Product has its RelatedProduct. For example, product
class Product {
string Id;
string Name;
string RelatedProductId;
decimal RelatedProductQuantity;
.
.
.
}
There is a business rule that whenever Product is added to Order with new OrderLine then Product with id=RelatedProductId should also be added in a quantity=RelatedProductQuantity.
Questions:
How to keep this rule within the domain so it doesn't spill over to application service but at the same time keep Order aggregate clean in a sense not to poison it by injecting repository or any data-fetching thing?
Should we use domain service? And if so, can domain service have repository injected, prepare all the data, create OrderLine (for both, base and related products), fill in the aggregate and save it to repository?
If none of the above, what's the best way to model it?
There are two common patterns that you will see here:
Fetch a copy of the information in your application code, then pass that information to the domain model as an argument
Pass the capability to fetch the information as an argument to the domain model
The second option is your classic "domain service" approach, where you use a "stateless" instance to fetch a copy of "global" state.
But, with the right perspective you might recognize that the first approach is the same mechanism - only it's the application code, rather than the domain code, that fetches the copy of the information.
In both cases, it's still the domain model deciding what to do with the copy of the information, so that's all right.
Possible tie breakers:
If the information you need to copy isn't local (ie: you are dealing with a distributed system, and the information isn't available in a local cache), then fetching that information will have failure modes, and you probably don't want to pollute the domain model with a bunch of code to handle that (in much the same way that you don't pollute your domain code with a bunch of database related concerns).
When it's hard to guess in advance which arguments are going to be passed to fetch the data, then it may make sense to let the domain code invoke that function directly. Otherwise, you end up with the application code asking the domain model for the arguments, and the passing the information back into the model, and this could even ping pong back and forth several times.
(Not that it can't be done: you can make it work - what's less clear is how happy you are going to be maintaining the code).
If you aren't sure... use the approach that feels more familiar.

homogeneous vs heterogeneous in documentdb

I am using Azure DocumentDB and all my experience in NoSql has been in MongoDb. I looked at the pricing model and the cost is per collection. In MongoDb I would have created 3 collections for what I was using: Users, Firms, and Emails. I noted that this approach would cost $24 per collection per month.
I was told by the people I work with that I'm doing it wrong. I should have all three of those things stored in a single collection with a field to describe what the data type is. That each collection should be related by date or geographic area so one part of the world has a smaller portion to search.
and to:
"Combine different types of documents into a single collection and add
a field across all to separate them in searching like a type field or
something"
I would never have dreamed of doing that in Mongo, as it would make indexing, shard keys, and other things hard to get right.
There might not be may fields that overlap between the objects (example: Email and firm objects)
I can do it this way, but I can't seem to find a single example of anyone else doing it that way - which indicates to me that maybe it isn't right. Now, I don't need an example, but can someone point me to some location that describes which is the 'right' way to do it? Or, if you do create a single collection for all data - other than Azure's pricing model, what are the advantages / disadvantages in doing that?
Any good articles on DocumentDb schema design?
Yes. In order to leverage CosmosDb to it's full potential need to think of a Collection is an entire Database system and not as a "table" designed to hold only one type of object.
Sharding in Cosmos is exceedingly simply. You just specify a field that all of your documents will populate and select that as your partition key. If you just select a generic value such as key or partitionKey you can easily separate the storage of your inbound emails, from users, from anything else by picking appropriate values.
class InboundEmail
{
public string Key {get; set;} = "EmailsPartition";
// other properties
}
class User
{
public string Key {get; set;} = "UsersPartition";
// other properties
}
What I'm showing is still only an example though. In reality your partition key values should be even more dynamic. It's important to understand that queries against a known partition are extremely quick. As soon as you need to scan across multiple partitions you'll see much slower and more costly results.
So, in an app that ingests a lot of user data. Keeping a single user's activity together in one partition might make sense for that particular entity.
If you want evidence that this is the appropriate way to use CosmosDb, consider the addition of the new Gremlin Graph APIs. Graphs are inherently heterogenous as they contain many different entities and entity types as well as the relationships between them. The query boundary of Cosmos is at the collection level so if you tried putting your entities all in different collections none of the Graph API or queries would work.
EDIT:
I noticed in the comments you made this statement And you would have an index on every field in both objects. CosmosDb does automatically index every field of every document. They use a special proprietary path based indexing mechanism that ensures every path of your JSON tree has indices on it. You have to specifically opt out of this auto indexing feature.

CQRS Read models in a NoSql (Mongo DB)

Hi its my fist time with DDD/CQRS. I've read multiple sources of knowledge and Im still confused a bit, maybe someone could help :)
Lets assume simple case that we have products and clients (possibly different bounded contexts).
A client can buy a product and he wants to see all products that he purchased.
In this case I realize I need a UserPurchasesView view model with:
purchaseId (which is a mongo primary key)
userId,
product: {id, name, image, shortDescription, [maybe some others]}
prize
timestamp
Now ... the problem is that My domain is producing an event like UserPurchasedProduct(userId, productId). I could enrich an event with a prize, product name or maybe something else but not all fields. Im getting to a point where enriching seems to be wrong.
In this point I realize I need something like ProductDetailsView:
productId (primary key)
prize
name
shortDescription
logo
This view is maintained by events like: ProductCreated, ProductRenamed, ProductImageChanged
And now we have 2 options ...
Look into the ProductDetailsView when UserPurchasedProduct event comes in, take all needed product details and save it in UserPurchasesView for faster reads. This solution looks not that bad but it introduces some extra coupling and it seems to me these views cannot be scaled well when needed. Also both views must be rebuilt together when replying all events from the event store (rebuilding is also more tricky in that case).
Keep only the productId in the UserPurchasesView and read multiple views when user queries his purchases. This is some extra processing that would have to be done somewhere. In the frontend, in the backend controller or in some read model high level API. UPDATE: I also realized that I would also need to keep at least the prize and maybe name of the product in the UserPurchasesView (in case it changes) but sometimes you need the value from the time of a purchase and sometimes you need the recent value. Scenario depends on a business but we could imagine both.
None of these solutions looks perfect to me. Am I wrong, am I missing something or is it just the way to do it? Thanks!
You understand well.
So you have to choose between coupling between the read models and coupling between UI and individual read models.
One of the main advantages of CQRS/ES is the posibility to create blazing fast read models (views if you like), without any joins, the perfect cache as I saw it called. I personally have chosen every time the first approach, with full data denormalisation. The views are very fast and models very clean and clear. This is the perfect solution if you want to optimize the read side of your application (and I think you should).
By listening to the right events you can keep these read models in sync with the rest of the application.
There is a 3rd option:
The projection responsible for the UserPurchasesView view not only listens to UserPurchasedProduct events, but also to ProductCreated, ProductRenamed, ProductImageChanged - any product related events that affect the UserPurchasesView. Now, as well as the UserPurchasesView collection for the read model that it is responsible for, it also needs a private collection to maintain the bits of products it is interested in: ({id, name, image, shortDescription, [maybe some others]}), so that when a new purchase event comes in, you have somewhere to get the initial state of those product fields from. Since your UserPurchasesView needs to listen to some of those product events anyway in order to keep up to date when a product changes, this isn't really much extra work, and avoids any dependency on another projection (ProductDetailsView). The cross-projection dependency also has a potential problem due to eventual consistency - what if the product isn't even in the product details view yet when the UserPurchasedProduct event comes through?
To avoid any concurrency issues, it's simplest to have each projection managed only by a single process and a single thread. That way, as long as the projection can receive events in-order across streams (so that it is guaranteed to see the product creation before the product purchase), you won't have issues with seeing a purchase before the product exists. If you introduce sharding or any other multi-threading to your projection, it gets more complicated.

Complex Finds in Domain Driven Design

I'm looking into converting part of an large existing VB6 system, into .net. I'm trying to use domain driven design, but I'm having a hard time getting my head around some things.
One thing that I'm completely stumped on is how I should handle complex find statements. For example, we currently have a screen that displays a list of saved documents, that the user can select and print off, email, edit or delete. I have a SavedDocument object that does the trick for all the actions, but it only has the properties relevant to it, and I need to display the client name that the document is for and their email address if they have one. I also need to show the policy reference that this document may have come from. The Client and Policy are linked to the SavedDocument but are their own aggregate roots, so are not loaded at the same time the SavedDocuments are.
The user is also allowed to specify several filters to reduce the list down. These to can be from properties that are stored on the SavedDocument or the Client and Policy.
I'm not sure how to handle this from a Domain driven design point of view.
Do I have a function on a repository that takes the filters and returns me a list of SavedDocuments, that I then have to turn into a different object or DTO, and fill with the additional client and policy information? That seem a little slow as I have to load all the details using multiple calls.
Do I have a function on a repository that takes the filters and returns me a list of SavedDocumentsForList objects that contain just the information I want? This seems the quickest but doesn't feel like I'm using DDD.
Do I load everything from their objects and do all the filtering and column selection in a service? This seems the slowest, but also appears to be very domain orientated.
I'm just really confused how to handle these situations, and I've not really seeing any other people asking questions about it, which masks me feel that I'm missing something.
Queries can be handled in a few ways in DDD. Sometimes you can use the domain entities themselves to serve queries. This approach can become cumbersome in scenarios such as yours when queries require projections of multiple aggregates. In this case, it is easier to use objects explicitly designed for the respective queries - effectively DTOs. These DTOs will be read-only and won't have any behavior. This can be referred to as the read-model pattern.

Resources