I am developing Library Management System which have two sorts of books (Ebook and PrintedBook).
I intends to make search capacity with both ebook and printedbook in the same page.
The only problem is that I see that ebook and printedbook are book. And should I make an Book entity, and PrintedBook and Ebook inherits Book entity. If I do this, the search capacity is easier by using IBookRepository. If not I have to join two tables (Ebooks and PrintedBooks).
Please help me.
Dealing with inheritance at persistance level, esspecialy when talking about relation databases, can be a headache. First of all you should ask yourself why is this a problem for you.
If the problem is a performance due to using JOIN in you database query you might look at technique called single table inheritance. Basically you have one table containing all the columns of all your book types (i.e. PrintedBook and Ebook). This way you don't have to use JOIN, but you sacrifice some storage.
Other then the concrete table inheritanec technique (as described by yourself) there is no other way how to deal with the inheritance problem in relation databases.
If your application becomes too complex or the domain model isn't compatible with your read use cases, you might look at read-model. Read-model helps you to focus on your problem domain without modifying it while having easy access to the data. This is very complex topic so if you want to read something about read-models (or about DDD implementation problems/techniques) I recommend you to read Implementing Domain-Driven Design by Vaugh Vernon.
Related
On SQLAlchemy's documentation page the author starts with a philosophy,
SQL databases behave less like object collections the more size and
performance start to matter; object collections behave less like
tables and rows the more abstraction starts to matter.
I'm scratching my head trying to understand the idea behind these two sentences, but failed. Could someone give an example illustrate the idea here? Thanks.
When you are creating an application using an Object Oriented language and a SQL database, you are simultaneously working with two very different conceptual models for storing information:
The relational model says how to store data in tables and rows and how to link elements through keys and joins.
The object model establishes a way to store entities with attributes in memory (usually) and how to set links between them using pointers or references.
So, let's say that you have an User entity that is linked to addresses and other users in your application. Those entities will need to be stored in the form several tables in the database (users table, addresses table and a many to many table for associating users to users, for instance). At the same time, if your code uses object oriented constructs, users and addresses will exist in memory in the form of objects with references between them, pointing to objects of the same or different kind.
The thing is, moving information between those two different worlds is much much more difficult than it looks at first:
You might associate one object with one row in a table, but that is not always possible and sometimes a single object must be associated to multiple rows in different tables.
Inheritance and polymorphic behavior are particularly difficult to map to a relational model.
Traversing objects and querying the database are vastly different actions.
Performance factors to take into account in an object model and a relational model are completely different.
And those are just a few examples. ORMs such as SQLAlchemy are essentially translators that convert information from one world into the other and back.
What I think that Mike Bayer was trying to convey is: the more you adapt your entity information to the object model (lots of inheritance, polymorphism, traversal of objects, ...), the farther it will resemble the natural structure in a relational model and the more performance concessions you will be making. And the other way around: the more you design your tables to perform well and be optimized for your queries, the less they will adapt to a natural structure of objects.
Martin Fowler has a nice write-up about the need of this translation in this article: ORM Hate (from which I took the above image).
Edit: further clarification on the abstraction vs performance issue
At the end, I think that the bottom line of that SQLAlchemy presentation text is: many ORMs hide the relational side of the relational-object oriented translation to make things easier. With them you only have to worry about the object oriented side, and the library is in charge of taking away the burden of dealing with the database. You get persistence for your objects without having to deal with SQL. However, they incur in a performance penalty in doing so, because the details of working with the database are abstracted away and you have no control over them. And those details are essential when you have to optimize performance. SQLAlchemy takes the opposite approach. It hides nothign of the relational side, you are in control of how SQL is generated and when and when not use joins, subqueries and other SQL constructs. That makes it a much more complex library to learn, but at the same time you are in control of the whole relational-object oriented translation process.
I have the following entities (example):
Book
Author
The Book entity is also an aggregate since it has related one or many Authors. Now I have a problem in how to fetch this aggregate from the repo. I have the following cases - and I also do need to take care of performance:
List all the Books. No need to fetch Authors.
List all the Bookss with Authors names. Obviously, we need to fetch the aggregate of Books and related Authors.
List all the Books with authors count. Similar to (2), except I do not want to fetch the Authors from the repo, just the count.
So how my repository should look like? Specific questions:
Should we have method like findBooks and findBooksWithAuthors and findBooksWithAuthorsCount? But this would lead to crazy amount of methods, since our entities have many relationships between each other.
Should we just have findBooks and then loadAuthors in AuthorsRepo, i.e. not doing the join, until we hit some performance issue, and then to refactor.
Should I create some aggregate-value-objects, like: BookAndAuthors that describes relationships?
Please note that this example is trivial - and you have to know that our models are more rich and have more relationships.
Do you need to fetch this kind of information to display on the UI ?,
I would encourage you to separate your read and write concerns, and keep your repository interface simple (similar to that of a collections interface).
Have a look at CQRS, it works very well with DDD and will help simplify your design to a great deal.
Once you get into CQRS, just keep in mind that CQRS does not necessarily involve Event Sourcing.
In your case I would recommend the simplest approach show in this article, basically have a read service (can call it Finder), which fires SQL and gets you a DTO/Map of whatever you need for the UI.
I've got a question on my mind that has been stirring for months as I've read about DDD, patterns and many other topics of application architecture. I'm going to frame this in terms of an MVC web application but the question is, I'm sure, much broader. and it is this: Does the adherence to domain entities create rigidity and inefficiency in an application?
The DDD approach makes complete sense for managing the business logic of an application and as a way of working with stakeholders. But to me it falls apart in the context of a multi-tiered application. Namely there are very few scenarios when a view needs all the data of an entity or when even two repositories have it all. In and of itself that's not bad but it means I make multiple queries returning a bunch of properties I don't need to get a few that I do. And once that is done the extraneous information either gets passed to the view or there is the overhead of discarding, merging and mapping data to a DTO or view model. I have need to generate a lot of reports and the problem seems magnified there. Each requires a unique slicing or aggregating of information that SQL can do well but repositories can't as they're expected to return full entities. It seems wasteful, honestly, and I don't want to pound a database and generate unneeded network traffic on a matter of principle. From questions like this Should the repository layer return data-transfer-objects (DTO)? it seems I'm not the only one to struggle with this question. So what's the answer to the limitations it seems to impose?
Thanks from a new and confounded DDD-er.
What's the real problem here? Processing business rules and querying for data are 2 very different concerns. That realization leads us to CQRS - Command-Query Responsibility Segregation. What's that? You just don't use the same model for both tasks: Domain Model is about behavior, performing business processes, handling command. And there is a separate Reporting Model used for display. In general, it can contain a table per view. These tables contains only relevant information so you can get rid of DTO, AutoMapper, etc.
How these two models synchronize? It can be done in many ways:
Reporting model can be built just on top of database views
Database replication
Domain model can issue events containing information about each change and they can be handled by denormalizers updating proper tables in Reporting Model
as I've read about DDD, patterns and many other topics of application architecture
Domain driven design is not about patterns and architecture but about designing your code according to business domain. Instead of thinking about repositories and layers, think about problem you are trying to solve. Simplest way to "start rehabilitation" would be to rename ProductRepository to just Products.
Does the adherence to domain entities create rigidity and inefficiency in an application?
Inefficiency comes from bad modeling. [citation needed]
The DDD approach makes complete sense for managing the business logic of an application and as a way of working with stakeholders. But to me it falls apart in the context of a multi-tiered application.
Tiers aren't layers
Namely there are very few scenarios when a view needs all the data of an entity or when even two repositories have it all. In and of itself that's not bad but it means I make multiple queries returning a bunch of properties I don't need to get a few that I do.
Query that data as you wish. Do not try to box your problems into some "ready-made solutions". Instead - learn from them and apply only what's necessary to solve them.
Each requires a unique slicing or aggregating of information that SQL can do well but repositories can't as they're expected to return full entities.
http://ayende.com/blog/3955/repository-is-the-new-singleton
So what's the answer to the limitations it seems to impose?
"seems"
Btw, internet is full of things like this (I mean that sample app).
To understand what DDD is, read blue book slowly and carefully. Twice.
If you think that fully fledged DDD is too much effort for your scenario then maybe you need to take a step down and look at something closer to Active Record.
I use DDD but in my scenario I have to support multiple front-ends; a couple web sites and a WinForms app, as well as a set of services that allow interaction with other automated processes. In this case, the extra complexity is worth it. I use DTO's to transfer a representation of my data to the various presentation layers. The CPU overhead in mapping domain entities to DTO's is small - a rounding error when compared to net work calls and database calls. There is also the overhead in managing this complexity. I have mitigated this to some extent by using AutoMapper. My Repositories return fully populated domain objects. My service layer will map to/from DTO's. Here we can flatten out the domain objects, combine domain objects, etc. to produce a more tabulated representation of the data.
Dino Esposito wrote an MSDN Magazine article on this subject here - you may find this interesting.
So, I guess to answer your "Why" question - as usual, it depends on your context. DDD maybe too much effort. In which case do something simpler.
Each requires a unique slicing or aggregating of information that SQL can do well but repositories can't as they're expected to return full entities.
Add methods to your repository to return ONLY what you want e.g. IOrderRepository.GetByCustomer
It's completely OK in DDD.
You may also use Query object pattern or Specification to make your repositories more generic; only remember not to use anything which is ORM-specific in interfaces of the repositories(e.g. ICriteria of NHibernate)
I am on a tight schedule with my project so don't have time to read books to understand it.
Just like anything else we can put it in few lines after reading books for few times. So here i need some description about each terms in DDD practices guideline so I can apply them bit at a piece to my project.
I already know terms in general but can't put it in terms with C# Project.
Below are the terms i have so far known out of reading some brief description in relation with C# project. Like What is the purpose of it in C# project.
Services
Factories
Repository
Aggregates
DomainObjects
Infrastructure
I am really confused about Infrastructure, Repository and Services
When to use Services and when to use Repository?
Please let me know if anyway i can make this question more clear
I recommend that you read through the Domain-Driven Design Quickly book from infoq, it is short, free in pdf form that you can download right away and does its' best to summarize the concepts presented in Eric Evan's Blue Bible
You didn't specify which language/framework the project you are currently working on is in, if it is a .NET project then take a look at the source code for CodeCampServer for a good example.
There is also a fairly more complicated example named Fohjin.DDD that you can look at (it has a focus on CQRS concepts that may be more than you are looking for)
Steve Bohlen has also given a presentation to an alt.net crowd on DDD, you can find the videos from links off of his blog post
I've just posted a blog post which lists these and some other resources as well.
Hopefully some of these resources will help you get started quickly.
This is my understanding and I did NOT read any DDD book, even the holy bible of it.
Services - stateless classes that usually operate on different layer objects, thus helping to decouple them; also to avoid code duplication
Factories - classes that knows how to create objects, thus decouple invoking code from knowing implementation details, making it easier to switch implementations; many factories also help to auto-resolve object dependencies (IoC containers); factories are infrastructure
Repository - interfaces (and corresponding implementations) that narrows data access to the bare minimum that clients should know about
Aggregates - classes that unifies access to several related entities via single interfaces (e.g. order and line items)
Domain Objects - classes that operate purely on domain/business logic, and do not care about persistence, presentation, or other concerns
Infrastructure - classes/layers that glue different objects or layers together; contains the actual implementation details that are not important to real application/user at all (e.g. how data is written to database, how HTTP form is mapped to view models).
Repository provides access to a very specific, usually single, kind of domain object. They emulate collection of objects, to some extent. Services usually operate on very different types of objects, usually accessed via static methods (do not have state), and can perform any operation (e.g. send email, prepare report), while repositories concentrate on CRUD methods.
DDD what all terms mean for Joe the plumber who can’t afford to read books few times?
I would say - not much. Not enough for sure.
I think you're being quite ambitious in trying to apply a new technique to a project that's under such tight deadlines that you can't take the time to study the technique in detail.
At a high level DDD is about decomposing your solution into layers and allocating responsibilities cleanly. If you attempt just to do that in your application you're likely to get some benefit. Later, when you have more time to study, you may discover that you didn't quite follow all the details of the DDD approach - I don't see that as a problem, you proabably already got some benefit of thoughtful structure even if you deviated from some of the DDD guidance.
To specifically answer your question in detail would just mean reiterating material that's already out there: Seems to me that this document nicely summarises the terms you're asking about.
They say about Services:
Some concepts from the domain aren’t
natural to model as objects. Forcing
the required domain functionality to
be the responsibility of an ENTITY or
VALUE either distorts the definition
of a model-based object or adds
meaningless artificial objects.
Therefore: When a significant process
or transformation in the domain is not
a natural responsibility of an ENTITY
or VALUE OBJECT, add an operation to
the model as a standalone interface
declared as a SERVICE.
Now the thing about this kind of wisdom is that to apply it you need to be able to spot when you are "distorting the definition". And I suspect that only with experience (or guidance from someone who is experienced) do you gain the insight to spot such things.
You must expect to experiment with ideas, get it a bit wrong sometimes, then reflect on why your decisions hurt or work. Your goal should not be to do DDD for its own sake, but to produce good software. When you find it cumbersome to implement something, or difficult to maintain something think about why this is, then examine what you did in the light of DDD advice. At that point you may say "Oh, if I had made that a Service, the Model would be so nmuch cleaner", or whatever.
You may find it helpful to read an example.:
In Domain Driven Design are collection properties of entities allowed to have partial values?
For example, should properties such as Customer.Orders, Post.Comments, Graph.Vertices always contain all orders, comments, vertices or it is allowed to have today's orders, recent comments, orphaned vertices?
Correspondingly, should Repositories provide methods like
GetCustomerWithOrdersBySpecification
GetPostWithCommentsBefore
etc.?
I don't think that DDD tells you to do or not to do this. It strongly depends on the system you are building and the specific problems you need to solve.
I not even heard about patterns about this.
From a subjective point of view I would say that entities should be complete by definitions (considering lazy loading), and could completely or partially be loaded to DTO's, to optimized the amount of data sent to clients. But I wouldn't mind to load partial entities from the database if it would solve some problem.
Remember that Domain-Driven Design also has a concept of services. For performing certain database queries, it's better to model the problem as a service than as a collection of child objects attached to a parent object.
A good example of this might be creating a report by accepting several user-entered parameters. It be easier to model this as:
CustomerReportService.GetOrdersByOrderDate(Customer theCustomer, Date cutoff);
Than like this:
myCustomer.OrdersCollection.SelectMatching(Date cutoff);
Or to put it another way, the DDD model you use for data entry does not have to be the same as the DDD model you use for reporting.
In highly scalable systems, it's common to separate these two concerns.