We are weighing the pros and cons between Model First and DB first development. I found the following
Model First
Pros
1. C# dev need not understand SQL stmts.
2. C# dev can create a more specific entity for his business req.
Cons
1. Cannot use finer features available in vendor DB (i.e. Constraints for column in SQL server)
For DB first the pros and cons reverses. Is there anything big am missing here. I am slowly tilting towards DB first pls. advice.
According to your project, currently you can use three available approaches: DB First, Model First and Code First.
Obviously there are trade-offs which should be considered before using each of them.
While developing business objects and their relationship, you have already produced much of your data model and prohibiting this redundancy is the idea behind new approaches and technologies. With this simple point in mind, if data model is going to change frequently, DB First is not a good choice.
Although the whole idea is brilliant, new approaches are not mature enough yet (witness Entity Framework frequent releases and major changes). They cannot completely eliminate mentioned redundancy and they cannot cover all of DB First capabilities in their frameworks. Also, the work is done through middle layers which diminishes performance compared to DB First approach. Unfortunately you will not notified about uncovered capabilities unless you are in the middle of your project. So, if you need sophisticated capabilities or have a complex data model and the business model is not subject to frequent changes, and/or performance is a high priority, DB First is your choice.
What is happening at this struggle, to my opinion is clarifying the right question which should be answered. As database technologies priority is stability, they are not subject to frequent or major changes (witness relational databases). From development point of view, we have our objects and their relations defined and storing them in collections and easily retrieving them when needed so why should bother where and how this volatile objects (they are in RAM at run-time) are going to be stored. While mentioned technologies are endeavors from developers side, there are endeavors from database technology side i.g. In Memory Databases, Object Oriented databases...
Related
I've read a few posts relating to this, but i still can't quite grasp how it all works.
Let's say for example i was building a site like Stack Overflow, with two pages => one listing all the questions, another where you ask/edit a question. A simple, CRUD-based web application.
If i used CQRS, i would have a seperate system for the read/writes, seperate DB's, etc..great.
Now, my issue comes to how to update the read state (which is, after all in a DB of it's own).
Flow i assume is something like this:
WebApp => User submits question
WebApp => System raises 'Write' event
WriteSystem => 'Write' event is picked up and saves to 'WriteDb'
WriteSystem => 'UpdateState' event raised
ReadSystem => 'UpdateState' event is picked up
ReadSystem => System updates it's own state ('ReadDb')
WebApp => Index page reads data from 'Read' system
Assuming this is correct, how is this significantly different to a CRUD system read/writing from same DB? Putting aside CQRS advantages like seperate read/write system scaling, rebuilding state, seperation of domain boundaries etc, what problems are solved from a persistence standpoint? Lock contention avoided?
I could achieve a similar advantage by either using queues to achieve single-threaded saves in a multi-threaded web app, or simply replicate data between a read/write DB, could i not?
Basically, I'm just trying to understand if i was building a CRUD-based web application, why i would care about CQRS, from a pragmatic standpoint.
Thanks!
Assuming this is correct, how is this significantly different to a CRUD system read/writing from same DB? Putting aside CQRS advantages like seperate read/write system scaling, rebuilding state, seperation of domain boundaries etc, what problems are solved from a persistence standpoint? Lock contention avoided?
The problem here is:
"Putting aside CQRS advantages …"
If you take away its advantages, it's a little bit difficult to argue what problems it solves ;-)
The key in understanding CQRS is that you separate reading data from writing data. This way you can optimize the databases as needed: Your write database is highly normalized, and hence you can easily ensure consistency. Your read database in contrast is denormalized, which makes your reads extremely simple and fast: They all become SELECT * FROM … effectively.
Under the assumption that a website as StackOverflow is way more read from than written to, this makes a lot of sense, as it allows you to optimize the system for fast responses and a great user experience, without sacrificing consistency at the same time.
Additionally, if combined with event-sourcing, this approach has other benefits, but for CQRS alone, that's it.
Shameless plug: My team and I have created a comprehensive introduction to CQRS, DDD and event-sourcing, maybe this helps to improve understanding as well. See this website for details.
A good starting point would be to review Greg Young's 2010 essay, where he tries to clarify the limited scope of the CQRS pattern.
CQRS is simply the creation of two objects where there was previously only one.... This separation however enables us to do many interesting things architecturally, the largest is that it forces a break of the mental retardation that because the two use the same data they should also use the same data model.
The idea of multiple data models is key, because you can now begin to consider using data models that are fit for purpose, rather than trying to tune a single data model to every case that you need to support.
Once we have the idea that these two objects are logically separate, we can start to consider whether they are physically separate. And that opens up a world of interesting trade offs.
what problems are solved from a persistence standpoint?
The opportunity to choose fit for purpose storage. Instead of supporting all of your use cases in your single read/write persistence store, you pull documents out of the key value store, and run graph queries out of the graph database, and full text search out of the document store, events out of the event stream....
Or not! if the cost benefit analysis tells you the work won't pay off, you have the option of serving all of your cases from a single store.
It depends on your applications needs.
A good overview and links to more resources here: https://learn.microsoft.com/en-us/azure/architecture/patterns/cqrs
When to use this pattern:
Use this pattern in the following situations:
Collaborative domains where multiple operations are performed in parallel on the same data. CQRS allows you to define commands with
enough granularity to minimize merge conflicts at the domain level
(any conflicts that do arise can be merged by the command), even when
updating what appears to be the same type of data.
Task-based user interfaces where users are guided through a complex process as a series of steps or with complex domain models.
Also, useful for teams already familiar with domain-driven design
(DDD) techniques. The write model has a full command-processing stack
with business logic, input validation, and business validation to
ensure that everything is always consistent for each of the aggregates
(each cluster of associated objects treated as a unit for data
changes) in the write model. The read model has no business logic or
validation stack and just returns a DTO for use in a view model. The
read model is eventually consistent with the write model.
Scenarios where performance of data reads must be fine tuned separately from performance of data writes, especially when the
read/write ratio is very high, and when horizontal scaling is
required. For example, in many systems the number of read operations
is many times greater that the number of write operations. To
accommodate this, consider scaling out the read model, but running the
write model on only one or a few instances. A small number of write
model instances also helps to minimize the occurrence of merge
conflicts.
Scenarios where one team of developers can focus on the complex domain model that is part of the write model, and another team can
focus on the read model and the user interfaces.
Scenarios where the system is expected to evolve over time and might contain multiple versions of the model, or where business rules
change regularly.
Integration with other systems, especially in combination with event sourcing, where the temporal failure of one subsystem shouldn't
affect the availability of the others.
This pattern isn't recommended in the following situations:
Where the domain or the business rules are simple.
Where a simple CRUD-style user interface and the related data access operations are sufficient.
For implementation across the whole system. There are specific components of an overall data management scenario where CQRS can be
useful, but it can add considerable and unnecessary complexity when it
isn't required.
On SQLAlchemy's documentation page the author starts with a philosophy,
SQL databases behave less like object collections the more size and
performance start to matter; object collections behave less like
tables and rows the more abstraction starts to matter.
I'm scratching my head trying to understand the idea behind these two sentences, but failed. Could someone give an example illustrate the idea here? Thanks.
When you are creating an application using an Object Oriented language and a SQL database, you are simultaneously working with two very different conceptual models for storing information:
The relational model says how to store data in tables and rows and how to link elements through keys and joins.
The object model establishes a way to store entities with attributes in memory (usually) and how to set links between them using pointers or references.
So, let's say that you have an User entity that is linked to addresses and other users in your application. Those entities will need to be stored in the form several tables in the database (users table, addresses table and a many to many table for associating users to users, for instance). At the same time, if your code uses object oriented constructs, users and addresses will exist in memory in the form of objects with references between them, pointing to objects of the same or different kind.
The thing is, moving information between those two different worlds is much much more difficult than it looks at first:
You might associate one object with one row in a table, but that is not always possible and sometimes a single object must be associated to multiple rows in different tables.
Inheritance and polymorphic behavior are particularly difficult to map to a relational model.
Traversing objects and querying the database are vastly different actions.
Performance factors to take into account in an object model and a relational model are completely different.
And those are just a few examples. ORMs such as SQLAlchemy are essentially translators that convert information from one world into the other and back.
What I think that Mike Bayer was trying to convey is: the more you adapt your entity information to the object model (lots of inheritance, polymorphism, traversal of objects, ...), the farther it will resemble the natural structure in a relational model and the more performance concessions you will be making. And the other way around: the more you design your tables to perform well and be optimized for your queries, the less they will adapt to a natural structure of objects.
Martin Fowler has a nice write-up about the need of this translation in this article: ORM Hate (from which I took the above image).
Edit: further clarification on the abstraction vs performance issue
At the end, I think that the bottom line of that SQLAlchemy presentation text is: many ORMs hide the relational side of the relational-object oriented translation to make things easier. With them you only have to worry about the object oriented side, and the library is in charge of taking away the burden of dealing with the database. You get persistence for your objects without having to deal with SQL. However, they incur in a performance penalty in doing so, because the details of working with the database are abstracted away and you have no control over them. And those details are essential when you have to optimize performance. SQLAlchemy takes the opposite approach. It hides nothign of the relational side, you are in control of how SQL is generated and when and when not use joins, subqueries and other SQL constructs. That makes it a much more complex library to learn, but at the same time you are in control of the whole relational-object oriented translation process.
I've got a question on my mind that has been stirring for months as I've read about DDD, patterns and many other topics of application architecture. I'm going to frame this in terms of an MVC web application but the question is, I'm sure, much broader. and it is this: Does the adherence to domain entities create rigidity and inefficiency in an application?
The DDD approach makes complete sense for managing the business logic of an application and as a way of working with stakeholders. But to me it falls apart in the context of a multi-tiered application. Namely there are very few scenarios when a view needs all the data of an entity or when even two repositories have it all. In and of itself that's not bad but it means I make multiple queries returning a bunch of properties I don't need to get a few that I do. And once that is done the extraneous information either gets passed to the view or there is the overhead of discarding, merging and mapping data to a DTO or view model. I have need to generate a lot of reports and the problem seems magnified there. Each requires a unique slicing or aggregating of information that SQL can do well but repositories can't as they're expected to return full entities. It seems wasteful, honestly, and I don't want to pound a database and generate unneeded network traffic on a matter of principle. From questions like this Should the repository layer return data-transfer-objects (DTO)? it seems I'm not the only one to struggle with this question. So what's the answer to the limitations it seems to impose?
Thanks from a new and confounded DDD-er.
What's the real problem here? Processing business rules and querying for data are 2 very different concerns. That realization leads us to CQRS - Command-Query Responsibility Segregation. What's that? You just don't use the same model for both tasks: Domain Model is about behavior, performing business processes, handling command. And there is a separate Reporting Model used for display. In general, it can contain a table per view. These tables contains only relevant information so you can get rid of DTO, AutoMapper, etc.
How these two models synchronize? It can be done in many ways:
Reporting model can be built just on top of database views
Database replication
Domain model can issue events containing information about each change and they can be handled by denormalizers updating proper tables in Reporting Model
as I've read about DDD, patterns and many other topics of application architecture
Domain driven design is not about patterns and architecture but about designing your code according to business domain. Instead of thinking about repositories and layers, think about problem you are trying to solve. Simplest way to "start rehabilitation" would be to rename ProductRepository to just Products.
Does the adherence to domain entities create rigidity and inefficiency in an application?
Inefficiency comes from bad modeling. [citation needed]
The DDD approach makes complete sense for managing the business logic of an application and as a way of working with stakeholders. But to me it falls apart in the context of a multi-tiered application.
Tiers aren't layers
Namely there are very few scenarios when a view needs all the data of an entity or when even two repositories have it all. In and of itself that's not bad but it means I make multiple queries returning a bunch of properties I don't need to get a few that I do.
Query that data as you wish. Do not try to box your problems into some "ready-made solutions". Instead - learn from them and apply only what's necessary to solve them.
Each requires a unique slicing or aggregating of information that SQL can do well but repositories can't as they're expected to return full entities.
http://ayende.com/blog/3955/repository-is-the-new-singleton
So what's the answer to the limitations it seems to impose?
"seems"
Btw, internet is full of things like this (I mean that sample app).
To understand what DDD is, read blue book slowly and carefully. Twice.
If you think that fully fledged DDD is too much effort for your scenario then maybe you need to take a step down and look at something closer to Active Record.
I use DDD but in my scenario I have to support multiple front-ends; a couple web sites and a WinForms app, as well as a set of services that allow interaction with other automated processes. In this case, the extra complexity is worth it. I use DTO's to transfer a representation of my data to the various presentation layers. The CPU overhead in mapping domain entities to DTO's is small - a rounding error when compared to net work calls and database calls. There is also the overhead in managing this complexity. I have mitigated this to some extent by using AutoMapper. My Repositories return fully populated domain objects. My service layer will map to/from DTO's. Here we can flatten out the domain objects, combine domain objects, etc. to produce a more tabulated representation of the data.
Dino Esposito wrote an MSDN Magazine article on this subject here - you may find this interesting.
So, I guess to answer your "Why" question - as usual, it depends on your context. DDD maybe too much effort. In which case do something simpler.
Each requires a unique slicing or aggregating of information that SQL can do well but repositories can't as they're expected to return full entities.
Add methods to your repository to return ONLY what you want e.g. IOrderRepository.GetByCustomer
It's completely OK in DDD.
You may also use Query object pattern or Specification to make your repositories more generic; only remember not to use anything which is ORM-specific in interfaces of the repositories(e.g. ICriteria of NHibernate)
Im new to subsonic and generally this was of programming, i usually develop from a rad perspective so using the visual studio dataset designer, but i wanted to start looking at developing n teir approach.
Ive never used a business logic layer, (naughy) normally my code behind takes care of validation so to speak aswell as general page level validation.
How can i generate my business logic, do i create a partial class of one of my classes and then add the business logic into this? and how would this look? just so i have an idea.
Any exmaples or advice would be greatly appreciated.
Thanks
Dan
The big gotchya with SubSonic is that it generates classes from database tables, there is a 1-to-1 correspondence between the two. That makes the classes SubSonic generates quite unsuitable for use as business objects, because it would tie your business layer very directly to your database structure. This is a bad thing (in nearly all scenarios that come to my mind, anyway).
SubSonic is a query tool and little more. It most certainly is not an ORM.
With that in mind, I believe the correct way to create a Business Logic Layer is to write your own business classes, and write Repository classes to manage loading and storing the data. But use SubSonic only internally to the Repository classes to handle the actual persisting of your data to the database.
If you use the SubSonic generated classes throughout your project you will find you are most likely doing it wrong, and the first significant change to your DB schema will illustrate that nicely (or .. not nicely).
In fact, I would recommend quickly moving into learning a real ORM like NHibernate or Entity Framework. They bring you much farther down the Happy Path, whereas SubSonic still requires one to do much of the Data Layer implementation themselves.
I am on a tight schedule with my project so don't have time to read books to understand it.
Just like anything else we can put it in few lines after reading books for few times. So here i need some description about each terms in DDD practices guideline so I can apply them bit at a piece to my project.
I already know terms in general but can't put it in terms with C# Project.
Below are the terms i have so far known out of reading some brief description in relation with C# project. Like What is the purpose of it in C# project.
Services
Factories
Repository
Aggregates
DomainObjects
Infrastructure
I am really confused about Infrastructure, Repository and Services
When to use Services and when to use Repository?
Please let me know if anyway i can make this question more clear
I recommend that you read through the Domain-Driven Design Quickly book from infoq, it is short, free in pdf form that you can download right away and does its' best to summarize the concepts presented in Eric Evan's Blue Bible
You didn't specify which language/framework the project you are currently working on is in, if it is a .NET project then take a look at the source code for CodeCampServer for a good example.
There is also a fairly more complicated example named Fohjin.DDD that you can look at (it has a focus on CQRS concepts that may be more than you are looking for)
Steve Bohlen has also given a presentation to an alt.net crowd on DDD, you can find the videos from links off of his blog post
I've just posted a blog post which lists these and some other resources as well.
Hopefully some of these resources will help you get started quickly.
This is my understanding and I did NOT read any DDD book, even the holy bible of it.
Services - stateless classes that usually operate on different layer objects, thus helping to decouple them; also to avoid code duplication
Factories - classes that knows how to create objects, thus decouple invoking code from knowing implementation details, making it easier to switch implementations; many factories also help to auto-resolve object dependencies (IoC containers); factories are infrastructure
Repository - interfaces (and corresponding implementations) that narrows data access to the bare minimum that clients should know about
Aggregates - classes that unifies access to several related entities via single interfaces (e.g. order and line items)
Domain Objects - classes that operate purely on domain/business logic, and do not care about persistence, presentation, or other concerns
Infrastructure - classes/layers that glue different objects or layers together; contains the actual implementation details that are not important to real application/user at all (e.g. how data is written to database, how HTTP form is mapped to view models).
Repository provides access to a very specific, usually single, kind of domain object. They emulate collection of objects, to some extent. Services usually operate on very different types of objects, usually accessed via static methods (do not have state), and can perform any operation (e.g. send email, prepare report), while repositories concentrate on CRUD methods.
DDD what all terms mean for Joe the plumber who can’t afford to read books few times?
I would say - not much. Not enough for sure.
I think you're being quite ambitious in trying to apply a new technique to a project that's under such tight deadlines that you can't take the time to study the technique in detail.
At a high level DDD is about decomposing your solution into layers and allocating responsibilities cleanly. If you attempt just to do that in your application you're likely to get some benefit. Later, when you have more time to study, you may discover that you didn't quite follow all the details of the DDD approach - I don't see that as a problem, you proabably already got some benefit of thoughtful structure even if you deviated from some of the DDD guidance.
To specifically answer your question in detail would just mean reiterating material that's already out there: Seems to me that this document nicely summarises the terms you're asking about.
They say about Services:
Some concepts from the domain aren’t
natural to model as objects. Forcing
the required domain functionality to
be the responsibility of an ENTITY or
VALUE either distorts the definition
of a model-based object or adds
meaningless artificial objects.
Therefore: When a significant process
or transformation in the domain is not
a natural responsibility of an ENTITY
or VALUE OBJECT, add an operation to
the model as a standalone interface
declared as a SERVICE.
Now the thing about this kind of wisdom is that to apply it you need to be able to spot when you are "distorting the definition". And I suspect that only with experience (or guidance from someone who is experienced) do you gain the insight to spot such things.
You must expect to experiment with ideas, get it a bit wrong sometimes, then reflect on why your decisions hurt or work. Your goal should not be to do DDD for its own sake, but to produce good software. When you find it cumbersome to implement something, or difficult to maintain something think about why this is, then examine what you did in the light of DDD advice. At that point you may say "Oh, if I had made that a Service, the Model would be so nmuch cleaner", or whatever.
You may find it helpful to read an example.: