Importing data and Event Sourcing - domain-driven-design

Importing data and Event Sourcing - domain-driven-design

I am currently working on a monolithic system which I would like to bring into the modern day and incorporate DDD and CQRS. I have been presented with a request to re-write the importing mechanism for the solution and feel this could present a good opportunity to start this re-architecting process.
Currently the process is:
User uploads CSV
System parses CSV and shows each row on screen. Validation takes place for each row and errors/warnings associated with each row
User can modify each line an re-validate all rows
User then selects rows that don't have errors and submits the import
Rows import and any non-selected rows, or rows with errors go into a holding area so they can deal with them at a later date
Additional details for this is that multiple rows could belong to the same entity (E.g. 2 rows could be line items in an order, so would have the same Order Ref).
I was thinking of having an import saga that would generate a bunch of import aggregates (e.g. OrderImportAggregate), and then when the import is submitted those would get converted into the class used across the system currently, which would hopefully become aggregates in their own right when re-architected further down the line! So the saga process would take on something along the lines of:
[EntityType]FileImportUploaded - Stores the CSV
[EntityType]FileImportParsed - Generates n number of [EntityType]Import aggregates.[EntityType]ImportItemCreated events raised/handled
Process would call the validation routine that the current entities go through to generate a list of errors, if any, and store against each item. [EntityType]ImportItemValidated events raised/handled
Each time a row is changed on screen, it calls a web api method for the saga and and item id to update the details and re-validate the row as per point 3.
User submits import, service groups entities together, based on ref for example, they get converted into the current system entity and calls their import/save routine. [EntityType]ImportItemCompleted event raised.
Saga completes when all aggregates are at ImportItemComplete state
As this was my first implementation of CQRS/Event Sourcing/DDD, I wanted to start off on the right foundation, so was wondering if this is a desired approach for this functionaility?

I suggest that you break your domain into two separate sub-domains implemented as to separate bounded context, one bounded context being the Import bounded context (ImportBC) and the other being the receiving bounded context (ReceivingBC, the actual name is not know to me, please replace it accordingly).
Then, in the Import BC you should implement using the CRUD style, having an entity for each import file and use a persistence to remember the progress on the validation and import process (this entity holds a list of not-yet imported items). After each item is validated by a human, a command could be sent to the aggregates in the ReceivindBC to test if the aggregate is valid according to the business rules, but without committing the changes to the repository! You do this so that the human user would know if the item is indeed valid and to enable/disable an import button. In this way you don't duplicate the validation logic inside the two bounded contexts. When the user actually presses the import button send the import command to the aggregate in the ReceivingBC and you actually commit the changes to the repository. Also, you remove the import item from the import file CRUD entity.
This technique of sending commands but without actually persisting into the repository is useful in helping the user experience in the UI (without duplicating logic inside the UI) and it is doable if you follow the DDD best practices and design your aggregates to be pure, side-effect free objects (to be Repository agnostic, to not know of their existing, to not use them at all!).

Well first of all you have to ask yourself why are you using CQRS. CQRS is the heavy 18 wheeler amongst architecture. I know of 2 good reasons that scream CQRS
1) You need to support undo functionality
2) in the future when new requirements are implemented you want to apply those to past data too.
The part of the requirements that you are describing however feels very much like crud. (You import a set of rows, you list a set of rows, you edit those rows and the ones marked as completed are then deleted from their input state and converted into some other kind of entity.
If you feel there is a lot of complexity describing the specific entities and the validation rules that apply then DDD would be a good fit. but still i would consider scaling it down and build a simle mvc style app to implement this (depending what else is required of this project)
and even if this were part of a larger domain i would suggest a microservices approach where this would be a completely standalone import application (and in that case you could still raise a ImportCompleted Event and put it on a service bus with multiple other applications listening to that event)
NOTE: CQRS is not event sourcing, cqrs is separating a command (update) stack from a query stack. It's often applied in combination with event sourcing. But having events that pop up everywhere can be a pain to maintain especially since it's often less obvious who is raising the event and if events have interactions on eachother (what happens to an order if both a ordercompleted and ordercanceled event are raised, possibly with timing issues which one is handled first)

I'm not a DDD expert but this is my thoughts on approaching this. I wouldn't use a seperate bounded context because it feels to me the import of domain objects can ideally be in the same bounded context as the one they are a part of. Keen to hear from experts why it would be wrong!
Parse the csv into an aggregate representing the data import and persist this (to the staging area / tables etc). We can load this aggregate from here in future. The parsing of CSV file to create this aggregate could be modelled as a command "CreateDataImportFromCsvFile" etc.
Build a UI that loads this aggregate and displays it. The aggregate can contain a list of domain objects "customer import items" and each "customer import item" can contain an "IsSelected" property as well as the domain object being imported I.e the "customer" domain object itself. This means you don't duplicate validation rules as you are using the actual Domain objects you intend to import. You hydrate those objects and display them in the UI. When the user clicks the import button, you issue a command. You handle that command by looping through each selected and valid "import item" on the aggregate and calling Save() on its Domain model, and then marking the import item as processed. Ideally do this all within an outer transaction scope (depends on whether you want atomicity vs eventual consistenty etc). Your UI can then optionally not display processed import items or it can display them in a disabled state or whatever depending on whether it is useful for the user to also be able to see what has actually been processed so far vs what's remaining.

Related

Domain events with composite pattern

I am trying to model a real-time collaboration application with DDD. A particular feature with some Hotspot events is CAD visualization.
Problem #1
Multiple participants join a 3D virtual environment and one of them is designated as a facilitator. Although all participants can change various preferences for themselves, the facilitator can change preferences for all users. The users can change them back on an individual level.
The problem I am facing is, single vs bulk operation. Do I submit a granular event for bulk operations or a single event? If an existing process listens to the granular event, it will miss the bulk event unless communicated explicitly which doesn't result in so clean boundary.
Problem #2
Interestingly enough this is a variation of problem #1 but a bit more severe. A CAD model comes with some meta-structure which is a DAG. Each leaf level structure is a group of triangles that are manipulated together. These groups of triangles are called Volume. A group of volumes forms another concept known as a Branch. A branch can contain other branches as a child. The branch+volume structure always forms a tree. Some disjoint tree branches form another concept called Group.
Now a participant can make a branch/group/volume visible and hidden. Do I publish a single branch-level event or create an event for *every branch/volume in the forward path?
I have thought about publishing bulk events for bulk operations and single events for single operations under the same topic. This doesn't feel good as I may introduce new bulk events and require another downstream context to break.
Alternatively, I thought about publishing both bulk and granular event with correlation_id. If a bulk event were understood, the downstream can ignore the following events with the same correlation id. Although this seems promising, Still doesn't feel good as the downstream may process events concurrently and later events could be processed earlier than the bulk event.
Can bulk operations be properly modeled using DDD? Is there a way to rethink the composite pattern which is more DDD friendly?

1.) bulk event, the id can be a query for all the matching ids at that moment or the explicit matching id list. you need it, because if you want to revert the event somehow, then you will have a problem if you lose the connection between individual events. it is an infot which must be stored too.
2.) looks like some sort of weird graph, it reminds me of the knowledge graph of sciences: math, physics, chemistry, biology, etc. where everything builds on math and they are interrelated, still people want to force them into a hierarchy. the problem that there are terms which are half way between two sciences so when you select the term of one science you cannot decide which they belong to. the same solution, selecting things with queries works for this too. I thought a lot about this problem too. having a shitload individual events will require a massive storage space after a certain size. better to use bulk with queries and compute them or save the id list as a query cache, but don't duplicate anything else. as of the semi-hierarchical structure, I have no idea how to model it properly. I would use a simple graph and tag everything and query based on the tags, but still there is a sort of hierarchy, which is hard to grasp from a pure graph perspective without any kind of weighting.

How can I design a bridge from a legacy CRUD oriented app to a CQRS and Event sourcing system?

I was asked to implement CQRS/Event sourcing patterns into a legacy web application, in order to prepare to migrate it from a monolithic/state oriented model to a distributed, service oriented app.
I have some questions on how I can design a Domain oriented code bundle that would connect the legacy entities strongly coupled to database, with a new Event sourced model.
The first things I did were:
writing a small "framework" for CQRS/ES, with classes like AggregateRoot, DomainEvent, Command, Handlers, Messaging, Eventstore, AggregateIds, etc.
trying to group and "migrate" the legacy Entities into some Aggregates to reconstruct all the history and states of the app into EventSoourced Aggregates
plug some Commands dispatching in the old controllers in order to let the app work as is, but also to feed the new CQRS/ES system on the side.
The context:
The legacy app contains several entities, mapped to database, that hold the model layer. (Our domain is Human resources (manpower).
Let's say we have those existing entities:
Worker, with various fields and related entities (OneToOne, OneToMany), like
name
address 1-1
competences 1-N
Society, in which worker works, with various fields and related entities (OneToOne, OneToMany), like
name
address 1-1
hours
Contract, with various fields and related entities (OneToOne, OneToMany), like
address 1-1
Worker 1-1
Society 1-1
documents 1-N
days 1-N
hours
etc.
From this legacy model, I designed a MissionAggregate that holds:
A db independent ID, like UUID
some Value objects: address, days (they were an entity in the legacy model, they became VOs here)
I also designed a WorkerAggregate and a SocietyAggregate, with fields and UUIDS, and in the MissionAggregate I added:
a reference to WorkerAggregate's UUID
a reference to SocietyAggregate's UUID
As I said earlier, my aim is to leave the legacy app as is, but just introduce in the CRUD controller's methods some calls to dispatch Commands to the new CQRS system.
For example:
After flushing newly created Contract in bdd, I want to dispatch a "CreateMissionCommand" to the new command bus.
It targets the appropriate Command Handler, that handles all the command's data, passes it to a newly created Aggregate with a new UUID and stores "MissionCreatedDomainEvent" in the EventStore.
The DomainEvent is indexed with an AggregateId, a playhead, and has a payload which contains the fields necessary to be applied to and build the MissionAggregate.
The newly Contract created in the app has now its former lifecycle, as usual, with all the updates that the legacy app does on it. But I also need to reflects all those changes to the corresponding EventSourcedAggregate, so every time there is a flush in database in the app, I dispatch a Command that translates the "crud like operations" of the legacy app into a Domain oriented /Command oriented pattern.
To sum up the workflow is:
A Crud legacy operation occurs and flushes some changes on the Contract Entity
In just a row of code in the controller, I dispatch a command built with necessary fields (AggregateId of the MissionAggregate... that I need to have stored somewhere... see next problems) to the Domain command bus, so that the impact on the existing code base is very low.
The bus passes the command to the corresponding command handler
The handler loads the aggregate and applies the changes it by calling the appropriate Aggregate method
then after some validation, the aggregate raises and stores the appropriate event
My problems and questions (some of them at least) are:
I feel like I am rewriting all big portions of the legacy app, with the same kind of relations between the Aggregates that I have between the Entities, and with the same type of validations, checks etc.
Having references, to both WorkerAggregate and SocietyAggregate UUID in MissionAggregate implies that I have to build those aggregate also (hence to dispatch commands from legacy app when the Worker and Society entities are flushed). Can't I have only references to Worker's entity id and Society's entity id?
How can I avoid having a eternally growing MissionAggregate? The Contract Entity is quite huge, it has a lot of fields that are constantly updated (hours, days, documents, etc.) If I want to store all those events, I need to have a large MissionAggregate to reflect all those changes; and so I need to have a tons of CommandHandlers that react to all the Commands of add, update, etc. that I am going to dispatch from the legacy app.
How "free" is an Aggregate from the Root entity it is supposed to refer to ? For example, a Contract Entity needs to relate somewhere to it's related Mission Aggregate, like for example when I want to dispatch a Command from the app, just after the legacy code having flushed something on the Entity. Where to store this relation? In the Entity itself, in a AggregateId field? in the Aggregate, should I have a ContractId field? Or should I have some kind of Mapping Table somewhere that holds the relationship between Contract ID and MissionAggregate ID?
What to do with the past? Should I migrate all the existing data through a script that generates Aggregates and events on all the historical data?
Thanks in advance for your time.

You have a huge task ahead of you, let's try to break it down.
It's best to build this new part of the system in isolation from the legacy codebase, otherwise you're going to have your hands tied in every turn of the way.
Create a separate layer in your project for these new requirements. We're going to call it "bubble" from now on. This bubble will be like a greenfield project, with its own structure, dependencies, etc. There will be no direct communication between the bubble and the legacy; communication will happen through another dedicated translation layer, which we'll call "Anti-Corruption Layer" (ACL).
ACL
It is like an API between two systems.
It translates calls from the bubble to the legacy and vice-versa. Its purpose is to prevent one system from corrupting or influencing the other. This way you can keep building/maintaining each system independently from each other.
At the same time, the ACL allows one system to consume the other, and reuse logic, validations, rules, etc.
To answer your questions directly:
I feel like i am rewriting all big portions of the legacy app, with the same kind of relations between the Aggregates that i have between the Entities, and with the same type of validations, checks etc.
With the ACL, you can resort to calling validations and reuse implementations from the legacy code. This will allow you time to rewrite things as needed or as possible.
You may not need to rewrite the entire system, though. If your goal is to implement CQRS and Event Sourcing and you can achieve this goal by keeping most or part of the legacy system, I would say you do it. Unless, of course, one of the goals is to completely replace the old system. Otherwise, keep it; write as less code as possible.
Suggested workflow:
Keep the CQRS and Event Sourcing system in the bubble
Do not bring these new frameworks into legacy
Make the lagacy Controller issue method calls to the ACL
The ACL will convert these calls into Commands and dispatch them
Any events will be caught by your Event Sourcing framework
Results will be persisted to the bubble's database
The bubble's database can be a different schema in the same database or can be a different database altogether. But you'll have to think about synchronization, and that's a topic of its own. To reduce complexity, I recommend a different schema in the same database.
Having references, to both WorkerAggregate and SocietyAggregate UUID in MissionAggregate implies that i have to build those aggregate also (hence to dispatch commands from legacy app when the Worker and Society entities are flushed). Can't i have only references to Worker's entity id and Society's entity id?
How can i avoid having a eternally growing MissionAggregate ? The Contract Entity is quite huge, it has a looot of fields that are constantly updated (hours, days, documents, etc.) If i want to store all those events, i need to have a large MissionAggregate to reflect all those changes; and so i need to have a tons of CommandHandlers that react to all the Commands of add, update, etc that i am going to dispatch from the legacy app.
You should aim for small aggregates. Huge aggregates are likely to degrade performance and cause concurrency problems.
If you anticipate having a huge aggregate, it is best to rethink it and try to break it down. Ask what fields/properties change together - these are possibly a different aggregate.
Also, when you speak about CQRS, you generally lean towards a task-based way of doing things in your system.
Think of a traditional web application, where you have a huge page with lots of fields that are all sent to the server in one batch when the user saves.
Now, contrast it with a modern web app where the user changes small portions of data at each step. If you think about your system this way you'll find those smaller aggregates.
PS. you don't need to rebuild your interfaces for this. If your legacy system has those huge pages, you could have logic in the controllers to detect which fields were changed and issue the appropriate commands.
How "free" is an Aggregate from the Root entity it is supposed to refer to ? For example, a Contract Entity needs to relate somewhere to it's related Mission Aggregate, like for example when i want to dispatch a Command from the app, just after the legacy code having flushed something on the Entity. Where to store this relation ? In the Entity itself, in a AggregateId field ? in the Aggregate, should i have a ContratId field ? Or should i have some kind of Mapping Table somewhere that holds the relationship between Contract ID and MissionAggregate ID?
Aggregates represent a conceptual whole. They are like atoms, indivisible things. You should always refer to an aggregate by its Root Entity Id, and never to a Child Entity Id: looking from the outside, there are no children.
An aggregate should be loaded as a whole and persisted as a whole. One more reason to have small aggregates.
An aggregate can be comprised of a single entity. Or it can have more entities and value objects, forming a graph, but one entity will be elected as the Root and will hold references to its children. Child entities and value objects should not hold references to their parents. The dependency is not bi-directional.
If Contract is an entity inside the Mission aggregate, the Contract should not have a reference to its parent.
But, if your Contract and Mission are different aggregates, then they can reference each other by their Ids.
What to do with the past? Should i migrate all the existing datas through a script that generates Aggregates and events on all the historical data?
That's a question for the business experts. Do they need it? If they don't, then don't implement it just for the sake of doing so. Every decision you make should be geared towards satisfying a business need and generating real value for it, considering the costs and tradeoffs.
Some people say that code is a liability, not an asset, and I aggre to some extent: every line of code you write needs to be tested and supported. Don't write any code that is not really necessary.
Also, have a look at this article about the Strangler Pattern, which shows how to migrate a legacy system by gradually replacing specific pieces of functionality with new applications and services.
If you have a chance, watch this course at Pluralsight (paid): Domain-Driven Design: Working with Legacy Projects. The author presents practical approaches for dealing with this kind of task.
I hope this has given you some insight.

I don't want to spoil your game. Everybody knows how cool it is to rewrite something from scratch. It's a challenge, it's fun, it's exciting. However...
migrate it from a monolithic/state oriented model to a distributed, service oriented app
CQRS/Event Sourcing won't solve any of your problems and it won't help you distribute the app in any reasonable way. If you just generate events on the CRUD operations you'll have a large tangled mess of dependencies between each part. Every part that needs data will have to call a couple of "services" (i.e. tables) to get it, than push data elsewhere, generate events1 that some other parts will react to. It will be a mess. Usually this is called a distributed monolith.
This is also the reason you already see problems with it. These problems won't go away, because you are essentially building the same system in the same way, but this time it'll be more complex.
Where to go from here
The very first thing is always: have a clear goal. You want a service oriented architecture you said. Why? Are there parts that need different scaling, different resources? Are they managed by different teams with different life-cycles? Etc.? Maybe you already have all this, I don't know, but if not, that's your first task.
Then. The parts you do want to pull out can't be just CRUD things. Those will not be independent, so whether your goal (see point above!) is scaling or different team, you won't reach your goal! To be independent you'll have to pull out the behavior with the data, and in a way that the service can operate on its own.
You can't just throw buzzwords at it and hope for the best. I'd suggest to just ignore all the hype and buzzwords and think about the goal you want to reach.
For example: I need a million workers to log their time in under 10 minutes total. So that means I need a "service" to enable worker to log their time with a web interface. So let's create that as a complete independent piece with its own database so it can be scaled to a 100 nodes when it needs to be. Export data to billing automatically every hour or so.

Service Layer DTOs - Large Complex Interactive Report-Like Objects

I have Meeting objects that form the basis of a scheduling system, of which gridviews are used to display the important information. This is for the purpose of scheduling employees to meetings, and for employees to view what has been scheduled.
I have been trying to follow DDD principles, but I'm having difficulty knowing what to pass from my service layer down to presentation area of system. This is because the schedule can be LARGE, and actually consists of many different elements of the system. Eg. Client Name, Address, Case Info, Group,etc, all of which are needed for the meeting scheduler to make a decision.
In addition to this, the scheduler needs to change values within this schedule and pass it back up to the service layer (eg. assign employees from dropdowns, maybe change group, etc). So, the information isn't really "readonly" - it needs to be interacted with. ie. It's not just a report.
Our current approach is to populate a flattened "Schedule Object" from SQL, which is constructed from small parts of different domain objects. It's quite a complex query. When changes have been made, this is then passed back up to the service layer, and the service will retrieve the domain objects in question, and fire business methods on the domain objects using information from the DTOs.
My question is, is this the correct approach? ie. Continue to generate large custom objects from SQL, and then pass down from Service Layer to Presentation Layer objects that feel a lot like View Models?
UPDATE due to an answer
To give a idea of the amount entities / aggregates relationships involved. (this is an obfuscated examples, so relationships are the important things here)
Client is in one default group
Client has one open case but many closed
Cases have many Meetings
Meeting have many assigned Employees
Meeting have many reasons
Meeting can get scheduled to different groups
Employees can be associated with many groups.
The schedule need to loads all meetings in open cases that belong to patients who are in the same groups as the employee.
Scheduler can see Client Name, Client Address, Case Info, MeetingTime, MeetingType, MeetingReasons, scheduledGroup(s) (showstrail), Assigned Employees (also has hidden employee ids).
Editable fields are assign employee dropdowns and scheduled group.
Schedule may be up to two hundred rows.
DTO is coming down from WCF, so domain model is accessed above this service layer, and not below.
Domain model business calls leveraged by service based on DTO values passed back, and repositories deal with inserts/updates.
So, I suppose to update, is using a query to populate an object which contains all of the above acceptable to pass down as one merged DTO? And if not, how would you approach it? ( giving some example calls to service layer, and explaining a little bit about how you conceive the ORM fetching the data keeping in mind performance)

In the service layer and below, I would treat each entity (see aggregate roots in DDD) separate with respect to it's transactional boundary. I.e. even if you could update a client and a case in the same UI view, it would be best to transactionally modify the client and then modify the case. The more you try to modify in one transaction, the more you can conflict with other users.
Although your schedule is large and can contain lots of objects, the service layer should again deal with each entity (aggregate root) separately and then bundle them together into a new view model. Sadly, on brown-field projects, a lot of logic might be in the SQL and the massive multi-table joins might make this harder to refactor into more atomic queries that do exactly what is needed. The old-school data-centric view of 'do everything you can in the database' goes against everything DDD.
Because DDD is a collection of design ideas and patterns and not particularly a methodology or an architecture, it sounds that it might be too late to try shoe-horn your current application into a DDD application-centric design. It sounds as though your current app is very entrenched in the data-centric view.
If everything is currently being passed up through the layers in one monolithic chunk, it might be best to keep with this style and just expose these monolithic chunks to the people in the other team who wish to consume them, for use in their new app. You might be able to put some sort of view model caching in place (a bit like the caching view model element in CQRS).
In my personal opinion, data-centric, normalised data apps have had their day (they made sense in the 1970s when hard disk space was expensive) and all apps should be moving toward more modern practices. In reality, only when legacy systems are crawling on their knees, will stakeholders usually put up the cash to look for alternatives (usually after stuffing every last server with RAM). It might be possible or best to convince them to refactor small sections at a time.

Workorder management with DDD and ORM

The central tenet to the software I am building is the "workorder"
WorkOrder as I see it would be an "aggregate root" that contains basic information about the work order such as creation date, model/manufacturer, serial number, purchase order.
In addition to these "value" objects, there are also sub "entities" or "aggregates" such as:
Sequences
Reworks
Dimensions
QuoteItems
Consumables
None of the above can/should exist without an associated work order. In the existing system they actually occasionally do but that is because of lack of transactions or checks in code to ensure integrity. They are orphaned records and deleted via scheduled clean up - one of the many reasons I am learning more about DDD and ORM to bring our development practices up to speed.
NOTE: This is probably off topic and can likely be skipped in your
reply
because we are primarily a web-based interface using extJS, each of
the list controls that display each of the above, I have been
reluctant to switch to ORM and DDD. Each list is populated via a
controller:action that queries the DB (ie: sequences list is populated
when the JS control calls a sequence REST URI with GET command). This
GET command invokes a controller that instantiates a sequence object
and calls the selectAllForWorkorderID method
My understanding of ORM is that I would use a repository to query
these items. Fine, however if this sequence object (in DDD parlance)
is considered an aggregate of WorkOrder root - then I must find the
workorder first and traverse the sequences through the WorkOrder.
In a AJAX web-based context this feels funny to me - but in a desktop
environment or even standard web-based context this is acceptable as I
would only query the WorkOrder object once each time a WorkOrder item
is selected in the master list. Not 6 or 8 times for each individual
list to be populated.
I can see now that our system actually has several aggregate root objects, work order is just the more complicated of the few:
WorkOrder
Warranties
Repair Orders
Are the primary roots. Warranties are dependent on work order ID's and Repair Orders can be but not always.
Ignoring the latter roots - allow me to focus solely on WorkOrder.
When I begin examine the existing models and try to determine what is business logic and/or application logic I am slightly confused. What goes into a "service" versus "aggregate root".
Consider one such method in the current model:
createWorkOrderFromRpi.
RPI's are approved documents that act as templates for WorkOrders - they dictate what sequences and the order of execution "can" be performed, dimensions, list of consumables etc. This is a separate system altogether and I believe would best be described as a "module" in DDD nomenclature.
This method has to query the RPI system and obtain the work order header details, sequence list, consumables, etc.
Once it has this data it calls the associated objects and methods:
WorkOrder.Create(Header Details)
Sequence.Create(Sequence Details) - Done in loop (1:m)
Consumable.Create(Consumable Details) - Done in loop (1:m)
In following DDD I am tempted to have the WorkOrder "aggregate root" provide a method with an identical signature however I am reluctant to do so.
I believe each of the "entities" that are aggregates of WorkOrder fit the description and should not ever be exposed to anything outside of the "root" unless traversed through the root itself. There may be cases where this is not the case. On second thought, the interface only ever exposes consumables and sequences and such when a work order is selected which would imply a work order must be loaded anyway?!?
There are some essential business rules which this method must perform:
A Work order with identical serial number is not actively already in the system (unless archived) unless it's on sub-contract in which case do not create a new work but receive a repair order for this work order instead.
There are a few more "rules" but I will exclude them for the sake of brevity.
The individual entities perform micro business validations, for example some fields, such as serial number, have a specific format, as do part numbers and purchase order numbers.
My primary question or concern, is given the above description, would this method best be implemented in an "aggregate root" or "service"?
UPDATE | One final question...if aggregate root is the proper concept...and I need access to the sequences so that I may update a field I would access conceptually (ignore syntax) like:
WorkOrder.Sequences(0).moveToNext()
If this method was implemented in the sequences "entity" which makes sense. Where does the division between technical details and business logic exist? For example, to move a work order from one sequence to the next, we update three timestamps per sequence:
date_entered
date_started
date_finished
When the last timestamp is set, the next sequence date_entered is set to the same time as previous sequence date_finished and the system knows this is the active sequence now. Thats a technical matter.
But a business rule or constraint would be:
Don't move work order if moved into history
Don't move work order if in rework
Don't move work order if in subcon
These are rules, which I would love to keep separated and distinct so as to make it easy for me to translate into English in the form of a specs document which I could present management as a living document and proof of functionality. I was kind of hoping that is what DDD would enforce/promote in a clean manner. Is this a requirement handled independently of DDD? Is this where CQS comes in? Separating business rules from technical matters which are of zero relevance to stake holders?
Alex

I think your createWorkOrderFromRpi() method should be on a "Service" rather than the WorkOrder aggregate root. This service method would then call methods on your Repositories or DAOs to create the workorder. An Aggregate Root typically combines entities but on your model I think RPI is a template or specification outside of the work order aggregate root. If RPI is part of the aggregate then you should put the method on the repository directly and call it wherever, as a repository is a business object in DDD also.
On the second question I believe a WorkOrder Aggregate Root is totally correct for the other "dependent" entities you listed, namely
Sequences
Reworks
Dimensions
QuoteItems
Consumables
I'm interested to know how you implemented this.

Implementing Udi's Fetching Strategy - How do I search?

Background
Udi Dahan suggests a fetching strategy as a useful pattern to use for data access. I agree.
The concept is to make roles explicit. For example I have an Aggregate Root - Customer. I want customer in several parts of my application - a list of customers to select from, a view of the customer's details, and I want a button to deactivate a customer.
It seems Udi would suggest an interface for each of these roles. So I have ICustomerInList with very basic details, ICustomerDetail which includes the latest 10 products purchased, and IDeactivateCustomer which has a method to deactivate the customer. Each interface exposes just enough of my Customer Aggregate Root to get the job done in each situation. My Customer Aggregate Root implements all these interfaces.
Now I want to implement a fetching strategy for each of these roles. Each strategy can load a different amount of data into my Aggregate Root because it will be behind an interface exposing only the bits of information needed.
The general method to implement this part is to ask a Service Locator or some other style of dependency injection. This code will take the interface you are wanting, for example ICustomerInList, and find a fetching strategy to load it (IStrategyForFetching<ICustomerInList>). This strategy is implemented by a class that knows to only load a Customer with the bits of information needed for the ICustomerInList interface.
So far so good.
Question
What you pass to the Service Locator, or the IStrategyForFetching<ICustomerInList>. All of the examples I see are only selecting one object by a known id. This case is easy, the calling code passes this id through and will get back the specific interface.
What if I want to search? Or I want page 2 of the list of customers? Now I want to pass in more terms that the Fetching Strategy needs.
Possible solutions
Some of the examples I've seen use a predicate - an expression that returns true or false if a particular Aggregate Root should be part of the result set. This works fine for conditions but what about getting back the first n customers and no more? Or getting page 2 of the search results? Or how the results are sorted?
My first reaction is to start adding generic parameters to my IStrategyForFetching<ICustomerInList> It now becomes IStrategyForFetching<TAggregateRoot, TStrategyForSelecting, TStrategyForOrdering>. This quickly becomes complex and ugly. It's further complicated by different repositories. Some repositories only supply data when using a particular strategy for selecting, some only certain types of ordering. I would like to have the flexibility to implement general repositories that can take sorting functions along with specialised repositories that only return Aggregate Roots sorted in a particular fashion.
It sounds like I should apply the same pattern used at the start - How do I make roles explicit? Should I implement a strategy for fetching X (Aggregate Root) using the payload Y (search / ordering parameters)?
Edit (2012-03-05)
This is all still valid if I'm not returning the Aggregate Root each time. If each interface is implemented by a different DTO I can still use IStrategyForFetching. This is why this pattern is powerful - what does the fetching and what is returned doesn't have to map in any way to the aggregate root.
I've ended up using IStrategyForFetching<TEntity, TSpecification>. TEntity is the thing I want to get, TSpecification is how I want to get it.

Have you come across CQRS? Udi is a big proponent of it, and its purpose is to solve this exact issue.
The concept in its most basic form is to separate the domain model from querying. This means that the domain model only comes into play when you want to execute a command / commit a transaction. You don't use data from your aggregates & entities to display information on the screen. Instead, you create a separate data access service (or bunch of them) that contain methods that provide the exact data required for each screen. These methods can accept criteria objects as parameters and therefore do searching with whatever criteria you desire.
A quick sequence of how this works:
A screen shows a list of customers that have made orders in the last week.
The UI calls the CustomerQueryService passing a date as criteria.
The CustomerQueryService executes a query that returns only the fields required for this screen, including the aggregate id of each customer.
The user chooses a customer in the list, and chooses perform the 'Make Important Customer' action /command.
The UI sends a MakeImportantCommand to the Command Service (or Application Service in DDD terms) containing the ID of the customer.
The command service fetches the Customer aggregate from the repository using the ID passed in the command, calls the necessary methods and updates the database.
Building your app using the CQRS architecture opens you up to lot of possibilities regarding performance and scalability. You can take this simple example further by creating separate query databases that contain denormalised tables for every view, eventual consistency & event sourcing. There is a lot of videos/examples/blogs about CQRS that I think would really interest you.
I know your question was regarding 'fetching strategy' but I notice that he wrote this article in 2007, and it's likely that he considers CQRS its sucessor.
To summarise my answer:
Don't try and project cut down DTO's from your domain aggregates. Instead, just create separate query services that give you a tailored query for your needs.
Read up on CQRS (if you haven't already).

To add to the response by David Masters, I think all the fetching strategy interfaces are adding needless complexity. Having the Customer AR implement the various interfaces which are modeled after a UI is a needless constraint on the AR class and you will spend far to much effort trying to enforce it. Moreover, it is a brittle solution. What if a view requires data that while related to Customer, does not belong on the customer class? Does one then coerce the customer class and the corresponding ORM mappings to contain that data? Why not just have a separate set of classes for query purposes and be done with it? This allows you to deal with fetching strategies at the place where they belong - in the repository. Furthermore, what value does the fetching strategy interface abstraction really add? It may be an appropriate model of what is happening in the application, it doesn't help in implementing it.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string