ddd - Creating and refactoring identity fields of Entities - domain-driven-design

We are learning DDD and evaluating its use for a system backed by a legacy system & datastore.
We have been using Anti-Corruption Layer, Bubble Context and Bounded Context without knowing about them, and this approach seems very practical to us.
But we are not certain and confident about identification methods and identity correlation.
Legacy data store does not have primary keys, instead uses composite unique indexes to identify information.
We are currently creating our model as Value objects that should be Entities (wishing to add "Long id" field to every one), or we rarely use combination of attributes used in unique indexes as id field. It seems clear to us that Entity models should have id fields.
Here are my concrete questions:
We want our shiny new Entities to have "Long id" fields theoretically. Is it OK not to add that field now since that gives us no value as the backend data store won't fill or understand that field?
Is it OK in DDD way to store identification information and refactoring it sometime later hopefully datastore changes towards our needs.
If so, is abstracting identification from Entities good approach (I mean Identifiable interface, or KeyClass per Entity? - Any good advice is needed here..)
This question may be out of the scope of DDD but i wonder if identification refactoring can cause impacts on several layers of the systems (For example, REST api may change from /entity/id_field/id_field/id_field to /entity/new_long_id)
and other questions :
How can we use the legacy information for our growing polished domain model, with less pain in identification stuff?
Or is it bad and not valuable to wish Long id for our domain at anytime of the project's life?
How can we refactor identity management?
Update:
Is identity mgmt important aspect of DDD or is it an infrastructural aspect that can be refactored, so we should not spend more time in it?

Use whatever identifier fits your needs, but be realistic and upfront about the cost and implications of choosing the wrong one. There is merit in assigning identifiers from the outside and just storing them along the other bits of information (regardless of format (guid, long, uuid)). Whether or not to refactor identity mgmt is more about doing a cost/benefit analysis. Is it even an option with the legacy system, and in what kind of timeframe will there be two keys sidebyside? Why are you even reusing the same datastore? Why not partition it so you can have parallel data stores (worst case even syncing data between both of them, preferably in one direction)? Try some vertical slicing instead of horizontal. HTH.

Related

How to reduce the higher time complexity brought by DDD

After reading the book Domain-Driven Design and some characters in the book Implementing Domain-Driven Design, I finally try to use DDD in a small service of a microservice system. And I have some questions here.
Here we have the entities Namespace, Group, and Resource. They are also aggregate roots:
As the picture pointed out, we have many Namespaces for users. And in every Namespace, we have Groups as well. And in every Group, we have Resources.
But I have a business logic:
The Group should have a unique name in its Namespace. (It is useful that the user can find the Group by its name)
To make it come true, I need to do those steps in the application layer to add a group with time complexity O(n):
Get the Namespace by its ID from the Repository of Namespace. It has a field Groups, and its type is []GroupID.
Get []Group value by []GroupID value from the Repository of Group.
Check if the name of the new group is unique in the existing Groups we get.
If it does be unique, then use the Repository of Group to save it.
But I think if I just use a sample transaction script, I can finish those in O(lg n). Because I know that I can let the field of Group name be unique in the database. How can I do it in DDD?
My thinking is:
I should add a comment in method save of the Repository interface for Group to let the user know that the save will check the name if is unique in the same Namespace.
Or we should use CQRS to check if the name of Group is unique? Another question is that maybe a Namespace may have a lot of Group. Even though we only put the ID of Group in the entity Namespace, it does cost a lot of space size. How to paginate the data?······ If we only want to get the name of Namespace by its ID, why we need get those IDs for Groups?
I do not want the DDD to limit me. But I still want to know what is the best practices. Before I know what happens, I try to avoid breaking rules.
My solution:
Thanks for the answer by #voiceofunreason. I find that it is hard to write code for set validation in the domain layer still.
#voiceofunreason tells me that I need consider the real world. I do consider and I am still confused that how to implement it to avoid breaking DDD rules. (Sorry but my question is not do we need the condition or not. My question is HOW to make the condition(or domain logic) come true without higher time complexity)
To be honest, I only have a MongoDB serving for storing all data. If I am using Transaction Script, everything is easy:
Create an index for the name of Group to make sure the names are unique.
Just insert a new Group. If the database raises any error, just refuse the request from the user.
But if I want to follow the DDD, and put the logic into the domain layer, I even do not know where to put the logic (it is easy in Transaction Script, right?). It really makes me feel blue. So my solution is:
Use DDD to split the total project into many bounded contexts.
And we do not care if we use the DDD or others in the bounded context. So tired I am.
In this bounded context, we just use Transaction Script.
Is the DDD not well to hold the condition for the set of entities, right? Because DDD always wants to get all data from the database rather than just deal in the database. Sometimes it makes the time complexity higher and I still do not know how to avoid it. Maybe I am wrong. If I am, please comment or post a new answer, thanks a lot.
The Group should have a unique name in its Namespace.
The general term for this problem is set validation. We have some collection of items, and we want to ensure that some condition holds over the entire set....
What is the business impact of having a failure
This is the key question we need to ask and it will drive our solution
in how to handle this issue as we have many choices of varying degrees
of difficulty. -- Greg Young, 2010
Some questions to consider include: is this a real constraint of the domain, or just an attempt at proofreading? Are we the authority for this data, or are we just storing a local copy of data that belongs to someone else? When we have conflicting information, can the computer determine whether the older or newer entry is in error? Does the business currently have a remediation process to use when the set condition doesn't hold? Can the business tolerate a conflict for some period of time (until end of day? minutes? nanoseconds?)
(In thinking about this last question, you may want to review Race Conditions Don't Exist, by Udi Dahan).
If the business requirement really is "we must never write conflicting entries into the collection", then any change you make must lock the collection against any potential conflicts. And this in turn has implications about, for example, how you can store the collection (trying to enforce a condition on a distributed collection is an expensive problem to have).
For the case where you can say: it makes sense to throw all of this data into a single relational database, then you might consider that the domain model is just going to make a "best effort" to avoid conflicts, and then re-enforce that with a "real" constraint in the data model.
You don't get bonus points for doing it the hard way.

How to handle hard aggregate-wide constraints in DDD/CQRS?

I'm new to DDD and I'm trying to model and implement a simple CRM system based on DDD, CQRS and event sourcing to get a feel for the paradigm. I have, however, run in to some difficulties that I'm not sure how to handle. I'm not sure if my difficulties stem from me not having modeled the domain properly or that I'm missing something else.
For a basic illustration of my problems, consider that my CRM system has the aggregate CustomerAggregate (which seems reasonble to me). The purpose of this aggregate is to make sure each customer is consistent and that its invarints hold up (name is required, social security number must be on the correkct format, etc.). So far, all is well.
When the system receives a command to create a new customer, however, it needs to make sure that the social security number of the new customer doesn't already exist (i.e. it must be unique across the system). This is, of cource, not an invariant that can be enforced by the CustomerAggregate aggregate since customers don't have any information regarding other customers.
One suggestion I've seen is to handle this kind of constraint in its own aggregate, e.g. SocialSecurityNumberUniqueAggregate. If the social security number is not already registered in the system, the SocialSecurityNumberUniqueAggregate publishes an event (e.g. SocialSecurityNumberOfNewCustomerWasUniqueEvent) which the CustomerAggregate subscribes to and publishes its own event in response to this (e.g. CustomerCreatedEvent). Does this make sense? How would the CustomerAggregate respond to, for example, a missing name or another hard constraint when responding to the SocialSecurityNumberOfNewCustomerWasUniqueEvent?
The search term you are looking for is set-validation.
Relational databases are really good at domain agnostic set validation, if you can fit the entire set into a single database.
But, that comes with a cost; designing your model that way restricts your options on what sorts of data storage you can use as your book of record, and it splits your "domain logic" into two different pieces.
Another common choice is to ignore the conflicts when you are running your domain logic (after all, what is the business value of this constraint?) but to instead monitor the persisted data looking for potential conflicts and escalate to a human being if there seems to be a problem.
You can combine the two (ex: check for possible duplicates via query when running the domain logic, and monitor the results later to mitigate against data races).
But if you need to maintain an invariant over a set, and you need that to be part of your write model (rather than separated out into your persistence layer), then you need to lock the entire set when making changes.
That could mean having a "registry of SSN assignments" that is an aggregate unto itself, and you have to start thinking about how much other customer data needs to be part of this aggregate, vs how much lives in a different aggregate accessible via a common identifier, with all of the possible complications that arise when your data set is controlled via different locks.
There's no rule that says all of the customer data needs to belong to a single "aggregate"; see Mauro Servienti's talk All Our Aggregates are Wrong. Trade offs abound.
One thing you want to be very cautious about in your modeling, is the risk of confusing data entry validation with domain logic. Unless you are writing domain models for the Social Security Administration, SSN assignments are not under your control. What your model has is a cached copy, and in this case potentially a corrupted copy.
Consider, for example, a data set that claims:
000-00-0000 is assigned to Alice
000-00-0000 is assigned to Bob
Clearly there's a conflict: both of those claims can't be true if the social security administration is maintaining unique assignments. But all else being equal, you can't tell which of these claims is correct. In particular, the suggestion that "the claim you happened to write down first must be the correct one" doesn't have a lot of logical support.
In cases like these, it often makes sense to hold off on an automated judgment, and instead kick the problem to a human being to deal with.
Although they are mechanically similar in a lot of ways, there are important differences between "the set of our identifier assignments should have no conflicts" and "the set of known third party identifier assignments should have no conflicts".
Do you also need to verify that the social security number (SSN) is really valid? Or are you just interested in verifying that no other customer aggregate with the same SSN can be created in your CRM system?
If the latter is the case I would suggest to have some CustomerService domain service which performs the whole SSN check by looking up the database (e.g. via a repository) and then creates the new customer aggregate (which again checks it's own invariants as you already mentioned). This whole process - the lookup of existing SSN and customer creation - needs to happen within one transaction to to ensure consistency. As I consider this domain logic a domain service is the perfect place for it. It does not hold data by itself but orchestrates the workflow which relates to business requirements - that no to customers with the same SSN must be created in our CRM.
If you also need to verify that the social security number is real you would also need to perform some call the another service I guess or keep some cached data of SSNs in your CRM. In this case you could additonally have some SocialSecurityNumberService domain service which is injected into the CustomerService. This would just be an interface in the domain layer but the implementation of this SocialSecurityNumberService interface would then reside in the infrastructure layer where the access to whatever resource required is implemented (be it a local cache you build in the background or some API call to another service).
Either way all your logic of creating the new customer would be in one place, the CustomerService domain service. Additional checks that go beyond the Customer aggregate boundaries would also be placed in this CustomerService.
Update
To also adhere to the nature of eventual consistency:
I guess as you go with event sourcing you and your business already accepted the eventual consistency nature. This also means entries with the same SSN could happen. I think you could have some background job which continually checks for duplicate entries and depending on the complexity of your business logic you might either be able to automatically correct the duplicates or you need human intervention to do it. It really depends how often this could really happen.
If a hard constraint is that this must NEVER happen maybe event sourcing is not the right way, at least for this part of your system...
Note: I also assume that command de-duplication is not the issue here but that you really have to deal with potentially different commands using the same SSN.

What is an Aggregate Root?

No, it is not a duplication question.
I have red many sources on the subject, but still I feel like I don't fully understand it.
This is the information I have so far (from multiple sources, be it articles, videos, etc...) about what is an Aggregate and Aggregate Root:
Aggregate is a collection of multiple Value Objects\Entity references and rules.
An Aggregate is always a command model (meant to change business state).
An Aggregate represents a single unit of (database - because essentialy the changes will be persisted) work, meaning it has to be consistent.
The Aggregate Root is the interface to the external world.
An Aggregate Root must have a globally unique identifier within the system
DDD suggests to have a Repository per Aggregate Root
A simple object from an aggregate can't be changed without its AR(Aggregate Root) knowing it
So with all that in mind, lets get to the part where I get confused:
in this site it says
The Aggregate Root is the interface to the external world. All interaction with an Aggregate is via the Aggregate Root. As such, an Aggregate Root MUST have a globally unique identifier within the system. Other Entites that are present in the Aggregate but are not Aggregate Roots require only a locally unique identifier, that is, an Id that is unique within the Aggregate.
But then, in this example I can see that an Aggregate Root is implemented by a static class called Transfer that acts as an Aggregate and a static function inside called TransferedRegistered that acts as an AR.
So the questions are:
How can it be that the function is an AR, if there must be a globaly unique identifier to it, and there isn't, reason being that its a function. what does have a globaly unique identifier is the Domain Event that this function produces.
Following question - How does an Aggregate Root looks like in code? is it the event? is it the entity that is returned? is it the function of the Aggregate class itself?
In the case that the Domain Event that the function returns is the AR (As stated that it has to have that globaly unique identifier), then how can we interact with this Aggregate? the first article clearly stated that all interaction with an Aggregate is by the AR, if the AR is an event, then we can do nothing but react on it.
Is it right to say that the aggregate has two main jobs:
Apply the needed changes based on the input it received and rules it knows
Return the needed data to be persisted from AR and/or need to be raised in a Domain Event from the AR
Please correct me on any of the bullet points in the beginning if some/all of them are wrong is some way or another and feel free to add more of them if I have missed any!
Thanks for clarifying things out!
I feel like I don't fully understand it.
That's not your fault. The literature sucks.
As best I can tell, the core ideas of implementing solutions using domain driven design came out of the world of Java circa 2003. So the patterns described by Evans in chapters 5 and six of the blue book were understood to be object oriented (in the Java sense) domain modeling done right.
Chapter 6, which discusses the aggregate pattern, is specifically about life cycle management; how do you create new entities in the domain model, how does the application find the right entity to interact with, and so on.
And so we have Factories, that allow you to create instances of domain entities, and Repositories, that provide an abstraction for retrieving a reference to a domain entity.
But there's a third riddle, which is this: what happens when you have some rule in your domain that requires synchronization between two entities in the domain? If you allow applications to talk to the entities in an uncoordinated fashion, then you may end up with inconsistencies in the data.
So the aggregate pattern is an answer to that; we organize the coordinated entities into graphs. With respect to change (and storage), the graph of entities becomes a single unit that the application is allowed to interact with.
The notion of the aggregate root is that the interface between the application and the graph should be one of the members of the graph. So the application shares information with the root entity, and then the root entity shares that information with the other members of the aggregate.
The aggregate root, being the entry point into the aggregate, plays the role of a coarse grained lock, ensuring that all of the changes to the aggregate members happen together.
It's not entirely wrong to think of this as a form of encapsulation -- to the application, the aggregate looks like a single entity (the root), with the rest of the complexity of the aggregate being hidden from view.
Now, over the past 15 years, there's been some semantic drift; people trying to adapt the pattern in ways that it better fits their problems, or better fits their preferred designs. So you have to exercise some care in designing how to translate the labels that they are using.
In simple terms an aggregate root (AR) is an entity that has a life-cycle of its own. To me this is the most important point. One AR cannot contain another AR but can reference it by Id or some value object (VO) containing at least the Id of the referenced AR. I tend to prefer to have an AR contain only other VOs instead of entities (YMMV). To this end the AR is responsible for consistency and variants w.r.t. the AR. Each VO can have its own invariants such as an EMailAddress requiring a valid e-mail format. Even if one were to call contained classes entities I will call that semantics since one could get the same thing done with a VO. A repository is responsible for AR persistence.
The example implementation you linked to is not something I would do or recommend. I followed some of the comments and I too, as one commenter alluded to, would rather use a domain service to perform something like a Transfer between two accounts. The registration of the transfer is not something that may necessarily be permitted and, as such, the domain service would be required to ensure the validity of the transfer. In fact, the registration of a transfer request would probably be a Journal in an accounting sense as that is my experience. Once the journal is approved it may attempt the actual transfer.
At some point in my DDD journey I thought that there has to be something wrong since it shouldn't be so difficult to understand aggregates. There are many opinions and interpretations w.r.t. to DDD and aggregates which is why it can get confusing. The other aspect is, in IMHO, that there is a fair amount of design involved that requires some creativity and which is based on an understanding of the domain itself. Creativity cannot be taught and design falls into the realm of tacit knowledge. The popular example of tacit knowledge is learning to ride a bike. Now, we can read all we want about how to ride a bike and it may or may not help much. Once we are on the bike and we teach ourselves to balance then we can make progress. Then there are people who end up doing absolutely crazy things on a bike and even if I read how to I don't think that I'll try :)
Keep practicing and modelling until it starts to make sense or until you feel comfortable with the model. If I recall correctly Eric Evans mentions in the Blue Book that it may take a couple of designs to get the model closer to what we need.
Keep in mind that Mike Mogosanu is using a event sourcing approach but in any case (without ES) his approach is very good to avoid unwanted artifacts in mainstream OOP languages.
How can it be that the function is an AR, if there must be a globaly unique identifier to it, and there isn't, reason being that
its a function. what does have a globaly unique identifier is the
Domain Event that this function produces.
TransferNumber acts as natural unique ID; there is also a GUID to avoid the need a full Value Object in some cases.
There is no unique ID state in the computer memory because it is an argument but think about it; why you want a globaly unique ID? It is just to locate the root element and its (non unique ID) childrens for persistence purposes (find, modify or delete it).
Order A has 2 order lines (1 and 2) while Order B has 4 order lines (1,2,3,4); the unique identifier of order lines is a composition of its ID and the Order ID: A1, B3, etc. It is just like relational schemas in relational databases.
So you need that ID just for persistence and the element that goes to persistence is a domain event expressing the changes; all the changes needed to keep consistency, so if you persist the domain event using the global unique ID to find in persistence what you have to modify the system will be in a consistent state.
You could do
var newTransfer = New Transfer(TransferNumber); //newTransfer is now an AG with a global unique ID
var changes = t.RegisterTransfer(Debit debit, Credit credit)
persistence.applyChanges(changes);
but what is the point of instantiate a object to create state in the computer memory if you are not going to do more than one thing with this object? It is pointless and most of OOP detractors use this kind of bad OOP design to criticize OOP and lean to functional programming.
Following question - How does an Aggregate Root looks like in code? is it the event? is it the entity that is returned? is it the function
of the Aggregate class itself?
It is the function itself. You can read in the post:
AR is a role , and the function is the implementation.
An Aggregate represents a single unit of work, meaning it has to be consistent. You can see how the function honors this. It is a single unit of work that keeps the system in a consistent state.
In the case that the Domain Event that the function returns is the AR (As stated that it has to have that globaly unique identifier),
then how can we interact with this Aggregate? the first article
clearly stated that all interaction with an Aggregate is by the AR, if
the AR is an event, then we can do nothing but react on it.
Answered above because the domain event is not the AR.
4 Is it right to say that the aggregate has two main jobs: Apply the
needed changes based on the input it received and rules it knows
Return the needed data to be persisted from AR and/or need to be
raised in a Domain Event from the AR
Yes; again, you can see how the static function honors this.
You could try to contat Mike Mogosanu. I am sure he could explain his approach better than me.

Correct design of aggregate roots

Somewhere far, far away in a domain galaxy there is mention of
'Measurement values' and 'Places'
Each 'Measurement value' comes from/belongs to a specific 'Place'
Each 'Measurement value' is registered on a given date & time and of a given, specific type (eg. waterflow, wind, etc)
Each 'Place' has a name and a collection of 'Measurement Values' that gets registered
Given my current model where 'Places' are the aggregate root that holds 'Measurement values' I have a dilemma:
Users wishes to view one type of measurement values at a time and there are quite a lot of measurement values.
To load all measurement values when only some of them are needed seems unneccessary..
Eg. Im stuck on how to organize/model the need "Show me waterflows (measurement values) in River X (Place) between time A and B"
Is it allowed to instantiate River X aggregate root only partially loaded with the type of measurement values concerned in a given use case?
Are there other ways of modelling measurement values and their origin?
Please let med know your thoughts...
I think that your aggregate is consistent as it is. Your dilemma as nothing to do with domain model but rather than with a presentation model.
I will consider the possibility to deserialize each measurement in a NoSQL instance, in this way your presenteation layer could filter and make any query without affecting the consistency of domain layer.
Correct me if I'm wrong but it sounds very much like the data model and storage are impacting the design of your system? If this is so it may be the cause of your dilemma. A key part of the benefit modeling using aggregates is it is free of dependencies. Dependencies such as databases and data models. There is no direct 'view' of an aggregate, so it's not shaped by the view. This makes aggregates much easier to design. They are much more focused on solving the problem. And are therefore great candidates for doing complex stuff.
If it turns out you don't need aggregates to model your domain. You can then just focus on an efficient storage and retrieval mechanism.
In other words...
Don't tie yourself up in knots doing DDD if you don't need to.
If it helps I created an infographic on common DDD mistakes. You may find it helpful. You can find it here.
By the way, I think DDD is a great way to go, but only if your domain warrants it. Appologies if I have misunderstood you.
I fail to see the real problem. You said that each Measurement is tied to a specific Place, then you don't have to load all Measurements.
Using correct data layer configuration, you can load the required Measurement by selecting/loading/instantiating only it's parent (Place).

Can't help but see Domain entities as wasteful. Why?

I've got a question on my mind that has been stirring for months as I've read about DDD, patterns and many other topics of application architecture. I'm going to frame this in terms of an MVC web application but the question is, I'm sure, much broader. and it is this:  Does the adherence to domain entities  create rigidity and inefficiency in an application? 
The DDD approach makes complete sense for managing the business logic of an application and as a way of working with stakeholders. But to me it falls apart in the context of a multi-tiered application. Namely there are very few scenarios when a view needs all the data of an entity or when even two repositories have it all. In and of itself that's not bad but it means I make multiple queries returning a bunch of properties I don't need to get a few that I do. And once that is done the extraneous information either gets passed to the view or there is the overhead of discarding, merging and mapping data to a DTO or view model. I have need to generate a lot of reports and the problem seems magnified there. Each requires a unique slicing or aggregating of information that SQL can do well but repositories can't as they're expected to return full entities. It seems wasteful, honestly, and I don't want to pound a database and generate unneeded network traffic on a matter of principle. From questions like this Should the repository layer return data-transfer-objects (DTO)? it seems I'm not the only one to struggle with this question. So what's the answer to the limitations it seems to impose? 
Thanks from a new and confounded DDD-er.  
What's the real problem here? Processing business rules and querying for data are 2 very different concerns. That realization leads us to CQRS - Command-Query Responsibility Segregation. What's that? You just don't use the same model for both tasks: Domain Model is about behavior, performing business processes, handling command. And there is a separate Reporting Model used for display. In general, it can contain a table per view. These tables contains only relevant information so you can get rid of DTO, AutoMapper, etc.
How these two models synchronize? It can be done in many ways:
Reporting model can be built just on top of database views
Database replication
Domain model can issue events containing information about each change and they can be handled by denormalizers updating proper tables in Reporting Model
as I've read about DDD, patterns and many other topics of application architecture
Domain driven design is not about patterns and architecture but about designing your code according to business domain. Instead of thinking about repositories and layers, think about problem you are trying to solve. Simplest way to "start rehabilitation" would be to rename ProductRepository to just Products.
Does the adherence to domain entities create rigidity and inefficiency in an application?
Inefficiency comes from bad modeling. [citation needed]
The DDD approach makes complete sense for managing the business logic of an application and as a way of working with stakeholders. But to me it falls apart in the context of a multi-tiered application.
Tiers aren't layers
Namely there are very few scenarios when a view needs all the data of an entity or when even two repositories have it all. In and of itself that's not bad but it means I make multiple queries returning a bunch of properties I don't need to get a few that I do.
Query that data as you wish. Do not try to box your problems into some "ready-made solutions". Instead - learn from them and apply only what's necessary to solve them.
Each requires a unique slicing or aggregating of information that SQL can do well but repositories can't as they're expected to return full entities.
http://ayende.com/blog/3955/repository-is-the-new-singleton
So what's the answer to the limitations it seems to impose?
"seems"
Btw, internet is full of things like this (I mean that sample app).
To understand what DDD is, read blue book slowly and carefully. Twice.
If you think that fully fledged DDD is too much effort for your scenario then maybe you need to take a step down and look at something closer to Active Record.
I use DDD but in my scenario I have to support multiple front-ends; a couple web sites and a WinForms app, as well as a set of services that allow interaction with other automated processes. In this case, the extra complexity is worth it. I use DTO's to transfer a representation of my data to the various presentation layers. The CPU overhead in mapping domain entities to DTO's is small - a rounding error when compared to net work calls and database calls. There is also the overhead in managing this complexity. I have mitigated this to some extent by using AutoMapper. My Repositories return fully populated domain objects. My service layer will map to/from DTO's. Here we can flatten out the domain objects, combine domain objects, etc. to produce a more tabulated representation of the data.
Dino Esposito wrote an MSDN Magazine article on this subject here - you may find this interesting.
So, I guess to answer your "Why" question - as usual, it depends on your context. DDD maybe too much effort. In which case do something simpler.
Each requires a unique slicing or aggregating of information that SQL can do well but repositories can't as they're expected to return full entities.
Add methods to your repository to return ONLY what you want e.g. IOrderRepository.GetByCustomer
It's completely OK in DDD.
You may also use Query object pattern or Specification to make your repositories more generic; only remember not to use anything which is ORM-specific in interfaces of the repositories(e.g. ICriteria of NHibernate)

Resources