Domain security involving domain logic - security

Together with my application's domain logic I am trying to outline the security model. I am stuck with a requirement that prevents me from considering security just a cross-cutting concern over my domain logic. Here follows my situation.
A user in my system can potentially be allowed to create a certain kind of objects, say, 'filters'. I introduce a permission called 'CREATE_FILTER', and a user is either allowed to create filters or not, depending on whether the admin assigned such a permission to this user, or not. Ok.
Now consider a more complex requirement when the number of filters a user can create is limited. So, e.g. the admin should be able to set max number of filters any user is allowed to create, or even more complex, to assign max numbers individually to users, say value of 3 to User1, 5 to User2 and so on. So, the security system, in order to authorize filter creation to a user, is not sufficient to check whether a user has such a permission assigned, but has to analyze the domain model in a complex way in order to look how many filters there are already created by the user to make the decision. To make things more complex, we can imagine that the max limit will depend on the amount of money user has on their account, or something.
I want to conceptually separate (at least in my mind), whether such a complicated security logic purely pertains to security (which will of course depend on the domain model) or is this already a full-fledged part of the domain logic itself? Does it make sense to keep a 'permission' concept, when assigning/removing permissions does not help much (since it's domain state on which depends authorization decision rather than assigned permissions)? Would it be a way to go, say, to have a complicated permission concept which not simply allows an action by a mere fact of its existence but would rather involve some complex decision making logic?

Here's one way you could handle this ...
On one side you have a security model (might be a bounded context in ddd speak) that is solving the problem of assigning permissions to subjects (users), maybe indirectly through the use of roles. I would envision upper boundaries (max numbers) to be an attribute associated with the assigned permission.
There's also a query part to this model. Yet, it can only answer "simple" questions:
Has this user permission to create filters?
How many filters can this user create?
Some would even say this query part is a separate model altogether.
On the other end you have your application's model which is largely "security" free apart from these pesky requirements along the lines of "user John Doe can only create 3 filter". As an aside, it's doubtful we're still speaking of a "user" at this point, but rather of a person acting in a certain role in this particular use case. Anyway, back to how we could keep this somewhat separate. Suppose we have a somewhat layered approach and we have an application service with an authorization service in front. The responsibility of the authorization service is to answer the question "is this user allowed to perform this operation? yes or no?" and stop processing if the answer is no. Here's a very naive version of that (C#).
public class FilterAuthorizationServices :
Handles<CreateFilter>
{
public FilterAuthorizationServices(FilterRepository filterRepository) { ... }
public void Authorize(Subject subject, CreateFilter message)
{
if(!subject.HasPermissionTo("CreateFilter"))
{
throw new NotAuthorizedException("...");
}
if(filterRepository.CountFiltersCreatedBy(subject.Id) >
subject.GetTheMaximumNumberOfFiltersAllowedToCreate())
{
throw new NotAuthorizedException("...");
}
}
}
Notice how the application service is not even mentioned here. It can concentrate on invoking the actual domain logic. Yet, the authorization service is using both the query part of the above model (embodied by the Subject) and the model of the application (embodied by the FilterRepository) to fulfill the authorization request. Nothing wrong with doing it this way.
You could even go a step further and ditch the need for the application's model if that model could somehow provide the "current number of created filters" to the security model. But that might be a bridge too far for you, since that would lead down the path of evaluating dynamic expressions (which wouldn't necessarily be a bad place to be in). If you want to go there, may I suggest you create a mini DSL to define the required expressions and associated code to parse and evaluate them.
If you'd like brokered authorization you could look at something like XACML (https://www.oasis-open.org/committees/tc_home.php?wg_abbrev=xacml) though you'll have to overcome your fear of XML first ;-)

Related

How to handle hard aggregate-wide constraints in DDD/CQRS?

I'm new to DDD and I'm trying to model and implement a simple CRM system based on DDD, CQRS and event sourcing to get a feel for the paradigm. I have, however, run in to some difficulties that I'm not sure how to handle. I'm not sure if my difficulties stem from me not having modeled the domain properly or that I'm missing something else.
For a basic illustration of my problems, consider that my CRM system has the aggregate CustomerAggregate (which seems reasonble to me). The purpose of this aggregate is to make sure each customer is consistent and that its invarints hold up (name is required, social security number must be on the correkct format, etc.). So far, all is well.
When the system receives a command to create a new customer, however, it needs to make sure that the social security number of the new customer doesn't already exist (i.e. it must be unique across the system). This is, of cource, not an invariant that can be enforced by the CustomerAggregate aggregate since customers don't have any information regarding other customers.
One suggestion I've seen is to handle this kind of constraint in its own aggregate, e.g. SocialSecurityNumberUniqueAggregate. If the social security number is not already registered in the system, the SocialSecurityNumberUniqueAggregate publishes an event (e.g. SocialSecurityNumberOfNewCustomerWasUniqueEvent) which the CustomerAggregate subscribes to and publishes its own event in response to this (e.g. CustomerCreatedEvent). Does this make sense? How would the CustomerAggregate respond to, for example, a missing name or another hard constraint when responding to the SocialSecurityNumberOfNewCustomerWasUniqueEvent?
The search term you are looking for is set-validation.
Relational databases are really good at domain agnostic set validation, if you can fit the entire set into a single database.
But, that comes with a cost; designing your model that way restricts your options on what sorts of data storage you can use as your book of record, and it splits your "domain logic" into two different pieces.
Another common choice is to ignore the conflicts when you are running your domain logic (after all, what is the business value of this constraint?) but to instead monitor the persisted data looking for potential conflicts and escalate to a human being if there seems to be a problem.
You can combine the two (ex: check for possible duplicates via query when running the domain logic, and monitor the results later to mitigate against data races).
But if you need to maintain an invariant over a set, and you need that to be part of your write model (rather than separated out into your persistence layer), then you need to lock the entire set when making changes.
That could mean having a "registry of SSN assignments" that is an aggregate unto itself, and you have to start thinking about how much other customer data needs to be part of this aggregate, vs how much lives in a different aggregate accessible via a common identifier, with all of the possible complications that arise when your data set is controlled via different locks.
There's no rule that says all of the customer data needs to belong to a single "aggregate"; see Mauro Servienti's talk All Our Aggregates are Wrong. Trade offs abound.
One thing you want to be very cautious about in your modeling, is the risk of confusing data entry validation with domain logic. Unless you are writing domain models for the Social Security Administration, SSN assignments are not under your control. What your model has is a cached copy, and in this case potentially a corrupted copy.
Consider, for example, a data set that claims:
000-00-0000 is assigned to Alice
000-00-0000 is assigned to Bob
Clearly there's a conflict: both of those claims can't be true if the social security administration is maintaining unique assignments. But all else being equal, you can't tell which of these claims is correct. In particular, the suggestion that "the claim you happened to write down first must be the correct one" doesn't have a lot of logical support.
In cases like these, it often makes sense to hold off on an automated judgment, and instead kick the problem to a human being to deal with.
Although they are mechanically similar in a lot of ways, there are important differences between "the set of our identifier assignments should have no conflicts" and "the set of known third party identifier assignments should have no conflicts".
Do you also need to verify that the social security number (SSN) is really valid? Or are you just interested in verifying that no other customer aggregate with the same SSN can be created in your CRM system?
If the latter is the case I would suggest to have some CustomerService domain service which performs the whole SSN check by looking up the database (e.g. via a repository) and then creates the new customer aggregate (which again checks it's own invariants as you already mentioned). This whole process - the lookup of existing SSN and customer creation - needs to happen within one transaction to to ensure consistency. As I consider this domain logic a domain service is the perfect place for it. It does not hold data by itself but orchestrates the workflow which relates to business requirements - that no to customers with the same SSN must be created in our CRM.
If you also need to verify that the social security number is real you would also need to perform some call the another service I guess or keep some cached data of SSNs in your CRM. In this case you could additonally have some SocialSecurityNumberService domain service which is injected into the CustomerService. This would just be an interface in the domain layer but the implementation of this SocialSecurityNumberService interface would then reside in the infrastructure layer where the access to whatever resource required is implemented (be it a local cache you build in the background or some API call to another service).
Either way all your logic of creating the new customer would be in one place, the CustomerService domain service. Additional checks that go beyond the Customer aggregate boundaries would also be placed in this CustomerService.
Update
To also adhere to the nature of eventual consistency:
I guess as you go with event sourcing you and your business already accepted the eventual consistency nature. This also means entries with the same SSN could happen. I think you could have some background job which continually checks for duplicate entries and depending on the complexity of your business logic you might either be able to automatically correct the duplicates or you need human intervention to do it. It really depends how often this could really happen.
If a hard constraint is that this must NEVER happen maybe event sourcing is not the right way, at least for this part of your system...
Note: I also assume that command de-duplication is not the issue here but that you really have to deal with potentially different commands using the same SSN.

DDD - bounded contexts sharing/communication

I am trying to utilize some DDD approaches in the app that allows guest purchase. While it looks easy, I got a bit confused and asking for your DDD advice.
The app has several bounded contexts and we are looking at 3 of them:
Customers (customers manage their user settings here, authentication, also admin can potentially create users)
Sales (orders)
Billing (charging customers for one-off payments and subscriptions)
The user story:
As a guest I want to order the product to do something.
This is one form and on checkout he/she will be asked for email and password. We need to create account on this step, but based on business logic this fits to Sales context - guest makes an order. We actually need to do simple steps:
Create user with ACTIVE status
Store billing details
Create order
Charge user later (event handled by Billing)
The confusion here is that it requires creating a user first. It looks more naturally to create it in customers, but probably it’s wrong? User can not sign up in any other way - only by placing an order, but admin can create a user manually. Based on different system events(from different contexts), the Customer context may change the user status based on special logic that is placed into Customer domain. Is there any safe way for sharing this User status logic between different contexts (while creating User in Sales we need that status enum class)? Does this logic for placing order look ok? Could you please recommend another approach if you think that this one is wrong?
Status is DDD at its worst. Including a status field is 1) lazy, and yet 2) very convenient. Yes, one of those design trade offs.
When you assign a status or read a status you are ignoring or sublimating significant business logic and structure for your domain. When “status” changes some very significant changes could occur in your domain... way beyond changing a status property.
Throw status out and instead consider some concepts: a CasualShopper or Guest (no purchases, browsing for products), a PotentialNewShopper (someone adding things in their basket who you’ve never seen before), and your usual Customer (which should probably be subdivided based on their current activity).
With this modeled, you can attach behaviors directly to each of these objects and “status” itself is sublimated into a richer DDD model. A common DDD mistake is not creating a transactionally-significant object (e.g. a Potential Shopper role) for some static/non-temporal object (e.g. a person).
From here you may decide you need a few bounded contexts; perhaps PotentialCustomers and EstablishedCustomers. In each the set of domain transitions are different and can be encapsulated rather than externalized.
So...
With that out of the way it looks like you have a Customer BC and a PossibleCustomer BC. You can safely do what you need to do in the latter now that it is self-contained.
Oh, but that may affect billing! and ordering!
True. That may mean new BCs or new objects in those BCs such as ProvisionalPayment and UnauthenticatedOrder. I’m spitballing a bit now...
My point is you have events that can transition between states rather than encoding those states, and as such you can attach the behaviors you need and persist them as needed in some physical store, that is likely partitioned in a way suitable to your DDD.
Doing all this work means not sharing unsafe status but sharing safe projections of relevant objects only.
Jumping into implementation briefly and anecdotally, developers are loath to store “temporary” state. “Temporary” state is OK to store and necessary when you’re modeling a domain without that cruddy status field.
You probably should ask yourself first whether you got the Bounded Contexts right.
In my opinion you have the following BCs
Identity and Users
Sales and Billing
consider this: the same person is a User in the first context but a Customer in the latter. so you have two views on the same real world entity which suggests that you have two bounded contexts where this entity means different things.
your bcs sound more like modules in the Sales and Billing context.
If you agree, then the control flow for your problem may be simplified where an entity in one context is created and the creation is propagated via event into the other. so the initial request could be handled by the Sales bc and the guest user handling would be propagated to Identity.

DDD: where should logic go that tests the existence of an entity?

I am in the process of refactoring an application and am trying to figure out where certain logic should fit. For example, during the registration process I have to check if a user exists based upon their email address. As this requires testing if the user exists in the database it seems as if this logic should not be tied to the model as its existence is dictated by it being in the database.
However, I will have a method on the repository responsible for fetching the user by email, etc. This handles the part about retrieval of the user if they exist. From a use case perspective, registration seems to be a use case scenario and accordingly it seems there should be a UserService (application service) with a register method that would call the repository method and perform if then logic to determine if the user entity returned was null or not.
Am I on the right track with this approach, in terms of DDD? Am I viewing this scenario the wrong way and if so, how should I revise my thinking about this?
This link was provided as a possible solution, Where to check user email does not already exits?. It does help but it does not seem to close the loop on the issue. The thing I seem to be missing from this article would be who would be responsible for calling the CreateUserService, an application service or a method on the aggregate root where the CreateUserService object would be injected into the method along with any other relevant parameters?
If the answer is the application service that seems like you are loosing some encapsulation by taking the domain service out of the domain layer. On the other hand, going the other way would mean having to inject the repository into the domain service. Which of those two options would be preferable and more in line with DDD?
I think the best fit for that behaviour is a Domain Service. DS could access to persistence so you can check for existence or uniquenes.
Check this blog entry for more info.
I.e:
public class TransferManager
{
private readonly IEventStore _store;
private readonly IDomainServices _svc;
private readonly IDomainQueries _query;
private readonly ICommandResultMediator _result;
public TransferManager(IEventStore store, IDomainServices svc,IDomainQueries query,ICommandResultMediator result)
{
_store = store;
_svc = svc;
_query = query;
_result = result;
}
public void Execute(TransferMoney cmd)
{
//interacting with the Infrastructure
var accFrom = _query.GetAccountNumber(cmd.AccountFrom);
//Setup value objects
var debit=new Debit(cmd.Amount,accFrom);
//invoking Domain Services
var balance = _svc.CalculateAccountBalance(accFrom);
if (!_svc.CanAccountBeDebitted(balance, debit))
{
//return some error message using a mediator
//this approach works well inside monoliths where everything happens in the same process
_result.AddResult(cmd.Id, new CommandResult());
return;
}
//using the Aggregate and getting the business state change expressed as an event
var evnt = Transfer.Create(/* args */);
//storing the event
_store.Append(evnt);
//publish event if you want
}
}
from http://blog.sapiensworks.com/post/2016/08/19/DDD-Application-Services-Explained
The problem that you are facing is called Set based validation. There are a lot of articles describing the possible solutions. I will give here an extract from one of them (the context is CQRS but it can be applied to some degree to any DDD architecture):
1. Locking, Transactions and Database Constraints
Locking, transactions and database constraints are tried and tested tools for maintaining data integrity, but they come at a cost. Often the code/system is difficult to scale and can be complex to write and maintain. But they have the advantage of being well understood with plenty of examples to learn from. By implication, this approach is generally done using CRUD based operations. If you want to maintain the use of event sourcing then you can try a hybrid approach.
2. Hybrid Locking Field
You can adopt a locking field approach. Create a registry or lookup table in a standard database with a unique constraint. If you are unable to insert the row then you should abandon the command. Reserve the address before issuing the command. For these sort of operations, it is best to use a data store that isn’t eventually consistent and can guarantee the constraint (uniqueness in this case). Additional complexity is a clear downside of this approach, but less obvious is the problem of knowing when the operation is complete. Read side updates are often carried out in a different thread or process or even machine to the command and there could be many different operations happening.
3. Rely on the Eventually Consistent Read Model
To some this sounds like an oxymoron, however, it is a rather neat idea. Inconsistent things happen in systems all the time. Event sourcing allows you to handle these inconsistencies. Rather than throwing an exception and losing someone’s work all in the name of data consistency. Simply record the event and fix it later.
As an aside, how do you know a consistent database is consistent? It keeps no record of the failed operations users have tried to carry out. If I try to update a row in a table that has been updated since I read from it, then the chances are I’m going to lose that data. This gives the DBA an illusion of data consistency, but try to explain that to the exasperated user!
Accepting these things happen, and allowing the business to recover, can bring real competitive advantage. First, you can make the deliberate assumption these issues won’t occur, allowing you to deliver the system quicker/cheaper. Only if they do occur and only if it is of business value do you add features to compensate for the problem.
4. Re-examine the Domain Model
Let’s take a simplistic example to illustrate how a change in perspective may be all you need to resolve the issue. Essentially we have a problem checking for uniqueness or cardinality across aggregate roots because consistency is only enforced with the aggregate. An example could be a goalkeeper in a football team. A goalkeeper is a player. You can only have 1 goalkeeper per team on the pitch at any one time. A data-driven approach may have an ‘IsGoalKeeper’ flag on the player. If the goalkeeper is sent off and an outfield player goes in the goal, then you would need to remove the goalkeeper flag from the goalkeeper and add it to one of the outfield players. You would need constraints in place to ensure that assistant managers didn’t accidentally assign a different player resulting in 2 goalkeepers. In this scenario, we could model the IsGoalKeeper property on the Team, OutFieldPlayers or Game aggregate. This way, maintaining the cardinality becomes trivial.
You seems to be on the right way, the only stuff I didn't get is what your UserService.register does.
It should take all the values to register a user as input, validate them (using the repository to check the existence of the email) and, if the input is valid store the new User.
Problems can arise when the validation involve complex queries. In that case maybe you need to create a secondary store with special indexes suited for queries that you can't do with your domain model, so you will have to manage two different stores that can be out of sync (a user exists in one but it isn't replicated in the other one, yet).
This kind of problem happens when you store your aggregates in something like a key-value store where you can search just with the id of the aggregate, but if you are using something like a sql database that permits to search using your entities fields, you can do a lot of stuff with simple queries.
The only thing you need to take care is avoid to mix query logic and commands logic, in your example the lookup you need to do is easy, is just one field and the result is a boolean, sometimes it can be harder like time operations, or query spanning multiple tables aggregating results, in these cases it is better to make your (command) service use a (query) service, that offers a simple api to do the calculation like:
interface UserReportingService {
ComplexResult aComplexQuery(AComplexInput input);
}
That you can implement with a class that use your repositories, or an implementation that executes directly the query on your database (sql, or whatever).
The difference is that if you use the repositories you "think" in terms of your domain object, if you write directly the query you think in terms of your db abstractions (tables/sets in case of sql, documents in case of mongo, etc..). One or the other depends on the query you need to do.
It is fine to inject repository into domain.
Repository should have simple inteface, so that domain objects could use it as simple collection or storage. Repositories' main idea is to hide data access under simple and clear interface.
I don't see any problems in calling domain services from usecase. Usecase is suppossed to be archestrator. And domain services are actions. It is fine (and even unavoidable) to trigger domain actions by usecase.
To decide, you should analyze Where is this restriction come from?
Is it business rule? Or maybe user shouldn't be a part of model at all?
Usualy "User" means authorization and authentification i.e behaviour, that for my mind should placed in usecase. I prefare to create separate entity for domain (e.g. buyer) and relate it with usecase's user. So when new user is registered it possible to trigger creation of new buyer.

Dealing with a user dependent application

An application that I'm currently writing is heavily dependent on the current logged in user, to give a concrete example lets say we have a list of Products.
Now every user has the 'rights' to see certain Products, particular details of this product, and edit / remove fewer of those.
E.g.:
The user can see 3/5 products
The user can see extra details from 2 out of those 3 products
...
As this is the case with most of the application's domain, I have a tendency to pass around the user in most methods. Which becomes cumbersome from time to time. As I have to pass in the user in some methods, just to pass it down to another one that needs it.
My gut tells me I'm missing something, but I'm not sure how I could tackle this problem.
I gave some thoughts at using a Class that holds this user, and inject that class everywhere I need it. Or using a static Property.
Now from time to time it is handy pass in the user in the method, I guess I could override it then:
public doSomething(User user = null)
{
var u = user ?? this.authService.User;
...
}
Are there other ways you could tackle this kind of problem ?
This is going to depend on where you are in the project in terms of progress. In some instances you may not have the leeway to change this but if you have more control or are starting out then you may have options.
Typically Identity & Access Control is a bounded context on its own. Authentication and authorization should not be in your core domain. Your core domain (or even sub-domains) are interested in doing what they do if you have access but it is not the domain's responsibility to determine that access.
The authorization should take place outside the domain. If you find that you are querying your domain then things probably need to change since you need a dedicated query layer that will probably apply the authorization. Any commands that are limited will have authorization applied at the integration/application layer. Whether we want to restrict a user from registering a new order or even new orders of a certain type should not really matter i.t.o. the domain since it is only the granularity that changes.
You may have a sub-domain that deals with the authorization specific to your domain and an Identity & Access Control generic sub-domain that is more orthogonal.
But you may be in a scenario where there is an uncomfortably high level of coupling between the data element authorization (a level of classification) and the structure. I am of the opinion that fluid classification should be kept away from ones structure as the repercussions of classification changes are too great.
Just some thoughts :)
Your gut is correct, keep listen to it.
Authorization checks should not be mixed with core domain checks. For example, the if that checks that the user may update the product details and the if that checks that the product details are long enough should not be contained in the same class or even the same bounded context. If you have a monolith then the two checks should be contained in separate namespaces/modules.
Now I will tell you how I do it. In my latest monolithic project I use CQRS a lot, I like the separation between Commands and Queries. I will give an example of command validation but this can be extended to query validation and even to non-CQRS architectures.
For every command I register zero or more command validators that check if the command may be sent to the aggregate. These validators are eventual consistent. If a command passes all the validators then the command is sent to the aggregate where it is further checked but in a strong consistent manner. So, we are talking about two kinds of validation: validation outside the aggregate and validation inside the aggregate. The checks that belongs to other bounded context can be implemented using command validators outside the aggregate, that's how I do it. And now some example source code, in PHP:
<?php
namespace CoreDomain {
class ProductAggregate
{
public function handle(ChangeProductDetails $command):void //no return value
{
//this check is strong consistent
//the method yields zero or more events or exception in case of failure
if (strlen($command->getProductDetails()) < 10) {
throw new \Exception("Product details must be at least 10 characters long");
}
yield new ProductDetailsWereChanged($command->getProductId(), $command->getProductDetails());
}
}
}
namespace Authorization {
class UserCanChangeProductDetailsValidator
{
private $authenticationReaderService;
private $productsPermissionsService;
public function validate(ChangeProductDetails $command): void //no return value, if all is good no exception are thrown
{
//this check is eventual consistent
if (!$this->productsPermissionsService->canUserChangeProductDetails($this->authenticationReaderService->getAuthenticatedUserId(), $command->getProductId())) {
throw new \Exception("User may not change product details");
}
}
}
}
This example uses a style where commands are sent directly to the aggregates but you should apply this pattern to other styles too. For brevity, the details of command validators registering are not included.

Enforce invariants spanning multiple aggregates (set validation) in Domain-driven Design

To illustrate the problem we use a simple case: there are two aggregates - Lamp and Socket. The following business rule always must be enforced: Neither a Lamp nor a Socket can be connected more than once at the same time. To provide an appropriate command we conceive a Connector-service with the Connect(Lamp, Socket)-method to plug them.
Because we want to comply to the rule that one transaction should involve only one aggregate, it's not advisable to set the association on both aggregates in the Connect-transaction. So we need an intermediate aggregate which symbolizes the Connection itself. So the Connect-transaction would just create a new Connection with the given components. Unfortunately, at this point the troubles begin; how can we ensure the consistency of connection-state? It may happen that many simultaneous users want to plug the same components at the exact same time, so our "consistency check" wouldn't reject the request. New Connection-aggregates would be stored, because we only lock at aggregate-level. The system would be inconsistent without even knowing that.
But how should we set the boundary of our aggregates to ensure our business rule? We could conceive a Connections-aggregate which gathers all active connections (as Connection-entity), thereby enabling our locking-algorithm which would properly reject duplicate Connect-requests. On the other hand this approach is inefficient and does not scale, further it is counter-intuitive in terms of domain language.
Do you know what I'm missing?
Edit: To sum up the problem, imagine an aggregate User. Since the definition of an aggregate is to be a transaction-based unit we are able to enforce invariants by locking this unit per transaction. All is fine. But now a business rule arises: the username must be unique. Therefore we must somehow reconcile our aggregate boundaries with this new requirement. Assuming millions of users registering at the same time, it becomes a problem. We try to ensure this invariant in a non-locked state since multiple users means multiple aggregates.
According to the book "Domain-driven Design" by Eric Evans one should apply eventual consistency as soon as multiple aggregates are involved in a single transaction. But is this really the case here and does is make sense?
Applying eventual consistency here would entail registering the User and afterwards checking the invariant with the username. If two Users actually set the same username the system would undo the second registering and notify the User. Thinking about this scenario disconcerts me because it disrupts the whole registering process. Sending the confirmation e-mail, for example, had to be delayed and so forth.
I think I'm just forgetting about something in general but I don't know what. It seems to me that I need something like invariants on Repository-level.
We could conceive a Connections-aggregate which gathers all active
connections (as Connection-entity), thereby enabling our
locking-algorithm which would properly reject duplicate
Connect-requests. On the other hand this approach is inefficient and
does not scale, further it is counter-intuitive in terms of domain
language
On the contrary, I think you're on the right track with this approach. It seems convoluted because you're using an example that doesn't make any sense - there is no real-life system that checks if a lamp is connected to more than one socket or a socket to more than one lamp.
But applying that approach to the second example would lead you to ask yourself what the "connection" aggregate is in that case, i.e. inside which scope a user name is unique. In a Company? For a given Tenant or Customer? For the whole <whatever-subdomain-youre-in>System? Find the name of the scope and there you have it - an Aggregate to enforce the unique name invariant. Choose the name carefully and if it doesn't exist in the ubiquitous language yet, invent a new concept with the help of a domain expert. DDD is not only about respecting existing domain terms, you're also allowed to introduce new ones when Breakthroughs are achieved.
Sometimes though, you will find that concurrent access to this aggregate is too intensive and generates problematic contention. With domain expert assent, you can introduce eventual consistency with a compensating action in case of conflict - appending a suffix to the nickname and notifying the user, for instance. Or you can split the "hot" aggregate into smaller, smarter, more efficient ones.
The problem you are describing is called set validation. Greg Young makes a very good point that a key question is whether or not the cost/benefit analysis justifies enforcing this constraint in code.
But let's suppose it does....
I find it's most useful to think about set validation from the perspective of an RDBMS. How would we handle this problem if we were doing things with tables? A likely candidate is that we would have some sort of connection table, with foreign keys for the Lamp and the Socket. Then we would define constraints that would say that each of those foreign keys must be unique in the table.
Those foreign key constraints span the entire table; which is the database's way of telling us that the entire table represents a single aggregate.
So if you were going to lift those constraints into your domain model, you would do so by making an aggregate of all connections, so that the domain model can immediately rule on whether or not a given Lamp-Socket connection should be allowed.
Now, there's an important caveat here -- we're assuming that the domain model is the authority for connections between lamps and sockets. If we are modeling lamps in the real world connected to sockets in the real world, then its important to recognize that the real world is the authority, not the model.
Put another way, if the domain model gets conflicting information about the real world (two lamps are reportedly connected to the same socket), the model only knows that its information about the world is incorrect -- maybe the first lamp was plugged in, maybe the second, maybe there's a message missing about a lamp being unplugged. So in this sort of case, it's common that you'll want to allow the conflict, with an escalation to a human being for resolution.
the username must be unique
This is the single most commonly asked variation of the set validation problem.
The basic remedy is the same: you now have a User Profile aggregate, with an identifier, and a separate user name directory aggregate, which ensures that each name is uniquely associated with a profile.
If you aren't worried that a profile has at most one user name linked to it, then there is another approach you can take, which is to introduce an aggregate for each user name, which includes the profileId as a member. Thus, each aggregate can enforce the constraint that the name can only be assigned if the previous assignment was terminated.
I think I'm just forgetting about something in general but I don't know what.
Only that constraints don't come from nowhere -- there should be a business motivation for them; and somebody (the domain expert) should be able to document the cost to the business of failing to maintain the proposed set constraint.
For instance, if you are already collecting an email address, do you really need a unique username? How much additional value are you creating by including username in the model? How much more by making it unique...?
If we plan an online game, for example, with millions of users which request games constantly, it's a real problem.
Yes, it is; but that may indicate that the game design is wrong. Review Udi Dahan's discussion of high contention domains, and his essay Race Conditions Don't Exist.
A thing to notice, however, is that if you really have an aggregate, you can scale it independently from the rest of your system. One monster box is dedicated to managing the set aggregate and nothing else (analog: an RDBMS dedicated to managing a single table).
A more likely choice is going to be sharding by realm/instance/whatzit; in which case you'd have a smaller set aggregate for each realm instance.
In addition to the suggestions already made, consider that some of these problems are very similar to database concurrency problems. Say that you have a contact, and one user changes the name, and another user changes the phone number for this contact. If you write a command that updates the whole contact with the state as it was after modification, then one of the two will overwrite the change of the other with the old value, unless measures are taken.
If, however, you write a 'ChangeEmailForContact' command, then already you will only change that one field and not have a conflict with the name change, which would similarly be a 'Name' or 'RenameContact' command.
Now what if two people change the email address shortly after the other? A really efficient way is to pass the original value (original email address) along with the new value in your command. Now you can check when updating the email address if the original email address was the same as the current email address (so it is a valid starting point), or if the new email address is the same as the current email address (no need to do anything). If not, then, only then, are you in a conflict situation.
Now, apply this to your 'set operation'. The first time a lightbulb is moved into a 'connection' (perhaps I would call it fixture), it is moving from unassigned to connection1. Then, when a lightbulb is moved, it must be moved from connection1 to connection2, say. Now you can validate if that lightbulb is already assigned, if it was assigned to connection1 or if something has changed in the meantime.
It doesn't solve everything of course, but for the tiny case that remains, that tiny moment where two initial assignments happen close enough together, you either have to go for say a redis cache of assigned usernames to validate against or give an admin an easy tool to solve this very rare instance. You could for instance make a projection that occasionally reports on such situations and make sure renaming isn't too painful.

Resources