Why? From CMS/pkcs#7 to Cades - digital-signature

Is that I know that the lack of some information in the CMs/pkcs#7 signature to prove the validity of signatures after a long period has resorted to CADES formats, is there other Reasons why users are migrate from the CMs/pkcs#7 fomats to Cades.
what are the advantages and disadvantages of each format.
thank you in advance

CAdES essentially is a specially profiled CMS ("C" in CAdES after all represents "CMS"). Thus, you don't migrate away from CMS but merely follow some stricter or more concrete rules.
CMS signatures (RFC 5652) may be extremely primitive, they actually need not even contain any signed attributes at all, and if they do, the only ones enforced are type and hash of the signed data.
Such minimalist signature containers are not useful for general use. There is too much opportunity for forgery (there is no assured, signed information on the signer) and too little information for proper validation.
Thus, many extra specifications have been published defining ways to add such missing information in a secured, signed, way, e.g. the ESS certificate identifiers (RFC 2634 / RFC 5035) for the secured identification of the signer certificate.
Collections of such extra attributes have been declared mandatory for signatures to have a certain legal value, e.g. as part of ISIS-MTT / Common PKI. Signature application in use in contexts where such a legal value is required, therefore, can count on those additional attributes to be present in a signature allowing for proper validation of the signatures.
While at first such collections were defined in smaller contexts only, e.g. on a national basis, meanwhile such collections are defined internationally, too.
CAdES specifies such collections (aka profiles) for all of Europe (and adopted also beyond).
In essence, creating CMS signatures according to such a profile makes sure that your signatures can be properly processed by very many applications and, therefore, their legal value immediately are recognized by them.

Related

How to handle hard aggregate-wide constraints in DDD/CQRS?

I'm new to DDD and I'm trying to model and implement a simple CRM system based on DDD, CQRS and event sourcing to get a feel for the paradigm. I have, however, run in to some difficulties that I'm not sure how to handle. I'm not sure if my difficulties stem from me not having modeled the domain properly or that I'm missing something else.
For a basic illustration of my problems, consider that my CRM system has the aggregate CustomerAggregate (which seems reasonble to me). The purpose of this aggregate is to make sure each customer is consistent and that its invarints hold up (name is required, social security number must be on the correkct format, etc.). So far, all is well.
When the system receives a command to create a new customer, however, it needs to make sure that the social security number of the new customer doesn't already exist (i.e. it must be unique across the system). This is, of cource, not an invariant that can be enforced by the CustomerAggregate aggregate since customers don't have any information regarding other customers.
One suggestion I've seen is to handle this kind of constraint in its own aggregate, e.g. SocialSecurityNumberUniqueAggregate. If the social security number is not already registered in the system, the SocialSecurityNumberUniqueAggregate publishes an event (e.g. SocialSecurityNumberOfNewCustomerWasUniqueEvent) which the CustomerAggregate subscribes to and publishes its own event in response to this (e.g. CustomerCreatedEvent). Does this make sense? How would the CustomerAggregate respond to, for example, a missing name or another hard constraint when responding to the SocialSecurityNumberOfNewCustomerWasUniqueEvent?
The search term you are looking for is set-validation.
Relational databases are really good at domain agnostic set validation, if you can fit the entire set into a single database.
But, that comes with a cost; designing your model that way restricts your options on what sorts of data storage you can use as your book of record, and it splits your "domain logic" into two different pieces.
Another common choice is to ignore the conflicts when you are running your domain logic (after all, what is the business value of this constraint?) but to instead monitor the persisted data looking for potential conflicts and escalate to a human being if there seems to be a problem.
You can combine the two (ex: check for possible duplicates via query when running the domain logic, and monitor the results later to mitigate against data races).
But if you need to maintain an invariant over a set, and you need that to be part of your write model (rather than separated out into your persistence layer), then you need to lock the entire set when making changes.
That could mean having a "registry of SSN assignments" that is an aggregate unto itself, and you have to start thinking about how much other customer data needs to be part of this aggregate, vs how much lives in a different aggregate accessible via a common identifier, with all of the possible complications that arise when your data set is controlled via different locks.
There's no rule that says all of the customer data needs to belong to a single "aggregate"; see Mauro Servienti's talk All Our Aggregates are Wrong. Trade offs abound.
One thing you want to be very cautious about in your modeling, is the risk of confusing data entry validation with domain logic. Unless you are writing domain models for the Social Security Administration, SSN assignments are not under your control. What your model has is a cached copy, and in this case potentially a corrupted copy.
Consider, for example, a data set that claims:
000-00-0000 is assigned to Alice
000-00-0000 is assigned to Bob
Clearly there's a conflict: both of those claims can't be true if the social security administration is maintaining unique assignments. But all else being equal, you can't tell which of these claims is correct. In particular, the suggestion that "the claim you happened to write down first must be the correct one" doesn't have a lot of logical support.
In cases like these, it often makes sense to hold off on an automated judgment, and instead kick the problem to a human being to deal with.
Although they are mechanically similar in a lot of ways, there are important differences between "the set of our identifier assignments should have no conflicts" and "the set of known third party identifier assignments should have no conflicts".
Do you also need to verify that the social security number (SSN) is really valid? Or are you just interested in verifying that no other customer aggregate with the same SSN can be created in your CRM system?
If the latter is the case I would suggest to have some CustomerService domain service which performs the whole SSN check by looking up the database (e.g. via a repository) and then creates the new customer aggregate (which again checks it's own invariants as you already mentioned). This whole process - the lookup of existing SSN and customer creation - needs to happen within one transaction to to ensure consistency. As I consider this domain logic a domain service is the perfect place for it. It does not hold data by itself but orchestrates the workflow which relates to business requirements - that no to customers with the same SSN must be created in our CRM.
If you also need to verify that the social security number is real you would also need to perform some call the another service I guess or keep some cached data of SSNs in your CRM. In this case you could additonally have some SocialSecurityNumberService domain service which is injected into the CustomerService. This would just be an interface in the domain layer but the implementation of this SocialSecurityNumberService interface would then reside in the infrastructure layer where the access to whatever resource required is implemented (be it a local cache you build in the background or some API call to another service).
Either way all your logic of creating the new customer would be in one place, the CustomerService domain service. Additional checks that go beyond the Customer aggregate boundaries would also be placed in this CustomerService.
Update
To also adhere to the nature of eventual consistency:
I guess as you go with event sourcing you and your business already accepted the eventual consistency nature. This also means entries with the same SSN could happen. I think you could have some background job which continually checks for duplicate entries and depending on the complexity of your business logic you might either be able to automatically correct the duplicates or you need human intervention to do it. It really depends how often this could really happen.
If a hard constraint is that this must NEVER happen maybe event sourcing is not the right way, at least for this part of your system...
Note: I also assume that command de-duplication is not the issue here but that you really have to deal with potentially different commands using the same SSN.

Where to put *serialization* in SOLID programming

I have business objects, that I would like to (de)serialize from and into a .yaml file.
Because I want the .yaml to be human readable, I need a certain degree of control over the serialize and deserialize methods.
Where should the serialization logic go?
A) If I teach every object, how to de/serialize itself, but that probably violates the single-responsibility-principle.
B) If I put it inside a common serialization module, that might violate the open-closed-principle, since more business objects will be added in the future. Also, changes to objects need now be performed in two places.
What is the SOLID approach to solve this conundrum for tiny-scale applications?
Usually in this kind of situation, you'll want the business objects to handle their own serialization. This doesn't necessarily violate the single responsibility principle, which asserts that each object should have one job and one boss. It just means that the one job includes serializability. The user owns the business object, and wants to be able to serialize it, so the requirement for serializability comes from the same place as those other requirements -- the user.
There are a couple danger areas, though. Firstly, do you really need to insist that the business object is serializable, or can you leave it up to the user to decide whether they are serializable or not? If you are imposing a serializability requirement, then there's a good chance that your are violating the SRP that way, because as you evolve the serialization system, you will be imposing your own requirements on the objects.
Second, you probably want to think long and hard about the interface the these objects use to serialize themselves. Does it have to be yaml? Why? Does it have to be to a file? Try not to impose requirements that are subject to change, because they depend on particular implementation decisions that you're making in the rest of the system. That ends up being a violation of SRP as well, because they those objects have to evolve according to requirements from 2 different sources. It's better if the objects themselves initiate their own serialization, and can choose the implementation to the greatest extent possible.

Should we trust the repository when it comes to invariants?

In the application I'm building there are a lot of scenarios where I need to select a group of aggregates on which to perform a specific operation. For instance, I may have to mark a bunch of Reminder aggregates as expired if they meet the expiration policy (there is only one).
I have a ReminderExpirationPolicy domain service that is always applied before delivering reminders. This policy does something like:
reminderRepository.findRemindersToExpire().forEach(function (reminder) {
reminder.expire(clockService.currentDateTime());
});
The expiration policy is currently duplicated as it exists as a SQL predicate within the SqlReminderRepository.findRemindersToExpire method and also within the Reminder.expire aggregate's method.
The answer to the question may be strongly opiniated (although there should definitely be pros and cons - and perhaps a widely adopted practice), but should I simply trust that the Reminder.expire method will only get called as part of the ReminderExpirationPolicy process and trust that the repository implementation will return the correct set of reminders to expire or should I also protect the invariant within the Reminder aggregate itself?
NOTE: I am aware that modifying multiple aggregates in a single transaction is sub-optimal and hinders scalability, but it's the most pragmatic solution in my case.
should I simply trust that the Reminder.expire method will only get called as part of the ReminderExpirationPolicy process and trust that the repository implementation will return the correct set of reminders to expire or should I also protect the invariant within the Reminder aggregate itself?
Short answer: you are backwards. You must protect the invariant within the Reminder aggregate; using the policy as a query specification is optional.
The key thing to realize is that, in your scenario, using the policy as a query specification is really is optional. Eliding persistence concerns, you should be able to do this
repo.getAll () { a -> a.expire(policy) ; }
with the aggregate declining to change state when doing so would violate the business invariant.
In general, the reason that this distinction is important is that any data that you could get by querying the repository is stale -- there could be another thread running coincident with yours that updates the aggregate after your query has run but before your expire command runs, and if that coincident work were to change the aggregate in a way that the policy would no longer be satisfied, then your expire command would come along later and threaten to violate the invariant.
Since the aggregate has to protect itself against this sort of race condition anyway, checking the policy in the query is optional.
It's still a good idea, of course -- in the course of normal operations, you shouldn't be sending commands that you expect to fail.
What's really happening, if you squint a little bit, is that the expire command and the query are using the same policy, but where the command execution path is evaluating whether the writeModel state satisfies the policy, the query is evaluating whether the readModel state satisfies the policy. So it isn't really duplicated logic - we're using different arguments in each case.
However, where my assumptions are different than yours is that from as far as I can see (assuming optimistic locking), even if the data become stale after aggregates are loaded and that I do not enforce the expiration rule within the aggregate, the operation would still fail because of a concurrency conflict.
Yes, if you have assurance that the version of the aggregate that processes the command is the same as the version that was used to test the policy, then the concurrent write will protect you.
An additional consideration is that you are losing one of the benefits of encapsulation. If the policy check happens in the aggregate, then you are guaranteed that every code path which can expire the aggregate must also evaluate the policy. You don't get that guarantee if the aggregate is relying on the caller to check the policy (a variant on the "anemic domain" model).

How to use external value object library in Domain Layer

I would like to have one or more libraries of reusable classes that are basically value objects, such as Address, PhoneNumber, EmailAdress, containing mostly properties and a few supporting methods. How can my Domain layer use these without breaking the rule that the Domain Layer should not contain external references, and without defining them as interfaces/abstract classes in the Domain Layer?
... without breaking the rule that the Domain Layer should not contain external references
I think your definition of 'external references' requires some reevaluation. It is hard to imagine a domain layer that does not reference anything. In C# and Java you will reference at least basic numeric types, dates and strings. I also don't see any harm in referencing external libraries like Noda/Joda time. On the other hand, you of course would not want to reference any heavy technical libraries like persistence, communication, UI etc.
So I would say that you can build your own reusable library referenced from domain but it requires a very careful consideration, and is often not worth the coupling that it will create. I would use a following criteria for every type:
Should be context-independent. EmailAddress for example is relatively independent of the context it is used from. Address on the other hand may have a different meaning depending on a Bounded context.
Should be stable (does not change often).
Should not hide any out-of-process communication (db, network etc)
Should not have any dependencies of its own (other than standard Java/C#)
I think that what you're referring to is a shared kernel.
Shared Kernel – This is where two teams share some subset of the
domain model. This shouldn’t be changed without the other team being
consulted.
While this looks great at first, since we are drilled not to repeat ourselves, be aware of the pitfalls:
The concepts should have the same meaning in every context. Some of these concepts hold subtle nuances depending on the context. Ask your domain expert.
Changes are more expensive; it might be cheaper to duplicate these few classes so that you can change them on your own than to have to consult multiple teams when something changes.
Stability cuts both ways. If you pull an entity out into each domain, then any changes have to be executed across multiple projects. If you don't, then changes have to be coordinated across multiple domains. The logistics of the former are easier than the latter, but the work involved in the latter can be greater. Either way, you have to test the changes on each platform.
And unless the entity is mature with a relatively well-defined semantics, my experience is that almost everything changes. So stability is nice, but might be a bit of a red herring.
That being said, I like (and +1) #Dmitry.

ddd - Creating and refactoring identity fields of Entities

We are learning DDD and evaluating its use for a system backed by a legacy system & datastore.
We have been using Anti-Corruption Layer, Bubble Context and Bounded Context without knowing about them, and this approach seems very practical to us.
But we are not certain and confident about identification methods and identity correlation.
Legacy data store does not have primary keys, instead uses composite unique indexes to identify information.
We are currently creating our model as Value objects that should be Entities (wishing to add "Long id" field to every one), or we rarely use combination of attributes used in unique indexes as id field. It seems clear to us that Entity models should have id fields.
Here are my concrete questions:
We want our shiny new Entities to have "Long id" fields theoretically. Is it OK not to add that field now since that gives us no value as the backend data store won't fill or understand that field?
Is it OK in DDD way to store identification information and refactoring it sometime later hopefully datastore changes towards our needs.
If so, is abstracting identification from Entities good approach (I mean Identifiable interface, or KeyClass per Entity? - Any good advice is needed here..)
This question may be out of the scope of DDD but i wonder if identification refactoring can cause impacts on several layers of the systems (For example, REST api may change from /entity/id_field/id_field/id_field to /entity/new_long_id)
and other questions :
How can we use the legacy information for our growing polished domain model, with less pain in identification stuff?
Or is it bad and not valuable to wish Long id for our domain at anytime of the project's life?
How can we refactor identity management?
Update:
Is identity mgmt important aspect of DDD or is it an infrastructural aspect that can be refactored, so we should not spend more time in it?
Use whatever identifier fits your needs, but be realistic and upfront about the cost and implications of choosing the wrong one. There is merit in assigning identifiers from the outside and just storing them along the other bits of information (regardless of format (guid, long, uuid)). Whether or not to refactor identity mgmt is more about doing a cost/benefit analysis. Is it even an option with the legacy system, and in what kind of timeframe will there be two keys sidebyside? Why are you even reusing the same datastore? Why not partition it so you can have parallel data stores (worst case even syncing data between both of them, preferably in one direction)? Try some vertical slicing instead of horizontal. HTH.

Resources