Explanation and use cases of JCR workspace for human beings - workspace

Could please anybody interpret JCR 2.0 specification in regard to JCR workspaces ?
I understand that a session is always bound to exactly one persistent workspace, though a single persistent workspace may be bound to multiple sessions.
Which probably relates to versioning and transactions, though I don't know why.
Some observations :
references are only possible between nodes of the same workspace
executing a query will always be targeted to a single workspace
Workspaces seem to be about nodes that represent the same content (same UUID), in :
different versions of "something", project maybe ?
different phase of workflow
And shouldn't be used for ACL.
Also in JackRabbit, each workspace has its persistence manager. Whereas ModeShape has a connector for source - workspace independent.

David's model ( http://wiki.apache.org/jackrabbit/DavidsModel ) Rule #3 recommends using workspaces only if you need clone(), merge() or update(). For the vast majority of JCR applications, this means not using workspaces. Putting things under different paths, setting specific property values or mixin node types on them and using JCR's versioning covers the versioning and workflow use cases that you mention.
To manage print jobs for example, you could simply move them between JCR folders named "new", "in-progress", "rejected" and "done". That's how it is done in some unix versions, using filesystem folders. JCR allows you to do the same, while benefitting from its "filesystem on steroids" features, so as to keep things very simple, transparent and efficient.
Note also David's Rule #5: references are harmful - we (Apache Sling and Day/Adobe CQ /CRX developers) tend to just use paths instead, as looser and more flexible references.
And as you mention queries: we also use very few of those - navigation in a JCR tree is much more efficient if your content model's path structure makes sense for the most common use cases.

Related

DDD - How to get another aggregate root without talking with repository?

In this question https://softwareengineering.stackexchange.com/questions/396151/which-layer-do-ddd-repositories-belong-to?newreg=f257b90b65e94f9ead5df5096267ef9a, I know that we should avoid talking with the repository.
But now we have one aggregate root and we need to get another aggregate root to check if the aggregate root we have is correct.
For example:
every resource has its scheme. So the resource will hold the name of its scheme.
If the resource wants to update with the new resource data, it wants to get the scheme entity and ask scheme check if the new resource data is matched by scheme validator or not.
To get the scheme entity, we need to ask the scheme repository. But they tell me we should avoid talking with the repository. If we really need to avoid, how can we get the scheme entity by its name?
Two ideas that you should keep in mind:
these are patterns; the patterns aren't expressed exactly the same way everywhere, but are adapted to each context where we use tham.
we choose designs that make our lives easier -- if the pattern is getting in the way, we don't use it.
To get the scheme entity, we need to ask the scheme repository. But they tell me we should avoid talking with the repository. If we really need to avoid, how can we get the scheme entity by its name?
Basic idea: entities can be found via traversal (you start from some aggregate root, then keep asking for things until you get where you want), root entities can be reached via the repository.
In the "blue book", Evans assumed that you could reach one aggregate root from another; people tend not to do that as much (when you can no longer assume that all aggregates will be stored in the same database, you suddenly need to be careful about which aggregates can change at the same time, and this in turn suggests additional constraints on your design).
In the present day, we would notice that if we aren't intending to change the scheme, that we don't need its aggregate root at all - we just need a validator, which we can create from a copy of the information.
So we treat the scheme information as reference data (see Helland, 2005).
Great - so how do we get "reference data" to the resource aggregate?
Typically in one of two ways - the simplest by far is to just pass it as an argument. The application code (usually the same place that we are pulling the resource aggregate out of its repository) is also responsible for looking up the resource data.
resource = resourceRepository.get(id)
validator = validatorRepository.get(resource.scheme)
resource.update(validator, ....)
Alternatively, we can pass the capability to lookup the validator to the resource. Naively, that might look like:
resource = resourceRepository.get(id)
resource.update(validatorRepository::get, ....)
But that violates our "rule" about where we use the repository. So now what? Two possible answers: we decide the rule doesn't apply here, OR we use a similar pattern to get what we need: a "domain service")
resource = resourceRepository.get(id)
resource.update(domainService, ....)
Domain service is a pattern that can be used for all sorts of things; here, we are using it as a convenient mechanism to access the reference data that we need.
Superficially, it looks like a repository - the significant difference here is that this domain services doesn't have affordances to change the scheme entities; it can only read them. The information is the same, but the expression of that information as an object is different (because this object only has read methods).
It's "just" design; the machine really doesn't care how we tell it what work to do. We're merely trying to arrange the code so that it clearly communicates intent to the next programmer who comes along.

Convention-based object-graph synchronization

I'm planning my first architecture that uses DTOs. I'm now exploring how to map the modified client-side domain objects back to the DTOs that were originally retrieved from the data service. I must map back to the original object graph, instead of instantiating a new one, in order to use WCF Data Services Client Library's change tracking feature.
To put it in general terms, I need a tool that maps instances and (recursively) their sub-instances (collectively called the "source graph") to existing instances and (recursively) sub-instances (collectively called the "target graph") in a manner that is (nearly) 100% convention, rather than configuration, based.
The specific required functionality that I can think of is:
Replace single-valued properties within the target graph with their corresponding values from the source graph.
Synchronize collection pairs: elements that were added to a collection within the source graph should then be added to the corresponding collection within the target graph; elements removed from a collection within the source graph should then be removed from the corresponding collection within the target graph.
When it comes to mapping DTOs, it seems many people use AutoMapper. So I had assumed this task would be easy using that tool. Upon looking at the details, though, I have doubts it will fit my requirements. This indicates AutoMapper won't handle #1 so well. Equally so, this indicates AutoMapper won't help much with #2 either.
I don't want to try bending AutoMapper to my purposes if it will lead to a lot of configuration code. That would defeat the purpose of using a convention-based tool in the first place. So I'm wondering: what's a better tool for the job?

How to generate POCO proxies from an existing database

I recently switched to Entity Framework 5. Now, I want to generate the POCO classes from an existing database and also I need both lazy loading and change tracking. So all the scalar properties should be virtual as well as navigation properties.
Adding a new ADO.Net Entity Data Model ends in an .edmx file and some other .cs and .tt files.
Firstly, I wonder why the generated POCO classes by default do not meet the requirements of change tracking proxy, i.e scalar properties are not virtual.
Secondly, how can I genrate proxy-enabled poco classes?
PS: I accepted the Slauma's answer as the best and the only answer so far but I don't agree with the first part of it. Here is my argument
Slauma talks about two problems with proxy: restrictions and performance:
About the restrictions on the proxy-enabled entities:
When the classes are generated in DB First method by Entity Framework, the rules that the classes must follow to enable change-tracking proxies are not that much important becuase they are not restrictive at all. Who really cares whether the navigation collections are IList or HashSet? Talking about the restrictions is sensible only when there are perior designed classes in the application and tables are to be generated from them.
Complex properties are not supported in DB first. So we can exclude them from our discussion.
About the perfomrance:
In the addressed article and also some other experiments I have studied so far the results are not very convincing to reject proxy in favor of snapshot. First, the experiments were done on a large number of entities a.k.a 10,000. It is not improbable that a batch process in your application(not in database) works on large number of entities, however better approaches are assumed such as stored procedure.
Second, depending on the type of the application and the needs, we usually deal with few number of entites for example when Repository pattern is impelemented and used; there is no difference between the performance of proxy and snapshot.
Interestingly, in the addressed experiment, re-assigning the same value to the properties was the only case when performance of proxy dramatically fails. But who really does this? It is very easy to be careful to avoid repeatedly notifying change tracker. Again, in this case significant problem arrises when large number of entites are dealt with.
Firstly, I wonder why the generated POCO classes by default do not
meet the requirements of change tracking proxy, i.e scalar properties
are not virtual.
Using change tracking proxies is not recommended as the default change tracking strategy. It is explained in more details in this blog post. In essence the main reason to use change tracking proxies - better performance compared to snapshot based change tracking - is not always guaranteed - and sometimes it's even worse - and the list of disadvantages is longer than for snapshot based change tracking.
In the past the T4 templates that generated POCO entities indeed marked all properties - including scalar properties - as virtual and prepared the entities for proxy based change tracking. For the reasons described in the blog this has been changed for the newer templates, including the DbContext Generator for EF 5, as mentioned in this comment below the blog post linked above. Now, only navigation properties are marked as virtual, but not scalar properties, which allows lazy loading but is not sufficient for change tracking proxies.
Secondly, how can I generate proxy-enabled poco classes?
I am not aware of any available T4 template that would do this, but it is quite easy to modify the default template to mark also the scalar properties as virtual:
In your project you should have two files with a .tt extension: YourModelContainer.tt and YourModelContainer.Context.tt. Open the YourModelContainer.tt file.
In this file you'll find a method called Property:
public string Property(EdmProperty edmProperty)
{
return string.Format(
CultureInfo.InvariantCulture,
"{0} {1} {2} {{ {3}get; {4}set; }}",
Accessibility.ForProperty(edmProperty),
_typeMapper.GetTypeName(edmProperty.TypeUsage),
_code.Escape(edmProperty),
_code.SpaceAfter(Accessibility.ForGetter(edmProperty)),
_code.SpaceAfter(Accessibility.ForSetter(edmProperty)));
}
Change the line with...
Accessibility.ForProperty(edmProperty),
...to...
AccessibilityAndVirtual(Accessibility.ForProperty(edmProperty)),
That's it.
Just to mention it, in case you are not familiar with it, but there is a second kind of Database-First approach available, that is Reverse Engineering an existing database to a Code-First model. This approach doesn't use a T4 template at all but creates a Code-First model and a context with Fluent API mapping. It is useful if you want to customize and extend the model classes (you could also add virtual modifiers then manually) and proceed with Code-First workflow (and Code-First Migrations) in future to update and evolve your database schema.

How to use external value object library in Domain Layer

I would like to have one or more libraries of reusable classes that are basically value objects, such as Address, PhoneNumber, EmailAdress, containing mostly properties and a few supporting methods. How can my Domain layer use these without breaking the rule that the Domain Layer should not contain external references, and without defining them as interfaces/abstract classes in the Domain Layer?
... without breaking the rule that the Domain Layer should not contain external references
I think your definition of 'external references' requires some reevaluation. It is hard to imagine a domain layer that does not reference anything. In C# and Java you will reference at least basic numeric types, dates and strings. I also don't see any harm in referencing external libraries like Noda/Joda time. On the other hand, you of course would not want to reference any heavy technical libraries like persistence, communication, UI etc.
So I would say that you can build your own reusable library referenced from domain but it requires a very careful consideration, and is often not worth the coupling that it will create. I would use a following criteria for every type:
Should be context-independent. EmailAddress for example is relatively independent of the context it is used from. Address on the other hand may have a different meaning depending on a Bounded context.
Should be stable (does not change often).
Should not hide any out-of-process communication (db, network etc)
Should not have any dependencies of its own (other than standard Java/C#)
I think that what you're referring to is a shared kernel.
Shared Kernel – This is where two teams share some subset of the
domain model. This shouldn’t be changed without the other team being
consulted.
While this looks great at first, since we are drilled not to repeat ourselves, be aware of the pitfalls:
The concepts should have the same meaning in every context. Some of these concepts hold subtle nuances depending on the context. Ask your domain expert.
Changes are more expensive; it might be cheaper to duplicate these few classes so that you can change them on your own than to have to consult multiple teams when something changes.
Stability cuts both ways. If you pull an entity out into each domain, then any changes have to be executed across multiple projects. If you don't, then changes have to be coordinated across multiple domains. The logistics of the former are easier than the latter, but the work involved in the latter can be greater. Either way, you have to test the changes on each platform.
And unless the entity is mature with a relatively well-defined semantics, my experience is that almost everything changes. So stability is nice, but might be a bit of a red herring.
That being said, I like (and +1) #Dmitry.

UI concerns affecting domain design

I have a fairly simple domain, with around 7-8 major entities identified and these could be their own aggregate roots. But there is going to be a UI screen that is going to list union of all objects in the system, that would mean union of all aggregates.
One way I have in mind is to use composition, i.e a Metadata aggregate that all other aggregate roots refer to, this will be an independent entity. So for this screen I can query this aggregate, the fields that I move to this new aggregate are the common fields that needs to be displayed in my "All Objects" grid.
The other approach could be to have an application service method that builds the necessary list for "All objects" screen by querying the other repositories and merging the lists at the application layer and also handling paging etc.
I am uneasy with the first solution as I can see a UI use case influencing my
domain design but the db does the grunt work of handling paging, merging lists etc
and there are no joins all of these info gleaned out by a single, simple query.
The second solution, although looks neater, loses out on ease and performance.
Please advise.
In this case I would propose the use of read-models which are essentially value objects or DTOs used specifically for read scenarios. Use of read-models is a pattern of keeping your entities and ARs clean. As far as how the read-models are created, you have two options basically as you described. One is to have a one repository return a single read-model that fulfills the requirements of a given view. This would allow you to leverage the database for performance. Another option is to compose read-models from multiple repositories or services at the application service level or event at the presentation layer. This approach is more extensible in that data doesn't have to come from the same data source.

Resources