What does the priority means in the context of puppetlabs' hiera? - puppet

I was wondering if someone could hint me towards what does the priority stands for in Puppetlabs' hiera. Are there some explanation in hiera's documentation? Is it about the ordering of backends and datasources in the configuration file? Somehow I couldn't google that out.
UPDATE: To be more specific, what I do not understand is the way Hiera v1.3 assigns a level of priority for a data source. Such a priority which is refered by the Hiera v1 manual (https://docs.puppet.com/hiera/1/lookup_types.html) in : "Note that in cases where two or more source hashes share some keys, higher priority data sources in the hierarchy will override lower ones."

Hiera imposes implicit priority through lookup order. Data sources are visited in the order they are defined. That's why the most specific sources must be defined first, because they implicitly get the highest priority.

Related

What is the semantic of having an item flow between Blocks rather than Parts in SysML 1.4?

From my understanding, SysML 1.4 allows to have itemFlows between Block as well as Part
Here is an excerpt from pag 75 of the SysML 1.4 specs
which shows that it is possible to have itemFlow(s) between Blocks.
I am not sure about the semantic of this.
For example, referring to the excerpt from the SysML 1.4 specs, does it mean that every instance of Engine block requires an "itemFlow" connection to an instance of a Transmission block and that a Torque will flow between every Instance of Engine block to the associated instance of Transmission Block?
Yes, of course. At least if the Engine/Transmission are blocks instantiated from this model.
You are free to define other Engines/Transmissions where not Torque is transported (e.g. if you see a copper cable as transmission where current is transported rather than torque).
An item flow in general tells that "something physical" is moved from source to target. The above transports torque. You can also transport current, gas, fluid, etc. Even abstract information can be transported, though SysML is designed to map physical objects, rather than abstract things (where UML will be sufficient).
There is an association between Engine and Transmission. Since we don't see any multiplicity, we may assume that it is 1. That means every Engine instance must be linked to a Transmission instance and vice versa. This is not realistic, but hey, who wants models of reality ;-). In the real world the multiplicity is 0..1.
The item flow just says, that Torque can potentially flow across a link between the two instances.
By the way: This is also not realistic, since torque is the potential to flow, not the item flowing. The item is rather angular momentum. For reasons I don't understand, the potential (e.g. Torque) or the rate (e.g. Current) is often used in place of the item that is flowing in reality.

Do we need another repo for each entity?

For example take an order entity. It's obvious that order lines don't exist without order. So we have to get them with the help of OrderRepository(throw an order entity). Ok. But what about other things that are loosely coupled with order? Should the customer info be available only from CustomerRepo and bank requisites of the seller available from BankRequisitesRepo, etc.? If it is correct, we should pass all these repositories to our Create Factory method I think.
Yes. In general, each major entity (aggregate root in domain driven design terminology) should have their own repositories. Child entities *like order lines) will generally not need one.
And yes. Define each repository as a service then inject them where needed.
You can even design things such that there is no direct coupling between Order and Customer in terms of an actual database link. This in turn allows customers and orders to live in completely independent databases. Which may or may not be useful for your applications.
You correctly understood that aggregate roots's (AR) child entities shall not have their own repository, unless they are themselves AR's. Having repositories for non-ARs would leave your invariants unprotected.
However you must also understand that entities should usually not be clustered together for convenience or just because the business states that some entity has one or many some other entity.
I strongly recommend that you read Effective Aggregate Design by Vaughn Vernon and this other blog post that Vaughn kindly wrote for a question I asked.
One of the aggregate design rule of thumb stated in Effective Aggregate Design is that you should usually reference other aggregates by identity only.
Therefore, you greatly reduce the number of AR instances needed in other AR's creationnal process since new Order(customer, ...) could become new Order(customerId, ...).
If you still find the need to query other AR's in one AR's creationnal process, then there's nothing wrong in injecting repositories as dependencies, but you should not depend on more than you need (e.g. let the client resolve the real dependencies and pass them directly rather than passing in a strategy allowing to resolve a dependency).

Can applications coexist within the same DHT?

If you create a new application which uses a distributed hash table (DHT), you need to bootstrap the p2p network. I had the idea that you could join an existing DHT (e.g. the Bittorrent DHT).
Is this feasable? Of course, we assume the same technology. Combining Chord with Kademlia is obviously not feasable.
If yes, would this be considered parasitic or symbiotic? Parasitic meaning that it conflicts with the original use somehow. Symbiotic, if it is good for both applications as they support each other.
In general: Kademlia and Chord are just abstract designs, while implementations provide varying functionality.
If its feature-set is too narrow you won't be able to map your application logic onto it. If it's overly broad for your needs it might be a pain to re-implement if no open source library is available.
For bittorrent: The bittorrent DHT provides 20byte key -> List[IP,Port] lookups as its primary feature, where the IP is determined by the sender IP and thus cannot be used to store arbitrary data. There are some secondary features like bloom filter statistics over those lists but they're probably even less useful for other applications.
It does not provide general key-value storage, at least not as part of the core specification. There is an extension proposal for that
Although implementations provide some basic forward-compatibility for unknown message types by treating them like node lookup requests instead of just ignoring them that is only of limited usefulness if your application supplies a small fraction of the nodes, since you're unlikely to encounter other nodes implementing that functionality during a lookup.
If yes, would this be considered parasitic or symbiotic?
That largely depends on whether you are a "good citizen" in the network.
Does your implementation follow the spec, including commonly used extensions?
Does your general use-case stay within an order of magnitude compared to other nodes when it comes to the traffic it causes?
Is the application lifecycle long enough to not lie outside the expected churn rates of the target DHT?

Explanation and use cases of JCR workspace for human beings

Could please anybody interpret JCR 2.0 specification in regard to JCR workspaces ?
I understand that a session is always bound to exactly one persistent workspace, though a single persistent workspace may be bound to multiple sessions.
Which probably relates to versioning and transactions, though I don't know why.
Some observations :
references are only possible between nodes of the same workspace
executing a query will always be targeted to a single workspace
Workspaces seem to be about nodes that represent the same content (same UUID), in :
different versions of "something", project maybe ?
different phase of workflow
And shouldn't be used for ACL.
Also in JackRabbit, each workspace has its persistence manager. Whereas ModeShape has a connector for source - workspace independent.
David's model ( http://wiki.apache.org/jackrabbit/DavidsModel ) Rule #3 recommends using workspaces only if you need clone(), merge() or update(). For the vast majority of JCR applications, this means not using workspaces. Putting things under different paths, setting specific property values or mixin node types on them and using JCR's versioning covers the versioning and workflow use cases that you mention.
To manage print jobs for example, you could simply move them between JCR folders named "new", "in-progress", "rejected" and "done". That's how it is done in some unix versions, using filesystem folders. JCR allows you to do the same, while benefitting from its "filesystem on steroids" features, so as to keep things very simple, transparent and efficient.
Note also David's Rule #5: references are harmful - we (Apache Sling and Day/Adobe CQ /CRX developers) tend to just use paths instead, as looser and more flexible references.
And as you mention queries: we also use very few of those - navigation in a JCR tree is much more efficient if your content model's path structure makes sense for the most common use cases.

Modelling a permissions system

How would you model a system that handles permissions for carrying out certain actions inside an application?
Security models are a large (and open) field of research. There's a huge array of models available to choose from, ranging from the simple:
Lampson's Access control matrix lists every domain object and every principal in the system with the actions that principal is allowed to perform on that object. It is very verbose and if actually implemented in this fashion, very memory intensive.
Access control lists are a simplification of Lampson's matrix: consider it to be something akin to a sparse-matrix implementation that lists objects and principals and allowed actions, and doesn't encode all the "null" entries from Lampson's matrix. Access control lists can include 'groups' as a convenience, and the lists can be stored via object or via principal (sometimes, via program, as in AppArmor or TOMOYO or LIDS).
Capability systems are based on the idea of having a reference or pointer to objects; a process has access to an initial set of capabilities, and can get more capabilities only by receiving them from other objects on the system. This sounds pretty far-out, but think of Unix file descriptors: they are an unforgeable reference to a specific open file, and the file descriptor can be handed to other processes or not. If you give the descriptor to another process, it will have access to that file. Entire operating systems were written around this idea. (The most famous are probably KeyKOS and EROS, but I'm sure this is a debatable
point. :)
... to the more complex, which have security labels assigned to objects and principals:
Security Rings, such as implemented in Multics and x86 CPUs, among others, and provide security traps or gates to allows processes to transition between the rings; each ring has a different set of privileges and objects.
Denning's Lattice is a model of which principals are allowed to interact with which security labels in a very hierarchical fashion.
Bell-LaPadula is similar to Denning's Lattice, and provides rules to prevent leaking top-secret data to unclassified levels and common extensions provide further compartmentalization and categorization to better provide military-style 'need to know' support.
The Biba Model is similar to Bell-LaPadula, but 'turned on its head' -- Bell-LaPadula is focused on confidentiality, but does nothing for integrity, and Biba is focused on integrity, but does nothing for confidentiality. (Bell-LaPadula prevents someone from reading The List Of All Spies, but would happily allow anyone to write anything into it. Biba would happily allow anyone to read The List Of All Spies, but forbid nearly everyone to write into it.)
Type Enforcement (and its sibling, Domain Type Enforcement) provides labels on principals and objects, and specifies the allowed object-verb-subject(class) tables. This is the familiar SELinux and SMACK.
.. and then there are some that incorporate the passage of time:
Chinese Wall was developed in business settings to separate employees within an organization that provides services to competitors in a given market: e.g., once Johnson has started working on the Exxon-Mobil account, he is not allowed access to the BP account. If Johnson had started working on BP first, he would be denied access to Exxon-Mobil's data.
LOMAC and high-watermark are two dynamic approaches: LOMAC modifies the privileges of processes as they access progressively-higher levels of data, and forbids writing to lower levels (processes migrate towards "top security"), and high-watermark modifies the labels on data as higher-levels of processes access it (data migrates towards "top security").
Clark-Wilson models are very open-ended; they include invariants and rules to ensure that every state transition does not violate the invariants. (This can be as simple as double-entry accounting or as complex as HIPPA.) Think database transactions and constraints.
Matt Bishop's "Computer security: art and science" is definitely worth reading if you'd like more depth on the published models.
I prefer RBAC. Although, you can find it very similar to ACL, but they differ semantically.
Go through the following links:
http://developer.android.com/guide/topics/security/security.html
http://technet.microsoft.com/en-us/library/cc182298.aspx

Resources