Why would a Puppet module's main class be included by a sub-class? - puppet

Many areas in the puppetlabs/apache module such as vhost.pp you can see error handling that requires the base class to be included first because the class in question uses the base class in its' parameter defaults.
Here in dev.pp there are no parameters though you can see the reference to $::apache::dev_packages which is declared by the ::apache::params class when ::apache is initialized.
However, in vhosts.pp you can see that the base class is included explicitly without an expectation that it was previously included.
My understanding of this is that apache::vhosts is designed to be used as a standalone class and it's inclusion of ::apache initializes Apache's default configuration as determined by the module. However, if Apache is declared elsewhere such as:
class { '::apache':
*params*
}
Then the inclusion of the base class utilizes whatever values were passed as arguments to the base class. Is that correct? Why would two public classes apache::vhosts and apache::dev have two different requirements for usage?

Why would a Puppet module's main class be included by a sub-class?
First of all, these are not base and subclasses. Puppet does have class inheritance, but apache::dev does not use it, and apache::vhost isn't even a class (it's a defined type). The apache class is the module's "main" class, and apache::dev is simply another class in the same module.
Pretty much the only good use for class inheritance is to support obtaining class parameter defaults from another class's variables, but evidently, the people in control of Puppet's online docs no longer think that's a good idea either (though you can still see an example in class apache). Hiera support for data in modules is a decent alternative, but I sometimes think that Puppet, Inc. is too fascinated with their shiny new goodies, and too dismissive of older features that work fine when used as documented, but break unfortunately when misused.
Here in dev.pp there are no parameters
... and no inclusion of class apache. But there is code that will cause catalog building to fail in the event that apache has not already been declared, separately.
However, in vhosts.pp you can see that the base class is included
explicitly without an expectation that it was previously included.
Yes, that's fairly normal. More normal, indeed, than apache::dev's behavior. apache::vhost is intended for public use, so if you declare an instance then it ensures that everything it needs is included in the catalog, too.
My understanding of this is that apache::vhosts is designed to be used as a standalone class and it's inclusion of ::apache initializes Apache's default configuration as determined by the module.
Not exactly. apache::vhost is intended to be a public type, and it does declare ::apache to ensure that everything needed to support it is indeed managed. You can characterize that as "standalone" if you like. But the inclusion of ::apache there is no different from the same anywhere else. If that class has already been added to the catalog then it it has no additional effect. Otherwise, it is added, with parameters drawn from Hiera data where such parameter data are defined, and hard-coded defaults where not. Hiera is how one should, generally, customize class parameters, and where that is done, the resulting apache configuration is not accurately characterized as "default" or defined by the module.
However, if Apache is declared elsewhere such as:
class { '::apache':
*params*
}
Then the inclusion of the base class utilizes whatever values were
passed as arguments to the base class.
If such a resource-like class declaration has already been evaluated then, as I already said, apache::vhost's include-like declaration has no additional effect. But if such a resource-like class declarations is evaluated later then catalog building will fail. This is one of the major reasons to avoid resource-like class declarations and rely on data binding via Hiera for class parameter customization.
Why would two public classes apache::vhosts and apache::dev have two different requirements for usage?
Because the module was developed over multiple years by hundreds of contributors. It is not surprising that that produced some inconsistency. Especially so because even Puppet developers who contribute to modules are at different points on the road to enlightenment.
The only plausible justification for preferring the approach of apache::dev is to avoid interfering with a resource-like declaration of class apache that is evaluated later, but avoiding such a failure by forcing a different failure is not a major gain. It does afford the opportunity to provide a clearer diagnostic in cases that would fail anyway, but at the expense of failing arbitrarily in other cases where it could just work instead.

Related

Puppet Include vs Class and Best Practices

When should I be using an include vs a class declaration? I am exploring creating a profile module right now, but am struggling with methodology and how I should lay things out.
A little background, I'm using the puppet-labs java module which can be found here.
My ./modules/profile/manifests/init.pp looks like this:
class profile {
## Hiera Lookups
$java_version = hiera('profile::jdk::package')
class {'java':
package => $java_version,
}
}
This works fine, but I know that I can also remove the class {'java': block of the code and instead use include java. My question relates around two things. One, if I wanted to use an include statement for whatever reason, how could I still pass the package version from hiera to it? Second, is there a preferred method of doing this? Is the include something I really shouldn't be using, or are there advantages and disadvantages to each method?
My long term goal will be building out profile like modules for my environment. Likely I would have a default profile that applies to all of my servers, and then profiles for different application load outs. I could include the profiles into a role and apply things to my individual nodes at that level. Does this make sense?
Thanks!
When should I be using an include vs a class declaration?
Where a class declares another, internal-only class that belongs to the same module, you can consider using a resource-like class declaration. That leverages your knowledge of the implementation details of the module, as you need to be able to prove that no other declaration of the class in question will be evaluated before the resource-like one. If ever that constraint is violated, catalog building will fail.
Under all other circumstances, you should use include or one of its siblings, require and contain.
One, if I wanted to use an include statement for whatever reason, how
could I still pass the package version from hiera to it?
Exactly the same way you would specify any other class parameter via Hiera. I already answered that for you.
Second, is
there a preferred method of doing this?
Yes, see above.
Is the include something I
really shouldn't be using, or are there advantages and disadvantages
to each method?
The include is what you should be using. This is your default, with require and contain as alternatives for certain situations. The resource-like declaration syntax seemed good to the Puppet team when they first introduced it, in Puppet 2.6, along with parameterized classes themselves. But it turns out that that syntax introduced deep design problems into the language, and it has been a source of numerous bugs and headaches. Automatic data binding was introduced in Puppet 3 in part to address many of those, allowing you to assign values to class parameters without using resource-like declarations.
The resource-like syntax has the single advantage -- if you want to consider it one -- that the parameter values are expressed directly in the manifest. Conventional Puppet wisdom holds that it is better to separate data from code, however, so as to avoid needing to modify manifests as configuration requirements change. Thus, expressing parameter values directly in the manifest is a good idea only if you are confident that they will never change. The most significant category of such cases is when a class has read data from an external source (i.e. looked it up via Hiera), and wants to pass those values on to another class.
The resource-like syntax has the great disadvantage that if a resource-like declaration of a given class is evaluated anywhere during the construction of a catalog for a given target node, then it must be the first declaration of that class that is evaluated. In contrast, any number of include-like declarations of the same class can be evaluated, whether instead of or in addition to a resource-like declaration.
Classes are singletons, so multiple declarations have no more effect on the target node than a single declaration. Allowing them is extremely convenient. Evaluation order of Puppet manifests is notoriously hard to predict, however, so if there is a resource-like declaration of a given class somewhere in the manifest set, it is very difficult in the general case to ensure that it is the first declaration of that class that is evaluated. That difficulty can be managed in the special case I described above. This falls into the more general category of evaluation-order dependencies, and you should take care to ensure that your manifest set is free of those.
There are other issues with the resource-like syntax, but none as significant as the evaluation-order dependency.
Clarification with respect to automated data binding
Automated data binding, mentioned above, associates keys identifying class parameters with corresponding values for those parameters. Compound values are supported if the back end supports them, which the default YAML back end in fact does. Your comments on this answer suggest that you do not yet fully appreciate these details, and in particular that you do not recognize the significance of keys identifying (whole) class parameters.
I take your example of a class that could on one hand be declared via this resource-like declaration:
class { 'elasticsearch':
config => { 'cluster.name' => 'clustername', 'node.name' => 'nodename' }
}
To use an include-like declaration instead, we must provide a value for the class's "config" parameter in the Hiera data. The key for this value will be elasticsearch::config (<fully-qualified classname> :: <parameter name>). The associated value is wanted puppet-side as a hash (a.k.a. "associative array", a.k.a. "map"), so that's how it is specified in the YAML-format Hiera data:
elasticsearch::config:
"cluster.name": "clustername"
"node.name": "nodename"
The hash nature of the value would be clearer if there were more than one entry. If you're unfamiliar with YAML, then it would probably be worth your while to at least skim a primer, such as the one at yaml.org.
With that data in place, we can now declare the class in our Puppet manifests simply via
include 'elasticsearch'

What's the appropriate class design for exporting testable functions in Typescript?

I have a couple of stateless classes that do some business logic and return some computations. Each one of them has naturally a set of dependencies on other classes.
Now there are two designs that I've juggling between:
Have a class where each method is a static method. I can use jest import mocking, to overwrite the dependencies for testing. Advantage is you only have one class instance.
Have a class with regular non-static methods. This would require instantiating the class in each place it is used. I can pass in class dependencies in the constructor. Testing this is pretty straightforward. Drawback is you create multiple class instances and potentially dependencies in the code.
Which of these is the preferred idiomatic TS approach?
There is also the classic solution of using an IoC container, but I'd like to avoid this since this application is fairly small and don't want to add extra bloat.
Also, don't want to export pure functions and forego classes all together since that means I'll lose auto-importing of classes (in VSCode).
Also, don't want to export pure functions and forego classes all together since that means I'll lose auto-importing of classes (in VSCode).
Can you explain what is this "auto-importing of classes (in VSCode)"? In VSCode, I'm using extensions like TypeScript Importer that adds the appropriate imports for any type (class, function, literal...), not only for classes.
Anyway, IMO, it's the "pure functions" approach that is the more JavaScript idiomatic, at least for those (including me) who likes the JavaScript functional programming features. It's true for TypeScript too, even with its sympathy for classes and C#/Java-like object oriented programming.
If you really prefer classes, avoid static i.e. option #1. You can instanciate and share instances while bootstrapping your application (like a DI container would do) to avoid multi instances.
Coupling between classes is less an issue in TypeScript due to its structural typings: a class can act as an interface → if class A depends on class B, you can always provide a class B-like object to the class A instance.

puppet inheritance VS puppet composition

I just came cross puppet inheritance lately. A few questions around it:
is it a good practice to use puppet inheritance? I've been told by some of the experienced puppet colleagues Inheritance in puppet is not very good, I was not quite convinced.
Coming from OO world, I really want to understand under the cover, how puppet inheritance works, how overriding works as well.
That depends, as there are two types of inheritance and you don't mention which you mean.
Node inheritance: inheriting from one node fqdn { } definition to another. This in particular is strongly recommended against, because it tends to fail the principle of least surprise. The classic example that catches people out is this:
node base {
$mta_config = "main.cf.normal"
include mta::postfix # uses $mta_config internally
}
node mailserver inherits base {
$mta_config = "main.cf.mailserver"
}
The $mta_config variable is evaluated in the base scope, so the "override" that is being attempted in the mailserver doesn't work.
There's no way to directly influence what's in the parent node, so there's little benefit over composition. This example would be fixed by removing the inheritance and including mta::postfix (or another "common"/"base" class) from both. You could then use parameterised classes too.
Class inheritance: the use for class inheritance is that you can override parameters on resources defined in a parent class. Reimplementing the above example this way, we get:
class mta::postfix {
file { "/etc/postfix/main.cf":
source => "puppet:///modules/mta/main.cf.normal",
}
service { ... }
}
class mta::postfix::server inherits mta::postfix {
File["/etc/postfix/main.cf"]:
source => "puppet:///modules/mta/main.cf.server",
}
# other config...
}
This does work, but I'd avoid going more than one level of inheritance deep as it becomes a headache to maintain.
In both of these examples though, they're easily improved by specifying the data ahead of time (via an ENC) or querying data inline via extlookup or hiera.
Hopefully the above examples help. Class inheritance allows for overriding of parameters only - you can't remove previously defined resources (a common question). Always refer to the resource with a capitalised type name (file { ..: } would become File[..]).
Also useful is that you can also define parameters to be undef, effectively unsetting them.
At first, I am just specifying differences between the two, Inheritance is an "is-a" relationship and Composition is a "has-a" relationship.
1) In puppet inheritance is single inheritance that means, we cannot derive from more than one class. Inheritance is good in the puppet, but we should aware of where it applies. For example, Puppet docs section ["Aside: When to Inherit" at this link https://docs.puppetlabs.com/puppet/latest/reference/lang_classes.html#aside-when-to-inherit], They actually name exactly two situations where inheritance should happen:
when you want to overwrite a parameter of a resource defined in the
parent class
when you want to inherit from a parameters class for standard parameter values
But please note some important things here:
In puppet their is a difference between the Node and class inheritance.
Recent new version from puppet doesn't allow for the Node inheritance please check this https://docs.puppetlabs.com/puppet/latest/reference/lang_node_definitions.html#inheritance-is-not-allowed.
2) Composition on the other hand, is the design technique to implement has-a relationship. which we can do using the include puppet keyword and also, with class { 'baseclass': }, the later one is, if you want to use parameters.
(Please note: In puppet, we can use "include" multiple times but not the "class" syntax, as puppet will complain with duplicate class definitions)
So which is (either inheritance or composition) is better to use in Puppet: It depends on the context i mean, what puppet code you are writing at the moment and understanding the limitations of the puppet inheritance and when to use composition.
So, i will try to keep all this in few points:
1) At first, puppet uses single inheritance model.
2) In puppet, the general consensus around inheritance is to only use it when you need to inherit defaults from Base/Parent
3) But look at this problem where you want to inherit defaults from parent:
class apache {
}
class tomcat inherits apache {
}
class mysql inherits tomcat {
}
class commerceServer inherits mysql {
}
At first glance this looks logical but note that the MySQL module is now inheriting defaults and resources from the tomcat class. Not only does this make NO sense as these services are unrelated, it also offers an opportunity for mistakes to end up in your puppet manifests.
4) So the better approach is to simply perform an include on each class (I mean composition) you wish to use, as this eliminates all scope problems of this nature.
Conclusion: We can try and simplify our puppet manifests by using inheritance, this might be sufficient, but it’s only workable up to a point.If you’re environment grows to hundreds or even thousands of servers, made up over 20 or 30 different types of server, some with shared attributes and subtle differences, spread out over multiple environments, you will likely end up with an unmanageable tangled web of inherited modules. At this point, the obvious choice is composition.
Go thru these links, it helps to understand puppet composition and inheritance in a good manner (personally they helped me):
Designing Puppet - It is really good,http://www.craigdunn.org/2012/05/239/
Wiki link: http://en.wikipedia.org/wiki/Composition_over_inheritance
Modeling Class Composition with Parametrized Classes : https://puppetlabs.com/blog/modeling-class-composition-with-parameterized-classes
I am basically a programmer, personally a strong supporter of Inversion of Control/Dependency Injection, which is the concept/pattern which can be possible thru the composition.

How, why and when to use the ".Internal" modules pattern?

I've seen a couple of package on hackage which contain module names with .Internal as their last name component (e.g. Data.ByteString.Internal)
Those modules are usually not properly browsable (but they may show up nevertheless) in Haddock and should not be used by client code, but contain definitions which are either re-exported from exposed modules or just used internally.
Now my question(s) to this library organization pattern are:
What problem(s) do those .Internal modules solve?
Are there other preferable ways to workaround those problems?
Which definitions should be moved to those .Internal modules?
What's the current recommended practice with respect to organizing libraries with the help of such .Internal modules?
Internal modules are generally modules that expose the internals of a package, that break package encapsulation.
To take ByteString as an example: When you normally use ByteStrings, they are used as opaque data types; a ByteString value is atomic, and its representation is uninteresting. All of the functions in Data.ByteString take values of ByteString, and never raw Ptr CChars or something.
This is a good thing; it means that the ByteString authors managed to make the representation abstract enough that all the details about the ByteString can be hidden completely from the user. Such a design leads to encapsulation of functionality.
The Internal modules are for people that wish to work with the internals of an encapsulated concept, to widen the encapsulation.
For example, you might want to make a new BitString data type, and you want users to be able to convert a ByteString into a BitString without copying any memory. In order to do this, you can't use opaque ByteStrings, because that doesn't give you access to the memory that represents the ByteString. You need access to the raw memory pointer to the byte data. This is what the Internal module for ByteStrings provides.
You should then make your BitString data type encapsulated as well, thus widening the encapsulation without breaking it. You are then free to provide your own BitString.Internal module, exposing the innards of your data type, for users that might want to inspect its representation in turn.
If someone does not provide an Internal module (or similar), you can't gain access to the module's internal representation, and the user writing e.g. BitString is forced to (ab)use things like unsafeCoerce to cast memory pointers, and things get ugly.
The definitions that should be put in an Internal module are the actual data declarations for your data types:
module Bla.Internal where
data Bla = Blu Int | Bli String
-- ...
module Bla (Bla, makeBla) where -- ONLY export the Bla type, not the constructors
import Bla.Internal
makeBla :: String -> Bla -- Some function only dealing with the opaque type
makeBla = undefined
#dflemstr is right, but not explicit about the following point. Some authors put internals of a package in a .Internal module and then don't expose that module via cabal, thereby making it inaccessible to client code. This is a bad thing1.
Exposed .Internal modules help to communicate different levels of abstraction implemented by a module. The alternatives are:
Expose implementation details in the same module as the abstraction.
Hide implementation details by not exposing them in module exports or via cabal.
(1) makes the documentation confusing, and makes it hard for the user to tell the transition between his code respecting a module's abstraction and breaking it. This transition is important: it is analogous to removing a parameter to a function and replacing its occurrences with a constant, a loss of generality.
(2) makes the above transition impossible and hinders the reuse of code. We would like to make our code as abstract as possible, but (cf. Einstein) no more so, and the module author does not have as much information as the module user, so is not in a position to decide what code should be inaccessible. See the link for more on this argument, as it is somewhat peculiar and controversial.
Exposing .Internal modules provides a happy medium which communicates the abstraction barrier without enforcing it, allowing users to easily restrict themselves to abstract code, but allowing them to "beta expand" the module's use if the abstraction breaks down or is incomplete.
1 There are, of course, complications to this puristic judgement. An internal change can now break client code, and authors now have a larger obligation to stabilize their implementation as well as their interface. Even if it is properly disclaimed, users is users and gotsta be supported, so there is some appeal to hiding the internals. It begs for a custom version policy which differentiates between .Internal and interface changes, but fortunately this is consistent with (but not explicit in) the versioning policy. "Real code" is also notoriously lazy, so exposing an .Internal module can provide an easy out when there was an abstract way to define code that was just "harder" (but ultimately supports the community's reuse). It can also discourage reporting an omission in the abstract interface that really should be pushed to the author to fix.
The idea is that you can have the "proper", stabile API which you export from MyModule and this is the preferred and documented way to use the library.
In addition to the public API, your module probably has private data constructors and internal helper functions etc. The MyModule.Internal submodule can be used to export those internal functions instead of keeping them completely locked inside the module.
It lets the users of your libary to access the internals if they have needs that you didn't foresee, but with the understanding that they are accessing an internal API that doesn't have the same implicit guarantees as the public one.
It lets you access the internal functions and constructors for e.g. unit-testing purposes.
One extension (or possibly clarification) to what shang and dflemstr said: if you have internal definitions (data types whose constructors aren't exported, etc.) that you want to access from multiple modules which are exported, then you typically create such an .Internal module which isn't exposed at all (i.e. listed in Other-Modules in the .cabal file).
However, this sometimes does leak out when doing types in ghci (e.g. when using a function but where some of the types it refers to aren't in scope; can't think of an instance where this happens off the top of my head, but it does).

how to use parametrized classes to reduce code base

I wrote puppet manifests and I use puppet to deploy my system.
I am now refactoring manifests in order to make it maintainable.
One of sub systems is tomcat with webapplications.
I have ~10 webapps. Each of those has almost the same procedure to deploy.
For now I use classes. 10 files - almost identical.
When I tried to use parametrized class, puppet lets me instantiate it just once.
Then I tried to create 'empty' classes which inherit from webapp class.
It does not work as well because puppet complains that parameters are not passed parent class.
I do not see any method I could abstract the code. How to do it?
I would like to achieve:
node {
class {"webapp::first": param1 = one}
class {"webapp::second": param1 = two}
}
where first and second are applications using the same recipes.
I know there is define, but recipe is pretty big and even if it would be possible I find class more readable.
You can use parameters in your classes, but defines are more what you want. Quoting the official documentation
Classes and defined types are created similarly, but they are used very differently.
Defined types are used to define reusable objects which will have multiple instances on a given host, so > they cannot include any resources that will only have one instance. For instance, multiple uses of the > same define cannot create the same file.
see http://docs.puppetlabs.com/guides/language_guide.html#resource-collections
try to use user defined type classes are singleton by nature

Resources