How are Chef attributes stored internally - attributes

I know where all we can define the chef attributes, attribute types and also their precendence levels. I just want to understand how they are stored internally.
Suppose I declare an attribute
default[:app][:install] = "/etc/app"
1) How is it stored internally? Is it using in a tree structure(heirearchy) in the node object or is it as hashmaps or a list of variables in the node object?
2) Also, in most of the cookbooks I see that attributes are declared in 2 or 3 levels something as above I dont understand if it is a standard or is it a best practice? Are there any guidelines for the way the attributes have to be declared? Is it something to do with its internal storage. Can't I declare the attribute as
default[:appinstall]= "/etc/app"
and access it as below in my recipe?
node[:appinstall]

Just four Mashes (subclass of Hash which does the string vs. symbol key fixups). When you access the merged view via node['foo'] it uses a Chef::Node::Attribute object to traverse all four in parallel until it finds a leaf value.
What you have shown is correct for setting and using attributes, though string keys are preferred over symbols. You should also in general scope your attributes with the name of the cookbook like:
default['mycookbook']['appinstall'] = '/etc/app'
This will reduce the chances of collisions with other cookbooks.

Related

How to access variables from another class using getter

I am a beginner programmer and I am having a hard time grasping getters and setters. I just do not see the point.
I am trying to access the variable in Class A and use that value in class B to do some function. I thought I could use getter to access that value but that returns null since I understand that I am creating a new object with new values now. Then what is the point of a getter then?
I passed the variables over using the method parameters but that seems counter intuitive to my beginner's mind. I just don't understand that entire concept. Or am I wrong. I can use getters to access the value of another class's variable without making it static?
If I'm understanding you correctly, what you're asking is "Why do I need instance variables, with getter and setter methods to read/modify those variables, when I can just pass data into an object using method arguments?" Does that sound about right?
The answer gets to the heart of what OOP (object-oriented programming) is all about. The central concept of OOP is that you create distinct objects to represent discrete pieces of data. For example, you might want to track names and ages for some group of people; in that case, you would use different objects to represent (and by extension, store and manage data about) each individual person.
Person 1 ("Bill", 52)
Person 2 ("Mary", 13)
Person 3 ("Lana", 29)
The purpose of the class in this model is simply to define the specifications of these objects (e.g. a "Person" consists of a name and an age).
Why is this useful? First, this is a pretty intuitive system, since you can think of the objects you're creating as being actual real-life objects. Second, it makes it easy to work with data that are related. If, for example, we wanted to concatenate (join together in a string) a person's name with their age, having an object representing each person makes that easy! Just look at each object, one by one, and use getters to access the values for each instance.
To do this in a non-OOP way, we would need some other way to store the information -- perhaps as a list of names and a separate list of ages.
List of names: ["Bill", "Mary", "Lana"]
List of ages: [52, 13, 29]
In that kind of setup, it's not as easy to see which name relates to which age -- the only thing they have connecting them is their position within the list. And if the lists were sorted, those positions could change!
So, in short: object instances are a great way to handle many similar discrete collections of data.
As far as why we generally use getter methods and setter methods when working with those instances, instead of just exposing properties directly, a great explanation can be found here. But it bears mentioning that different languages handle this differently. In JavaScript, for example, all properties are accessible directly. In Ruby, none of them are, and you must use setters and getters to see/modify object instance variables.
I hope this provides some clarity!

Puppet Include vs Class and Best Practices

When should I be using an include vs a class declaration? I am exploring creating a profile module right now, but am struggling with methodology and how I should lay things out.
A little background, I'm using the puppet-labs java module which can be found here.
My ./modules/profile/manifests/init.pp looks like this:
class profile {
## Hiera Lookups
$java_version = hiera('profile::jdk::package')
class {'java':
package => $java_version,
}
}
This works fine, but I know that I can also remove the class {'java': block of the code and instead use include java. My question relates around two things. One, if I wanted to use an include statement for whatever reason, how could I still pass the package version from hiera to it? Second, is there a preferred method of doing this? Is the include something I really shouldn't be using, or are there advantages and disadvantages to each method?
My long term goal will be building out profile like modules for my environment. Likely I would have a default profile that applies to all of my servers, and then profiles for different application load outs. I could include the profiles into a role and apply things to my individual nodes at that level. Does this make sense?
Thanks!
When should I be using an include vs a class declaration?
Where a class declares another, internal-only class that belongs to the same module, you can consider using a resource-like class declaration. That leverages your knowledge of the implementation details of the module, as you need to be able to prove that no other declaration of the class in question will be evaluated before the resource-like one. If ever that constraint is violated, catalog building will fail.
Under all other circumstances, you should use include or one of its siblings, require and contain.
One, if I wanted to use an include statement for whatever reason, how
could I still pass the package version from hiera to it?
Exactly the same way you would specify any other class parameter via Hiera. I already answered that for you.
Second, is
there a preferred method of doing this?
Yes, see above.
Is the include something I
really shouldn't be using, or are there advantages and disadvantages
to each method?
The include is what you should be using. This is your default, with require and contain as alternatives for certain situations. The resource-like declaration syntax seemed good to the Puppet team when they first introduced it, in Puppet 2.6, along with parameterized classes themselves. But it turns out that that syntax introduced deep design problems into the language, and it has been a source of numerous bugs and headaches. Automatic data binding was introduced in Puppet 3 in part to address many of those, allowing you to assign values to class parameters without using resource-like declarations.
The resource-like syntax has the single advantage -- if you want to consider it one -- that the parameter values are expressed directly in the manifest. Conventional Puppet wisdom holds that it is better to separate data from code, however, so as to avoid needing to modify manifests as configuration requirements change. Thus, expressing parameter values directly in the manifest is a good idea only if you are confident that they will never change. The most significant category of such cases is when a class has read data from an external source (i.e. looked it up via Hiera), and wants to pass those values on to another class.
The resource-like syntax has the great disadvantage that if a resource-like declaration of a given class is evaluated anywhere during the construction of a catalog for a given target node, then it must be the first declaration of that class that is evaluated. In contrast, any number of include-like declarations of the same class can be evaluated, whether instead of or in addition to a resource-like declaration.
Classes are singletons, so multiple declarations have no more effect on the target node than a single declaration. Allowing them is extremely convenient. Evaluation order of Puppet manifests is notoriously hard to predict, however, so if there is a resource-like declaration of a given class somewhere in the manifest set, it is very difficult in the general case to ensure that it is the first declaration of that class that is evaluated. That difficulty can be managed in the special case I described above. This falls into the more general category of evaluation-order dependencies, and you should take care to ensure that your manifest set is free of those.
There are other issues with the resource-like syntax, but none as significant as the evaluation-order dependency.
Clarification with respect to automated data binding
Automated data binding, mentioned above, associates keys identifying class parameters with corresponding values for those parameters. Compound values are supported if the back end supports them, which the default YAML back end in fact does. Your comments on this answer suggest that you do not yet fully appreciate these details, and in particular that you do not recognize the significance of keys identifying (whole) class parameters.
I take your example of a class that could on one hand be declared via this resource-like declaration:
class { 'elasticsearch':
config => { 'cluster.name' => 'clustername', 'node.name' => 'nodename' }
}
To use an include-like declaration instead, we must provide a value for the class's "config" parameter in the Hiera data. The key for this value will be elasticsearch::config (<fully-qualified classname> :: <parameter name>). The associated value is wanted puppet-side as a hash (a.k.a. "associative array", a.k.a. "map"), so that's how it is specified in the YAML-format Hiera data:
elasticsearch::config:
"cluster.name": "clustername"
"node.name": "nodename"
The hash nature of the value would be clearer if there were more than one entry. If you're unfamiliar with YAML, then it would probably be worth your while to at least skim a primer, such as the one at yaml.org.
With that data in place, we can now declare the class in our Puppet manifests simply via
include 'elasticsearch'

V8 JavaScript Object vs Binary Tree

Is there a faster way to search data in JavaScript (specifically on V8 via node.js, but without c/c++ modules) than using the JavaScript Object?
This may be outdated but it suggests a new class is dynamically generated for every single property. Which made me wonder if a binary tree implementation might be faster, however this does not appear to be the case.
The binary tree implementation isn't well balanced so it might get better with balancing (only the first 26 values are roughly balanced by hand.)
Does anyone have an idea on why or how it might be improved? On another note: does the dynamic class notion mean there are actually ~260,000 properties (in the jsperf benchmark test of the second link) and subsequently chains of dynamic class definitions held in memory?
V8 uses the concepts of 'maps', which describe the layout of the data in an object.
These maps can be "fast maps" which specify a fixed offset from the start of the object at which a particular property can be found, or they can be "dictionary map", which use a hashtable to provide a lookup mechanism.
Each object has a pointer to the map that describes it.
Generally, objects start off with a fast map. When a property is added to an object with a fast map, the map is transitioned to a new one which describes the location of the new property within the object. The object is re-allocated with enough space for the new data item if necessary, and the object's map pointer is set to the new map.
The old map keeps a record of the transitions from it, including a pointer to the new map and a description of the property whose addition caused the map transition.
If another object which has the old map gets the same property added (which is very common, since objects of the same type tend to get used in the same way), that object will just use the new map - V8 doesn't create a new map in this case.
However, once the number of properties goes over a certain theshold (in fact, the current metric is to do with the storage space used, not the actual number of properties), the object is changed to use a dictionary map. At this point the object is re-written using a hashtable. In general, it won't undergo any more map transitions - any more properties that are added will just go in the hashtable.
Fast maps allow V8 to generate optimized code (using Crankshaft) where the offset of a property within an object is hard-coded into the machine code. This makes it very fast for cases where it can do this - it avoids the need for doing any lookup.
Obviously, the generated machine code is then dependent on the map - if the object's data layout changes, the code has to be discarded and re-optimized when necessary. V8 has a type profiling mechanism which collects information about what the types of various objects are during execution of unoptimized code. It doesn't trigger optimization of the code until certain stability constraints are met - one of these is that the maps of objects used in the function aren't changing frequently.
Here's a more detailed description of how this stuff works.
Here's a video where one of the lead developers of V8 describes stuff like map transitions and lots more.
For your particular test case, I would think that it goes through a few hundred map transitions while properties are being added in the preparation loop, then it will eventually transition to a dictionary based object. It certainly won't go through 260,000 of them.
Regarding your question about binary trees: a properly sized hashtable (with a sensible hash function and a significant number of objects in it) will always outperform a binary tree for a use-case where you're just searching, as your test code seems to do (all of the insertion is done in the setup phase).

Using interfaces directly in C#

I recently read in "Professional C# 4 and .NET 4" that:
You can never instantiate an interface.
But periodically I see things like this:
IQuadrilateral myQuad;
What are the limitations in using interfaces directly (without having a class inherit from the interface)? How could I use such objects (if they can even be called objects)?
For example instead of using a Square class that derives from IQuadrilateral, to what extent could I get away with creating an interface like IQuadrilateral myQuad?
Since interfaces don't implement methods, I don't think I could use any methods with them. I thought interfaces didn't have fields to them (only properties), so I'm not sure how I could store data with them.
The answer is simple, you can't instantiate an interface.
The example you provided is not an example of instantiating an interface, you are just defining a local variable of the type IQuadrilateral
To instantiate the interface, you would have to do this:
IQuadrilateral myQuad = new IQuadrilateral();
And that isn't possible since IQuadrilateral does not have a constructor.
This is perfectly valid:
IQuadrilateral myQuad = new Square();
But you aren't initiating IQuadrilateral, you are initiating Square and assigning it to a variable with the type IQuadrilateral.
The methods available in myQuad would be the methods defined in the interface, but the implementation would be based on the implementation in Square. And any additional methods in Square that are not part of the IQuadrilateral interface would not be available unless you cast myQuad to a Square variable.
You can't create an instance of an interface.
The code you showed defines a variable of type IQuadrilateral. The actual instance this variable points to will always be of a concrete class implementing this interface.
Background Knowledge
Think of an interface as a contract. In a contract between two people, it defines what is capable, what is expected from the parties involved. In programming, it works the same way. The interface defines what to expect, what must exist for you to conform to that interface. Therefore, since it only defines what to expect, it itself, doesn't provide the implementation, the "code under the covers" so to speak, does.
A property behaves like a field, but allows you to intercept when someone assigns a value to it or reads the value. You can also deny reading or writing to it, your choice when you define the property. Interfaces work with properties instead of fields because of this. Since the "contract" is just defining what property should be there (name and type), and if it should allow a read or write capabilities, it leaves it up to the implementer to provide this.
Take for example the IEnumerator interface from the .NET framework. This interface was designed to allow iteration over a collection of objects. The purpose is not to change items, or randomly access them, it's just for getting object A and moving to the next, and the next, and the next, as many times as needed. Many collection type classes implement this: Queue, ArrayList, SortedList, Stack, etc. All these types of objects store many objects and now they all share the common "contract": the ability to iterate one-by-one over them.
However, you can see that the IEnumerator interface has a MoveNext() method declared. Why? This is because the items may not be served in the same manner. For example, people will generally access the ArrayList from the first item to the last. But a Stack was designed opposite, for people to access the last object down to the first.
Questions Answered
With all this knowledge, the limitation of declaring a variable as the interface type as opposed to the class type that implemented the interface is that you only get access to what the interface (the contract) says should be there. The benefit though is that you can assign to this variable any class type that implements the interface.

Usage of a correct collection Type

I am looking for a native, or a custom-type that covers the following requirements:
A Generic collection that contains only unique objects like a HashSet<T>
It implements INotifyCollectionChanged
It implements IENumerable<T> (duh) and must be wrappable by a ReadOnlyCollection<T> (duh, duh)
It should work with both small and large numbers of items (perhaps changing inner behaviour?)
the signature of the type must be like UniqueList<T> (like a list, not a key/valuepair)
It does not have to be sortable.
Searchability is not a "must-have".
The main purpose of this is to set up a small mesh/network between related objects.
So this network can only unique objects and there has to be a mechanism that notifies the application when changes in the collection happen.Since it is for a proof-of-concept the scope is purely within the assembly (no db's or fs are of any importance).
What is a proper native type for this or what are the best ingredients to create a composite?
Sounds like you could just wrap HashSet<T> in your own type extremely easily, just to implement INotifyCollectionChanged. You can easily proxy everything you need - e.g. GetEnumerator can just call set.GetEnumerator() etc. Implementing INotifyCollectionChanged should just be a matter of raising the event when an element is added or removed. You probably want to make sure you don't raise the event if either you add an element which is already present or remove an element which isn't already present. HashSet<T>.Add/Remove both return bool to help you with this though.
I wouldn't call it UniqueList<T> though, as that suggests list-like behaviour such as maintaining ordering. I'd call it ObservableSet<T> or something like that.

Resources