Setup Puppet at first place - puppet

I am trying to understand the best practice of setting up Puppet in the first place, let's say I have 1000 existing servers needs to be managed Puppet.
Do I manually install Puppet agent on each or there is a better way.
Sorry if this question is too generic just want to have some idea.

1000 servers could be a lot for a single master instance. of course it will depend on the master specs, and other factors related to the puppet runs.
There are few questions you need to answer first to determine how are you going to go about it such as
Puppet Enterprise or Open Source? What is the current configuration night mare you are trying to solve?
What is the current configuration data related to the challenge or
problem you have?
What are the current business roles (e.g. web server, load
balancer,database, ..etc) related to the problem you have? What
makes a role in terms of configurations?
I would suggest that you start first small to learn more about the puppet DSL, and its ECO system (master, agent, puppetdb, console/dashboard). I also recommend you start with the free 10 nodes puppet Enterprise as it will let you focus more on the problem at hand not how to configure the puppet masters, and agents, how to scale them, ..etc.
One more thing install puppet agent every where if you can in NOOP/disabled mode to get at least facts and run it in a masterless fashion using puppet apply when you need to. i find NOOP mode more useful as it tells you what needs to be changed, also you can enforce changes using --no-noop
hope that will get you started.

To answer your question: Yes, Puppet agent would need to be installed on every node. If you are managing 1000 nodes, I would assume you have your own OS image. In this case, its best to add it to the OS image, and use this image on 1000 nodes.

Related

How to call a puppet provider method from puppet manifest?

I'm using the ibm_installation_manager module from the puppet forge and it is a bit basic because IBM wrote Installation Manager in a time where idempotency was done much.
ref: https://forge.puppet.com/puppetlabs/ibm_installation_manager
As such it does not cater nicely for upgrades - so the module will not detect if an upgrade is needed, stop existing processes, do the upgrade and then start the processes again. It will just detect if an upgrade is needed and try to install the desired version and if that constitutes an upgrade that's great, but it will probably fail due to running instances.
So I need to implement some "stop processes" pre-upgrade functionality.
I need to mention at this point I'm new to ruby and fairly new to puppet.
The provider that the module uses (imcl.rb) has an exists method.
The ideal way for me to detect if an upgrade is going to happen (and stop the instances if it is) would be for my puppet manifest to be able to somehow call the exists method. Is this possible?
Or how would you approach this problem?
Something like imcl.exists(ibm_pkg["my_imcl_pkg_resource"])
The ideal way for me to detect if an upgrade is going to happen (and stop the instances if it is) would be for my puppet manifest to be able to somehow call the exists method. Is this possible?
No, it is not possible, at least not in any useful way. Your manifests describe how to build a catalog of resources describing the target state of the machine. In a master / agent setup, this happens on the master. The catalog is then used as input to a separate step, in which it is transferred to the target machine and applied there. It is in this second step that providers are engaged.
To the extent that you want the contents of your catalogs to be influenced by the current state of the target machine, the Puppet mechanism for that is to convey the needed state details to the catalog builder in the form of facts. It is relatively straightforward to add your own facts. Indeed, there are at least two distinct, non-exclusive mechanisms, going under the names "external facts" and "custom facts".

How to use multiple different puppet masters from one puppet agent?

There is the need that one puppet agent contacts some different puppet masters.
Reason: there are different groups that create different and independent sets of manifests.
Possible groups and their tasks
Application Vendor: configuration of application
Security: hardening
Operations: routing tables, monitoring tools
Each of these groups should run it's own puppet master - the data (manifests and appropriate data) should be strictly separated. If it is possible, one group should even not see / have access to the manifests of the others (we are using MAC on the puppet agent OSes).
Thoughts and ideas that all failed:
using (only) hira is not flexible as needed - there is the need to have different manifests.
r10k: supports more than one environment, but in each environment can only access one set of manifests.
multi but same puppet server using e.g. DNS round robin: this is the other way round. We need different puppet masters.
Some ways that might be possible but...
running multiple instances of puppet agents. That 'feels' strange. Advantage: the access rights can be limited in the way as needed (e.g. the application puppet agent can run under the application user).
patching puppet that it can handle more than one puppet master. Disadvantage: might be some work.
using other mechanisms to split responsibility. Example: use different git-repositories. Create one puppet master. The puppet master pulls all the different repositories and serves the manifests.
My questions:
Is there a straight forward way implementing this requirement with puppet?
If not, is there some best practice how to do this?
While I think what you are trying to do here is better tackled by incorporating all of your modules and data onto a single master, and that utilizing environments will be effectively the exact same situation (different masters will provide a different set of modules/data) this can be achieved by implementing a standard multi-master infrastructure (one CA master for cert signing, multiple compile masters with certs signed by the same CA master, configured to forward cert traffic elsewhere) and configure each master to have whatever you need. You then end up having to specify which master you want to check in to on each run (a cronjob or some other approach), and have the potential for one checkin to change settings set by another (kinda eliminating the hardening/security concept).
I would urge you to think deeper on how to collaborate your varied aspects (git repos for each division's hiera data and modules that have access control) so that a central master can serve your needs (and access to that master would be the only way to get data/modules from everywhere).
This type of setup will be complex to implement, but the end result will be more reliable and maintainable. Puppet inc. may even be able to do consultation to help you get it right.
There are likely other approaches too, just fyi.
I've often found it convenient to multi-home a puppet agent for development purposes, because with a localĀ puppet server you can instantly test manifest changes - there's no requirement to commit, push and r10k deploy environment like there is if you're just using directory environments and a single (remote) puppet server.
I've found the best way to do that is to just vary the path configuration (otherwise you run into problems with e.g. the CA certs failing to verify against the other server) - a form of your "running multiple instances of puppet agents" suggestion. (I still run them all privileged, so they can all use apt package {} etc.)
For Puppet 3, I'd do this by varying the libdir with --libdir (because the ssldir was under the libdir), but now (Puppet 4+) it looks more sensible to vary the --confdir. So, for example:
$ sudo puppet agent -t # Runs against main puppet server
$ sudo puppet agent -t \
--server=puppet.dev.example.com \
--confdir=/etc/puppetlabs/puppet-dev # Runs against dev puppet server

Trigger puppet run on update of manifest / facts

I'm working on a tool which manages WordPress instances using puppet. The flow is the following: the user adds the data of the new WordPress installation in the web interface and then that web interface is supposed to send a message to the puppet master to tell it to deploy it to the selected machine.
Currently the setup is done via a manifest file which contains the declaration of all WordPress instances, and that is applied manually via puppet apply on the puppet agent. This brings me to my 2 questions:
Are manifests the correct way of doing this? If so, is it possible to apply them from the puppet master to a specific node instead of going to the agent?
Is it possible to automatically have a puppet run triggered once the list of instances is altered?
To answer your first question, yes there's absolutely a way of doing this via a puppetmaster, what you have at the moment is a masterless setup which assumes you're distributing your configuration with some kind of version control (like git) or manual process. This is a totally legitimate way of doing things if you don't want a centralized master.
If you want to use a master, you'll need to drop your manifest in the $modulepath of your master (it varies depending on your version, you can find it using puppet config print modulepath on your master) and then point the puppet agent at the master.
If you want to go down the master route, I'd suggest following the puppet documentation which will help you get started.
The second question brings me on to a philosphical argument of 'is this really want you want to do?'
Puppet traditionally (in my opinion) is a declarative config management tool that is designed to make your systems look a certain way. You write code to determine 'this is how I want it to look' and Puppet will converge to make it look that way. What you're looking to do is more of an orchestration task (ie when X do Y). There are ways of doing this with Puppet like using mcollective (to trigger a puppet run) which is managed by a webhook, but I think there are better tools for the job.
I'd suggest looking at ansible, saltstack or Chef's knife tool to do deploys like this.

Using Vagrant, why is puppet provisioning better than a custom packaged box?

I'm creating a virtual machine to mimic our production web server so that I can share it with new developers to get them up to speed as quickly as possible. I've been through the Vagrant docs however I do not understand the advantage of using a generic base box and provisioning everything with Puppet versus packaging a custom box with everything already installed and configured. All I can think of is;
Advantages of using Puppet vs custom packaged box
Easy to keep everyone up to date - Ability to put manifests under
version control and share the repo so that other developers can
simply pull new updates and re-run puppet i.e. 'vagrant provision'.
Environment is documented in the manifests.
Ability to use puppet modules defined in production environment to
ensure identical environments.
Disadvantages of using Puppet vs custom packaged box
Takes longer to write the manifests than to simply install and
configure a custom packaged box.
Building the virtual machine the first time would take longer using
puppet than simply downloading a custom packaged box.
I feel like I must be missing some important details, can you think of any more?
Advantages:
As dependencies may change over time, building a new box from scratch will involve either manually removing packages, or throwing the box away and repeating the installation process by hand all over again. You could obviously automate the installation with a bash or some other type of script, but you'd be making calls to the native OS package manager, meaning it will only run on the operating system of your choice. In other words, you're boxed in ;)
As far as I know, Puppet (like Chef) contains a generic and operating system agnostic way to install packages, meaning manifests can be run on different operating systems without modification.
Additionally, those same scripts can be used to provision the production machine, meaning that the development machine and production will be practically identical.
Disadvantages:
Having to learn another DSL, when you may not be planning on ever switching your OS or production environment. You'll have to decide if the advantages are worth the time you'll spend setting it up. Personally, I think that having an abstract and repeatable package management/configuration strategy will save me lots of time in the future, but YMMV.
One great advantages not explicitly mentioned above is the fact that you'd be documenting your setup (properly), and your documentation will be the actual setup - not a (one-time) description of how things were/may have been intended to be.

Is there really no easy way to test puppet scripts on a remote machine?

I'm experimenting with Puppet scripts for deployment.
I find the hardest part about the process of writing those scripts is iteratively testing them.
I don't want to puppet apply on my local development machine, that liable to screw stuff up. I have a clean-slate remote box where I want to apply. I also don't see how a puppetmaster can help me; I might be using a puppetmaster at a later point for production deployments, but for now, I just want to get my code working.
So I put together a quick shell script that would rsync the different directories from my local puppet module path to /tmp on the remote machine, and then run puppet apply. This is terribly inconvenient. It's slow, especially if we're talking about a syntax error.
I think what I want really is something like a puppetd <-> puppetmaster connection, where puppetd on the remote machine receives an already compiled manifest. Just an adhoc-one over a SSH connection, without having to actual setup an Puppetmaster, dealing with certificates etc. puppet apply user#host.
There seems to be nothing of the sort, but how do other people deal with this? I experience of working on a Puppet script is incredibly frustrating to me, as is.
I'd recommend using Vagrant. If you're not testing the puppet master setup you can use the built in provisioner integration.
Once you have everything setup you can run vagrant provision or just run puppet apply on the vagrant vm.
Here's a related article you may find helpful as well.
I would also take a look at puppet rpsec tests, using rspec-puppet and puppetlabs-spec-helper. The rspec-puppet-init will break puppet doc and geppetto and maybe some other things due to the symlinks, and there are some issues with hiera, but the tests are easy to setup otherwise and work well, and can also be tied into jenkins/hudson.
I usually have two levels of testing for my Puppet scripts.
Unit tests for quick feedback: Written using rspec-puppet, these compile a Puppet catalog for the class/define/etc being tested, and make assertions about it. Run locally each time I make a minor change, and on the build server each time I check in. The tests run quickly (<10 seconds), and pick up syntax and dependency issues.
Functional tests to make sure it really works: Written using Cucumber with the Aruba library. When I'm finished implementing a feature and the unit tests for it pass, these tests provision a VM (using Vagrant) with the appropriate Puppet manifest(s), log in, and make assertions about the VM's state. The tests themselves look something like:
Given I am SSHed into Vagrant box "webserver"
When I type "php --version"
Then the output should include "PHP 5.4.11"
Vagrant is the most useful environment for rapid infrastructure development that I've found. It most closely (99%) will mirror your production setup, and you can account for those tiny differences in puppet so everything works as expected. It takes about 30 minutes to get going with it and will pay you back many times over in saved time messing around with file copy scripts :)
If it's helpful to visualize, on my desktop I have 3 terminals side by side:
Terminal 1) Editing puppet manifests, classes, ruby code, etc
Terminal 2) Running 'vagrant provision' which simply does a puppet apply along with any facts you want to pass, etc.
Terminal 3) 'vagrant ssh' into the box so I can poke around as puppet is doing its work
Hope this helps!
Why don't you want to run a puppetmaster? It's created for exactly this situation.
If you absolutely cannot run a puppetmaster, then you would have to wrap your puppet calls in another script that first downloads the file (with curl or wget) and apply them after a successful download. Given that the puppetmaster is a fairly simple application to run, I don't see how not using it would be any better.
I stumbled across rump while looking at another question. If you're using git, it might be useful. There's a slide deck available.
From the README.md: "Rump helps you run Puppet locally against a Git checkout."
You may be interested in citac, a toolkit for automated testing of Puppet scripts. It is available on Github: https://github.com/citac/citac
Citac systematically executes your Puppet manifest in various configurations, imitating transient system faults, different resource execution orders, and more. The generated test reports inform you about issues with non-idempotent resources, convergence-related issues, etc.
The tool uses Docker containers for execution, hence your system remains untouched while testing. State changes are tracked during execution of the Puppet script, and detailed test reports are generated.
To get an idea of which bugs the tool is able to detect, a large-scale evaluation with more than 150 public Puppet scripts has been performed. The results are available here: http://citac.github.io/eval/
Please feel free to provide feedback, pull requests, etc. Happy testing!

Resources