Using puppet to build from source

Using puppet to build from source - puppet

How can I use puppet to build from source without using multiple Exec commands?. Do we have modules for it on forge that I could use?

It's possible to use Puppet to build applications from source without using execs, possibly with a custom written type and provider. Otherwise, yes, it'd have to be a few different exec resources with onlyif, creates etc. statements to stop them running every time the agent ran.
Puppet's model of configuration management is known as a desired state model: you define the end state of the system and let the system. This is why exec's are generally avoided in Puppet: they don't fit a desired state model. It also makes things like updating the application, or dealing with unknowns like a partial failure of the compilation that creates a required file.
In my opinion, I would not recommend using configuration management to build applications from source at all. There are a few issue inherent with doing so (this is not just for Puppet, but most config management languages):
Slower runs, as running the compilation can be longer and detecting that it's complete is normally a slightly trickier tasks
Issues with half complete state or failure: if the compilation breaks halfway through it's both harder to detect and resolve
Making the compilation idempotent: You have to wrap the command in logic that detects if the installation has already been done. However, this is difficult, as things like the detection of a flag file or particular binary could occur even when the compilation ends in failure
Upgrading or changing: There's no easy way to upgrade or change the application. A package would be easier to do this with.
This sounds like something that would be better served by packaging, using tools such as FPM or just native package building tools such as rpmbuild.

Related

chef get converged-attributes without deploying

We are using chef to deploy all of our stacks.
I need to build a runbook for each environment we deploy.
I have been parsing the environment, node and recipe files but the more information I need to extract, the more complex it becomes because I am converging the attributes in my application.
I would like to use the converged-attributes.json file produced by our chef deployment without deploying any code because we can't deploy production to the build runbooks.
We also plan to build the runbook before the environment exists to provide configuration information to the DevOps team (e.g. memory requirements, ports, etc.).
Is there a way to use any of the chef/knife components or libraries to do the following?
Converge the attributes for each node
Write the converged attributes to a location my application can access on Mac OSX.
Quit before attempting to access any servers

This is not possible in the generic case. Chef is executable code at heart and the only way to fully compute the side effects is to actually execute it. This is what chef-client does, you can't "converge" the node externally so step 3 doesn't really make any sense. You could try to use Why Run mode but we really don't recommend it and are probably going to remove the feature as it does more harm than good most of the time. Roles and environments are static data so you can parse and manipulate those, but cookbooks are code and have to be run in-place to know exactly what they will do.

How to call a puppet provider method from puppet manifest?

I'm using the ibm_installation_manager module from the puppet forge and it is a bit basic because IBM wrote Installation Manager in a time where idempotency was done much.
ref: https://forge.puppet.com/puppetlabs/ibm_installation_manager
As such it does not cater nicely for upgrades - so the module will not detect if an upgrade is needed, stop existing processes, do the upgrade and then start the processes again. It will just detect if an upgrade is needed and try to install the desired version and if that constitutes an upgrade that's great, but it will probably fail due to running instances.
So I need to implement some "stop processes" pre-upgrade functionality.
I need to mention at this point I'm new to ruby and fairly new to puppet.
The provider that the module uses (imcl.rb) has an exists method.
The ideal way for me to detect if an upgrade is going to happen (and stop the instances if it is) would be for my puppet manifest to be able to somehow call the exists method. Is this possible?
Or how would you approach this problem?
Something like imcl.exists(ibm_pkg["my_imcl_pkg_resource"])

The ideal way for me to detect if an upgrade is going to happen (and stop the instances if it is) would be for my puppet manifest to be able to somehow call the exists method. Is this possible?
No, it is not possible, at least not in any useful way. Your manifests describe how to build a catalog of resources describing the target state of the machine. In a master / agent setup, this happens on the master. The catalog is then used as input to a separate step, in which it is transferred to the target machine and applied there. It is in this second step that providers are engaged.
To the extent that you want the contents of your catalogs to be influenced by the current state of the target machine, the Puppet mechanism for that is to convey the needed state details to the catalog builder in the form of facts. It is relatively straightforward to add your own facts. Indeed, there are at least two distinct, non-exclusive mechanisms, going under the names "external facts" and "custom facts".

How to be able to "move" all necessary libraries that a script requires when moving to a new machine

We work on scientific computing and regularly submit calculations to different computing clusters. For that we connect using linux shell and submitting jobs through SGE, Slurm, etc (it depends on the cluster). Our codes are composed of python and bash scripts and several binaries. Some of them depend on external libraries such as matplotlib. When we start to use a new cluster, it is a nightmare since we need to tell the admins all the libraries we need, and sometimes they can not install all of them, or they only have old versions that can not be upgraded. So we wonder what could we do here. I was wondering if we could somehow "pack" all libraries we need along with our codes. Do you think it is possible? Otherwise, how could we move to new clusters without the need for admins to install anything?

The key is to compile all the code you need by yourself, using the compiler/library/MPI toolchains installed by the admins of the clusters, so that
your software is compiled properly for the cluster hardware, and
you do not depend on the admin to install the software.
The following are very useful in this case:
Ansible, to upload/manage configuration files, rc files, set permissions, compile your binaries, etc. and deploy a new environment easily on new clusters
Easybuild to install your version of Python with all the needed dependencies, and install other scientific software thanks to the community supported build procedures
CDE to build a package with all dependencies for your binaries on your laptop and use it as-is on the clusters.
More specifically for Python, you can use
virtual envs to setup a consistent set of Python modules across all clusters, independently from the modules already installed; or
Anaconda or Canopy to use a Python scientific distribution
to have a consistent Python install across all clusters.

Don't get me wrong, but I think what you have to do so: stop behaving like amateurs.
Meaning: the integrity of your "system configuration" is one of the core assets of your "business". And you just told us that you are basically unable of easily re-producing your system configuration.
So, the real answer here can't be a recommendation to use this or that technology. The real answer is: you, and the other teams involved in running your operations need to come together and define a serious strategy how to fix this.
Maybe you then decide that the way to go is that your development team provides Docker buildfiles, so that your operations team can easily create images on new machines. Or you decide that you need to use something like ansible to enable centralized control over your complete environment.

That's what venv is for, it allows you to create a portable customized environment easily, with exactly what you need and nothing more.

I completely agree with https://stackoverflow.com/users/1531124/ghostcat
but here is the really bad answer that will cause you a lot of problems in near future!!!:
if you need some dynamic library and you are not planning to upgrade them in future, you can try copying all needed libs to a folder in your app and use an script to launch the app:
#!/bin/sh
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/path/to/your/lib/folder
./myAPP
but keep in mind that this is bad practice.

Create a chroot image, like here - click. Install everything you need and then you can just chroot into it on any machine.

I work on scientific clusters as well, and you are going to find that wherever you go.
I would only rely on the admins on installing the most basic stuff. That is:
- Software necessary to build your software or run the most basic stuff: compilers and most basic utilities (python, perl, binutils, autotools, cmake, etc.).
Software libraries that make use of I/O devices: MPI, file I/O libraries...
A queue system (they already have it most of the time).
Environment modules. This is not a must, but it really helps you get the job done, specially if you mess with different library versions or implementations (that's my case, for example).
From that point on, you can build and install on your own directories all the software you use most of the time.
This does not mean that you cannot ask an admin to install some libraries. If you feel that many people is going to benefit from that, then you should request its installation. In addition, you may need some specific version or some special features which are not used most of the time, but you really need them. A very good example is with BLAS libraries (basic lineal algebra subroutines):
You have lots of BLAS implementations available: the original BLAS, Intel MKL, OpenBLAS, ATLAS, cuBLAS
If that is not enough, the open source versions usually offer multiple configuration options: serial version, parallel version with PThreads, parallel version with OpenMP, parallel version with MPI...
In my particular case, most of the software that I felt was necessary for many users in the cluster ended up being installed by the admins without any problem (either me or other users requested it), but you also have to keep in mind that in a cluster there can be many users and a single person/team is not able to attend the specific requirements you need, specially if you are able to do so.

I think you want to containerize your application in some way. Two main options (because docker/rkt and similar things are way too heavyweight for your task if I understand it correctly) in my opinion are runc and snappy.
Runc relies on OCI runtime specification, you need to create an environment (that is very similar to chroot environment in that you need to copy everything you software uses in one directory) and then you'll be able to run your application with runc tool. Runc itself is just one binary, at the moment it requires root privileges to run (hello, cluster admins), but there are patches at least partly solving that, so if you build your own runc and there are no blocking things wrt root privilege requirements you may be able to run your application with no administration overhead at all.
Snappy is similar in that you need to prepare a snap package for your application, this time using snapcraft as an assistant tool. Snappy is probably a bit easier in creating an application image and IMO is certainly better for long-term support because it clearly separates your application from the data (kinda W^X, application image is a read-only squashfs file and application can only write to a limited set of directories). But at the moment it will require your cluster admins to install snapd and to perform some operations like snap installation that require root privileges. Still, it should be better than your current situation, because that's just one non-intrusive package to install.
If these tools don't fit for some reason, there is always an option to make something of your own. That won't be easy and there are many subtle details that can bite you when doing that, but it can be done, compile all of your dependencies and applications into some path, create wrapper scripts to set up PATH and LD_LIBRARY_PATH environment for your components and then bring that directory into the new cluster, run wrapper scripts instead of target binaries and that's it. It's similar to what XAMPP does, they have quite a number of integrated things packaged into one directory that works across many distributions.
update
Let's also add AppImage into the mix, theoretically it can be a savior for your case, as it specifically does not require root privileges. It's kinda inbetween Snappy and rolling your own, as you need to prepare your application directory yourself (snappy can manage some of dependencies with snapcraft when you just specify "I need this Ubuntu package"), add appropriate metadata and then it can be packaged into single executable.

Reinstalling package if specific file is missing

I need to reinstall a package with Puppet if a specific file is missing. How can I achive this? In never versions of Puppet is a the onlyif parameter available but we still use Puppet 3.1.

Do not mistake Puppet for some kind of baroque script engine. Reinstalling a package is an action, whereas Puppet classes, resources, and DSL in general focus on describing state. Puppet's general paradigm is that you tell it what state you want, and it figures out for itself what actions, if any, to take to achieve that state. Even Exec resources are best conceptualized and used as representations of state to be managed.
Puppet Package resources do not recognize a state of "installed but broken", or any similar thing, and therefore they have no sense of a need to reinstall (as opposed to updating) a package, nor any mechanism for doing so.
If your concern is with only one specific file that you expect the package to provide, then you should consider putting that file under direct management (via a File resource) instead of relying on a package reinstallation to recover it if it should go missing.
However, you should consider what flaw in your system's configuration or security policy affords any plausible likelihood that random system files will unexpectedly go missing. You should especially do this if you are using the file in question as a canary to detect broader damage.
Nevertheless, if you remain firm about doing what you ask, then an Exec resource can help. The details of what you would need are unclear, but you can take this as a pattern:
exec { 'Ensure package mypackage good':
command => '/usr/bin/yum -y reinstall mypackage',
creates => '/path/to/some_file',
require => Package['mypackage']
}
The Exec type also has unless and onlyif parameters (including in Puppet 3.1), but the creates parameter serves the specific case of using the presence or absence of a file to determine whether the command needs to be run, which is exactly what you want.
Note also the require parameter. This presumes that package 'mypackage' is under Puppet management (not shown), and guarantees that the Exec will not be synced before the package. That way, if the package is altogether absent, you can be sure that Puppet will install it (supposing that's what you have specified) before testing the presence of any file it is expected to provide.

Using Vagrant, why is puppet provisioning better than a custom packaged box?

I'm creating a virtual machine to mimic our production web server so that I can share it with new developers to get them up to speed as quickly as possible. I've been through the Vagrant docs however I do not understand the advantage of using a generic base box and provisioning everything with Puppet versus packaging a custom box with everything already installed and configured. All I can think of is;
Advantages of using Puppet vs custom packaged box
Easy to keep everyone up to date - Ability to put manifests under
version control and share the repo so that other developers can
simply pull new updates and re-run puppet i.e. 'vagrant provision'.
Environment is documented in the manifests.
Ability to use puppet modules defined in production environment to
ensure identical environments.
Disadvantages of using Puppet vs custom packaged box
Takes longer to write the manifests than to simply install and
configure a custom packaged box.
Building the virtual machine the first time would take longer using
puppet than simply downloading a custom packaged box.
I feel like I must be missing some important details, can you think of any more?

Advantages:
As dependencies may change over time, building a new box from scratch will involve either manually removing packages, or throwing the box away and repeating the installation process by hand all over again. You could obviously automate the installation with a bash or some other type of script, but you'd be making calls to the native OS package manager, meaning it will only run on the operating system of your choice. In other words, you're boxed in ;)
As far as I know, Puppet (like Chef) contains a generic and operating system agnostic way to install packages, meaning manifests can be run on different operating systems without modification.
Additionally, those same scripts can be used to provision the production machine, meaning that the development machine and production will be practically identical.
Disadvantages:
Having to learn another DSL, when you may not be planning on ever switching your OS or production environment. You'll have to decide if the advantages are worth the time you'll spend setting it up. Personally, I think that having an abstract and repeatable package management/configuration strategy will save me lots of time in the future, but YMMV.

One great advantages not explicitly mentioned above is the fact that you'd be documenting your setup (properly), and your documentation will be the actual setup - not a (one-time) description of how things were/may have been intended to be.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string