Maintaining software as a user (on a cluster) - linux

Every cluster of computers I've encountered suffers from the same problem: its software is outdated. Naturally, one has the ability as a user to install everything from source in the home directory. I was wondering if there are any tools that would allow one to install and update software within home directory the same way package managers do in Linux distributions, i.e. with minimal pain and effort.
I have found toast, which is good, but not always reliable and up-to-date. Are there alternatives?
My particular needs at the moment are a recent version of GCC, boost, python, cmake.

I recommended using a sensible distribution for your cluster nodes. Then keeping the nodes up-to-date can be as simple as running the package manager, which you can even do via a distributed shell on all nodes at once. And for what it is worth, my choice would be Debian or Ubuntu.

You could try nix (http://nixos.org/). I haven't used it, so I don't know if it's more up-to-date than toast.

Either use a package manager that installs/updates on all cluster nodes transparently or create a directory that is shared (i.e. network file system) from all nodes

Related

What is the safest way to deliver an Application to novice Linux users?

My customers are novice Linux users, and so am i.
When I gave them my App packaged with ansible, they saw ansible problems, when i gave them manual steps, they also screwed that up, now i have 3 last options, either a perl/bash script or a snappy/deb/rpm package or Linux containers, can anyone share their experience on the safest way to see less problems when installing my app (Written in C)?
This depends on the nature of your application. Debs, rpms etc. are all fine but depend on which distro you're using.
If it's C application, it might make sense to make it a static binary. That way, you'll have to download a single file and just click on it to make it run. It will be big but it should work fine regardless of what else is there. Otherwise, you'll have to worry about dependencies etc.
As it was commented before it depends what you did to deploy the product.
In general, if you have dependencies (previous packages that you assume were already installed) or your installation is complex - use rpm or deb.
However if you target multi-platform bare in mind you will have at least two releases (one rpm and one deb...)
If configuration or installation is easier you can just give them an install script.
If your application requires a specific environment with specific configuration/packages I'd consider containers although I never done that personally before.

How to be able to "move" all necessary libraries that a script requires when moving to a new machine

We work on scientific computing and regularly submit calculations to different computing clusters. For that we connect using linux shell and submitting jobs through SGE, Slurm, etc (it depends on the cluster). Our codes are composed of python and bash scripts and several binaries. Some of them depend on external libraries such as matplotlib. When we start to use a new cluster, it is a nightmare since we need to tell the admins all the libraries we need, and sometimes they can not install all of them, or they only have old versions that can not be upgraded. So we wonder what could we do here. I was wondering if we could somehow "pack" all libraries we need along with our codes. Do you think it is possible? Otherwise, how could we move to new clusters without the need for admins to install anything?
The key is to compile all the code you need by yourself, using the compiler/library/MPI toolchains installed by the admins of the clusters, so that
your software is compiled properly for the cluster hardware, and
you do not depend on the admin to install the software.
The following are very useful in this case:
Ansible, to upload/manage configuration files, rc files, set permissions, compile your binaries, etc. and deploy a new environment easily on new clusters
Easybuild to install your version of Python with all the needed dependencies, and install other scientific software thanks to the community supported build procedures
CDE to build a package with all dependencies for your binaries on your laptop and use it as-is on the clusters.
More specifically for Python, you can use
virtual envs to setup a consistent set of Python modules across all clusters, independently from the modules already installed; or
Anaconda or Canopy to use a Python scientific distribution
to have a consistent Python install across all clusters.
Don't get me wrong, but I think what you have to do so: stop behaving like amateurs.
Meaning: the integrity of your "system configuration" is one of the core assets of your "business". And you just told us that you are basically unable of easily re-producing your system configuration.
So, the real answer here can't be a recommendation to use this or that technology. The real answer is: you, and the other teams involved in running your operations need to come together and define a serious strategy how to fix this.
Maybe you then decide that the way to go is that your development team provides Docker buildfiles, so that your operations team can easily create images on new machines. Or you decide that you need to use something like ansible to enable centralized control over your complete environment.
That's what venv is for, it allows you to create a portable customized environment easily, with exactly what you need and nothing more.
I completely agree with https://stackoverflow.com/users/1531124/ghostcat
but here is the really bad answer that will cause you a lot of problems in near future!!!:
if you need some dynamic library and you are not planning to upgrade them in future, you can try copying all needed libs to a folder in your app and use an script to launch the app:
#!/bin/sh
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/path/to/your/lib/folder
./myAPP
but keep in mind that this is bad practice.
Create a chroot image, like here - click. Install everything you need and then you can just chroot into it on any machine.
I work on scientific clusters as well, and you are going to find that wherever you go.
I would only rely on the admins on installing the most basic stuff. That is:
- Software necessary to build your software or run the most basic stuff: compilers and most basic utilities (python, perl, binutils, autotools, cmake, etc.).
Software libraries that make use of I/O devices: MPI, file I/O libraries...
A queue system (they already have it most of the time).
Environment modules. This is not a must, but it really helps you get the job done, specially if you mess with different library versions or implementations (that's my case, for example).
From that point on, you can build and install on your own directories all the software you use most of the time.
This does not mean that you cannot ask an admin to install some libraries. If you feel that many people is going to benefit from that, then you should request its installation. In addition, you may need some specific version or some special features which are not used most of the time, but you really need them. A very good example is with BLAS libraries (basic lineal algebra subroutines):
You have lots of BLAS implementations available: the original BLAS, Intel MKL, OpenBLAS, ATLAS, cuBLAS
If that is not enough, the open source versions usually offer multiple configuration options: serial version, parallel version with PThreads, parallel version with OpenMP, parallel version with MPI...
In my particular case, most of the software that I felt was necessary for many users in the cluster ended up being installed by the admins without any problem (either me or other users requested it), but you also have to keep in mind that in a cluster there can be many users and a single person/team is not able to attend the specific requirements you need, specially if you are able to do so.
I think you want to containerize your application in some way. Two main options (because docker/rkt and similar things are way too heavyweight for your task if I understand it correctly) in my opinion are runc and snappy.
Runc relies on OCI runtime specification, you need to create an environment (that is very similar to chroot environment in that you need to copy everything you software uses in one directory) and then you'll be able to run your application with runc tool. Runc itself is just one binary, at the moment it requires root privileges to run (hello, cluster admins), but there are patches at least partly solving that, so if you build your own runc and there are no blocking things wrt root privilege requirements you may be able to run your application with no administration overhead at all.
Snappy is similar in that you need to prepare a snap package for your application, this time using snapcraft as an assistant tool. Snappy is probably a bit easier in creating an application image and IMO is certainly better for long-term support because it clearly separates your application from the data (kinda W^X, application image is a read-only squashfs file and application can only write to a limited set of directories). But at the moment it will require your cluster admins to install snapd and to perform some operations like snap installation that require root privileges. Still, it should be better than your current situation, because that's just one non-intrusive package to install.
If these tools don't fit for some reason, there is always an option to make something of your own. That won't be easy and there are many subtle details that can bite you when doing that, but it can be done, compile all of your dependencies and applications into some path, create wrapper scripts to set up PATH and LD_LIBRARY_PATH environment for your components and then bring that directory into the new cluster, run wrapper scripts instead of target binaries and that's it. It's similar to what XAMPP does, they have quite a number of integrated things packaged into one directory that works across many distributions.
update
Let's also add AppImage into the mix, theoretically it can be a savior for your case, as it specifically does not require root privileges. It's kinda inbetween Snappy and rolling your own, as you need to prepare your application directory yourself (snappy can manage some of dependencies with snapcraft when you just specify "I need this Ubuntu package"), add appropriate metadata and then it can be packaged into single executable.

Distributing and Updating Software Applications to Linux envaranment

Currently I'm manually distributing and updating two applications over 50 computers running CentOS 6.5 and Ubuntu 14.04. Each time the new version is available for either of my applications,i have to copy all files and update it in all the computers by manually.its very time consuming and frustrating.
to avoid this manual process over 50 computers,I like to maintain a central server that contain the latest version of the applications and whenever need to install or update just type a command in client pc like we use in CentOS and Ubuntu to install a software
in Ubuntu
sudo apt-get install vlc
and in Cent OS
sudo yum install vlc
one of the programs written in java and other is written in python
I google it and can't find any good and useful source about how to do this.
some one alrady done this or knows how to achive this please help.
You need to create packages to make this happen.
Ubuntu uses the Debian package format, so you can use Debian's New Maintainer's Guide, which is the canonical tutorial on how to create a Debian package. It makes the assumption that you're going to upload the package to Debian, which in your case isn't true, but that just means you need to skip some sections of the document.
For RPM, there isn't such a document AFAIK, but there is the book 'max rpm' (which unfortunately is somewhat outdated), and fedora has augmented that with some guidelines and best practices which they've put on their wiki. Since RHEL is created by forking fedora and stabilizing that, and since CentOS is based on RHEL, what goes for fedora goes for CentOS, too.
These methods will create packages manually, which is always the best way and will result in the least problems afterwards. However, they take time. If you don't want to spend that time, there are also a few options to generate packages which will automate part or all of the job for you. Personally, however, I'm not a fan of these methods and therefore wouldn't recommend them.
Finally, another option is to not create packages, but to use a config management system like puppet to automate the deployment. It's even available in Ubuntu and EPEL.
edit I notice you may actually be asking about creating a repository instead. That's a different thing. There are several tools to help you do that; at core, all they do is run createrepo for RPM packages, or dpkg-scanpackages for debian packages. You can do that yourself, or investigate time in a tool like reprepro or aptly or some such.

yum/ zypper for non-root installation in independent rpm database

My company is developing a Linux based software product which is shipped to different customers.
The product it self consits out of small software components which interact with each other.
What we usually ship as an update/ new release to the customer are the the current versions of the different software components e.g. compA-2.0.1, compB-3.2.3 and compC-4.1.2
Currently we employ a rather simple shell script for the installation/ upgarding process. However, we'd like to move forwarard to state of the art packaging, mainly to have an easy way of swapping different versions of components, keeping track of files and the packages they belong to and also to provide the customers with an easier interface for the update/ installation.
The software components are installed in different directories, depending on the customers demands. So it could be in /opt, /usr/local or something completely different.
Since the vast majority of our customers runs on rpm-based Linux distributions we decided for rpm-packages instead of dpkg.
In rpm terms our problem is a non-root installation. This is realativly straight forward using the following features:
own rpm database using the --dbpath option
installing in different locations using the Prefix mechanism
optional: disabling auto library dependancies using AutoReqProv: no in the rpm spec file
Using those features/ options allows us to create rpm packages which can be installed using the rpm command line tool as non-root user.
However, what we really would like to see is to install those packages via a http repository with either yum or zypper. The latter one is the tool of choice in SUSE based distributions.
The problem we see is, that non of the tools is providing the required alternative rpm database option (--dbath in rpm) and prefix support required for a non-root installation.
Does anybody have a suggestion/ idea how to deal with this issue? Is there maybe a third package-tool with we're not aware of?
Or should we maybe go a totally different route? I had a play with GNU stow and wrote some very simplistic yum-like logic around it - but then I would basically start my own package installation tool which I tried to circumvent.

How to script a standard Linux build?

I'm going to rebuild my Linux box [yet] again. I have to create a few user groups, user accounts and install my standard packages. Until now I've just used the GUI tools. I was wondering if anyone has any recommendations on writing a script to create users, groups and install standard packages after I do a minimal install of my latest Fedora build? Sometimes I run Ubuntu so I'd like the script to be somewhat generic.
For .deb distros, use FAI. For .rpm distros, use Kickstart. For system management after installation, use cfengine.
Fedora and Ubuntu use totally different package managers, so you won't be able to easily do it in any sort of generic way.
In CentOS (which is RedHat Enterprise Edition with the serial numbers filed off, and so therefore pretty close to Fedora), we did this using Kickstart files. These files have a simple syntax that enabled you to specify users, groups and packages to install, and even to script some custom stuff.
While I haven't done this yet, I have a similar problem. I'm considering a virtualization host and multiple client OS (Ubuntu and CentOS being the top 2 candidates) - that way once I get the client configured as I want it, I can save it off for reloading as needed.
Doesn't get around the original setup issue, but does limit the "rebuild my Linux box [yet] again" problem.
You may want to consider it.
It may be overkill but you can check out Puppet.
From their website:
Puppet is a system for automating
system administration tasks.
I'm just starting looking for ways to automate system administration, so I don't have much experience with it yet.
If all you need to do is create users and groups and install packages then I would suggest that you just write two separate scripts.
It might be that you could share the users and groups part but only if all the distributions you use have the same policy for creating them (for example Ubuntu creates a group for each user while I am sure some distributions have a "users" group as well).
You could take a look at the useradd and groupadd commands which should be available everywhere. For Ubuntu there is also the friendlier adduser and addgroup and I would not be surprised if Fedora has a set of similar commands.
After groups are setup you just need to feed the package manager a big list of packages you need to have installed. Trying to install packages which are already installed should be safe, so you could install the packages you need on a "clean" new install and then dump a package list.
So to summarize: If you don't plan to support more than two distributions then I suggest just writing the two scripts separately.
Another option to help with constantly rebuilding a box is Norton Ghost, with ghost you can make an image and then just re-image the drive as needed. You install it and configure it to your liking, then take an image.
It's gonna be difficult to make the script generic, but you could use any sort of scripting tool (bash, or ruby or whatever) and try and check what distro is running and then run the appropriate commands to install software. There are various ways to check what distro is running here
Creating groups should be the same on all distros, and you may even be able to drop in an already configured /etc/passwd and /etc/groups (though I haven't tried that, and it may not work).
The response above, about the different distros using different methods is dead on. It's like trying to use the same part for a Chevy and a Ford (there's the car analogy, for you).
The easiest method I've found is to learn about setting up partitions for the different mount points i.e. / ; /home ; /var ; /opt are the big ones.
This lets you keep your users, groups, and many of your apps during your rebuilds. Changing distros will break a lot of things, but your user accounts should still be there.

Resources