Automizing the process of setting up a new server - linux

I'm maintaining the servers of a web game. Whenever we add a new server to our game, I have to configure many environment details and install softwares (for example, testing if some ports of the new machine can be connected from other places, installing mysql-client, pv..., copying the game server files from the other machine, and changing mysql server connection URL) on the new machine.
So my question is "How can I automize the whole process of setting up a new server?" Because most of the works I do are repetitive. I don't want to do this kind of job whenever a new machine comes in.
Is there a tool that allows me to save the state of a linux machine so that next time when we buy a new server, I can copy the state of an old linux machine to the new machine? I think this is one of the ways to automize the process of setting up a new game server.
I've also tried using some *.sh scripts to automize the process. But it's not always possible to get the return value of every command I execute. This is why I come here and ask for help.

Have you looked at Docker, Ansible, Cheff or Puppet?
In Docker you can build a new container by describing required operations in docker file. And you can easily move container between machines.
Ansible, Cheff and Puppet are systems management automation tools.

I doubt you'll find such tool to automatize an entire customization process because it's rather difficult to define/obtain a one-size-fit-all linux machine state, especially if the customisation includes logical/functional sequences.
But with good scripting you can obtain a possibly more reliable customisation from scratch (rather than copying it from another machine). I'd recommend a higher-level scripting language, tho, IMHO regular bash/zsh/csh scripting is not good/convenient enough. I prefer python, which gives easy access to every cmd's return code, stdout, stderr and with the pexpect module it can drive interactive cmds.
There are tools to handle specific types of customisations (sw package installations, config files), but not all I needed, so I didn't bother and went straight for custom scripts (more work, but total control). Personal preference, tho, others will advise against that.

Related

should I configure my EC2 using user_data or Ansible

When launching EC2 using Terraform (or cloud formation), we can configure EC2 by putting some scripts in user_data/remote-exec. Alternatively, we can configure EC2 using Ansible/Chef, etc. What are the difference of configuring EC2 in user_data/remote-exec and do that with Ansible/Chef? when to use the former, when to use the latter (I know Ansible/Chef is idempotent)?
In my case, the EC2 is originally manually launched, then manually configured using a lot of linux commands. and the commands are not configured by me. Now I am the person to automate the whole structure using terraform, and configure EC2s. Using user_data/remote-exec to configure EC2 is straightforward. I just need to put all the existing linux commands they have in some scripts with a little change. And if the configuration result using my script is not successful, at least I can quickly figure out whether I miss some commands by comparing my script and the original linux commands. But if I use ansible/chef, I have to rewrite all the steps using different language. And if the configuration is not what expected, it is hard for me to figure out which steps are not correct, because the syntax of ansible/chef and linux commands are totally different.
My question is, in my case, should I use ansible/chef or user_data/remote-exec for configuration?
User Data is good for initial configuration of the system. If you need longer term maintenance a configuration management software like Ansible/Chef/Salt/Puppet is a great option.
Packer can be used for immutable infrastructure, i.e. doesn't change after creation. You can run all the scripts and installs on the system for it to be ready to just boot, this is also faster because you don't have to wait for user data to run.
A few questions you have to ask as well, how often are you going to patch these? Are you going to just update existing or replace with new. Ansible is great for configuration since it's just yaml files an
Blue/Green deployments generally replace servers with all new ones and gradually move traffic over to the new servers.
Some more things to consider with your Infrastructure as code

Linux: What should I use to run terminal programs based on a calendar system?

Sorry about the really ambiguous question, I really have no idea how to word it though hopefully I can give you more detail here.
I am developing a project where a user can log into a website and book a server to run a game for a specific amount of time. When the time is up the server stops running and the players on the server are kicked off. The website part is not a problem, I am doing this in PHP and everything works. It has a calendar system to book a server and can generate config files based on what the user wants.
My question is what should I use to run the specific game server on the linux box with those config files at the correct time? I have got this working with bash scripts and cron, but it seems very un-elegant. It literally uses FTP to connect to the website so it can download all the necessary config files and put them in a folder for that game and time. I was wondering if there was a better way of doing this. Perhaps writing a program in C, but I am not sure how to go about doing this.
(I am not asking for someone to hold my hand and tell me "write this code here", just some ideas of a better way of approaching this problem)
Thanks so much guys!
Edit: The webserver is a totaly different machine. I would theoreticaly like to have more than one game server where each of them "connects" (at the moment FTP) to the webserver, gets a file saying what it has to do at a specific time and downloads any associated files then disconnects.
I think at is better suited for running one time jobs than cron.
For a better approach for the downloading files etc, you should give more details on your setup (like, the website and the game server, are they on the same machine? Or the same network? etc etc.
You need a distributed task scheduler. With that, you can:
Schedule command "X" to be run at a certain time.
Specify the machine (or ask it to pick a machine from a pool of available machines)
Webserver would send request to this scheduler via command line or via web service when user selects a game server and a time.
You can have a look at : http://www.acelet.com/super/SuperWatchdog/index.html
EDIT :
One more option :http://jobscheduler.sourceforge.net/

Use "apt" or compile from scratch for a web service?

For the first time, I am writing a web service that will call upon external programs to process requests in batch. The front-end will accept file uploads and then place them in a queue. The workers on the backend will take that file, run it through ffmpeg and the rest of my pipeline, and send an email when the process is complete.
I have my backend process working on my computer (Ubuntu 10.04). The question is: should I try to re-create that pipeline using binaries that I've compiled from scratch? Or is it okay to use apt when configuring in The Real World?
Not all hosting services uses Ubuntu, and not all give me root access. (I haven't chosen a host yet.) However, they will let me upload binaries to execute, and many give me shell access with gcc.
Usually this would be a no-brainier and I'd compile it all from scratch. But doing so - not to mention trying to figure out how to create a platform-independent .tar.gz binary - will be quite a task which ultimately doesn't really help me ship my product.
Do you have any thoughts on the best way to set up my stack so that I'm not tied to a specific hosting provider? Should I try creating my own .deb, which contains Ubuntu's version of ffmpeg (and other tools) with the configurations I need?
Short of a setup where I manage my own servers/VMs (which may very well be what I have to do), how might I accomplish this?
The question is: should I try to re-create that pipeline using binaries that I've compiled from scratch? Or is it okay to use apt when configuring in The Real World?
It is in reverse: it is not okay to deploy unpackaged in The Real World IMHO
and not all give me root access
How would you be deploying a .deb without root access. Chroot jails?
But doing so - not to mention trying to figure out how to create a platform-independent .tar.gz binary - will be quite a task which ultimately doesn't really help me ship my product.
+1 You answer you own question. Don't meddle unless you have to.
Do you have any thoughts on the best way to set up my stack so that I'm not tied to a specific hosting provider?
Only depend on wellpackaged standard libs (such as ffmpeg). Otherwise include them in your own deployment. This problem isn't too hard too solve for 10s of thousand Linux applications over decades now, so it would probably be feasible for you too.
Out of the box:
Look at rightscale and other cloud providers/agents that have specialized images/tool chains especially for video encoding.
A 'regular' VPS provider (with Xen or Virtuozzo) will not normally be happy with these kinds of workload, but EC2, Rackspace and their lot will be absolutely fine with that.
In general, I wouldn't believe that a cloud infrastructure provider that doesn't grant root access will allow for computationally intensive workloads. $0.02

How to fetch network card configs remotely from multiple Linux machines?

I need a tool/script to fetch network card configurations from multiple Linux machines, mostly Red Hat Enterprise 5. I only know some basic bash, and I need something that can be run remotely pulling server names from a CSV. It also needs to be be run quickly and easily by non-technical types from a Windows machine. I've found WBEM/CMI/SBLIM, but I'd rather not write a whole C++ application. Can anyone point me to a tool or script that could accomplish this?
For Red Hat Enterprise Linux servers, you likely just need to take a copy of the files in /etc/sysconfig/networking/devices/ from each server. You can use an sftp client to accomplish that over ssh.
(The files are just easy-to-read text config files containing the network device configuration)
Can you give more details as to what information you need to pull? The various parameters to ifconfig give quite a lot of information about a Linux machine's network card configuration, so if you can do it that way it will be very easy. Simply write a script that converts the CSV into something white-space delimited, and then you can do something like:
#!/bin/bash
for host in $HOSTS ; do
CARDINFO=`ssh $host 'ifconfig'`
# Do whatever processing you need on CARDINFO here
done
That's a very rough sketch of the pseudocode. You'll also need to set up passwordless SSH on the hosts you want to access, but that's easy to do on Red Hat.
If you want to use WBEM/CIM for that (as mentioned in your original question), and you prefer a scripting environment over a programming language such as C/C++/Java, then there are PyWBEM and PowerCIM as two ways to do that in Python. If it needs to be bash etc, then there are command line clients (such as cimcli from the OpenPegasus project or wbemcli from the SBLIM project) and you could parse their output. Personally, I would prefer a Python based approach using PyWBEM. It is very easy to use, connecting to a CIM server is one line and enumerating CIM instances of a class is one more line.
On the side of the Linux system you want to query, the CIM server would need to run (tog-pegasus or sfcb) along with the right CIM provider packages (sblim). This approach has the advantage that your interface will be the same regardless of which Linux distribution you are using. Parsing config files is often dependent on the type of Linux distribution and I have seen them change across versions.
One main purpose of CIM is to provide reliable interfaces that are consistent across different types of environments and that change only compatibly over time.
Last but not least, using CIM allows you to get away without having to install any agent software on the system you want to inspect (as long as you can ensure that the CIM server is running).
Andy

Automated deployment of files to multiple Macs

We have a set of Mac machines (mostly PPC) that are used for running Java applications for experiments. The applications consist of folders with a bunch of jar files, some documentation, and some shell scripts.
I'd like to be able to push out new version of our experiments to a directory on one Linux server, and then instruct the Macs to update their versions, or retrieve an entire new experiment if they don't yet have it.
../deployment/
../deployment/experiment1/
../deployment/experiment2/
and so on
I'd like to come up with a way to automate the update process. The Macs are not always on, and they have their IP addresses assigned by DHCP, so the server (which has a domain name) can't contact them directly. I imagine that I would need some sort of daemon running full-time on the Macs, pinging the server every minute or so, to find out whether some "experiments have been updated" announcement has been set.
Can anyone think of an efficient way to manage this? Solutions can involve either existing Mac applications, or shell scripts that I can write.
You might have some success with a simple Subversion setup; if you have the dev tools on your farm of Macs, then they'll already have Subversion installed.
Your script is as simple as running svn up on the deployment directory as often as you want and checking your changes in to the Subversion server from your machine. You can do this without any special setup on the server.
If you don't care about history and a version control system seems too "heavy", the traditional Unix tool for this is called rsync, and there's lots of information on its website.
Perhaps you're looking for a solution that doesn't involve any polling; in that case, maybe you could have a process that runs on each Mac and registers a local network Bonjour service; DNS-SD libraries are probably available for your language of choice, and it's a pretty simple matter to get a list of active machines in this case. I wrote this script in Ruby to find local machines running SSH:
#!/usr/bin/env ruby
require 'rubygems'
require 'dnssd'
handle = DNSSD.browse('_ssh._tcp') do |reply|
puts "#{reply.name}.#{reply.domain}"
end
sleep 1
handle.stop
You can use AppleScript remotely if you turn on Remote Events on the client machines. As an example, you can control programs like iTunes remotely.
I'd suggest that you put an update script on your remote machines (AppleScript or otherwise) and then use remote AppleScript to trigger running your update script as needed.
If you update often then Jim Puls idea is a great one. If you'd rather have direct control over when the machines start looking for an update then remote AppleScript is the simplest solution I can think of.

Resources