how to cluster coreos instances without could-init - coreos

so I created three coreos(container linux) instances on aws without user-data/cloud-config.
I did this because my understanding was that cloud-config was just for setting up services and ssh-keys etc and I wanted to do that through other means.
Now I want to know how these three separate instance can be converted to act as a cluster so that updates happen one at a time
basically my question in essence would be what is the required thing that makes separate instances into a cluster in coreos?

The easiest way is to just destroy and recreate the machines. Using cloud-init makes this easy, once you get it working just right.
The next best way is going to be writing out a systemd "drop in" per machine, with the etcd clustering config. You will need to translate the flags from this example into env vars (they follow the same format) and include those env vars in a drop in.
An example looks like:
/etcd/systemd/system/etcd2.service.d/10-clustering.conf
[Service]
Environment=ETCD_LISTEN_PEER_URLS=http://10.0.0.1:2380
Environment=ETCD_<insert others>
...
After applying these, run sudo systemctl daemon-reload to pick up the changes, and then sudo systemctl restart etcd2. You can check if your changes were picked up correctly in the etcd log (it logs all detected env vars) and with systemctl cat etcd2, which should list out your drop in.

Related

Init.d script to systemd problem with parameters

I am trying to upgrade my init.d script called "myService" to systemd .The init.d script has 1 parameter which decides what to do, with the following switch case:
case "$choice" in
"start")
# starts service logic here
"stop")
# stops service logic here
"filter")
# runs some .sh file from our PC
esac
In order to upgrade to systemd I create myService.service file in systemd and set in the properties of the file on ExecuteStart and ExecuteStop to execute the init.d file with parameter start or stop,now I can do : systemctl start myService.service ,however if I want to invoke the filter option I am not allowed to do systemctl filter myService.service since "filter" is not valid option for systemctl .Any suggestions how can I overcome this?
This scheme does not fit within systemd responsibilities as a service manager, such as (but not limited to):
running services (e.g. starting, stopping, etc.)
the configuration of the above (e.g. which system level to run in)
providing information on the status of a service
declaring the dependencies and the handling between the various services
Although you did not provide information on the implementation of the service, it seems that the filter mode is an application/server specific action. Moreover, it's not clearly described what happens when the service is stopped and filter is issued.
So, keeping in mind the separation of concerns, I'd suggest using systemd to control the start and stop of your service, but use whatever IPC (D-Bus, sockets, signals, etc.) that service is using to trigger the filter operation.

Can I install two chef clients on a Linux server so that both two clients can manage the linux server?

I want to install one chef client on a linux server to manage the server by executing shell command. While there is a reboot command in one of the recipes in the run list, and the rest recipes don't continue to execute after the server reboot. As I haven't found a way to solve it, I wonder if I can install two chef clients on a Linux server and execute different recipes so that the rest recipes can continue to execute after the server reboot. Anyone can help? Thanks.
Putting two clients on a single device, or two Configuration Management tools on the same box in general, is a Bad Idea. Even if you could do it the cognitive load increase from determining when to update what where going forward is going to open you up to mistakes.
The proper approach is to put restart flags in your recipes; before you call the restart resource you set a flag (which can be a files contents or even existence, an environment variable, or any number of other persistent data objects) to indicate that a restart was performed. If its to be periodic you can instead look at something like the last time a file was accessed with its atime property. Then you do logic around your steps that require a reboot, guarding against a reboot if the flag no-restart flag is set or triggering it if you have a restart flag set, your choice. That way you'll have a chef converge with a restart that skips part of your runlist, then later another run that skips the unnecessary restart process.
Another good option is to just pay more attention to how you have your resources ordered. If the restart is in your last runlist item and is notified with the :delayed timing then it will be the last run resource, meaning the rest of your recipes would have already converged. If you need a complete converge every time then that is the option you should embrace.
Option 1 is a Ruby-centric solution and will require you to embrace dev work. Option 2 is more pure Chef with some Ruby sprinkled in and you can read up on notifying resources in the docs here: https://docs.chef.io/resource_common.html#notifications
There is an option 3 where you change the runlist during the chef run, which you could use to remove the recipes that require the reboot, but I think you'd benefit more from option 1 or 2.

should I configure my EC2 using user_data or Ansible

When launching EC2 using Terraform (or cloud formation), we can configure EC2 by putting some scripts in user_data/remote-exec. Alternatively, we can configure EC2 using Ansible/Chef, etc. What are the difference of configuring EC2 in user_data/remote-exec and do that with Ansible/Chef? when to use the former, when to use the latter (I know Ansible/Chef is idempotent)?
In my case, the EC2 is originally manually launched, then manually configured using a lot of linux commands. and the commands are not configured by me. Now I am the person to automate the whole structure using terraform, and configure EC2s. Using user_data/remote-exec to configure EC2 is straightforward. I just need to put all the existing linux commands they have in some scripts with a little change. And if the configuration result using my script is not successful, at least I can quickly figure out whether I miss some commands by comparing my script and the original linux commands. But if I use ansible/chef, I have to rewrite all the steps using different language. And if the configuration is not what expected, it is hard for me to figure out which steps are not correct, because the syntax of ansible/chef and linux commands are totally different.
My question is, in my case, should I use ansible/chef or user_data/remote-exec for configuration?
User Data is good for initial configuration of the system. If you need longer term maintenance a configuration management software like Ansible/Chef/Salt/Puppet is a great option.
Packer can be used for immutable infrastructure, i.e. doesn't change after creation. You can run all the scripts and installs on the system for it to be ready to just boot, this is also faster because you don't have to wait for user data to run.
A few questions you have to ask as well, how often are you going to patch these? Are you going to just update existing or replace with new. Ansible is great for configuration since it's just yaml files an
Blue/Green deployments generally replace servers with all new ones and gradually move traffic over to the new servers.
Some more things to consider with your Infrastructure as code

Docker daemon.json logging config not effective

I have a mongodb docker container (stock one downloaded from the docker repo). Its log size is unconstrained (/var/lib.docker/containers/'container_id'/'container_id'-json.log)
This recently caused a server to fill up so I discovered I can instruct the docker daemon to limit the max size of a container's log file as well as the number of log files it will keep after splitting. (Please forgive the naiveté. This is a tools environment so things get set up to serve immediate needs with an often painful lack of planning)
Stopping the container is not desirable (though it wouldn't bring about the end of the world) thus doing so is probably a suitable plan G.
Through experimentation I discovered that running a different instance of the same docker image and including --log-opt max-size=1m --log-opt max-file=3 in the docker run command accomplishes what I want nicely.
I'm given to understand that I can include this in the docker daemon.json file so that it will work globally for all containers. I tried adding the following to the file "/etc/docker/daemon.json"
{
"log-driver": "json-file",
"log-opts": {
"max-size": "10m",
"max-file": "3"
}
}
Then I sent a -SIGHUP to the daemon. I did observe that the daemon's log spit out something about reloading the config and it mentioned the exact filepath at which I made the edit. (Note: This file did not exist previously. I created it and added the content.) This had no effect on the log output of the running Mongo container.
After reloading the daemon I also tried instantiating the different instance of the Mongo container again and it too didn't observe the logging directive that the daemon should have. I saw its log pass the 10m mark and keep going.
My questions are:
Should there be a way for updates to logging via the daemon to affect running containers?
If not, is there a way to tell the container to reload this information while still running? (I see docker update but this doesn't appear to be one of the config options that can be updated.
Is there something wrong with my config. I tested including a nonsensical directive to see if mistakes would fail silently and they did not. A directive not in the schema raised an error in the daemon's log. This indicates that the content I added (displayed above) is, at least, expected, though possibly incomplete or something. The commands seem to work in the run command but not in the config. Also, I initially tried including the "3" as a number and that raised an error too that disappeared when I stringified it.
I did see in the file "/var/lib.docker/containers/'container_id'/hostconfig.json" for the different instance of the Mongo container in which I included the directives in its run command that these settings were visible. Would it be effective/safe to manually edit this file for the production instance of the Mongo container to match the different proof of concept container's config?
Please see below some system details:
Docker version 1.10.3, build 20f81dd
Ubuntu 14.04.1 LTS
My main goal is to understand why the global config didn't seem to work and if there is a way to make this change to a running container without interrupting it.
Thank you, in advance, for your help!
This setting will be the new default for newly created containers, not existing containers even if they are restarted. A newly created container will have a new container id. I stress this because many people (myself included) try to change the log settings on an existing container without first deleting that container (they've likely created a pet), and there is no supported way to do that in docker.
It is not necessary to completely stop the docker engine, you can simply run a reload command for this change to apply. However, some methods for running docker, like the desktop environments and Docker in Docker based installs, may require a restart of the engine when there is no easy reload option.
This setting will limit the json file to 3 separate 10 meg files, that's between 20-30 megs of logs depending on where in the file the third log happens to be. Once you fill the third file, the oldest log is deleted, taking you back to 20 megs, a rotation is performed in the other logs, and a new log file is started. However json has a lot of overhead, approximately 50% in my testing, which means you'll get roughly 10-15 megs of application output.
Note that this setting is just the default, and any container can override it. So if you see no effect, double check how the container is started to verify there are no log options being passed there.
Changing the daemon.json for running containers did not work for me. Reloading the daemon and restarting the docker after editing the /etc/docker/daemon.json worked but only for the new containers.
docker-compose down
sudo systemctl daemon-reload
sudo systemctl restart docker
docker-compose up -d

How can I tell Puppet to stop a service on shutdown without keeping it running?

Context:
On a linux (RedHat family) system, I have an init-script-based service that is started/stopped manually most of the time, and in general is only run in response to specific, uncommon situations. The init scripts are thin wrappers around code that I do not have control over.
If the service is killed without running the stop action on its init script, it is aborted uncleanly and leaves the system in a broken state that requires manual intervention to fix.
When the systems running the service shut down, they kill it uncleanly. I can register the service with chkconfig such that it gets shut down via the init script when the host shuts down; this solves the problem on a per-host basis.
Question:
I would like to automate the configuration of this service to stop-at-shutdown via Puppet.
How can I tell Puppet to register a service with chkconfig such that the service will be stopped via the init script when the system shuts down, but will not otherwise be managed by Puppet?
What I've Tried:
I made a hokey exec statement in Puppet that calls chkconfig directly, but that feels inelegant (and will probably break in some way I haven't thought of).
I played around with the noop flag to the service type in Puppet, but it didn't seem to have the desired effect.
Puppet does not have any built-in support for configuring which runlevels a service runs in, nor any built-in, generalized support for chkconfig. Ordinarily it is a service-installation responsibility to register the service with chkconfig; services that are installed from the system RPMs are registered that way.
Furthermore, chkconfig recognizes structured comments at the top of initscripts to determine which runlevels the service will run in by default, according to LSB convention. A proper initscript need only be registered with chkconfig to have the default runlevels set -- in particular, for it to be set to be stopped in runlevels 0 and 6, which is what you're after.
If you're rolling your own initscripts and deploying them manually or directly via Puppet (as opposed to packaging them up and installing them via Yum) then your best bet is probably to build a defined type that manages the initscript and its registration. You do not need and probably do not want a Service resource for it, but a File resource to put the proper file in place and an Exec resource to handle registration sounds about right.

Resources