Drone slaves provided by CoreOs - coreos

I have a drone host and a CoreOS cluster with fleet.
The drone now have only unix:///var/run/docker.sock in the nodes menu.
As I understand, I could add other docker nodes defined by docker URLs and certificates. However once I have a CoreOS cluster, it seems logical to use that as the provider of the slaves. I am looking for a solution where
(1)I don't have to configure the nodes whenever the CoreOS cluster configration changes, and
(2) provides correct resource management.
I could think of the following solutions:
Expose docker uris in the CoreOS cluster nodes, and configure all of them directly in drone. In this case I would have follow CoreOs cluster changes manually. Resource management would probably conflict with that of fleet.
Expose docker uris in the CoreOS cluster nodes, and provide a DNS round-robin based access. Seems to be a terrible way of resource management, and would most probably conflict with feet.
Install Swarm on the CoreOs nodes. Resource management would probably conflict with that of fleet.
Have fleet or RKT expose a docker uri, and fleet/RKT would decide on which node the container runs on. The problem is that I could not find any way to do this.
Have drone.io use fleet or RKT. Same problem. Is it possible?
Is there any way to provide solutions for all of my requirements with drone.io and CoreOs?

As I understand, I could add other docker nodes defined by docker URLs
and certificates. However once I have a CoreOS cluster, it seems
logical to use that as the provider of the slaves.
The newest version of drone supports build agents. Build agents are installed per-server and will communicate with the central drone server to pull builds from the queue, execute and send back the results.
docker run \
-e DRONE_SERVER=http://my.drone.server \
-e DRONE_SECRET=passcode \
-v /var/run/docker.sock:/container/path/docker.sock \
drone/drone:0.5 agent
This allows you to add and remove agents on the fly without having to register or manage them at the server level.
I believe this should solve the basic problem you've outlined, although I'm not sure it will provide the level of integration you desire with fleet and coreos. Perhaps a coreos expert can augment my answer.

Related

How to patch GKE Managed Instance Groups (Node Pools) for package security updates?

I have a GKE cluster running multiple nodes across two zones. My goal is to have a job scheduled to run once a week to run sudo apt-get upgrade to update the system packages. Doing some research I found that GCP provides a tool called "OS patch management" that does exactly that. I tried to use it but the Patch Job execution raised an error informing
Failure reason: Instance is part of a Managed Instance Group.
I also noticed that during the creation of the GKE Node pool, there is an option for enabling "Auto upgrade". But according to its description, it will only upgrade the version of the Kubernetes.
According to the Blog Exploring container security: the shared responsibility model in GKE:
For GKE, at a high level, we are responsible for protecting:
The nodes’ operating system, such as Container-Optimized OS (COS) or Ubuntu. GKE promptly makes any patches to these images available. If you have auto-upgrade enabled, these are automatically deployed. This is the base layer of your container—it’s not the same as the operating system running in your containers.
Conversely, you are responsible for protecting:
The nodes that run your workloads. You are responsible for any extra software installed on the nodes, or configuration changes made to the default. You are also responsible for keeping your nodes updated. We provide hardened VM images and configurations by default, manage the containers that are necessary to run GKE, and provide patches for your OS—you’re just responsible for upgrading. If you use node auto-upgrade, it moves the responsibility of upgrading these nodes back to us.
The node auto-upgrade feature DOES patch the OS of your nodes, it does not just upgrade the Kubernetes version.
OS Patch Management only works for GCE VM's. Not for GKE
You should refrain from doing OS level upgrades in GKE, that could cause some unexpected behavior (maybe a package get's upgraded and changes something that will mess up the GKE configuration).
You should let GKE auto-upgrade the OS and Kubernetes. Auto-upgrade will upgrade the OS as GKE releases are inter-twined with the OS release.
One easy way to go is to signup your clusters to release channels, this way they get upgraded as often as you want (depending on the channel) and your OS will be patched regularly.
Also you can follow the GKE hardening guide which provide you with step to make sure your GKE clusters are as secured as possible

Does Kubernetes restart a failed container or create a new container when the running container fails for any reason?

I have ran a docker container locally and it stores data in a file (currently no volume is mounted). I stored some data using the API. After that I failed the container using process.exit(1) and started the container again. The previously stored data in the container survives (as expected). But when I do this same thing in Kubernetes (minikube) the data is lost.
Posting this as a community wiki for better visibility, feel free to edit and expand it.
As described in comments, kubernetes replaces failed containers with new (identical) ones and this explain why container's filesystem will be clean.
Also as said containers should be stateless. There are different options how to run different applications and take care about its data:
Run a stateless application using a Deployment
Run a stateful application either as a single instance or as a replicated set
Run automated tasks with a CronJob
Useful links:
Kubernetes workloads
Pod lifecycle

How to Scale out Gitlab EE

Currently I am running the whole gitlab EE as a single container. I need to scale out the service so that it can support more users and more operations/pull/push/Merge Requests etc simultanously.
I need to run a redis cluster of its own
I need to run a PG cluster separate
I need to integrate elasticsearch for search
But how can I scale out the remaning core gitlab services. Do they support a scale out architecture.
gitlab workhorse
unicorn ( gitlab rails )
sidekiq ( gitlab rails )
gitaly
gitlab shell
Do they support a scale out architecture.
Not exactly, considering the GitLab Omnibus image is one package with bundled dependencies.
But I never experienced so much traffic that it needed to be split up and scaled out.
There is though a proposal for splitting up the Omnibus image: gitlab-org/omnibus-gitlab issue 1800.
It points out to gitlab-org/build/CNG which does just what you are looking for:
Each directory contains the Dockerfile for a specific component of the infrastructure needed to run GitLab.
rails - The Rails code needed for both API and web.
unicorn - The Unicorn container that exposes Rails.
sidekiq - The Sidekiq container that runs async Rails jobs
shell - Running GitLab Shell and OpenSSH to provide git over ssh, and authorized keys support from the database
gitaly - The Gitaly container that provides a distributed git repos
The other option, using Kubernetes, is the charts/gitlab:
The gitlab chart is the best way to operate GitLab on Kubernetes. This chart contains all the required components to get started, and can scale to large deployments.
Some of the key benefits of this chart and corresponding containers are:
Improved scalability and reliability
No requirement for root privileges
Utilization of object storage instead of NFS for storage
The default deployment includes:
Core GitLab components: Unicorn, Shell, Workhorse, Registry, Sidekiq, and Gitaly
Optional dependencies: Postgres, Redis, Minio
An auto-scaling, unprivileged GitLab Runner using the Kubernetes executor
Automatically provisioned SSL via Let's Encrypt.
Update Sept. 2020:
GitLab 13.4 offers one feature which can help scaling out GitLab on-premise:
Gitaly Cluster majority-wins reference transactions (beta)
Gitaly Cluster allows Git repositories to be replicated on multiple warm Gitaly nodes. This improves fault tolerance by removing single points of failure.
Reference transactions, introduced in GitLab 13.3, causes changes to be broadcast to all the Gitaly nodes in the cluster, but only the Gitaly nodes that vote in agreement with the primary node persist the changes to disk.
If all the replica nodes dissented, only one copy of the change would be persisted to disk, creating a single point of failure until asynchronous replication completed.
Majority-wins voting improves fault tolerance by requiring a majority of nodes to agree before persisting changes to disk. When the feature flag is enabled, writes must succeed on multiple nodes. Dissenting nodes are automatically brought in sync by asynchronous replication from the nodes that formed the quorum.
See Documentation and Issue.

How do I implement a simple scheduler to start docker container on the least busy host

I am not using docker swarm or kubernetes. I need to implement a simple scheduler to start docker container on demand on the least busy host. The docker container runs nodejs codes BTW.
Is there any open source project out here already implements this?
Thanks!
Take a look at kubernetes jobs if you need to run container once and on demand.
As for least busy node you can use nodeAffinity to specify nodes where you don't have other apps or better to specify app resources and allow kubernetes to decide where to run your app.

Node Cluster and/or Docker Cluster?

Trying to get the best performance from my application with as little setup as possible.
I'm struggling to find a consensus online of whether it would be better to use the Node cluster module in a Docker container, or to use a cluster of Docker instances instead.
OPINION: Node cluster first, then Docker cluster
OPINION: Don't use Node cluster in a Docker instance
Depends what "best performance" means? What is the bottleneck in your case? CPU? RAM? Network? Disk-I/O?
Advantages of a node cluster:
All communication is in memory.
Disadvantage
The solution doesn't scale beyond one host. If the host is overloaded, then so is your service
Advantages of a docker cluster:
high availability.
more network bandwidth, more resources as you have more hosts
Assuming you run your software as a service in docker anyway, I can't see the issue of "little setup as possible". Use both if it makes sense.

Resources