Tazjin's blog

Provisioning CoreOS with Ansible: Basic Setup

This is the first in a series of blog posts about how to manage a cluster of CoreOS machines using Ansible.

CoreOS is a minimal Linux distribution focused on providing a highly-available Linux base for servers, with all applications run in Docker and service discovery plus scheduling happening through etcd and fleet.

I'll assume some familiarity with CoreOS concepts such as cluster discovery and also a general knowledge of how Docker and Ansible work. Go read up on those otherwise - it's lots of fun!

All code for these blog posts is collected in my example repository. Every blog post has an associated tag, for this post it's basic-setup.

So let's get started. In this post we will cover how to run Ansible on CoreOS machines and what we can do with it. In the end we'll provision a cloud-config using Ansible.

Running Ansible - where's Python?

CoreOS is, as mentioned before, intended to be a minimal Linux distribution. It also includes a very minimal set of default packages - Python not being one of them. Because Python is a requirement to use most Ansible functionality we need a way to run a Python interpreter on the machine that can access Ansible's provisioning scripts.

Of course CoreOS is all about containers, so that's what we'll do. We'll create our own Python and base it off CoreOS toolbox. Drop this gist in /opt/bin/python:

In short, we start the toolbox-container (by default a Fedora image that contains Python) and map the host's /home-folder to the container's /home-folder. This is necessary because Ansible drops its provisioning scripts into ~/.ansible by default and then calls the Python interpreter with it. The toolbox then calls python with the arguments passed to the script.

Setting up Ansible

Ansible will usually try to find Python by calling env python. However with our solution this won't return the path of the Python drop-in.

There is an inventory variable called ansible_python_interpreter that you can set to /opt/bin/python to solve this problem.

Lets look at an example inventory file that contains some more options as well.

Here we set up three demo machines to form a CoreOS cluster and add them to two groups: coreos and demo. The first group is intended as a top-level group for all CoreOS machines, the second one is what I call a "pod" - a way to differentiate between different clusters (for example for setting different discovery URLs for different machines). How you structure this is of course up to you.

Other variables here are domain - the search domain for host FQDNs, the two CoreOS variables coreos_discovery (for the etcd discovery URL) and coreos_channel (for the CoreOS update channel). Then there is machine_metadata which should be set on inididual machine level for fleet-metadata templating.

Provisioning CoreOS

The core of provisioning a CoreOS machine is really the cloud-config deployed to it. For this we take a simple, template-based approach.

The Ansible role hostbase is applied to all CoreOS machines and contains this template:

Most of this is pretty self-explanatory using what we've already covered. We use the variables set up on a per-host + pod level to create the final configuration.

The fleet metadata is templated together from the pod_metadata (metadata for a whole cluster, e.g. type=vm if you have a separate physical and virtual cluster) and the machine_metadata which is set in host_vars. We will take a closer look at this in a later post.

Users are set up from a vars file included in the task, called users.yml. They are templated into the cloud config with Github keys as the SSH key provisioning method. (Note: Users should be in the systemd-journal group in order to be able to retrieve journal entries for units. This isn't currently supported in CoreOS as the group is read-only. It could be done by moving the entry to /etc/group but I personally don't do that).

This template is filled with the variables and placed into the /var/lib/coreos-install/user_data location which is read and applied on every system boot. We also call coreos-cloudinit manually to apply the new configuration right-away. This happens in cloudinit.yml.

Note: As you can see in that file we prefix the file destination with /media/root/. This is important for all file operations you do through Ansible, as the container running our Python has the actual root partition mounted at that path. If you miss it, you will provision the container instead which is most likely not what you want.

That's it! (for now)

This is all you need - pretty easy! Using this we already have a pretty solid base for provisioning CoreOS hosts in a flexible way. We can set cluster-wide and machine-specific metadata, configure other important CoreOS settings and also use Ansible in general to configure our CoreOS host.

What's next?

I have several more posts planned, covering topics such as:

This post will be updated once the others are out.

All of this is still a pretty young and developing topic and one reason I'm putting this out there is that I really want to get some feedback from others facing these questions, so feel free to tweet at me or comment on Reddit!