I spent the weekend playing with CoreOS. Here are some things that I found interesting.
systemd.unit
“Universal” Unit Configuration
I think the first thing to really understand are the concepts behind systemd units. They aren’t particularly tricky, and going through the entire official document is very enlightening. The main things I took away from it are the following:
[Unit] section
This section is primarily for a simple description of the unit (for use in log
and status output) and dependency definitions. The main attributes for defining
dependencies are Requires
, Before
, and After
. It is important to note
that none of these implies the other. As an example, if serviceA
depends on
serviceB
to be started successfully as a prerequisite, the unit file for
serviceA
would need to indicate that with both Requires
and After
attributes set to serviceB
.
[Install] section
I think it’s important to understand this section, but nothing about it is
worth pointing out. Well, except that fleet
managed service units on a CoreOS
cluster will explicitly and intentionally exclude this section. Instead, a
service unit configuration will likely include some attributes in an X-Fleet
section instead. More on that later.
systemd.service
On a CoreOS cluster, most user services will be configured through service
configuration files and scheduled by fleet
.
[Service] section
The Type
attribute is important to understand. I found that the oneshot
value can be used for something like initializing a dataset. For instance,
I might have a database.service
of Type simple
. Then a db-init.service
of Type oneshot
which Requires
and runs After
the database.service
unit to initialize the dataset Before
a webapp.service
unit can be
scheduled.
This can also be useful for Docker volume containers, too. In this case, you
would probably want to also toggle the RemainAfterExit
boolean. This is
mostly because convention for volume containers is that they don’t have a long
running service, but rather usually just an echo
and quick exit.
There are a few Exec*
attributes available to service units and they are
generally pretty self explanatory. The main ones being ExecStart
/ExecStop
,
ExecStartPre
/ExecStartPost
, and ExecStopPost
. The value of these options
needs to be an absolute path to an executable followed by arguments to that
command. If this command filename is prefixed with -
, the exit code of that
command is ignored.
It’s also important to note that while the command syntax intended to be similar to shell syntax, there are many aspects of shell syntax which are intentionally unsupported.
cloud-config
CoreOS recognizes a subset of configuration items of the cloud-init project’s
cloud-config file. This is generally used to initialize individual nodes of a
cluster through a cloud provider’s user-data option. This usually starts by
generating a new etcd
discovery token.
The trick here is that your cluster won’t be considered healthy until you have
the minimum number of nodes in your cluster. You configure that when you
generate your discovery token. If you don’t explicitly set a cluster size, the
default is 3 nodes. This was tripping me up, as I couldn’t figure out why the
fleet service was failing when I had just one node in the cluster. Looking
back, this seems a little more obvious.
One notable component is the coreos.fleet.metadata
attribute. You might use this to distinguish between cluster nodes in different
regions, for instance.
fleet
CoreOS expects that you will be managing service units on your cluster through
an abstraction provided as fleet
. These service units with be Docker
containers, since that’s the only real service container supported natively at
the time of this writing. You will be defining service units as you would for
systemd, except without an [Install]
section and you might optionally include
some fleet-specific properties
in an [X-Fleet]
section. You might use these options to manage where your
service is scheduled on your cluster.