Monitoring CoreOS Clusters

UPDATE: Related to monitoring CoreOS clusters, we have recently optimized the SPM setup on CoreOS and integrated a logging gateway to Logsene into the SPM Agent for Docker.  You can read about it in Centralized Log Management and Monitoring for CoreOS Clusters


[ Note: Click here for the Docker Monitoring webinar video recording and slides. And click here for the Docker Logging webinar video recording and slides. ]

In this post you’ll learn how to get operational insights (i.e. performance metrics, container events, etc.) from CoreOS and make that super simple with etcd, fleet, and SPM.

We’ll use:

  • SPM for Docker to run the monitoring agent as a Docker container and collect all Docker metrics and events for all other containers on the same host + metrics for hosts
  • fleet to seamlessly distribute this container to all hosts in the CoreOS cluster by simply providing it with a fleet unit file shown below
  • etcd to set a property to hold the SPM App token for the whole cluster

The Big Picture

Before we get started, let’s take a step back and look at our end goal.  What do we want?  We want charts with Performance Metrics, we want Event Collection, we’d love integrated Anomaly Detection and Alerting, and we want that not only for containers, but also for hosts running containers.  CoreOS has no package manager and deploys services in containers, so we want to run the SPM agent in a Docker container, as shown in the following figure:

SPM_for_Docker

By the end of this post each of your Docker hosts could look like the above figure, with one or more of your own containers running your own apps, and a single SPM Docker Agent container that monitors all your containers and the underlying hosts.

3 Simple Steps

1)  Create a new SPM App of type “Docker” and copy the SPM App Token

2) Set the SPM App Token via etcd. This makes the token instantly available to all SPM agent instances in the cluster:

etcdctl set /sematext.com/myapp/spm/token YOUR_SPM_APP_TOKEN

Of course, you can change “myapp” part to whatever you want.  This simply acts as a namespace in etcd in case you have multiple SPM Apps (and thus multiple SPM App Tokens).

3) Grab the spm-agent.service fleet unit file and submit it to fleet:

# download service file for sematext-agent-docker
wget https://raw.githubusercontent.com/sematext/sematext-agent-docker/master/coreos/spm-agent.service
# Load and start the service with
fleetctl load spm-agent.service
fleetctl start spm-agent.service

Fleet unit file

What’s this fleet unit file about?  It simple.  It reads the SPM App Token from etcd and then starts the Docker container with sematext-agent-docker inside. This is what it looks like:

[Unit]
Description=SPM Docker Agent
After=docker.service
Requires=docker.service

[Service]
TimeoutStartSec=0
EnvironmentFile=/etc/environment
Restart=always
RestartSec=30s
ExecStartPre=-/usr/bin/docker kill spm-agent
ExecStartPre=-/usr/bin/docker rm spm-agent
ExecStartPre=/usr/bin/docker pull sematext/sematext-agent-docker:latest
ExecStart=/bin/sh -c 'set -ex; /usr/bin/docker run --name sematext-agent -e
SPM_TOKEN=$(etcdctl get /sematext.com/myapp/spm/token) -e HOSTNAME=$HOSTNAME -v /var/run/docker.sock:/var/run/docker.sock sematext/sematext-agent-docker' ExecStop=/usr/bin/docker stop spm-agent

[Install]
WantedBy=multi-user.target

[X-Fleet]
Global=true

After about a minute, you should see Docker metrics and events in SPM.

Bildschirmfoto 2015-06-24 um 13.56.39

Open Sourced Everything

Everything described here is open-sourced:

Summary – What this gets you

What we  get after this setup is the following:

Having this little setup let’s you take the full advantage of SPM and Logsene e.g. by defining intelligent alerts for metrics and logs, delivered to channels like e-mail, PagerDuty, Slack, HipChat or any WebHook, as well as making correlations between performance metrics, events, logs, and alerts.

Running CoreOS? Need any help getting CoreOS metrics and/or logs into SPM & Logsene?  Let us know!  Oh, and if you’re a small startup — ping @sematext — you can get a good discount on both SPM and Logsene!

Leave a Reply