AWS CoreOS cluster provisioning with Terraform
- Overview
- Setup AWS credentials
- Install tools
- Quick start
- Customization
- Build multi-node cluster
- Manage individual platform resources
- Technical notes
This is a practical implementation of [CoreOS cluster architectures ] ( built on AWS.
The cluster follows CoreOS production cluster model that contains an autoscaling etcd cluster, and an autoscaling worker cluster for hosted containers. You can optionally add an admiral cluster for shared services such as CI, private docker registry, logging and monitoring, etc.
The entire infrastructure is managed by Terraform.
For other type of Unix cluster, see a similar repo aws-linux-cluster.
Go to AWS Console.
- Signup AWS account if you don't already have one. The default EC2 instances created by this tool is covered by AWS Free Tier ( service.
- Create a group
policy. - Create a user
and Download the user credentials. - Add user
to groupcoreos-cluster
If you use Vagrant, you can skip this section and go to Quick Start section.
Instructions for install tools on MacOS:
Install Terraform
$ brew update $ brew install terraform
$ mkdir -p ~/bin/terraform $ cd ~/bin/terraform $ curl -L -O $ unzip
Install Jq
$ brew install jq
Install AWS CLI
$ brew install awscli
$ sudo easy_install pip $ sudo pip install --upgrade awscli
For other platforms, follow the tool links and instructions on tool sites.
$ git clone
$ cd aws-terraform
If you use Vagrant, instead of install tools on your host machine, there is Vagranetfile for a Ubuntu box with all the necessary tools installed:
$ vagrant up
$ vagrant ssh
$ cd aws-terraform
$ aws configure --profile coreos-cluster
Use the downloaded aws user credentials when prompted.
The above command will create a coreos-cluster profile authentication section in ~/.aws/config and ~/.aws/credentials files. The build process bellow will automatically configure Terraform AWS provider credentials using this profile.
This default build will create one etcd node and one worker node cluster in a VPC, with application buckets for data, necessary iam roles, polices, keypairs and keys. The instance type for the nodes is t2.micro. You can review the configuration and make changes if needed. See Customization for details.
$ make
... build steps info ...
... at last, shows the worker's ip:
worker public ips:
$ make show
id = etcd
availability_zones.# = 3
availability_zones.2050015877 = us-west-2c
availability_zones.221770259 = us-west-2b
availability_zones.2487133097 = us-west-2a
default_cooldown = 300
desired_capacity = 1
force_delete = true
health_check_grace_period = 0
health_check_type = EC2
launch_configuration = terraform-4wjntqyn7rbfld5qa4qj6s3tie
load_balancers.# = 0
max_size = 9
min_size = 1
name = etcd
tag.# = 1
$ ssh -A [email protected]
CoreOS beta (723.3.0)
[email protected] ~ $ fleetctl list-machines
289a6ba7... env=coreos-cluster,platform=ec2,provider=aws,region=us-west-2,role=etcd2
320bd4ac... env=coreos-cluster,platform=ec2,provider=aws,region=us-west-2,role=worker
$ make destroy_all
This will destroy ALL resources created by this project.
The default values for VPC, ec2 instance profile, policies, keys, autoscaling group, lanuch configurations etc., can be override in resources/terraform/` files.
AWS profile and cluster name are defined at the top of Makefile:
AWS_PROFILE := coreos-cluster CLUSTER_NAME := coreos-cluster
These can also be customized to match your AWS profile and cluster name.
The number of etcd nodes and worker nodes are defined in resource/terraform/ and resource/terraform/
Change the cluster_desired_capacity in the file to build multi-nodes etcd/worker cluster, for example, change to 3:
cluster_desired_capacity = 3
Note: etcd minimum, maximum and cluster_desired_capacity should be the same and in odd number, e.g. 3, 5, 9
You should also change the aws_instance_type
from micro
to medium
or large
if heavy docker containers to be hosted on the nodes:
image_type = "t2.medium"
root_volume_size = 12
docker_volume_size = 120
To build:
$ make all
... build steps info ...
... at last, shows the worker's ip:
worker public ips:
Login to a worker node:
$ ssh -A [email protected]
CoreOS beta (723.3.0)
[email protected] ~ $ etcdctl cluster-health
cluster is healthy
member 34d5239c565aa4f6 is healthy
member 5d6f4a5f10a44465 is healthy
member ab930e93b1d5946c is healthy
core@ip-10-0-1-92 ~ $ etcdctl member list
34d5239c565aa4f6: name=i-65e333ac peerURLs= clientURLs=
5d6f4a5f10a44465: name=i-cd40d405 peerURLs= clientURLs=
ab930e93b1d5946c: name=i-ecfa0d1a peerURLs= clientURLs=
[email protected] ~ $ fleetctl list-machines
0d16eb52... env=coreos-cluster,platform=ec2,provider=aws,region=us-west-2,role=etcd2
d320718e... env=coreos-cluster,platform=ec2,provider=aws,region=us-west-2,role=etcd2
f0bea88e... env=coreos-cluster,platform=ec2,provider=aws,region=us-west-2,role=etcd2
0cb636ac... env=coreos-cluster,platform=ec2,provider=aws,region=us-west-2,role=worker
4acc8d6e... env=coreos-cluster,platform=ec2,provider=aws,region=us-west-2,role=worker
fa9f4ea7... env=coreos-cluster,platform=ec2,provider=aws,region=us-west-2,role=worker
You can create individual resources and the automated-scripts will create resources automatically based on dependencies.
$ make help
Usage: make (<resource> | destroy_<resource> | plan_<resource> | refresh_<resource> | show | graph )
Available resources: vpc s3 route53 iam etcd worker
For example: make worker # to show what resources are planned for worker
Currently defined resources:
Resource | Description |
vpc | VPC, gateway, and subnets |
s3 | S3 buckets |
iam | Setup a deployment user and deployment keys |
route53 | Setup public and private hosted zones on Route53 DNS service |
elb | Setup application ELBs |
efs | EFS cluster |
etcd | Setup ETCD2 cluster |
worker | Setup application docker hosting cluster |
admiral | Central service cluster (fleet-ui, monitoring, logging, etc) |
docker_registry | Private docker registry cluster |
rds | RDS servers |
cloudtrail | Setup AWS CloudTrail |
To build the cluster step by step:
$ make init
$ make vpc
$ make etcd
$ make worker
Make commands can be re-run. If a resource already exists, it just refreshes the terraform status.
This will create a build/ directory, copy all terraform files to the build dir, and execute correspondent terraform cmd to build the resource on AWS.
To destroy a resource:
$ make destroy_<resource>
- Etcd cluster is on an autoscaling group. It should be set with a fixed, odd number (1,3,5..), and cluster_desired_capacity=min_size=max_size.
- Cluster discovery is managed with stakater/etcd-aws-cluster image. etcd cluster is formed by self-discover through its auto-scaling group and then an etcd initial cluster is updated automatically to s3://AWS-ACCOUNT-CLUSTER-NAME-cloudinit/CLUSTER-NAME_etcd/initial-cluster s3 bucket. Worker nodes join the cluster by downloading the etcd initial-cluster file from the s3 bucket during their bootstrap.
- AWS resources are defined in resources and modules directories. The build process will copy all resource files from resources to a build directory. The terraform actions are performed under build, which is ignored in .gitignore. The original Terraform files in the repo are kept intact.
- Makefiles and shell scripts are used to give us more flexibility on tasks Terraform leftover. This provides stream-lined build automation.
- All nodes use a common bootstrap shell script as user-data, which downloads initial-cluster file and nodes specific cloud-config.yaml to configure the node. If cloud-config changes, no need to rebuild an instance. Just reboot it to pick up the change.
- CoreOS AMI is generated on the fly to keep it up-to-data. Default channel can be changed in Makefile.
- Terraform auto-generated launch configuration name and CBD feature are used to allow launch configuration update on a live autoscaling group, however, running ec2 instances in the autoscaling group has to be recycled outside of the Terraform management to pick up the new LC.
- For a production system, the security groups defined in etcd, worker, and admiral module should be carefully reviewed and tightened.
To control your cluster with fleet, you use the fleetctl command. As you can read here, fleet has no built-in security mechanism. If you want to use fleetctl from your workstation, you need to configure fleet to use an SSH tunnel. I found that an easy way to do this is to configure the SSH user and private key in ~/.ssh/config and then export the FLEETCTL_TUNNEL variable on the command line. Like so:
Host coreos User core HostName IdentityFile ~/.ssh/your_aws_private_key.pem
It doesn’t matter which instance you use as the other end of your SSH tunnel, as long as you use the EC2 instance’s public IP address. Of course the IP address in your SSH config must be the same as what you export in the environment variable.
Also, make sure to add your private key to ssh-agent, to make sure the ssh commands work:
ssh-add ~/.ssh/your_aws_private_key.pem
Once you’ve done this, the following command:
fleetctl list-machines
Should show you the servers in your cluster:
MACHINE IP METADATA 015a6f3a... - 3588db25... -
Host coreos User core HostName IdentityFile /Users/rasheed/Documents/projects/stakater/aws-terraform-xuwang/aws-terraform/build/keypairs/gocd.pem
ssh-add /Users/rasheed/Documents/projects/stakater/aws-terraform-xuwang/aws-terraform/build/keypairs/gocd.pem
fleetctl submit hello.service fleetctl start hello.service fleetctl status hello.service fleetctl destroy hello.service
To see the output of the service, call:ß
fleetctl journal hello.service
Fleet is effectively a clustered layer on top of systemd. Fleet uses systemd unit files with an (optional) added section to tell fleet which machines it should run on. There is very little magic.
list systemd units
systemctl list-units | grep fleet
systemctl restart fleet.service
introduction to systemd:
introduction to fleet:
- Don't modify the cluster-name. If you do then please do update the "" as well. Specifically this path:
# Bucket path for the cloud-config.yaml
Two types of units can be run in your cluster — standard and global units. Standard units are long-running processes that are scheduled onto a single machine. If that machine goes offline, the unit will be migrated onto a new machine and started.
Global units will be run on all machines in the cluster.
The fleet logs (sudo journalctl -u fleet) will provide more clarity on what’s going on under the hood.
There are two fleetctl commands to view units in the cluster: list-unit-files, which shows the units that fleet knows about and whether or not they are global, and list-units, which shows the current state of units actively loaded into machines in the cluster.
$ fleetctl list-unit-files
You can view all of the machines in the cluster by running list-machines:
$ fleetctl list-machines
$ fleetctl list-units
Check the fleet service to see what errors it gives us:
$ systemctl status -l fleet
For each of our essential services, we should check the status and logs. The general way of doing this is:
systemctl status -l journalctl -b -u
If we check the etcd logs, we will see something like this:
journalctl -b -u etcd
When your CoreOS machine processes the cloud-config file, it generates stub systemd unit files that it uses to start up fleet and etcd. To see the systemd configuration files that were created and are being used to start your services, change to the directory where they were dropped:
cd /run/systemd/system ls -F
to list all units
Services usually fail because of a missing dependency (e.g. a file or mount point), missing configuration, or incorrect permissions. In this example we see that the dev-mqueue unit with type mount fails. As the type is a mount, the reason is most likely because mounting a particular partition failed.
By using the systemctl status command we can see the details of the dev-mqueue.mount unit:
[root@localhost ~]# systemctl status dev-mqueue.mount
online tool to validate cloud-config
Can you check to see if the service is enabled (systemctl is-enabled etcd2)? If it's not enabled, it may be a dependency of something that is enabled. You can test with systemctl list-dependencies etcd2 --reverse
check status of a service
systemctl status -l gocd
There’s a few things worth pointing out:
- The container is clearly dependent on having Docker running, hence the Requires line. The After line is also needed to avoid race conditions.
- Before we start the container, we first stop and remove any existing container with the same name and then pull the latest version of the image. The “-” at the start means systemd won’t abort if the command fails.
- This means that our container will be started from scratch each time. If you want to persist data then you’ll need to do something with volumes or volume containers, or change the code to restart the old container if it exists.
- We’ve used TimeoutStartSec=0 to turn off timeouts, as the docker pull may take a while.
You can check units status by:
$ sudo systemctl status gocd-agent-1
Or the unit logs by:
$ sudo journalctl -exu gocd-agent-1
Usually, the log info will tell what's going on.
docker logs <IMAGE_NAME>
I don't know what is SIGKILL'ing the process. Perhaps there is something in the full system journal around that time that might indicate journalctl --since "2015-03-20 08:49"? Try running dmesg too? Maybe the kernel is killing it?
Step 1: get into the coreos machine:
ssh -i /home/vagrant/aws-terraform/build/keypairs/gocd.pem core@
Step 2: get list of running docker containers
docker ps
Step 3: to check logs of particular container/service
journalctl -exu gocd-agent-1
journalctl -exu gocd-agent-cd-prod.service
Step 4: