authors | state |
---|---|
Alex McGrath ([email protected]) |
implemented |
Proposes a way by which an SSH service might automatically discover and register AWS EC2 instances.
Currently when adding a new AWS server, it's required that Teleport be installed after the server has been provisioned which may be a slow process for organizations with large numbers of servers as it needs to be installed and then added to the teleport cluster
With the changes described in this document, Teleport will be able to resolve the issues with adding AWS servers to Teleport clusters automatically.
A new service will be introduced for general purpose cloud resource discovery:
discovery_service
. Initially, it will only support EC2 discovery.
Discovery will use a matcher similar to the db_service/aws
matcher, however EC2
discovery will have an optional install command, set of join parameters and script to
use when joining:
discovery_service:
enabled: "yes"
aws:
aws:
- types: ["ec2"]
regions: ["eu-central-1"]
tags:
"teleport": "yes"
install:
join_params:
token_name: "aws-discovery-iam-token" # default value
script_name: "default-installer" # default value
ssm:
document: "TeleportDiscoveryInstaller" # default value
The agent will use EC2's DescribeInstances
API in order to list instances1. This
will require the teleport SSH agent to include ec2:DescribeInstances
as part of
it's IAM permissions.
As with AWS database discover, new EC2 nodes will be discovered periodically on a 60 second timer, as new nodes are found they will be added to the teleport cluster.
In order to avoid attempting to reinstall teleport on top of an instance where it is already present the generated teleport config will match against the node name using the AWS account id and instance id.
Example:
{
"kind": "node",
"version": "v2",
"metadata": {
"name": "${AWS_ACCOUNT_ID}-${AWS_INSTANCE_ID}",
"labels": {
"env": "example"
"teleport.dev/discovered-node": "yes"
},
},
"spec": {
"public_addr": "ec2-54-194-252-215.us-west-1.compute.amazonaws.com",
"hostname": "awsxyz"
}
}
In order to install the Teleport agent on EC2 instances, Teleport will serve an
install script at /webapi/scripts/{installer-resource-name}
. Installer scripts will
be editable as a resource.
Example resource script:
kind: installer
metadata:
name: "installer" # default value
spec:
# shell script that will be downloaded an run by the EC2 node
script: |
#!/bin/sh
curl https://.../teleport-pubkey.asc ...
echo "deb [signed-by=... stable main" | tee ... > /dev/null
apt-get update
apt-get install teleport
teleport node configure --auth-agent=... --join-method=iam --token-name=iam-
# Any resource in Teleport can automatically expire.
expires: 0001-01-01T00:00:00Z
Unless overridden by a user, a default teleport installer command will be generated that is appropriate for the current running version and operating system initially supporting DEB and RPM based distros that Teleport already provides packages for.
The user must create a custom SSM Command document that will be used to execute the served command. The instance of Teleport doing discovery will attempt to automatically create the SSM document.
Example SSM aws:runCommand document:
# name: installTeleport
---
schemaVersion: '2.2'
description: aws:runShellScript
parameters:
token:
types: String
description: "(Required) The Teleport invite token to use when joining the cluster."
mainSteps:
- action: aws:downloadContent
name: downloadContent
inputs:
sourceType: "HTTP"
destinationPath: "/tmp/installTeleport.sh"
sourceInfo:
url: "https://teleportcluster.xyz/webapi/scripts/installer"
- action: aws:runShellScript
name: runShellScript
inputs:
timeoutSeconds: '300'
runCommand:
- /bin/sh /tmp/installTeleport.sh "{{ token }}"
In order to run the new SSM document the AWS user will need IAM permissions to run SSM commands3 for example:
{
"Statement": [
{
"Action": "ssm:SendCommand",
"Effect": "Allow",
"Resource": [
# Allow running commands on all us-west-2 instances
"arn:aws:ssm:us-west-2:*:instance/*",
# Allows running the installTeleport document on the allowed instances
"arn:aws:ssm:us-east-2:aws-account-ID:document/installTeleport"
]
},
// "CreateDocument" and "GetDocument" permissions are required
// to automatically create the document
{
"Action": "ssm:CreateDocument",
"Effect": "Allow",
"Resource": [ "*" ]
},
{
"Action": "ssm:GetDocument",
"Effect": "Allow",
"Resource": [ "*" ]
}
]
}
The machines being discovered will need to allow receiving ec2messages
in
order to receive the SSM commands:
{
"Statement": [
{
"Action": "ec2messages:GetMessages"
"Effect": "Allow"
}
]
}
On AWS, Amazon Linux and Ubuntu LTS (16.04, 18.04, 20.04) come with the SSM agent preinstalled4.
In order to allow nodes to create tokens for the purposes of sending invites to EC2
instances a new system role will be added -- RoleNodeDiscovery
, that will have
permissions to create tokens.
Each EC2 instance that is to be discovered will also require that they have an IAM role attached, in order to be able to send and receive messages for the SSM agent.
Example:
{
"Statement": [
{
"Action": "ec2messages:*",
"Effect": "Allow",
"Resource": [
# Allow running commands on all us-west-2 instances
"*"
]
}
]
}
The teleport node configure
subcommand will be used to generate a
new /etc/teleport.yaml file:
teleport node configure
--auth-server=auth-server.example.com [auth server that is being connected to]
--token="$1" # passed via parameter from SSM document
--labels="teleport.dev/instance-id=${INSTANCE_ID},teleport.dev/account-id=${ACCOUNT_ID}"
This will create generate a file with the following contents:
teleport:
nodename: "$accountID-$instanceID"
auth_servers:
- "auth-server.example.com:3025"
join_params:
token_name: token
discovery_service:
enabled: yes
labels:
teleport.dev/origin: "cloud"
In addition to supporting automatic Teleport agent installation, an agentless option will also be supported. This mode will update the OpenSSH CA to use the Teleport CA without installing the full Teleport Agent.
A new teleport join
command will be added. This will identify itself
with the cluster, using an EC2 join token, in order to fetch the
Teleport CA and to generate host keys. This command will also modify
the sshd config to make use of the fetched keys.
This mode can be enabled by setting agentless: true
in the matcher. When the
matcher includes this, a predefined script for agentless installation will be used for
the endpoint.
Example agentless config:
discovery_service:
enabled: "yes"
aws:
- types: ["ec2"]
regions: ["us-west-1"]
tags:
"teleport": "yes" # aws tags to match
install:
install_teleport: true # default value
# default to this as a result of agentless: true
script_name: "default-agentless-installer"
sshd_config: "/etc/ssh/sshd_config" # default path
ssm:
# default to this as a result of agentless: true
document_name: "TeleportAgentlessDiscoveryInstaller"
An agentless specific SSM document will be required. The teleport discovery bootstrap
command will need to be updated to create SSM documents appropriate for agentless discovery.
Example SSM document:
# name: TeleportAgentlessDiscoveryInstaller
---
schemaVersion: '2.2'
description: aws:runShellScript
parameters:
sshdConfigPath:
types: String
description: "(Required) The path to the sshd config file."
token:
types: String
description: "(Required) The Teleport invite token to use when joining the cluster."
certificateRotation
types: String
description: "Indicates whether this discovery execution is being run as a result of a cert rotation"
mainSteps:
- action: aws:downloadContent
name: downloadContent
inputs:
sourceType: "HTTP"
destinationPath: "/tmp/installTeleport.sh"
sourceInfo:
url: "https://teleportcluster.xyz/webapi/scripts/default-agentless-installer"
- action: aws:runShellScript
name: runShellScript
inputs:
timeoutSeconds: '300'
runCommand:
- export CERTIFICATE_ROTATIOn='{{ certificateRotation }}'
- export SSHD_CONFIG='{{ sshdConfigPath }}'
- /bin/sh /tmp/installTeleport.sh '{{ token }}'
Agentless mode will serve a different install script resource named
default-agentless-installer
. Which will be used to update and restart the sshd
configuration.
Possible agentless installer script:
(
flock -n 9 || exit 1
if grep -q 'TrustedUserCAKeys /etc/ssh/teleport_user_ca.pub' "$SSHD_CONFIG"; then
if [ ! "$CERTIFICATE_ROTATION" = "" ]; then
IMDS_TOKEN=$(curl -m5 -sS -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 300")
PUBLIC_IP=$(curl -m5 -sS -H "X-aws-ec2-metadata-token: ${IMDS_TOKEN}" "http://169.254.169.254/latest/meta-data/public-ipv4")
sudo teleport join \
--openssh-config=$SSHD_CONFIG \
--join-method=iam \
--token="$1" \
--proxy-server="{{ .PublicProxyAddr }}" \
--additional-principals="$PUBLIC_IP" \
--restart-sshd
fi
exit 0
fi
if [ "$distro_id" = "debian" ] || [ "$distro_id" = "ubuntu" ]; then
# ... add teleport repo as in other script
sudo apt-get install -y teleport
elif [ "$distro_id" = "amzn" ] || [ "$distro_id" = "rhel" ]; then
sudo yum install -y teleport
fi
IMDS_TOKEN=$(curl -m5 -sS -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 300")
PUBLIC_IP=$(curl -m5 -sS -H "X-aws-ec2-metadata-token: ${IMDS_TOKEN}" "http://169.254.169.254/latest/meta-data/public-ipv4")
# new command to create the host certs, teleport ca, and update sshd_config
sudo teleport join \
--openssh-config=$SSHD_CONFIG \
--join-method=iam \
--token="$1" \
--proxy-server="{{ .PublicProxyAddr }}" \
--additional-principals="$PUBLIC_IP" \
--restart-sshd \
--rotate-certs
systemctl restart sshd
) 9>/var/lock/teleport_install.lock
A parameter will be added to the SSM document to indicate that a cert rotation is being done.
The discovery agent will listen for certificate rotations and run
teleport join --rotate-certs --...
on the agentless nodes already
present in the cluster.
When rotating certs the teleport join
command will fetch the
OpenSSHCA and overwrite the existing file.
Discovery server:
teleport:
...
auth_service:
enabled: "yes"
discovery_service:
enabled: "yes"
aws:
- types: ["ec2"]
regions: ["eu-central-1"]
tags:
"teleport": "yes"
install:
join_params:
token_name: aws-discovery-iam-token # default value
ssm:
document: "TeleportDiscoveryInstaller" # default value
An SSM document must be created to download and run the teleport install script. The script will be generated using a configuration appropriate for the system running Teleport.
# name: installTeleport
---
schemaVersion: '2.2'
description: aws:runShellScript
parameters:
token:
types: String
description: "(Required) The Teleport invite token to use when joining the cluster."
mainSteps:
- action: aws:downloadContent
name: downloadContent
inputs:
sourceType: "HTTP"
destinationPath: "/tmp/installTeleport.sh"
sourceInfo:
url: "https://teleportcluster.xyz/webapi/scripts/installer"
- action: aws:runShellScript
name: runShellScript
inputs:
timeoutSeconds: '300'
runCommand:
- /bin/sh /tmp/installTeleport.sh "{{ token }}"
The discovery node should have IAM permissions to call ec2:SendCommand and then
limit it to the installTeleport
document:
{
"Statement": [
{
"Action": "ssm:SendCommand",
"Effect": "Allow",
"Resource": [
# Allow running commands on all instances
"*",
# allow running the installTeleport document
"arn:aws:ssm:*:aws-account-ID:document/installTeleport"
]
}
]
}
The SSH discovery node should have permission to call ec2:DescribeInstances
{
"Statement": [
{
"Action": [
"ec2:DescribeInstances",
],
"Effect": "Allow",
"Resource": [
"*", # for example, allow on all ec2 instance with SSM available
]
}
]
}
Nodes being discovered will need permission to GetMessages
{
"Statement": [
{
"Action": "ec2messages:GetMessages"
"Effect": "Allow"
}
]
}
In the future the option to include a list of IAM roles to assume for different accounts may be included:
discovery_service:
enabled: "yes"
aws:
- types: ["ec2"]
regions: ["us-west-1"]
tags:
"teleport": "yes"
ssm_command_document: ssm_command_document_name
roles: # list of ARNs for IAM roles to assume
- "arn:aws:iam::222222222222:role/teleport-DescribeInstancesInstall-role"