authors | state |
---|---|
Hugo Hervieux ([email protected]) |
implemented (v12.0) |
- Engineering: @r0mant && @tigrato && ( @gus || @programmerq )
- Product: @klizhentas || @xinding33
- Security: @reedloden
This proposal describes structural changes made to the Teleport Helm charts to achieve the following goals:
- deploy auth and proxies separately
- reduce the time to deploy in most common setups (
aws
andstandalone
) - always support the latest Teleport features by default (reduce time-to-market)
- reduce the cost of chart maintenance
- ensure seamless updates between Teleport versions
- ensure out of the box configuration supports large scale deployments
Most self-hosted Teleport setups rely either on Helm charts or Terraform to deploy and operate Teleport. We want those two methods to become reference ways of deploying Teleport, providing out of the box the most secure and available setup.
Helm charts should allow users to easily benefit from the best Teleport deployment they can have. This includes but does not limit to:
- security
- maintainability
- availability
- scalability
In its current state, the Helm chart deploys a all-in-one set of pods assuming proxy, auth, and kubernetes-access roles. Splitting responsibilities across multiple sets of pods would increase availability, scalability, and reduce attack surface.
Helm charts are also lagging behind upstream Teleport in terms of feature. The
teleport-cluster
chart configuration is exposing a subset of the supported
teleport.yaml
values, but under different names. This causes unnecessary
friction for the user and increases the cost of maintaining the chart
configuration template.
This proposal starts by discussing the chart structure and deployed resources. The second part is dedicated to the chart values, configuration format, and backward compatibility. The third part addresses new update strategy constraints between major Teleport versions.
The resources in the chart would be split in two subdirectories,
templates/auth/
and templates/proxy/
to clearly identify which resource is
used by which teleport node. Common resources should be put in templates/
.
The chart would deploy two Deployments: one for the proxies and one for the auths.
- the
teleport-proxy
Deployment: Those pods are stateless by default and can be upscaled even in standalone mode. Deploying those nodes using a Deployment means we cannot mount persistent storage on them. As Teleport does not support graceful shutdown with record shipping, users might lose active sessions recordings during a rollout if using theproxy
mode. Teleport nodes are relying onkube
ProvisionTokens to join the auth nodes on startup (see RFD-0094). - the
teleport-auth
Deployment: Those pods cannot be replicated without remote backend for state and audit logs. When persistence is enabled, a single volume will be mounted to those pods and the update strategy will be "re-create". For setups in which auth pods are stateless, the Deployment can be scaled up.
The main LB service should send traffic to the proxies, two additional services for in-cluster communication should be created: one for the proxies and one for the auth.
The trust between auth and proxy should be bootstrapped by creating a provisionToken on start.
Deploying different pod sets requires a way to discriminate them. The only
label set currently is app: {{.Release.Name }}
. We should follow Helm
label recommendations:
Label | Value | Purpose |
---|---|---|
app.kubernetes.io/name | {{- default .Chart.Name .Values.nameOverride | trunc 63 | trimSuffix "-" }} |
Identify the application. |
helm.sh/chart | {{ .Chart.Name }}-{{ .Chart.Version | replace "+" "_" }} |
This should be the chart name and version. |
app.kubernetes.io/managed-by | {{ .Release.Service }} |
It is for finding all things managed by Helm. |
app.kubernetes.io/instance | {{ .Release.Name }} |
It aids in differentiating between different instances of the same application. |
app.kubernetes.io/version | {{ .Chart.AppVersion }} |
The version of the app. |
app.kubernetes.io/component | Name of the main Teleport service: auth , proxy , kube |
This describes which Teleport component is deployed. |
Those labels should be applied to all deployed resources when applicable. This includes but does not limit to Pods, Deployments, ConfigMaps, Secrets and Services.
Note: if multiple components are deployed in the same pod (e.g. auth and kube),
only the main component should appear in the app.kubernetes.io/component
.
This avoids the label selectors to change when services are added or removed.
The app: {{.Release.Name}}
label should stay on the auth pods for
compatibility reasons.
A single optional
PodMonitor
should be deployed per Helm release, selecting all pods based
on app.kubernetes.io/name
.
It was initially planned to allow deployment of custom resources through the chart. Unfortunately, Helm does not support deploying both a CRD and its CRs in the same release (it checks if the API is supported before deploying). This section has been removed from the RFD during implementation.
The Helm chart would still expose modes (aws
, gcp
, standalone
, custom
),
but allow users to pass arbitrary additional configuration or perform specific
overrides. This way, users would not have to leave the happy path if they need
to set one specific value. Manually implementing all configuration knobs in
Helm adds no value and brings confusion as some values are not supported or not
named the same way than the teleport.yaml
field they set.
By leveraging Helm's templating functions toYaml
, fromYaml
, and sprig's
mustMergeOverwrite
the charts would merge their automatically-generated
teleport.yaml
with user-provided teleport.yaml
.
A user deploying the auth_service
chart in standalone
mode and wanting to
set key exchange algorithms, remove extra log fields, and override kube
cluster's name would use the values:
clusterName: my-cluster
chartMode: standalone
auth:
teleportConfig:
teleport:
kex_algos:
- ecdh-sha2-nistp256
- ecdh-sha2-nistp384
- ecdh-sha2-nistp521
log:
format:
extra_fields: ~
kubernetes_service:
kube_cluster_name: my-override
The generated chart configuration for the standalone
mode is
teleport:
log:
severity: INFO
output: stderr
format:
output: text
extra_fields: ["timestamp","level","component","caller"]
auth_service:
enabled: true
cluster_name: my-cluster
authentication:
type: "local"
local_auth: true
second_factor: "otp"
kubernetes_service:
enabled: true
listen_addr: 0.0.0.0:3027
kube_cluster_name: my-cluster
proxy_service:
enabled: false
ssh_service:
enabled: false
Once merged with the custom user configuration, the resulting configuration is
auth_service:
authentication:
local_auth: true
second_factor: otp
type: local
cluster_name: my-cluster
enabled: true
kubernetes_service:
enabled: true
kube_cluster_name: my-override
listen_addr: 0.0.0.0:3027
proxy_service:
enabled: false
ssh_service:
enabled: false
teleport:
kex_algos:
- ecdh-sha2-nistp256
- ecdh-sha2-nistp384
- ecdh-sha2-nistp521
log:
format:
extra_fields: null
output: text
output: stderr
severity: INFO
The proof of concept code can be found here.
The main drawback of this approach is that comments and value ordering are lost during the round-trip. This approach could be extended to support multiple configuration syntax, following a breaking change for example.
custom
should be removed in favor of a new scratch
mode. Compared to
the previous Helm chart, users would not provide an external ConfigMap
but pass the custom configuration through the values. This is a breaking
change for them, but by the nature of the auth/proxy split it is not possible
to be backward compatible with custom
mode.
In order to mitigate the risk of building an invalid configuration, the chart should run pre-install and pre-upgrade hooks validating the configuration.
Splitting between auth and proxies will imply breaking some logic, we will try to provide backward compatibility as much as possible. This includes being compatible with the previous installation guides and seamlessly upgrading setups created from those guides.
The revamp of the teleport-cluster
change should ensure the IP of the service
stays the same, this requires the loadbalancing service to remain the same.
This proposal introduces two new values for users to edit the teleport.yaml
config: auth.teleportConfig
and proxy.teleportConfig
. The content of those
values should be merged with the generated configuration, as described in the
previous section.
For example:
auth:
teleportConfig:
auth_service:
authentication:
connector_name: "my-connector"
proxy:
teleportConfig:
proxy_service:
acme:
enabled: true
email: [email protected]
The following values are core values: users must set them for the chart to work properly. They support the happy path. Those values should not be changed by the proposal as it would harm backward compatibility and user experience.
clusterName
publicAddr
chartMode
aws
gcp
enterprise
operator
The following values are used to generate Teleport's configuration, we must
continue to support them for backward compatibility, but using
*.teleportConfig
should be preferred.
kubeClusterName
authentication
authenticationType
authenticationSecondFactor
proxyListenerMode
sessionRecording
separatePostgresListener
separateMongoListener
kubePublicAddr
mongoPublicAddr
mysqlPublicAddr
postgresPublicAddr
sshPublicAddr
tunnelPublicAddr
acme
acmeEmail
acmeURI
log
Some values are used to configure the Kubernetes resources deploying Teleport. When specified they should apply to both auth and proxy deployments. Those values are:
podSecurityPolicy
labels
highAvailability
tls
image
enterpriseImage
affinity
annotations
extraArgs
extraEnv
extraVolumes
extraVolumeMounts
imagePullPolicy
initContainers
postStart
securityContext
priorityClassName
tolerations
probeTimeoutSeconds
teleportVersionOverride
resources
A few values will have to be treated differently:
persistence
will only apply to theauth
deploymentservice
will only apply to theproxy
serviceserviceAccount.name
will apply to theauth
, the proxy service account name should be the auth one suffixed with-proxy
Some users will need to set different values for auth and proxy pods, the
following values should be also available under auth
and proxy
. Those
specific values should take precedence over the ones at the root.
labels
highAvailability
(except the certManager section)affinity
annotations
extraArgs
extraEnv
extraVolumes
extraVolumeMounts
initContainers
postStart
tolerations
teleportVersionOverride
resources
As this RFD brings numerous value changes and adds several ways of doing the same thing, users should be provided full working examples covering various common setups.
Such examples would complete the documentation by demonstrating to users the best practices and capabilities of the chart.
Those examples should also be used to lint the chart.
Auth pods have to be updated before proxies. Helm does not support applying resources in a specific order.
Both auth and proxy rollouts will be triggered at the same time, but the proxy one should be held until all auth pods are rolled out. Not waiting for the full rollout will cause the load to spread unevenly across auth pods, which will be harmful at scale.
Proxies will have an initContainer checking if all auth pods from the past
version were removed. Version check via the Teleport gRPC api (PingResponse
)
requires valid credentials to connect to Teleport. To work around this issue we
can rely on Kubernetes's service discovery through DNS to discover how many
pods are running which version:
- the chart labels auth pods with their major teleport version
- the chart creates two headless services:
- one selecting pods with the current major version
teleport-auth-v11
- one selecting pods with the previous major version
teleport-auth-v10
- one selecting pods with the current major version
- proxy pods have an initContainer
- the
v11
initContainer resolvesteleport-auth-v10
every 5 seconds until no IP is returned - the initContainer exits, the proxy starts
- this unlocks the proxy deployment rollout
Headless services selecting auth pods with a specific version should contain
on-ready endpoints to ensure the rollout happens only when all pods are
completely terminated. This means setting spec.publishNotReadyAddresses: true
.
This rollout approach might take some time on largest Teleport deployments.
This is not an issue per-se but has to be documented, as users running with
--atomic
or --wait
might have to increase their Helm timeouts.
Note: Teleport does not officially support multiple auth nodes running under different major versions. The recommended update approach is to scale down to a single node, update, and scale back up. In reality, most Teleport versions are backward compatible with the previous major version, running multiple auth is rarely an issue. This potential issue seems more related to Teleport than to the deployment method, it will be considered out of scope of this RFD for the sake of simplicity.