This is a mono repository for my home infrastructure and Kubernetes cluster. I try to adhere to Infrastructure as Code (IaC) and GitOps practices using tools like Ansible, Terraform, Kubernetes, Flux, Renovate, and GitHub Actions.
There is a template over at onedr0p/flux-cluster-template if you want to try and follow along with some of the practices I use here.
My cluster is k3s provisioned overtop bare-metal Debian using the Ansible galaxy role ansible-role-k3s. This is a semi-hyper-converged cluster, workloads and block storage are sharing the same available resources on my nodes while I have a separate server for (NFS) file storage.
πΈ Click here to see my Ansible playbooks and roles.
- actions-runner-controller: self-hosted Github runners
- cilium: internal Kubernetes networking plugin
- cert-manager: creates SSL certificates for services in my cluster
- external-dns: automatically syncs DNS records from my cluster ingresses to a DNS provider
- external-secrets: managed Kubernetes secrets using 1Password Connect.
- ingress-nginx: ingress controller for Kubernetes using NGINX as a reverse proxy and load balancer
- rook: distributed block storage for persistent storage
- sops: managed secrets for Kubernetes, Ansible, and Terraform which are committed to Git
- tf-controller: additional Flux component used to run Terraform from within a Kubernetes cluster.
- volsync and snapscheduler: backup and recovery of persistent volume claims
Flux watches my kubernetes folder (see Directories below) and makes the changes to my cluster based on the YAML manifests.
The way Flux works for me here is it will recursively search the kubernetes/apps folder until it finds the most top level kustomization.yaml
per directory and then apply all the resources listed in it. That aforementioned kustomization.yaml
will generally only have a namespace resource and one or many Flux kustomizations. Those Flux kustomizations will generally have a HelmRelease
or other resources related to the application underneath it which will be applied.
Renovate watches my entire repository looking for dependency updates, when they are found a PR is automatically created. When some PRs are merged Flux applies the changes to my cluster.
This Git repository contains the following directories under Kubernetes.
π kubernetes # Kubernetes cluster defined as code
ββπ bootstrap # Flux installation
ββπ flux # Main Flux configuration of the repository
ββπ apps # Apps deployed into my cluster grouped by namespace (see below)
Below is a high-level look at the layout of how my directory structure with Flux works. In this brief example, you are able to see that authelia
will not be able to run until lldap
and cloudnative-pg
are running. It also shows that the Cluster
custom resource depends on the cloudnative-pg
Helm chart. This is needed because cloudnative-pg
installs the Cluster
custom resource definition in the Helm chart.
# Key: <kind> :: <metadata.name>
GitRepository :: home-kubernetes
Kustomization :: cluster
Kustomization :: cluster-apps
Kustomization :: cluster-apps-cloudnative-pg
HelmRelease :: cloudnative-pg
Kustomization :: cluster-apps-cloudnative-pg-cluster
DependsOn:
Kustomization :: cluster-apps-cloudnative-pg
Cluster :: postgres
Kustomization :: cluster-apps-lldap
HelmRelease :: lldap
DependsOn:
Kustomization :: cluster-apps-cloudnative-pg-cluster
Kustomization :: cluster-apps-authelia
DependsOn:
Kustomization :: cluster-apps-lldap
Kustomization :: cluster-apps-cloudnative-pg-cluster
HelmRelease :: authelia
Name | CIDR |
---|---|
Server VLAN | 192.168.42.0/24 |
Kubernetes pods | 10.32.0.0/16 |
Kubernetes services | 10.33.0.0/16 |
While most of my infrastructure and workloads are self-hosted I do rely upon the cloud for certain key parts of my setup. This saves me from having to worry about two things. (1) Dealing with chicken/egg scenarios and (2) services I critically need whether my cluster is online or not.
The alternative solution to these two problems would be to host a Kubernetes cluster in the cloud and deploy applications like HCVault, Vaultwarden, ntfy, and Gatus. However, maintaining another cluster and monitoring another group of workloads is a lot more time and effort than I am willing to put in.
Service | Use | Cost |
---|---|---|
1Password | Secrets with External Secrets | ~$65/yr |
Cloudflare | Domain and R2 | ~$30/yr |
Frugal | Usenet access | ~$35/yr |
GCP | Voice interactions with Home Assistant over Google Assistant | Free |
GitHub | Hosting this repository and continuous integration/deployments | Free |
Migadu | Email hosting | ~$20/yr |
NextDNS | My router DNS server which includes AdBlocking | ~$20/yr |
Pushover | Kubernetes Alerts and application notifications | Free |
Terraform Cloud | Storing Terraform state | Free |
UptimeRobot | Monitoring internet connectivity and external facing applications | ~$60/yr |
Total: ~$20/mo |
On my Vyos router I have Bind9 and dnsdist deployed as containers. In my cluster external-dns
is deployed with the RFC2136
provider which syncs DNS records to bind9
.
Downstream DNS servers configured in dnsdist
such as bind9
(above) and NextDNS. All my clients use dnsdist
as the upstream DNS server, this allows for more granularity with configuring DNS across my networks. These could be things like giving each of my VLANs a specific nextdns
profile, or having all requests for my domain forward to bind9
on certain networks, or only using 1.1.1.1
instead of nextdns
on certain networks where adblocking isn't needed.
Outside the external-dns
instance mentioned above another instance is deployed in my cluster and configured to sync DNS records to Cloudflare. The only ingress this external-dns
instance looks at to gather DNS records to put in Cloudflare
are ones that have an ingress class name of external
and an ingress annotation of external-dns.alpha.kubernetes.io/target
.
Device | Count | OS Disk Size | Data Disk Size | Ram | Operating System | Purpose |
---|---|---|---|---|---|---|
Intel NUC8i5BEH | 3 | 1TB SSD | 1TB NVMe (rook-ceph) | 64GB | Debian | Kubernetes Masters |
Intel NUC8i7BEH | 3 | 1TB SSD | 1TB NVMe (rook-ceph) | 64GB | Debian | Kubernetes Workers |
PowerEdge T340 | 1 | 2TB SSD | 8x12TB ZFS (mirrored vdevs) | 64GB | Ubuntu | NFS + Backup Server |
Lenovo SA120 | 1 | - | 6x12TB (+2 hot spares) | - | - | DAS |
Raspberry Pi 4 | 1 | 32GB (SD) | - | 4GB | PiKVM (Arch) | Network KVM |
TESmart 8 Port KVM Switch | 1 | - | - | - | - | Network KVM (PiKVM) |
HP EliteDesk 800 G3 SFF | 1 | 256GB NVMe | - | 8GB | Vyos (Debian) | Router |
Unifi US-16-XG | 1 | - | - | - | - | 10Gb Core Switch |
Unifi USW-Enterprise-24-PoE | 1 | - | - | - | - | 2.5Gb PoE Switch |
Unifi USP PDU Pro | 1 | - | - | - | - | PDU |
APC SMT1500RM2U w/ NIC | 1 | - | - | - | - | UPS |
Thanks to all the people who donate their time to the Kubernetes @Home Discord community. A lot of inspiration for my cluster comes from the people who have shared their clusters using the k8s-at-home GitHub topic. Be sure to check out the Kubernetes @Home search for ideas on how to deploy applications or get ideas on what you can deploy.
See my awful commit history
See LICENSE