To check if your api token works, you can use the following curl
. envfile
curl -v -H "Authorization: PVEAPIToken=$PROXMOX_TOKEN=$PROXMOX_SECRET" ${PROXMOX_URL%/}/api2/json/
Kind requires
hybrid cgroups
when ran in Docker on Linux. A cgroupv2-only setup will break on
systemd-247 shipped in kind's container. Docker/runc also requires that there
are no anonymous cgroupv1 filesystems mounted, otherwise cgroup namespace
isolation will fail. If you require these, use Podman, but be aware that this
breaks various assumptions in ./cluster-api/Tiltfile
.
If you mount cgroups with no controller attached, this will break Docker/runc in case of creating a new cgroup namespace, as runc expects that every cgroup's name is also its type (with the exception of the systemd cgroup).
As an example:
/ # mkdir /sys/fs/cgroup/broken
/ # mount -t cgroup -o none,name=broken broken /sys/fs/cgroup/broken/
/ # docker --log-level debug container start be42
Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error mounting "cgroup" to rootfs at "/sys/fs/cgroup": mount cgroup:/sys/fs/cgroup/foobar (via /proc/self/fd/7), flags: 0xe, data: foobar: invalid argument: unknown
Error: failed to start containers: be42
This breaks because Docker requires a systemd directory in cgroups, as it remounts /sys/fs/cgroup read-only on entering the remapped namespace. If the directory does not exist, mounting systemd will fail. This leads to systemd in the container breaking on startup (obviously):
[(Linux without systemd)]
/ # docker --log-level debug start be42
INFO: ensuring we can execute mount/umount even with userns-remap
INFO: remounting /sys read-only
INFO: making mounts shared
INFO: detected cgroup v1
INFO: detected cgroupns
[...]
INFO: starting init
Failed to mount cgroup at /sys/fs/cgroup/systemd: Operation not permitted
[!!!!!!] Failed to mount API filesystems.
Exiting PID 1...
A fix is to create this directory, then start Docker.
TODO
If you encounter errors like
missing MachineDeployment strategy
on yourMachineDeployment
failed to call webhook: the server could not find the requested resource
in your capmox-controller's logs or others, please check the image tag of thecapi-controller-manager
Deployment and compare it against our compatibility matrix.
kubectl get deployment/capi-controller-manager -o yaml | yq '.spec.template.spec.containers[].image'
If your capi-controller is too new, you can pass a --core cluster-api:v1.6.1
during clusterctl init
, to force an older version. By default it installs the latest version from the kubernetes-sigs/cluster-api project.
Calico unfortunately does not test connectivity when it choses a node ip to use for IPVS communication. This can be altered manually. More on this topic in Calicos documentation.
Sometimes machines do not delete because some resource needs to be reconciled before deletion can happen, but these resources can not reconcile (for example nodes may not drain). To fix deletion deadlocks in such cases:
- Remove
ipaddresses
andipaddressclaims
for the relevant machines - Remove the
proxmoxmachine
finalizer by editingproxmoxmachines <machine>
- Delete the
proxmoxmachine
- Remove the
machine
finalizer by editingmachines <machine>
- Delete the
machine
After these steps, VMs may linger in proxmox. Carefully remove those.
Proxmox VE Image Builder and CAPMOX differ in their use of environment variables.
Trying to use CAPMOX's variables will lead to image building failure.
The image builder uses PROXMOX_USERNAME
as the token name and PROXMOX_TOKEN
as the token's secret, whereas CAPMOX uses PROXMOX_TOKEN
as
the token name and PROXMOX_SECRET
as the token's secret UUID.
The CAPMOX way of implementing authentication is closer to the Proxmox API Token Documentation,
therefore this pitfall will likely keep on existing.
Older versions of kube-vip
do not consider the IPv6 routing table and therefore IPv6 interface detection fails.
Update kube-vip
to version 0.7.2
.
Example log:
time="2024-03-14T11:48:58Z" level=info msg="Starting kube-vip.io [v0.5.10]"
time="2024-03-14T11:48:58Z" level=info msg="namespace [kube-system], Mode: [ARP], Features(s): Control Plane:[true], Services:[false]"
time="2024-03-14T11:48:58Z" level=info msg="No interface is specified for VIP in config, auto-detecting default Interface"
....
time="2024-03-14T11:52:30Z" level=fatal msg="unable to detect default interface -> [Unable to find default route]"
Kubelet chooses the first interface to acquire a node-ip for kubeadm. The first interface is defined by the in-kernel order, which is defined by the order the pci bus is scanned and drivers are loaded.
As an example:
kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
test-cluster-control-plane-gcgc6 Ready control-plane 11h v1.26.7 10.0.1.69 <none> Ubuntu 22.04.3 LTS 5.15.0-89-generic containerd://1.7.6
test-cluster-load-balancer-c8rd2 Ready load-balancer,node 11h v1.26.7 10.0.2.155 <none> Ubuntu 22.04.3 LTS 5.15.0-89-generic containerd://1.7.6
test-cluster-load-balancer-wqbcg Ready load-balancer,node 11h v1.26.7 10.0.2.152 <none> Ubuntu 22.04.3 LTS 5.15.0-89-generic containerd://1.7.6
test-cluster-worker-hbm8s Ready node 11h v1.26.7 10.0.1.71 <none> Ubuntu 22.04.3 LTS 5.15.0-89-generic containerd://1.7.6
test-cluster-worker-n2vbc NotReady node 17m v1.26.7 10.0.1.73 <none> Ubuntu 22.04.3 LTS 5.15.0-89-generic containerd://1.7.6
The load-balancers have an e1000
interface as their default network, whereas ens19
and ens20
are virtio
root@test-cluster-load-balancer-zrjx8:~# ip -o l sh
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000\ link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: ens19: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc prio state UP mode DEFAULT group default qlen 1000\ link/ether 0a:97:89:e5:7f:1d brd ff:ff:ff:ff:ff:ff\ altname enp0s19
3: ens20: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc prio master vrf-ext state UP mode DEFAULT group default qlen 1000\ link/ether 9a:58:08:40:a2:70 brd ff:ff:ff:ff:ff:ff\ altname enp0s20
4: ens18: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc prio state UP mode DEFAULT group default qlen 1000\ link/ether 16:7a:ee:74:23:0d brd ff:ff:ff:ff:ff:ff\ altname enp0s18
This is the order the interfaces are created in:
root@test-cluster-load-balancer-zrjx8:~# dmesg -t | grep eth
virtio_net virtio2 ens19: renamed from eth0
virtio_net virtio3 ens20: renamed from eth1
e1000 0000:00:12.0 eth0: (PCI:33MHz:32-bit) 16:7a:ee:74:23:0d
e1000 0000:00:12.0 eth0: Intel(R) PRO/1000 Network Connection
e1000 0000:00:12.0 ens18: renamed from eth0
If you absolutely must mix interface types, make sure that the default network interface is the one that comes up first.
Sometimes machines do not delete because some resource needs to be reconciled before deletion can happen, but these resources can not reconcile (for example nodes may not drain). To fix deletion deadlocks in such cases:
- Remove
ipaddresses
andipaddressclaims
for the relevant machines - Remove the
proxmoxmachine
finalizer by editingproxmoxmachines <machine>
- Delete the
proxmoxmachine
- Remove the
machine
finalizer by editingmachines <machine>
- Delete the
machine
After these steps, VMs may linger in proxmox. Carefully remove those.