Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds custom timeout for instance creation #458

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

fjaeger-seibert
Copy link

If I try to create a cluster in a private network only, the cloud init takes a very long time. Up to 6 minutes, so it would be helpful to be able to adjust the timeout.

I hope my changes are ok, never worked with Crystal before :D

Copy link

sonarqubecloud bot commented Oct 2, 2024

@vitobotta
Copy link
Owner

Hi! Thanks for the PR. It looks OK and it doesn't seem to be risky to merge, but I will test a bit as soon as I have a chance before making a release. Would you mind describing your setup and configuration as detailed as possible so I can reproduce the context easily? Thanks!

@vitobotta
Copy link
Owner

Also, did you figure out why the cloud init takes a long time in this case?

@fjaeger-seibert
Copy link
Author

Hi, I would be happy to try to describe my current setup and configuration.
I have created a private network. Within this network I have a s2s router that routes all addresses within the network into a vpn.
This means that some static routes have to be created when each instance starts up.

Unfortunately, I haven't found out why the cloud init takes so long. But it doesn't matter what I do in the cloud init. It always takes a long time. Even if I only create one file.

Here is my config:

#hetzner_token: <your token> Will be inserted by HCLOUD_TOKEN env variable
timeouts:
  instance_creation_timeout: 600
cluster_name: agilehive
kubeconfig_path: "~/.kube/config"
k3s_version: v1.26.4+k3s1
embedded_registry_mirror:
  enabled: false
networking:
  ssh:
    port: 22
    use_agent: false # set to true if your key has a passphrase
    public_key_path: "/home/app/hetzner-k3s/.ssh/id_rsa.pub"
    private_key_path: "/home/app/hetzner-k3s/.ssh/id_rsa"
  allowed_networks:
    ssh:
      - 0.0.0.0/0
    api:
      - 0.0.0.0/0
  public_network:
    ipv4: false
    ipv6: false
  private_network:
    enabled: true
    subnet: PRIVATE_NETWORK_SUBNET
    existing_network_name: "PRIVATE_NETWORK_NAME"
disable_flannel: false # set to true if you want to install a different CNI
schedule_workloads_on_masters: false
cloud_controller_manager_manifest_url: "https://github.com/hetznercloud/hcloud-cloud-controller-manager/releases/download/v1.19.0/ccm-networks.yaml"
csi_driver_manifest_url: "https://raw.githubusercontent.com/hetznercloud/csi-driver/v2.6.0/deploy/kubernetes/hcloud-csi.yml"
system_upgrade_controller_deployment_manifest_url: "https://github.com/rancher/system-upgrade-controller/releases/download/v0.13.4/system-upgrade-controller.yaml"
system_upgrade_controller_crd_manifest_url: "https://github.com/rancher/system-upgrade-controller/releases/download/v0.13.4/crd.yaml"
cluster_autoscaler_manifest_url: "https://raw.githubusercontent.com/kubernetes/autoscaler/master/cluster-autoscaler/cloudprovider/hetzner/examples/cluster-autoscaler-run-on-master.yaml"
datastore:
  mode: etcd # etcd (default) or external
  external_datastore_endpoint: postgres://....
masters_pool:
  instance_type: cpx31
  instance_count: 1
  location: nbg1
worker_node_pools:
  - name: jira-node
    instance_type: ccx33
    instance_count: 1
    location: nbg1
  - name: database-node
    instance_type: cpx41
    instance_count: 1
    location: nbg1
    taints:
      - key: database
        value: database:NoSchedule
post_create_commands:
  - export IP=$(ip addr show enp7s0 | grep "inet\b" | awk '{print $2}' | cut -d/ -f1)
  - echo "nameserver 8.8.8.8" > /etc/resolv.conf
  - >
    printf "network##\n  version## 2\n  renderer## networkd\n  ethernets##\n    enp7s0##\n      addresses##\n        - $IP/32\n      nameservers##\n        addresses##\n          - 8.8.8.8\n          - 8.8.4.4\n      routes##\n        - to## default\n          via## x.x.x.x\n        - to## x.x.x.x/32\n          scope## link\n        - to## x.x.x.x/32\n          scope## link\n        - to## x.x.x.x/8\n          via## x.x.x.x\n          on-link## true\n        - to## x.x.x.x/12\n          via## x.x.x.x\n          on-link## true\n        - to## x.x.x.x/16\n          via## x.x.x.x\n          on-link## true\n" |
    sed 's/##/:/g' > /etc/netplan/50-cloud-init.yaml
  - netplan generate
  - netplan apply
  # Script to ensure that the network settings have been applied, as this does not work reliably
  - >
    echo '#!/bin/bash\n\n# Variablen definieren\ndomain="google.de"\nmax_attempts=20\nattempt=0\n\n# Function to check whether the domain is accessible\ncheck_domain() {\n    nslookup $domain > /dev/null 2>&1\n    return $?\n}\n\n# Loop to check the domain reachability\nwhile [ $attempt -lt $max_attempts ]; do\n    attempt=$((attempt + 1))\n    echo "Attempt $attempt of $max_attempts: Check accessibility of $domain..."\n    \n    check_domain\n    if [ $? -eq 0 ]; then\n        echo "$domain is reachable."\n        exit 0\n    else\n        echo "$domain is not reachable. Execute netplan generate and netplan apply..."\n        netplan generate\n        netplan apply\n        echo "Waiting 30 seconds..."\n        sleep 30\n    fi\ndone\n\necho "Maximum number of attempts reached. $domain remains unattainable."\nexit 1\n' > ensure_network_settings.sh
  - chmod +x ensure_network_settings.sh
  - ./ensure_network_settings.sh

@vitobotta
Copy link
Owner

Can you also describe more the s2s router setup as well? Just trying to think what might be the problem.

@fjaeger-seibert
Copy link
Author

The s2s router is the connection to the rest of the VPN network. I can use it to reach VMs without public IPs from the user VPN. The monitoring and trending of the systems also runs via it
It also makes the router for internal IPs to the outside. It runs a Wireguard client and otherwise only Linux routing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants