Skip to content
This repository has been archived by the owner on Jul 16, 2024. It is now read-only.

Downgrade kpng alpine image to align nftables with nodes OS. #525

Merged
merged 1 commit into from
Aug 25, 2023

Conversation

mneverov
Copy link
Contributor

@mneverov mneverov commented Aug 25, 2023

What kind of PR is this?

Hopefully a bugfix.

Why this PR is needed / What this PR do?

This PR aligns nftables version in kpng image and os node image.
Currently, when e2e tests run locally nft rules set up incorrectly due to the versions mismatch that results in [invalid type] in chains.
To reproduce

  1. start kpng without running tests via ./hack/test_e2e.sh -i ipv4 -b nft -d
  2. join a worker node, install nftables and check for invalid type:
kubectl node-shell kpng-e2e-ipv4-nft-worker
apt update && apt install -y nftables
nft list ruleset | grep invalid

kpng has nothing to do with that. One can reproduce it by ssh-ing to a kpng pod and applying nft rules:

vi nft.conf
table ip6 test {
        chain svc_1 {
                tcp dport 80 dnat to fd6d:706e:6701:2::2:8080
                fib daddr type local tcp dport 30202 dnat to fd6d:706e:6701:2::2:8080
        }

        chain svc_default_nodeport-test_eps {
                numgen random mod 2 vmap { 0 : jump svc_2, 1 : jump svc_1 }
        }

        chain svc_2 {
                tcp dport 80 dnat to fd6d:706e:6701:1::2:8080
                fib daddr type local tcp dport 30202 dnat to fd6d:706e:6701:1::2:8080
        }
}
nft -f nft.conf

On the node the new test table will look like:

table ip6 test {
        chain svc_1 {
                tcp dport 80 dnat to fd6d:706e:6701:2::2:8080
                fib daddr type local tcp dport 30202 dnat to fd6d:706e:6701:2::2:8080
        }

        chain svc_default_nodeport-test_eps {
                numgen random mod 2 map { 0 : 0x0 [invalid type], 1 : 0x1000000 [invalid type] }
        }

        chain svc_2 {
                tcp dport 80 dnat to fd6d:706e:6701:1::2:8080
                fib daddr type local tcp dport 30202 dnat to fd6d:706e:6701:1::2:8080
        }
}

There is no invalid type message when the same steps are executed from alpine:3.16 image.

NOTE: I checked GH e2e action and it uses the same images (alpine 3.17 and ubuntu 22.04). When tests run locally I see invalid types during tests execution, but 335 tests run successfully so I suspect the chains with invalid types are not hit 🤷.

Probably related to the issue.

Which issue(s) this PR fixes?

Fixes #

Additional information about this PR

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Aug 25, 2023
@k8s-ci-robot k8s-ci-robot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Aug 25, 2023
@jayunit100
Copy link
Contributor

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 25, 2023
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jayunit100, mneverov

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 25, 2023
@jayunit100
Copy link
Contributor

ok so this was the reason why it stopped working recently in the hack/local-up-kpng recipe i guesS?

@k8s-ci-robot k8s-ci-robot merged commit a456b72 into kubernetes-retired:master Aug 25, 2023
@mneverov
Copy link
Contributor Author

ok so this was the reason why it stopped working recently in the hack/local-up-kpng recipe i guesS?

it might be related, but also the error you had is different...

@mcluseau
Copy link
Contributor

alpine is 3.18 now, I'm not sure we should downgrade for a specific node version. nftables is an in-kernel VM and nft compiles the given rules for this VM, hence it works in the kernel, even if the "decompilation" with an older version of nft is not guaranteed to work 100%. Downgrading nft means to going back in time, so less bugs fixed. To get the expected result from commands like nft list ruleset, it's better to use the nft instance in the kpng pod, not the one on the node.

@mneverov
Copy link
Contributor Author

mneverov commented Aug 27, 2023

alpine is 3.18 now, I'm not sure we should downgrade for a specific node version. nftables is an in-kernel VM and nft compiles the given rules for this VM, hence it works in the kernel, even if the "decompilation" with an older version of nft is not guaranteed to work 100%. Downgrading nft means to going back in time, so less bugs fixed. To get the expected result from commands like nft list ruleset, it's better to use the nft instance in the kpng pod, not the one on the node.

@mcluseau yes, you're right. It explains why tests ran successfully on alpine 3.17. Sorry for the noise, ptal #527. Please let me know if I can bump golang build image to the latest alpine too and golang version to 1.21.

@mcluseau
Copy link
Contributor

mcluseau commented Aug 27, 2023

@mneverov hey, no worries, thanks for contributing, there's no building without some noise :-)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants