Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add status for removing / removalfailed #1334

Merged
merged 19 commits into from
Mar 7, 2025
Merged

Conversation

mjnagel
Copy link
Contributor

@mjnagel mjnagel commented Mar 4, 2025

Description

This PR utilizes the "new" ability in a Pepr finalizer to not remove the finalizer. This enables us to update the status while finalizing, and catch errors if cleanup does not work as expected. Changes:

I also updated the diagram to support these changes, as well as adding test cases for the finalizer function. Diagram update can be previewed on the docs by using this link on docs/reference/configuration/UDS operator/package.md, specific changes:

  • Moved finalizer section to the right of reconciler
  • Simplified flow of validator (to make more space in the diagram)
  • Added new pieces of finalizer flow (failure, status patching, etc)

Related Issue

Fixes #963

Fixes #1159

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Other (security config, docs update, etc)

Steps to Validate

Testing Steps

Test setup:

# Install slim-dev (unicorn flavor to avoid pull rate limiting)
uds run slim-dev --set flavor=unicorn
# Create the test packages
zarf p create src/test --skip-sbom
# Deploy the test packages
zarf p deploy build/zarf-package-uds-core-test-apps-*.tar.zst --confirm
# Validate all package CRs go to Ready status
kubectl get pkg -A # should all show ready

Test that normal deletion works and makes events:

# Delete a package CR
kubectl delete pkg -n test-tenant-app test-tenant-app
# Validate success and events
kubectl get pkg -n test-tenant-app # should show no resources
kubectl get events -n test-tenant-app | grep package # should show 3 removal events

Test that finalizer doesn't run until CR is ready:

# This forces a re-reconcile of the package and then deletes immediately
# If you watch while this happens (k9s, etc) you should see it go to Pending before Removing
kubectl patch pkg httpbin-other -n authservice-test-app --subresource=status --type=json  -p='[{"op": "remove", "path": "/status"}]' && kubectl delete pkg httpbin-other -n authservice-test-app
# Validate that the watcher waited to finalize
kubectl logs -n pepr-system -l app=pepr-uds-core-watcher --tail=-1 | grep "Waiting"
kubectl get events -n authservice-test-app | grep package # should show 3 removal events

Test that finalizer places CR in RemovalFailed state on failed cleanup:

# Deploy the test apps again (we need the sso client)
zarf p deploy build/zarf-package-uds-core-test-apps-*.tar.zst --confirm
# Edit the peprstore
kubectl edit peprstore -n pepr-system pepr-uds-core-store
# Delete the line with `uds-core-operator-v2-sso-client-uds-core-httpbin`, this is the client token and will make Pepr unable to cleanup the client
# Save the peprstore
# Delete the package CR
kubectl delete pkg httpbin-other -n authservice-test-app
# Make sure that status is marked as RemovalFailed (after ~15 seconds)
kubectl get pkg httpbin-other -n authservice-test-app
# Make sure events show up that client failed to be removed
kubectl describe pkg httpbin-other -n authservice-test-app
# Make sure that the SSO client removal was retried 4 times before final failure
kubectl logs -n pepr-system -l app=pepr-uds-core-watcher --tail=-1 | grep "cleanupSSOClients"

Also note the automated jest unit tests and validate those.

Checklist before merging

@mjnagel mjnagel self-assigned this Mar 4, 2025
@mjnagel mjnagel changed the title feat: add status for removal / removalfailed feat: add status for removing / removalfailed Mar 4, 2025
@mjnagel mjnagel marked this pull request as ready for review March 4, 2025 18:19
@mjnagel mjnagel requested a review from a team as a code owner March 4, 2025 18:19
UnicornChance
UnicornChance previously approved these changes Mar 5, 2025
Copy link
Contributor

@UnicornChance UnicornChance left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me (LGTM), ran through tests, everything is working as expected

@mjnagel mjnagel merged commit a99b408 into main Mar 7, 2025
25 checks passed
@mjnagel mjnagel deleted the finalizer-rework branch March 7, 2025 17:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants