-
-
Notifications
You must be signed in to change notification settings - Fork 828
feat(helm): support topology spread constraints #2560
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
WalkthroughAdds a new optional environment variable KUBERNETES_TOPOLOGY_SPREAD_CONSTRAINTS (z.string().optional()) to the supervisor Env schema. Introduces parseTopologySpreadConstraints() in apps/supervisor/src/workloadManager/kubernetes.ts to parse that env var and, when present, inject topologySpreadConstraints into the default pod spec. Bumps Helm chart version to 4.0.2. Helm templates updated: supervisor deployment now sets the env var from .Values.supervisor.config.kubernetes.topologySpreadConstraints and conditionally renders pod-level topologySpreadConstraints; webapp deployment gains an optional topologySpreadConstraints section. hosting/k8s/helm/values.yaml adds empty topologySpreadConstraints placeholders for webapp and supervisor.config.kubernetes. Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing touches🧪 Generate unit tests
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (23)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (3)
apps/supervisor/src/workloadManager/kubernetes.ts (2)
59-74
: Guard against non-array JSON for topology spread constraintsJSON.parse may return a non-array; add a minimal type guard to avoid emitting invalid pod specs.
Apply this diff:
- private parseTopologySpreadConstraints(): k8s.V1TopologySpreadConstraint[] | null { - if (!env.KUBERNETES_TOPOLOGY_SPREAD_CONSTRAINTS) { - return null; - } - - try { - return JSON.parse(env.KUBERNETES_TOPOLOGY_SPREAD_CONSTRAINTS); - } catch (error) { - this.logger.error("[KubernetesWorkloadManager] Failed to parse topology spread constraints", { - error: error instanceof Error ? error.message : String(error), - raw: env.KUBERNETES_TOPOLOGY_SPREAD_CONSTRAINTS, - }); - return null; - } - } + private parseTopologySpreadConstraints(): k8s.V1TopologySpreadConstraint[] | null { + const raw = env.KUBERNETES_TOPOLOGY_SPREAD_CONSTRAINTS; + if (!raw) return null; + try { + const parsed = JSON.parse(raw); + if (!Array.isArray(parsed)) { + this.logger.warn("[KubernetesWorkloadManager] Expected an array for topology spread constraints", { + type: typeof parsed, + }); + return null; + } + return parsed as k8s.V1TopologySpreadConstraint[]; + } catch (error) { + this.logger.error("[KubernetesWorkloadManager] Failed to parse topology spread constraints", { + error: error instanceof Error ? error.message : String(error), + raw, + }); + return null; + } + }
289-304
: Nit: avoid emitting empty array propertyOnly include topologySpreadConstraints when non-empty for cleaner specs.
Apply this diff:
- ...(topologySpreadConstraints ? { topologySpreadConstraints } : {}), + ...(topologySpreadConstraints && topologySpreadConstraints.length > 0 + ? { topologySpreadConstraints } + : {}),hosting/k8s/helm/values.yaml (1)
72-74
: Add commented examples to guide users on labelSelectorProviding concrete examples here reduces misconfiguration. Suggested snippets keep with your preference for examples in values files.
Apply these additions under each section:
@@ - topologySpreadConstraints: [] + topologySpreadConstraints: [] + # Example: + # topologySpreadConstraints: + # - maxSkew: 1 + # topologyKey: kubernetes.io/hostname + # whenUnsatisfiable: ScheduleAnyway + # labelSelector: + # matchLabels: + # app.kubernetes.io/component: webapp @@ - topologySpreadConstraints: [] + topologySpreadConstraints: [] + # Example (for worker pods created by the supervisor): + # topologySpreadConstraints: + # - maxSkew: 1 + # topologyKey: kubernetes.io/hostname + # whenUnsatisfiable: ScheduleAnyway + # labelSelector: + # matchLabels: + # app.kubernetes.io/part-of: trigger-worker @@ - topologySpreadConstraints: [] + topologySpreadConstraints: [] + # Example (for the supervisor Deployment itself): + # topologySpreadConstraints: + # - maxSkew: 1 + # topologyKey: kubernetes.io/hostname + # whenUnsatisfiable: DoNotSchedule + # labelSelector: + # matchLabels: + # app.kubernetes.io/component: supervisorBased on learnings
Also applies to: 268-270, 360-362
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
apps/supervisor/src/env.ts
(1 hunks)apps/supervisor/src/workloadManager/kubernetes.ts
(3 hunks)hosting/k8s/helm/Chart.yaml
(1 hunks)hosting/k8s/helm/templates/supervisor.yaml
(2 hunks)hosting/k8s/helm/templates/webapp.yaml
(1 hunks)hosting/k8s/helm/values.yaml
(3 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{ts,tsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.{ts,tsx}
: Always prefer using isomorphic code like fetch, ReadableStream, etc. instead of Node.js specific code
For TypeScript, we usually use types over interfaces
Avoid enums
No default exports, use function declarations
Files:
apps/supervisor/src/env.ts
apps/supervisor/src/workloadManager/kubernetes.ts
🧠 Learnings (4)
📚 Learning: 2025-06-25T13:18:44.103Z
Learnt from: nicktrn
PR: triggerdotdev/trigger.dev#2195
File: hosting/k8s/helm/Chart.yaml:1-18
Timestamp: 2025-06-25T13:18:44.103Z
Learning: For the Trigger.dev Helm chart, the chart name should be "trigger" (not "trigger-v4") since this is the first official chart release. Helper templates should use the actual chart name from .Chart.Name rather than hardcoded prefixes.
Applied to files:
hosting/k8s/helm/Chart.yaml
📚 Learning: 2025-06-25T13:20:17.174Z
Learnt from: nicktrn
PR: triggerdotdev/trigger.dev#2195
File: hosting/k8s/helm/values.yaml:22-51
Timestamp: 2025-06-25T13:20:17.174Z
Learning: In the Trigger.dev Helm chart values.yaml, the maintainer prefers to use explicit comprehensive warnings for security-sensitive default values rather than implementing secure-by-default behavior that would fail installation. The project uses deterministic default secrets with clear "TESTING ONLY" warnings and instructions for production deployment.
Applied to files:
hosting/k8s/helm/Chart.yaml
📚 Learning: 2025-06-25T13:18:04.827Z
Learnt from: nicktrn
PR: triggerdotdev/trigger.dev#2195
File: hosting/k8s/helm/templates/extra-manifests.yaml:1-4
Timestamp: 2025-06-25T13:18:04.827Z
Learning: In the Trigger.dev v4 Helm chart, the user prefers to rely on documentation and examples in values files rather than implementing defensive coding in templates, particularly for features like extraManifests where proper usage is documented.
Applied to files:
hosting/k8s/helm/Chart.yaml
📚 Learning: 2025-06-25T14:14:11.965Z
Learnt from: nicktrn
PR: triggerdotdev/trigger.dev#2195
File: hosting/k8s/helm/values-production-example.yaml:95-102
Timestamp: 2025-06-25T14:14:11.965Z
Learning: In the Trigger.dev Helm chart production examples, the maintainer prefers to include initial/bootstrap credentials with clear warnings that they should be changed after first login, rather than requiring external secret references that could complicate initial setup. This follows their pattern of providing working examples with explicit security guidance.
Applied to files:
hosting/k8s/helm/Chart.yaml
🧬 Code graph analysis (1)
apps/supervisor/src/workloadManager/kubernetes.ts (1)
apps/supervisor/src/env.ts (1)
env
(119-119)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (23)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (4, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (8, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (7, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (5, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (5, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (6, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (8, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (1, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (7, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (2, 8)
- GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
- GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - pnpm)
- GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - npm)
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
- GitHub Check: typecheck / typecheck
- GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (4)
hosting/k8s/helm/Chart.yaml (1)
5-5
: Chart version bump looks goodPatch bump to 4.0.2 is consistent with the scope of changes.
hosting/k8s/helm/templates/webapp.yaml (1)
422-425
: Good addition: optional topologySpreadConstraints for webappUse of tpl + toYaml with proper nindent is correct and safe here.
apps/supervisor/src/env.ts (1)
93-93
: Env schema addition looks goodOptional KUBERNETES_TOPOLOGY_SPREAD_CONSTRAINTS string is appropriate.
hosting/k8s/helm/templates/supervisor.yaml (1)
279-282
: LGTM: optional topologySpreadConstraints for supervisor DeploymentTemplating and indentation are correct.
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
{{- with .Values.supervisor.config.kubernetes.topologySpreadConstraints }} | ||
- name: KUBERNETES_TOPOLOGY_SPREAD_CONSTRAINTS | ||
value: {{ tpl (toYaml .) $ | toJson | quote }} | ||
{{- end }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realize I requested adding support for topology spread constraints for the jobs too but I am not sure if, in practice, it would be effective, since only a single replica of a job is running at any given time, correct? If so, I suggest removing this for now.
{{- with .Values.supervisor.topologySpreadConstraints }} | ||
topologySpreadConstraints: | ||
{{- tpl (toYaml .) $ | nindent 8 }} | ||
{{- end }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that supervisor deployment is locked to a replica count of 1. Curious what the reason for that is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason I am asking is, although it's good to have the topology spread constraints config surfaced to users, it's likely going to be ineffectual because of the replica count being locked to 1.
fixes #2548
all new settings incl env vars are optional. new helm chart released, but will need another bump once the new supervisor images are out to fully support spread constraints for the worker pods.