Skip to content

Conversation

nicktrn
Copy link
Collaborator

@nicktrn nicktrn commented Sep 25, 2025

fixes #2548

all new settings incl env vars are optional. new helm chart released, but will need another bump once the new supervisor images are out to fully support spread constraints for the worker pods.

Copy link

changeset-bot bot commented Sep 25, 2025

⚠️ No Changeset found

Latest commit: 0dcde3c

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

Copy link
Contributor

coderabbitai bot commented Sep 25, 2025

Walkthrough

Adds a new optional environment variable KUBERNETES_TOPOLOGY_SPREAD_CONSTRAINTS (z.string().optional()) to the supervisor Env schema. Introduces parseTopologySpreadConstraints() in apps/supervisor/src/workloadManager/kubernetes.ts to parse that env var and, when present, inject topologySpreadConstraints into the default pod spec. Bumps Helm chart version to 4.0.2. Helm templates updated: supervisor deployment now sets the env var from .Values.supervisor.config.kubernetes.topologySpreadConstraints and conditionally renders pod-level topologySpreadConstraints; webapp deployment gains an optional topologySpreadConstraints section. hosting/k8s/helm/values.yaml adds empty topologySpreadConstraints placeholders for webapp and supervisor.config.kubernetes.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Description Check ⚠️ Warning The description does not follow the repository’s template and is missing several required sections including the checklist, testing steps, changelog, and screenshots, providing only a brief summary of the issue resolution. Please update the description to use the provided template by adding the “Closes #” header, the checklist items, detailed testing steps, a concise changelog entry, and any relevant screenshots if applicable.
✅ Passed checks (4 passed)
Check name Status Explanation
Title Check ✅ Passed The title clearly and concisely summarizes the main change by indicating that Helm support for topology spread constraints is being added, and it follows conventional formatting without extraneous details.
Linked Issues Check ✅ Passed The changes expose topologySpreadConstraints for webapp and supervisor pods via Helm values, inject the configuration into the supervisor’s workload manager for jobs, and mirror existing configurables as requested in issue #2548.
Out of Scope Changes Check ✅ Passed All modifications directly relate to implementing topology spread constraints support and no unrelated features or refactors have been introduced outside the scope of issue #2548.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
✨ Finishing touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/helm-topology

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b01c92c and 0dcde3c.

📒 Files selected for processing (1)
  • hosting/k8s/helm/templates/supervisor.yaml (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • hosting/k8s/helm/templates/supervisor.yaml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (23)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (1, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (2, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (7, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (8, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (5, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (6, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (7, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (8, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (5, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
  • GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (4, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
  • GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
  • GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - npm)
  • GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
  • GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - pnpm)
  • GitHub Check: typecheck / typecheck
  • GitHub Check: Analyze (javascript-typescript)

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
apps/supervisor/src/workloadManager/kubernetes.ts (2)

59-74: Guard against non-array JSON for topology spread constraints

JSON.parse may return a non-array; add a minimal type guard to avoid emitting invalid pod specs.

Apply this diff:

-  private parseTopologySpreadConstraints(): k8s.V1TopologySpreadConstraint[] | null {
-    if (!env.KUBERNETES_TOPOLOGY_SPREAD_CONSTRAINTS) {
-      return null;
-    }
-
-    try {
-      return JSON.parse(env.KUBERNETES_TOPOLOGY_SPREAD_CONSTRAINTS);
-    } catch (error) {
-      this.logger.error("[KubernetesWorkloadManager] Failed to parse topology spread constraints", {
-        error: error instanceof Error ? error.message : String(error),
-        raw: env.KUBERNETES_TOPOLOGY_SPREAD_CONSTRAINTS,
-      });
-      return null;
-    }
-  }
+  private parseTopologySpreadConstraints(): k8s.V1TopologySpreadConstraint[] | null {
+    const raw = env.KUBERNETES_TOPOLOGY_SPREAD_CONSTRAINTS;
+    if (!raw) return null;
+    try {
+      const parsed = JSON.parse(raw);
+      if (!Array.isArray(parsed)) {
+        this.logger.warn("[KubernetesWorkloadManager] Expected an array for topology spread constraints", {
+          type: typeof parsed,
+        });
+        return null;
+      }
+      return parsed as k8s.V1TopologySpreadConstraint[];
+    } catch (error) {
+      this.logger.error("[KubernetesWorkloadManager] Failed to parse topology spread constraints", {
+        error: error instanceof Error ? error.message : String(error),
+        raw,
+      });
+      return null;
+    }
+  }

289-304: Nit: avoid emitting empty array property

Only include topologySpreadConstraints when non-empty for cleaner specs.

Apply this diff:

-      ...(topologySpreadConstraints ? { topologySpreadConstraints } : {}),
+      ...(topologySpreadConstraints && topologySpreadConstraints.length > 0
+        ? { topologySpreadConstraints }
+        : {}),
hosting/k8s/helm/values.yaml (1)

72-74: Add commented examples to guide users on labelSelector

Providing concrete examples here reduces misconfiguration. Suggested snippets keep with your preference for examples in values files.

Apply these additions under each section:

@@
-  topologySpreadConstraints: []
+  topologySpreadConstraints: []
+  # Example:
+  # topologySpreadConstraints:
+  #   - maxSkew: 1
+  #     topologyKey: kubernetes.io/hostname
+  #     whenUnsatisfiable: ScheduleAnyway
+  #     labelSelector:
+  #       matchLabels:
+  #         app.kubernetes.io/component: webapp
@@
-      topologySpreadConstraints: []
+      topologySpreadConstraints: []
+      # Example (for worker pods created by the supervisor):
+      # topologySpreadConstraints:
+      #   - maxSkew: 1
+      #     topologyKey: kubernetes.io/hostname
+      #     whenUnsatisfiable: ScheduleAnyway
+      #     labelSelector:
+      #       matchLabels:
+      #         app.kubernetes.io/part-of: trigger-worker
@@
-  topologySpreadConstraints: []
+  topologySpreadConstraints: []
+  # Example (for the supervisor Deployment itself):
+  # topologySpreadConstraints:
+  #   - maxSkew: 1
+  #     topologyKey: kubernetes.io/hostname
+  #     whenUnsatisfiable: DoNotSchedule
+  #     labelSelector:
+  #       matchLabels:
+  #         app.kubernetes.io/component: supervisor

Based on learnings

Also applies to: 268-270, 360-362

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between eb0263e and b01c92c.

📒 Files selected for processing (6)
  • apps/supervisor/src/env.ts (1 hunks)
  • apps/supervisor/src/workloadManager/kubernetes.ts (3 hunks)
  • hosting/k8s/helm/Chart.yaml (1 hunks)
  • hosting/k8s/helm/templates/supervisor.yaml (2 hunks)
  • hosting/k8s/helm/templates/webapp.yaml (1 hunks)
  • hosting/k8s/helm/values.yaml (3 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.{ts,tsx}: Always prefer using isomorphic code like fetch, ReadableStream, etc. instead of Node.js specific code
For TypeScript, we usually use types over interfaces
Avoid enums
No default exports, use function declarations

Files:

  • apps/supervisor/src/env.ts
  • apps/supervisor/src/workloadManager/kubernetes.ts
🧠 Learnings (4)
📚 Learning: 2025-06-25T13:18:44.103Z
Learnt from: nicktrn
PR: triggerdotdev/trigger.dev#2195
File: hosting/k8s/helm/Chart.yaml:1-18
Timestamp: 2025-06-25T13:18:44.103Z
Learning: For the Trigger.dev Helm chart, the chart name should be "trigger" (not "trigger-v4") since this is the first official chart release. Helper templates should use the actual chart name from .Chart.Name rather than hardcoded prefixes.

Applied to files:

  • hosting/k8s/helm/Chart.yaml
📚 Learning: 2025-06-25T13:20:17.174Z
Learnt from: nicktrn
PR: triggerdotdev/trigger.dev#2195
File: hosting/k8s/helm/values.yaml:22-51
Timestamp: 2025-06-25T13:20:17.174Z
Learning: In the Trigger.dev Helm chart values.yaml, the maintainer prefers to use explicit comprehensive warnings for security-sensitive default values rather than implementing secure-by-default behavior that would fail installation. The project uses deterministic default secrets with clear "TESTING ONLY" warnings and instructions for production deployment.

Applied to files:

  • hosting/k8s/helm/Chart.yaml
📚 Learning: 2025-06-25T13:18:04.827Z
Learnt from: nicktrn
PR: triggerdotdev/trigger.dev#2195
File: hosting/k8s/helm/templates/extra-manifests.yaml:1-4
Timestamp: 2025-06-25T13:18:04.827Z
Learning: In the Trigger.dev v4 Helm chart, the user prefers to rely on documentation and examples in values files rather than implementing defensive coding in templates, particularly for features like extraManifests where proper usage is documented.

Applied to files:

  • hosting/k8s/helm/Chart.yaml
📚 Learning: 2025-06-25T14:14:11.965Z
Learnt from: nicktrn
PR: triggerdotdev/trigger.dev#2195
File: hosting/k8s/helm/values-production-example.yaml:95-102
Timestamp: 2025-06-25T14:14:11.965Z
Learning: In the Trigger.dev Helm chart production examples, the maintainer prefers to include initial/bootstrap credentials with clear warnings that they should be changed after first login, rather than requiring external secret references that could complicate initial setup. This follows their pattern of providing working examples with explicit security guidance.

Applied to files:

  • hosting/k8s/helm/Chart.yaml
🧬 Code graph analysis (1)
apps/supervisor/src/workloadManager/kubernetes.ts (1)
apps/supervisor/src/env.ts (1)
  • env (119-119)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (23)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (4, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (8, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (7, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (5, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (5, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (6, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (8, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (1, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (7, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (2, 8)
  • GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
  • GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
  • GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - pnpm)
  • GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - npm)
  • GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
  • GitHub Check: typecheck / typecheck
  • GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (4)
hosting/k8s/helm/Chart.yaml (1)

5-5: Chart version bump looks good

Patch bump to 4.0.2 is consistent with the scope of changes.

hosting/k8s/helm/templates/webapp.yaml (1)

422-425: Good addition: optional topologySpreadConstraints for webapp

Use of tpl + toYaml with proper nindent is correct and safe here.

apps/supervisor/src/env.ts (1)

93-93: Env schema addition looks good

Optional KUBERNETES_TOPOLOGY_SPREAD_CONSTRAINTS string is appropriate.

hosting/k8s/helm/templates/supervisor.yaml (1)

279-282: LGTM: optional topologySpreadConstraints for supervisor Deployment

Templating and indentation are correct.

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Comment on lines +187 to +190
{{- with .Values.supervisor.config.kubernetes.topologySpreadConstraints }}
- name: KUBERNETES_TOPOLOGY_SPREAD_CONSTRAINTS
value: {{ tpl (toYaml .) $ | toJson | quote }}
{{- end }}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize I requested adding support for topology spread constraints for the jobs too but I am not sure if, in practice, it would be effective, since only a single replica of a job is running at any given time, correct? If so, I suggest removing this for now.

Comment on lines +279 to +282
{{- with .Values.supervisor.topologySpreadConstraints }}
topologySpreadConstraints:
{{- tpl (toYaml .) $ | nindent 8 }}
{{- end }}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that supervisor deployment is locked to a replica count of 1. Curious what the reason for that is?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I am asking is, although it's good to have the topology spread constraints config surfaced to users, it's likely going to be ineffectual because of the replica count being locked to 1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

feat: Allow configuring the topology spread constraints in the Helm chart
2 participants