Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

distributed provisioning: unset "selected-node" for nodes which have no driver running #544

Open
pohly opened this issue Dec 20, 2020 · 12 comments
Labels
lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.

Comments

@pohly
Copy link
Contributor

pohly commented Dec 20, 2020

When deploying external-provisioner alongside the CSI driver on each node, there is one problem: if the scheduler picks a node which has no driver instance, then the volume is stuck because the usual "no capacity -> reschedule" recovery is never triggered.

A custom scheduler extension and capacity tracking can minimize the risk, but cannot prevent this entirely.

Possible solutions:

  • deploy the driver on all nodes, let it report "no capacity" on those were it has no resources -> works today, but creates overhead
  • deploy a central provisioner together with a driver component that knows about nodes where the driver runs -> can be done today, but implies that CSI drivers must be made aware of Kubernetes
  • do something similar inside external-provisioner, probably based on node labels (specify node selector for "driver not running" and handle those)
@pohly
Copy link
Contributor Author

pohly commented Dec 20, 2020

deploy a central provisioner together with a driver component that knows about nodes where the driver runs -> can be done today, but implies that CSI drivers must be made aware of Kubernetes

This isn't ideal because "provisioning" will be started by the central provisioner for all nodes and then must be made to fail for those which do have a driver, which will emit additional events.

@pohly
Copy link
Contributor Author

pohly commented Dec 21, 2020

do something similar inside external-provisioner, probably based on node labels (specify node selector for "driver not running" and handle those)

This is conceptually very similar to setting AllowedTopologies in the storage class which uses late binding. The volume scheduler will check that before selecting a node, right?

If so, then this is probably the right solution for this issue because it avoids the problem entirely.

There's a slight race (node has the right labels, is selected for a PVC, labels get removed, driver no longer runs -> PVC stuck), but that should be rare and can be documented as a caveat for admins.

@pohly
Copy link
Contributor Author

pohly commented Jan 11, 2021

This is conceptually very similar to setting AllowedTopologies in the storage class which uses late binding. The volume scheduler will check that before selecting a node, right?

Yes, it does: https://github.com/kubernetes/kubernetes/blob/ba5f5bea64a7b42265e581e8c7fe633336bec79a/pkg/controller/volume/scheduling/scheduler_binder.go#L873-L877

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 11, 2021
@pohly
Copy link
Contributor Author

pohly commented Apr 11, 2021

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 11, 2021
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 10, 2021
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Aug 9, 2021
@pohly
Copy link
Contributor Author

pohly commented Aug 9, 2021

/remove-lifecycle rotten

@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Aug 9, 2021
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 7, 2021
@pohly
Copy link
Contributor Author

pohly commented Nov 8, 2021

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 8, 2021
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 6, 2022
@pohly
Copy link
Contributor Author

pohly commented Feb 7, 2022

/remove-lifecycle stale
/lifecycle frozen

@k8s-ci-robot k8s-ci-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 7, 2022
lukasmetzner added a commit to hetznercloud/csi-driver that referenced this issue Oct 29, 2024
Due to a bug in the scheduler a node with no driver instance might be
picked and the volume is stuck in pending as the "no capacity - >
reschedule" recovery is never triggered
[[0]](kubernetes/kubernetes#122109),
[[1]](kubernetes-csi/external-provisioner#544).

- See #400

---------

Co-authored-by: lukasmetzner <[email protected]>
Co-authored-by: Julian Tölle <[email protected]>
lukasmetzner added a commit to hetznercloud/csi-driver that referenced this issue Nov 11, 2024
Due to a bug in the scheduler a node with no driver instance might be
picked and the volume is stuck in pending as the "no capacity - >
reschedule" recovery is never triggered
[[0]](kubernetes/kubernetes#122109),
[[1]](kubernetes-csi/external-provisioner#544).

- See #400

---------

Co-authored-by: lukasmetzner <[email protected]>
Co-authored-by: Julian Tölle <[email protected]>
lukasmetzner added a commit to hetznercloud/csi-driver that referenced this issue Nov 12, 2024
We modified the response for `NodeGetInfo` to return an additional
Topology Segment. We assumed that this only “adds” new info, but in
practice it breaks the spec.

When trying to schedule a volume to nodes, the container orchestration
systems should verify that the Node fulfills at least one Accessible
Topology of the Node, where “fulfills” means that all supplied segments
match.

This is not implemented in the same way between Kubernetes and Nomad.

- **Kubernetes**: requirements are fulfilled if the volume specifies a
subset of the Nodes topology
- **Nomad**: requirements are fulfilled if the volume specifies all of
the Nodes topology

We made these changes to work around a bug in the Kubernetes scheduler
([here](kubernetes-csi/external-provisioner#544))
where nodes without the CSI Plugin would still be considered for
scheduling, but then creating and attaching the volume fails with no
automatic reconciliation of this error.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.
Projects
None yet
Development

No branches or pull requests

5 participants
@pohly @k8s-ci-robot @fejta-bot @k8s-triage-robot and others