-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add documentation around desired balance #119902
base: main
Are you sure you want to change the base?
Add documentation around desired balance #119902
Conversation
Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with some minor comments (some are probably just preference so up to you how you address)
* @param lastConvergedIndex Identifies what input data the balancer computation round used to produce this {@link DesiredBalance}. See | ||
* {@link DesiredBalanceInput#index()} for details. Each reroute request gets assigned a monotonically increasing | ||
* sequence number, and the balancer, which runs async to reroute, uses the latest request's data to compute the | ||
* desired balance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I think this would be "strictly increasing", "monotonically increasing" means values can be repeated? Perhaps "sequence number" is enough as (I think) it implies the same?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, applied. Simpler.
* produces a new ClusterState with the changes made by {@link DesiredBalanceReconciler#reconcile}. The {@link RerouteStrategy} provided | ||
* to the callback calls into {@link #desiredBalanceReconciler} for the changes. The {@link #masterServiceTaskQueue} will apply the | ||
* cluster state update. | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment seems overly specific to me? Given it's an interface, I feel like I'd rather know what it does rather than how it does it.
I think it's the the "to run ...." bit that I find jarring. If it's a good abstraction, only the what should matter, not the how. We can use our IDEs to find the implementation(s). Also would be less likely to go stale if we were less specific.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that a good abstraction would explain what and not how. The problem with this area of the code is that it's like spaghetti and difficult to follow. Right now the Allocator has a callback to the AllocationService, which has a callback to the Allocator, which produces a result for the AllocationService to feed back into the Allocator's MasterServiceTaskQueue..... The first step to improve the code, in my mind, is to document what's happening, later we can hopefully refactor the code.
* Reconciliation ({@link DesiredBalanceReconciler#reconcile(DesiredBalance, RoutingAllocation)}) takes the {@link DesiredBalance} | ||
* output of {@link DesiredBalanceComputer#compute} and identifies how shards need to be added, moved or removed to go from the current | ||
* cluster shard allocation to the new desired allocation. | ||
*/ | ||
private final DesiredBalanceReconciler desiredBalanceReconciler; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this doc be on DesiredBalanceReconciler#reconcile
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the PR with a new comment on DesiredBalanceReconciler#reconcile
I'd like to explain here how desiredBalanceReconciler
differs from reconciler
. They have practically the same name right now, so I think it makes the code more understandable to very clearly explain how they are used / what they do in this file's context.
* Accepts listeners with an index value (see {#link #indexGenerator}) and run them whenever a DesiredBalance computation completes with | ||
* an equal or greater index value. | ||
*/ | ||
private final PendingListenersQueue pendingListenersQueue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the javadoc on the pending listeners queue is enough? or we're duplicating it a bit (i.e. more to maintain)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, you're right. Rewrote to just say that it tracks and runs listeners for after computation completes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
* @param lastConvergedIndex Identifies what input data the balancer computation round used to produce this {@link DesiredBalance}. See | ||
* {@link DesiredBalanceInput#index()} for details. Each reroute request gets assigned a monotonically increasing | ||
* sequence number, and the balancer, which runs async to reroute, uses the latest request's data to compute the | ||
* desired balance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, applied. Simpler.
* produces a new ClusterState with the changes made by {@link DesiredBalanceReconciler#reconcile}. The {@link RerouteStrategy} provided | ||
* to the callback calls into {@link #desiredBalanceReconciler} for the changes. The {@link #masterServiceTaskQueue} will apply the | ||
* cluster state update. | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that a good abstraction would explain what and not how. The problem with this area of the code is that it's like spaghetti and difficult to follow. Right now the Allocator has a callback to the AllocationService, which has a callback to the Allocator, which produces a result for the AllocationService to feed back into the Allocator's MasterServiceTaskQueue..... The first step to improve the code, in my mind, is to document what's happening, later we can hopefully refactor the code.
* Reconciliation ({@link DesiredBalanceReconciler#reconcile(DesiredBalance, RoutingAllocation)}) takes the {@link DesiredBalance} | ||
* output of {@link DesiredBalanceComputer#compute} and identifies how shards need to be added, moved or removed to go from the current | ||
* cluster shard allocation to the new desired allocation. | ||
*/ | ||
private final DesiredBalanceReconciler desiredBalanceReconciler; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the PR with a new comment on DesiredBalanceReconciler#reconcile
I'd like to explain here how desiredBalanceReconciler
differs from reconciler
. They have practically the same name right now, so I think it makes the code more understandable to very clearly explain how they are used / what they do in this file's context.
* Accepts listeners with an index value (see {#link #indexGenerator}) and run them whenever a DesiredBalance computation completes with | ||
* an equal or greater index value. | ||
*/ | ||
private final PendingListenersQueue pendingListenersQueue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, you're right. Rewrote to just say that it tracks and runs listeners for after computation completes
More documentation again. I'm trying to figure out how everything plugs together so I can hook in my metric collection.
Relates ES-10341