-
Notifications
You must be signed in to change notification settings - Fork 25.3k
Add documentation around desired balance #119902
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add documentation around desired balance #119902
Conversation
Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with some minor comments (some are probably just preference so up to you how you address)
* @param lastConvergedIndex Identifies what input data the balancer computation round used to produce this {@link DesiredBalance}. See | ||
* {@link DesiredBalanceInput#index()} for details. Each reroute request gets assigned a monotonically increasing | ||
* sequence number, and the balancer, which runs async to reroute, uses the latest request's data to compute the | ||
* desired balance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I think this would be "strictly increasing", "monotonically increasing" means values can be repeated? Perhaps "sequence number" is enough as (I think) it implies the same?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, applied. Simpler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On master failover, the index gets set back to -1
Lines 378 to 379 in 80729f9
private void onNoLongerMaster() { | |
if (indexGenerator.getAndSet(-1) != -1) { |
If the node later becomes master again, the index will again begin from zero. So I think we should qualified it as "strictly increasing in the same master term".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated 👍 Thanks!
* produces a new ClusterState with the changes made by {@link DesiredBalanceReconciler#reconcile}. The {@link RerouteStrategy} provided | ||
* to the callback calls into {@link #desiredBalanceReconciler} for the changes. The {@link #masterServiceTaskQueue} will apply the | ||
* cluster state update. | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment seems overly specific to me? Given it's an interface, I feel like I'd rather know what it does rather than how it does it.
I think it's the the "to run ...." bit that I find jarring. If it's a good abstraction, only the what should matter, not the how. We can use our IDEs to find the implementation(s). Also would be less likely to go stale if we were less specific.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that a good abstraction would explain what and not how. The problem with this area of the code is that it's like spaghetti and difficult to follow. Right now the Allocator has a callback to the AllocationService, which has a callback to the Allocator, which produces a result for the AllocationService to feed back into the Allocator's MasterServiceTaskQueue..... The first step to improve the code, in my mind, is to document what's happening, later we can hopefully refactor the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
produces a result for the AllocationService to feed back into the Allocator's MasterServiceTaskQueue
I don't quite get this part. Seems also related to the last sentence in the comment
The {@link #masterServiceTaskQueue} will apply the cluster state update.
IIUC, submitting the ReconcileDesiredBalanceTask
should be the first step to trigger the callback to AllocationService, i.e. reconciler
. It also feels a bit odd to say a master task queue "apply" the cluster state update. In general, master service computes the new cluster state. The ClusterApplierService then applies the state update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've changed the text from apply to publish, per the MasterService terminology. And mentioned that the ReconcileDesiredBalanceExecutor constructs the cluster state. Hopefully that makes things a bit clearer, let me know.
In general, master service computes the new cluster state. The ClusterApplierService then applies the state update.
The MasterServiceTaskQueue is created here with the *Executor, and the queue appears to know how to execute and publish a cluster state update. I didn't dig into the details, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Yeah, MasterService computes (executes) and publishes the new cluster state. So Publish sounds good. There relevant applier here is IndicesClusterStateService
.
...va/org/elasticsearch/cluster/routing/allocation/allocator/DesiredBalanceShardsAllocator.java
Show resolved
Hide resolved
* Accepts listeners with an index value (see {#link #indexGenerator}) and run them whenever a DesiredBalance computation completes with | ||
* an equal or greater index value. | ||
*/ | ||
private final PendingListenersQueue pendingListenersQueue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the javadoc on the pending listeners queue is enough? or we're duplicating it a bit (i.e. more to maintain)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, you're right. Rewrote to just say that it tracks and runs listeners for after computation completes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
* @param lastConvergedIndex Identifies what input data the balancer computation round used to produce this {@link DesiredBalance}. See | ||
* {@link DesiredBalanceInput#index()} for details. Each reroute request gets assigned a monotonically increasing | ||
* sequence number, and the balancer, which runs async to reroute, uses the latest request's data to compute the | ||
* desired balance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, applied. Simpler.
* produces a new ClusterState with the changes made by {@link DesiredBalanceReconciler#reconcile}. The {@link RerouteStrategy} provided | ||
* to the callback calls into {@link #desiredBalanceReconciler} for the changes. The {@link #masterServiceTaskQueue} will apply the | ||
* cluster state update. | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that a good abstraction would explain what and not how. The problem with this area of the code is that it's like spaghetti and difficult to follow. Right now the Allocator has a callback to the AllocationService, which has a callback to the Allocator, which produces a result for the AllocationService to feed back into the Allocator's MasterServiceTaskQueue..... The first step to improve the code, in my mind, is to document what's happening, later we can hopefully refactor the code.
...va/org/elasticsearch/cluster/routing/allocation/allocator/DesiredBalanceShardsAllocator.java
Show resolved
Hide resolved
* Accepts listeners with an index value (see {#link #indexGenerator}) and run them whenever a DesiredBalance computation completes with | ||
* an equal or greater index value. | ||
*/ | ||
private final PendingListenersQueue pendingListenersQueue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, you're right. Rewrote to just say that it tracks and runs listeners for after computation completes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made some updates from Yang's review
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Relates ES-10341
More documentation again. I'm trying to figure out how everything plugs together so I can hook in my metric collection.
Relates ES-10341