-
-
Notifications
You must be signed in to change notification settings - Fork 528
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[no-release-notes] go/store/nbs: During a GC process, take dependencies on chunks that are read through the ChunkStore. #8760
Conversation
…equests so that a GC does not end while they are in progress.
…pendencies on chunks during reads.
…chunks which were in the memtable but get filtered out because they are already in the store.
…h for chunks that are written but which are already present in the store.
…g, make it so that reading chunks as part of the GC process does not take further dependencies on them and never blocks on waitForGC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just the one question about locking store access on block
defer gcs.newGen.mu.RUnlock() | ||
return gcs.hasMany(toHasRecords(hashes)) | ||
func (gcs *GenerationalNBS) HasMany(ctx context.Context, hashes hash.HashSet) (hash.HashSet, error) { | ||
absent, err := gcs.newGen.HasMany(ctx, hashes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the diff just from collapsing gcs.getMany
? or is there more to this?
err := nbs.waitForGC(ctx) | ||
nbs.mu.Unlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm sure this is intentional, but is it better to block here than let other sessions try to do partial work before blocking on the same wait? ex: marked chunks cache hits under this miss
Previously GC was constructed to walk the transitive closure of reachable chunks from when it started, and to take additional dependencies on any chunks that were written to the ChunkStore during the GC process. This change makes it so that we can additionally take dependencies on any chunks that are read from the ChunkStore during the GC process.
For the
nbs
layer to signal a new dependency during a GC, it makes use of akeeperFunc
, which is passed into theBeginGC
call. The contract is that, ifkeeperFunc
is non-nil
and it returnsfalse
, the chunk address which was given to the function will make its way back to a call toSaveHashes
on someMarkAndSweeper
whose contents will end up in the final store once the GC is completed.keeperFunc
, however, is allowed to returntrue
, which signals to the block store that the in-progress GC has passed a certain point in time and can no longer make that guarantee. As a result, anytimekeeperFunc
returnstrue
, the block store implementation needs to block until the GC process is over, and then it should resume the operation from the beginning.That machinery already exists for the
Put()
andCommit()
use case. This PR adds machinery to do all of that on various read paths that explicitly touch chunks, such asHas
,HasMany
,Get
,GetMany
andGetManyCompressed
. In order to add dependency tracking on the read path, it's important for the GC process itself to not cause the chunks it has read to form additional dependencies. This PR makes a slight change tomarkAndSweeper
so that it can fetch chunks from the source store without recording unnecessary duplicate dependencies and potentially becoming mired in deadlocks.