-
Notifications
You must be signed in to change notification settings - Fork 13.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FLINK-37379][state/forst] Double list LRU in ForSt state and cache reload #26237
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR, overall design LGTM, left two comments regarding the implementation.
void movedToSecond(FileCacheEntry value) { | ||
// trigger the evicting | ||
LOG.trace("Cache entry {} moved to second link.", value.cachePath); | ||
cacheLimitPolicy.release(value.entrySize); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the FileCacheEntry#cachePath
deleted from cacheFs
?
Do we need to wait for delete before releasing cacheLimitPolicy
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cache will be deleted in value.invalidate()
, if the reference count reached 0.
We'd better not wait for that, since we are on the task thread and it's better not block here.
...nk-statebackend-forst/src/main/java/org/apache/flink/state/forst/fs/cache/DoubleListLru.java
Show resolved
Hide resolved
Also, additional two commits to enhance the user experience are attached. |
What is the purpose of the change
Currently, the file-based LRU cache lacks of loading files from the disaggregated DFS. Meaning that after the job recovery, there will be no cache available, hence the performance is low over a period of time. It is critical for production availability of disaggregated state.
This PR introduces a new implementation that allows ForSt to manage file-based cache in a double list (hot/cold) and support cache reload after eviction.
Brief change log
DoubleListLru
and rewrite theFileBasedCache
based on that.Verifying this change
This change is already covered by existing IT tests. The new introduced classes are covered by new introduced UTs.
Does this pull request potentially affect one of the following parts:
@Public(Evolving)
: noDocumentation