-
Notifications
You must be signed in to change notification settings - Fork 11
PG-1604: Improve last key LSN calculation logic #519
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
fed0689
to
a9b1adf
Compare
295c016
to
82b1ce9
Compare
Codecov Report❌ Patch coverage is ❌ Your project status has failed because the head coverage (82.51%) is below the target coverage (90.00%). You can increase the head coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## TDE_REL_17_STABLE #519 +/- ##
=====================================================
+ Coverage 82.19% 82.51% +0.31%
=====================================================
Files 25 25
Lines 3174 3209 +35
Branches 515 510 -5
=====================================================
+ Hits 2609 2648 +39
+ Misses 456 452 -4
Partials 109 109
🚀 New features to boost your workflow:
|
637f46e
to
0049350
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review of the the bugifx comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am still not a huge fan of rewinding and uisng the same key to write what we think should be new data but which technically does not have to be.
The min/max comparisons of LSNs assumed that everyting is in the same timeline. In practice, with replication + recovery combinations, it is possible that keys span at least 3 timelines, which means that this has to be included in both combinations, as in other timelines, the restrictions are less strict.
last_key_loc.tli = TDEXLogGetEncKeyTli(); | ||
|
||
lastKeyUsable = (TDEXLogGetEncKeyLsn() != 0); | ||
afterLastKey = wal_location_cmp(last_key_loc, loc) <= 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it even necessary to check for this? Don't we write the WAL in a linear fashion so once recovery is done can we really write anything to old locations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, it is, because replicas might go back to earlier locations when rewriting an entire segment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even after recovery is done? Wow. I think this needs to be explained with a comment in the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even after recovery is done?
Replicas are always in recovery, we can't rely on just checking recovery status. That was one of my earlier solutions, but replicas basically never generated new keys because of that (only in some corner cases)
So the current solution restricts the check for crash recovery, and generates a new key even on replicas after we progressed past the last key.
Previosly we simply set the LSN for the new key to the first write location. This is however not correct, as there are many corner cases around this: * recovery / replication might write old LSNs * we can't handle multiple keys with the same TLI/LSN, which can happen with quick restarts without writes To support this in this commit we modify the following: * We only activate new keys outside crash recovery, or immediately if encryption is turned off * We also take the already existing last key into account (if exists), and only activate a new key if we progressed past its start location The remaining changes are just support infrastructure for this: * Since we might rewrite old records, we use the already existing keys for those writes, not the active last keys * We prefetch existing keys during initialization, so it doesn't accidentally happen in the critical section during a write There is a remaining bug with stopping wal encryption, also mentioned in a TODO message in the code. This will be addressed in a later PR as this fix already took too long.
Previosly we simply set the LSN for the new key to the first write
location.
This is however not correct, as there are many corner cases around this:
with quick restarts without writes
To support this in this commit we modify the following:
encryption is turned off
and only activate a new key if we progressed past its start location
The remaining changes are just support infrastructure for this:
for those writes, not the active last keys
accidentally happen in the critical section during a write
There is a remaining bug with stopping wal encryption, also mentioned in
a TODO message in the code. This will be addressed in a later PR as this
fix already took too long.
--
And two additional bugfixes, see the separate commits.