`pg_tde` test PR #3

jeltz · 2025-02-24T18:13:24Z

This is a testy PR!

1. TruncateMultiXact() performs the SLRU truncations in a critical section. Deleting the SLRU segments calls ForwardSyncRequest(), which will try to compact the request queue if it's full (CompactCheckpointerRequestQueue()). That in turn allocates memory, which is not allowed in a critical section. Backtrace: TRAP: failed Assert("CritSectionCount == 0 || (context)->allowInCritSection"), File: "../src/backend/utils/mmgr/mcxt.c", Line: 1353, PID: 920981 postgres: autovacuum worker template0(ExceptionalCondition+0x6e)[0x560a501e866e] postgres: autovacuum worker template0(+0x5dce3d)[0x560a50217e3d] postgres: autovacuum worker template0(ForwardSyncRequest+0x8e)[0x560a4ffec95e] postgres: autovacuum worker template0(RegisterSyncRequest+0x2b)[0x560a50091eeb] postgres: autovacuum worker template0(+0x187b0a)[0x560a4fdc2b0a] postgres: autovacuum worker template0(SlruDeleteSegment+0x101)[0x560a4fdc2ab1] postgres: autovacuum worker template0(TruncateMultiXact+0x2fb)[0x560a4fdbde1b] postgres: autovacuum worker template0(vac_update_datfrozenxid+0x4b3)[0x560a4febd2f3] postgres: autovacuum worker template0(+0x3adf66)[0x560a4ffe8f66] postgres: autovacuum worker template0(AutoVacWorkerMain+0x3ed)[0x560a4ffe7c2d] postgres: autovacuum worker template0(+0x3b1ead)[0x560a4ffecead] postgres: autovacuum worker template0(+0x3b620e)[0x560a4fff120e] postgres: autovacuum worker template0(+0x3b3fbb)[0x560a4ffeefbb] postgres: autovacuum worker template0(+0x2f724e)[0x560a4ff3224e] /lib/x86_64-linux-gnu/libc.so.6(+0x27c8a)[0x7f62cc642c8a] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x85)[0x7f62cc642d45] postgres: autovacuum worker template0(_start+0x21)[0x560a4fd16f31] To fix, bail out in CompactCheckpointerRequestQueue() without doing anything, if it's called in a critical section. That covers the above call path, as well as any other similar cases where RegisterSyncRequest might be called in a critical section. 2. After fixing that, another problem became apparent: Autovacuum process doing that truncation can deadlock with the checkpointer process. TruncateMultiXact() sets "MyProc->delayChkptFlags |= DELAY_CHKPT_START". If the sync request queue is full and cannot be compacted, the process will repeatedly sleep and retry, until there is room in the queue. However, if the checkpointer is trying to start a checkpoint at the same time, and is waiting for the DELAY_CHKPT_START processes to finish, the queue will never shrink. More concretely, the autovacuum process is stuck here: #0 0x00007fc934926dc3 in epoll_wait () from /lib/x86_64-linux-gnu/libc.so.6 #1 0x000056220b24348b in WaitEventSetWaitBlock (set=0x56220c2e4b50, occurred_events=0x7ffe7856d040, nevents=1, cur_timeout=<optimized out>) at ../src/backend/storage/ipc/latch.c:1570 #2 WaitEventSetWait (set=0x56220c2e4b50, timeout=timeout@entry=10, occurred_events=<optimized out>, occurred_events@entry=0x7ffe7856d040, nevents=nevents@entry=1, wait_event_info=wait_event_info@entry=150994949) at ../src/backend/storage/ipc/latch.c:1516 #3 0x000056220b243224 in WaitLatch (latch=<optimized out>, latch@entry=0x0, wakeEvents=wakeEvents@entry=40, timeout=timeout@entry=10, wait_event_info=wait_event_info@entry=150994949) at ../src/backend/storage/ipc/latch.c:538 postgres#4 0x000056220b26cf46 in RegisterSyncRequest (ftag=ftag@entry=0x7ffe7856d0a0, type=type@entry=SYNC_FORGET_REQUEST, retryOnError=true) at ../src/backend/storage/sync/sync.c:614 postgres#5 0x000056220af9db0a in SlruInternalDeleteSegment (ctl=ctl@entry=0x56220b7beb60 <MultiXactMemberCtlData>, segno=segno@entry=11350) at ../src/backend/access/transam/slru.c:1495 postgres#6 0x000056220af9dab1 in SlruDeleteSegment (ctl=ctl@entry=0x56220b7beb60 <MultiXactMemberCtlData>, segno=segno@entry=11350) at ../src/backend/access/transam/slru.c:1566 postgres#7 0x000056220af98e1b in PerformMembersTruncation (oldestOffset=<optimized out>, newOldestOffset=<optimized out>) at ../src/backend/access/transam/multixact.c:3006 postgres#8 TruncateMultiXact (newOldestMulti=newOldestMulti@entry=3221225472, newOldestMultiDB=newOldestMultiDB@entry=4) at ../src/backend/access/transam/multixact.c:3201 postgres#9 0x000056220b098303 in vac_truncate_clog (frozenXID=749, minMulti=<optimized out>, lastSaneFrozenXid=749, lastSaneMinMulti=3221225472) at ../src/backend/commands/vacuum.c:1917 postgres#10 vac_update_datfrozenxid () at ../src/backend/commands/vacuum.c:1760 postgres#11 0x000056220b1c3f76 in do_autovacuum () at ../src/backend/postmaster/autovacuum.c:2550 postgres#12 0x000056220b1c2c3d in AutoVacWorkerMain (startup_data=<optimized out>, startup_data_len=<optimized out>) at ../src/backend/postmaster/autovacuum.c:1569 and the checkpointer is stuck here: #0 0x00007fc9348ebf93 in clock_nanosleep () from /lib/x86_64-linux-gnu/libc.so.6 #1 0x00007fc9348fe353 in nanosleep () from /lib/x86_64-linux-gnu/libc.so.6 #2 0x000056220b40ecb4 in pg_usleep (microsec=microsec@entry=10000) at ../src/port/pgsleep.c:50 #3 0x000056220afb43c3 in CreateCheckPoint (flags=flags@entry=108) at ../src/backend/access/transam/xlog.c:7098 postgres#4 0x000056220b1c6e86 in CheckpointerMain (startup_data=<optimized out>, startup_data_len=<optimized out>) at ../src/backend/postmaster/checkpointer.c:464 To fix, add AbsorbSyncRequests() to the loops where the checkpointer waits for DELAY_CHKPT_START or DELAY_CHKPT_COMPLETE operations to finish. Backpatch to v14. Before that, SLRU deletion didn't call RegisterSyncRequest, which avoided this failure. I'm not sure if there are other similar scenarios on older versions, but we haven't had any such reports. Discussion: https://www.postgresql.org/message-id/[email protected]

If we look up something about a relation in the command start event trigger we need to hold onto that lock until the end of the command and not release it so no other commd is able to change the data we looked at between now and when the DDL command itself actually locks the relation.

For now we do not use this field but we plan to use it when encrypting Wrelation data and WAL.

We might as well increase the entropy by adding a random base value to the IVs used used when encrypting relations. But since for now the pair of key + IV is generated and used together it adds little extra security over what we already have. We add the IV with XOR since that is a cheap and easy operation which uases no extra collissions.

Same as last commit but for the WAL encryption. Rewrote the calculation in a way gcc's vectorizer likes, as verified with Godbolt. The code generated by clang is ok and branch free but it fails to properly vectorize both before and after.

This way they can eventually be made more similar to using OpenSSL for encrypting and decrypting the CTR stream.

While this does not merge the two similar functions it removes a lot of cruft from pg_tde_read_one_map_entry2() making the functions more similar and the code easier to read.

…l key We already had protection against decrypting relation keys with the wrong principal key but to properly protect us against new relation keys being encrypted with the wrong principal key we need to also verify that the principal key was correct when we fetch the principal key from the key provider. We do so by signing the principal key info header of the key map file using AES-128-GCM. This way we cannot get a jumbled mess of relation keys encrypted with multiple different principal keys.

This follows the pattern in the rest of the PostgreSQL source code.

The end LSN of the current buffer to decript was exclusive while the end LSN of the key was inclusive which had led to confusion and an off-by-one bug. Also add a simple test case for the WAL encryption using the logical decoding's test plugin.

This makes the code easier to read plus gives us things like letting the compiler check for unused variables.

Several variables were at a too long scope or were initialized when they should not be. Plus convert a while loop to a for loop.

The comment about all-zero pages created by smgrzeroextend() sounded much scarier than the reality. In fact trying to encrypt these all-zero pages might not only be a waste of CPU cycles but also could decrease security by making us re-use the same IV first with all zeros and then with the actual data. And the extra amount of protection we gain from encrypting them is minuscule since they are only added at the end of the table, soon overwritten and only gives the attacker a very slightly more accurate table size.

There is no point in looking up the length of the file when we already know it.

* PG-1457 Rename some key management funcions * PG-1457 Fix some tests * PG-1457 Hit CI * PG-1457 Rename key in CI setup * PG-1457 Rename pg_tde_verify_global_principal_key to pg_tde_verify_server_principal_key * PG-1457 Rename keys in tests * PG-1457 Renaming * PG-1457 Renaming * PG-1457 Fix tests * PG-1457 Fix tests * PG-1457 Fix tabs * PG-1457 Fix tests * PG-1457 Fix tests * PG-1457 Fix * PG-1457 Fix test * PG-1457 Fix test * PG-1457 Hit CI * PG-1457 Fix after rebase * PG-1457 Fix * PG-1457 Fix * PG-1457 Fix * PG-1457 Fix test * PG-1457 Fix tests * PG-1457 Fix tests * PG-1457 Fix

It's not clear to someone new to the code that "key provider" in these files refers to what's called "key provider type" elsewhere. Rename these to make it easier for the next person.

By using SPI and a SQL language function to force the rewrite of sequences we made life unnecessarily hard for ourselves and also intoduced a bug where ALTER TABLE could break if the pg_tde extension's functions were not in the search path. Instead we re-use the code for updating sequence persistence whe table persistance is changed. And we only need to look at the table's presistance since it is always the same as the one of the sequences.

This file is no longer used by our scripts and is easy to generate when running pgindent.

The R was pulling double duty in the old name.

This is just a conversion of the google doc into markdown, with actualizing some of the outdated details in the document. The last section (researc/investigation topics) is left out, as that doesn't make much sense in a public documentation.

When we did a recovery, even if tde_heap was not used, 98% of the time was spent in pg_tde_get_key_from_file() due to our SMGR missing the shortcicruit from mdcreate() which skips running if the fork already is open. The issue was made worse by us not caching negative SMGR key lookups but that is the subject of another commit.

…s#154) PG-1457 Replace `principal key` with just a `key` on user API level, as it's the only key that user can directly interact with.

Co-authored-by: Andrew Pogrebnoi <[email protected]>

Commit e3a87b4 added a way to call ereport() without any extra parentheses which was backported all the way back to PostgreSQL 12. The new modern way improves code readability slightly.

HeapTupleIsValid() is actually just a null check but PostgreSQL's codebase almost always uses this macro and we had a confusion where we both had a null check and called this macro so we at least should pick just one of the two ways to write it. And here I picked the most commonly used way in the PostgreSQL codebase.

We apparently had the same alternate file twice in our repo.

For some reason these functions were declared twice, once using the macro and once without it.

To run pg_tde tests with `make check` we have to add pg_buffercache and test_decoding extensions to temporary pg installation.

There is no user of this return value and furthermore the function can only ever return true.

These pgindent excludes are no longer relevant.

Since upstream does not ignore it neither should we.

The .gitignore entry was left in the old location when the source for the executable was moved.

One of them even always had meson in the name even when make was ran.

Most of the result files and the logs were missing from the artifact created when running make check-world.

Instead of giving them numbers we call them pg_tde_ddl_start and pg_tde_ddl_end. Since the triggers are not on the same event the names do not matter for the order they are executed in.

In tde_keyring.c we do a lot of switching on the keyring type, some of it which could be done in a slightly nicer way.

The meson and the make scripts had diverged a lot, so this commit fixes that plus some other inconsistencies.

The TDE_MODE environment variable disables tests we actually want to run in our Github Actions. This change is also necessary to in a future commit disable the pg_tde tests in the global TDE mode.

The purpose of the global TDE mode is to run PostgreSQL's normal test suite but with our extension so running the pg_tde test suite when in that mode makes no sense. Meson supports disabling test suites with --no-suite so we only need to do this for the Makefile.

Now that we no longer run the pg_tde suite in the global TDE mode we can remove all the code which was there to support it.

This removes some unnecessary queries and formats the queries to be easier to read.

Also add a couple of tests for the DEFAULT case to avoid regressions.

Improve readability for the provider vs key type logic.

This enum was only used in one place and oscured the two dimensions of provider types (database vs global) and principal keys (server vs default vs database).

Make the intent a lot cleaner by doing the check when a global provider us used instead of deciding to do it or not per user facing function.

While these tests test our changes to pg_waldump they are quite easy to overlook right now and where exactly should we draw the line? These tests are not something we ever want to upstream and in the future when we figure out how we want to make sure pg_waldump works with encrypted WAL we likely will want to have the tests for that solution in the same folder as our other tests anyway.

jeltz changed the base branch from master to TDE_REL_17_STABLE February 24, 2025 18:14

jeltz force-pushed the tde/test branch from 6c74368 to 4d5fc5b Compare February 24, 2025 18:16

jeltz and others added 26 commits April 4, 2025 19:48

PG-1455 Add field with random base IV to relation and WAL keys

7ea53b4

For now we do not use this field but we plan to use it when encrypting Wrelation data and WAL.

PG-1455 Change xlog base IVs to use addition

36bc82d

This way they can eventually be made more similar to using OpenSSL for encrypting and decrypting the CTR stream.

PG-1437 Simplify pg_tde_read_one_map_entry2()

d788185

While this does not merge the two similar functions it removes a lot of cruft from pg_tde_read_one_map_entry2() making the functions more similar and the code easier to read.

Use correct type for fd

e9d4927

Use %m for error message in our modification of pg_regress

d83fc90

This follows the pattern in the rest of the PostgreSQL source code.

Use separate variables instead of an array when rotating files

b11d51c

This makes the code easier to read plus gives us things like letting the compiler check for unused variables.

Clean up tdeheap_xlog_seg_read()

2616e27

Several variables were at a too long scope or were initialized when they should not be. Plus convert a while loop to a for loop.

Remove unnecessary seek when we already know the size

fef106d

There is no point in looking up the length of the file when we already know it.

Clarify some namings in keyring code

3381f84

It's not clear to someone new to the code that "key provider" in these files refers to what's called "key provider type" elsewhere. Rename these to make it easier for the next person.

Remove typedefs.list

0a54910

This file is no longer used by our scripts and is easy to generate when running pgindent.

Fix typo in KeyringProvideRecord typedef name

ab2e9b1

The R was pulling double duty in the old name.

Added high level overview to documentation

e0a3448

This is just a conversion of the google doc into markdown, with actualizing some of the outdated details in the document. The last section (researc/investigation topics) is left out, as that doesn't make much sense in a public documentation.

Improved readability

afb7151

Updated review comments

9356a40

PG-1457 Rename principal key on user API level to just a key (postgre…

959a6b6

…s#154) PG-1457 Replace `principal key` with just a `key` on user API level, as it's the only key that user can directly interact with.

Update doc link readme (postgres#175)

d158ba5

Co-authored-by: Andrew Pogrebnoi <[email protected]>

Modernize our way to call ereport()

725c34d

Commit e3a87b4 added a way to call ereport() without any extra parentheses which was backported all the way back to PostgreSQL 12. The new modern way improves code readability slightly.

jeltz force-pushed the tde/test branch from 4d5fc5b to aae28f5 Compare April 11, 2025 02:51

jeltz and others added 12 commits April 25, 2025 15:50

Remove duplicate pg_regress alternate expected file

b639210

We apparently had the same alternate file twice in our repo.

Remove double declarations

aeaae4e

For some reason these functions were declared twice, once using the macro and once without it.

Specifiy extensions required for tde test in EXTRA_INSTALL

c1da756

To run pg_tde tests with `make check` we have to add pg_buffercache and test_decoding extensions to temporary pg installation.

Extract code coverage jobs into separate CI workflow

198af9e

Run simple tests for code coverage needs

082cc11

Highlight that code coverage badge in main readme file related to pg_tde

b7d52ab

Remove unused return value from check_percona_api_version()

d602c2e

There is no user of this return value and furthermore the function can only ever return true.

Remove leftovers from when we remove tde_heap_basic

d51c498

These pgindent excludes are no longer relevant.

Stop ignoring nodtags.h for pgindent

f1d12f1

Since upstream does not ignore it neither should we.

Move pg_tde_change_key_provider .gitignore entry

435ef90

The .gitignore entry was left in the old location when the source for the executable was moved.

Clean up .gitignore

e355f38

jeltz force-pushed the tde/test branch from 28cf5f3 to d3147df Compare April 25, 2025 21:23

jeltz added 16 commits April 26, 2025 14:58

Make failure artifact names consistent in Github Actions

5f64c75

One of them even always had meson in the name even when make was ran.

Add more test result and logs to the Github failure artifact

5a1fe76

Most of the result files and the logs were missing from the artifact created when running make check-world.

Rename event triggers to something more expressive

6fef8bd

Instead of giving them numbers we call them pg_tde_ddl_start and pg_tde_ddl_end. Since the triggers are not on the same event the names do not matter for the order they are executed in.

Clean up some switch statements and if chains

712dcf5

In tde_keyring.c we do a lot of switching on the keyring type, some of it which could be done in a slightly nicer way.

Harmonize the different Github Actions entry points

e979ab7

The meson and the make scripts had diverged a lot, so this commit fixes that plus some other inconsistencies.

Only enable TDE_MODE for the tests with global TDE

b7dfa68

The TDE_MODE environment variable disables tests we actually want to run in our Github Actions. This change is also necessary to in a future commit disable the pg_tde tests in the global TDE mode.

Remove support for running pg_tde tests in global TDE mode

6a1e1b6

Now that we no longer run the pg_tde suite in the global TDE mode we can remove all the code which was there to support it.

Clean up test SQL file

ae04e97

This removes some unnecessary queries and formats the queries to be easier to read.

Make ALTER TABLE ... SET ACCESS METHOD logic easier to read

dffece7

Also add a couple of tests for the DEFAULT case to avoid regressions.

Simplify logic for global, local, server, default

7177c2d

Improve readability for the provider vs key type logic.

Get rid of global_status enum

7ed3115

This enum was only used in one place and oscured the two dimensions of provider types (database vs global) and principal keys (server vs default vs database).

Simplify superuser check for pg_tde_set_*_key_using_*_key_provider()

bb000ef

Make the intent a lot cleaner by doing the check when a global provider us used instead of deciding to do it or not per user facing function.

WIP: PG-1442 Enable encryption of catalog tables

3f51a90

WIP: PG-1442 Ugly fix

ba8d862

jeltz force-pushed the tde/test branch from d3147df to ba8d862 Compare April 28, 2025 08:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

`pg_tde` test PR #3

`pg_tde` test PR #3

Uh oh!

jeltz commented Feb 24, 2025

Uh oh!

Uh oh!

pg_tde test PR #3

Are you sure you want to change the base?

pg_tde test PR #3

Uh oh!

Conversation

jeltz commented Feb 24, 2025

Uh oh!

Uh oh!

`pg_tde` test PR #3

`pg_tde` test PR #3