Skip to content

[pull] master from postgres:master #970

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 16, 2025
Merged

[pull] master from postgres:master #970

merged 1 commit into from
Jun 16, 2025

Conversation

pull[bot]
Copy link

@pull pull bot commented Jun 16, 2025

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.1)

Can you help keep this open source service alive? 💖 Please sponsor : )

Previously there was no memory barrier enforcing correct memory ordering when
waiting for a free IO handle. However, in the much more common case of waiting
for IO to complete, memory barriers already were present.

On strongly ordered architectures like x86 this had no negative consequences,
but on some armv8 hardware (observed on Apple hardware), it was possible for
the update, in the IO worker, to PgAioHandle->state to become visible before
->distilled_result becoming visible, leading to rather confusing assertion
failures. The failures were rare enough that the bug sometimes took days to
reproduce when running 027_stream_regress in a loop.

Once finally debugged, it was easy enough to come up with a much quicker
repro: Trigger a lot of very fast IO by limiting io_combine_limit to 1 and
ensure that we always have to wait for a free handle by setting
io_max_concurrency to 1. Triggering lots of concurrent seqscans in that setup
triggers the issue within seconds.

One reason this was hard to debug was that the assertion failure most commonly
happened in WaitReadBuffers(), rather than in the AIO subsystem itself. The
assertions added in this commit make problems like this easier to understand.

Also add a comment to the IO worker explaining that we rely on the lwlock
acquisition for correct memory ordering.

I think it'd be good to add a tap test that stress tests buffer IO, but that's
material for a separate patch.

Thanks a lot to Alexander and Konstantin for all the debugging help.

Reported-by: Tom Lane <[email protected]>
Reported-by: Alexander Lakhin <[email protected]>
Investigated-by: Andres Freund <[email protected]>
Investigated-by: Alexander Lakhin <[email protected]>
Investigated-by: Konstantin Knizhnik <[email protected]>
Discussion: https://postgr.es/m/2dkz7azclpeiqcmouamdixyn5xhlzy4rvikxrbovyzvi6rnv5c@pz7o7osv2ahf
@pull pull bot added the ⤵️ pull label Jun 16, 2025
@pull pull bot merged commit e9a3615 into tuhaihe:master Jun 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant