Skip to content

Use stack for buffer/WAL usage instrumentation #14

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

lfittl
Copy link
Owner

@lfittl lfittl commented Jan 26, 2025

TODO

  • Performance testing

    • Can we show that the reduced InstrStopNode effort e.g. speeds up an EXPLAIN ANALYZE .. COUNT(*) ..?)
    • Can we show that parallel query runs faster in regular queries? (since it can now conditionally skip buffer usage measurements)
  • Prototype HLL usage for shared buffers hit distinct (mainly to test whether the API is sound for that usage)

  • Double check whether the emitted_fpi logic is actually needed (since we do keep the WAL usage counters after all)

  • Double check we have no more bugs with Nested Loops counter on the loop node itself

  • Add EXPLAIN option for distinct (this may be easiest after Robert's patch is pushed)

  • Consider changing "hit distinct" => "distinct", and including reads

  • Investigate alternatives to modifying utility commands that use internal transactions

    • Idea 1: Can we assign to the portal resource owner, instead of the transaction? (but AFAIK I tried that and it didn't work since the portal resource owner is the initial transaction?)
    • Idea 2: Can we create our own resource owner in places where InstrUsage/InstrStop works with utility statements (i.e. pg_stat_statements utility tracking)
    • Idea 3: Maybe we should keep using top-level buffer counters after all? (and only use the stack to fix the InstrStartNode/InstrStopNode problem)
  • Double check whether the fact that parallel query no longer updates the caller's pgWalUsage is a problem (did it actually do that before, or just update the instrumentation struct?)

  • Debug test failure in triggers.sql regression test (merge into parent p using (values (1)) as v(id) on p.aid = v.id caused an abort, see https://api.cirrus-ci.com/v1/artifact/task/6191633160470528/log/src/test/recovery/tmp_check/log/027_stream_regress_primary.log)

    • ASAN failure? ==53931==ERROR: AddressSanitizer: heap-use-after-free on address 0x625000ab99c0 at pc 0x556c737b6388 bp 0x7ffedc6f6240 sp 0x7ffedc6f6238
      #0 0x556c737b6387 in XLogInsertRecord /tmp/cirrus-ci-build/src/backend/access/transam/xlog.c:1081
      #1 0x556c737c5615 in XLogInsert /tmp/cirrus-ci-build/src/backend/access/transam/xloginsert.c:523
      #2 0x556c737a1811 in XactLogAbortRecord /tmp/cirrus-ci-build/src/backend/access/transam/xact.c:6110
      #3 0x556c737a1cb0 in RecordTransactionAbort /tmp/cirrus-ci-build/src/backend/access/transam/xact.c:1818
      #4 0x556c737a2263 in AbortTransaction /tmp/cirrus-ci-build/src/backend/access/transam/xact.c:2926
      #5 0x556c737a3b35 in AbortCurrentTransactionInternal /tmp/cirrus-ci-build/src/backend/access/transam/xact.c:3495
      #6 0x556c737a3d1d in AbortCurrentTransaction /tmp/cirrus-ci-build/src/backend/access/transam/xact.c:3449
      #7 0x556c74005963 in PostgresMain /tmp/cirrus-ci-build/src/backend/tcop/postgres.c:4408
      #8 0x556c73ff830f in BackendMain /tmp/cirrus-ci-build/src/backend/tcop/backend_startup.c:107
      #9 0x556c73e51e28 in postmaster_child_launch /tmp/cirrus-ci-build/src/backend/postmaster/launch_backend.c:274
      #10 0x556c73e57c89 in BackendStartup /tmp/cirrus-ci-build/src/backend/postmaster/postmaster.c:3519
      #11 0x556c73e5a915 in ServerLoop /tmp/cirrus-ci-build/src/backend/postmaster/postmaster.c:1688
      #12 0x556c73e5ca5a in PostmasterMain /tmp/cirrus-ci-build/src/backend/postmaster/postmaster.c:1386
      #13 0x556c73bf88ea in main /tmp/cirrus-ci-build/src/backend/main/main.c:230
      #14 0x7f8b30248249  (/lib/x86_64-linux-gnu/libc.so.6+0x27249)
      #15 0x7f8b30248304 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x27304)
      #16 0x556c735d53a0 in _start (/tmp/cirrus-ci-build/tmp_install/usr/local/pgsql/bin/postgres+0x4233a0)
    

@lfittl lfittl force-pushed the instrument-usage-stack branch from 3551c3a to 0d18fd9 Compare January 26, 2025 10:17
@lfittl lfittl force-pushed the instrument-usage-stack branch from 0d18fd9 to c50f815 Compare February 8, 2025 21:04
@lfittl lfittl force-pushed the instrument-usage-stack branch 4 times, most recently from 1298979 to a22521d Compare March 9, 2025 17:12
@lfittl lfittl force-pushed the instrument-usage-stack branch from a22521d to 0e48a26 Compare March 16, 2025 06:15
@lfittl lfittl force-pushed the instrument-usage-stack branch from 50c8c28 to 91b4067 Compare May 15, 2025 15:51
@lfittl lfittl force-pushed the instrument-usage-stack branch from 91b4067 to 900c403 Compare May 15, 2025 15:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant