Skip to content

feat: if upgrade 17 -> 17 modify upgrade process #1583

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 22 commits into from
May 15, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
167cc84
feat: if upgrade 17 -> 17 or 17-orioledb -> 17-orioledb do not run th…
samrose May 2, 2025
bd71971
feat: handle all cases of SERVER_LC_COLLATE and SERVER_LC_CTYPE
samrose May 5, 2025
35116f1
fix: explixit set on 17/oriole
samrose May 6, 2025
096a6b5
feat: handling max_slot_wal_keep_size for pg 17 was needed as well
samrose May 6, 2025
3b57d25
feat: binary upgrades require max_slot_wal_keep_size to be -1 during …
samrose May 6, 2025
f442e52
fix: Better to override that during the upgrade process by specifying…
samrose May 6, 2025
d54f361
fix: cover only pg 17
samrose May 7, 2025
e2028b2
fix: rm oriole handling
samrose May 7, 2025
68e4a85
fix: do not need max_slot_wal_keep_size on old version
samrose May 7, 2025
556baec
fix: temp config on new-options too
samrose May 7, 2025
2f9044b
fix: remove unbound var
samrose May 7, 2025
e734bfb
chore: remove complete.sh change that should not have been committed
samrose May 7, 2025
e07ec08
chore: bump for testing
samrose May 7, 2025
f067318
chore: stash code
samrose May 9, 2025
8ea5351
feat: working pg 17 upgrade
samrose May 9, 2025
dce6cfb
feat: pg 15 handling
samrose May 9, 2025
36329b9
feat: rm oriole handling, refine 15 -> 17 config
samrose May 10, 2025
921e3ca
feat: make sure old pg stops if not force stop
samrose May 11, 2025
0fc1623
chore: bump version
samrose May 11, 2025
8ca4ab0
chore: cleanup + bump version for test
samrose May 12, 2025
28cb728
fix: rollback to working version with fix from divit
samrose May 15, 2025
f474c99
chore: version bump
samrose May 15, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
56 changes: 48 additions & 8 deletions ansible/files/admin_api_scripts/pg_upgrade_scripts/initiate.sh
Original file line number Diff line number Diff line change
Expand Up @@ -41,10 +41,14 @@ LOG_FILE="/var/log/pg-upgrade-initiate.log"

POST_UPGRADE_EXTENSION_SCRIPT="/tmp/pg_upgrade/pg_upgrade_extensions.sql"
POST_UPGRADE_POSTGRES_PERMS_SCRIPT="/tmp/pg_upgrade/pg_upgrade_postgres_perms.sql"
OLD_PGVERSION=$(run_sql -A -t -c "SHOW server_version;")
OLD_PGVERSION=$(pg_config --version | sed 's/PostgreSQL \([0-9]*\.[0-9]*\).*/\1/')

# Skip locale settings if both versions are PostgreSQL 17+
if ! [[ "$OLD_PGVERSION" =~ ^17.* && "$PGVERSION" =~ ^17.* ]]; then
SERVER_LC_COLLATE=$(run_sql -A -t -c "SHOW lc_collate;")
SERVER_LC_CTYPE=$(run_sql -A -t -c "SHOW lc_ctype;")
fi

SERVER_LC_COLLATE=$(run_sql -A -t -c "SHOW lc_collate;")
SERVER_LC_CTYPE=$(run_sql -A -t -c "SHOW lc_ctype;")
SERVER_ENCODING=$(run_sql -A -t -c "SHOW server_encoding;")

POSTGRES_CONFIG_PATH="/etc/postgresql/postgresql.conf"
Expand Down Expand Up @@ -251,7 +255,12 @@ function initiate_upgrade {
if [ -n "$IS_LOCAL_UPGRADE" ]; then
mkdir -p "$PG_UPGRADE_BIN_DIR"
mkdir -p /tmp/persistent/
echo "a7189a68ed4ea78c1e73991b5f271043636cf074" > "$PG_UPGRADE_BIN_DIR/nix_flake_version"
if [ -n "$NIX_FLAKE_VERSION" ]; then
echo "$NIX_FLAKE_VERSION" > "$PG_UPGRADE_BIN_DIR/nix_flake_version"
else
echo "a7189a68ed4ea78c1e73991b5f271043636cf074" > "$PG_UPGRADE_BIN_DIR/nix_flake_version"
fi

tar -czf "/tmp/persistent/pg_upgrade_bin.tar.gz" -C "/tmp/pg_upgrade_bin" .
rm -rf /tmp/pg_upgrade_bin/
fi
Expand Down Expand Up @@ -394,9 +403,14 @@ function initiate_upgrade {
rm -rf "${PGDATANEW:?}/"

if [ "$IS_NIX_UPGRADE" = "true" ]; then
LC_ALL=en_US.UTF-8 LC_CTYPE=$SERVER_LC_CTYPE LC_COLLATE=$SERVER_LC_COLLATE LANGUAGE=en_US.UTF-8 LANG=en_US.UTF-8 LOCALE_ARCHIVE=/usr/lib/locale/locale-archive su -c ". /nix/var/nix/profiles/default/etc/profile.d/nix-daemon.sh && $PGBINNEW/initdb --encoding=$SERVER_ENCODING --lc-collate=$SERVER_LC_COLLATE --lc-ctype=$SERVER_LC_CTYPE -L $PGSHARENEW -D $PGDATANEW/ --username=supabase_admin" -s "$SHELL" postgres
if [[ "$PGVERSION" =~ ^17.* ]]; then
LC_ALL=en_US.UTF-8 LC_CTYPE=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LANGUAGE=en_US.UTF-8 LANG=en_US.UTF-8 LOCALE_ARCHIVE=/usr/lib/locale/locale-archive su -c ". /nix/var/nix/profiles/default/etc/profile.d/nix-daemon.sh && $PGBINNEW/initdb --encoding=$SERVER_ENCODING --locale-provider=icu --icu-locale=en_US.UTF-8 -L $PGSHARENEW -D $PGDATANEW/ --username=supabase_admin" -s "$SHELL" postgres
else
LC_ALL=en_US.UTF-8 LC_CTYPE=$SERVER_LC_CTYPE LC_COLLATE=$SERVER_LC_COLLATE LANGUAGE=en_US.UTF-8 LANG=en_US.UTF-8 LOCALE_ARCHIVE=/usr/lib/locale/locale-archive su -c ". /nix/var/nix/profiles/default/etc/profile.d/nix-daemon.sh && $PGBINNEW/initdb --encoding=$SERVER_ENCODING --lc-collate=$SERVER_LC_COLLATE --lc-ctype=$SERVER_LC_CTYPE -L $PGSHARENEW -D $PGDATANEW/ --username=supabase_admin" -s "$SHELL" postgres
fi
else
su -c "$PGBINNEW/initdb -L $PGSHARENEW -D $PGDATANEW/ --username=supabase_admin" -s "$SHELL" postgres

fi

# This line avoids the need to supply the supabase_admin password on the old
Expand All @@ -409,6 +423,20 @@ $(cat /etc/postgresql/pg_hba.conf)" > /etc/postgresql/pg_hba.conf
run_sql -c "select pg_reload_conf();"
fi

TMP_CONFIG="/tmp/pg_upgrade/postgresql.conf"
cp "$POSTGRES_CONFIG_PATH" "$TMP_CONFIG"

# Check if max_slot_wal_keep_size exists in the config
# Add the setting if not found
echo "max_slot_wal_keep_size = -1" >> "$TMP_CONFIG"

# Remove db_user_namespace if upgrading from PG15
if [[ "$OLD_PGVERSION" =~ ^15.* && "$PGVERSION" =~ ^17.* ]]; then
sed -i '/^db_user_namespace/d' "$TMP_CONFIG"
fi

chown postgres:postgres "$TMP_CONFIG"

UPGRADE_COMMAND=$(cat <<EOF
time ${PGBINNEW}/pg_upgrade \
--old-bindir="${PGBINOLD}" \
Expand All @@ -417,17 +445,23 @@ $(cat /etc/postgresql/pg_hba.conf)" > /etc/postgresql/pg_hba.conf
--new-datadir=${PGDATANEW} \
--username=supabase_admin \
--jobs="${WORKERS}" -r \
--old-options='-c config_file=${POSTGRES_CONFIG_PATH}' \
--old-options="-c config_file=$TMP_CONFIG" \
--old-options="-c shared_preload_libraries='${SHARED_PRELOAD_LIBRARIES}'" \
--new-options="-c data_directory=${PGDATANEW}" \
--new-options="-c config_file=$TMP_CONFIG" \
--new-options="-c shared_preload_libraries='${SHARED_PRELOAD_LIBRARIES}'"
EOF
)

if [ "$IS_NIX_BASED_SYSTEM" = "true" ]; then
UPGRADE_COMMAND=". /nix/var/nix/profiles/default/etc/profile.d/nix-daemon.sh && $UPGRADE_COMMAND"
fi
GRN_PLUGINS_DIR=/var/lib/postgresql/.nix-profile/lib/groonga/plugins LC_ALL=en_US.UTF-8 LC_CTYPE=$SERVER_LC_CTYPE LC_COLLATE=$SERVER_LC_COLLATE LANGUAGE=en_US.UTF-8 LANG=en_US.UTF-8 LOCALE_ARCHIVE=/usr/lib/locale/locale-archive su -pc "$UPGRADE_COMMAND --check" -s "$SHELL" postgres

if [[ "$PGVERSION" =~ ^17.* ]]; then
GRN_PLUGINS_DIR=/var/lib/postgresql/.nix-profile/lib/groonga/plugins LC_ALL=en_US.UTF-8 LANGUAGE=en_US.UTF-8 LANG=en_US.UTF-8 LOCALE_ARCHIVE=/usr/lib/locale/locale-archive su -pc "$UPGRADE_COMMAND --check" -s "$SHELL" postgres
else
GRN_PLUGINS_DIR=/var/lib/postgresql/.nix-profile/lib/groonga/plugins LC_ALL=en_US.UTF-8 LC_CTYPE=$SERVER_LC_CTYPE LC_COLLATE=$SERVER_LC_COLLATE LANGUAGE=en_US.UTF-8 LANG=en_US.UTF-8 LOCALE_ARCHIVE=/usr/lib/locale/locale-archive su -pc "$UPGRADE_COMMAND --check" -s "$SHELL" postgres
fi

echo "10. Stopping postgres; running pg_upgrade"
# Extra work to ensure postgres is actually stopped
Expand All @@ -439,11 +473,17 @@ EOF

sleep 3
systemctl stop postgresql

else
CI_stop_postgres
fi

GRN_PLUGINS_DIR=/var/lib/postgresql/.nix-profile/lib/groonga/plugins LC_ALL=en_US.UTF-8 LC_CTYPE=$SERVER_LC_CTYPE LC_COLLATE=$SERVER_LC_COLLATE LANGUAGE=en_US.UTF-8 LANG=en_US.UTF-8 LOCALE_ARCHIVE=/usr/lib/locale/locale-archive su -pc "$UPGRADE_COMMAND" -s "$SHELL" postgres
# Start the old PostgreSQL instance with version-specific options
if [[ "$PGVERSION" =~ ^17.* ]]; then
GRN_PLUGINS_DIR=/var/lib/postgresql/.nix-profile/lib/groonga/plugins LC_ALL=en_US.UTF-8 LANGUAGE=en_US.UTF-8 LANG=en_US.UTF-8 LOCALE_ARCHIVE=/usr/lib/locale/locale-archive su -pc "$UPGRADE_COMMAND" -s "$SHELL" postgres
else
GRN_PLUGINS_DIR=/var/lib/postgresql/.nix-profile/lib/groonga/plugins LC_ALL=en_US.UTF-8 LC_CTYPE=$SERVER_LC_CTYPE LC_COLLATE=$SERVER_LC_COLLATE LANGUAGE=en_US.UTF-8 LANG=en_US.UTF-8 LOCALE_ARCHIVE=/usr/lib/locale/locale-archive su -pc "$UPGRADE_COMMAND" -s "$SHELL" postgres
fi

# copying custom configurations
echo "11. Copying custom configurations"
Expand Down
6 changes: 3 additions & 3 deletions ansible/vars.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,9 @@ postgres_major:

# Full version strings for each major version
postgres_release:
postgresorioledb-17: "17.0.1.079-orioledb"
postgres17: "17.4.1.029"
postgres15: "15.8.1.086"
postgresorioledb-17: "17.0.1.080-orioledb"
postgres17: "17.4.1.030"
postgres15: "15.8.1.087"

# Non Postgres Extensions
pgbouncer_release: "1.19.0"
Expand Down
115 changes: 115 additions & 0 deletions nix/docs/testing-pg-upgrade-scripts.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
# Testing PostgreSQL Upgrade Scripts

This document describes how to test changes to the PostgreSQL upgrade scripts on a running machine.

## Prerequisites

- A running PostgreSQL instance
- Access to the Supabase Postgres repository
- Permissions to run GitHub Actions workflows
- ssh access to the ec2 instance

## Development Workflow

1. **Make Changes to Upgrade Scripts**
- Make your changes to the scripts in `ansible/files/admin_api_scripts/pg_upgrade_scripts/`
- Commit and push your changes to your feature branch
- For quick testing, you can also edit the script directly on the server at `/etc/adminapi/pg_upgrade_scripts/initiate.sh`

2. **Publish Script Changes** (Only needed for deploying to new instances)
- Go to [publish-nix-pgupgrade-scripts.yml](https://github.com/supabase/postgres/actions/workflows/publish-nix-pgupgrade-scripts.yml)
- Click "Run workflow"
- Select your branch
- Run the workflow

3. **Publish Binary Flake Version** (Only needed for deploying to new instances)
- Go to [publish-nix-pgupgrade-bin-flake-version.yml](https://github.com/supabase/postgres/actions/workflows/publish-nix-pgupgrade-bin-flake-version.yml)
- Click "Run workflow"
- Select your branch
- Run the workflow
- Note: Make sure the flake version includes the PostgreSQL version you're testing (e.g., 17)

4. **Test on Running Machine**
ssh into the machine
```bash
# Stop PostgreSQL
sudo systemctl stop postgresql

# Run the upgrade script in local mode with your desired flake version
sudo NIX_FLAKE_VERSION="your-flake-version-here" IS_LOCAL_UPGRADE=true /etc/adminapi/pg_upgrade_scripts/initiate.sh 17
```
Note: This will use the version of the script that exists at `/etc/adminapi/pg_upgrade_scripts/initiate.sh` on the server.
The script should be run as the ubuntu user with sudo privileges. The script will handle switching to the postgres user when needed.

In local mode:
- The script at `/etc/adminapi/pg_upgrade_scripts/initiate.sh` will be used (your edited version)
- Only the PostgreSQL binaries will be downloaded from the specified flake version
- No new upgrade scripts will be downloaded
- You can override the flake version by setting the NIX_FLAKE_VERSION environment variable
- If NIX_FLAKE_VERSION is not set, it will use the default flake version

5. **Monitor Progress**
```bash
# Watch the upgrade log
tail -f /var/log/pg-upgrade-initiate.log
```

6. **Check Results**
In local mode, the script will:
- Create a new data directory at `/data_migration/pgdata`
- Run pg_upgrade to test the upgrade process
- Generate SQL files in `/data_migration/sql/` for any needed post-upgrade steps
- Log the results in `/var/log/pg-upgrade-initiate.log`

To verify success:
```bash
# Check the upgrade log for completion
grep "Upgrade complete" /var/log/pg-upgrade-initiate.log

# Check for any generated SQL files
ls -l /data_migration/sql/

# Check the new data directory
ls -l /data_migration/pgdata/
```

Note: The instance will not be upgraded to the new version in local mode. This is just a test run to verify the upgrade process works correctly.

## Important Notes

- The `IS_LOCAL_UPGRADE=true` flag makes the script run in the foreground and skip disk mounting steps
- The script will use the existing data directory
- All output is logged to `/var/log/pg-upgrade-initiate.log`
- The script will automatically restart PostgreSQL after completion or failure
- For testing, you can edit the script directly on the server - the GitHub Actions workflows are only needed for deploying to new instances
- Run the script as the ubuntu user with sudo privileges - the script will handle user switching internally
- Local mode is for testing only - it will not actually upgrade the instance
- The Nix flake version must include the PostgreSQL version you're testing (e.g., 17)
- In local mode, only the PostgreSQL binaries are downloaded from the flake - the upgrade scripts are used from the local filesystem
- You can override the flake version by setting the NIX_FLAKE_VERSION environment variable when running the script

## Troubleshooting

If the upgrade fails:
1. Check the logs at `/var/log/pg-upgrade-initiate.log`
2. Look for any error messages in the PostgreSQL logs
3. The script will attempt to clean up and restore the original state
4. If you see an error about missing Nix flake attributes, make sure the flake version includes the PostgreSQL version you're testing

Common Errors:
- `error: flake 'github:supabase/postgres/...' does not provide attribute 'packages.aarch64-linux.psql_17/bin'`
- This means the Nix flake version doesn't include PostgreSQL 17 binaries
- You need to specify a flake version that includes your target version
- You can find valid flake versions by looking at the commit history of the publish-nix-pgupgrade-bin-flake-version.yml workflow

## Cleanup

After testing:
1. The script will automatically clean up temporary files
2. PostgreSQL will be restarted
3. The original configuration will be restored

## References

- [publish-nix-pgupgrade-scripts.yml](https://github.com/supabase/postgres/actions/workflows/publish-nix-pgupgrade-scripts.yml)
- [publish-nix-pgupgrade-bin-flake-version.yml](https://github.com/supabase/postgres/actions/workflows/publish-nix-pgupgrade-bin-flake-version.yml)