Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ERROR: switched WAL could not be archived in 10 seconds #227

Closed
MasonXon opened this issue May 19, 2022 · 6 comments
Closed

ERROR: switched WAL could not be archived in 10 seconds #227

MasonXon opened this issue May 19, 2022 · 6 comments

Comments

@MasonXon
Copy link

By viewing historical issues, I cannot found the answer for this problem, I put my configuration:

restore_command = 'cp ./pg_archivelog/%f %p'
archive_mode = 'on'
archive_timeout = '120'
archive_command = 'cp %p /var/lib/pgsql/12/data/pg_archivelog/%f'

I'm sure the mode archive is all right, because I tested pg_basebackup for PITR before.

ARCLOG_PATH='/var/lib/pgsql/12/data/pg_archivelog'
SRVLOG_PATH='/var/lib/pgsql/12/data/log'

operations init worked normal, but backup occurs following:

[postgres@pgsql ~]$ /usr/pgsql-12/bin/pg_rman -B /var/lib/pgsql/backups -b full --hard-copy backup -P
INFO: copying database files
Processed 5012 of 5012 files, skipped 0
ERROR: switched WAL could not be archived in 10 seconds

I have tried restarting, modifying database timeout parameters, etc., but none of them seem to work.

@mikecaat
Copy link
Contributor

Hi, could you share the bellow results?

  1. execute with --debug option
# /usr/pgsql-12/bin/pg_rman -B /var/lib/pgsql/backups -b full --hard-copy backup -P --debug
  1. check the archive status after the error is happened
# ls -lR $PGDATA/pg_wal
# ls -lR /var/lib/pgsql/12/data/pg_archivelog/

@MasonXon
Copy link
Author

before 1:

[postgres@pgsql data]$ ls -lR $PGDATA/pg_wal
/var/lib/pgsql/12/data/pg_wal:
total 393232
-rw-------. 1 postgres postgres       35 May 19 12:54 00000002.history
-rw-------. 1 postgres postgres       80 May 19 14:02 00000003.history
-rw-------. 1 postgres postgres      347 May 23 15:25 00000004000000010000003B.00000028.backup
-rw-------. 1 postgres postgres 16777216 May 23 19:15 00000004000000010000003F
-rw-------. 1 postgres postgres 16777216 May 19 15:55 000000040000000100000040
-rw-------. 1 postgres postgres 16777216 May 19 15:55 000000040000000100000041
-rw-------. 1 postgres postgres 16777216 May 19 15:55 000000040000000100000042
-rw-------. 1 postgres postgres 16777216 May 19 15:55 000000040000000100000043
-rw-------. 1 postgres postgres 16777216 May 19 15:55 000000040000000100000044
-rw-------. 1 postgres postgres 16777216 May 19 15:55 000000040000000100000045
-rw-------. 1 postgres postgres 16777216 May 19 15:55 000000040000000100000046
-rw-------. 1 postgres postgres 16777216 May 19 15:55 000000040000000100000047
-rw-------. 1 postgres postgres 16777216 May 19 15:55 000000040000000100000048
-rw-------. 1 postgres postgres 16777216 May 19 15:56 000000040000000100000049
-rw-------. 1 postgres postgres 16777216 May 19 15:55 00000004000000010000004A
-rw-------. 1 postgres postgres 16777216 May 19 15:54 00000004000000010000004B
-rw-------. 1 postgres postgres 16777216 May 19 15:54 00000004000000010000004C
-rw-------. 1 postgres postgres 16777216 May 19 15:54 00000004000000010000004D
-rw-------. 1 postgres postgres 16777216 May 19 15:54 00000004000000010000004E
-rw-------. 1 postgres postgres 16777216 May 19 15:54 00000004000000010000004F
-rw-------. 1 postgres postgres 16777216 May 19 15:54 000000040000000100000050
-rw-------. 1 postgres postgres 16777216 May 19 15:54 000000040000000100000051
-rw-------. 1 postgres postgres 16777216 May 19 15:54 000000040000000100000052
-rw-------. 1 postgres postgres 16777216 May 19 15:54 000000040000000100000053
-rw-------. 1 postgres postgres 16777216 May 19 15:55 000000040000000100000054
-rw-------. 1 postgres postgres 16777216 May 19 15:55 000000040000000100000055
-rw-------. 1 postgres postgres 16777216 May 19 15:55 000000040000000100000056
-rw-------. 1 postgres postgres      125 May 19 14:02 00000004.history
drwx------. 2 postgres postgres      146 May 23 19:14 archive_status

/var/lib/pgsql/12/data/pg_wal/archive_status:
total 0
-rw-------. 1 postgres postgres 0 May 19 10:07 00000002.history.done
-rw-------. 1 postgres postgres 0 May 19 12:54 00000003.history.done
-rw-------. 1 postgres postgres 0 May 23 15:25 00000004000000010000003B.00000028.backup.done
-rw-------. 1 postgres postgres 0 May 19 14:02 00000004.history.done

ls -lR $PGDATA/pg_wal|wc -l
39

ls -lR /var/lib/pgsql/12/data/pg_archivelog/|wc -l
72

backup log follow:

[postgres@pgsql data]$ /usr/pgsql-12/bin/pg_rman -B /var/lib/pgsql/backups -b full --hard-copy backup -P --debug
DEBUG: the initially configured target database : SYSTEM_IDENTIFIER = 7044337785934363742
DEBUG: the system identifier of current target database : 7044337785934363742
DEBUG: the backup target database is the same as initial configured one.
DEBUG: destination directories of backup are initialized
DEBUG: checking PostgreSQL server version
DEBUG: server version is 12.0.11
DEBUG: checking block size setting
DEBUG: (query) SELECT current_setting($1)
DEBUG: 	(param:0) = block_size
DEBUG: block size is 8192
DEBUG: checking block size setting
DEBUG: (query) SELECT current_setting($1)
DEBUG: 	(param:0) = wal_block_size
DEBUG: wal block size is 8192
DEBUG: data checksum disabled on the initially configured database
INFO: copying database files
DEBUG: executing pg_start_backup()
DEBUG: (query) SELECT * from pg_walfile_name_offset(pg_start_backup($1, $2, $3))
DEBUG: 	(param:0) = 2022-05-23 19:20:12 with pg_rman
DEBUG: 	(param:1) = true
DEBUG: 	(param:2) = false
DEBUG: backup start point is (WAL file: 000000040000000100000040, xrecoff: 40)
DEBUG: taking full backup of database files
Processed 1348 of 1348 files, skipped 00
DEBUG: executing pg_stop_backup()
DEBUG: (query) SET client_min_messages = warning;
DEBUG: (query) SELECT * FROM pg_stop_backup($1)
DEBUG: 	(param:0) = false
DEBUG: (query) SELECT * FROM pg_walfile_name_offset($1)
DEBUG: 	(param:0) = 1/41000050
DEBUG: backup end point is (WAL file: 000000040000000100000041, xrecoff: 80)
DEBUG: waiting for 000000040000000100000041 is archived
DEBUG: (query) SELECT txid_current();
DEBUG: current XID is 1106
DEBUG: WAL file containing backup end point is archived after waiting for 0 seconds
INFO: copying archived WAL files
DEBUG: turn to take a full backup of archived WAL files
DEBUG: taking backup of archived WAL files
Processed 73 of 73 files, skipped 0
INFO: backup complete
INFO: Please execute 'pg_rman validate' to verify the files are correctly copied.
DEBUG: do not delete old archived WAL files
DEBUG: do not delete old backup files

It seems it's succeed, But why? before I execute the command it still failed

@MasonXon
Copy link
Author

Now I'm so confused. but pg_rman is normal now.

@mikecaat
Copy link
Contributor

Thanks for sharing the result.

Now I'm so confused. but pg_rman is normal now.

Though I can't say what happened exactly, #104 said that timing could cause the problem. If the archived status which pg_rman will check is deleted by CHECKPOINT, it can't know archiving is done or not.

Anyway, please let me know if the problem happen.

Thanks,

@MasonXon
Copy link
Author

Okay,thanks

@mikecaat
Copy link
Contributor

I'll close this issue. But please feel free to reopen.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants