Skip to content
This repository was archived by the owner on Jul 6, 2021. It is now read-only.

Commit aab0b90

Browse files
committed
Rename "tx wraparound" -> "tx ID wraparound"
See merge request postgres-ai/postgres-checkup!314
2 parents cb8637b + cd8b5b4 commit aab0b90

15 files changed

+128
-132
lines changed

HELP.md

Lines changed: 35 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -7,20 +7,20 @@ This group determines the available resources such as hardware characteristics o
77
General information about operational systems where the observed Postgres master and its replicas operate.
88

99
> Insights:
10-
>
10+
>
1111
> - Hardware and software differences (OS versions, Linux kernel versions, CPU, Memory). If the observed master and its replicas run on different platforms, it might cause issues with binary replication.
12-
>
12+
>
1313
> - Memory settings tuning. (Examples: is swap enabled? Are huge pages used?) Observing state of memory about memory consumption by database may lead to recommendations of changes to improve system performance.
14-
>
14+
>
1515
> - Information about virtualization type.
1616
1717

1818
### A002 Postgres Version Information
1919

2020
This report answers the following questions:
21-
- Do all nodes have the same Postgres version?
21+
- Do all nodes have the same Postgres version?
2222
- Is the minor version being used up-to-date? Keeping the minor version of the database up-to-date is recommended to decrease chances to encounter with bugs, performance and security issues?
23-
- Is the major version currently supported by the community?
23+
- Is the major version currently supported by the community?
2424
- Will the major version be supported by the community during the next 12 months?
2525
- If the minor version is not the most recent, are any critical bugfixes released that need to be applied ASAP?
2626

@@ -40,26 +40,26 @@ The following is included:
4040
- The uptime. Sometimes low uptime may indicate an unplanned, accidental restart of the database.
4141
- General information: how many databases are on one instance, what is their size, replication mode, age of statistics.
4242
- Information about replicas, replication modes, replication delays.
43-
- Ratio of forced checkpoints among all checkpoints registered since statistics reset time.
44-
> Insights: Frequent checkpoints in most cases create an excessive load on the disk subsystem. Identifying this fact will allow the more optimal disk utilization.
45-
- How big is the observed database (the cluster may have multiple databases)?
46-
> Insight: if the database is smaller than RAM, there are good chances to avoid intensive disk IO in most operations
47-
- Cache Effectiveness: percentage of buffer pool hits.
48-
> Insight: if it is not more than 95% on all nodes, it might be a good sign that the buffer pool size needs to be increased.
49-
- Successful Commits: percentage of successfully committed transactions.
50-
> Insight: if the value is not more than 99%, it might be a sign of logic issues with application code leading to high rates of ROLLBACK events.
51-
- Temp Files per day: how many temporary files were generated per day in average, since last statistics reset time.
52-
> Insight: if this value is high (thousands), it is a signal that work_mem should be increased.
53-
- Deadlocks per day.
54-
> Insight: significant (dozens) daily number of deadlocks is a sign of issues with application logic that needs redesign.
43+
- Ratio of forced checkpoints among all checkpoints registered since statistics reset time.
44+
> Insights: Frequent checkpoints in most cases create an excessive load on the disk subsystem. Identifying this fact will allow the more optimal disk utilization.
45+
- How big is the observed database (the cluster may have multiple databases)?
46+
> Insight: if the database is smaller than RAM, there are good chances to avoid intensive disk IO in most operations
47+
- Cache Effectiveness: percentage of buffer pool hits.
48+
> Insight: if it is not more than 95% on all nodes, it might be a good sign that the buffer pool size needs to be increased.
49+
- Successful Commits: percentage of successfully committed transactions.
50+
> Insight: if the value is not more than 99%, it might be a sign of logic issues with application code leading to high rates of ROLLBACK events.
51+
- Temp Files per day: how many temporary files were generated per day in average, since last statistics reset time.
52+
> Insight: if this value is high (thousands), it is a signal that work_mem should be increased.
53+
- Deadlocks per day.
54+
> Insight: significant (dozens) daily number of deadlocks is a sign of issues with application logic that needs redesign.
5555
5656
### A005 Extensions
5757

5858
Provides a list of all available and installed (in the current observed database) extensions, with versions. Insight: if there is a newer version of an installed extension, the report will highlight it, meaning that update is needed.
5959

6060
### A006 Postgres Setting Deviations
6161

62-
Helps to check that there are no differences in Postgres configuration on the observed nodes (except `transaction_read_only` and pg_stat_kcache’s `linux_hz`).
62+
Helps to check that there are no differences in Postgres configuration on the observed nodes (except `transaction_read_only` and pg_stat_kcache’s `linux_hz`).
6363

6464
> Insights:
6565
> - In general, any differences in configuration on master and its replicas might lead to issues in case of failover. An example: the master is tuned, while replicas are not tuned at all or tuned poorly, in the event of failover, a new master cannot operate properly due to poor tuning.
@@ -72,13 +72,13 @@ There are multiple ways to change database settings globally:
7272
- explicitly, in the configuration file postgresql.conf, and
7373
- implicitly, using 'ALTER SYSTEM' commands.
7474

75-
This report checks if there are settings which were set by implicit (ALTER SYSTEM) way.
75+
This report checks if there are settings which were set by implicit (ALTER SYSTEM) way.
7676

7777
Possible sources of configuration settings (presented in the first column of the report’s table):
7878

7979
* `postgresql.auto.conf`: changed via 'ALTER SYSTEM' command.
8080
* `%any other file pattern%`: changed in additional config included to the main one.
81-
* `postgresql.conf`: non-default values are set in postgresql.conf.
81+
* `postgresql.conf`: non-default values are set in postgresql.conf.
8282

8383
### A008 Disk Usage and File System Type
8484

@@ -132,11 +132,11 @@ Shows global and per-table (if any) autovacuum-related Postgres settings.
132132

133133
> Insights:
134134
> - Is any tuning applied (values are not default)?
135-
> - Are there any custom table autovacuum settings? There are cases when the tables have a custom autovacuum configuration. Tracking such tables will allow you to understand the nature of the functioning of autovacuum workers. Such tables are marked with asterisk (*) in the following reports.
135+
> - Are there any custom table autovacuum settings? There are cases when the tables have a custom autovacuum configuration. Tracking such tables will allow you to understand the nature of the functioning of autovacuum workers. Such tables are marked with asterisk (\*) in the following reports.
136136
137-
### F002 Autovacuum: Transaction Wraparound Check
137+
### F002 Autovacuum: Transaction ID Wraparound Check
138138

139-
Shows a distance in % to transaction wraparound disaster for every database.
139+
Shows a distance in % to transaction ID wraparound disaster for every database.
140140

141141
> Insights:
142142
> If % is higher than 50%, autovacuum tuning should be considered as soon as possible.
@@ -160,14 +160,14 @@ Estimated table and index bloat is presented in this report.
160160
> - Objects with a high percentage of bloat lead to wasted disk space, degradation in query performance, additional CPU costs, and excessive read load on the disk.
161161
> This report is based on estimations. The errors in bloat estimates may be significant (in some cases, up to 15% and event more). Use it only as an indicator of potential issues.
162162
> - Checks the following things:
163-
> - Extreme (>90%) level of heap or index bloat estimated.
164-
> - Significant (>40%) level of heap or index bloat estimated.
163+
> - Extreme (>90%) level of heap or index bloat estimated.
164+
> - Significant (>40%) level of heap or index bloat estimated.
165165
166166
### F008 Autovacuum: Resource Usage
167167

168-
Shows a table with Postgres settings related to autovacuum resource usage.
168+
Shows a table with Postgres settings related to autovacuum resource usage.
169169

170-
> Insights:
170+
> Insights:
171171
> - Is `autovacuum_max_workers` not default? (When CPU cores or vCPUs >= 10).
172172
> - Is `autovacuum_vacuum_cost_limit` / `vacuum_cost_limit` not default?
173173
> - Isn't `maintenance_work_mem` / `autovacuum_work_mem` too low compared to table sizes and RAM?
@@ -201,7 +201,7 @@ A detailed snapshot report of all connections, grouped by users, databases and s
201201
Provides information about how "timeout" and locking-related settings are tuned, shows deadlocks counter for every database since statistics reset.
202202

203203
> Insights:
204-
> - Questions worth answering:
204+
> - Questions worth answering:
205205
> - Is `statement_timeout` > 0 and <= 30 seconds (good choice for an OLTP system)?
206206
> - Is `idle_in_transaction_session_timeout` >0 and < 20 minutes (preventing autovacuum and locking issues)?
207207
> - Is `max_locks_per_transaction` not default (for example, low value may interrupt pg_dump)?
@@ -218,14 +218,14 @@ Shows the list of never used, rarely used and redundant indexes.
218218
Helps to understand how much space they occupy.
219219

220220
> Insights:
221-
> - Questions worth answering:
221+
> - Questions worth answering:
222222
> - Is the total size of unused indexes less than 10% of the DB size (only if statistics is older than 1 week)?
223223
> - Is statistics saved across restarts?
224224
> - If statistics age is low, the report should be used with caution.
225225
226226
### H003 Non-indexed Foreign Keys
227227

228-
Checks if all foreign keys have indexes in referencing tables.
228+
Checks if all foreign keys have indexes in referencing tables.
229229

230230
# K. SQL Query Analysis
231231

@@ -247,7 +247,7 @@ The grouping is based on the first word of every query.
247247

248248
One of the most comprehensive and deep reports. Shows Top query groups
249249
ordered by total execution time during the observation period (`total_time` in
250-
pg_stat_statements). Good start for query optimization.
250+
pg_stat_statements). Good start for query optimization.
251251

252252
> Insights:
253253
> - The first question to answer: Are there any query groups with `total_time` ratio >50% of overall `total_time`? If we have this type of query, it is definitely worth optimizing it.
@@ -262,11 +262,11 @@ face of a growing amount of data.
262262

263263
### L001 Table Sizes
264264

265-
Displays the size of tables and their components (indexes, TOAST, the table itself).
265+
Displays the size of tables and their components (indexes, TOAST, the table itself).
266266

267-
> - Questions worth answering:
268-
> - Does the size of indexes for each table not exceed heap (with toast) size?
269-
> - Are there any non-indexes tables which size is > 10 MiB?
267+
> - Questions worth answering:
268+
> - Does the size of indexes for each table not exceed heap (with toast) size?
269+
> - Are there any non-indexes tables which size is > 10 MiB?
270270
> - Are there any non-partitioned tables of size > 100 GiB?
271271
272272
### L003 Integer (int2, int4) Out-of-range Risks in PKs

LICENSE

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ modification, are permitted provided that the following conditions are met:
1515
* Neither the name of the copyright holder nor the names of its
1616
contributors may be used to endorse or promote products derived from
1717
this software without specific prior written permission.
18-
18+
1919
* Redistributions of any form whatsoever and integration into third-party
2020
products (including but not limited to cloud services) must retain the
2121
following acknowledgment in the documentation and copyright notices:

README.md

Lines changed: 64 additions & 64 deletions
Original file line numberDiff line numberDiff line change
@@ -162,7 +162,8 @@ Which literally means: connect to the server with given credentials, save data i
162162
project directory, as epoch of check `1`. Epoch is a numerical (**integer**) sign of current iteration.
163163
For example: in half a year we can switch to "epoch number `2`".
164164

165-
`-h db2.vpn.local` means: try to connect to host via SSH and then use remote `psql` command to perform checks.
165+
`-h db2.vpn.local` means: try to connect to host via SSH and then use remote `psql` command to perform checks.
166+
166167
If SSH is not available the local 'psql' will be used (non-psql reports will be skipped).
167168

168169
For comprehensive analysis, it is recommended to run the tool on the master and
@@ -187,7 +188,7 @@ for host in db2.vpn.local db3.vpn.local db4.vpn.local; do
187188
-e 1 \
188189
--file resources/checks/K000_query_analysis.sh # the first snapshot is needed only for reports K***
189190
done
190-
191+
191192
sleep "$DISTANCE"
192193

193194
for host in db2.vpn.local db3.vpn.local db4.vpn.local; do
@@ -244,15 +245,14 @@ We need to know a hostname or an ip address of target database to be used with `
244245
PG_HOST=$(docker inspect --format '{{ .NetworkSettings.IPAddress }}' postgres)
245246
```
246247

247-
You can use official images or build an image yourself.
248-
Run this command to build an image:
248+
You can use official images or build an image yourself. Run this command to build an image:
249249

250250
```bash
251251
docker build -t postgres-checkup .
252252
```
253253

254-
Then run a container with `postgres-checkup`.
255-
This command run the tool using Postgres connection only (without SSH):
254+
Then run a container with `postgres-checkup`. This command run the tool using
255+
Postgres connection only (without SSH):
256256

257257
```bash
258258
docker run --rm \
@@ -323,89 +323,89 @@ Docker support implemented by [Ivan Muratov](https://gitlab.com/binakot).
323323

324324
## А. General / Infrastructural
325325

326-
- [x] A001 System, CPU, RAM, disks, virtualization #6 , #56 , #57 , #86
326+
- [x] A001 System, CPU, RAM, disks, virtualization #6 , #56 , #57 , #86
327327
- [x] A002 PostgreSQL versions (Simple) #68, #21, #86
328-
- [x] A003 Collect pg_settings #15, #167, #86
329-
- [x] A004 General cluster info #7, #58, #59, #86, #162
330-
- [x] A005 Extensions #8, #60, #61, #86, #167
331-
- [x] A006 Config diff #9, #62, #63, #86
332-
- [x] A007 ALTER SYSTEM vs postgresql.conf #18, #86
333-
- [x] A008 Disk usage and file system type #19, #20
334-
- [ ] A010 Data checksums, wal_log_hints #22
335-
- [ ] A011 Connection pooling. pgbouncer #23
336-
- [ ] A012 Anti-crash checks #177
337-
338-
## B. Backups and DR
339-
340-
- [ ] B001 SLO/SLA, RPO, RTO #24
341-
- [ ] B002 File system, mount flags #25
342-
- [ ] B003 Full backups / incremental #26
343-
- [ ] B004 WAL archiving (GB/day?) - #27
344-
- [ ] B005 Restore checks, monitoring, alerting #28
328+
- [x] A003 Collect pg_settings #15, #167, #86
329+
- [x] A004 General cluster info #7, #58, #59, #86, #162
330+
- [x] A005 Extensions #8, #60, #61, #86, #167
331+
- [x] A006 Config diff #9, #62, #63, #86
332+
- [x] A007 ALTER SYSTEM vs postgresql.conf #18, #86
333+
- [x] A008 Disk usage and file system type #19, #20
334+
- [ ] A010 Data checksums, wal_log_hints #22
335+
- [ ] A011 Connection pooling. pgbouncer #23
336+
- [ ] A012 Anti-crash checks #177
337+
338+
## B. Backups and DR
339+
340+
- [ ] B001 SLO/SLA, RPO, RTO #24
341+
- [ ] B002 File system, mount flags #25
342+
- [ ] B003 Full backups / incremental #26
343+
- [ ] B004 WAL archiving (GB/day?) - #27
344+
- [ ] B005 Restore checks, monitoring, alerting #28
345345

346346
## C. Replication and HA
347347

348-
- [ ] C001 SLO/SLA #29
349-
- [ ] C002 Sync/async, Streaming / wal transfer; logical decoding #30
350-
- [ ] C003 SPOFs; “-1 datacenter”, standby with traffic #31
351-
- [ ] C004 Failover #32
352-
- [ ] C005 Switchover #33
353-
- [ ] C006 Delayed replica (replay of 1 day of WALs) - #34
348+
- [ ] C001 SLO/SLA #29
349+
- [ ] C002 Sync/async, Streaming / wal transfer; logical decoding #30
350+
- [ ] C003 SPOFs; “-1 datacenter”, standby with traffic #31
351+
- [ ] C004 Failover #32
352+
- [ ] C005 Switchover #33
353+
- [ ] C006 Delayed replica (replay of 1 day of WALs) - #34
354354

355-
## D. Monitoring / Troubleshooting
355+
## D. Monitoring / Troubleshooting
356356

357-
- [ ] D001 Logging (syslog?), log_*** #35
358-
- [x] D002 Useful Linux tools #36
359-
- [ ] D003 List of monitoring metrics #37
360-
- [x] D004 pg_stat_statements, tuning opts, pg_stat_kcache #38
361-
- [ ] D005 track_io_timing, …, auto_explain #39
362-
- [ ] D006 Recommended DBA toolsets: postgres_dba, pgCenter, pgHeroother #40
363-
- [ ] D007 Postgres-specific tools for troubleshooting #137
357+
- [ ] D001 Logging (syslog?), log_*** #35
358+
- [x] D002 Useful Linux tools #36
359+
- [ ] D003 List of monitoring metrics #37
360+
- [x] D004 pg_stat_statements, tuning opts, pg_stat_kcache #38
361+
- [ ] D005 track_io_timing, …, auto_explain #39
362+
- [ ] D006 Recommended DBA toolsets: postgres_dba, pgCenter, pgHeroother #40
363+
- [ ] D007 Postgres-specific tools for troubleshooting #137
364364

365365
## E. WAL, Checkpoints
366366

367-
- [ ] E001 WAL/checkpoint settings, IO #41
368-
- [ ] E002 Checkpoints, bgwriter, IO #42
367+
- [ ] E001 WAL/checkpoint settings, IO #41
368+
- [ ] E002 Checkpoints, bgwriter, IO #42
369369

370370
## F. Autovacuum, Bloat
371371

372-
- [x] F001 < F003 Current autovacuum-related settings #108, #164
373-
- [x] F002 < F007 Transaction wraparound check #16, #171
374-
- [x] F003 < F006 Dead tuples #164
375-
- [x] F004 < F001 Heap bloat estimation #87, #122
376-
- [x] F005 < F002 Index bloat estimation #88
377-
- [ ] F006 < F004 Precise heap bloat analysis
378-
- [ ] F007 < F005 Precise index bloat analysis
379-
- [x] F008 < F008 Resource usage (CPU, Memory, disk IO) #44
372+
- [x] F001 < F003 Current autovacuum-related settings #108, #164
373+
- [x] F002 < F007 Transaction ID wraparound check #16, #171
374+
- [x] F003 < F006 Dead tuples #164
375+
- [x] F004 < F001 Heap bloat estimation #87, #122
376+
- [x] F005 < F002 Index bloat estimation #88
377+
- [ ] F006 < F004 Precise heap bloat analysis
378+
- [ ] F007 < F005 Precise index bloat analysis
379+
- [x] F008 < F008 Resource usage (CPU, Memory, disk IO) #44
380380

381-
## G. Performance / Connections / Memory-related Settings
381+
## G. Performance / Connections / Memory-related Settings
382382

383-
- [x] G001 Memory-related settings #45, #190
384-
- [x] G002 Connections #46
385-
- [x] G003 Timeouts, locks, deadlocks (amount) #47
386-
- [ ] G004 Query planner (diff) #48
387-
- [ ] G005 I/O settings #49
388-
- [ ] G006 Default_statistics_target (plus per table?) #50
383+
- [x] G001 Memory-related settings #45, #190
384+
- [x] G002 Connections #46
385+
- [x] G003 Timeouts, locks, deadlocks (amount) #47
386+
- [ ] G004 Query planner (diff) #48
387+
- [ ] G005 I/O settings #49
388+
- [ ] G006 Default_statistics_target (plus per table?) #50
389389

390390
## H. Index Analysis
391391

392-
- [x] H001 Indexes: invalid #192, #51
393-
- [x] H002 Unused and redundant indexes #51, #180, #170, #168, #322
394-
- [x] H003 Missing FK indexes #52, #142, #173
392+
- [x] H001 Indexes: invalid #192, #51
393+
- [x] H002 Unused and redundant indexes #51, #180, #170, #168, #322
394+
- [x] H003 Missing FK indexes #52, #142, #173
395395

396396
## J. Capacity Planning
397397

398-
- [ ] J001 Capacity planning - #54
398+
- [ ] J001 Capacity planning - #54
399399

400400
## K. SQL query Analysis
401401

402-
- [x] K001 Globally aggregated query metrics #158, #178, #182, #184
403-
- [x] K002 Workload type ("first word" analysis) #159, #178, #179, #182, #184
402+
- [x] K001 Globally aggregated query metrics #158, #178, #182, #184
403+
- [x] K002 Workload type ("first word" analysis) #159, #178, #179, #182, #184
404404
- [x] K003 Top queries by total_time #160, #172, #174, #178, #179, #182, #184, #193
405405

406406
## L. DB Schema Analysis
407-
- [x] L001 (was: H003) Current sizes of DB objects (tables, indexes, mat. views) #163
408-
- [ ] L002 (was: H004) Data types being used #53
407+
- [x] L001 (was: H003) Current sizes of DB objects (tables, indexes, mat. views) #163
408+
- [ ] L002 (was: H004) Data types being used #53
409409
- [x] L003 Integer (int2, int4) out-of-range risks in PKs // calculate capacity remained; optional: predict when capacity will be fully used) https://gitlab.com/postgres-ai-team/postgres-checkup/issues/237
410410

411411
## TODO:
@@ -414,7 +414,7 @@ Docker support implemented by [Ivan Muratov](https://gitlab.com/binakot).
414414

415415
---
416416

417-
# Ideas :bulb: :bulb: :bulb: :thinking\_face:
417+
# Ideas :bulb: :bulb: :bulb: :thinking\_face:
418418

419419
- analyze all FKs and check if data types of referencing column and referenced one match (same thing for multi-column FKs)
420420
- tables w/o PKs? tables not having even unique index?

checkup

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1110,7 +1110,7 @@ run_checks() {
11101110
done
11111111

11121112
msg
1113-
msg "All checks has been finished for host '$HOST'!"
1113+
msg "All checks have been finished for host '$HOST'!"
11141114

11151115
# print stacks with failed reports
11161116
if ! [[ -z "${check_failed_json_stack}" ]]; then

pghrep/LICENSE

Whitespace-only changes.

0 commit comments

Comments
 (0)