Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleaning nodestore_node table #1808

Open
aminvakil opened this issue Nov 14, 2022 · 15 comments
Open

Cleaning nodestore_node table #1808

aminvakil opened this issue Nov 14, 2022 · 15 comments
Assignees

Comments

@aminvakil
Copy link
Collaborator

aminvakil commented Nov 14, 2022

Problem Statement

Over time nodestore_node table gets bigger and bigger and currently there is no procedure to clean it up.

A forum comment explaining what nodestore_node is by @untitaker :

To answer the question about purpose of nodestore:

nodestore stores the raw, unindexed parts of the data of an event. For example all event breadcrumbs and contexts are stored there. Clickhouse only stores the data you can search and aggregate by. It only has indices, so to speak.

nodestore has multiple backends. On sentry.io it is not postgres, but google bigtable. We use the postgres backend for local development and for onpremise, but for large-scale installations it is suboptimal.

We don’t have a solution right now for the storage problem in any case, sorry. One thing you should ensure actually runs is sentry cleanup, but it’s definitely possible that nodestore takes this amount of disk space.

https://forum.sentry.io/t/postgres-nodestore-node-table-124gb/12753/3

Solution Brainstorm

There was an idea suggested on forum which worked for me, but I lost all event details.
Something like this would work:

DELETE FROM public.nodestore_node WHERE “timestamp” < NOW() - INTERVAL '1 day';
VACUUM FULL public.nodestore_node;

Change 1 day according to your needs.

Maybe put this in a cron container which gets run every night, we should think about its performance issues though, this took a long time to get executed on our instance, maybe because it wasn't run before, but I'm not sure.

@aminvakil
Copy link
Collaborator Author

cc @BYK @chadwhitacre @hubertdeng123

@BYK
Copy link
Member

BYK commented Nov 14, 2022

There are a few interesting and important things:

  1. We can never add the VACUUM FULL public.nodestore_node; part in any kind of production setup as it will lock the whole table for an indefinite time. It also requires additional space on disk so it has a high potential to put the system in an inoperable and hard-to-recover state.
  2. The sentry cleanup command that we trigger via the cron does run nodestore.cleanup():
    https://github.com/getsentry/sentry/blob/1d24b160a74e10759d20fa323f5251bffee3f7b5/src/sentry/runner/commands/cleanup.py#L260-L268 which is defined at https://github.com/getsentry/sentry/blob/1d24b160a74e10759d20fa323f5251bffee3f7b5/src/sentry/nodestore/django/backend.py#L54-L62

So technically, this should work, if you set SENTRY_RETENTION_DAYS to 90 or less which is the default. The only missing thing I see here is an explicit VACUUM call but autovacuum should handle that already: https://www.postgresql.org/docs/12/routine-vacuuming.html#AUTOVACUUM

Now we need to figure out which part of the puzzle is broken:

  1. Do we have the nodestore.cleanup() call successfully?
  2. If so, does it actually run the DELETE commands?
  3. If so, does the autovacuum thingy work in the Postgres container or do we need to enable that
  4. Even then, should we issue a manual VACUUM command after a cleanup operation?

@untitaker
Copy link
Member

I have nothing to add to what BYK said. This should work out of the box assuming you run sentry cleanup

@chadwhitacre
Copy link
Member

I lost all event details.

Did you lose all event details because you ran the command without substituting a real value for SENTRY_RETENTION_DAYS, is that what you mean? Does Postgres silently accept INTERVAL ‘SENTRY_RETENTION_DAYS’ as equivalent to zero, something like that?

@github-actions
Copy link

github-actions bot commented Dec 7, 2022

This issue has gone three weeks without activity. In another week, I will close it.

But! If you comment or otherwise update it, I will reset the clock, and if you label it Status: Backlog or Status: In Progress, I will leave it alone ... forever!


"A weed is but an unloved flower." ― Ella Wheeler Wilcox 🥀

@aminvakil
Copy link
Collaborator Author

I lost all event details.

Did you lose all event details because you ran the command without substituting a real value for SENTRY_RETENTION_DAYS, is that what you mean? Does Postgres silently accept INTERVAL ‘SENTRY_RETENTION_DAYS’ as equivalent to zero, something like that?

Sorry for late response, No, I ran this exact same command: DELETE FROM public.nodestore_node WHERE "timestamp" < NOW() - INTERVAL '1 day';

@github-actions
Copy link

github-actions bot commented Jan 8, 2023

This issue has gone three weeks without activity. In another week, I will close it.

But! If you comment or otherwise update it, I will reset the clock, and if you label it Status: Backlog or Status: In Progress, I will leave it alone ... forever!


"A weed is but an unloved flower." ― Ella Wheeler Wilcox 🥀

@mrmachine
Copy link

We hit this problem as well. nodestore_node was 42GB and the table had never been vacuumed.

We ran the sentry-cleanup service which did it's thing, but did not reduce disk space. We ran vacuum analyze which did not reduce disk space. We ran vacuum full nodestore_node and that did reduce disk space to 3GB. Postgres also then reported that vacuum had been executed one time.

postgres=# select relname, vacuum_count, n_tup_del, n_live_tup, n_dead_tup 
from pg_stat_sys_tables 
where relname='pg_toast_20247';
    relname     | vacuum_count | n_tup_del | n_live_tup | n_dead_tup 
----------------+--------------+-----------+------------+------------
 pg_toast_20247 |            1 |  62889277 |    1840323 |          0
(1 row)

So it seems that (3) is true. The autovacuum thingy inside the postgres container is not working.

However, this seems to indicate that it is in fact enabled:

postgres=# SHOW autovacuum;
 autovacuum 
------------
 on
(1 row)

Someone hinted on a forum that if Postgres is under load it may not get a chance to autovacuum. I don't think our installation is especially high load, but if it is the case that Postgres never runs autovacuum under nominal load then maybe sentry self-hosted does need a way to explicitly trigger vacuum on some kind of schedule.

While it is true that vacuum locks the table and uses some disk space, and temporarily doubling our 42GB table during a vacuum might have causes some issues, it only took a couple of minutes to complete the vacuum and it significantly dropped the disk usage.

If vacuum was run regularly, the size should also hover at a much lower level (e.g. 3GB instead of 42GB) and should not get large enough to cause significant issues with disk usage during the vacuum.

We would be happy to accept a periodic 2 minute complete sentry outage to avoid unbounded disk usage, but it seems that sentry did not suffer a complete outage during the vacuum anyway. We're also willing to accept a partial outage or some amount of events that are missing or missing data.

We could also schedule vacuum to occur during a specified maintenance window to further reduce the impact to production systems.

I also tried the pg_repack option mentioned in the docks, but the current command did not run at all (failed to install pg_repack inside the container) and an older version of the command I found in a GitHub issue that matched our version of postgres also failed to install inside the container.

So I think a setting to schedule a vacuum full nodestore_node in postgres is the best way forward. It could be disabled by default, but should be easy to enable via setting that allows a specific schedule to be set.

@mrmachine
Copy link

Reading more about vacuum and vacuum full. The latter rewrites the whole table and returns disk space back to the operating system. The former frees up deleted rows for re-use but does not return disk space back to the operating system. So if the sentry cleanup could run vacuum immediately after deleting rows from nodestore_node, and the cleanup frequency was high enough (or configurable) for the volume of new writes vs deletions of old data, then there should be no more unbounded growth without requiring a full table lock or any downtime. This should be easier to implement and more reliable than trying to configure the autovacuum daemon to do it for us?

https://www.postgresql.org/docs/12/routine-vacuuming.html#VACUUM-FOR-SPACE-RECOVERY

@chadwhitacre
Copy link
Member

Thanks for the deep dive, @mrmachine.

@aminvakil
Copy link
Collaborator Author

In our environment, we have not yet upgraded to 23.4.0 as disk space is almost 80% full and cannot upgrade our postgres.

Therefore we have not yet tested if upgrading to 14.5 would have an effect on this matter.

@lcd1232
Copy link

lcd1232 commented Aug 13, 2024

It's possible to free up space without downtime using pg_repack

@klboke
Copy link
Contributor

klboke commented Aug 16, 2024

  • We also encountered this issue, but it was more severe for us. Our nodestore_node table expanded to 3.6TB, so we had to focus on and resolve this problem. We tried to pull a new instance from backup data to verify how long a VACUUM FULL nodestore_node would take and found that it would take at least 8 hours. This means that if we perform this operation in production, Sentry would be down for 8 hours, which is unacceptable.

  • However, we clearly understand that nodestore_node is just a KV (key-value) abstract storage, using id as the key, to store or query event JSONs, each JSON being approximately 6~70 KB. Of course, the data stored in nodestore_node is already compressed. So I naturally thought of using an object storage product like S3, which can very well solve the storage problem of nodestore_node, using id as the file path, with each event JSON being a file.

  • Now, I have managed to use S3 to store the messages in nodestore_node and have submitted a PR :feat(nodestore): A file system-based node storage backend, with added support for S3 and GCS storage sentry#76250. I hope others facing similar issues can provide some suggestions.

postgresql size
Image

oss files

Image

@aldy505
Copy link
Collaborator

aldy505 commented Jan 5, 2025

UPDATE (good news, this time):

We are planning to move away from PostgreSQL as the nodestore backend to something else, probably between S3-compatible (like Garage, on this PR: #3498), and filesystem based. An internal discussion will be held sometime next week.

@BYK
Copy link
Member

BYK commented Jan 8, 2025

The consensus is towards an filesystem-based backend right now. Since the current FS-backed node store implementation is a bit naive and geared towards debuggability, it will take some more time once we build a more prod-ready node store on top of that.

But fret not as it shouldn't take too long. Also that S3/Garage patch can serve as a blueprint for people willing to experiment.

@BYK BYK self-assigned this Jan 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: No status
Status: No status
Development

No branches or pull requests

9 participants