-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cleaning nodestore_node table #1808
Comments
There are a few interesting and important things:
So technically, this should work, if you set Now we need to figure out which part of the puzzle is broken:
|
I have nothing to add to what BYK said. This should work out of the box assuming you run |
Did you lose all event details because you ran the command without substituting a real value for |
This issue has gone three weeks without activity. In another week, I will close it. But! If you comment or otherwise update it, I will reset the clock, and if you label it "A weed is but an unloved flower." ― Ella Wheeler Wilcox 🥀 |
Sorry for late response, No, I ran this exact same command: |
This issue has gone three weeks without activity. In another week, I will close it. But! If you comment or otherwise update it, I will reset the clock, and if you label it "A weed is but an unloved flower." ― Ella Wheeler Wilcox 🥀 |
We hit this problem as well. We ran the
So it seems that (3) is true. The autovacuum thingy inside the postgres container is not working. However, this seems to indicate that it is in fact enabled:
Someone hinted on a forum that if Postgres is under load it may not get a chance to autovacuum. I don't think our installation is especially high load, but if it is the case that Postgres never runs autovacuum under nominal load then maybe sentry self-hosted does need a way to explicitly trigger vacuum on some kind of schedule. While it is true that vacuum locks the table and uses some disk space, and temporarily doubling our 42GB table during a vacuum might have causes some issues, it only took a couple of minutes to complete the vacuum and it significantly dropped the disk usage. If vacuum was run regularly, the size should also hover at a much lower level (e.g. 3GB instead of 42GB) and should not get large enough to cause significant issues with disk usage during the vacuum. We would be happy to accept a periodic 2 minute complete sentry outage to avoid unbounded disk usage, but it seems that sentry did not suffer a complete outage during the vacuum anyway. We're also willing to accept a partial outage or some amount of events that are missing or missing data. We could also schedule vacuum to occur during a specified maintenance window to further reduce the impact to production systems. I also tried the pg_repack option mentioned in the docks, but the current command did not run at all (failed to install pg_repack inside the container) and an older version of the command I found in a GitHub issue that matched our version of postgres also failed to install inside the container. So I think a setting to schedule a |
Reading more about vacuum and vacuum full. The latter rewrites the whole table and returns disk space back to the operating system. The former frees up deleted rows for re-use but does not return disk space back to the operating system. So if the sentry cleanup could run vacuum immediately after deleting rows from nodestore_node, and the cleanup frequency was high enough (or configurable) for the volume of new writes vs deletions of old data, then there should be no more unbounded growth without requiring a full table lock or any downtime. This should be easier to implement and more reliable than trying to configure the autovacuum daemon to do it for us? https://www.postgresql.org/docs/12/routine-vacuuming.html#VACUUM-FOR-SPACE-RECOVERY |
Thanks for the deep dive, @mrmachine. |
In our environment, we have not yet upgraded to 23.4.0 as disk space is almost 80% full and cannot upgrade our postgres. Therefore we have not yet tested if upgrading to 14.5 would have an effect on this matter. |
It's possible to free up space without downtime using pg_repack |
oss files |
UPDATE (good news, this time): We are planning to move away from PostgreSQL as the nodestore backend to something else, probably between S3-compatible (like Garage, on this PR: #3498), and filesystem based. An internal discussion will be held sometime next week. |
The consensus is towards an filesystem-based backend right now. Since the current FS-backed node store implementation is a bit naive and geared towards debuggability, it will take some more time once we build a more prod-ready node store on top of that. But fret not as it shouldn't take too long. Also that S3/Garage patch can serve as a blueprint for people willing to experiment. |
Problem Statement
Over time
nodestore_node
table gets bigger and bigger and currently there is no procedure to clean it up.A forum comment explaining what
nodestore_node
is by @untitaker :https://forum.sentry.io/t/postgres-nodestore-node-table-124gb/12753/3
Solution Brainstorm
There was an idea suggested on forum which worked for me, but I lost all event details.
Something like this would work:
Change 1 day according to your needs.
Maybe put this in a
cron
container which gets run every night, we should think about its performance issues though, this took a long time to get executed on our instance, maybe because it wasn't run before, but I'm not sure.The text was updated successfully, but these errors were encountered: