From f317028f42c658b2fafea80a332196e86ca87c84 Mon Sep 17 00:00:00 2001 From: Sage Weil Date: Mon, 27 Feb 2012 15:41:57 -0800 Subject: [PATCH] doc: beginnings of documentation of stuck pgs and pg states Signed-off-by: Josh Durgin Reviewed-by: Sage Weil --- doc/control.rst | 38 +++++++++++++++++++--- doc/dev/placement-group.rst | 64 ++++++++++++++++++++++++++++++++++--- 2 files changed, 92 insertions(+), 10 deletions(-) diff --git a/doc/control.rst b/doc/control.rst index 8020333a806d0..a0863fd11f151 100644 --- a/doc/control.rst +++ b/doc/control.rst @@ -54,6 +54,39 @@ Add auth keyring for an osd. :: Show auth key OSD subsystem. +PG subsystem +------------ +:: + + $ ceph -- pg dump [--format ] + +Output the stats of all pgs. Valid formats are "plain" and "json", +and plain is the default. :: + + $ ceph -- pg dump_stuck inactive|unclean|stale [--format ] [-t|--threshold ] + +Output the stats of all PGs stuck in the specified state. + +``--format`` may be ``plain`` (default) or ``json`` + +``--threshold`` defines how many seconds "stuck" is (default: 300) + +**Inactive** PGs cannot process reads or writes because they are waiting for an OSD +with the most up-to-date data to come back. + +**Unclean** PGs contain objects that are not replicated the desired number +of times. They should be recovering. + +**Stale** PGs are in an unknown state - the OSDs that host them have not +reported to the monitor cluster in a while (configured by +mon_osd_report_timeout). :: + + $ ceph pg mark_unfound_lost revert + +Revert "lost" objects to their prior state, either a previous version +or delete them if they were just created. :: + + OSD subsystem ------------- :: @@ -108,11 +141,6 @@ Create a cluster snapshot. :: Mark an OSD as lost. This may result in permanent data loss. Use with caution. :: - $ ceph pg mark_unfound_lost revert - -Revert "lost" objects to their prior state, either a previous version -or delete them if they were just created. :: - $ ceph osd create [] Create a new OSD. If no ID is given, a new ID is automatically selected diff --git a/doc/dev/placement-group.rst b/doc/dev/placement-group.rst index 5755277bcc7a3..a5abb2b5755c9 100644 --- a/doc/dev/placement-group.rst +++ b/doc/dev/placement-group.rst @@ -81,10 +81,64 @@ consistent hashing; you can think of it as:: result.append(chosen) return result +User-visible PG States +====================== -PG status refreshes only when pg mapping changes -================================================ +.. todo:: diagram of states and how they can overlap + +*creating* + the PG is still being created + +*active* + requests to the PG will be processed + +*clean* + all objects in the PG are replicated the correct number of times + +*down* + a replica with necessary data is down, so the pg is offline + +*replay* + the PG is waiting for clients to replay operations after an OSD crashed + +*splitting* + the PG is being split into multiple PGs (not functional as of 2012-02) + +*scrubbing* + the PG is being checked for inconsistencies + +*degraded* + some objects in the PG are not replicated enough times yet + +*inconsistent* + replicas of the PG are not consistent (e.g. objects are + the wrong size, objects are missing from one replica *after* recovery + finished, etc.) + +*peering* + the PG is undergoing the :doc:`/dev/peering` process + +*repair* + the PG is being checked and any inconsistencies found will be repaired (if possible) + +*recovering* + objects are being migrated/synchronized with replicas + +*backfill* + a special case of recovery, in which the entire contents of + the PG are scanned and synchronized, instead of inferring what + needs to be transferred from the PG logs of recent operations + +*incomplete* + a pg is missing a necessary period of history from its + log. If you see this state, report a bug, and try to start any + failed OSDs that may contain the needed information. + +*stale* + the PG is in an unknown state - the monitors have not received + an update for it since the PG mapping changed. + +*remapped* + the PG is temporarily mapped to a different set of OSDs from what + CRUSH specified -The pg status currently doesn't get refreshed when the actual pg -mapping doesn't change, and e.g. a pool size change of 2->1 won't do -that. It will refresh if you restart the OSDs, though.