Skip to content

Commit e2bf629

Browse files
committed
Fix two ancient bugs in GiST code to re-find a parent after page split:
First, when following a right-link, we incorrectly marked the current page as the parent of the right sibling. In reality, the parent of the right page is the same as the parent of the current page (or some page to the right of it, gistFindCorrectParent() will sort that out). Secondly, when we follow a right-link, we must prepend, not append, the right page to our list of pages to visit. That's because we assume that once we hit a leaf page in the list, all the rest are leaf pages too, and give up. To hit these bugs, you need concurrent actions and several unlucky accidents. Another backend must split the root page, while you're in process of splitting a lower-level page. Furthermore, while you scan the internal nodes to re-find the parent, another backend needs to again split some more internal pages. Even then, the bugs don't necessarily manifest as user-visible errors or index corruption. While we're at it, make the error reporting a bit better if gistFindPath() fails to re-find the parent. It used to be an assertion, but an elog() seems more appropriate. Backpatch to all supported branches.
1 parent 335a00a commit e2bf629

File tree

1 file changed

+24
-9
lines changed

1 file changed

+24
-9
lines changed

src/backend/access/gist/gist.c

+24-9
Original file line numberDiff line numberDiff line change
@@ -674,24 +674,38 @@ gistFindPath(Relation r, BlockNumber child)
674674

675675
if (GistPageIsLeaf(page))
676676
{
677-
/* we can safety go away, follows only leaf pages */
677+
/*
678+
* Because we scan the index top-down, all the rest of the pages
679+
* in the queue must be leaf pages as well.
680+
*/
678681
UnlockReleaseBuffer(buffer);
679-
return NULL;
682+
break;
680683
}
681684

682685
top->lsn = PageGetLSN(page);
683686

684687
if (top->parent && XLByteLT(top->parent->lsn, GistPageGetOpaque(page)->nsn) &&
685688
GistPageGetOpaque(page)->rightlink != InvalidBlockNumber /* sanity check */ )
686689
{
687-
/* page splited while we thinking of... */
690+
/*
691+
* Page was split while we looked elsewhere. We didn't see the
692+
* downlink to the right page when we scanned the parent, so
693+
* add it to the queue now.
694+
*
695+
* Put the right page ahead of the queue, so that we visit it
696+
* next. That's important, because if this is the lowest internal
697+
* level, just above leaves, we might already have queued up some
698+
* leaf pages, and we assume that there can't be any non-leaf
699+
* pages behind leaf pages.
700+
*/
688701
ptr = (GISTInsertStack *) palloc0(sizeof(GISTInsertStack));
689702
ptr->blkno = GistPageGetOpaque(page)->rightlink;
690703
ptr->childoffnum = InvalidOffsetNumber;
691-
ptr->parent = top;
692-
ptr->next = NULL;
693-
tail->next = ptr;
694-
tail = ptr;
704+
ptr->parent = top->parent;
705+
ptr->next = top->next;
706+
top->next = ptr;
707+
if (tail == top)
708+
tail = ptr;
695709
}
696710

697711
maxoff = PageGetMaxOffsetNumber(page);
@@ -749,7 +763,9 @@ gistFindPath(Relation r, BlockNumber child)
749763
top = top->next;
750764
}
751765

752-
return NULL;
766+
elog(ERROR, "failed to re-find parent of a page in index \"%s\", block %u",
767+
RelationGetRelationName(r), child);
768+
return NULL; /* keep compiler quiet */
753769
}
754770

755771

@@ -821,7 +837,6 @@ gistFindCorrectParent(Relation r, GISTInsertStack *child)
821837

822838
/* ok, find new path */
823839
ptr = parent = gistFindPath(r, child->blkno);
824-
Assert(ptr != NULL);
825840

826841
/* read all buffers as expected by caller */
827842
/* note we don't lock them or gistcheckpage them here! */

0 commit comments

Comments
 (0)