Skip to content

Commit

Permalink
mm/gup: /proc/vmstat: pin_user_pages (FOLL_PIN) reporting
Browse files Browse the repository at this point in the history
Now that pages are "DMA-pinned" via pin_user_page*(), and unpinned via
unpin_user_pages*(), we need some visibility into whether all of this is
working correctly.

Add two new fields to /proc/vmstat:

    nr_foll_pin_acquired
    nr_foll_pin_released

These are documented in Documentation/core-api/pin_user_pages.rst.  They
represent the number of pages (since boot time) that have been pinned
("nr_foll_pin_acquired") and unpinned ("nr_foll_pin_released"), via
pin_user_pages*() and unpin_user_pages*().

In the absence of long-running DMA or RDMA operations that hold pages
pinned, the above two fields will normally be equal to each other.

Also: update Documentation/core-api/pin_user_pages.rst, to remove an
earlier (now confirmed untrue) claim about a performance problem with
/proc/vmstat.

Also: update Documentation/core-api/pin_user_pages.rst to rename the new
/proc/vmstat entries, to the names listed here.

Signed-off-by: John Hubbard <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Jan Kara <[email protected]>
Acked-by: Kirill A. Shutemov <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: Jérôme Glisse <[email protected]>
Cc: "Matthew Wilcox (Oracle)" <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Dave Chinner <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Mike Kravetz <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
  • Loading branch information
johnhubbard authored and torvalds committed Apr 2, 2020
1 parent 47e29d3 commit 1970dc6
Show file tree
Hide file tree
Showing 4 changed files with 45 additions and 5 deletions.
33 changes: 28 additions & 5 deletions Documentation/core-api/pin_user_pages.rst
Original file line number Diff line number Diff line change
Expand Up @@ -208,12 +208,35 @@ has the following new calls to exercise the new pin*() wrapper functions:
You can monitor how many total dma-pinned pages have been acquired and released
since the system was booted, via two new /proc/vmstat entries: ::

/proc/vmstat/nr_foll_pin_requested
/proc/vmstat/nr_foll_pin_requested
/proc/vmstat/nr_foll_pin_acquired
/proc/vmstat/nr_foll_pin_released

Those are both going to show zero, unless CONFIG_DEBUG_VM is set. This is
because there is a noticeable performance drop in unpin_user_page(), when they
are activated.
Under normal conditions, these two values will be equal unless there are any
long-term [R]DMA pins in place, or during pin/unpin transitions.

* nr_foll_pin_acquired: This is the number of logical pins that have been
acquired since the system was powered on. For huge pages, the head page is
pinned once for each page (head page and each tail page) within the huge page.
This follows the same sort of behavior that get_user_pages() uses for huge
pages: the head page is refcounted once for each tail or head page in the huge
page, when get_user_pages() is applied to a huge page.

* nr_foll_pin_released: The number of logical pins that have been released since
the system was powered on. Note that pages are released (unpinned) on a
PAGE_SIZE granularity, even if the original pin was applied to a huge page.
Becaused of the pin count behavior described above in "nr_foll_pin_acquired",
the accounting balances out, so that after doing this::

pin_user_pages(huge_page);
for (each page in huge_page)
unpin_user_page(page);

...the following is expected::

nr_foll_pin_released == nr_foll_pin_acquired

(...unless it was already out of balance due to a long-term RDMA pin being in
place.)

References
==========
Expand Down
2 changes: 2 additions & 0 deletions include/linux/mmzone.h
Original file line number Diff line number Diff line change
Expand Up @@ -243,6 +243,8 @@ enum node_stat_item {
NR_DIRTIED, /* page dirtyings since bootup */
NR_WRITTEN, /* page writings since bootup */
NR_KERNEL_MISC_RECLAIMABLE, /* reclaimable non-slab kernel pages */
NR_FOLL_PIN_ACQUIRED, /* via: pin_user_page(), gup flag: FOLL_PIN */
NR_FOLL_PIN_RELEASED, /* pages returned via unpin_user_page() */
NR_VM_NODE_STAT_ITEMS
};

Expand Down
13 changes: 13 additions & 0 deletions mm/gup.c
Original file line number Diff line number Diff line change
Expand Up @@ -86,6 +86,8 @@ static __maybe_unused struct page *try_grab_compound_head(struct page *page,
if (flags & FOLL_GET)
return try_get_compound_head(page, refs);
else if (flags & FOLL_PIN) {
int orig_refs = refs;

/*
* When pinning a compound page of order > 1 (which is what
* hpage_pincount_available() checks for), use an exact count to
Expand All @@ -104,6 +106,9 @@ static __maybe_unused struct page *try_grab_compound_head(struct page *page,
if (hpage_pincount_available(page))
hpage_pincount_add(page, refs);

mod_node_page_state(page_pgdat(page), NR_FOLL_PIN_ACQUIRED,
orig_refs);

return page;
}

Expand Down Expand Up @@ -158,6 +163,8 @@ bool __must_check try_grab_page(struct page *page, unsigned int flags)
* once, so that the page really is pinned.
*/
page_ref_add(page, refs);

mod_node_page_state(page_pgdat(page), NR_FOLL_PIN_ACQUIRED, 1);
}

return true;
Expand All @@ -178,6 +185,7 @@ static bool __unpin_devmap_managed_user_page(struct page *page)

count = page_ref_sub_return(page, refs);

mod_node_page_state(page_pgdat(page), NR_FOLL_PIN_RELEASED, 1);
/*
* devmap page refcounts are 1-based, rather than 0-based: if
* refcount is 1, then the page is free and the refcount is
Expand Down Expand Up @@ -228,6 +236,8 @@ void unpin_user_page(struct page *page)

if (page_ref_sub_and_test(page, refs))
__put_page(page);

mod_node_page_state(page_pgdat(page), NR_FOLL_PIN_RELEASED, 1);
}
EXPORT_SYMBOL(unpin_user_page);

Expand Down Expand Up @@ -2014,6 +2024,9 @@ EXPORT_SYMBOL(get_user_pages_unlocked);
static void put_compound_head(struct page *page, int refs, unsigned int flags)
{
if (flags & FOLL_PIN) {
mod_node_page_state(page_pgdat(page), NR_FOLL_PIN_RELEASED,
refs);

if (hpage_pincount_available(page))
hpage_pincount_sub(page, refs);
else
Expand Down
2 changes: 2 additions & 0 deletions mm/vmstat.c
Original file line number Diff line number Diff line change
Expand Up @@ -1168,6 +1168,8 @@ const char * const vmstat_text[] = {
"nr_dirtied",
"nr_written",
"nr_kernel_misc_reclaimable",
"nr_foll_pin_acquired",
"nr_foll_pin_released",

/* enum writeback_stat_item counters */
"nr_dirty_threshold",
Expand Down

0 comments on commit 1970dc6

Please sign in to comment.