Diffstat (limited to 'Documentation/cgroups')
1 files changed, 6 insertions, 122 deletions
diff --git a/Documentation/cgroups/memcg_test.txt b/Documentation/cgroups/memcg_test.txt
index bcf750d3cecd..8870b0212150 100644
@@ -29,28 +29,13 @@ Please note that implementation details can be changed.
a page/swp_entry may be uncharged (usage -= PAGE_SIZE) by
- Called when an anonymous page is fully unmapped. I.e., mapcount goes
- to 0. If the page is SwapCache, uncharge is delayed until
- Called when a page-cache is deleted from radix-tree. If the page is
- SwapCache, uncharge is delayed until mem_cgroup_uncharge_swapcache().
- Called when SwapCache is removed from radix-tree. The charge itself
- is moved to swap_cgroup. (If mem+swap controller is disabled, no
- charge to swap occurs.)
+ Called when a page's refcount goes down to 0.
Called when swp_entry's refcnt goes down to 0. A charge against swap
- mem_cgroup_end_migration(old, new)
- At success of migration old is uncharged (if necessary), a charge
- to new page is committed. At failure, charge to old page is committed.
Memcg pages are charged in two steps:
@@ -69,18 +54,6 @@ Under below explanation, we assume CONFIG_MEM_RES_CTRL_SWAP=y.
Anonymous page is newly allocated at
- page fault into MAP_ANONYMOUS mapping.
- It is charged right after it's allocated before doing any page table
- related operations. Of course, it's uncharged when another page is used
- for the fault address.
- At freeing anonymous page (by exit() or munmap()), zap_pte() is called
- and pages for ptes are freed one by one.(see mm/memory.c). Uncharges
- are done at page_remove_rmap() when page_mapcount() goes down to 0.
- Another page freeing is by page-reclaim (vmscan.c) and anonymous
- pages are swapped out. In this case, the page is marked as
- PageSwapCache(). uncharge() routine doesn't uncharge the page marked
- as SwapCache(). It's delayed until __delete_from_swap_cache().
At swap-in, the page is taken from swap-cache. There are 2 cases.
@@ -89,41 +62,6 @@ Under below explanation, we assume CONFIG_MEM_RES_CTRL_SWAP=y.
(b) If the SwapCache has been mapped by processes, it has been
- This swap-in is one of the most complicated work. In do_swap_page(),
- following events occur when pte is unchanged.
- (1) the page (SwapCache) is looked up.
- (2) lock_page()
- (3) try_charge_swapin()
- (4) reuse_swap_page() (may call delete_swap_cache())
- (5) commit_charge_swapin()
- (6) swap_free().
- Considering following situation for example.
- (A) The page has not been charged before (2) and reuse_swap_page()
- doesn't call delete_from_swap_cache().
- (B) The page has not been charged before (2) and reuse_swap_page()
- calls delete_from_swap_cache().
- (C) The page has been charged before (2) and reuse_swap_page() doesn't
- call delete_from_swap_cache().
- (D) The page has been charged before (2) and reuse_swap_page() calls
- memory.usage/memsw.usage changes to this page/swp_entry will be
- Case (A) (B) (C) (D)
- Before (2) 0/ 1 0/ 1 1/ 1 1/ 1
- (3) +1/+1 +1/+1 +1/+1 +1/+1
- (4) - 0/ 0 - -1/ 0
- (5) 0/-1 0/ 0 -1/-1 0/ 0
- (6) - 0/-1 - 0/-1
- Result 1/ 1 1/ 1 1/ 1 1/ 1
- In any cases, charges to this page should be 1/ 1.
At swap-out, typical state transition is below.
@@ -136,28 +74,20 @@ Under below explanation, we assume CONFIG_MEM_RES_CTRL_SWAP=y.
swp_entry's refcnt -= 1.
- At (b), the page is marked as SwapCache and not uncharged.
- At (d), the page is removed from SwapCache and a charge in page_cgroup
- is moved to swap_cgroup.
Finally, at task exit,
(e) zap_pte() is called and swp_entry's refcnt -=1 -> 0.
- Here, a charge in swap_cgroup disappears.
5. Page Cache
Page Cache is charged at
- uncharged at
- - __remove_from_page_cache().
The logic is very clear. (About migration, see below)
Note: __remove_from_page_cache() is called by remove_from_page_cache()
6. Shmem(tmpfs) Page Cache
- Memcg's charge/uncharge have special handlers of shmem. The best way
- to understand shmem's page state transition is to read mm/shmem.c.
+ The best way to understand shmem's page state transition is to read
But brief explanation of the behavior of memcg around shmem will be
helpful to understand the logic.
@@ -170,56 +100,10 @@ Under below explanation, we assume CONFIG_MEM_RES_CTRL_SWAP=y.
It's charged when...
- A new page is added to shmem's radix-tree.
- A swp page is read. (move a charge from swap_cgroup to page_cgroup)
- It's uncharged when
- - A page is removed from radix-tree and not SwapCache.
- - When SwapCache is removed, a charge is moved to swap_cgroup.
- - When swp_entry's refcnt goes down to 0, a charge in swap_cgroup
7. Page Migration
- One of the most complicated functions is page-migration-handler.
- Memcg has 2 routines. Assume that we are migrating a page's contents
- from OLDPAGE to NEWPAGE.
- Usual migration logic is..
- (a) remove the page from LRU.
- (b) allocate NEWPAGE (migration target)
- (c) lock by lock_page().
- (d) unmap all mappings.
- (e-1) If necessary, replace entry in radix-tree.
- (e-2) move contents of a page.
- (f) map all mappings again.
- (g) pushback the page to LRU.
- (-) OLDPAGE will be freed.
- Before (g), memcg should complete all necessary charge/uncharge to
- The point is....
- - If OLDPAGE is anonymous, all charges will be dropped at (d) because
- try_to_unmap() drops all mapcount and the page will not be
- - If OLDPAGE is SwapCache, charges will be kept at (g) because
- __delete_from_swap_cache() isn't called at (e-1)
- - If OLDPAGE is page-cache, charges will be kept at (g) because
- remove_from_swap_cache() isn't called at (e-1)
- memcg provides following hooks.
- - mem_cgroup_prepare_migration(OLDPAGE)
- Called after (b) to account a charge (usage += PAGE_SIZE) against
- memcg which OLDPAGE belongs to.
- - mem_cgroup_end_migration(OLDPAGE, NEWPAGE)
- Called after (f) before (g).
- If OLDPAGE is used, commit OLDPAGE again. If OLDPAGE is already
- charged, a charge by prepare_migration() is automatically canceled.
- If NEWPAGE is used, commit NEWPAGE and uncharge OLDPAGE.
- But zap_pte() (by exit or munmap) can be called while migration,
- we have to check if OLDPAGE/NEWPAGE is a valid page after commit().
Each memcg has its own private LRU. Now, its handling is under global