diff options
| author | Kairui Song <kasong@tencent.com> | 2026-04-28 02:06:59 +0800 |
|---|---|---|
| committer | Andrew Morton <akpm@linux-foundation.org> | 2026-05-28 21:31:29 -0700 |
| commit | a9e2a27890722806804ab949da0c06622dcbc3e4 (patch) | |
| tree | 886ae3e0d07e824c4b70c6df1201aa4f945149c4 /mm | |
| parent | 9c4c746bc40ee4b64eb682278ca5f0b37a683903 (diff) | |
| download | linux-next-history-a9e2a27890722806804ab949da0c06622dcbc3e4.tar.gz | |
mm/mglru: don't abort scan immediately right after aging
Right now, if eviction triggers aging, the reclaimer will abort. This is
not the optimal strategy for several reasons.
Aborting the reclaim early wastes a reclaim cycle when under pressure, and
for concurrent reclaim, if the LRU is under aging, all concurrent
reclaimers might fail. And if the age has just finished, new cold folios
exposed by the aging are not reclaimed until the next reclaim iteration.
What's more, the current aging trigger is quite lenient, having 3 gens
with a reclaim priority lower than default will trigger aging, and blocks
reclaiming from one memcg. This wastes reclaim retry cycles easily. And
in the worst case, if the reclaim is making slower progress and all
following attempts fail due to being blocked by aging, it triggers
unexpected early OOM.
And if a lruvec requires aging, it doesn't mean it's hot. Instead, the
lruvec could be idle for quite a while, and hence it might contain lots of
cold folios to be reclaimed.
While it's helpful to rotate memcg LRU after aging for global reclaim, as
global reclaim fairness is coupled with the rotation in shrink_many, memcg
fairness is instead handled by cgroup iteration in shrink_node_memcgs.
So, for memcg level pressure, this abort is not the key part for keeping
the fairness. And in most cases, there is no need to age, and fairness
must be achieved by upper-level reclaim control.
So instead, just keep the scanning going unless one whole batch of folios
failed to be isolated or enough folios have been scanned, which is
triggered by evict_folios returning 0. And only abort for global reclaim
after one batch, so when there are fewer memcgs, progress is still made,
and the fairness mechanism described above still works fine.
And in most cases, the one more batch attempt for global reclaim might
just be enough to satisfy what the reclaimer needs, hence improving global
reclaim performance by reducing reclaim retry cycles.
Rotation is still there after the reclaim is done, which still follows the
comment in mmzone.h. And fairness still looking good.
Link: https://lore.kernel.org/20260428-mglru-reclaim-v7-8-02fabb92dc43@tencent.com
Signed-off-by: Kairui Song <kasong@tencent.com>
Reviewed-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Chen Ridong <chenridong@huaweicloud.com>
Reviewed-by: Barry Song <baohua@kernel.org>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Chris Li <chrisl@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: David Stevens <stevensd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kalesh Singh <kaleshsingh@google.com>
Cc: Leno Hou <lenohou@gmail.com>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vernon Yang <vernon2gm@gmail.com>
Cc: Wei Xu <weixugc@google.com>
Cc: Yafang <laoar.shao@gmail.com>
Cc: Yuanchu Xie <yuanchu@google.com>
Cc: Yu Zhao <yuzhao@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Diffstat (limited to 'mm')
| -rw-r--r-- | mm/vmscan.c | 12 |
1 files changed, 9 insertions, 3 deletions
diff --git a/mm/vmscan.c b/mm/vmscan.c index 6128b191b81d7..daad01a07e338 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -4989,7 +4989,7 @@ static bool should_abort_scan(struct lruvec *lruvec, struct scan_control *sc) */ static bool try_to_shrink_lruvec(struct lruvec *lruvec, struct scan_control *sc) { - bool need_rotate = false; + bool need_rotate = false, should_age = false; long nr_batch, nr_to_scan; int swappiness = get_swappiness(lruvec, sc); struct mem_cgroup *memcg = lruvec_memcg(lruvec); @@ -5007,8 +5007,7 @@ static bool try_to_shrink_lruvec(struct lruvec *lruvec, struct scan_control *sc) if (should_run_aging(lruvec, max_seq, sc, swappiness)) { if (try_to_inc_max_seq(lruvec, max_seq, swappiness, false)) need_rotate = true; - /* stop scanning as it's low on cold folios */ - break; + should_age = true; } nr_batch = min(nr_to_scan, MIN_LRU_BATCH); @@ -5019,6 +5018,13 @@ static bool try_to_shrink_lruvec(struct lruvec *lruvec, struct scan_control *sc) if (should_abort_scan(lruvec, sc)) break; + /* + * Root reclaim needs rotation when low on cold folio for better + * fairness. Cgroup reclaim gets fairness from the iterator. + */ + if (root_reclaim(sc) && should_age) + break; + nr_to_scan -= delta; cond_resched(); } |
