From: Davidlohr Bueso <dave@stgolabs.net>
To: jack@suse.cz, tytso@mit.edu, adilger.kernel@dilger.ca,
brauner@kernel.org
Cc: mcgrof@kernel.org, willy@infradead.org, hare@suse.de,
djwong@kernel.org, linux-ext4@vger.kernel.org,
linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
Davidlohr Bueso <dave@stgolabs.net>,
kernel test robot <oliver.sang@intel.com>,
syzbot+f3c6fda1297c748a7076@syzkaller.appspotmail.com
Subject: [PATCH 7/7] mm/migrate: fix sleep in atomic for large folios and buffer heads
Date: Thu, 17 Apr 2025 18:59:21 -0700 [thread overview]
Message-ID: <20250418015921.132400-8-dave@stgolabs.net> (raw)
In-Reply-To: <20250418015921.132400-1-dave@stgolabs.net>
The large folio + buffer head noref migration scenarios are
being naughty and blocking while holding a spinlock.
As a consequence of the pagecache lookup path taking the
folio lock this serializes against migration paths, so
they can wait for each other. For the private_lock
atomic case, a new BH_Migrate flag is introduced which
enables the lookup to bail.
This allows the critical region of the private_lock on
the migration path to be reduced to the way it was before
ebdf4de5642fb6 ("mm: migrate: fix reference check race
between __find_get_block() and migration"), that is covering
the count checks.
The scope is always noref migration.
Reported-by: kernel test robot <oliver.sang@intel.com>
Reported-by: syzbot+f3c6fda1297c748a7076@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/oe-lkp/202503101536.27099c77-lkp@intel.com
Fixes: 3c20917120ce61 ("block/bdev: enable large folio support for large logical block sizes")
Reviewed-by: Jan Kara <jack@suse.cz>
Co-developed-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
---
fs/buffer.c | 12 +++++++++++-
fs/ext4/ialloc.c | 3 ++-
include/linux/buffer_head.h | 1 +
mm/migrate.c | 8 +++++---
4 files changed, 19 insertions(+), 5 deletions(-)
diff --git a/fs/buffer.c b/fs/buffer.c
index f8c9e5eb4685..7be23ff20b27 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -207,6 +207,15 @@ __find_get_block_slow(struct block_device *bdev, sector_t block, bool atomic)
head = folio_buffers(folio);
if (!head)
goto out_unlock;
+ /*
+ * Upon a noref migration, the folio lock serializes here;
+ * otherwise bail.
+ */
+ if (test_bit_acquire(BH_Migrate, &head->b_state)) {
+ WARN_ON(!atomic);
+ goto out_unlock;
+ }
+
bh = head;
do {
if (!buffer_mapped(bh))
@@ -1390,7 +1399,8 @@ lookup_bh_lru(struct block_device *bdev, sector_t block, unsigned size)
/*
* Perform a pagecache lookup for the matching buffer. If it's there, refresh
* it in the LRU and mark it as accessed. If it is not present then return
- * NULL
+ * NULL. Atomic context callers may also return NULL if the buffer is being
+ * migrated; similarly the page is not marked accessed either.
*/
static struct buffer_head *
find_get_block_common(struct block_device *bdev, sector_t block,
diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c
index 38bc8d74f4cc..e7ecc7c8a729 100644
--- a/fs/ext4/ialloc.c
+++ b/fs/ext4/ialloc.c
@@ -691,7 +691,8 @@ static int recently_deleted(struct super_block *sb, ext4_group_t group, int ino)
if (!bh || !buffer_uptodate(bh))
/*
* If the block is not in the buffer cache, then it
- * must have been written out.
+ * must have been written out, or, most unlikely, is
+ * being migrated - false failure should be OK here.
*/
goto out;
diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
index c791aa9a08da..0029ff880e27 100644
--- a/include/linux/buffer_head.h
+++ b/include/linux/buffer_head.h
@@ -34,6 +34,7 @@ enum bh_state_bits {
BH_Meta, /* Buffer contains metadata */
BH_Prio, /* Buffer should be submitted with REQ_PRIO */
BH_Defer_Completion, /* Defer AIO completion to workqueue */
+ BH_Migrate, /* Buffer is being migrated (norefs) */
BH_PrivateStart,/* not a state bit, but the first bit available
* for private allocation by other entities
diff --git a/mm/migrate.c b/mm/migrate.c
index f3ee6d8d5e2e..676d9cfc7059 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -845,9 +845,11 @@ static int __buffer_migrate_folio(struct address_space *mapping,
return -EAGAIN;
if (check_refs) {
- bool busy;
+ bool busy, migrating;
bool invalidated = false;
+ migrating = test_and_set_bit_lock(BH_Migrate, &head->b_state);
+ VM_WARN_ON_ONCE(migrating);
recheck_buffers:
busy = false;
spin_lock(&mapping->i_private_lock);
@@ -859,12 +861,12 @@ static int __buffer_migrate_folio(struct address_space *mapping,
}
bh = bh->b_this_page;
} while (bh != head);
+ spin_unlock(&mapping->i_private_lock);
if (busy) {
if (invalidated) {
rc = -EAGAIN;
goto unlock_buffers;
}
- spin_unlock(&mapping->i_private_lock);
invalidate_bh_lrus();
invalidated = true;
goto recheck_buffers;
@@ -883,7 +885,7 @@ static int __buffer_migrate_folio(struct address_space *mapping,
unlock_buffers:
if (check_refs)
- spin_unlock(&mapping->i_private_lock);
+ clear_bit_unlock(BH_Migrate, &head->b_state);
bh = head;
do {
unlock_buffer(bh);
--
2.39.5
next prev parent reply other threads:[~2025-04-18 2:08 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-18 1:59 [PATCH v2 0/7] fs/buffer: split pagecache lookups into atomic or blocking Davidlohr Bueso
2025-04-18 1:59 ` [PATCH 1/7] fs/buffer: split locking for pagecache lookups Davidlohr Bueso
2025-04-18 1:59 ` [PATCH 2/7] fs/buffer: introduce sleeping flavors " Davidlohr Bueso
2025-04-18 1:59 ` [PATCH 3/7] fs/buffer: use sleeping version of __find_get_block() Davidlohr Bueso
2025-04-18 1:59 ` [PATCH 4/7] fs/ocfs2: " Davidlohr Bueso
2025-04-18 1:59 ` [PATCH 5/7] fs/jbd2: " Davidlohr Bueso
2025-04-18 1:59 ` [PATCH 6/7] fs/ext4: use sleeping version of sb_find_get_block() Davidlohr Bueso
2025-04-18 1:59 ` Davidlohr Bueso [this message]
2025-04-21 21:06 ` [PATCH v2 0/7] fs/buffer: split pagecache lookups into atomic or blocking Luis Chamberlain
2025-04-22 7:51 ` Christian Brauner
-- strict thread matches above, loose matches on Subject: below --
2025-04-15 23:16 [PATCH -next " Davidlohr Bueso
2025-04-15 23:16 ` [PATCH 7/7] mm/migrate: fix sleep in atomic for large folios and buffer heads Davidlohr Bueso
2025-04-16 9:43 ` Jan Kara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250418015921.132400-8-dave@stgolabs.net \
--to=dave@stgolabs.net \
--cc=adilger.kernel@dilger.ca \
--cc=brauner@kernel.org \
--cc=djwong@kernel.org \
--cc=hare@suse.de \
--cc=jack@suse.cz \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mcgrof@kernel.org \
--cc=oliver.sang@intel.com \
--cc=syzbot+f3c6fda1297c748a7076@syzkaller.appspotmail.com \
--cc=tytso@mit.edu \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.