diff options
author | Kent Overstreet <kent.overstreet@linux.dev> | 2025-05-27 22:20:27 -0400 |
---|---|---|
committer | Kent Overstreet <kent.overstreet@linux.dev> | 2025-05-31 22:03:17 -0400 |
commit | 0224d17d762ce036fde5ad18dd33236db6fca88b (patch) | |
tree | fedc47f64b3cbb9a96fcf1d4976be6cc9fdba5aa /fs | |
parent | f02d15327455822ed80e0b7d70b2ab3568a0389e (diff) | |
download | linux-0224d17d762ce036fde5ad18dd33236db6fca88b.tar.gz |
bcachefs: Runtime self healing for keys for deleted snapshots
If snapshot deletion incorrectly missing some keys and leaves keys for
deleted snapshots, that causes a bit of a problem for data move - we
can't move an extent for a nonexistent snapshot, because the extent
might have to be fragmented, and maintaining correct visibility in child
snapshots doesn't work if it doesn't have a snapshot.
Previously we'd just skip these keys, but it turns out that causes
copygc to spin.
So we need runtime self healing, i.e. calling check_key_has_snapshot()
from the data move path.
Snapshot deletion v2 included sentinal values for deleted snapshot
nodes, so this is quite safe.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Diffstat (limited to 'fs')
-rw-r--r-- | fs/bcachefs/data_update.c | 25 |
1 files changed, 18 insertions, 7 deletions
diff --git a/fs/bcachefs/data_update.c b/fs/bcachefs/data_update.c index 63a10ea83c9743..fafe7a57ea4161 100644 --- a/fs/bcachefs/data_update.c +++ b/fs/bcachefs/data_update.c @@ -821,13 +821,24 @@ int bch2_data_update_init(struct btree_trans *trans, struct bch_fs *c = trans->c; int ret = 0; - /* - * fs is corrupt we have a key for a snapshot node that doesn't exist, - * and we have to check for this because we go rw before repairing the - * snapshots table - just skip it, we can move it later. - */ - if (unlikely(k.k->p.snapshot && !bch2_snapshot_exists(c, k.k->p.snapshot))) - return -BCH_ERR_data_update_done_no_snapshot; + if (k.k->p.snapshot) { + /* + * We'll go ERO if we see a key for a missing snapshot, and if + * we're still in recovery we want to give that a chance to + * repair: + */ + if (unlikely(test_bit(BCH_FS_in_recovery, &c->flags) && + bch2_snapshot_id_state(c, k.k->p.snapshot) == SNAPSHOT_ID_empty)) + return -BCH_ERR_data_update_done_no_snapshot; + + ret = bch2_check_key_has_snapshot(trans, iter, k); + if (ret < 0) + return ret; + if (ret) /* key was deleted */ + return bch2_trans_commit(trans, NULL, NULL, BCH_TRANS_COMMIT_no_enospc) ?: + -BCH_ERR_data_update_done_no_snapshot; + ret = 0; + } bch2_bkey_buf_init(&m->k); bch2_bkey_buf_reassemble(&m->k, c, k); |