Merge tag 'vfs-7.2-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull misc vfs updates from Christian Brauner: "Features: - Reduce pipe->mutex contention by pre-allocating pages outside the lock in anon_pipe_write(). anon_pipe_write() called alloc_page() once per page while holding pipe->mutex. The allocation can sleep doing direct reclaim and runs memcg charging, which extends the critical section and stalls any concurrent reader on the same mutex. Now up to 8 pages are pre-allocated before the mutex is taken, leftovers are recycled into the per-pipe tmp_page[] cache before unlock, and any remainder is released after unlock, keeping the allocator out of the critical section on both sides. On a writers x readers sweep with 64KB writes against a 1 MB pipe throughput improves 6-28% and average write latency drops 5-22%; under memory pressure - when the cost of holding the mutex across reclaim is highest - throughput improves 21-48% and latency drops 17-33%. The microbenchmark is added to selftests. - uaccess/sockptr: fix the ignored_trailing logic in copy_struct_to_user() to behave as documented and the usize check in copy_struct_from_sockptr() for user pointers, and add copy_struct_{from,to}_bounce_buffer() and copy_struct_to_sockptr() helpers for upcoming users (IPPROTO_SMBDIRECT, IPPROTO_QUIC). - bpf: add a sleepable bpf_real_inode() kfunc that resolves the real inode backing a dentry via d_real_inode(). On overlayfs the inode attached to the dentry doesn't carry the underlying device information; this is used by the filesystem restriction BPF program that was merged into systemd. - docs: add guidelines for submitting new filesystems, motivated by the maintenance burden abandoned and untestable filesystems impose on VFS developers, blocking infrastructure work like folio conversions and iomap migration. Fixes: - libfs: set SB_I_NOEXEC and SB_I_NODEV by default in init_pseudo() and drop the now-redundant assignments in callers. This began as a one-line dma-buf fix for a path_noexec() warning; a pseudo filesystem has no reason not to set SB_I_NOEXEC. All init_pseudo() callers were audited: the only visible effect is on dma-buf where SB_I_NOEXEC silences the warning. - Handle set_blocksize() failures in legacy filesystems (bfs, hpfs, qnx4, jfs, befs, affs, isofs, minix, ntfs3, omfs). Mounting a device with a sector size > PAGE_SIZE crashed roughly half of them; the rest had the same missing error handling pattern. Plus a follow-up releasing the superblock buffer_head when setting the minix v3 block size fails. - mount: honour SB_NOUSER in the new mount API. - fs/fcntl: fix a SOFTIRQ-unsafe lock order in fasync signaling by switching the process-group paths of send_sigio() and send_sigurg() from read_lock(&tasklist_lock) to RCU, matching the single-PID path. - vfs: add an FS_USERNS_DELEGATABLE flag and set it for NFS, fixing delegated NFS mounts (fsopen() in a container with the mount performed by a privileged daemon) that broke when non-init s_user_ns was tied to FS_USERNS_MOUNT. - selftests/namespaces: fix a hang in nsid_test where an unreaped grandchild kept the TAP pipe write-end open, a waitpid(-1) race in listns_efault_test, and a false FAIL on kernels without listns() where the tests should SKIP. - filelock: fix the break_lease() stub signature for CONFIG_FILE_LOCKING=n. - init/initramfs_test: wait for the async initramfs unpacking before running; the test and do_populate_rootfs() share the parser state. - fs/coredump: reduce redundant log noise in validate_coredump_safety(). - iomap: pass the correct length to fserror_report_io() in __iomap_write_begin(). - backing-file: fix the backing_file_open() kerneldoc. Cleanups: - initramfs: refactor the cpio hex header parsing to use hex2bin() instead of the hand-rolled simple_strntoul() which is reverted, and extend the initramfs KUnit tests to cover header fields with 0x prefixes. - Replace __get_free_pages() and friends with kmalloc()/kzalloc() across quota, proc, ocfs2/dlm, nilfs2, nfs, nfsd, libfs, jfs, jbd2, isofs, fuse, select, namespace, configfs, binfmt_misc, bfs, and the do_mounts init code - part of the larger work of replacing page allocator calls with kmalloc(). - Use clear_and_wake_up_bit() in unlock_buffer() and journal_end_buffer_io_sync() instead of open-coding the sequence. - Drop unused VFS exports: unexport drop_super_exclusive(), remove start_removing_user_path_at(), and fold __start_removing_path() into start_removing_path(). - fs/read_write: narrow the __kernel_write() export with EXPORT_SYMBOL_FOR_MODULES(). - vfs: uapi: retire octal and hex constants in favor of (1 << n) for the O_ flags. Finding a free bit for a new flag across the architectures was needlessly hard with the mixed bases. - dcache: add extra sanity checks of dead dentries in dentry_free() via a new DENTRY_WARN_ONCE() that also prints d_flags. - iov_iter: use kmemdup_array() in dup_iter() to harden the allocation against multiplication overflow. - fs/pipe: write to ->poll_usage only once. - vfs: remove an always-taken if-branch in find_next_fd(). - dcache: use kmalloc_flex() for struct external_name in __d_alloc(). - namei: use QSTR() instead of QSTR_INIT() in path_pts(). - sync_file_range: delete dead S_ISLNK code. - Comment fixes: retire a stale comment in fget_task_next() and fix assorted spelling mistakes" * tag 'vfs-7.2-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (73 commits) backing-file: fix backing_file_open() kerneldoc parameter iomap: pass the correct len to fserror_report_io in __iomap_write_begin vfs: add FS_USERNS_DELEGATABLE flag and set it for NFS filelock: fix break_lease() stub signature for CONFIG_FILE_LOCKING=n vfs: uapi: retire octal and hex numbers in favor of (1 << n) for O_ flags bpf: add bpf_real_inode() kfunc fs/read_write: Do not export __kernel_write() to the entire world libfs: drop redundant SB_I_NOEXEC/SB_I_NODEV in init_pseudo() callers libfs: set SB_I_NOEXEC and SB_I_NODEV by default in init_pseudo() mount: honour SB_NOUSER in the new mount API fs/fcntl: fix SOFTIRQ-unsafe lock order in fasync signaling selftests/pipe: add pipe_bench microbenchmark fs/pipe: pre-allocate pages outside pipe->mutex in anon_pipe_write fs: retire stale comment in fget_task_next() fs: fix spelling mistakes in comment bfs: replace get_zeroed_page() with kzalloc() binfmt_misc: replace __get_free_page() with kmalloc() configfs: replace __get_free_pages() with kzalloc() fs/namespace: use __getname() to allocate mntpath buffer fs/select: replace __get_free_page() with kmalloc() ...
author: Linus Torvalds <torvalds@linux-foundation.org> 2026-06-15 03:59:45 +0530
committer: Linus Torvalds <torvalds@linux-foundation.org> 2026-06-15 03:59:45 +0530
commit: 7e0e7bd60d4a812b694c477716597fcb038b00cb (patch)
tree: 4ff61d47485803e7dacab1c8ddef0a4c11b512da /Documentation
parent: ff8747aacaff8266dd751b8a8648fb728dcc3b21 (diff)
parent: aa5c4fe3ba0cb2af90bbcfa7a8ef4fefcd5c2370 (diff)
download: ath-7e0e7bd60d4a812b694c477716597fcb038b00cb.tar.gz
3 files changed, 196 insertions, 1 deletions
diff --git a/Documentation/filesystems/adding-new-filesystems.rst b/Documentation/filesystems/adding-new-filesystems.rst
new file mode 100644
index 0000000000000..a3d0bf16f73a0
--- /dev/null
+++ b/Documentation/filesystems/adding-new-filesystems.rst
@@ -0,0 +1,195 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+.. _adding_new_filesystems:
+
+Adding New Filesystems
+======================
+
+This document describes what is involved in adding a new filesystem to the
+Linux kernel.
+
+Every filesystem merged into the kernel becomes the collective responsibility
+of the VFS maintainers and the wider filesystem development community.
+Experience has shown that filesystems which become unmaintained impose a
+significant and ongoing burden: they are hard or impossible to test, they
+block infrastructure changes because someone must update or preserve old APIs
+for code that nobody is actively looking after, and they accumulate unfixed
+bugs.  The requirements and expectations described here are informed by this
+experience and are intended to ensure that new filesystems enter the kernel
+on a sustainable footing.
+
+
+Do You Need a New In-Kernel Filesystem?
+---------------------------------------
+
+Before proposing a new in-kernel filesystem, consider whether one of the
+alternatives might be more appropriate.
+
+ - If an existing in-kernel filesystem covers the same use case, improving it
+   is generally preferred over adding a new implementation.  The kernel
+   community favors incremental improvement over parallel implementations.
+
+ - If the filesystem serves a niche audience or has a small user base, a FUSE
+   (Filesystem in Userspace) implementation may be a better fit.  FUSE
+   filesystems avoid the long-term kernel maintenance commitment and can be
+   developed and released on their own schedule.
+
+ - If kernel-level performance, reliability, or integration is genuinely
+   required, make the case explicitly.  Explain who the users are, what the
+   use case is, and why a FUSE implementation would not be sufficient.
+
+
+Technical Requirements
+----------------------
+
+New filesystems must use current kernel interfaces and practices.
+Submitting a filesystem built on outdated APIs creates an unacceptable
+maintenance debt and is likely to face pushback during review.
+
+Use modern VFS interfaces
+  Do not use interfaces listed in
+  :ref:`Documentation/process/deprecated.rst <deprecated>`.
+
+  Use folios rather than raw page operations for page cache management and
+  iomap rather than buffer heads for block mapping and I/O.  See
+  ``Documentation/filesystems/iomap/index.rst`` for iomap documentation.
+
+  Block-based filesystems that need functionality not currently provided by
+  iomap should be prepared to explain why adding that functionality to iomap
+  is infeasible, rather than reimplementing their own block mapping layer.
+
+  Network filesystems should consider using the netfs library
+  (``Documentation/filesystems/netfs_library.rst``), or be prepared to explain
+  why it is not a good fit.
+
+Provide userspace utilities
+  A ``mkfs`` tool is expected so that the filesystem can be created and used
+  by testers and users.  A ``fsck`` tool is strongly recommended; while not
+  strictly required for every filesystem type, the ability to verify
+  consistency and repair corruption is an important part of a mature
+  filesystem.
+
+Be testable
+  The filesystem must be testable in a meaningful way.  The
+  `fstests <https://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git>`_
+  framework (also known as xfstests) is the standard testing infrastructure
+  for Linux filesystems and its use is highly recommended.  At a minimum,
+  there must be a credible and documented way to test the filesystem and
+  detect regressions.  When submitting, include a summary of test results
+  indicating which tests pass, fail, or are not applicable.
+
+Provide documentation
+  A documentation file under ``Documentation/filesystems/`` describing the
+  filesystem, its on-disk format, mount options, and any notable design
+  decisions is recommended.
+
+
+Community and Maintainership Expectations
+-----------------------------------------
+
+Merging a filesystem is a long-term commitment.  The kernel community
+needs confidence that the filesystem will be actively maintained after it
+is merged.
+
+Identified maintainers
+  The submission must include a ``MAINTAINERS`` entry with at least one
+  maintainer (``M:``), a mailing list (``L:``), and a git tree (``T:``).
+  Having two or more maintainers is strongly preferred so that coverage
+  does not depend on a single person.  The maintainers are expected to be
+  the primary points of contact for the filesystem going forward.
+
+Demonstrated commitment
+  A track record of maintaining kernel code -- for example, in other
+  subsystems -- significantly strengthens the case for a new filesystem.
+  Maintainers who are already known and trusted within the community face
+  less friction during review.
+
+Sustained backing
+  Major filesystems in Linux have organizational or corporate support behind
+  their development.  Filesystems that depend entirely on volunteer effort
+  face higher scrutiny about their long-term viability.
+
+Responsiveness
+  The maintainer is expected to respond to bug reports, address review
+  feedback, and adapt the filesystem to VFS infrastructure changes such as
+  folio conversions, iomap migration, and mount API updates.  Unresponsive
+  maintainership is one of the primary reasons filesystems end up on the
+  path to deprecation.
+
+User base
+  Clearly describe who the users of this filesystem are and the scale of the
+  user base.  Filesystems with a very small or unclear user base face a
+  harder path to acceptance and a higher risk of future deprecation.
+
+Building your track record
+  A practical way to demonstrate many of the qualities above is to maintain
+  the filesystem out-of-tree for a period before requesting a merge.  This
+  shows sustained commitment, builds a visible user base, and gives reviewers
+  confidence that the code and its maintainer will persist after merging.
+  That said, it is recognized that for some filesystems the user base grows
+  significantly only after upstreaming, so a compelling case for expected
+  adoption can substitute for a large existing user base.
+
+
+Submission Process
+------------------
+
+This section covers what is specific to filesystem submissions, over and
+above the normal submission advice in
+:ref:`Documentation/process/submitting-patches.rst <submittingpatches>` and
+:ref:`Documentation/process/submit-checklist.rst <submitchecklist>`.
+
+ - Send patches to the linux-fsdevel mailing list
+   (``linux-fsdevel@vger.kernel.org``).  CC the relevant VFS maintainers as
+   listed in the ``MAINTAINERS`` file under
+   ``FILESYSTEMS (VFS and infrastructure)``.
+
+ - Structure the submission logically.  It is neither acceptable to send one
+   large patch containing the entire filesystem, nor is a replay of the full
+   development history helpful to reviewers.  Instead, split the series by
+   topic -- for example: superblock and mount handling, inode operations,
+   directory operations, address space operations, and so on -- so that each
+   patch is reviewable in isolation.
+
+ - Separate any filesystem-specific ioctls into their own patches with
+   dedicated justification.  Interfaces beyond those already common across
+   other filesystems will receive additional scrutiny because they are hard
+   to maintain and may conflict with future generic interfaces.
+
+ - Expect thorough review.  Filesystem code interacts deeply with the VFS,
+   memory management, and block layers, so reviewers will examine the code
+   carefully.  Address all review feedback and be prepared for multiple
+   revision cycles.
+
+ - It may be appropriate to mark the filesystem as experimental in its Kconfig
+   help text for the first few releases to set expectations while the code
+   stabilizes in-tree.
+
+
+Ongoing Obligations
+-------------------
+
+Merging is not the finish line.  Maintaining a filesystem in the kernel is an
+ongoing commitment.
+
+ - Adapt to VFS infrastructure changes.  The VFS layer evolves continuously;
+   maintainers are expected to keep up with conversions such as folio
+   migration, iomap adoption, and mount API updates.
+
+ - Maintain test coverage.  As test suites evolve, the filesystem's test
+   results should be kept current.
+
+ - Handle security issues and regression promptly.  Both those reported
+   by ordinary users and those reported by test bots and fuzzing tools.
+   The filesystem must handle corrupted input gracefully without corrupting
+   memory, hanging, or crashing the kernel.
+
+ - Engage with the wider filesystem community.  Participate on linux-fsdevel,
+   share approaches to common problems, and look for opportunities to reuse
+   shared infrastructure.  It is inappropriate to develop in isolation on a
+   private list and surface patches only at merge time.
+
+ - Filesystems that become unmaintained -- where the maintainer stops
+   responding, infrastructure changes go unadapted, and testing becomes
+   impossible -- are candidates for deprecation and eventual removal from
+   the kernel.
diff --git a/Documentation/filesystems/index.rst b/Documentation/filesystems/index.rst
index fc7254d01a2b2..1f71cf1595476 100644
--- a/Documentation/filesystems/index.rst
+++ b/Documentation/filesystems/index.rst
@@ -43,6 +43,7 @@ algorithms work.
    caching/index
 
    porting
+   adding-new-filesystems
 
 Filesystem support layers
 =========================
diff --git a/Documentation/filesystems/porting.rst b/Documentation/filesystems/porting.rst
index fdf074429cd3a..f546b1d3897fa 100644
--- a/Documentation/filesystems/porting.rst
+++ b/Documentation/filesystems/porting.rst
@@ -1297,7 +1297,6 @@ Several functions are renamed:
 -  kern_path_locked -> start_removing_path
 -  kern_path_create -> start_creating_path
 -  user_path_create -> start_creating_user_path
--  user_path_locked_at -> start_removing_user_path_at
 -  done_path_create -> end_creating_path
 
 ---
author	Linus Torvalds <torvalds@linux-foundation.org>	2026-06-15 03:59:45 +0530
committer	Linus Torvalds <torvalds@linux-foundation.org>	2026-06-15 03:59:45 +0530
commit	7e0e7bd60d4a812b694c477716597fcb038b00cb (patch)
tree	4ff61d47485803e7dacab1c8ddef0a4c11b512da /Documentation
parent	ff8747aacaff8266dd751b8a8648fb728dcc3b21 (diff)
parent	aa5c4fe3ba0cb2af90bbcfa7a8ef4fefcd5c2370 (diff)
download	ath-7e0e7bd60d4a812b694c477716597fcb038b00cb.tar.gz