diff options
| author | Mark Brown <broonie@kernel.org> | 2026-05-29 18:09:32 +0100 |
|---|---|---|
| committer | Mark Brown <broonie@kernel.org> | 2026-05-29 18:09:32 +0100 |
| commit | 5f74287f42d47e9acdc9a987518387125f046527 (patch) | |
| tree | a8c7e4a5ad67952a170f269f99418a4ab50a5318 /Documentation | |
| parent | 96c3d0c2555e1b97c57348e83702f3b56b8df9d3 (diff) | |
| parent | 982071afc4e24a052d84132ffbf4340856924c28 (diff) | |
| download | linux-next-history-5f74287f42d47e9acdc9a987518387125f046527.tar.gz | |
Merge branch 'fs-next' of linux-next
# Conflicts:
# fs/btrfs/defrag.c
Diffstat (limited to 'Documentation')
| -rw-r--r-- | Documentation/ABI/testing/sysfs-fs-f2fs | 5 | ||||
| -rw-r--r-- | Documentation/filesystems/9p.rst | 10 | ||||
| -rw-r--r-- | Documentation/filesystems/adding-new-filesystems.rst | 195 | ||||
| -rw-r--r-- | Documentation/filesystems/f2fs.rst | 9 | ||||
| -rw-r--r-- | Documentation/filesystems/index.rst | 1 | ||||
| -rw-r--r-- | Documentation/filesystems/porting.rst | 1 | ||||
| -rw-r--r-- | Documentation/filesystems/proc.rst | 19 | ||||
| -rw-r--r-- | Documentation/netlink/specs/nfsd.yaml | 290 | ||||
| -rw-r--r-- | Documentation/netlink/specs/sunrpc_cache.yaml | 149 |
9 files changed, 671 insertions, 8 deletions
diff --git a/Documentation/ABI/testing/sysfs-fs-f2fs b/Documentation/ABI/testing/sysfs-fs-f2fs index 423ec40e2e4e2..1b58c029abd0d 100644 --- a/Documentation/ABI/testing/sysfs-fs-f2fs +++ b/Documentation/ABI/testing/sysfs-fs-f2fs @@ -270,7 +270,8 @@ Description: Shows all enabled kernel features. inode_checksum, flexible_inline_xattr, quota_ino, inode_crtime, lost_found, verity, sb_checksum, casefold, readonly, compression, test_dummy_encryption_v2, - atomic_write, pin_file, encrypted_casefold, linear_lookup. + atomic_write, pin_file, encrypted_casefold, linear_lookup, + fserror. What: /sys/fs/f2fs/<disk>/inject_rate Date: May 2016 @@ -1000,4 +1001,4 @@ Contact: "Chao Yu" <chao@kernel.org> Description: It can be used to tune priority of f2fs critical task, e.g. f2fs_ckpt, f2fs_gc threads, limitation as below: - it requires user has CAP_SYS_NICE capability. - - the range is [100, 139], by default the value is 100. + - the range is [100, 139], by default the value is 120. diff --git a/Documentation/filesystems/9p.rst b/Documentation/filesystems/9p.rst index be3504ca034a8..3f65db648db06 100644 --- a/Documentation/filesystems/9p.rst +++ b/Documentation/filesystems/9p.rst @@ -23,13 +23,10 @@ the 9p client is available in the form of a USENIX paper: Other applications are described in the following papers: * XCPU & Clustering - http://xcpu.org/papers/xcpu-talk.pdf * KVMFS: control file system for KVM - http://xcpu.org/papers/kvmfs.pdf * CellFS: A New Programming Model for the Cell BE - http://xcpu.org/papers/cellfs-talk.pdf * PROSE I/O: Using 9p to enable Application Partitions - http://plan9.escet.urjc.es/iwp9/cready/PROSE_iwp9_2006.pdf + http://web.archive.org/web/20110101152020/http://plan9.escet.urjc.es/iwp9/cready/PROSE_iwp9_2006.pdf * VirtFS: A Virtualization Aware File System pass-through https://kernel.org/doc/ols/2010/ols2010-pages-109-120.pdf @@ -238,6 +235,11 @@ Options cachetag cache tag to use the specified persistent cache. cache tags for existing cache sessions can be listed at /sys/fs/9p/caches. (applies only to cache=fscache) + + negtimeout the duration (in milliseconds) that negative dentries (paths + that do not actually exist) are retained in the cache. If + set to a negative value, those entries are kept indefinitely + until evicted by the buffer cache management system ============= =============================================================== Behavior diff --git a/Documentation/filesystems/adding-new-filesystems.rst b/Documentation/filesystems/adding-new-filesystems.rst new file mode 100644 index 0000000000000..a3d0bf16f73a0 --- /dev/null +++ b/Documentation/filesystems/adding-new-filesystems.rst @@ -0,0 +1,195 @@ +.. SPDX-License-Identifier: GPL-2.0 + +.. _adding_new_filesystems: + +Adding New Filesystems +====================== + +This document describes what is involved in adding a new filesystem to the +Linux kernel. + +Every filesystem merged into the kernel becomes the collective responsibility +of the VFS maintainers and the wider filesystem development community. +Experience has shown that filesystems which become unmaintained impose a +significant and ongoing burden: they are hard or impossible to test, they +block infrastructure changes because someone must update or preserve old APIs +for code that nobody is actively looking after, and they accumulate unfixed +bugs. The requirements and expectations described here are informed by this +experience and are intended to ensure that new filesystems enter the kernel +on a sustainable footing. + + +Do You Need a New In-Kernel Filesystem? +--------------------------------------- + +Before proposing a new in-kernel filesystem, consider whether one of the +alternatives might be more appropriate. + + - If an existing in-kernel filesystem covers the same use case, improving it + is generally preferred over adding a new implementation. The kernel + community favors incremental improvement over parallel implementations. + + - If the filesystem serves a niche audience or has a small user base, a FUSE + (Filesystem in Userspace) implementation may be a better fit. FUSE + filesystems avoid the long-term kernel maintenance commitment and can be + developed and released on their own schedule. + + - If kernel-level performance, reliability, or integration is genuinely + required, make the case explicitly. Explain who the users are, what the + use case is, and why a FUSE implementation would not be sufficient. + + +Technical Requirements +---------------------- + +New filesystems must use current kernel interfaces and practices. +Submitting a filesystem built on outdated APIs creates an unacceptable +maintenance debt and is likely to face pushback during review. + +Use modern VFS interfaces + Do not use interfaces listed in + :ref:`Documentation/process/deprecated.rst <deprecated>`. + + Use folios rather than raw page operations for page cache management and + iomap rather than buffer heads for block mapping and I/O. See + ``Documentation/filesystems/iomap/index.rst`` for iomap documentation. + + Block-based filesystems that need functionality not currently provided by + iomap should be prepared to explain why adding that functionality to iomap + is infeasible, rather than reimplementing their own block mapping layer. + + Network filesystems should consider using the netfs library + (``Documentation/filesystems/netfs_library.rst``), or be prepared to explain + why it is not a good fit. + +Provide userspace utilities + A ``mkfs`` tool is expected so that the filesystem can be created and used + by testers and users. A ``fsck`` tool is strongly recommended; while not + strictly required for every filesystem type, the ability to verify + consistency and repair corruption is an important part of a mature + filesystem. + +Be testable + The filesystem must be testable in a meaningful way. The + `fstests <https://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git>`_ + framework (also known as xfstests) is the standard testing infrastructure + for Linux filesystems and its use is highly recommended. At a minimum, + there must be a credible and documented way to test the filesystem and + detect regressions. When submitting, include a summary of test results + indicating which tests pass, fail, or are not applicable. + +Provide documentation + A documentation file under ``Documentation/filesystems/`` describing the + filesystem, its on-disk format, mount options, and any notable design + decisions is recommended. + + +Community and Maintainership Expectations +----------------------------------------- + +Merging a filesystem is a long-term commitment. The kernel community +needs confidence that the filesystem will be actively maintained after it +is merged. + +Identified maintainers + The submission must include a ``MAINTAINERS`` entry with at least one + maintainer (``M:``), a mailing list (``L:``), and a git tree (``T:``). + Having two or more maintainers is strongly preferred so that coverage + does not depend on a single person. The maintainers are expected to be + the primary points of contact for the filesystem going forward. + +Demonstrated commitment + A track record of maintaining kernel code -- for example, in other + subsystems -- significantly strengthens the case for a new filesystem. + Maintainers who are already known and trusted within the community face + less friction during review. + +Sustained backing + Major filesystems in Linux have organizational or corporate support behind + their development. Filesystems that depend entirely on volunteer effort + face higher scrutiny about their long-term viability. + +Responsiveness + The maintainer is expected to respond to bug reports, address review + feedback, and adapt the filesystem to VFS infrastructure changes such as + folio conversions, iomap migration, and mount API updates. Unresponsive + maintainership is one of the primary reasons filesystems end up on the + path to deprecation. + +User base + Clearly describe who the users of this filesystem are and the scale of the + user base. Filesystems with a very small or unclear user base face a + harder path to acceptance and a higher risk of future deprecation. + +Building your track record + A practical way to demonstrate many of the qualities above is to maintain + the filesystem out-of-tree for a period before requesting a merge. This + shows sustained commitment, builds a visible user base, and gives reviewers + confidence that the code and its maintainer will persist after merging. + That said, it is recognized that for some filesystems the user base grows + significantly only after upstreaming, so a compelling case for expected + adoption can substitute for a large existing user base. + + +Submission Process +------------------ + +This section covers what is specific to filesystem submissions, over and +above the normal submission advice in +:ref:`Documentation/process/submitting-patches.rst <submittingpatches>` and +:ref:`Documentation/process/submit-checklist.rst <submitchecklist>`. + + - Send patches to the linux-fsdevel mailing list + (``linux-fsdevel@vger.kernel.org``). CC the relevant VFS maintainers as + listed in the ``MAINTAINERS`` file under + ``FILESYSTEMS (VFS and infrastructure)``. + + - Structure the submission logically. It is neither acceptable to send one + large patch containing the entire filesystem, nor is a replay of the full + development history helpful to reviewers. Instead, split the series by + topic -- for example: superblock and mount handling, inode operations, + directory operations, address space operations, and so on -- so that each + patch is reviewable in isolation. + + - Separate any filesystem-specific ioctls into their own patches with + dedicated justification. Interfaces beyond those already common across + other filesystems will receive additional scrutiny because they are hard + to maintain and may conflict with future generic interfaces. + + - Expect thorough review. Filesystem code interacts deeply with the VFS, + memory management, and block layers, so reviewers will examine the code + carefully. Address all review feedback and be prepared for multiple + revision cycles. + + - It may be appropriate to mark the filesystem as experimental in its Kconfig + help text for the first few releases to set expectations while the code + stabilizes in-tree. + + +Ongoing Obligations +------------------- + +Merging is not the finish line. Maintaining a filesystem in the kernel is an +ongoing commitment. + + - Adapt to VFS infrastructure changes. The VFS layer evolves continuously; + maintainers are expected to keep up with conversions such as folio + migration, iomap adoption, and mount API updates. + + - Maintain test coverage. As test suites evolve, the filesystem's test + results should be kept current. + + - Handle security issues and regression promptly. Both those reported + by ordinary users and those reported by test bots and fuzzing tools. + The filesystem must handle corrupted input gracefully without corrupting + memory, hanging, or crashing the kernel. + + - Engage with the wider filesystem community. Participate on linux-fsdevel, + share approaches to common problems, and look for opportunities to reuse + shared infrastructure. It is inappropriate to develop in isolation on a + private list and surface patches only at merge time. + + - Filesystems that become unmaintained -- where the maintainer stops + responding, infrastructure changes go unadapted, and testing becomes + impossible -- are candidates for deprecation and eventual removal from + the kernel. diff --git a/Documentation/filesystems/f2fs.rst b/Documentation/filesystems/f2fs.rst index 7e40316312867..8c4a14ae444f4 100644 --- a/Documentation/filesystems/f2fs.rst +++ b/Documentation/filesystems/f2fs.rst @@ -137,6 +137,15 @@ noacl Disable POSIX Access Control List. Note: acl is enabled active_logs=%u Support configuring the number of active logs. In the current design, f2fs supports only 2, 4, and 6 logs. Default number is 6. + When the underlying block device exposes write + streams, the default active_logs=6 configuration + maps hot, warm, and cold DATA writes to streams 1, + 2, and 3, respectively. If only one or two write + streams are available, f2fs falls back to mapping + all DATA writes to stream 1 or mapping hot/warm + to stream 1 and cold to stream 2. If no write + streams are exposed, f2fs leaves the stream + unset. disable_ext_identify Disable the extension list configured by mkfs, so f2fs is not aware of cold files such as media files. inline_xattr Enable the inline xattrs feature. diff --git a/Documentation/filesystems/index.rst b/Documentation/filesystems/index.rst index fc7254d01a2b2..1f71cf1595476 100644 --- a/Documentation/filesystems/index.rst +++ b/Documentation/filesystems/index.rst @@ -43,6 +43,7 @@ algorithms work. caching/index porting + adding-new-filesystems Filesystem support layers ========================= diff --git a/Documentation/filesystems/porting.rst b/Documentation/filesystems/porting.rst index fdf074429cd3a..f546b1d3897fa 100644 --- a/Documentation/filesystems/porting.rst +++ b/Documentation/filesystems/porting.rst @@ -1297,7 +1297,6 @@ Several functions are renamed: - kern_path_locked -> start_removing_path - kern_path_create -> start_creating_path - user_path_create -> start_creating_user_path -- user_path_locked_at -> start_removing_user_path_at - done_path_create -> end_creating_path --- diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst index db6167befb7b2..5006644c1d198 100644 --- a/Documentation/filesystems/proc.rst +++ b/Documentation/filesystems/proc.rst @@ -52,6 +52,7 @@ fixes/update part 1.1 Stefani Seibold <stefani@seibold.net> June 9 2009 4 Configuring procfs 4.1 Mount options + 4.2 Mount restrictions 5 Filesystem behavior @@ -2425,7 +2426,9 @@ prohibited by hidepid=. If you use some daemon like identd which needs to learn information about processes information, just add identd to this group. subset=pid hides all top level files and directories in the procfs that -are not related to tasks. +are not related to tasks. This option cannot be changed on an existing +procfs instance because overmounts that existed before the change could +otherwise remain reachable after the top level procfs entries are hidden. pidns= specifies a pid namespace (either as a string path to something like `/proc/$pid/ns/pid`, or a file descriptor when using `FSCONFIG_SET_FD`) that @@ -2434,6 +2437,20 @@ will use the calling process's active pid namespace. Note that the pid namespace of an existing procfs instance cannot be modified (attempting to do so will give an `-EBUSY` error). +4.2 Mount restrictions +-------------------------- + +If user namespaces are in use, the kernel additionally checks the instances of +procfs available to the mounter and will not allow procfs to be mounted if: + + 1. This mount is not fully visible unless the new procfs is going to be + mounted with subset=pid option. + + a. Its root directory is not the root directory of the filesystem. + b. If any file or non-empty procfs directory is hidden by another mount. + + 2. A new mount overrides the readonly option or any option from atime family. + Chapter 5: Filesystem behavior ============================== diff --git a/Documentation/netlink/specs/nfsd.yaml b/Documentation/netlink/specs/nfsd.yaml index 8ab43c8253b2e..8f36fadd68f75 100644 --- a/Documentation/netlink/specs/nfsd.yaml +++ b/Documentation/netlink/specs/nfsd.yaml @@ -6,8 +6,52 @@ uapi-header: linux/nfsd_netlink.h doc: NFSD configuration over generic netlink. +definitions: + - + type: flags + name: cache-type + entries: [svc_export, expkey] + - + type: flags + name: export-flags + doc: These flags are ordered to match the NFSEXP_* flags in include/linux/nfsd/export.h + entries: + - readonly + - insecure-port + - rootsquash + - allsquash + - async + - gathered-writes + - noreaddirplus + - security-label + - sign-fh + - nohide + - nosubtreecheck + - noauthnlm + - msnfs + - fsid + - crossmount + - noacl + - v4root + - pnfs + - + type: flags + name: xprtsec-mode + doc: These flags are ordered to match the NFSEXP_XPRTSEC_* flags in include/linux/nfsd/export.h + entries: + - none + - tls + - mtls + attribute-sets: - + name: cache-notify + attributes: + - + name: cache-type + type: u32 + enum: cache-type + - name: rpc-status attributes: - @@ -132,6 +176,160 @@ attribute-sets: - name: npools type: u32 + - + name: fslocation + attributes: + - + name: host + type: string + - + name: path + type: string + - + name: fslocations + attributes: + - + name: location + type: nest + nested-attributes: fslocation + multi-attr: true + - + name: auth-flavor + attributes: + - + name: pseudoflavor + type: u32 + - + name: flags + type: u32 + enum: export-flags + enum-as-flags: true + - + name: svc-export + attributes: + - + name: seqno + type: u64 + - + name: client + type: string + - + name: path + type: string + - + name: negative + type: flag + - + name: expiry + type: u64 + - + name: anon-uid + type: u32 + - + name: anon-gid + type: u32 + - + name: fslocations + type: nest + nested-attributes: fslocations + - + name: uuid + type: binary + - + name: secinfo + type: nest + nested-attributes: auth-flavor + multi-attr: true + - + name: xprtsec + type: u32 + enum: xprtsec-mode + multi-attr: true + - + name: flags + type: u32 + enum: export-flags + enum-as-flags: true + - + name: fsid + type: s32 + - + name: svc-export-reqs + attributes: + - + name: requests + type: nest + nested-attributes: svc-export + multi-attr: true + - + name: expkey + attributes: + - + name: seqno + type: u64 + - + name: client + type: string + - + name: fsidtype + type: u8 + - + name: fsid + type: binary + - + name: negative + type: flag + - + name: expiry + type: u64 + - + name: path + type: string + - + name: expkey-reqs + attributes: + - + name: requests + type: nest + nested-attributes: expkey + multi-attr: true + - + name: cache-flush + attributes: + - + name: mask + type: u32 + enum: cache-type + enum-as-flags: true + - + name: unlock-ip + attributes: + - + name: address + type: binary + doc: struct sockaddr_in or struct sockaddr_in6. + checks: + min-len: 16 + - + name: unlock-filesystem + attributes: + - + name: path + type: string + doc: Filesystem path whose state should be released. + - + name: unlock-export + attributes: + - + name: path + type: string + doc: >- + Export path whose NFSv4 state should be revoked. + All state (opens, locks, delegations, layouts) acquired + through any export of this path is revoked, regardless + of which client holds the state. Intended for use after + all clients have been unexported from a given path, + enabling the underlying filesystem to be unmounted. operations: list: @@ -233,3 +431,95 @@ operations: attributes: - mode - npools + - + name: cache-notify + doc: Notification that there are cache requests that need servicing + attribute-set: cache-notify + mcgrp: exportd + event: + attributes: + - cache-type + - + name: svc-export-get-reqs + doc: Dump all pending svc_export requests + attribute-set: svc-export-reqs + flags: [admin-perm] + dump: + reply: + attributes: + - requests + - + name: svc-export-set-reqs + doc: Respond to one or more svc_export requests + attribute-set: svc-export-reqs + flags: [admin-perm] + do: + request: + attributes: + - requests + - + name: expkey-get-reqs + doc: Dump all pending expkey requests + attribute-set: expkey-reqs + flags: [admin-perm] + dump: + reply: + attributes: + - requests + - + name: expkey-set-reqs + doc: Respond to one or more expkey requests + attribute-set: expkey-reqs + flags: [admin-perm] + do: + request: + attributes: + - requests + - + name: cache-flush + doc: Flush nfsd caches (svc_export and/or expkey) + attribute-set: cache-flush + flags: [admin-perm] + do: + request: + attributes: + - mask + - + name: unlock-ip + doc: release NLM locks held by an IP address + attribute-set: unlock-ip + flags: [admin-perm] + do: + request: + attributes: + - address + - + name: unlock-filesystem + doc: revoke NFS state under a filesystem path + attribute-set: unlock-filesystem + flags: [admin-perm] + do: + request: + attributes: + - path + - + name: unlock-export + doc: >- + Revoke NFSv4 state acquired through exports of a given path. + Unlike unlock-filesystem, which operates at superblock granularity, + this command targets only state associated with a specific export + path. Userspace (exportfs -u) sends this after removing the last + client for a path so the underlying filesystem can be unmounted. + attribute-set: unlock-export + flags: [admin-perm] + do: + request: + attributes: + - path + +mcast-groups: + list: + - + name: none + - + name: exportd diff --git a/Documentation/netlink/specs/sunrpc_cache.yaml b/Documentation/netlink/specs/sunrpc_cache.yaml new file mode 100644 index 0000000000000..f22ff22b9418f --- /dev/null +++ b/Documentation/netlink/specs/sunrpc_cache.yaml @@ -0,0 +1,149 @@ +# SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) +--- +name: sunrpc +protocol: genetlink +uapi-header: linux/sunrpc_netlink.h + +doc: SUNRPC cache upcall support over generic netlink. + +definitions: + - + type: flags + name: cache-type + entries: [ip_map, unix_gid] + +attribute-sets: + - + name: cache-notify + attributes: + - + name: cache-type + type: u32 + enum: cache-type + - + name: ip-map + attributes: + - + name: seqno + type: u64 + - + name: class + type: string + - + name: addr + type: string + - + name: domain + type: string + - + name: negative + type: flag + - + name: expiry + type: u64 + - + name: ip-map-reqs + attributes: + - + name: requests + type: nest + nested-attributes: ip-map + multi-attr: true + - + name: unix-gid + attributes: + - + name: seqno + type: u64 + - + name: uid + type: u32 + - + name: gids + type: u32 + multi-attr: true + - + name: negative + type: flag + - + name: expiry + type: u64 + - + name: unix-gid-reqs + attributes: + - + name: requests + type: nest + nested-attributes: unix-gid + multi-attr: true + - + name: cache-flush + attributes: + - + name: mask + type: u32 + enum: cache-type + enum-as-flags: true + +operations: + list: + - + name: cache-notify + doc: Notification that there are cache requests that need servicing + attribute-set: cache-notify + mcgrp: exportd + event: + attributes: + - cache-type + - + name: ip-map-get-reqs + doc: Dump all pending ip_map requests + attribute-set: ip-map-reqs + flags: [admin-perm] + dump: + reply: + attributes: + - requests + - + name: ip-map-set-reqs + doc: Respond to one or more ip_map requests + attribute-set: ip-map-reqs + flags: [admin-perm] + do: + request: + attributes: + - requests + - + name: unix-gid-get-reqs + doc: Dump all pending unix_gid requests + attribute-set: unix-gid-reqs + flags: [admin-perm] + dump: + reply: + attributes: + - requests + - + name: unix-gid-set-reqs + doc: Respond to one or more unix_gid requests + attribute-set: unix-gid-reqs + flags: [admin-perm] + do: + request: + attributes: + - requests + - + name: cache-flush + doc: Flush sunrpc caches (ip_map and/or unix_gid) + attribute-set: cache-flush + flags: [admin-perm] + do: + request: + attributes: + - mask + +mcast-groups: + list: + - + name: none + - + name: exportd |
