| Age | Commit message (Collapse) | Author | Files | Lines |
|
Pull ARM updates from Russell King:
- fix a race condition handling PG_dcache_clean
- further cleanups for the fault handling, allowing RT to be enabled
- fixing nzones validation in adfs filesystem driver
- fix for module unwinding
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linux:
ARM: 9463/1: Allow to enable RT
ARM: 9472/1: fix race condition on PG_dcache_clean in __sync_icache_dcache()
ARM: 9471/1: module: fix unwind section relocation out of range error
fs/adfs: validate nzones in adfs_validate_bblk()
ARM: provide individual is_translation_fault() and is_permission_fault()
ARM: move FSR fault status definitions before fsr_fs()
ARM: use BIT() and GENMASK() for fault status register fields
ARM: move is_permission_fault() and is_translation_fault() to fault.h
ARM: move vmalloc() lazy-page table population
ARM: ensure interrupts are enabled in __do_user_fault()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull misc vfs updates from Christian Brauner:
"Features:
- coredump: add tracepoint for coredump events
- fs: hide file and bfile caches behind runtime const machinery
Fixes:
- fix architecture-specific compat_ftruncate64 implementations
- dcache: Limit the minimal number of bucket to two
- fs/omfs: reject s_sys_blocksize smaller than OMFS_DIR_START
- fs/mbcache: cancel shrink work before destroying the cache
- dcache: permit dynamic_dname()s up to NAME_MAX
Cleanups:
- remove or unexport unused fs_context infrastructure
- trivial ->setattr cleanups
- selftests/filesystems: Assume that TIOCGPTPEER is defined
- writeback: fix kernel-doc function name mismatch for wb_put_many()
- autofs: replace manual symlink buffer allocation in autofs_dir_symlink
- init/initramfs.c: trivial fix: FSM -> Finite-state machine
- fs: remove stale and duplicate forward declarations
- readdir: Introduce dirent_size()
- fs: Replace user_access_{begin/end} by scoped user access
- kernel: acct: fix duplicate word in comment
- fs: write a better comment in step_into() concerning .mnt assignment
- fs: attr: fix comment formatting and spelling issues"
* tag 'vfs-7.1-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (28 commits)
dcache: permit dynamic_dname()s up to NAME_MAX
fs: attr: fix comment formatting and spelling issues
fs: hide file and bfile caches behind runtime const machinery
fs: write a better comment in step_into() concerning .mnt assignment
proc: rename proc_notify_change to proc_setattr
proc: rename proc_setattr to proc_nochmod_setattr
affs: rename affs_notify_change to affs_setattr
adfs: rename adfs_notify_change to adfs_setattr
hfs: update comments on hfs_inode_setattr
kernel: acct: fix duplicate word in comment
fs: Replace user_access_{begin/end} by scoped user access
readdir: Introduce dirent_size()
coredump: add tracepoint for coredump events
fs: remove do_sys_truncate
fs: pass on FTRUNCATE_* flags to do_truncate
fs: fix archiecture-specific compat_ftruncate64
fs: remove stale and duplicate forward declarations
init/initramfs.c: trivial fix: FSM -> Finite-state machine
autofs: replace manual symlink buffer allocation in autofs_dir_symlink
fs/mbcache: cancel shrink work before destroying the cache
...
|
|
Reject ADFS disc records with a zero zone count during boot block
validation, before the disc record is used.
When nzones is 0, adfs_read_map() passes it to kmalloc_array(0, ...)
which returns ZERO_SIZE_PTR, and adfs_map_layout() then writes to
dm[-1], causing an out-of-bounds write before the allocated buffer.
adfs_validate_dr0() already rejects nzones != 1 for old-format
images. Add the equivalent check to adfs_validate_bblk() for
new-format images so that a crafted image with nzones == 0 is
rejected at probe time.
Found by syzkaller.
Fixes: f6f14a0d71b0 ("fs/adfs: map: move map-specific sb initialisation to map.c")
Signed-off-by: Bae Yeonju <iwasbaeyz@gmail.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
|
|
Make the function name match the method that it implements.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://patch.msgid.link/20260325063711.3298685-3-hch@lst.de
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
The implementation is now really basic so rename generic_file_fsync()
simple_fsync() and __generic_file_fsync() to simple_fsync_noflush().
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20260326095354.16340-56-jack@suse.cz
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
This was done entirely with mindless brute force, using
git grep -l '\<k[vmz]*alloc_objs*(.*, GFP_KERNEL)' |
xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/'
to convert the new alloc_obj() users that had a simple GFP_KERNEL
argument to just drop that argument.
Note that due to the extreme simplicity of the scripting, any slightly
more complex cases spread over multiple lines would not be triggered:
they definitely exist, but this covers the vast bulk of the cases, and
the resulting diff is also then easier to check automatically.
For the same reason the 'flex' versions will be done as a separate
conversion.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:
Single allocations: kmalloc(sizeof(TYPE), ...)
are replaced with: kmalloc_obj(TYPE, ...)
Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with: kmalloc_objs(TYPE, COUNT, ...)
Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with: kmalloc_flex(*PTR, FAM, COUNT, ...)
(where TYPE may also be *VAR)
The resulting allocations no longer return "void *", instead returning
"TYPE *".
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
Remove <linux/hex.h> from <linux/kernel.h> and update all users/callers of
hex.h interfaces to directly #include <linux/hex.h> as part of the process
of putting kernel.h on a diet.
Removing hex.h from kernel.h means that 36K C source files don't have to
pay the price of parsing hex.h for the roughly 120 C source files that
need it.
This change has been build-tested with allmodconfig on most ARCHes. Also,
all users/callers of <linux/hex.h> in the entire source tree have been
updated if needed (if not already #included).
Link: https://lkml.kernel.org/r/20251215005206.2362276-1-rdunlap@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Yury Norov (NVIDIA) <yury.norov@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull mmap_prepare updates from Christian Brauner:
"Last cycle we introduce f_op->mmap_prepare() in c84bf6dd2b83 ("mm:
introduce new .mmap_prepare() file callback").
This is preferred to the existing f_op->mmap() hook as it does require
a VMA to be established yet, thus allowing the mmap logic to invoke
this hook far, far earlier, prior to inserting a VMA into the virtual
address space, or performing any other heavy handed operations.
This allows for much simpler unwinding on error, and for there to be a
single attempt at merging a VMA rather than having to possibly
reattempt a merge based on potentially altered VMA state.
Far more importantly, it prevents inappropriate manipulation of
incompletely initialised VMA state, which is something that has been
the cause of bugs and complexity in the past.
The intent is to gradually deprecate f_op->mmap, and in that vein this
series coverts the majority of file systems to using f_op->mmap_prepare.
Prerequisite steps are taken - firstly ensuring all checks for mmap
capabilities use the file_has_valid_mmap_hooks() helper rather than
directly checking for f_op->mmap (which is now not a valid check) and
secondly updating daxdev_mapping_supported() to not require a VMA
parameter to allow ext4 and xfs to be converted.
Commit bb666b7c2707 ("mm: add mmap_prepare() compatibility layer for
nested file systems") handles the nasty edge-case of nested file
systems like overlayfs, which introduces a compatibility shim to allow
f_op->mmap_prepare() to be invoked from an f_op->mmap() callback.
This allows for nested filesystems to continue to function correctly
with all file systems regardless of which callback is used. Once we
finally convert all file systems, this shim can be removed.
As a result, ecryptfs, fuse, and overlayfs remain unaltered so they
can nest all other file systems.
We additionally do not update resctl - as this requires an update to
remap_pfn_range() (or an alternative to it) which we defer to a later
series, equally we do not update cramfs which needs a mixed mapping
insertion with the same issue, nor do we update procfs, hugetlbfs,
syfs or kernfs all of which require VMAs for internal state and hooks.
We shall return to all of these later"
* tag 'vfs-6.17-rc1.mmap_prepare' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
doc: update porting, vfs documentation to describe mmap_prepare()
fs: replace mmap hook with .mmap_prepare for simple mappings
fs: convert most other generic_file_*mmap() users to .mmap_prepare()
fs: convert simple use of generic_file_*_mmap() to .mmap_prepare()
mm/filemap: introduce generic_file_*_mmap_prepare() helpers
fs/xfs: transition from deprecated .mmap hook to .mmap_prepare
fs/ext4: transition from deprecated .mmap hook to .mmap_prepare
fs/dax: make it possible to check dev dax support without a VMA
fs: consistently use can_mmap_file() helper
mm/nommu: use file_has_valid_mmap_hooks() helper
mm: rename call_mmap/mmap_prepare to vfs_mmap/mmap_prepare
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull misc VFS updates from Christian Brauner:
"This contains the usual selections of misc updates for this cycle.
Features:
- Add ext4 IOCB_DONTCACHE support
This refactors the address_space_operations write_begin() and
write_end() callbacks to take const struct kiocb * as their first
argument, allowing IOCB flags such as IOCB_DONTCACHE to propagate
to the filesystem's buffered I/O path.
Ext4 is updated to implement handling of the IOCB_DONTCACHE flag
and advertises support via the FOP_DONTCACHE file operation flag.
Additionally, the i915 driver's shmem write paths are updated to
bypass the legacy write_begin/write_end interface in favor of
directly calling write_iter() with a constructed synchronous kiocb.
Another i915 change replaces a manual write loop with
kernel_write() during GEM shmem object creation.
Cleanups:
- don't duplicate vfs_open() in kernel_file_open()
- proc_fd_getattr(): don't bother with S_ISDIR() check
- fs/ecryptfs: replace snprintf with sysfs_emit in show function
- vfs: Remove unnecessary list_for_each_entry_safe() from
evict_inodes()
- filelock: add new locks_wake_up_waiter() helper
- fs: Remove three arguments from block_write_end()
- VFS: change old_dir and new_dir in struct renamedata to dentrys
- netfs: Remove unused declaration netfs_queue_write_request()
Fixes:
- eventpoll: Fix semi-unbounded recursion
- eventpoll: fix sphinx documentation build warning
- fs/read_write: Fix spelling typo
- fs: annotate data race between poll_schedule_timeout() and
pollwake()
- fs/pipe: set FMODE_NOWAIT in create_pipe_files()
- docs/vfs: update references to i_mutex to i_rwsem
- fs/buffer: remove comment about hard sectorsize
- fs/buffer: remove the min and max limit checks in __getblk_slow()
- fs/libfs: don't assume blocksize <= PAGE_SIZE in
generic_check_addressable
- fs_context: fix parameter name in infofc() macro
- fs: Prevent file descriptor table allocations exceeding INT_MAX"
* tag 'vfs-6.17-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (24 commits)
netfs: Remove unused declaration netfs_queue_write_request()
eventpoll: fix sphinx documentation build warning
ext4: support uncached buffered I/O
mm/pagemap: add write_begin_get_folio() helper function
fs: change write_begin/write_end interface to take struct kiocb *
drm/i915: Refactor shmem_pwrite() to use kiocb and write_iter
drm/i915: Use kernel_write() in shmem object create
eventpoll: Fix semi-unbounded recursion
vfs: Remove unnecessary list_for_each_entry_safe() from evict_inodes()
fs/libfs: don't assume blocksize <= PAGE_SIZE in generic_check_addressable
fs/buffer: remove the min and max limit checks in __getblk_slow()
fs: Prevent file descriptor table allocations exceeding INT_MAX
fs: Remove three arguments from block_write_end()
fs/ecryptfs: replace snprintf with sysfs_emit in show function
fs: annotate suspected data race between poll_schedule_timeout() and pollwake()
docs/vfs: update references to i_mutex to i_rwsem
fs/buffer: remove comment about hard sectorsize
fs_context: fix parameter name in infofc() macro
VFS: change old_dir and new_dir in struct renamedata to dentrys
proc_fd_getattr(): don't bother with S_ISDIR() check
...
|
|
Change the address_space_operations callbacks write_begin() and
write_end() to take struct kiocb * as the first argument instead of
struct file *.
Update all affected function prototypes, implementations, call sites,
and related documentation across VFS, filesystems, and block layer.
Part of a series refactoring address_space_operations write_begin and
write_end callbacks to use struct kiocb for passing write context and
flags.
Signed-off-by: Taotao Chen <chentaotao@didiglobal.com>
Link: https://lore.kernel.org/20250716093559.217344-4-chentaotao@didiglobal.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Since commit c84bf6dd2b83 ("mm: introduce new .mmap_prepare() file
callback"), the f_op->mmap() hook has been deprecated in favour of
f_op->mmap_prepare().
We have provided generic .mmap_prepare() equivalents, so update all file
systems that specify these directly in their file_operations structures.
This updates 9p, adfs, affs, bfs, fat, hfs, hfsplus, hostfs, hpfs, jffs2,
jfs, minix, omfs, ramfs and ufs file systems directly.
It updates generic_ro_fops which impacts qnx4, cramfs, befs, squashfs,
frebxfs, qnx6, efs, romfs, erofs and isofs file systems.
There are remaining file systems which use generic hooks in a less direct
way which we address in a subsequent commit.
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Link: https://lore.kernel.org/c7dc90e44a9e75e750939ea369290d6e441a18e6.1750099179.git.lorenzo.stoakes@oracle.com
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
... to be used instead of manually assigning to ->s_d_op.
All in-tree filesystem converted (and field itself is renamed,
so any out-of-tree ones in need of conversion will be caught
by compiler).
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Eric Sandeen <sandeen@redhat.com> says:
These were all tested against images I created or obtained, using a
script to test random combinations of valid and invalid mount and
remount options, and comparing the results before and after the changes.
AFAICT, all parsing works as expected and behavior is unchanged.
* patches from https://lore.kernel.org/r/20240916172735.866916-1-sandeen@redhat.com:
hfsplus: convert hfsplus to use the new mount api
hfs: convert hfs to use the new mount api
befs: convert befs to use the new mount api
affs: convert affs to use the new mount api
adfs: convert adfs to use the new mount api
Link: https://lore.kernel.org/r/20240916172735.866916-1-sandeen@redhat.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
asm/unaligned.h is always an include of asm-generic/unaligned.h;
might as well move that thing to linux/unaligned.h and include
that - there's nothing arch-specific in that header.
auto-generated by the following:
for i in `git grep -l -w asm/unaligned.h`; do
sed -i -e "s/asm\/unaligned.h/linux\/unaligned.h/" $i
done
for i in `git grep -l -w asm-generic/unaligned.h`; do
sed -i -e "s/asm-generic\/unaligned.h/linux\/unaligned.h/" $i
done
git mv include/asm-generic/unaligned.h include/linux/unaligned.h
git mv tools/include/asm-generic/unaligned.h tools/include/linux/unaligned.h
sed -i -e "/unaligned.h/d" include/asm-generic/Kbuild
sed -i -e "s/__ASM_GENERIC/__LINUX/" include/linux/unaligned.h tools/include/linux/unaligned.h
|
|
Convert the adfs filesystem to use the new mount API.
Tested by comparing random mount & remount options before and after
the change.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Link: https://lore.kernel.org/r/20240916172735.866916-2-sandeen@redhat.com
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Convert all callers from working on a page to working on one page
of a folio (support for working on an entire folio can come later).
Removes a lot of folio->page->folio conversions.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Fix the 'make W=1' issue:
WARNING: modpost: missing MODULE_DESCRIPTION() in fs/adfs/adfs.o
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Link: https://lore.kernel.org/r/20240523-md-adfs-v1-1-364268e38370@quicinc.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Yes, yes, I know the slab people were planning on going slow and letting
every subsystem fight this thing on their own. But let's just rip off
the band-aid and get it over and done with. I don't want to see a
number of unnecessary pull requests just to get rid of a flag that no
longer has any meaning.
This was mainly done with a couple of 'sed' scripts and then some manual
cleanup of the end result.
Link: https://lore.kernel.org/all/CAHk-=wji0u+OOtmAOD-5JV3SXcRJF___k_+8XNKmak0yd5vW1Q@mail.gmail.com/
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
If the filesystem implements migrate_folio and writepages, there is no
need for a writepage implementation.
Link: https://lkml.kernel.org/r/20231215200245.748418-6-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Convert to using the new inode timestamp accessor functions.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Link: https://lore.kernel.org/r/20231004185347.80880-14-jlayton@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Pull block updates from Jens Axboe:
"Pretty quiet round for this release. This contains:
- Add support for zoned storage to ublk (Andreas, Ming)
- Series improving performance for drivers that mark themselves as
needing a blocking context for issue (Bart)
- Cleanup the flush logic (Chengming)
- sed opal keyring support (Greg)
- Fixes and improvements to the integrity support (Jinyoung)
- Add some exports for bcachefs that we can hopefully delete again in
the future (Kent)
- deadline throttling fix (Zhiguo)
- Series allowing building the kernel without buffer_head support
(Christoph)
- Sanitize the bio page adding flow (Christoph)
- Write back cache fixes (Christoph)
- MD updates via Song:
- Fix perf regression for raid0 large sequential writes (Jan)
- Fix split bio iostat for raid0 (David)
- Various raid1 fixes (Heinz, Xueshi)
- raid6test build fixes (WANG)
- Deprecate bitmap file support (Christoph)
- Fix deadlock with md sync thread (Yu)
- Refactor md io accounting (Yu)
- Various non-urgent fixes (Li, Yu, Jack)
- Various fixes and cleanups (Arnd, Azeem, Chengming, Damien, Li,
Ming, Nitesh, Ruan, Tejun, Thomas, Xu)"
* tag 'for-6.6/block-2023-08-28' of git://git.kernel.dk/linux: (113 commits)
block: use strscpy() to instead of strncpy()
block: sed-opal: keyring support for SED keys
block: sed-opal: Implement IOC_OPAL_REVERT_LSP
block: sed-opal: Implement IOC_OPAL_DISCOVERY
blk-mq: prealloc tags when increase tagset nr_hw_queues
blk-mq: delete redundant tagset map update when fallback
blk-mq: fix tags leak when shrink nr_hw_queues
ublk: zoned: support REQ_OP_ZONE_RESET_ALL
md: raid0: account for split bio in iostat accounting
md/raid0: Fix performance regression for large sequential writes
md/raid0: Factor out helper for mapping and submitting a bio
md raid1: allow writebehind to work on any leg device set WriteMostly
md/raid1: hold the barrier until handle_read_error() finishes
md/raid1: free the r1bio before waiting for blocked rdev
md/raid1: call free_r1bio() before allow_barrier() in raid_end_bio_io()
blk-cgroup: Fix NULL deref caused by blkg_policy_data being installed before init
drivers/rnbd: restore sysfs interface to rnbd-client
md/raid5-cache: fix null-ptr-deref for r5l_flush_stripe_to_raid()
raid6: test: only check for Altivec if building on powerpc hosts
raid6: test: make sure all intermediate and artifact files are .gitignored
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull non-MM updates from Andrew Morton:
- An extensive rework of kexec and crash Kconfig from Eric DeVolder
("refactor Kconfig to consolidate KEXEC and CRASH options")
- kernel.h slimming work from Andy Shevchenko ("kernel.h: Split out a
couple of macros to args.h")
- gdb feature work from Kuan-Ying Lee ("Add GDB memory helper
commands")
- vsprintf inclusion rationalization from Andy Shevchenko
("lib/vsprintf: Rework header inclusions")
- Switch the handling of kdump from a udev scheme to in-kernel
handling, by Eric DeVolder ("crash: Kernel handling of CPU and memory
hot un/plug")
- Many singleton patches to various parts of the tree
* tag 'mm-nonmm-stable-2023-08-28-22-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (81 commits)
document while_each_thread(), change first_tid() to use for_each_thread()
drivers/char/mem.c: shrink character device's devlist[] array
x86/crash: optimize CPU changes
crash: change crash_prepare_elf64_headers() to for_each_possible_cpu()
crash: hotplug support for kexec_load()
x86/crash: add x86 crash hotplug support
crash: memory and CPU hotplug sysfs attributes
kexec: exclude elfcorehdr from the segment digest
crash: add generic infrastructure for crash hotplug support
crash: move a few code bits to setup support of crash hotplug
kstrtox: consistently use _tolower()
kill do_each_thread()
nilfs2: fix WARNING in mark_buffer_dirty due to discarded buffer reuse
scripts/bloat-o-meter: count weak symbol sizes
treewide: drop CONFIG_EMBEDDED
lockdep: fix static memory detection even more
lib/vsprintf: declare no_hash_pointers in sprintf.h
lib/vsprintf: split out sprintf() and friends
kernel/fork: stop playing lockless games for exe_file replacement
adfs: delete unused "union adfs_dirtail" definition
...
|
|
union adfs_dirtail::new stands in the way if Linux++ project:
"new" can't be used as member's name because it is a keyword in C++.
Link: https://lkml.kernel.org/r/43b0a4c8-a7cf-4ab1-98f7-0f65c096f9e8@p183
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Add a new config option that controls building the buffer_head code, and
select it from all file systems and stacking drivers that need it.
For the block device nodes and alternative iomap based buffered I/O path
is provided when buffer_head support is not enabled, and iomap needs a
a small tweak to define the IOMAP_F_BUFFER_HEAD flag to 0 to not call
into the buffer_head code when it doesn't exist.
Otherwise this is just Kconfig and ifdef changes.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Link: https://lore.kernel.org/r/20230801172201.1923299-7-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
In later patches, we're going to change how the inode's ctime field is
used. Switch to using accessor functions instead of raw accesses of
inode->i_ctime.
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Message-Id: <20230705190309.579783-20-jlayton@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Replace pointers to generic_file_splice_read() with calls to
filemap_splice_read().
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Christian Brauner <brauner@kernel.org>
cc: Jens Axboe <axboe@kernel.dk>
cc: Al Viro <viro@zeniv.linux.org.uk>
cc: David Hildenbrand <david@redhat.com>
cc: John Hubbard <jhubbard@nvidia.com>
cc: linux-mm@kvack.org
cc: linux-block@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
Link: https://lore.kernel.org/r/20230522135018.2742245-29-dhowells@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Convert to struct mnt_idmap.
Last cycle we merged the necessary infrastructure in
256c8aed2b42 ("fs: introduce dedicated idmap type for mounts").
This is just the conversion to struct mnt_idmap.
Currently we still pass around the plain namespace that was attached to a
mount. This is in general pretty convenient but it makes it easy to
conflate namespaces that are relevant on the filesystem with namespaces
that are relevent on the mount level. Especially for non-vfs developers
without detailed knowledge in this area this can be a potential source for
bugs.
Once the conversion to struct mnt_idmap is done all helpers down to the
really low-level helpers will take a struct mnt_idmap argument instead of
two namespace arguments. This way it becomes impossible to conflate the two
eliminating the possibility of any bugs. All of the vfs and all filesystems
only operate on struct mnt_idmap.
Acked-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
|
|
This function is NOT converted to handle large folios, so include
an assert that the filesystem isn't passing one in. Otherwise, use
the folio functions instead of the page functions, where they exist.
Convert all filesystems which use block_read_full_page().
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
|
|
There are no more aop flags left, so remove the parameter.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
|
|
There are no more aop flags left, so remove the parameter.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
|
|
Pull filesystem folio updates from Matthew Wilcox:
"Primarily this series converts some of the address_space operations to
take a folio instead of a page.
Notably:
- a_ops->is_partially_uptodate() takes a folio instead of a page and
changes the type of the 'from' and 'count' arguments to make it
obvious they're bytes.
- a_ops->invalidatepage() becomes ->invalidate_folio() and has a
similar type change.
- a_ops->launder_page() becomes ->launder_folio()
- a_ops->set_page_dirty() becomes ->dirty_folio() and adds the
address_space as an argument.
There are a couple of other misc changes up front that weren't worth
separating into their own pull request"
* tag 'folio-5.18b' of git://git.infradead.org/users/willy/pagecache: (53 commits)
fs: Remove aops ->set_page_dirty
fb_defio: Use noop_dirty_folio()
fs: Convert __set_page_dirty_no_writeback to noop_dirty_folio
fs: Convert __set_page_dirty_buffers to block_dirty_folio
nilfs: Convert nilfs_set_page_dirty() to nilfs_dirty_folio()
mm: Convert swap_set_page_dirty() to swap_dirty_folio()
ubifs: Convert ubifs_set_page_dirty to ubifs_dirty_folio
f2fs: Convert f2fs_set_node_page_dirty to f2fs_dirty_node_folio
f2fs: Convert f2fs_set_data_page_dirty to f2fs_dirty_data_folio
f2fs: Convert f2fs_set_meta_page_dirty to f2fs_dirty_meta_folio
afs: Convert afs_dir_set_page_dirty() to afs_dir_dirty_folio()
btrfs: Convert extent_range_redirty_for_io() to use folios
fs: Convert trivial uses of __set_page_dirty_nobuffers to filemap_dirty_folio
btrfs: Convert from set_page_dirty to dirty_folio
fscache: Convert fscache_set_page_dirty() to fscache_dirty_folio()
fs: Add aops->dirty_folio
fs: Remove aops->launder_page
orangefs: Convert launder_page to launder_folio
nfs: Convert from launder_page to launder_folio
fuse: Convert from launder_page to launder_folio
...
|
|
The inode allocation is supposed to use alloc_inode_sb(), so convert
kmem_cache_alloc() of all filesystems to alloc_inode_sb().
Link: https://lkml.kernel.org/r/20220228122126.37293-5-songmuchun@bytedance.com
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: Theodore Ts'o <tytso@mit.edu> [ext4]
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Alex Shi <alexs@kernel.org>
Cc: Anna Schumaker <Anna.Schumaker@Netapp.com>
Cc: Chao Yu <chao@kernel.org>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Fam Zheng <fam.zheng@bytedance.com>
Cc: Jaegeuk Kim <jaegeuk@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kari Argillander <kari.argillander@gmail.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Qi Zheng <zhengqi.arch@bytedance.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Xiongchun Duan <duanxiongchun@bytedance.com>
Cc: Yang Shi <shy828301@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Convert all callers; mostly this is just changing the aops to point
at it, but a few implementations need a little more work.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Acked-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Tested-by: Mike Marshall <hubcap@omnibond.com> # orangefs
Tested-by: David Howells <dhowells@redhat.com> # afs
|
|
Remove special-casing of a NULL invalidatepage, since there is no
more block_invalidatepage.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Tested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Acked-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Tested-by: Mike Marshall <hubcap@omnibond.com> # orangefs
Tested-by: David Howells <dhowells@redhat.com> # afs
|
|
Return value directly instead of taking this in a variable.
Link: https://lkml.kernel.org/r/20211210023211.424609-1-chi.minghao@zte.com.cn
Signed-off-by: Minghao Chi <chi.minghao@zte.com.cn>
Reported-by: Zeal Robot <zealci@zte.com.cm>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Remove the CONFIG_BLOCK default to __set_page_dirty_buffers and just wire
that method up for the missing instances.
[hch@lst.de: ecryptfs: add a ->set_page_dirty cludge]
Link: https://lkml.kernel.org/r/20210624125250.536369-1-hch@lst.de
Link: https://lkml.kernel.org/r/20210614061512.3966143-4-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Tyler Hicks <code@tyhicks.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Extend some inode methods with an additional user namespace argument. A
filesystem that is aware of idmapped mounts will receive the user
namespace the mount has been marked with. This can be used for
additional permission checking and also to enable filesystems to
translate between uids and gids if they need to. We have implemented all
relevant helpers in earlier patches.
As requested we simply extend the exisiting inode method instead of
introducing new ones. This is a little more code churn but it's mostly
mechanical and doesnt't leave us with additional inode methods.
Link: https://lore.kernel.org/r/20210121131959.646623-25-christian.brauner@ubuntu.com
Cc: Christoph Hellwig <hch@lst.de>
Cc: David Howells <dhowells@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
|
|
When file attributes are changed most filesystems rely on the
setattr_prepare(), setattr_copy(), and notify_change() helpers for
initialization and permission checking. Let them handle idmapped mounts.
If the inode is accessed through an idmapped mount map it into the
mount's user namespace. Afterwards the checks are identical to
non-idmapped mounts. If the initial user namespace is passed nothing
changes so non-idmapped mounts will see identical behavior as before.
Helpers that perform checks on the ia_uid and ia_gid fields in struct
iattr assume that ia_uid and ia_gid are intended values and have already
been mapped correctly at the userspace-kernelspace boundary as we
already do today. If the initial user namespace is passed nothing
changes so non-idmapped mounts will see identical behavior as before.
Link: https://lore.kernel.org/r/20210121131959.646623-8-christian.brauner@ubuntu.com
Cc: Christoph Hellwig <hch@lst.de>
Cc: David Howells <dhowells@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull misc vfs updates from Al Viro:
"Assorted stuff all over the place (the largest group here is
Christoph's stat cleanups)"
* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fs: remove KSTAT_QUERY_FLAGS
fs: remove vfs_stat_set_lookup_flags
fs: move vfs_fstatat out of line
fs: implement vfs_stat and vfs_lstat in terms of vfs_fstatat
fs: remove vfs_statx_fd
fs: omfs: use kmemdup() rather than kmalloc+memcpy
[PATCH] reduce boilerplate in fsid handling
fs: Remove duplicated flag O_NDELAY occurring twice in VALID_OPEN_FLAGS
selftests: mount: add nosymfollow tests
Add a "nosymfollow" mount option.
|
|
Get rid of boilerplate in most of ->statfs()
instances...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.
[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
|
|
Move the struct block_device definition together with most of the
block layer definitions, as it has nothing to do with the rest of fs.h.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Some filesystem references got broken by a previous patch
series I submitted. Address those.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Acked-by: David Sterba <dsterba@suse.com> # fs/affs/Kconfig
Link: https://lore.kernel.org/r/57318c53008dbda7f6f4a5a9e5787f4d37e8565a.1586881715.git.mchehab+huawei@kernel.org
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
|
|
This code accidentally returns success, but it should return the
-EIO error code from adfs_fplus_validate_header().
Acked-by: Russell King <rmk+kernel@armlinux.org.uk>
Fixes: d79288b4f61b ("fs/adfs: bigdir: calculate and validate directory checkbyte")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Avoid using the inode number as the indirect disc address, even though
these currently have the same value.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Add support for ADFS E and E+ floppy image formats, which, unlike their
hard disk variants, do not have a filesystem boot block - they have a
single map zone, with the map fragment stored at sector 0.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Separate the filesystem block probing from the superblock filling so
we can support other ADFS filesystem formats, such as the single-zone
E and E+ floppy image formats which do not have a boot block.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Remove the noisy debug in adfs_dir_update().
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
When we have write support enabled, we must not drop inodes before they
have been written back, otherwise we lose updates to the filesystem on
umount. Keep the inodes around unless we are built in read-only mode,
or we are mounted read-only.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Implement big directory entry update support in the same way that we
do for new directories.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
When reading a big directory, calculate the validate the directory
checkbyte to ensure that the directory contents are valid.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Strengthen the directory validation by ensuring that the header fields
contain sensible values that fit inside the directory, and limit the
directory size to 4MB as per RISC OS requirements.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Extract the directory validation from the directory reading function as
we will want to re-use this code.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Factor out the directory entry byte offset calculation.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
After changing a directory, we need to update the sequence numbers and
calculate the new check byte before the directory is scheduled to be
written back to the media. Since this needs to happen for any change
to the directory, move this into a separate method.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
__adfs_dir_put() and adfs_dir_find_entry() are only called from
adfs_f_update(), so move them into this function, removing some
unnecessary entry copying by doing so.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
adfs_dir_read() is only called from adfs_f_read(), so merge it into
that function. As new directories are always 2048 bytes in size,
(which we rely on elsewhere) we can consolidate some of the code.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Check that the lastmask and reserved fields are all zero, as per the
documentation.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
We have two locations where we validate the new directory format, so
factor this out to a helper.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Add and use pointers in the adfs_dir structure to access the directory
head and tail structures, which will always be contiguous in a buffer.
This allows us to avoid memcpy()ing the data in the new directory code,
making it slightly more efficient.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Rather than using setpos + getnext to iterate through the directory
entries, pass iterate() down to the dir format code to populate the
dirents.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
There is nothing in our readdir (aka iterate) method that relies on
the directory inode being exclusively locked, so switch to using the
iterate_shared() hook rather than iterate().
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Get rid of the ifdef, using IS_ENABLED() instead to detect whether the
code should be callable. This allows the compiler to always parse the
following code, reducing the chances of errors being missed.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
When we update a directory, a number of errors may happen. If we failed
to find the entry to update, we can just release the directory buffers
as normal.
However, if we have some other error, we may have partially updated the
buffers, resulting in an invalid directory. In this case, we need to
discard the buffers to avoid writing the contents back to the media, and
later re-read the directory from the media.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Use __u8 and pack the structures for on-disk directories.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Update directory locking such that it covers the validation of the
directory, which could fail if another thread is concurrently writing
to the same directory. Since we may sleep, we need to use a rwsem
rather than a rw spinlock.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Provide a helper for marking directory buffers dirty so they get
written back to disk.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Add a helper to read a directory using the inode, which we do in two
places.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Both directory formats code the mechanics of fetching the directory
buffers using their own implementations. Consolidate these into one
implementation.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Directories can span multiple buffers, and we currently open-code
memcpy access to these buffers, including dealing with entries that
are split across multiple buffers. Such code exists in both
directory format implementations.
Provide common functions to allow data to be copied from/to the
directory buffers as if they were a contiguous set of buffers, and
use them when accessing directories.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
adfs_fplus_sync() can be used for both directory formats since we now
have a common way to access the buffer heads, so move it into dir.c
and appropriately rename it. Remove the directory-format specific
implementations.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
With the bhs pointer in place, we have no need for separate per-format
free() methods, since a generic version will do. Provide a generic
implementation, remove the format specific implementations and the
method function pointer.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Initialise the dir object before we pass it down to the directory format
specific read handler. This allows us to get rid of the initialisation
inside those handlers.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Rename bh_fplus to bhs in preparation to make some of the directory
handling code sharable between implementations.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
When scanning the map for a fragment id, we need to keep track of the
free space links, so we don't inadvertently believe that the freespace
link is a valid fragment id.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Move map specific superblock initialisation to map.c, rather than
having it spread into super.c.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Use find_next_bit_le() to find the end of a fragment in the map rather
than open-coding this functionality.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
lookup_zone() and scan_free_map() cope in different ways with the
location of the map data within a zone:
1. lookup_zone() adds a four byte offset to the map data pointer to
skip over the check and free link bytes.
2. scan_free_map() needs to use the free link pointer, which is an
offset from itself, so we end up adding a 32-bit offset to the
end pointer (aka mapsize) which is really confusing.
Rename mapsize to endbit as this is really what it is, and incorporate
the 32-bit offset into the map layout. This means that both dm_startbit
and dm_endbit are now bit offsets from the start of the buffer, rather
than four bytes in to the buffer.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
We have several places which deal with releasing the map buffers and
freeing the map array. Provide a helper for this.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Split up adfs_read_map() into separate helpers to layout the map,
read the map, and release the map buffers.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
adfs_map_free() is not obvious whether it is freeing the map or
returning the number of free blocks on the filesystem. Rename it to
the more generic statfs() to make it clear that it's a statistic
function.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Keep all the map code together in map.c, rather than having some in
super.c
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Fix adfs_mode2atts() to actually update the file permissions on the
media rather than using the current inode mode. Note also that
directories do not have read/write permissions stored on the media.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Despite ADFS timestamps having centi-second granularity, and Linux
gaining fine-grained timestamp support in v2.5.48, fs/adfs was never
updated.
Update fs/adfs to centi-second support, and ensure that the inode ctime
always reflects what is written in underlying media.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull adfs updates from Al Viro:
"More ADFS patches from Russell King"
* 'work.adfs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fs/adfs: add time stamp and file type helpers
fs/adfs: super: limit idlen according to directory type
fs/adfs: super: fix use-after-free bug
fs/adfs: super: safely update options on remount
fs/adfs: super: correct superblock flags
fs/adfs: clean up indirect disc addresses and fragment IDs
fs/adfs: clean up error message printing
fs/adfs: use %pV for error messages
fs/adfs: use format_version from disc_record
fs/adfs: add helper to get filesystem size
fs/adfs: add helper to get discrecord from map
fs/adfs: correct disc record structure
|
|
Add some helpers to check whether the inode has a time stamp and file
type, and to parse the file type from the load address.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Limit idlen according to the directory type, as idlen (the size of a
fragment ID) can not be more than 16 with the "new directory" layout.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Fix a use-after-free bug during filesystem initialisation, where we
access the disc record (which is stored in a buffer) after we have
released the buffer.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Only update the options on remount if we successfully parse all options,
rather than updating those we've managed to parse.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
We don't support atime updates of any kind, and we ought to set the
read-only bit if we are compiled without write support.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
We use a variety of different names for the indirect disc address of
the current object, use a variety of different types, and print it in
a variety of different ways. Bring some consistency to this by naming
it "indaddr", use u32 or __u32 as the type since it fits in 32-bits,
and always print it with %06x (with no leading hex prefix.)
When printing it was a directory identifer, use "dir %06x" otherwise
use "object %06x".
Do the same for fragment IDs and the parent indirect disc addresses.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Overhaul our message printing:
- provide a consistent way to print messages:
- filesystem corruption should be reported via adfs_error()
- everything else should use adfs_msg()
- clean up the error message printing when mounting a filesystem
- fix the messages printed by the big directory format code to only
use adfs_error() when there is filesystem corruption, otherwise
use adfs_msg().
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Rather than using vsnprintf() with a temporary buffer on the stack, use
%pV to print error messages.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
We only use the format version in one place during filesystem mount, so
it is pointless storing it in the superblock structure. Also, we should
be using the version from the disc record in the map rather than the
boot block.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Add a helper to get the filesystem size from the disc record and
eliminate the "s_size" member of the adfs superblock structure.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Add a helper to get the disc record from the map, rather than open
coding this in adfs_fill_super().
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Based on 2 normalized pattern(s):
this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license version 2 as
published by the free software foundation
this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license version 2 as
published by the free software foundation #
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-only
has been chosen to replace the boilerplate/reference in 4122 file(s).
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Enrico Weigelt <info@metux.net>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Pull ADFS cleanups/fixes from Russell King:
"As a result of some of Al Viro's great work, here are a few cleanups
with fixes for adfs:
- factor out filename comparison, so we can be sure that
adfs_compare() (used for namei compare) and adfs_match() (used for
lookup) have the same behaviour.
- factor out filename lowering (which is not the same as tolower()
which will lower top-bit-set characters) to ensure that we have the
same behaviour when comparing filenames as when we hash them.
- factor out the object fixups, so we are applying all fixups to
directory objects in the same way, independent of the disk format.
- factor out the object name fixup (into the previously factored out
function) to ensure that filenames are appropriately translated -
for example, adfs allows '/' in filenames, which being the Unix
path separator, need to be translated to a different character,
which is normally '.' (DOS 8.3 filenames represent the . as a / on
adfs, so this is the expected reverse translation.)
- remove filename truncation; Al asked about this and apparently the
decision is to remove it. In any case, adfs's truncation was buggy,
so this rids us of that bug by removing the truncation feature.
- we now have only one location which adds the "filetype" suffix to
the filename, so there's no point that code being out of line.
- since we translate '/' into '.', an adfs filename of "/" or "//"
would end up being translated to "." and ".." which have special
meanings. In this case, change the first character to "^" to avoid
these special directory names being abused"
* tag 'for-rc-adfs' of git://git.armlinux.org.uk/~rmk/linux-arm:
fs/adfs: fix filename fixup handling for "/" and "//" names
fs/adfs: move append_filetype_suffix() into adfs_object_fixup()
fs/adfs: remove truncated filename hashing
fs/adfs: factor out filename fixup
fs/adfs: factor out object fixups
fs/adfs: factor out filename case lowering
fs/adfs: factor out filename comparison
|
|
Avoid translating "/" and "//" directory entry names to the special
"." and ".." names by instead converting the first character to "^".
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
|
|
append_filetype_suffix() is now only used in adfs_object_fixup(), so
move it there.
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
|
|
fs/adfs support for truncated filenames is broken, and there is a desire
not to support this into the future. Let's remove the fs/adfs support
for this.
Viro says:
"FWIW, the word from Linus had been basically "kill it off" on
truncation."
That being:
"Make it so. Make the rule be that d_hash() can only change the hash
itself, rather than the subtle special case for len that we had
because of legacy reasons.."
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
|
|
Move the filename fixup to adfs_object_fixup() so we only have one
implementation of this.
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
|
|
Factor out the directory object fixups, which parse the filetype and
optionally apply the filetype suffix to the filename.
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
|
|
Factor out the filename case lowering of directory names when comparing
or hashing filenames.
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
|
|
We have essentially the same code in adfs_compare() as adfs_match(), so
arrange to use a common implementation.
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
|
|
Add SPDX license identifiers to all Make/Kconfig files which:
- Have no license information of any form
These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:
GPL-2.0-only
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux
Pull Wimplicit-fallthrough updates from Gustavo A. R. Silva:
"Mark switch cases where we are expecting to fall through.
This is part of the ongoing efforts to enable -Wimplicit-fallthrough.
Most of them have been baking in linux-next for a whole development
cycle. And with Stephen Rothwell's help, we've had linux-next
nag-emails going out for newly introduced code that triggers
-Wimplicit-fallthrough to avoid gaining more of these cases while we
work to remove the ones that are already present.
We are getting close to completing this work. Currently, there are
only 32 of 2311 of these cases left to be addressed in linux-next. I'm
auditing every case; I take a look into the code and analyze it in
order to determine if I'm dealing with an actual bug or a false
positive, as explained here:
https://lore.kernel.org/lkml/c2fad584-1705-a5f2-d63c-824e9b96cf50@embeddedor.com/
While working on this, I've found and fixed the several missing
break/return bugs, some of them introduced more than 5 years ago.
Once this work is finished, we'll be able to universally enable
"-Wimplicit-fallthrough" to avoid any of these kinds of bugs from
entering the kernel again"
* tag 'Wimplicit-fallthrough-5.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux: (27 commits)
memstick: mark expected switch fall-throughs
drm/nouveau/nvkm: mark expected switch fall-throughs
NFC: st21nfca: Fix fall-through warnings
NFC: pn533: mark expected switch fall-throughs
block: Mark expected switch fall-throughs
ASN.1: mark expected switch fall-through
lib/cmdline.c: mark expected switch fall-throughs
lib: zstd: Mark expected switch fall-throughs
scsi: sym53c8xx_2: sym_nvram: Mark expected switch fall-through
scsi: sym53c8xx_2: sym_hipd: mark expected switch fall-throughs
scsi: ppa: mark expected switch fall-through
scsi: osst: mark expected switch fall-throughs
scsi: lpfc: lpfc_scsi: Mark expected switch fall-throughs
scsi: lpfc: lpfc_nvme: Mark expected switch fall-through
scsi: lpfc: lpfc_nportdisc: Mark expected switch fall-through
scsi: lpfc: lpfc_hbadisc: Mark expected switch fall-throughs
scsi: lpfc: lpfc_els: Mark expected switch fall-throughs
scsi: lpfc: lpfc_ct: Mark expected switch fall-throughs
scsi: imm: mark expected switch fall-throughs
scsi: csiostor: csio_wr: mark expected switch fall-through
...
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.
Warning level 3 was used: -Wimplicit-fallthrough=3
This patch is part of the ongoing efforts to enable
-Wimplicit-fallthrough
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
|
|
We just truncate the seconds to 32-bit in one place now, so this can
trivially be converted over to using timespec64 consistently.
Link: http://lkml.kernel.org/r/20180620100133.4035614-1-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
We never look them up in there; inode_fake_hash() will make them appear
hashed for mark_inode_dirty() purposes. And don't leave them around
until memory pressure kicks them out - we never look them up again.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arnd/playground
Pull inode timestamps conversion to timespec64 from Arnd Bergmann:
"This is a late set of changes from Deepa Dinamani doing an automated
treewide conversion of the inode and iattr structures from 'timespec'
to 'timespec64', to push the conversion from the VFS layer into the
individual file systems.
As Deepa writes:
'The series aims to switch vfs timestamps to use struct timespec64.
Currently vfs uses struct timespec, which is not y2038 safe.
The series involves the following:
1. Add vfs helper functions for supporting struct timepec64
timestamps.
2. Cast prints of vfs timestamps to avoid warnings after the switch.
3. Simplify code using vfs timestamps so that the actual replacement
becomes easy.
4. Convert vfs timestamps to use struct timespec64 using a script.
This is a flag day patch.
Next steps:
1. Convert APIs that can handle timespec64, instead of converting
timestamps at the boundaries.
2. Update internal data structures to avoid timestamp conversions'
Thomas Gleixner adds:
'I think there is no point to drag that out for the next merge
window. The whole thing needs to be done in one go for the core
changes which means that you're going to play that catchup game
forever. Let's get over with it towards the end of the merge window'"
* tag 'vfs-timespec64' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/playground:
pstore: Remove bogus format string definition
vfs: change inode times to use struct timespec64
pstore: Convert internal records to timespec64
udf: Simplify calls to udf_disk_stamp_to_time
fs: nfs: get rid of memcpys for inode times
ceph: make inode time prints to be long long
lustre: Use long long type to print inode time
fs: add timespec64_truncate()
|
|
The kmalloc() function has a 2-factor argument form, kmalloc_array(). This
patch replaces cases of:
kmalloc(a * b, gfp)
with:
kmalloc_array(a * b, gfp)
as well as handling cases of:
kmalloc(a * b * c, gfp)
with:
kmalloc(array3_size(a, b, c), gfp)
as it's slightly less ugly than:
kmalloc_array(array_size(a, b), c, gfp)
This does, however, attempt to ignore constant size factors like:
kmalloc(4 * 1024, gfp)
though any constants defined via macros get caught up in the conversion.
Any factors with a sizeof() of "unsigned char", "char", and "u8" were
dropped, since they're redundant.
The tools/ directory was manually excluded, since it has its own
implementation of kmalloc().
The Coccinelle script used for this was:
// Fix redundant parens around sizeof().
@@
type TYPE;
expression THING, E;
@@
(
kmalloc(
- (sizeof(TYPE)) * E
+ sizeof(TYPE) * E
, ...)
|
kmalloc(
- (sizeof(THING)) * E
+ sizeof(THING) * E
, ...)
)
// Drop single-byte sizes and redundant parens.
@@
expression COUNT;
typedef u8;
typedef __u8;
@@
(
kmalloc(
- sizeof(u8) * (COUNT)
+ COUNT
, ...)
|
kmalloc(
- sizeof(__u8) * (COUNT)
+ COUNT
, ...)
|
kmalloc(
- sizeof(char) * (COUNT)
+ COUNT
, ...)
|
kmalloc(
- sizeof(unsigned char) * (COUNT)
+ COUNT
, ...)
|
kmalloc(
- sizeof(u8) * COUNT
+ COUNT
, ...)
|
kmalloc(
- sizeof(__u8) * COUNT
+ COUNT
, ...)
|
kmalloc(
- sizeof(char) * COUNT
+ COUNT
, ...)
|
kmalloc(
- sizeof(unsigned char) * COUNT
+ COUNT
, ...)
)
// 2-factor product with sizeof(type/expression) and identifier or constant.
@@
type TYPE;
expression THING;
identifier COUNT_ID;
constant COUNT_CONST;
@@
(
- kmalloc
+ kmalloc_array
(
- sizeof(TYPE) * (COUNT_ID)
+ COUNT_ID, sizeof(TYPE)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(TYPE) * COUNT_ID
+ COUNT_ID, sizeof(TYPE)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(TYPE) * (COUNT_CONST)
+ COUNT_CONST, sizeof(TYPE)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(TYPE) * COUNT_CONST
+ COUNT_CONST, sizeof(TYPE)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(THING) * (COUNT_ID)
+ COUNT_ID, sizeof(THING)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(THING) * COUNT_ID
+ COUNT_ID, sizeof(THING)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(THING) * (COUNT_CONST)
+ COUNT_CONST, sizeof(THING)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(THING) * COUNT_CONST
+ COUNT_CONST, sizeof(THING)
, ...)
)
// 2-factor product, only identifiers.
@@
identifier SIZE, COUNT;
@@
- kmalloc
+ kmalloc_array
(
- SIZE * COUNT
+ COUNT, SIZE
, ...)
// 3-factor product with 1 sizeof(type) or sizeof(expression), with
// redundant parens removed.
@@
expression THING;
identifier STRIDE, COUNT;
type TYPE;
@@
(
kmalloc(
- sizeof(TYPE) * (COUNT) * (STRIDE)
+ array3_size(COUNT, STRIDE, sizeof(TYPE))
, ...)
|
kmalloc(
- sizeof(TYPE) * (COUNT) * STRIDE
+ array3_size(COUNT, STRIDE, sizeof(TYPE))
, ...)
|
kmalloc(
- sizeof(TYPE) * COUNT * (STRIDE)
+ array3_size(COUNT, STRIDE, sizeof(TYPE))
, ...)
|
kmalloc(
- sizeof(TYPE) * COUNT * STRIDE
+ array3_size(COUNT, STRIDE, sizeof(TYPE))
, ...)
|
kmalloc(
- sizeof(THING) * (COUNT) * (STRIDE)
+ array3_size(COUNT, STRIDE, sizeof(THING))
, ...)
|
kmalloc(
- sizeof(THING) * (COUNT) * STRIDE
+ array3_size(COUNT, STRIDE, sizeof(THING))
, ...)
|
kmalloc(
- sizeof(THING) * COUNT * (STRIDE)
+ array3_size(COUNT, STRIDE, sizeof(THING))
, ...)
|
kmalloc(
- sizeof(THING) * COUNT * STRIDE
+ array3_size(COUNT, STRIDE, sizeof(THING))
, ...)
)
// 3-factor product with 2 sizeof(variable), with redundant parens removed.
@@
expression THING1, THING2;
identifier COUNT;
type TYPE1, TYPE2;
@@
(
kmalloc(
- sizeof(TYPE1) * sizeof(TYPE2) * COUNT
+ array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
, ...)
|
kmalloc(
- sizeof(TYPE1) * sizeof(THING2) * (COUNT)
+ array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
, ...)
|
kmalloc(
- sizeof(THING1) * sizeof(THING2) * COUNT
+ array3_size(COUNT, sizeof(THING1), sizeof(THING2))
, ...)
|
kmalloc(
- sizeof(THING1) * sizeof(THING2) * (COUNT)
+ array3_size(COUNT, sizeof(THING1), sizeof(THING2))
, ...)
|
kmalloc(
- sizeof(TYPE1) * sizeof(THING2) * COUNT
+ array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
, ...)
|
kmalloc(
- sizeof(TYPE1) * sizeof(THING2) * (COUNT)
+ array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
, ...)
)
// 3-factor product, only identifiers, with redundant parens removed.
@@
identifier STRIDE, SIZE, COUNT;
@@
(
kmalloc(
- (COUNT) * STRIDE * SIZE
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kmalloc(
- COUNT * (STRIDE) * SIZE
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kmalloc(
- COUNT * STRIDE * (SIZE)
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kmalloc(
- (COUNT) * (STRIDE) * SIZE
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kmalloc(
- COUNT * (STRIDE) * (SIZE)
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kmalloc(
- (COUNT) * STRIDE * (SIZE)
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kmalloc(
- (COUNT) * (STRIDE) * (SIZE)
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
|
kmalloc(
- COUNT * STRIDE * SIZE
+ array3_size(COUNT, STRIDE, SIZE)
, ...)
)
// Any remaining multi-factor products, first at least 3-factor products,
// when they're not all constants...
@@
expression E1, E2, E3;
constant C1, C2, C3;
@@
(
kmalloc(C1 * C2 * C3, ...)
|
kmalloc(
- (E1) * E2 * E3
+ array3_size(E1, E2, E3)
, ...)
|
kmalloc(
- (E1) * (E2) * E3
+ array3_size(E1, E2, E3)
, ...)
|
kmalloc(
- (E1) * (E2) * (E3)
+ array3_size(E1, E2, E3)
, ...)
|
kmalloc(
- E1 * E2 * E3
+ array3_size(E1, E2, E3)
, ...)
)
// And then all remaining 2 factors products when they're not all constants,
// keeping sizeof() as the second factor argument.
@@
expression THING, E1, E2;
type TYPE;
constant C1, C2, C3;
@@
(
kmalloc(sizeof(THING) * C2, ...)
|
kmalloc(sizeof(TYPE) * C2, ...)
|
kmalloc(C1 * C2 * C3, ...)
|
kmalloc(C1 * C2, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(TYPE) * (E2)
+ E2, sizeof(TYPE)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(TYPE) * E2
+ E2, sizeof(TYPE)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(THING) * (E2)
+ E2, sizeof(THING)
, ...)
|
- kmalloc
+ kmalloc_array
(
- sizeof(THING) * E2
+ E2, sizeof(THING)
, ...)
|
- kmalloc
+ kmalloc_array
(
- (E1) * E2
+ E1, E2
, ...)
|
- kmalloc
+ kmalloc_array
(
- (E1) * (E2)
+ E1, E2
, ...)
|
- kmalloc
+ kmalloc_array
(
- E1 * E2
+ E1, E2
, ...)
)
Signed-off-by: Kees Cook <keescook@chromium.org>
|
|
struct timespec is not y2038 safe. Transition vfs to use
y2038 safe struct timespec64 instead.
The change was made with the help of the following cocinelle
script. This catches about 80% of the changes.
All the header file and logic changes are included in the
first 5 rules. The rest are trivial substitutions.
I avoid changing any of the function signatures or any other
filesystem specific data structures to keep the patch simple
for review.
The script can be a little shorter by combining different cases.
But, this version was sufficient for my usecase.
virtual patch
@ depends on patch @
identifier now;
@@
- struct timespec
+ struct timespec64
current_time ( ... )
{
- struct timespec now = current_kernel_time();
+ struct timespec64 now = current_kernel_time64();
...
- return timespec_trunc(
+ return timespec64_trunc(
... );
}
@ depends on patch @
identifier xtime;
@@
struct \( iattr \| inode \| kstat \) {
...
- struct timespec xtime;
+ struct timespec64 xtime;
...
}
@ depends on patch @
identifier t;
@@
struct inode_operations {
...
int (*update_time) (...,
- struct timespec t,
+ struct timespec64 t,
...);
...
}
@ depends on patch @
identifier t;
identifier fn_update_time =~ "update_time$";
@@
fn_update_time (...,
- struct timespec *t,
+ struct timespec64 *t,
...) { ... }
@ depends on patch @
identifier t;
@@
lease_get_mtime( ... ,
- struct timespec *t
+ struct timespec64 *t
) { ... }
@te depends on patch forall@
identifier ts;
local idexpression struct inode *inode_node;
identifier i_xtime =~ "^i_[acm]time$";
identifier ia_xtime =~ "^ia_[acm]time$";
identifier fn_update_time =~ "update_time$";
identifier fn;
expression e, E3;
local idexpression struct inode *node1;
local idexpression struct inode *node2;
local idexpression struct iattr *attr1;
local idexpression struct iattr *attr2;
local idexpression struct iattr attr;
identifier i_xtime1 =~ "^i_[acm]time$";
identifier i_xtime2 =~ "^i_[acm]time$";
identifier ia_xtime1 =~ "^ia_[acm]time$";
identifier ia_xtime2 =~ "^ia_[acm]time$";
@@
(
(
- struct timespec ts;
+ struct timespec64 ts;
|
- struct timespec ts = current_time(inode_node);
+ struct timespec64 ts = current_time(inode_node);
)
<+... when != ts
(
- timespec_equal(&inode_node->i_xtime, &ts)
+ timespec64_equal(&inode_node->i_xtime, &ts)
|
- timespec_equal(&ts, &inode_node->i_xtime)
+ timespec64_equal(&ts, &inode_node->i_xtime)
|
- timespec_compare(&inode_node->i_xtime, &ts)
+ timespec64_compare(&inode_node->i_xtime, &ts)
|
- timespec_compare(&ts, &inode_node->i_xtime)
+ timespec64_compare(&ts, &inode_node->i_xtime)
|
ts = current_time(e)
|
fn_update_time(..., &ts,...)
|
inode_node->i_xtime = ts
|
node1->i_xtime = ts
|
ts = inode_node->i_xtime
|
<+... attr1->ia_xtime ...+> = ts
|
ts = attr1->ia_xtime
|
ts.tv_sec
|
ts.tv_nsec
|
btrfs_set_stack_timespec_sec(..., ts.tv_sec)
|
btrfs_set_stack_timespec_nsec(..., ts.tv_nsec)
|
- ts = timespec64_to_timespec(
+ ts =
...
-)
|
- ts = ktime_to_timespec(
+ ts = ktime_to_timespec64(
...)
|
- ts = E3
+ ts = timespec_to_timespec64(E3)
|
- ktime_get_real_ts(&ts)
+ ktime_get_real_ts64(&ts)
|
fn(...,
- ts
+ timespec64_to_timespec(ts)
,...)
)
...+>
(
<... when != ts
- return ts;
+ return timespec64_to_timespec(ts);
...>
)
|
- timespec_equal(&node1->i_xtime1, &node2->i_xtime2)
+ timespec64_equal(&node1->i_xtime2, &node2->i_xtime2)
|
- timespec_equal(&node1->i_xtime1, &attr2->ia_xtime2)
+ timespec64_equal(&node1->i_xtime2, &attr2->ia_xtime2)
|
- timespec_compare(&node1->i_xtime1, &node2->i_xtime2)
+ timespec64_compare(&node1->i_xtime1, &node2->i_xtime2)
|
node1->i_xtime1 =
- timespec_trunc(attr1->ia_xtime1,
+ timespec64_trunc(attr1->ia_xtime1,
...)
|
- attr1->ia_xtime1 = timespec_trunc(attr2->ia_xtime2,
+ attr1->ia_xtime1 = timespec64_trunc(attr2->ia_xtime2,
...)
|
- ktime_get_real_ts(&attr1->ia_xtime1)
+ ktime_get_real_ts64(&attr1->ia_xtime1)
|
- ktime_get_real_ts(&attr.ia_xtime1)
+ ktime_get_real_ts64(&attr.ia_xtime1)
)
@ depends on patch @
struct inode *node;
struct iattr *attr;
identifier fn;
identifier i_xtime =~ "^i_[acm]time$";
identifier ia_xtime =~ "^ia_[acm]time$";
expression e;
@@
(
- fn(node->i_xtime);
+ fn(timespec64_to_timespec(node->i_xtime));
|
fn(...,
- node->i_xtime);
+ timespec64_to_timespec(node->i_xtime));
|
- e = fn(attr->ia_xtime);
+ e = fn(timespec64_to_timespec(attr->ia_xtime));
)
@ depends on patch forall @
struct inode *node;
struct iattr *attr;
identifier i_xtime =~ "^i_[acm]time$";
identifier ia_xtime =~ "^ia_[acm]time$";
identifier fn;
@@
{
+ struct timespec ts;
<+...
(
+ ts = timespec64_to_timespec(node->i_xtime);
fn (...,
- &node->i_xtime,
+ &ts,
...);
|
+ ts = timespec64_to_timespec(attr->ia_xtime);
fn (...,
- &attr->ia_xtime,
+ &ts,
...);
)
...+>
}
@ depends on patch forall @
struct inode *node;
struct iattr *attr;
struct kstat *stat;
identifier ia_xtime =~ "^ia_[acm]time$";
identifier i_xtime =~ "^i_[acm]time$";
identifier xtime =~ "^[acm]time$";
identifier fn, ret;
@@
{
+ struct timespec ts;
<+...
(
+ ts = timespec64_to_timespec(node->i_xtime);
ret = fn (...,
- &node->i_xtime,
+ &ts,
...);
|
+ ts = timespec64_to_timespec(node->i_xtime);
ret = fn (...,
- &node->i_xtime);
+ &ts);
|
+ ts = timespec64_to_timespec(attr->ia_xtime);
ret = fn (...,
- &attr->ia_xtime,
+ &ts,
...);
|
+ ts = timespec64_to_timespec(attr->ia_xtime);
ret = fn (...,
- &attr->ia_xtime);
+ &ts);
|
+ ts = timespec64_to_timespec(stat->xtime);
ret = fn (...,
- &stat->xtime);
+ &ts);
)
...+>
}
@ depends on patch @
struct inode *node;
struct inode *node2;
identifier i_xtime1 =~ "^i_[acm]time$";
identifier i_xtime2 =~ "^i_[acm]time$";
identifier i_xtime3 =~ "^i_[acm]time$";
struct iattr *attrp;
struct iattr *attrp2;
struct iattr attr ;
identifier ia_xtime1 =~ "^ia_[acm]time$";
identifier ia_xtime2 =~ "^ia_[acm]time$";
struct kstat *stat;
struct kstat stat1;
struct timespec64 ts;
identifier xtime =~ "^[acmb]time$";
expression e;
@@
(
( node->i_xtime2 \| attrp->ia_xtime2 \| attr.ia_xtime2 \) = node->i_xtime1 ;
|
node->i_xtime2 = \( node2->i_xtime1 \| timespec64_trunc(...) \);
|
node->i_xtime2 = node->i_xtime1 = node->i_xtime3 = \(ts \| current_time(...) \);
|
node->i_xtime1 = node->i_xtime3 = \(ts \| current_time(...) \);
|
stat->xtime = node2->i_xtime1;
|
stat1.xtime = node2->i_xtime1;
|
( node->i_xtime2 \| attrp->ia_xtime2 \) = attrp->ia_xtime1 ;
|
( attrp->ia_xtime1 \| attr.ia_xtime1 \) = attrp2->ia_xtime2;
|
- e = node->i_xtime1;
+ e = timespec64_to_timespec( node->i_xtime1 );
|
- e = attrp->ia_xtime1;
+ e = timespec64_to_timespec( attrp->ia_xtime1 );
|
node->i_xtime1 = current_time(...);
|
node->i_xtime2 = node->i_xtime1 = node->i_xtime3 =
- e;
+ timespec_to_timespec64(e);
|
node->i_xtime1 = node->i_xtime3 =
- e;
+ timespec_to_timespec64(e);
|
- node->i_xtime1 = e;
+ node->i_xtime1 = timespec_to_timespec64(e);
)
Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
Cc: <anton@tuxera.com>
Cc: <balbi@kernel.org>
Cc: <bfields@fieldses.org>
Cc: <darrick.wong@oracle.com>
Cc: <dhowells@redhat.com>
Cc: <dsterba@suse.com>
Cc: <dwmw2@infradead.org>
Cc: <hch@lst.de>
Cc: <hirofumi@mail.parknet.co.jp>
Cc: <hubcap@omnibond.com>
Cc: <jack@suse.com>
Cc: <jaegeuk@kernel.org>
Cc: <jaharkes@cs.cmu.edu>
Cc: <jslaby@suse.com>
Cc: <keescook@chromium.org>
Cc: <mark@fasheh.com>
Cc: <miklos@szeredi.hu>
Cc: <nico@linaro.org>
Cc: <reiserfs-devel@vger.kernel.org>
Cc: <richard@nod.at>
Cc: <sage@redhat.com>
Cc: <sfrench@samba.org>
Cc: <swhiteho@redhat.com>
Cc: <tj@kernel.org>
Cc: <trond.myklebust@primarydata.com>
Cc: <tytso@mit.edu>
Cc: <viro@zeniv.linux.org.uk>
|
|
Cc: Russell King <linux@armlinux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Cc: Russell King <linux@armlinux.org.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
This is a pure automated search-and-replace of the internal kernel
superblock flags.
The s_flags are now called SB_*, with the names and the values for the
moment mirroring the MS_* flags that they're equivalent to.
Note how the MS_xyz flags are the ones passed to the mount system call,
while the SB_xyz flags are what we then use in sb->s_flags.
The script to do this was:
# places to look in; re security/*: it generally should *not* be
# touched (that stuff parses mount(2) arguments directly), but
# there are two places where we really deal with superblock flags.
FILES="drivers/mtd drivers/staging/lustre fs ipc mm \
include/linux/fs.h include/uapi/linux/bfs_fs.h \
security/apparmor/apparmorfs.c security/apparmor/include/lib.h"
# the list of MS_... constants
SYMS="RDONLY NOSUID NODEV NOEXEC SYNCHRONOUS REMOUNT MANDLOCK \
DIRSYNC NOATIME NODIRATIME BIND MOVE REC VERBOSE SILENT \
POSIXACL UNBINDABLE PRIVATE SLAVE SHARED RELATIME KERNMOUNT \
I_VERSION STRICTATIME LAZYTIME SUBMOUNT NOREMOTELOCK NOSEC BORN \
ACTIVE NOUSER"
SED_PROG=
for i in $SYMS; do SED_PROG="$SED_PROG -e s/MS_$i/SB_$i/g"; done
# we want files that contain at least one of MS_...,
# with fs/namespace.c and fs/pnode.c excluded.
L=$(for i in $SYMS; do git grep -w -l MS_$i $FILES; done| sort|uniq|grep -v '^fs/namespace.c'|grep -v '^fs/pnode.c')
for f in $L; do sed -i $f $SED_PROG; done
Requested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became
the concluded license(s).
- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull more vfs updates from Al Viro:
">rename2() work from Miklos + current_time() from Deepa"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fs: Replace current_fs_time() with current_time()
fs: Replace CURRENT_TIME_SEC with current_time() for inode timestamps
fs: Replace CURRENT_TIME with current_time() for inode timestamps
fs: proc: Delete inode time initializations in proc_alloc_inode()
vfs: Add current_time() api
vfs: add note about i_op->rename changes to porting
fs: rename "rename2" i_op to "rename"
vfs: remove unused i_op->rename
fs: make remaining filesystems use .rename2
libfs: support RENAME_NOREPLACE in simple_rename()
fs: support RENAME_NOREPLACE for local filesystems
ncpfs: fix unused variable warning
|
|
CURRENT_TIME macro is not appropriate for filesystems as it
doesn't use the right granularity for filesystem timestamps.
Use current_time() instead.
CURRENT_TIME is also not y2038 safe.
This is also in preparation for the patch that transitions
vfs timestamps to use 64 bit time and hence make them
y2038 safe. As part of the effort current_time() will be
extended to do range checks. Hence, it is necessary for all
file system timestamps to use current_time(). Also,
current_time() will be transitioned along with vfs to be
y2038 safe.
Note that whenever a single call to current_time() is used
to change timestamps in different inodes, it is because they
share the same time granularity.
Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Felipe Balbi <balbi@kernel.org>
Acked-by: Steven Whitehouse <swhiteho@redhat.com>
Acked-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Acked-by: David Sterba <dsterba@suse.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
inode_change_ok() will be resposible for clearing capabilities and IMA
extended attributes and as such will need dentry. Give it as an argument
to inode_change_ok() instead of an inode. Also rename inode_change_ok()
to setattr_prepare() to better relect that it does also some
modifications in addition to checks.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jan Kara <jack@suse.cz>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull more vfs updates from Al Viro:
"Assorted cleanups and fixes.
In the "trivial API change" department - ->d_compare() losing 'parent'
argument"
* 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
cachefiles: Fix race between inactivating and culling a cache object
9p: use clone_fid()
9p: fix braino introduced in "9p: new helper - v9fs_parent_fid()"
vfs: make dentry_needs_remove_privs() internal
vfs: remove file_needs_remove_privs()
vfs: fix deadlock in file_remove_privs() on overlayfs
get rid of 'parent' argument of ->d_compare()
cifs, msdos, vfat, hfs+: don't bother with parent in ->d_compare()
affs ->d_compare(): don't bother with ->d_inode
fold _d_rehash() and __d_rehash() together
fold dentry_rcuwalk_invalidate() into its only remaining caller
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull qstr constification updates from Al Viro:
"Fairly self-contained bunch - surprising lot of places passes struct
qstr * as an argument when const struct qstr * would suffice; it
complicates analysis for no good reason.
I'd prefer to feed that separately from the assorted fixes (those are
in #for-linus and with somewhat trickier topology)"
* 'work.const-qstr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
qstr: constify instances in adfs
qstr: constify instances in lustre
qstr: constify instances in f2fs
qstr: constify instances in ext2
qstr: constify instances in vfat
qstr: constify instances in procfs
qstr: constify instances in fuse
qstr constify instances in fs/dcache.c
qstr: constify instances in nfs
qstr: constify instances in ocfs2
qstr: constify instances in autofs4
qstr: constify instances in hfs
qstr: constify instances in hfsplus
qstr: constify instances in logfs
qstr: constify dentry_init_security
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
We always mixed in the parent pointer into the dentry name hash, but we
did it late at lookup time. It turns out that we can simplify that
lookup-time action by salting the hash with the parent pointer early
instead of late.
A few other users of our string hashes also wanted to mix in their own
pointers into the hash, and those are updated to use the same mechanism.
Hash users that don't have any particular initial salt can just use the
NULL pointer as a no-salt.
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: George Spelvin <linux@sciencehorizons.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Lots of needless 80-col overflows.
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Mark those kmem allocations that are known to be easily triggered from
userspace as __GFP_ACCOUNT/SLAB_ACCOUNT, which makes them accounted to
memcg. For the list, see below:
- threadinfo
- task_struct
- task_delay_info
- pid
- cred
- mm_struct
- vm_area_struct and vm_region (nommu)
- anon_vma and anon_vma_chain
- signal_struct
- sighand_struct
- fs_struct
- files_struct
- fdtable and fdtable->full_fds_bits
- dentry and external_name
- inode for all filesystems. This is the most tedious part, because
most filesystems overwrite the alloc_inode method.
The list is far from complete, so feel free to add more objects.
Nevertheless, it should be close to "account everything" approach and
keep most workloads within bounds. Malevolent users will be able to
breach the limit, but this was possible even with the former "account
everything" approach (simply because it did not account everything in
fact).
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Greg Thelen <gthelen@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The adfs_dir_ops structures are never modified, so declare them as const.
Done with the help of Coccinelle.
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
kmem_cache_alloc() returns void*.
Signed-off-by: Firo Yang <firogm@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull fourth vfs update from Al Viro:
"d_inode() annotations from David Howells (sat in for-next since before
the beginning of merge window) + four assorted fixes"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
RCU pathwalk breakage when running into a symlink overmounting something
fix I_DIO_WAKEUP definition
direct-io: only inc/dec inode->i_dio_count for file systems
fs/9p: fix readdir()
VFS: assorted d_backing_inode() annotations
VFS: fs/inode.c helpers: d_inode() annotations
VFS: fs/cachefiles: d_backing_inode() annotations
VFS: fs library helpers: d_inode() annotations
VFS: assorted weird filesystems: d_inode() annotations
VFS: normal filesystems (and lustre): d_inode() annotations
VFS: security/: d_inode() annotations
VFS: security/: d_backing_inode() annotations
VFS: net/: d_inode() annotations
VFS: net/unix: d_backing_inode() annotations
VFS: kernel/: d_inode() annotations
VFS: audit: d_backing_inode() annotations
VFS: Fix up some ->d_inode accesses in the chelsio driver
VFS: Cachefiles should perform fs modifications on the top layer only
VFS: AF_UNIX sockets should call mknod on the top layer only
|
|
Fix the wrong values returned by various functions such as EIO and ENOMEM.
Signed-off-by: Sanidhya Kashyap <sanidhya.gatech@gmail.com>
Cc: Fabian Frederick <fabf@skynet.be>
Cc: Joe Perches <joe@perches.com>
Cc: Taesoo kim <taesoo@gatech.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
that's the bulk of filesystem drivers dealing with inodes of their own
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
All places outside of core VFS that checked ->read and ->write for being NULL or
called the methods directly are gone now, so NULL {read,write} with non-NULL
{read,write}_iter will do the right thing in all cases.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Might as well do the right thing.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
kcalloc manages count*sizeof overflow.
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Use kernel.h definition.
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
init_inodecache is only called by __init init_adfs_fs.
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Previously, the no-op "mount -o mount /dev/xxx" operation when the
file system is already mounted read-write causes an implied,
unconditional syncfs(). This seems pretty stupid, and it's certainly
documented or guaraunteed to do this, nor is it particularly useful,
except in the case where the file system was mounted rw and is getting
remounted read-only.
However, it's possible that there might be some file systems that are
actually depending on this behavior. In most file systems, it's
probably fine to only call sync_filesystem() when transitioning from
read-write to read-only, and there are some file systems where this is
not needed at all (for example, for a pseudo-filesystem or something
like romfs).
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: linux-fsdevel@vger.kernel.org
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Artem Bityutskiy <dedekind1@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Evgeniy Dushistov <dushistov@mail.ru>
Cc: Jan Kara <jack@suse.cz>
Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: Anders Larsen <al@alarsen.net>
Cc: Phillip Lougher <phillip@squashfs.org.uk>
Cc: Kees Cook <keescook@chromium.org>
Cc: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
Cc: Petr Vandrovec <petr@vandrovec.name>
Cc: xfs@oss.sgi.com
Cc: linux-btrfs@vger.kernel.org
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Cc: codalist@coda.cs.cmu.edu
Cc: linux-ext4@vger.kernel.org
Cc: linux-f2fs-devel@lists.sourceforge.net
Cc: fuse-devel@lists.sourceforge.net
Cc: cluster-devel@redhat.com
Cc: linux-mtd@lists.infradead.org
Cc: jfs-discussion@lists.sourceforge.net
Cc: linux-nfs@vger.kernel.org
Cc: linux-nilfs@vger.kernel.org
Cc: linux-ntfs-dev@lists.sourceforge.net
Cc: ocfs2-devel@oss.oracle.com
Cc: reiserfs-devel@vger.kernel.org
|
|
makes ->d_hash() and ->d_compare() safety in RCU mode independent
from vfsmount_lock.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
truncate_pagecache() doesn't care about old size since commit
cedabed49b39 ("vfs: Fix vmtruncate() regression"). Let's drop it.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Instances either don't look at it at all (the majority of cases) or
only want it to find the superblock (which can be had as dentry->d_sb).
A few cases that want more are actually safe with dentry->d_inode -
the only precaution needed is the check that it hadn't been replaced with
NULL by rmdir() or by overwriting rename(), which case should be simply
treated as cache miss.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Modify the request_module to prefix the file system type with "fs-"
and add aliases to all of the filesystems that can be built as modules
to match.
A common practice is to build all of the kernel code and leave code
that is not commonly needed as modules, with the result that many
users are exposed to any bug anywhere in the kernel.
Looking for filesystems with a fs- prefix limits the pool of possible
modules that can be loaded by mount to just filesystems trivially
making things safer with no real cost.
Using aliases means user space can control the policy of which
filesystem modules are auto-loaded by editing /etc/modprobe.d/*.conf
with blacklist and alias directives. Allowing simple, safe,
well understood work-arounds to known problematic software.
This also addresses a rare but unfortunate problem where the filesystem
name is not the same as it's module name and module auto-loading
would not work. While writing this patch I saw a handful of such
cases. The most significant being autofs that lives in the module
autofs4.
This is relevant to user namespaces because we can reach the request
module in get_fs_type() without having any special permissions, and
people get uncomfortable when a user specified string (in this case
the filesystem type) goes all of the way to request_module.
After having looked at this issue I don't think there is any
particular reason to perform any filtering or permission checks beyond
making it clear in the module request that we want a filesystem
module. The common pattern in the kernel is to call request_module()
without regards to the users permissions. In general all a filesystem
module does once loaded is call register_filesystem() and go to sleep.
Which means there is not much attack surface exposed by loading a
filesytem module unless the filesystem is mounted. In a user
namespace filesystems are not mounted unless .fs_flags = FS_USERNS_MOUNT,
which most filesystems do not set today.
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Acked-by: Kees Cook <keescook@chromium.org>
Reported-by: Kees Cook <keescook@google.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs pile (part one) from Al Viro:
"Assorted stuff - cleaning namei.c up a bit, fixing ->d_name/->d_parent
locking violations, etc.
The most visible changes here are death of FS_REVAL_DOT (replaced with
"has ->d_weak_revalidate()") and a new helper getting from struct file
to inode. Some bits of preparation to xattr method interface changes.
Misc patches by various people sent this cycle *and* ocfs2 fixes from
several cycles ago that should've been upstream right then.
PS: the next vfs pile will be xattr stuff."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits)
saner proc_get_inode() calling conventions
proc: avoid extra pde_put() in proc_fill_super()
fs: change return values from -EACCES to -EPERM
fs/exec.c: make bprm_mm_init() static
ocfs2/dlm: use GFP_ATOMIC inside a spin_lock
ocfs2: fix possible use-after-free with AIO
ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path
get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero
target: writev() on single-element vector is pointless
export kernel_write(), convert open-coded instances
fs: encode_fh: return FILEID_INVALID if invalid fid_type
kill f_vfsmnt
vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op
nfsd: handle vfs_getattr errors in acl protocol
switch vfs_getattr() to struct path
default SET_PERSONALITY() in linux/elf.h
ceph: prepopulate inodes only when request is aborted
d_hash_and_lookup(): export, switch open-coded instances
9p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate()
9p: split dropping the acls from v9fs_set_create_acl()
...
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
The CONFIG_EXPERIMENTAL config item has not carried much meaning for a
while now and is almost always enabled by default. As agreed during the
Linux kernel summit, remove it from any "depends on" lines in Kconfigs.
Acked-by: Stuart Swales <stuart.swales.croftnuisk@gmail.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Removed vmtruncate
Signed-off-by: Marco Stornelli <marco.stornelli@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs update from Al Viro:
- big one - consolidation of descriptor-related logics; almost all of
that is moved to fs/file.c
(BTW, I'm seriously tempted to rename the result to fd.c. As it is,
we have a situation when file_table.c is about handling of struct
file and file.c is about handling of descriptor tables; the reasons
are historical - file_table.c used to be about a static array of
struct file we used to have way back).
A lot of stray ends got cleaned up and converted to saner primitives,
disgusting mess in android/binder.c is still disgusting, but at least
doesn't poke so much in descriptor table guts anymore. A bunch of
relatively minor races got fixed in process, plus an ext4 struct file
leak.
- related thing - fget_light() partially unuglified; see fdget() in
there (and yes, it generates the code as good as we used to have).
- also related - bits of Cyrill's procfs stuff that got entangled into
that work; _not_ all of it, just the initial move to fs/proc/fd.c and
switch of fdinfo to seq_file.
- Alex's fs/coredump.c spiltoff - the same story, had been easier to
take that commit than mess with conflicts. The rest is a separate
pile, this was just a mechanical code movement.
- a few misc patches all over the place. Not all for this cycle,
there'll be more (and quite a few currently sit in akpm's tree)."
Fix up trivial conflicts in the android binder driver, and some fairly
simple conflicts due to two different changes to the sock_alloc_file()
interface ("take descriptor handling from sock_alloc_file() to callers"
vs "net: Providing protocol type via system.sockprotoname xattr of
/proc/PID/fd entries" adding a dentry name to the socket)
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (72 commits)
MAX_LFS_FILESIZE should be a loff_t
compat: fs: Generic compat_sys_sendfile implementation
fs: push rcu_barrier() from deactivate_locked_super() to filesystems
btrfs: reada_extent doesn't need kref for refcount
coredump: move core dump functionality into its own file
coredump: prevent double-free on an error path in core dumper
usb/gadget: fix misannotations
fcntl: fix misannotations
ceph: don't abuse d_delete() on failure exits
hypfs: ->d_parent is never NULL or negative
vfs: delete surplus inode NULL check
switch simple cases of fget_light to fdget
new helpers: fdget()/fdput()
switch o2hb_region_dev_write() to fget_light()
proc_map_files_readdir(): don't bother with grabbing files
make get_file() return its argument
vhost_set_vring(): turn pollstart/pollstop into bool
switch prctl_set_mm_exe_file() to fget_light()
switch xfs_find_handle() to fget_light()
switch xfs_swapext() to fget_light()
...
|
|
There's no reason to call rcu_barrier() on every
deactivate_locked_super(). We only need to make sure that all delayed rcu
free inodes are flushed before we destroy related cache.
Removing rcu_barrier() from deactivate_locked_super() affects some fast
paths. E.g. on my machine exit_group() of a last process in IPC
namespace takes 0.07538s. rcu_barrier() takes 0.05188s of that time.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
|
|
Just the flags; only NFS cares even about that, but there are
legitimate uses for such argument. And getting rid of that
completely would require splitting ->lookup() into a couple
of methods (at least), so let's leave that alone for now...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Replace remaining direct i_nlink updates with a new set_nlink()
updater function.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Tested-by: Toshiyuki Okajima <toshi.okajima@jp.fujitsu.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
|
|
Fixes generated by 'codespell' and manually reviewed.
Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
|
|
* 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block: (65 commits)
Documentation/iostats.txt: bit-size reference etc.
cfq-iosched: removing unnecessary think time checking
cfq-iosched: Don't clear queue stats when preempt.
blk-throttle: Reset group slice when limits are changed
blk-cgroup: Only give unaccounted_time under debug
cfq-iosched: Don't set active queue in preempt
block: fix non-atomic access to genhd inflight structures
block: attempt to merge with existing requests on plug flush
block: NULL dereference on error path in __blkdev_get()
cfq-iosched: Don't update group weights when on service tree
fs: assign sb->s_bdi to default_backing_dev_info if the bdi is going away
block: Require subsystems to explicitly allocate bio_set integrity mempool
jbd2: finish conversion from WRITE_SYNC_PLUG to WRITE_SYNC and explicit plugging
jbd: finish conversion from WRITE_SYNC_PLUG to WRITE_SYNC and explicit plugging
fs: make fsync_buffers_list() plug
mm: make generic_writepages() use plugging
blk-cgroup: Add unaccounted time to timeslice_used.
block: fixup plugging stubs for !CONFIG_BLOCK
block: remove obsolete comments for blkdev_issue_zeroout.
blktrace: Use rq->cmd_flags directly in blk_add_trace_rq.
...
Fix up conflicts in fs/{aio.c,super.c}
|
|
fs/adfs/adfs.h: In function 'append_filetype_suffix':
fs/adfs/adfs.h:115: warning: comparison is always false due to limited range of data type
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Stuart Swales <stuart.swales.croftnuisk@gmail.com>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
ADFS (FileCore) storage complies with the RISC OS filetype specification
(12 bits of file type information is stored in the file load address,
rather than using a file extension). The existing driver largely ignores
this information and does not present it to the end user.
It is desirable that stored filetypes be made visible to the end user to
facilitate a precise copy of data and metadata from a hard disc (or image
thereof) into a RISC OS emulator (such as RPCEmu) or to a network share
which can be accessed by real Acorn systems.
This patch implements a per-mount filetype suffix option (use -o
ftsuffix=1) to present any filetype as a ,xyz hexadecimal suffix on each
file. This type suffix is compatible with that used by RISC OS systems
that access network servers using NFS client software and by RPCemu's host
filing system.
Signed-off-by: Stuart Swales <stuart.swales.croftnuisk@gmail.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
ADFS (FileCore) storage complies with the RISC OS timestamp specification
(40-bit centiseconds since 01 Jan 1900 00:00:00). It is desirable that
stored timestamp precision be maintained to facilitate a precise copy of
data and metadata from a hard disc (or image thereof) into a RISC OS
emulator (such as RPCEmu).
This patch implements a full-precision conversion from ADFS to Unix
timestamp as the existing driver, for ease of calculation with old 32-bit
compilers, uses the common trick of shifting the 40-bits representing
centiseconds around into 32-bits representing seconds thereby losing
precision.
Signed-off-by: Stuart Swales<stuart.swales.croftnuisk@gmail.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Kernel crashes in fs/adfs module when accessing directories with a large
number of objects on mounted Acorn ADFS E+/F+ format discs (or images) as
the existing code writes off the end of the fixed array of struct
buffer_head pointers.
Additionally, each directory access that didn't crash would leak a buffer
as nr_buffers was not adjusted correctly for E+/F+ discs (was always left
as one less than required).
The patch fixes this by allocating a dynamically-sized set of struct
buffer_head pointers if necessary for the E+/F+ case (many directories
still do in fact fit in 2048 bytes) and sets the correct nr_buffers so
that all buffers are released.
Addresses https://bugzilla.kernel.org/show_bug.cgi?id=26072
Tested by tar'ing the contents of my RISC PC's E+ format 20Gb HDD which
contains a number of large directories that previously crashed the kernel.
Signed-off-by: Stuart Swales <stuart.swales.croftnuisk@gmail.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Code has been converted over to the new explicit on-stack plugging,
and delay users have been converted to use the new API for that.
So lets kill off the old plugging along with aops->sync_page().
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
According to Russell King, adfs was written to not require the big
kernel lock, and all inode updates are done under adfs_dir_lock.
All other metadata in adfs is read-only and does not require locking.
The use of the BKL is the result of various pushdowns from the VFS
operations.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Russell King <rmk@arm.linux.org.uk>
Cc: Stuart Swales <stuart.swales.croftnuisk@gmail.com>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Reduce some branches and memory accesses in dcache lookup by adding dentry
flags to indicate common d_ops are set, rather than having to check them.
This saves a pointer memory access (dentry->d_op) in common path lookup
situations, and saves another pointer load and branch in cases where we
have d_op but not the particular operation.
Patched with:
git grep -E '[.>]([[:space:]])*d_op([[:space:]])*=' | xargs sed -e 's/\([^\t ]*\)->d_op = \(.*\);/d_set_d_op(\1, \2);/' -e 's/\([^\t ]*\)\.d_op = \(.*\);/d_set_d_op(\&\1, \2);/' -i
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
|
|
RCU free the struct inode. This will allow:
- Subsequent store-free path walking patch. The inode must be consulted for
permissions when walking, so an RCU inode reference is a must.
- sb_inode_list_lock to be moved inside i_lock because sb list walkers who want
to take i_lock no longer need to take sb_inode_list_lock to walk the list in
the first place. This will simplify and optimize locking.
- Could remove some nested trylock loops in dcache code
- Could potentially simplify things a bit in VM land. Do not need to take the
page lock to follow page->mapping.
The downsides of this is the performance cost of using RCU. In a simple
creat/unlink microbenchmark, performance drops by about 10% due to inability to
reuse cache-hot slab objects. As iterations increase and RCU freeing starts
kicking over, this increases to about 20%.
In cases where inode lifetimes are longer (ie. many inodes may be allocated
during the average life span of a single inode), a lot of this cache reuse is
not applicable, so the regression caused by this patch is smaller.
The cache-hot regression could largely be avoided by using SLAB_DESTROY_BY_RCU,
however this adds some complexity to list walking and store-free path walking,
so I prefer to implement this at a later date, if it is shown to be a win in
real situations. I haven't found a regression in any non-micro benchmark so I
doubt it will be a problem.
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
|
|
Change d_hash so it may be called from lock-free RCU lookups. See similar
patch for d_compare for details.
For in-tree filesystems, this is just a mechanical change.
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
|
|
Change d_compare so it may be called from lock-free RCU lookups. This
does put significant restrictions on what may be done from the callback,
however there don't seem to have been any problems with in-tree fses.
If some strange use case pops up that _really_ cannot cope with the
rcu-walk rules, we can just add new rcu-unaware callbacks, which would
cause name lookup to drop out of rcu-walk mode.
For in-tree filesystems, this is just a mechanical change.
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
|
|
... and switch of the obvious get_sb_bdev() users to ->mount()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
* 'vfs' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl: (30 commits)
BKL: remove BKL from freevxfs
BKL: remove BKL from qnx4
autofs4: Only declare function when CONFIG_COMPAT is defined
autofs: Only declare function when CONFIG_COMPAT is defined
ncpfs: Lock socket in ncpfs while setting its callbacks
fs/locks.c: prepare for BKL removal
BKL: Remove BKL from ncpfs
BKL: Remove BKL from OCFS2
BKL: Remove BKL from squashfs
BKL: Remove BKL from jffs2
BKL: Remove BKL from ecryptfs
BKL: Remove BKL from afs
BKL: Remove BKL from USB gadgetfs
BKL: Remove BKL from autofs4
BKL: Remove BKL from isofs
BKL: Remove BKL from fat
BKL: Remove BKL from ext2 filesystem
BKL: Remove BKL from do_new_mount()
BKL: Remove BKL from cgroup
BKL: Remove BKL from NTFS
...
|
|
With all the patches we have queued in the BKL removal tree, only a
few dozen modules are left that actually rely on the BKL, and even
there are lots of low-hanging fruit. We need to decide what to do
about them, this patch illustrates one of the options:
Every user of the BKL is marked as 'depends on BKL' in Kconfig,
and the CONFIG_BKL becomes a user-visible option. If it gets
disabled, no BKL using module can be built any more and the BKL
code itself is compiled out.
The one exception is file locking, which is practically always
enabled and does a 'select BKL' instead. This effectively forces
CONFIG_BKL to be enabled until we have solved the fs/lockd
mess and can apply the patch that removes the BKL from fs/locks.c.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
This patch is a preparation necessary to remove the BKL from do_new_mount().
It explicitly adds calls to lock_kernel()/unlock_kernel() around
get_sb/fill_super operations for filesystems that still uses the BKL.
I've read through all the code formerly covered by the BKL inside
do_kern_mount() and have satisfied myself that it doesn't need the BKL
any more.
do_kern_mount() is already called without the BKL when mounting the rootfs
and in nfsctl. do_kern_mount() calls vfs_kern_mount(), which is called
from various places without BKL: simple_pin_fs(), nfs_do_clone_mount()
through nfs_follow_mountpoint(), afs_mntpt_do_automount() through
afs_mntpt_follow_link(). Both later functions are actually the filesystems
follow_link inode operation. vfs_kern_mount() is calling the specified
get_sb function and lets the filesystem do its job by calling the given
fill_super function.
Therefore I think it is safe to push down the BKL from the VFS to the
low-level filesystems get_sb/fill_super operation.
[arnd: do not add the BKL to those file systems that already
don't use it elsewhere]
Signed-off-by: Jan Blunck <jblunck@infradead.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: Christoph Hellwig <hch@infradead.org>
|
|
Make sure we check the truncate constraints early on in ->setattr by adding
those checks to inode_change_ok. Also clean up and document inode_change_ok
to make this obvious.
As a fallout we don't have to call inode_newsize_ok from simple_setsize and
simplify it down to a truncate_setsize which doesn't return an error. This
simplifies a lot of setattr implementations and means we use truncate_setsize
almost everywhere. Get rid of fat_setsize now that it's trivial and mark
ext2_setsize static to make the calling convention obvious.
Keep the inode_newsize_ok in vmtruncate for now as all callers need an
audit for its removal anyway.
Note: setattr code in ecryptfs doesn't call inode_change_ok at all and
needs a deeper audit, but that is left for later.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Move the call to vmtruncate to get rid of accessive blocks to the callers
in preparation of the new truncate sequence and rename the non-truncating
version to cont_write_begin.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Lots of filesystems calls vmtruncate despite not implementing the old
->truncate method. Switch them to use simple_setsize and add some
comments about the truncate code where it seems fitting.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
We don't name our generic fsync implementations very well currently.
The no-op implementation for in-memory filesystems currently is called
simple_sync_file which doesn't make too much sense to start with,
the the generic one for simple filesystems is called simple_fsync
which can lead to some confusion.
This patch renames the generic file fsync method to generic_file_fsync
to match the other generic_file_* routines it is supposed to be used
with, and the no-op implementation to noop_fsync to make it obvious
what to expect. In addition add some documentation for both methods.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
|
|
This gives the filesystem more information about the writeback that
is happening. Trond requested this for the NFS unstable write handling,
and other filesystems might benefit from this too by beeing able to
distinguish between the different callers in more detail.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
unsigned block cannot be less than 0.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
* Remove smp_lock.h from files which don't need it (including some headers!)
* Add smp_lock.h to files which do need it
* Make smp_lock.h include conditional in hardirq.h
It's needed only for one kernel_locked() usage which is under CONFIG_PREEMPT
This will make hardirq.h inclusion cheaper for every PREEMPT=n config
(which includes allmodconfig/allyesconfig, BTW)
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Missing conversion to host-endian before doing shifts
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Move BKL into ->put_super from the only caller. A couple of
filesystems had trivial enough ->put_super (only kfree and NULLing of
s_fs_info + stuff in there) to not get any locking: coda, cramfs, efs,
hugetlbfs, omfs, qnx4, shmem, all others got the full treatment. Most
of them probably don't need it, but I'd rather sort that out individually.
Preferably after all the other BKL pushdowns in that area.
[AV: original used to move lock_super() down as well; these changes are
removed since we don't do lock_super() at all in generic_shutdown_super()
now]
[AV: fuse, btrfs and xfs are known to need no damn BKL, exempt]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Currently many file systems in Linux kernel do not return f_fsid in statfs
info, the value is set as 0 in vfs layer. Anyway, in some conditions,
f_fsid from statfs(2) is useful, especially being used as (f_fsid, ino)
pair to uniquely identify a file.
Basic idea of the patches is generating a unique fs ID by
huge_encode_dev(sb->s_bdev->bd_dev) during file system mounting life time
(no endian consistent issue). sb is a point of struct super_block of
current mounted file system being accessed by statfs(2).
This patch:
Make adfs return f_fsid info for statfs(2), and do a little variable
renaming in adfs_statfs().
Signed-off-by: Coly Li <coly.li@suse.de>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: "Sergey S. Kostyliov" <rathamahata@php4.ru>
Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
Cc: Dave Kleikamp <shaggy@austin.ibm.com>
Cc: Bob Copeland <me@bobcopeland.com>
Cc: Anders Larsen <al@alarsen.net>
Cc: Phillip Lougher <phillip@lougher.demon.co.uk>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Evgeniy Dushistov <dushistov@mail.ru>
Cc: Jan Kara <jack@suse.cz>
Cc: Andreas Dilger <adilger@sun.com>
Cc: Jamie Lokier <jamie@shareable.org>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
|
|
This is a much better version of a previous patch to make the parser
tables constant. Rather than changing the typedef, we put the "const" in
all the various places where its required, allowing the __initconst
exception for nfsroot which was the cause of the previous trouble.
This was posted for review some time ago and I believe its been in -mm
since then.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Cc: Alexander Viro <aviro@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Kmem cache passed to constructor is only needed for constructors that are
themselves multiplexeres. Nobody uses this "feature", nor does anybody uses
passed kmem cache in non-trivial way, so pass only pointer to object.
Non-trivial places are:
arch/powerpc/mm/init_64.c
arch/powerpc/mm/hugetlbpage.c
This is flag day, yes.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Pekka Enberg <penberg@cs.helsinki.fi>
Acked-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Jon Tollefson <kniht@linux.vnet.ibm.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Matt Mackall <mpm@selenic.com>
[akpm@linux-foundation.org: fix arch/powerpc/mm/hugetlbpage.c]
[akpm@linux-foundation.org: fix mm/slab.c]
[akpm@linux-foundation.org: fix ubifs]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
__FUNCTION__ is gcc-specific, use __func__
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
fs/adfs/dir_f.c:126:4: warning: do-while statement is not a compound statement
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Add a .show_options super operation to adfs.
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Slab constructors currently have a flags parameter that is never used. And
the order of the arguments is opposite to other slab functions. The object
pointer is placed before the kmem_cache pointer.
Convert
ctor(void *object, struct kmem_cache *s, unsigned long flags)
to
ctor(struct kmem_cache *s, void *object)
throughout the kernel
[akpm@linux-foundation.org: coupla fixes]
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Slab destructors were no longer supported after Christoph's
c59def9f222d44bb7e2f0a559f2906191a0862d7 change. They've been
BUGs for both slab and slub, and slob never supported them
either.
This rips out support for the dtor pointer from kmem_cache_create()
completely and fixes up every single callsite in the kernel (there were
about 224, not including the slab allocator definitions themselves,
or the documentation references).
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
|