aboutsummaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2025-06-15Linux 6.16-rc2HEADv6.16-rc2mainLinus Torvalds1-1/+1
2025-06-15Merge tag 'kbuild-fixes-v6.16' of ↵Linus Torvalds4-64/+33
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild fixes from Masahiro Yamada: - Move warnings about linux/export.h from W=1 to W=2 - Fix structure type overrides in gendwarfksyms * tag 'kbuild-fixes-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: gendwarfksyms: Fix structure type overrides kbuild: move warnings about linux/export.h from W=1 to W=2
2025-06-16gendwarfksyms: Fix structure type overridesSami Tolvanen2-58/+21
As we always iterate through the entire die_map when expanding type strings, recursively processing referenced types in type_expand_child() is not actually necessary. Furthermore, the type_string kABI rule added in commit c9083467f7b9 ("gendwarfksyms: Add a kABI rule to override type strings") can fail to override type strings for structures due to a missing kabi_get_type_string() check in this function. Fix the issue by dropping the unnecessary recursion and moving the override check to type_expand(). Note that symbol versions are otherwise unchanged with this patch. Fixes: c9083467f7b9 ("gendwarfksyms: Add a kABI rule to override type strings") Reported-by: Giuliano Procida <gprocida@google.com> Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Reviewed-by: Petr Pavlu <petr.pavlu@suse.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2025-06-16kbuild: move warnings about linux/export.h from W=1 to W=2Masahiro Yamada2-6/+12
This hides excessive warnings, as nobody builds with W=2. Fixes: a934a57a42f6 ("scripts/misc-check: check missing #include <linux/export.h> when W=1") Fixes: 7d95680d64ac ("scripts/misc-check: check unnecessary #include <linux/export.h> when W=1") Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Acked-by: Heiko Carstens <hca@linux.ibm.com>
2025-06-14Merge tag 'v6.16-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds5-22/+35
Pull smb client fixes from Steve French: - SMB3.1.1 POSIX extensions fix for char remapping - Fix for repeated directory listings when directory leases enabled - deferred close handle reuse fix * tag 'v6.16-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: smb: improve directory cache reuse for readdir operations smb: client: fix perf regression with deferred closes smb: client: disable path remapping with POSIX extensions
2025-06-14Merge tag 'iommu-fixes-v6.16-rc1' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux Pull iommu fix from Joerg Roedel: - Fix PTE size calculation for NVidia Tegra * tag 'iommu-fixes-v6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux: iommu/tegra: Fix incorrect size calculation
2025-06-14Merge tag 'block-6.16-20250614' of git://git.kernel.dk/linuxLinus Torvalds7-38/+114
Pull block fixes from Jens Axboe: - Fix for a deadlock on queue freeze with zoned writes - Fix for zoned append emulation - Two bio folio fixes, for sparsemem and for very large folios - Fix for a performance regression introduced in 6.13 when plug insertion was changed - Fix for NVMe passthrough handling for polled IO - Document the ublk auto registration feature - loop lockdep warning fix * tag 'block-6.16-20250614' of git://git.kernel.dk/linux: nvme: always punt polled uring_cmd end_io work to task_work Documentation: ublk: Separate UBLK_F_AUTO_BUF_REG fallback behavior sublists block: Fix bvec_set_folio() for very large folios bio: Fix bio_first_folio() for SPARSEMEM without VMEMMAP block: use plug request list tail for one-shot backmerge attempt block: don't use submit_bio_noacct_nocheck in blk_zone_wplug_bio_work block: Clear BIO_EMULATES_ZONE_APPEND flag on BIO completion ublk: document auto buffer registration(UBLK_F_AUTO_BUF_REG) loop: move lo_set_size() out of queue freeze
2025-06-14Merge tag 'io_uring-6.16-20250614' of git://git.kernel.dk/linuxLinus Torvalds6-23/+59
Pull io_uring fixes from Jens Axboe: - Fix for a race between SQPOLL exit and fdinfo reading. It's slim and I was only able to reproduce this with an artificial delay in the kernel. Followup sparse fix as well to unify the access to ->thread. - Fix for multiple buffer peeking, avoiding truncation if possible. - Run local task_work for IOPOLL reaping when the ring is exiting. This currently isn't done due to an assumption that polled IO will never need task_work, but a fix on the block side is going to change that. * tag 'io_uring-6.16-20250614' of git://git.kernel.dk/linux: io_uring: run local task_work from ring exit IOPOLL reaping io_uring/kbuf: don't truncate end buffer for multiple buffer peeks io_uring: consistently use rcu semantics with sqpoll thread io_uring: fix use-after-free of sq->thread in __io_uring_show_fdinfo()
2025-06-14Merge tag 'rust-fixes-6.16' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux Pull Rust fix from Miguel Ojeda: - 'hrtimer': fix future compile error when the 'impl_has_hr_timer!' macro starts to get called * tag 'rust-fixes-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux: rust: time: Fix compile error in impl_has_hr_timer macro
2025-06-14Merge tag 'mm-hotfixes-stable-2025-06-13-21-56' of ↵Linus Torvalds14-29/+134
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "9 hotfixes. 3 are cc:stable and the remainder address post-6.15 issues or aren't considered necessary for -stable kernels. Only 4 are for MM" * tag 'mm-hotfixes-stable-2025-06-13-21-56' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mm: add mmap_prepare() compatibility layer for nested file systems init: fix build warnings about export.h MAINTAINERS: add Barry as a THP reviewer drivers/rapidio/rio_cm.c: prevent possible heap overwrite mm: close theoretical race where stale TLB entries could linger mm/vma: reset VMA iterator on commit_merge() OOM failure docs: proc: update VmFlags documentation in smaps scatterlist: fix extraneous '@'-sign kernel-doc notation selftests/mm: skip failed memfd setups in gup_longterm
2025-06-13Merge tag 'scsi-fixes' of ↵Linus Torvalds6-14/+23
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "All fixes for drivers. The core change in the error handler is simply to translate an ALUA specific sense code into a retry the ALUA components can handle and won't impact any other devices" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: error: alua: I/O errors for ALUA state transitions scsi: storvsc: Increase the timeouts to storvsc_timeout scsi: s390: zfcp: Ensure synchronous unit_add scsi: iscsi: Fix incorrect error path labels for flashnode operations scsi: mvsas: Fix typos in per-phy comments and SAS cmd port registers scsi: core: ufs: Fix a hang in the error handler
2025-06-13Merge tag 'drm-fixes-2025-06-14' of https://gitlab.freedesktop.org/drm/kernelLinus Torvalds9-39/+68
Pull drm fixes from Dave Airlie: "Quiet week, only two pull requests came my way, xe has a couple of fixes and then a bunch of fixes across the board, vc4 probably fixes the biggest problem: vc4: - Fix infinite EPROBE_DEFER loop in vc4 probing amdxdna: - Fix amdxdna firmware size meson: - modesetting fixes sitronix: - Kconfig fix for st7171-i2c dma-buf: - Fix -EBUSY WARN_ON_ONCE in dma-buf udmabuf: - Use dma_sync_sgtable_for_cpu in udmabuf xe: - Fix regression disallowing 64K SVM migration - Use a bounce buffer for WA BB" * tag 'drm-fixes-2025-06-14' of https://gitlab.freedesktop.org/drm/kernel: drm/xe/lrc: Use a temporary buffer for WA BB udmabuf: use sgtable-based scatterlist wrappers dma-buf: fix compare in WARN_ON_ONCE drm/sitronix: st7571-i2c: Select VIDEOMODE_HELPERS drm/meson: fix more rounding issues with 59.94Hz modes drm/meson: use vclk_freq instead of pixel_freq in debug print drm/meson: fix debug log statement when setting the HDMI clocks drm/vc4: fix infinite EPROBE_DEFER loop drm/xe/svm: Fix regression disallowing 64K SVM migration accel/amdxdna: Fix incorrect PSP firmware size
2025-06-13io_uring: run local task_work from ring exit IOPOLL reapingJens Axboe1-0/+3
In preparation for needing to shift NVMe passthrough to always use task_work for polled IO completions, ensure that those are suitably run at exit time. See commit: 9ce6c9875f3e ("nvme: always punt polled uring_cmd end_io work to task_work") for details on why that is necessary. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-06-13nvme: always punt polled uring_cmd end_io work to task_workJens Axboe1-14/+7
Currently NVMe uring_cmd completions will complete locally, if they are polled. This is done because those completions are always invoked from task context. And while that is true, there's no guarantee that it's invoked under the right ring context, or even task. If someone does NVMe passthrough via multiple threads and with a limited number of poll queues, then ringA may find completions from ringB. For that case, completing the request may not be sound. Always just punt the passthrough completions via task_work, which will redirect the completion, if needed. Cc: stable@vger.kernel.org Fixes: 585079b6e425 ("nvme: wire up async polling for io passthrough commands") Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-06-13Merge tag 'acpi-6.16-rc2' of ↵Linus Torvalds6-9/+31
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fixes from Rafael Wysocki: "These fix an ACPI APEI error injection driver failure that started to occur after switching it over to using a faux device, address an EC driver issue related to invalid ECDT tables, clean up the usage of mwait_idle_with_hints() in the ACPI PAD driver, add a new IRQ override quirk, and fix a NULL pointer dereference related to nosmp: - Update the faux device handling code in the driver core and address an ACPI APEI error injection driver failure that started to occur after switching it over to using a faux device on top of that (Dan Williams) - Update data types of variables passed as arguments to mwait_idle_with_hints() in the ACPI PAD (processor aggregator device) driver to match the function definition after recent changes (Uros Bizjak) - Fix a NULL pointer dereference in the ACPI CPPC library that occurs when nosmp is passed to the kernel in the command line (Yunhui Cui) - Ignore ECDT tables with an invalid ID string to prevent using an incorrect GPE for signaling events on some systems (Armin Wolf) - Add a new IRQ override quirk for MACHENIKE 16P (Wentao Guan)" * tag 'acpi-6.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: resource: Use IRQ override on MACHENIKE 16P ACPI: EC: Ignore ECDT tables with an invalid ID string ACPI: CPPC: Fix NULL pointer dereference when nosmp is used ACPI: PAD: Update arguments of mwait_idle_with_hints() ACPI: APEI: EINJ: Do not fail einj_init() on faux_device_create() failure driver core: faux: Quiet probe failures driver core: faux: Suppress bind attributes
2025-06-13Merge tag 'pm-6.16-rc2' of ↵Linus Torvalds16-133/+368
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "These fix the cpupower utility installation, fix up the recently added Rust abstractions for cpufreq and OPP, restore the x86 update eliminating mwait_play_dead_cpuid_hint() that has been reverted during the 6.16 merge window along with preventing the failure caused by it from happening, and clean up mwait_idle_with_hints() usage in intel_idle: - Implement CpuId Rust abstraction and use it to fix doctest failure related to the recently introduced cpumask abstraction (Viresh Kumar) - Do minor cleanups in the `# Safety` sections for cpufreq abstractions added recently (Viresh Kumar) - Unbreak cpupower systemd service units installation on some systems by adding a unitdir variable for specifying the location to install them (Francesco Poli) - Eliminate mwait_play_dead_cpuid_hint() again after reverting its elimination during the 6.16 merge window due to a problem with handling "dead" SMT siblings, but this time prevent leaving them in C1 after initialization by taking them online and back offline when a proper cpuidle driver for the platform has been registered (Rafael Wysocki) - Update data types of variables passed as arguments to mwait_idle_with_hints() to match the function definition after recent changes (Uros Bizjak)" * tag 'pm-6.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: rust: cpu: Add CpuId::current() to retrieve current CPU ID rust: Use CpuId in place of raw CPU numbers rust: cpu: Introduce CpuId abstraction intel_idle: Update arguments of mwait_idle_with_hints() cpufreq: Convert `/// SAFETY` lines to `# Safety` sections cpupower: split unitdir from libdir in Makefile Reapply "x86/smp: Eliminate mwait_play_dead_cpuid_hint()" ACPI: processor: Rescan "dead" SMT siblings during initialization intel_idle: Rescan "dead" SMT siblings during initialization x86/smp: PM/hibernate: Split arch_resume_nosmt() intel_idle: Use subsys_initcall_sync() for initialization
2025-06-13Merge branches 'acpi-pad', 'acpi-cppc', 'acpi-ec' and 'acpi-resource'Rafael J. Wysocki4-2/+26
Merge assorted ACPI updates for 6.16-rc2: - Update data types of variables passed as arguments to mwait_idle_with_hints() in the ACPI PAD (processor aggregator device) driver to match the function definition after recent changes (Uros Bizjak). - Fix a NULL pointer dereference in the ACPI CPPC library that occurs when nosmp is passed to the kernel in the command line (Yunhui Cui). - Ignore ECDT tables with an invalid ID string to prevent using an incorrect GPE for signaling events on some systems (Armin Wolf). - Add a new IRQ override quirk for MACHENIKE 16P (Wentao Guan). * acpi-pad: ACPI: PAD: Update arguments of mwait_idle_with_hints() * acpi-cppc: ACPI: CPPC: Fix NULL pointer dereference when nosmp is used * acpi-ec: ACPI: EC: Ignore ECDT tables with an invalid ID string * acpi-resource: ACPI: resource: Use IRQ override on MACHENIKE 16P
2025-06-13Merge branch 'pm-cpuidle'Rafael J. Wysocki8-65/+64
Merge cpuidle updates for 6.16-rc2: - Update data types of variables passed as arguments to mwait_idle_with_hints() to match the function definition after recent changes (Uros Bizjak). - Eliminate mwait_play_dead_cpuid_hint() again after reverting its elimination during the merge window due to a problem with handling "dead" SMT siblings, but this time prevent leaving them in C1 after initialization by taking them online and back offline when a proper cpuidle driver for the platform has been registered (Rafael Wysocki). * pm-cpuidle: intel_idle: Update arguments of mwait_idle_with_hints() Reapply "x86/smp: Eliminate mwait_play_dead_cpuid_hint()" ACPI: processor: Rescan "dead" SMT siblings during initialization intel_idle: Rescan "dead" SMT siblings during initialization x86/smp: PM/hibernate: Split arch_resume_nosmt() intel_idle: Use subsys_initcall_sync() for initialization
2025-06-13Merge branch 'pm-tools'Rafael J. Wysocki1-4/+5
Merge a cpupower utility fix for 6.16-rc2 that unbreaks systemd service units installation on some sysems (Francesco Poli). * pm-tools: cpupower: split unitdir from libdir in Makefile
2025-06-13Merge tag 'spi-fix-v6.16-rc1' of ↵Linus Torvalds5-20/+41
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Pull spi fixes from Mark Brown: "A collection of driver specific fixes, most minor apart from the OMAP ones which disable some recent performance optimisations in some non-standard cases where we could start driving the bus incorrectly. The change to the stm32-ospi driver to use the newer reset APIs is a fix for interactions with other IP sharing the same reset line in some SoCs" * tag 'spi-fix-v6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: spi: spi-pci1xxxx: Drop MSI-X usage as unsupported by DMA engine spi: stm32-ospi: clean up on error in probe() spi: stm32-ospi: Make usage of reset_control_acquire/release() API spi: offload: check offload ops existence before disabling the trigger spi: spi-pci1xxxx: Fix error code in probe spi: loongson: Fix build warnings about export.h spi: omap2-mcspi: Disable multi-mode when the previous message kept CS asserted spi: omap2-mcspi: Disable multi mode when CS should be kept asserted after message
2025-06-13Merge tag 'regulator-fix-v6.16-rc1' of ↵Linus Torvalds1-3/+3
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator Pull regulator fix from Mark Brown: "One minor fix for a leak in the DT parsing code in the max20086 driver" * tag 'regulator-fix-v6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: regulator: max20086: Fix refcount leak in max20086_parse_regulators_dt()
2025-06-13posix-cpu-timers: fix race between handle_posix_cpu_timers() and ↵Oleg Nesterov1-0/+9
posix_cpu_timer_del() If an exiting non-autoreaping task has already passed exit_notify() and calls handle_posix_cpu_timers() from IRQ, it can be reaped by its parent or debugger right after unlock_task_sighand(). If a concurrent posix_cpu_timer_del() runs at that moment, it won't be able to detect timer->it.cpu.firing != 0: cpu_timer_task_rcu() and/or lock_task_sighand() will fail. Add the tsk->exit_state check into run_posix_cpu_timers() to fix this. This fix is not needed if CONFIG_POSIX_CPU_TIMERS_TASK_WORK=y, because exit_task_work() is called before exit_notify(). But the check still makes sense, task_work_add(&tsk->posix_cputimers_work.work) will fail anyway in this case. Cc: stable@vger.kernel.org Reported-by: Benoît Sevens <bsevens@google.com> Fixes: 0bdd2ed4138e ("sched: run_posix_cpu_timers: Don't check ->exit_state, use lock_task_sighand()") Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-06-13Merge tag 'trace-v6.16-rc1' of ↵Linus Torvalds1-3/+1
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fix from Steven Rostedt: - Do not free "head" variable in filter_free_subsystem_filters() The first error path jumps to "free_now" label but first frees the newly allocated "head" variable. But the "free_now" code checks this variable, and if it is not NULL, it will iterate the list. As this list variable was already initialized, the "free_now" code will not do anything as it is empty. But freeing it will cause a UAF bug. The error path should simply jump to the "free_now" label and leave the "head" variable alone. * tag 'trace-v6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing: Do not free "head" on error path of filter_free_subsystem_filters()
2025-06-13Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds18-126/+194
Pull kvm fixes from Paolo Bonzini: "ARM: - Rework of system register accessors for system registers that are directly writen to memory, so that sanitisation of the in-memory value happens at the correct time (after the read, or before the write). For convenience, RMW-style accessors are also provided. - Multiple fixes for the so-called "arch-timer-edge-cases' selftest, which was always broken. x86: - Make KVM_PRE_FAULT_MEMORY stricter for TDX, allowing userspace to pass only the "untouched" addresses and flipping the shared/private bit in the implementation. - Disable SEV-SNP support on initialization failure * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86/mmu: Reject direct bits in gpa passed to KVM_PRE_FAULT_MEMORY KVM: x86/mmu: Embed direct bits into gpa for KVM_PRE_FAULT_MEMORY KVM: SEV: Disable SEV-SNP support on initialization failure KVM: arm64: selftests: Determine effective counter width in arch_timer_edge_cases KVM: arm64: selftests: Fix xVAL init in arch_timer_edge_cases KVM: arm64: selftests: Fix thread migration in arch_timer_edge_cases KVM: arm64: selftests: Fix help text for arch_timer_edge_cases KVM: arm64: Make __vcpu_sys_reg() a pure rvalue operand KVM: arm64: Don't use __vcpu_sys_reg() to get the address of a sysreg KVM: arm64: Add RMW specific sysreg accessor KVM: arm64: Add assignment-specific sysreg accessor
2025-06-13io_uring/kbuf: don't truncate end buffer for multiple buffer peeksJens Axboe1-1/+4
If peeking a bunch of buffers, normally io_ring_buffers_peek() will truncate the end buffer. This isn't optimal as presumably more data will be arriving later, and hence it's better to stop with the last full buffer rather than truncate the end buffer. Cc: stable@vger.kernel.org Fixes: 35c8711c8fc4 ("io_uring/kbuf: add helpers for getting/peeking multiple buffers") Reported-by: Christian Mazakas <christian.mazakas@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-06-13Merge tag 'v6.16-p4' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fix from Herbert Xu: "Fix a broken self-test in hkdf (new regression)" * tag 'v6.16-p4' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: hkdf - move to late_initcall
2025-06-13Merge tag 'bcachefs-2025-06-12' of git://evilpiepirate.org/bcachefsLinus Torvalds25-116/+319
Pull bcachefs fixes from Kent Overstreet: "As usual, highlighting the ones users have been noticing: - Fix a small issue with has_case_insensitive not being propagated on snapshot creation; this led to fsck errors, which we're harmless because we're not using this flag yet (it's for overlayfs + casefolding). - Log the error being corrected in the journal when we're doing fsck repair: this was one of the "lessons learned" from the i_nlink 0 -> subvolume deletion bug, where reconstructing what had happened by analyzing the journal was a bit more difficult than it needed to be. - Don't schedule btree node scan to run in the superblock: this fixes a regression from the 6.16 recovery passes rework, and let to it running unnecessarily. The real issue here is that we don't have online, "self healing" style topology repair yet: topology repair currently has to run before we go RW, which means that we may schedule it unnecessarily after a transient error. This will be fixed in the future. - We now track, in btree node flags, the reason it was scheduled to be rewritten. We discovered a deadlock in recovery when many btree nodes need to be rewritten because they're degraded: fully fixing this will take some work but it's now easier to see what's going on. For the bug report where this came up, a device had been kicked RO due to transient errors: manually setting it back to RW was sufficient to allow recovery to succeed. - Mark a few more fsck errors as autofix: as a reminder to users, please do keep reporting cases where something needs to be repaired and is not repaired automatically (i.e. cases where -o fix_errors or fsck -y is required). - rcu_pending.c now works with PREEMPT_RT - 'bcachefs device add', then umount, then remount wasn't working - we now emit a uevent so that the new device's new superblock is correctly picked up - Assorted repair fixes: btree node scan will no longer incorrectly update sb->version_min, - Assorted syzbot fixes" * tag 'bcachefs-2025-06-12' of git://evilpiepirate.org/bcachefs: (23 commits) bcachefs: Don't trace should_be_locked unless changing bcachefs: Ensure that snapshot creation propagates has_case_insensitive bcachefs: Print devices we're mounting on multi device filesystems bcachefs: Don't trust sb->nr_devices in members_to_text() bcachefs: Fix version checks in validate_bset() bcachefs: ioctl: avoid stack overflow warning bcachefs: Don't pass trans to fsck_err() in gc_accounting_done bcachefs: Fix leak in bch2_fs_recovery() error path bcachefs: Fix rcu_pending for PREEMPT_RT bcachefs: Fix downgrade_table_extra() bcachefs: Don't put rhashtable on stack bcachefs: Make sure opts.read_only gets propagated back to VFS bcachefs: Fix possible console lock involved deadlock bcachefs: mark more errors autofix bcachefs: Don't persistently run scan_for_btree_nodes bcachefs: Read error message now prints if self healing bcachefs: Only run 'increase_depth' for keys from btree node csan bcachefs: Mark need_discard_freespace_key_bad autofix bcachefs: Update /dev/disk/by-uuid on device add bcachefs: Add more flags to btree nodes for rewrite reason ...
2025-06-13Documentation: ublk: Separate UBLK_F_AUTO_BUF_REG fallback behavior sublistsBagas Sanjaya1-0/+2
Stephen Rothwell reports htmldocs warning on ublk docs: Documentation/block/ublk.rst:414: ERROR: Unexpected indentation. [docutils] Fix the warning by separating sublists of auto buffer registration fallback behavior from their appropriate parent list item. Fixes: ff20c516485e ("ublk: document auto buffer registration(UBLK_F_AUTO_BUF_REG)") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Closes: https://lore.kernel.org/linux-next/20250612132638.193de386@canb.auug.org.au/ Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> Link: https://lore.kernel.org/r/20250613023857.15971-1-bagasdotme@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-06-13iommu/tegra: Fix incorrect size calculationJason Gunthorpe1-2/+2
This driver uses a mixture of ways to get the size of a PTE, tegra_smmu_set_pde() did it as sizeof(*pd) which became wrong when pd switched to a struct tegra_pd. Switch pd back to a u32* in tegra_smmu_set_pde() so the sizeof(*pd) returns 4. Fixes: 50568f87d1e2 ("iommu/terga: Do not use struct page as the handle for as->pd memory") Reported-by: Diogo Ivo <diogo.ivo@tecnico.ulisboa.pt> Closes: https://lore.kernel.org/all/62e7f7fe-6200-4e4f-ad42-d58ad272baa6@tecnico.ulisboa.pt/ Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Acked-by: Thierry Reding <treding@nvidia.com> Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com> Tested-by: Diogo Ivo <diogo.ivo@tecnico.ulisboa.pt> Link: https://lore.kernel.org/r/0-v1-da7b8b3d57eb+ce-iommu_terga_sizeof_jgg@nvidia.com Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2025-06-13block: Fix bvec_set_folio() for very large foliosMatthew Wilcox (Oracle)1-2/+5
Similarly to 26064d3e2b4d ("block: fix adding folio to bio"), if we attempt to add a folio that is larger than 4GB, we'll silently truncate the offset and len. Widen the parameters to size_t, assert that the length is less than 4GB and set the first page that contains the interesting data rather than the first page of the folio. Fixes: 26db5ee15851 (block: add a bvec_set_folio helper) Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Link: https://lore.kernel.org/r/20250612144255.2850278-1-willy@infradead.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-06-13bio: Fix bio_first_folio() for SPARSEMEM without VMEMMAPMatthew Wilcox (Oracle)1-1/+1
It is possible for physically contiguous folios to have discontiguous struct pages if SPARSEMEM is enabled and SPARSEMEM_VMEMMAP is not. This is correctly handled by folio_page_idx(), so remove this open-coded implementation. Fixes: 640d1930bef4 (block: Add bio_for_each_folio_all()) Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Link: https://lore.kernel.org/r/20250612144126.2849931-1-willy@infradead.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-06-13spi: spi-pci1xxxx: Drop MSI-X usage as unsupported by DMA engineThangaraj Samynathan1-1/+1
Removes MSI-X from the interrupt request path, as the DMA engine used by the SPI controller does not support MSI-X interrupts. Signed-off-by: Thangaraj Samynathan <thangaraj.s@microchip.com> Link: https://patch.msgid.link/20250612023059.71726-1-thangaraj.s@microchip.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-06-13Merge tag 'drm-misc-fixes-2025-06-12' of ↵Dave Airlie7-34/+47
https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes drm-misc-fixes for v6.16-rc2: - Fix infinite EPROBE_DEFER loop in vc4 probing. - Fix amdxdna firmware size. - mode fixes for meson. - Kconfig fix for st7171-i2c. - Fix -EBUSY WARN_ON_ONCE in dma-buf - Use dma_sync_sgtable_for_cpu in udmabuf. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://lore.kernel.org/r/62c06195-8bc1-4dae-8777-e86d94e4d9d9@linux.intel.com
2025-06-12mm: add mmap_prepare() compatibility layer for nested file systemsLorenzo Stoakes5-3/+107
Nested file systems, that is those which invoke call_mmap() within their own f_op->mmap() handlers, may encounter underlying file systems which provide the f_op->mmap_prepare() hook introduced by commit c84bf6dd2b83 ("mm: introduce new .mmap_prepare() file callback"). We have a chicken-and-egg scenario here - until all file systems are converted to using .mmap_prepare(), we cannot convert these nested handlers, as we can't call f_op->mmap from an .mmap_prepare() hook. So we have to do it the other way round - invoke the .mmap_prepare() hook from an .mmap() one. in order to do so, we need to convert VMA state into a struct vm_area_desc descriptor, invoking the underlying file system's f_op->mmap_prepare() callback passing a pointer to this, and then setting VMA state accordingly and safely. This patch achieves this via the compat_vma_mmap_prepare() function, which we invoke from call_mmap() if f_op->mmap_prepare() is specified in the passed in file pointer. We place the fundamental logic into mm/vma.h where VMA manipulation belongs. We also update the VMA userland tests to accommodate the changes. The compat_vma_mmap_prepare() function and its associated machinery is temporary, and will be removed once the conversion of file systems is complete. We carefully place this code so it can be used with CONFIG_MMU and also with cutting edge nommu silicon. [akpm@linux-foundation.org: export compat_vma_mmap_prepare tp fix build] [lorenzo.stoakes@oracle.com: remove unused declarations] Link: https://lkml.kernel.org/r/ac3ae324-4c65-432a-8c6d-2af988b18ac8@lucifer.local Link: https://lkml.kernel.org/r/20250609165749.344976-1-lorenzo.stoakes@oracle.com Fixes: c84bf6dd2b83 ("mm: introduce new .mmap_prepare() file callback"). Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reported-by: Jann Horn <jannh@google.com> Closes: https://lore.kernel.org/linux-mm/CAG48ez04yOEVx1ekzOChARDDBZzAKwet8PEoPM4Ln3_rk91AzQ@mail.gmail.com/ Reviewed-by: Pedro Falcato <pfalcato@suse.de> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <brauner@kernel.org> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Liam Howlett <liam.howlett@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-06-13Merge tag 'drm-xe-fixes-2025-06-12' of ↵Dave Airlie2-5/+21
https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes Driver Changes: - Fix regression disallowing 64K SVM migration (Maarten) - Use a bounce buffer for WA BB (Lucas) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Hellstrom <thomas.hellstrom@linux.intel.com> Link: https://lore.kernel.org/r/aEsBQoh5Si3ouPgE@fedora
2025-06-12Merge tag 'bitmap-for-6.16-rc2' of https://github.com/norov/linuxLinus Torvalds1-2/+2
Pull bitmap fix from Yury Norov: "Fix for __GENMASK() and __GENMASK_ULL() in UAPI" * tag 'bitmap-for-6.16-rc2' of https://github.com/norov/linux: uapi: bitops: use UAPI-safe variant of BITS_PER_LONG again
2025-06-12smb: improve directory cache reuse for readdir operationsBharath SM2-17/+19
Currently, cached directory contents were not reused across subsequent 'ls' operations because the cache validity check relied on comparing the ctx pointer, which changes with each readdir invocation. As a result, the cached dir entries was not marked as valid and the cache was not utilized for subsequent 'ls' operations. This change uses the file pointer, which remains consistent across all readdir calls for a given directory instance, to associate and validate the cache. As a result, cached directory contents can now be correctly reused, improving performance for repeated directory listings. Performance gains with local windows SMB server: Without the patch and default actimeo=1: 1000 directory enumeration operations on dir with 10k files took 135.0s With this patch and actimeo=0: 1000 directory enumeration operations on dir with 10k files took just 5.1s Signed-off-by: Bharath SM <bharathsm@microsoft.com> Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Cc: stable@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2025-06-12smb: client: fix perf regression with deferred closesPaulo Alcantara1-3/+6
Customer reported that one of their applications started failing to open files with STATUS_INSUFFICIENT_RESOURCES due to NetApp server hitting the maximum number of opens to same file that it would allow for a single client connection. It turned out the client was failing to reuse open handles with deferred closes because matching ->f_flags directly without masking off O_CREAT|O_EXCL|O_TRUNC bits first broke the comparision and then client ended up with thousands of deferred closes to same file. Those bits are already satisfied on the original open, so no need to check them against existing open handles. Reproducer: #include <stdio.h> #include <stdlib.h> #include <string.h> #include <unistd.h> #include <fcntl.h> #include <pthread.h> #define NR_THREADS 4 #define NR_ITERATIONS 2500 #define TEST_FILE "/mnt/1/test/dir/foo" static char buf[64]; static void *worker(void *arg) { int i, j; int fd; for (i = 0; i < NR_ITERATIONS; i++) { fd = open(TEST_FILE, O_WRONLY|O_CREAT|O_APPEND, 0666); for (j = 0; j < 16; j++) write(fd, buf, sizeof(buf)); close(fd); } } int main(int argc, char *argv[]) { pthread_t t[NR_THREADS]; int fd; int i; fd = open(TEST_FILE, O_WRONLY|O_CREAT|O_TRUNC, 0666); close(fd); memset(buf, 'a', sizeof(buf)); for (i = 0; i < NR_THREADS; i++) pthread_create(&t[i], NULL, worker, NULL); for (i = 0; i < NR_THREADS; i++) pthread_join(t[i], NULL); return 0; } Before patch: $ mount.cifs //srv/share /mnt/1 -o ... $ mkdir -p /mnt/1/test/dir $ gcc repro.c && ./a.out ... number of opens: 1391 After patch: $ mount.cifs //srv/share /mnt/1 -o ... $ mkdir -p /mnt/1/test/dir $ gcc repro.c && ./a.out ... number of opens: 1 Cc: linux-cifs@vger.kernel.org Cc: David Howells <dhowells@redhat.com> Cc: Jay Shin <jaeshin@redhat.com> Cc: Pierguido Lambri <plambri@redhat.com> Fixes: b8ea3b1ff544 ("smb: enable reuse of deferred file handles for write operations") Acked-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-06-12Merge tag 'net-6.16-rc2' of ↵Linus Torvalds74-555/+818
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bluetooth and wireless. Current release - regressions: - af_unix: allow passing cred for embryo without SO_PASSCRED/SO_PASSPIDFD Current release - new code bugs: - eth: airoha: correct enable mask for RX queues 16-31 - veth: prevent NULL pointer dereference in veth_xdp_rcv when peer disappears under traffic - ipv6: move fib6_config_validate() to ip6_route_add(), prevent invalid routes Previous releases - regressions: - phy: phy_caps: don't skip better duplex match on non-exact match - dsa: b53: fix untagged traffic sent via cpu tagged with VID 0 - Revert "wifi: mwifiex: Fix HT40 bandwidth issue.", it caused transient packet loss, exact reason not fully understood, yet Previous releases - always broken: - net: clear the dst when BPF is changing skb protocol (IPv4 <> IPv6) - sched: sfq: fix a potential crash on gso_skb handling - Bluetooth: intel: improve rx buffer posting to avoid causing issues in the firmware - eth: intel: i40e: make reset handling robust against multiple requests - eth: mlx5: ensure FW pages are always allocated on the local NUMA node, even when device is configure to 'serve' another node - wifi: ath12k: fix GCC_GCC_PCIE_HOT_RST definition for WCN7850, prevent kernel crashes - wifi: ath11k: avoid burning CPU in ath11k_debugfs_fw_stats_request() for 3 sec if fw_stats_done is not set" * tag 'net-6.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (70 commits) selftests: drv-net: rss_ctx: Add test for ntuple rules targeting default RSS context net: ethtool: Don't check if RSS context exists in case of context 0 af_unix: Allow passing cred for embryo without SO_PASSCRED/SO_PASSPIDFD. ipv6: Move fib6_config_validate() to ip6_route_add(). net: drv: netdevsim: don't napi_complete() from netpoll net/mlx5: HWS, Add error checking to hws_bwc_rule_complex_hash_node_get() veth: prevent NULL pointer dereference in veth_xdp_rcv net_sched: remove qdisc_tree_flush_backlog() net_sched: ets: fix a race in ets_qdisc_change() net_sched: tbf: fix a race in tbf_change() net_sched: red: fix a race in __red_change() net_sched: prio: fix a race in prio_tune() net_sched: sch_sfq: reject invalid perturb period net: phy: phy_caps: Don't skip better duplex macth on non-exact match MAINTAINERS: Update Kuniyuki Iwashima's email address. selftests: net: add test case for NAT46 looping back dst net: clear the dst when changing skb protocol net/mlx5e: Fix number of lanes to UNKNOWN when using data_rate_oper net/mlx5e: Fix leak of Geneve TLV option object net/mlx5: HWS, make sure the uplink is the last destination ...
2025-06-12drm/xe/lrc: Use a temporary buffer for WA BBLucas De Marchi1-4/+20
In case the BO is in iomem, we can't simply take the vaddr and write to it. Instead, prepare a separate buffer that is later copied into io memory. Right now it's just a few words that could be using xe_map_write32(), but the intention is to grow the WA BB for other uses. Fixes: 617d824c5323 ("drm/xe: Add WA BB to capture active context utilization") Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Link: https://lore.kernel.org/r/20250604-wa-bb-fix-v1-1-0dfc5dafcef0@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit ef48715b2d3df17c060e23b9aa636af3d95652f8) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-06-12Merge tag 'pinctrl-v6.16-2' of ↵Linus Torvalds64-80/+17
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl Pull pin control fixes from Linus Walleij: - Add some missing pins on the Qualcomm QCM2290, along with a managed resources patch that make it clean and nice - Drop an unused function in the ST Micro driver - Drop bouncing MAINTAINER entry - Drop of_match_ptr() macro to rid compile warnings in the TB10x driver - Fix up calculation of pin numbers from base in the Sunxi driver * tag 'pinctrl-v6.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: pinctrl: sunxi: dt: Consider pin base when calculating bank number from pin pinctrl: tb10x: Drop of_match_ptr for ID table pinctrl: MAINTAINERS: Drop bouncing Jianlong Huang pinctrl: st: Drop unused st_gpio_bank() function pinctrl: qcom: pinctrl-qcm2290: Add missing pins pinctrl: qcom: switch to devm_gpiochip_add_data()
2025-06-12Merge tag 'arc-6.16-rc1' of ↵Linus Torvalds27-67/+53
git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc Pull ARC fixes from Vineet Gupta: - arch_atomic64_cmpxchg relaxed variant [Jason] - use of inbuilt swap in stack unwinder [Yu-Chun Lin] - use of __ASSEMBLER__ in kernel headers [Thomas Huth] * tag 'arc-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: ARC: Replace __ASSEMBLY__ with __ASSEMBLER__ in the non-uapi headers ARC: Replace __ASSEMBLY__ with __ASSEMBLER__ in uapi headers ARC: unwind: Use built-in sort swap to reduce code size and improve performance ARC: atomics: Implement arch_atomic64_cmpxchg using _relaxed
2025-06-12Merge tag 'wireless-2025-06-12' of ↵Jakub Kicinski19-246/+251
https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless Johannes Berg says: ==================== Another quick round of updates: - revert mwifiex HT40 that was causing issues - many ath10k/ath11k/ath12k fixes - re-add some iwlwifi code I lost in a merge - use kfree_sensitive() on an error path in cfg80211 * tag 'wireless-2025-06-12' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless: wifi: cfg80211: use kfree_sensitive() for connkeys cleanup wifi: iwlwifi: fix merge damage related to iwl_pci_resume Revert "wifi: mwifiex: Fix HT40 bandwidth issue." wifi: ath12k: fix uaf in ath12k_core_init() wifi: ath12k: Fix hal_reo_cmd_status kernel-doc wifi: ath12k: fix GCC_GCC_PCIE_HOT_RST definition for WCN7850 wifi: ath11k: validate ath11k_crypto_mode on top of ath11k_core_qmi_firmware_ready wifi: ath11k: consistently use ath11k_mac_get_fw_stats() wifi: ath11k: move locking outside of ath11k_mac_get_fw_stats() wifi: ath11k: adjust unlock sequence in ath11k_update_stats_event() wifi: ath11k: move some firmware stats related functions outside of debugfs wifi: ath11k: don't wait when there is no vdev started wifi: ath11k: don't use static variables in ath11k_debugfs_fw_stats_process() wifi: ath11k: avoid burning CPU in ath11k_debugfs_fw_stats_request() wil6210: fix support for sparrow chipsets wifi: ath10k: Avoid vdev delete timeout when firmware is already down ath10k: snoc: fix unbalanced IRQ enable in crash recovery ==================== Link: https://patch.msgid.link/20250612082519.11447-3-johannes@sipsolutions.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-12Merge branch 'fix-ntuple-rules-targeting-default-rss'Jakub Kicinski2-2/+60
Gal Pressman says: ==================== Fix ntuple rules targeting default RSS This series addresses a regression in ethtool flow steering where rules targeting the default RSS context (context 0) were incorrectly rejected. The default RSS context always exists but is not stored in the rss_ctx xarray like additional contexts. The current validation logic was checking for the existence of context 0 in this array, causing valid flow steering rules to be rejected. This prevented configurations such as: - High priority rules directing specific traffic to the default context - Low priority catch-all rules directing remaining traffic to additional contexts Patch 1 fixes the validation logic to skip the existence check for context 0. Patch 2 adds a selftest that verifies this behavior. v3: https://lore.kernel.org/20250609120250.1630125-1-gal@nvidia.com v2: https://lore.kernel.org/20250225071348.509432-1-gal@nvidia.com ==================== Link: https://patch.msgid.link/20250612071958.1696361-1-gal@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-12selftests: drv-net: rss_ctx: Add test for ntuple rules targeting default RSS ↵Gal Pressman1-1/+58
context Add test_rss_default_context_rule() to verify that ntuple rules can correctly direct traffic to the default RSS context (context 0). The test creates two ntuple rules with explicit location priorities: - A high-priority rule (loc 0) directing specific port traffic to context 0. - A low-priority rule (loc 1) directing all other TCP traffic to context 1. This validates that: 1. Rules targeting the default context function properly. 2. Traffic steering works as expected when mixing default and additional RSS contexts. The test was written by AI, and reviewed by humans. Reviewed-by: Nimrod Oren <noren@nvidia.com> Signed-off-by: Gal Pressman <gal@nvidia.com> Link: https://patch.msgid.link/20250612071958.1696361-3-gal@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-12net: ethtool: Don't check if RSS context exists in case of context 0Gal Pressman1-1/+2
Context 0 (default context) always exists, there is no need to check whether it exists or not when adding a flow steering rule. The existing check fails when creating a flow steering rule for context 0 as it is not stored in the rss_ctx xarray. For example: $ ethtool --config-ntuple eth2 flow-type tcp4 dst-ip 194.237.147.23 dst-port 19983 context 0 loc 618 rmgr: Cannot insert RX class rule: Invalid argument Cannot insert classification rule An example usecase for this could be: - A high-priority rule (loc 0) directing specific port traffic to context 0. - A low-priority rule (loc 1) directing all other TCP traffic to context 1. This is a user-visible regression that was caught in our testing environment, it was not reported by a user yet. Fixes: de7f7582dff2 ("net: ethtool: prevent flow steering to RSS contexts which don't exist") Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Nimrod Oren <noren@nvidia.com> Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Joe Damato <jdamato@fastly.com> Reviewed-by: Edward Cree <ecree.xilinx@gmail.com> Link: https://patch.msgid.link/20250612071958.1696361-2-gal@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-12Merge tag 'for-net-2025-06-11' of ↵Jakub Kicinski9-38/+107
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth Luiz Augusto von Dentz says: ==================== bluetooth pull request for net: - eir: Fix NULL pointer deference on eir_get_service_data - eir: Fix possible crashes on eir_create_adv_data - hci_sync: Fix broadcast/PA when using an existing instance - ISO: Fix using BT_SK_PA_SYNC to detect BIS sockets - ISO: Fix not using bc_sid as advertisement SID - MGMT: Fix sparse errors * tag 'for-net-2025-06-11' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth: Bluetooth: MGMT: Fix sparse errors Bluetooth: ISO: Fix not using bc_sid as advertisement SID Bluetooth: ISO: Fix using BT_SK_PA_SYNC to detect BIS sockets Bluetooth: eir: Fix possible crashes on eir_create_adv_data Bluetooth: hci_sync: Fix broadcast/PA when using an existing instance Bluetooth: Fix NULL pointer deference on eir_get_service_data ==================== Link: https://patch.msgid.link/20250611204944.1559356-1-luiz.dentz@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-12af_unix: Allow passing cred for embryo without SO_PASSCRED/SO_PASSPIDFD.Kuniyuki Iwashima1-1/+2
Before the cited commit, the kernel unconditionally embedded SCM credentials to skb for embryo sockets even when both the sender and listener disabled SO_PASSCRED and SO_PASSPIDFD. Now, the credentials are added to skb only when configured by the sender or the listener. However, as reported in the link below, it caused a regression for some programs that assume credentials are included in every skb, but sometimes not now. The only problematic scenario would be that a socket starts listening before setting the option. Then, there will be 2 types of non-small race window, where a client can send skb without credentials, which the peer receives as an "invalid" message (and aborts the connection it seems ?): Client Server ------ ------ s1.listen() <-- No SO_PASS{CRED,PIDFD} s2.connect() s2.send() <-- w/o cred s1.setsockopt(SO_PASS{CRED,PIDFD}) s2.send() <-- w/ cred or Client Server ------ ------ s1.listen() <-- No SO_PASS{CRED,PIDFD} s2.connect() s2.send() <-- w/o cred s3, _ = s1.accept() <-- Inherit cred options s2.send() <-- w/o cred but not set yet s3.setsockopt(SO_PASS{CRED,PIDFD}) s2.send() <-- w/ cred It's unfortunate that buggy programs depend on the behaviour, but let's restore the previous behaviour. Fixes: 3f84d577b79d ("af_unix: Inherit sk_flags at connect().") Reported-by: Jacek Łuczak <difrost.kernel@gmail.com> Closes: https://lore.kernel.org/all/68d38b0b-1666-4974-85d4-15575789c8d4@gmail.com/ Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Tested-by: Christian Heusel <christian@heusel.eu> Tested-by: André Almeida <andrealmeid@igalia.com> Tested-by: Jacek Łuczak <difrost.kernel@gmail.com> Link: https://patch.msgid.link/20250611202758.3075858-1-kuni1840@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-12ipv6: Move fib6_config_validate() to ip6_route_add().Kuniyuki Iwashima1-55/+55
syzkaller created an IPv6 route from a malformed packet, which has a prefix len > 128, triggering the splat below. [0] This is a similar issue fixed by commit 586ceac9acb7 ("ipv6: Restore fib6_config validation for SIOCADDRT."). The cited commit removed fib6_config validation from some callers of ip6_add_route(). Let's move the validation back to ip6_route_add() and ip6_route_multipath_add(). [0]: UBSAN: array-index-out-of-bounds in ./include/net/ipv6.h:616:34 index 20 is out of range for type '__u8 [16]' CPU: 1 UID: 0 PID: 7444 Comm: syz.0.708 Not tainted 6.16.0-rc1-syzkaller-g19272b37aa4f #0 PREEMPT Hardware name: riscv-virtio,qemu (DT) Call Trace: [<ffffffff80078a80>] dump_backtrace+0x2e/0x3c arch/riscv/kernel/stacktrace.c:132 [<ffffffff8000327a>] show_stack+0x30/0x3c arch/riscv/kernel/stacktrace.c:138 [<ffffffff80061012>] __dump_stack lib/dump_stack.c:94 [inline] [<ffffffff80061012>] dump_stack_lvl+0x12e/0x1a6 lib/dump_stack.c:120 [<ffffffff800610a6>] dump_stack+0x1c/0x24 lib/dump_stack.c:129 [<ffffffff8001c0ea>] ubsan_epilogue+0x14/0x46 lib/ubsan.c:233 [<ffffffff819ba290>] __ubsan_handle_out_of_bounds+0xf6/0xf8 lib/ubsan.c:455 [<ffffffff85b363a4>] ipv6_addr_prefix include/net/ipv6.h:616 [inline] [<ffffffff85b363a4>] ip6_route_info_create+0x8f8/0x96e net/ipv6/route.c:3793 [<ffffffff85b635da>] ip6_route_add+0x2a/0x1aa net/ipv6/route.c:3889 [<ffffffff85b02e08>] addrconf_prefix_route+0x2c4/0x4e8 net/ipv6/addrconf.c:2487 [<ffffffff85b23bb2>] addrconf_prefix_rcv+0x1720/0x1e62 net/ipv6/addrconf.c:2878 [<ffffffff85b92664>] ndisc_router_discovery+0x1a06/0x3504 net/ipv6/ndisc.c:1570 [<ffffffff85b99038>] ndisc_rcv+0x500/0x600 net/ipv6/ndisc.c:1874 [<ffffffff85bc2c18>] icmpv6_rcv+0x145e/0x1e0a net/ipv6/icmp.c:988 [<ffffffff85af6798>] ip6_protocol_deliver_rcu+0x18a/0x1976 net/ipv6/ip6_input.c:436 [<ffffffff85af8078>] ip6_input_finish+0xf4/0x174 net/ipv6/ip6_input.c:480 [<ffffffff85af8262>] NF_HOOK include/linux/netfilter.h:317 [inline] [<ffffffff85af8262>] NF_HOOK include/linux/netfilter.h:311 [inline] [<ffffffff85af8262>] ip6_input+0x16a/0x70c net/ipv6/ip6_input.c:491 [<ffffffff85af8dcc>] ip6_mc_input+0x5c8/0x1268 net/ipv6/ip6_input.c:588 [<ffffffff85af6112>] dst_input include/net/dst.h:469 [inline] [<ffffffff85af6112>] ip6_rcv_finish net/ipv6/ip6_input.c:79 [inline] [<ffffffff85af6112>] NF_HOOK include/linux/netfilter.h:317 [inline] [<ffffffff85af6112>] NF_HOOK include/linux/netfilter.h:311 [inline] [<ffffffff85af6112>] ipv6_rcv+0x5ae/0x6e0 net/ipv6/ip6_input.c:309 [<ffffffff85087e84>] __netif_receive_skb_one_core+0x106/0x16e net/core/dev.c:5977 [<ffffffff85088104>] __netif_receive_skb+0x2c/0x144 net/core/dev.c:6090 [<ffffffff850883c6>] netif_receive_skb_internal net/core/dev.c:6176 [inline] [<ffffffff850883c6>] netif_receive_skb+0x1aa/0xbf2 net/core/dev.c:6235 [<ffffffff8328656e>] tun_rx_batched.isra.0+0x430/0x686 drivers/net/tun.c:1485 [<ffffffff8329ed3a>] tun_get_user+0x2952/0x3d6c drivers/net/tun.c:1938 [<ffffffff832a21e0>] tun_chr_write_iter+0xc4/0x21c drivers/net/tun.c:1984 [<ffffffff80b9b9ae>] new_sync_write fs/read_write.c:593 [inline] [<ffffffff80b9b9ae>] vfs_write+0x56c/0xa9a fs/read_write.c:686 [<ffffffff80b9c2be>] ksys_write+0x126/0x228 fs/read_write.c:738 [<ffffffff80b9c42e>] __do_sys_write fs/read_write.c:749 [inline] [<ffffffff80b9c42e>] __se_sys_write fs/read_write.c:746 [inline] [<ffffffff80b9c42e>] __riscv_sys_write+0x6e/0x94 fs/read_write.c:746 [<ffffffff80076912>] syscall_handler+0x94/0x118 arch/riscv/include/asm/syscall.h:112 [<ffffffff8637e31e>] do_trap_ecall_u+0x396/0x530 arch/riscv/kernel/traps.c:341 [<ffffffff863a69e2>] handle_exception+0x146/0x152 arch/riscv/kernel/entry.S:197 Fixes: fa76c1674f2e ("ipv6: Move some validation from ip6_route_info_create() to rtm_to_fib6_config().") Reported-by: syzbot+4c2358694722d304c44e@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/6849b8c3.a00a0220.1eb5f5.00f0.GAE@google.com/ Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250611193551.2999991-1-kuni1840@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-12net: drv: netdevsim: don't napi_complete() from netpollJakub Kicinski1-1/+2
netdevsim supports netpoll. Make sure we don't call napi_complete() from it, since it may not be scheduled. Breno reports hitting a warning in napi_complete_done(): WARNING: CPU: 14 PID: 104 at net/core/dev.c:6592 napi_complete_done+0x2cc/0x560 __napi_poll+0x2d8/0x3a0 handle_softirqs+0x1fe/0x710 This is presumably after netpoll stole the SCHED bit prematurely. Reported-by: Breno Leitao <leitao@debian.org> Fixes: 3762ec05a9fb ("netdevsim: add NAPI support") Tested-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20250611174643.2769263-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-12net/mlx5: HWS, Add error checking to hws_bwc_rule_complex_hash_node_get()Dan Carpenter1-2/+17
Check for if ida_alloc() or rhashtable_lookup_get_insert_fast() fails. Fixes: 17e0accac577 ("net/mlx5: HWS, support complex matchers") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Link: https://patch.msgid.link/aEmBONjyiF6z5yCV@stanley.mountain Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-12veth: prevent NULL pointer dereference in veth_xdp_rcvJesper Dangaard Brouer1-2/+2
The veth peer device is RCU protected, but when the peer device gets deleted (veth_dellink) then the pointer is assigned NULL (via RCU_INIT_POINTER). This patch adds a necessary NULL check in veth_xdp_rcv when accessing the veth peer net_device. This fixes a bug introduced in commit dc82a33297fc ("veth: apply qdisc backpressure on full ptr_ring to reduce TX drops"). The bug is a race and only triggers when having inflight packets on a veth that is being deleted. Reported-by: Ihor Solodrai <ihor.solodrai@linux.dev> Closes: https://lore.kernel.org/all/fecfcad0-7a16-42b8-bff2-66ee83a6e5c4@linux.dev/ Reported-by: syzbot+c4c7bf27f6b0c4bd97fe@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/683da55e.a00a0220.d8eae.0052.GAE@google.com/ Fixes: dc82a33297fc ("veth: apply qdisc backpressure on full ptr_ring to reduce TX drops") Signed-off-by: Jesper Dangaard Brouer <hawk@kernel.org> Acked-by: Ihor Solodrai <ihor.solodrai@linux.dev> Link: https://patch.msgid.link/174964557873.519608.10855046105237280978.stgit@firesoul Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-12Merge branch 'net_sched-no-longer-use-qdisc_tree_flush_backlog'Jakub Kicinski5-12/+4
Eric Dumazet says: ==================== net_sched: no longer use qdisc_tree_flush_backlog() This series is based on a report from Gerrard Tai. Essentially, all users of qdisc_tree_flush_backlog() are racy. We must instead use qdisc_purge_queue(). ==================== Link: https://patch.msgid.link/20250611111515.1983366-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-12net_sched: remove qdisc_tree_flush_backlog()Eric Dumazet1-8/+0
This function is no longer used after the four prior fixes. Given all prior uses were wrong, it seems better to remove it. Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250611111515.1983366-6-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-12net_sched: ets: fix a race in ets_qdisc_change()Eric Dumazet1-1/+1
Gerrard Tai reported a race condition in ETS, whenever SFQ perturb timer fires at the wrong time. The race is as follows: CPU 0 CPU 1 [1]: lock root [2]: qdisc_tree_flush_backlog() [3]: unlock root | | [5]: lock root | [6]: rehash | [7]: qdisc_tree_reduce_backlog() | [4]: qdisc_put() This can be abused to underflow a parent's qlen. Calling qdisc_purge_queue() instead of qdisc_tree_flush_backlog() should fix the race, because all packets will be purged from the qdisc before releasing the lock. Fixes: b05972f01e7d ("net: sched: tbf: don't call qdisc_put() while holding tree lock") Reported-by: Gerrard Tai <gerrard.tai@starlabs.sg> Suggested-by: Gerrard Tai <gerrard.tai@starlabs.sg> Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250611111515.1983366-5-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-12net_sched: tbf: fix a race in tbf_change()Eric Dumazet1-1/+1
Gerrard Tai reported a race condition in TBF, whenever SFQ perturb timer fires at the wrong time. The race is as follows: CPU 0 CPU 1 [1]: lock root [2]: qdisc_tree_flush_backlog() [3]: unlock root | | [5]: lock root | [6]: rehash | [7]: qdisc_tree_reduce_backlog() | [4]: qdisc_put() This can be abused to underflow a parent's qlen. Calling qdisc_purge_queue() instead of qdisc_tree_flush_backlog() should fix the race, because all packets will be purged from the qdisc before releasing the lock. Fixes: b05972f01e7d ("net: sched: tbf: don't call qdisc_put() while holding tree lock") Reported-by: Gerrard Tai <gerrard.tai@starlabs.sg> Suggested-by: Gerrard Tai <gerrard.tai@starlabs.sg> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Zhengchao Shao <shaozhengchao@huawei.com> Link: https://patch.msgid.link/20250611111515.1983366-4-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-12net_sched: red: fix a race in __red_change()Eric Dumazet1-1/+1
Gerrard Tai reported a race condition in RED, whenever SFQ perturb timer fires at the wrong time. The race is as follows: CPU 0 CPU 1 [1]: lock root [2]: qdisc_tree_flush_backlog() [3]: unlock root | | [5]: lock root | [6]: rehash | [7]: qdisc_tree_reduce_backlog() | [4]: qdisc_put() This can be abused to underflow a parent's qlen. Calling qdisc_purge_queue() instead of qdisc_tree_flush_backlog() should fix the race, because all packets will be purged from the qdisc before releasing the lock. Fixes: 0c8d13ac9607 ("net: sched: red: delay destroying child qdisc on replace") Reported-by: Gerrard Tai <gerrard.tai@starlabs.sg> Suggested-by: Gerrard Tai <gerrard.tai@starlabs.sg> Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250611111515.1983366-3-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-12net_sched: prio: fix a race in prio_tune()Eric Dumazet1-1/+1
Gerrard Tai reported a race condition in PRIO, whenever SFQ perturb timer fires at the wrong time. The race is as follows: CPU 0 CPU 1 [1]: lock root [2]: qdisc_tree_flush_backlog() [3]: unlock root | | [5]: lock root | [6]: rehash | [7]: qdisc_tree_reduce_backlog() | [4]: qdisc_put() This can be abused to underflow a parent's qlen. Calling qdisc_purge_queue() instead of qdisc_tree_flush_backlog() should fix the race, because all packets will be purged from the qdisc before releasing the lock. Fixes: 7b8e0b6e6599 ("net: sched: prio: delay destroying child qdiscs on change") Reported-by: Gerrard Tai <gerrard.tai@starlabs.sg> Suggested-by: Gerrard Tai <gerrard.tai@starlabs.sg> Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250611111515.1983366-2-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-12net_sched: sch_sfq: reject invalid perturb periodEric Dumazet1-2/+8
Gerrard Tai reported that SFQ perturb_period has no range check yet, and this can be used to trigger a race condition fixed in a separate patch. We want to make sure ctl->perturb_period * HZ will not overflow and is positive. Tested: tc qd add dev lo root sfq perturb -10 # negative value : error Error: sch_sfq: invalid perturb period. tc qd add dev lo root sfq perturb 1000000000 # too big : error Error: sch_sfq: invalid perturb period. tc qd add dev lo root sfq perturb 2000000 # acceptable value tc -s -d qd sh dev lo qdisc sfq 8005: root refcnt 2 limit 127p quantum 64Kb depth 127 flows 128 divisor 1024 perturb 2000000sec Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: Gerrard Tai <gerrard.tai@starlabs.sg> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250611083501.1810459-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-12net: phy: phy_caps: Don't skip better duplex macth on non-exact matchMaxime Chevallier1-6/+12
When performing a non-exact phy_caps lookup, we are looking for a supported mode that matches as closely as possible the passed speed/duplex. Blamed patch broke that logic by returning a match too early in case the caller asks for half-duplex, as a full-duplex linkmode may match first, and returned as a non-exact match without even trying to mach on half-duplex modes. Reported-by: Jijie Shao <shaojijie@huawei.com> Closes: https://lore.kernel.org/netdev/20250603102500.4ec743cf@fedora/T/#m22ed60ca635c67dc7d9cbb47e8995b2beb5c1576 Tested-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com> Fixes: fc81e257d19f ("net: phy: phy_caps: Allow looking-up link caps based on speed and duplex") Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/20250606094321.483602-1-maxime.chevallier@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-12io_uring: consistently use rcu semantics with sqpoll threadKeith Busch4-15/+38
The sqpoll thread is dereferenced with rcu read protection in one place, so it needs to be annotated as an __rcu type, and should consistently use rcu helpers for access and assignment to make sparse happy. Since most of the accesses occur under the sqd->lock, we can use rcu_dereference_protected() without declaring an rcu read section. Provide a simple helper to get the thread from a locked context. Fixes: ac0b8b327a5677d ("io_uring: fix use-after-free of sq->thread in __io_uring_show_fdinfo()") Signed-off-by: Keith Busch <kbusch@kernel.org> Link: https://lore.kernel.org/r/20250611205343.1821117-1-kbusch@meta.com [axboe: fold in fix for register.c] Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-06-12Merge tag 'cpufreq-arm-fixes-6.16-rc' of ↵Rafael J. Wysocki7-64/+299
git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm Merge CPUFreq fixes for 6.16-rc from Viresh Kumar: "- Implement CpuId rust abstraction and use it to fix doctest failure (Viresh Kumar). - Minor cleanups in the `# Safety` sections for cpufreq abstractions (Viresh Kumar)." * tag 'cpufreq-arm-fixes-6.16-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm: rust: cpu: Add CpuId::current() to retrieve current CPU ID rust: Use CpuId in place of raw CPU numbers rust: cpu: Introduce CpuId abstraction cpufreq: Convert `/// SAFETY` lines to `# Safety` sections
2025-06-11init: fix build warnings about export.hHuacai Chen2-0/+2
After commit a934a57a42f64a4 ("scripts/misc-check: check missing #include <linux/export.h> when W=1") and 7d95680d64ac8e836c ("scripts/misc-check: check unnecessary #include <linux/export.h> when W=1"), we get some build warnings with W=1: init/main.c: warning: EXPORT_SYMBOL() is used, but #include <linux/export.h> is missing init/initramfs.c: warning: EXPORT_SYMBOL() is used, but #include <linux/export.h> is missing So fix these build warnings for the init code. Link: https://lkml.kernel.org/r/20250608141235.155206-1-chenhuacai@loongson.cn Fixes: a934a57a42f6 ("scripts/misc-check: check missing #include <linux/export.h> when W=1") Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Reviewed-by: Masahiro Yamada <masahiroy@kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <brauner@kernel.org> Cc: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-06-11MAINTAINERS: add Barry as a THP reviewerBarry Song1-0/+1
I have been actively contributing to mTHP and reviewing related patches for an extended period, and I would like to continue supporting patch reviews. Link: https://lkml.kernel.org/r/20250609002442.1856-1-21cnbao@gmail.com Signed-off-by: Barry Song <baohua@kernel.org> Acked-by: Zi Yan <ziy@nvidia.com> Acked-by: Baolin Wang <baolin.wang@linux.alibaba.com> Acked-by: Dev Jain <dev.jain@arm.com> Acked-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Ryan Roberts <ryan.roberts@arm.com> Cc: David Hildenbrand <david@redhat.com> Cc: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Nico Pache <npache@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-06-11drivers/rapidio/rio_cm.c: prevent possible heap overwriteAndrew Morton1-0/+3
In riocm_cdev_ioctl(RIO_CM_CHAN_SEND) -> cm_chan_msg_send() -> riocm_ch_send() cm_chan_msg_send() checks that userspace didn't send too much data but riocm_ch_send() failed to check that userspace sent sufficient data. The result is that riocm_ch_send() can write to fields in the rio_ch_chan_hdr which were outside the bounds of the space which cm_chan_msg_send() allocated. Address this by teaching riocm_ch_send() to check that the entire rio_ch_chan_hdr was copied in from userspace. Reported-by: maher azz <maherazz04@gmail.com> Cc: Matt Porter <mporter@kernel.crashing.org> Cc: Alexandre Bounine <alex.bou9@gmail.com> Cc: Linus Torvalds <torvalds@linuxfoundation.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-06-11mm: close theoretical race where stale TLB entries could lingerRyan Roberts1-0/+2
Commit 3ea277194daa ("mm, mprotect: flush TLB if potentially racing with a parallel reclaim leaving stale TLB entries") described a theoretical race as such: """ Nadav Amit identified a theoretical race between page reclaim and mprotect due to TLB flushes being batched outside of the PTL being held. He described the race as follows: CPU0 CPU1 ---- ---- user accesses memory using RW PTE [PTE now cached in TLB] try_to_unmap_one() ==> ptep_get_and_clear() ==> set_tlb_ubc_flush_pending() mprotect(addr, PROT_READ) ==> change_pte_range() ==> [ PTE non-present - no flush ] user writes using cached RW PTE ... try_to_unmap_flush() The same type of race exists for reads when protecting for PROT_NONE and also exists for operations that can leave an old TLB entry behind such as munmap, mremap and madvise. """ The solution was to introduce flush_tlb_batched_pending() and call it under the PTL from mprotect/madvise/munmap/mremap to complete any pending tlb flushes. However, while madvise_free_pte_range() and madvise_cold_or_pageout_pte_range() were both retro-fitted to call flush_tlb_batched_pending() immediately after initially acquiring the PTL, they both temporarily release the PTL to split a large folio if they stumble upon one. In this case, where re-acquiring the PTL flush_tlb_batched_pending() must be called again, but it previously was not. Let's fix that. There are 2 Fixes: tags here: the first is the commit that fixed madvise_free_pte_range(). The second is the commit that added madvise_cold_or_pageout_pte_range(), which looks like it copy/pasted the faulty pattern from madvise_free_pte_range(). This is a theoretical bug discovered during code review. Link: https://lkml.kernel.org/r/20250606092809.4194056-1-ryan.roberts@arm.com Fixes: 3ea277194daa ("mm, mprotect: flush TLB if potentially racing with a parallel reclaim leaving stale TLB entries") Fixes: 9c276cc65a58 ("mm: introduce MADV_COLD") Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> Reviewed-by: Jann Horn <jannh@google.com> Acked-by: David Hildenbrand <david@redhat.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Mel Gorman <mgorman <mgorman@suse.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-06-11mm/vma: reset VMA iterator on commit_merge() OOM failureLorenzo Stoakes1-18/+4
While an OOM failure in commit_merge() isn't really feasible due to the allocation which might fail (a maple tree pre-allocation) being 'too small to fail', we do need to handle this case correctly regardless. In vma_merge_existing_range(), we can theoretically encounter failures which result in an OOM error in two ways - firstly dup_anon_vma() might fail with an OOM error, and secondly commit_merge() failing, ultimately, to pre-allocate a maple tree node. The abort logic for dup_anon_vma() resets the VMA iterator to the initial range, ensuring that any logic looping on this iterator will correctly proceed to the next VMA. However the commit_merge() abort logic does not do the same thing. This resulted in a syzbot report occurring because mlockall() iterates through VMAs, is tolerant of errors, but ended up with an incorrect previous VMA being specified due to incorrect iterator state. While making this change, it became apparent we are duplicating logic - the logic introduced in commit 41e6ddcaa0f1 ("mm/vma: add give_up_on_oom option on modify/merge, use in uffd release") duplicates the vmg->give_up_on_oom check in both abort branches. Additionally, we observe that we can perform the anon_dup check safely on dup_anon_vma() failure, as this will not be modified should this call fail. Finally, we need to reset the iterator in both cases, so now we can simply use the exact same code to abort for both. We remove the VM_WARN_ON(err != -ENOMEM) as it would be silly for this to be otherwise and it allows us to implement the abort check more neatly. Link: https://lkml.kernel.org/r/20250606125032.164249-1-lorenzo.stoakes@oracle.com Fixes: 47b16d0462a4 ("mm: abort vma_modify() on merge out of memory failure") Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reported-by: syzbot+d16409ea9ecc16ed261a@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-mm/6842cc67.a00a0220.29ac89.003b.GAE@google.com/ Reviewed-by: Pedro Falcato <pfalcato@suse.de> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Jann Horn <jannh@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-06-11docs: proc: update VmFlags documentation in smapswangfushuai1-1/+3
Remove outdated VM_DENYWRITE("dw") reference and add missing VM_LOCKONFAULT("lf") and VM_UFFD_MINOR("ui") flags. [akpm@linux-foundation.org: add "dp" (VM_DROPPABLE), per Tal] Link: https://lkml.kernel.org/r/20250607153614.81914-1-wangfushuai@baidu.com Signed-off-by: wangfushuai <wangfushuai@baidu.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: David Hildenbrand <david@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Mariano Pache <npache@redhat.com> Cc: xu xin <xu.xin16@zte.com.cn> Cc: Tal Zussman <tz2294@columbia.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-06-11scatterlist: fix extraneous '@'-sign kernel-doc notationRandy Dunlap2-6/+6
Using "@argname@" in kernel-doc produces "argname****" (with "argname" in bold) in the generated html output, so use the expected kernel-doc notation of just "@argname" instead. "Fixes:" lines are added in case Matthew's patch [1] is backported. Link: https://lkml.kernel.org/r/20250605002337.2842659-1-rdunlap@infradead.org Link: https://lore.kernel.org/linux-doc/3bc4e779-7a79-42c1-8867-024f643a22fc@infradead.org/T/#m5d2bd9d21fb34f297aa4e7db069f09bc27b89007 [1] Fixes: 0db9299f48eb ("SG: Move functions to lib/scatterlist.c and add sg chaining allocator helpers") Fixes: 8d1d4b538bb1 ("scatterlist: inline sg_next()") Fixes: 18dabf473e15 ("Change table chaining layout") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-06-11selftests/mm: skip failed memfd setups in gup_longtermMark Brown1-1/+6
Unlike the other cases gup_longterm's memfd tests previously skipped the test when failing to set up the file descriptor to test. Restore this behavior to avoid hitting failures when hugetlb isn't configured. Link: https://lkml.kernel.org/r/20250605-selftest-mm-gup-longterm-tweaks-v1-1-2fae34b05958@kernel.org Fixes: 66bce7afbaca ("selftests/mm: fix test result reporting in gup_longterm") Signed-off-by: Mark Brown <broonie@kernel.org> Reported-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Closes: https://lkml.kernel.org/r/a76fc252-0fe3-4d4b-a9a1-4a2895c2680d@lucifer.local Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Tested-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: David Hildenbrand <david@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-06-12rust: cpu: Add CpuId::current() to retrieve current CPU IDViresh Kumar4-0/+21
Introduce `CpuId::current()`, a constructor that wraps the C function `raw_smp_processor_id()` to retrieve the current CPU identifier without guaranteeing stability. This function should be used only when the caller can ensure that the CPU ID won't change unexpectedly due to preemption or migration. Suggested-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Boqun Feng <boqun.feng@gmail.com>
2025-06-12rust: Use CpuId in place of raw CPU numbersViresh Kumar4-27/+59
Use the newly defined `CpuId` abstraction instead of raw CPU numbers. This also fixes a doctest failure for configurations where `nr_cpu_ids < 4`. The C `cpumask_{set|clear}_cpu()` APIs emit a warning when given an invalid CPU number — but only if `CONFIG_DEBUG_PER_CPU_MAPS=y` is set. Meanwhile, `cpumask_weight()` only considers CPUs up to `nr_cpu_ids`, which can cause inconsistencies: a CPU number greater than `nr_cpu_ids` may be set in the mask, yet the weight calculation won't reflect it. This leads to doctest failures when `nr_cpu_ids < 4`, as the test tries to set CPUs 2 and 3: rust_doctest_kernel_cpumask_rs_0.location: rust/kernel/cpumask.rs:180 rust_doctest_kernel_cpumask_rs_0: ASSERTION FAILED at rust/kernel/cpumask.rs:190 Fixes: 8961b8cb3099 ("rust: cpumask: Add initial abstractions") Reported-by: Miguel Ojeda <ojeda@kernel.org> Closes: https://lore.kernel.org/rust-for-linux/CANiq72k3ozKkLMinTLQwvkyg9K=BeRxs1oYZSKhJHY-veEyZdg@mail.gmail.com/ Reported-by: Andreas Hindborg <a.hindborg@kernel.org> Closes: https://lore.kernel.org/all/87qzzy3ric.fsf@kernel.org/ Suggested-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Boqun Feng <boqun.feng@gmail.com>
2025-06-12KVM: x86/mmu: Reject direct bits in gpa passed to KVM_PRE_FAULT_MEMORYPaolo Bonzini1-0/+3
Only let userspace pass the same addresses that were used in KVM_SET_USER_MEMORY_REGION (or KVM_SET_USER_MEMORY_REGION2); gpas in the the upper half of the address space are an implementation detail of TDX and KVM. Extracted from a patch by Sean Christopherson <seanjc@google.com>. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-06-12KVM: x86/mmu: Embed direct bits into gpa for KVM_PRE_FAULT_MEMORYPaolo Bonzini1-1/+5
Bug[*] reported for TDX case when enabling KVM_PRE_FAULT_MEMORY in QEMU. It turns out that @gpa passed to kvm_mmu_do_page_fault() doesn't have shared bit set when the memory attribute of it is shared, and it leads to wrong root in tdp_mmu_get_root_for_fault(). Fix it by embedding the direct bits in the gpa that is passed to kvm_tdp_map_page(), when the memory of the gpa is not private. [*] https://lore.kernel.org/qemu-devel/4a757796-11c2-47f1-ae0d-335626e818fd@intel.com/ Reported-by: Xiaoyao Li <xiaoyao.li@intel.com> Closes: https://lore.kernel.org/qemu-devel/4a757796-11c2-47f1-ae0d-335626e818fd@intel.com/ Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com> Message-ID: <20250611001018.2179964-1-xiaoyao.li@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-06-11bcachefs: Don't trace should_be_locked unless changingKent Overstreet1-2/+4
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-11bcachefs: Ensure that snapshot creation propagates has_case_insensitiveKent Overstreet1-0/+10
We normally can't create a new directory with the case-insensitive option already set - except when we're creating a snapshot. And if casefolding is enabled filesystem wide, we should still set it even though not strictly required, for consistency. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-11bcachefs: Print devices we're mounting on multi device filesystemsKent Overstreet1-6/+14
Previously, we only ever logged the filesystem UUID. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-11bcachefs: Don't trust sb->nr_devices in members_to_text()Kent Overstreet1-4/+30
We have to be able to print superblock sections even if they fail to validate (for debugging), so we have to calculate the number of entries from the field size. Reported-by: syzbot+5138f00559ffb3cb3610@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-11bcachefs: Fix version checks in validate_bset()Kent Overstreet1-5/+11
It seems btree node scan picked up a partially overwritten btree node, and corrected the "bset version older than sb version_min" error - resulting in an invalid superblock with a bad version_min field. Don't run this check at all when we're in btree node scan, and when we do run it, do something saner if the bset version is totally crazy. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-11bcachefs: ioctl: avoid stack overflow warningArnd Bergmann1-2/+2
Multiple ioctl handlers individually use a lot of stack space, and clang chooses to inline them into the bch2_fs_ioctl() function, blowing through the warning limit: fs/bcachefs/chardev.c:655:6: error: stack frame size (1032) exceeds limit (1024) in 'bch2_fs_ioctl' [-Werror,-Wframe-larger-than] 655 | long bch2_fs_ioctl(struct bch_fs *c, unsigned cmd, void __user *arg) By marking the largest two of them as noinline_for_stack, no indidual code path ends up using this much, which avoids the warning and reduces the possible total stack usage in the ioctl handler. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-11bcachefs: Don't pass trans to fsck_err() in gc_accounting_doneKent Overstreet1-1/+3
fsck_err() can return a transaction restart if passed a transaction object - this has always been true when it has to drop locks to prompt for user input, but we're seeing this more now that we're logging the error being corrected in the journal. gc_accounting_done() doesn't call fsck_err() from an actual commit loop, and it doesn't need to be holding btree locks when it calls fsck_err(), so the easy fix here for the unhandled transaction restart is to just not pass it the transaction object. We'll miss out on the fancy new logging, but that's ok. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-11bcachefs: Fix leak in bch2_fs_recovery() error pathKent Overstreet1-4/+4
Fix a small leak of the superblock 'clean' section. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-11bcachefs: Fix rcu_pending for PREEMPT_RTKent Overstreet1-11/+11
PREEMPT_RT redefines how standard spinlocks work, so local_irq_save() + spin_lock() is no longer equivalent to spin_lock_irqsave(). Fortunately, we don't strictly need to do it that way. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-11bcachefs: Fix downgrade_table_extra()Kent Overstreet1-1/+4
Fix a UAF: we were calling darray_make_room() and retaining a pointer to the old buffer. And fix an UBSAN warning: struct bch_sb_field_downgrade_entry uses __counted_by, so set dst->nr_errors before assigning to the array entry. Reported-by: syzbot+14c52d86ddbd89bea13e@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-11bcachefs: Don't put rhashtable on stackKent Overstreet1-9/+13
Object debugging generally needs special provisions for putting said objects on the stack, which rhashtable does not have. Reported-by: syzbot+bcc38a9556d0324c2ec2@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-11bcachefs: Make sure opts.read_only gets propagated back to VFSKent Overstreet2-1/+11
If we think we're read-only but the VFS doesn't, fun will ensue. And now that we know we have to be able to do this safely, just make nochanges imply ro. Reported-by: syzbot+a7d6ceaba099cc21dee4@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-11bcachefs: Fix possible console lock involved deadlockAlan Huang6-20/+8
Link: https://lore.kernel.org/all/6822ab02.050a0220.f2294.00cb.GAE@google.com/T/ Reported-by: syzbot+2c3ef91c9523c3d1a25c@syzkaller.appspotmail.com Signed-off-by: Alan Huang <mmpgouride@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-11bcachefs: mark more errors autofixKent Overstreet1-4/+4
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-11bcachefs: Don't persistently run scan_for_btree_nodesKent Overstreet2-3/+13
bch2_btree_lost_data() gets called on btree node read error, but the error might be transient. btree_node_scan is expensive, and there's no need to run it persistently (marking it in the superblock as required to run) - check_topology will run it if required, via bch2_get_scanned_nodes(). Running it non-persistently is fine, to avoid check_topology having to rewind recovery to run it. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-11bcachefs: Read error message now prints if self healingKent Overstreet2-2/+10
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-11bcachefs: Only run 'increase_depth' for keys from btree node csanKent Overstreet1-1/+12
bch2_btree_increase_depth() was originally for disaster recovery, to get some data back from the journal when a btree root was bad. We don't need it for that purpose anymore; on bad btree root we'll launch btree node scan and reconstruct all the interior nodes. If there's a key in the journal for a depth that doesn't exists, and it's not from check_topology/btree node scan, we should just ignore it. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-11bcachefs: Mark need_discard_freespace_key_bad autofixKent Overstreet1-1/+1
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-11bcachefs: Update /dev/disk/by-uuid on device addKent Overstreet1-0/+16
Invalidate pagecache after we write the new superblock and send a uevent. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-11bcachefs: Add more flags to btree nodes for rewrite reasonKent Overstreet4-6/+48
It seems excessive forced btree node rewrites can cause interior btree updates to become wedged during recovery, before we're using the write buffer for backpointer updates. Add more flags so we can determine where these are coming from. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-11bcachefs: Add range being updated to btree_update_to_text()Kent Overstreet2-2/+31
We had a deadlock during recovery where interior btree updates became wedged and all open_buckets were consumed; start adding more introspection. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-11bcachefs: Log fsck errors in the journalKent Overstreet1-0/+3
Log the specific error being corrected in the journal when we're repairing, this helps greatly with 'bcachefs list_journal' analysis. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-11bcachefs: Add missing restart handling to check_topology()Kent Overstreet1-35/+60
The next patch will add logging of the specific error being corrected in repair paths to the journal; this means __bch2_fsck_err() can return transaction restarts in places that previously weren't expecting them. check_topology() is old code that doesn't use btree iterators for btree node locking - it'll have to be rewritten in the future to work online. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-12rust: cpu: Introduce CpuId abstractionViresh Kumar1-0/+110
This adds abstraction for representing a CPU identifier. Suggested-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Boqun Feng <boqun.feng@gmail.com>
2025-06-11Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfLinus Torvalds4-4/+8
Pull BPF fixes from Alexei Starovoitov - Fix libbpf backward compatibility (Andrii Nakryiko) - Add Stanislav Fomichev as bpf/net reviewer - Fix resolve_btfid build when cross compiling (Suleiman Souhlal) * tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: MAINTAINERS: Add myself as bpf networking reviewer tools/resolve_btfids: Fix build when cross compiling kernel with clang. libbpf: Handle unsupported mmap-based /sys/kernel/btf/vmlinux correctly
2025-06-11MAINTAINERS: Update Kuniyuki Iwashima's email address.Kuniyuki Iwashima2-3/+6
I left Amazon and joined Google, so let's map the email addresses accordingly. Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250610235734.88540-1-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-11selftests: net: add test case for NAT46 looping back dstJakub Kicinski2-0/+16
Simple test for crash involving multicast loopback and stale dst. Reuse exising NAT46 program. Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250610001245.1981782-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-11net: clear the dst when changing skb protocolJakub Kicinski1-6/+13
A not-so-careful NAT46 BPF program can crash the kernel if it indiscriminately flips ingress packets from v4 to v6: BUG: kernel NULL pointer dereference, address: 0000000000000000 ip6_rcv_core (net/ipv6/ip6_input.c:190:20) ipv6_rcv (net/ipv6/ip6_input.c:306:8) process_backlog (net/core/dev.c:6186:4) napi_poll (net/core/dev.c:6906:9) net_rx_action (net/core/dev.c:7028:13) do_softirq (kernel/softirq.c:462:3) netif_rx (net/core/dev.c:5326:3) dev_loopback_xmit (net/core/dev.c:4015:2) ip_mc_finish_output (net/ipv4/ip_output.c:363:8) NF_HOOK (./include/linux/netfilter.h:314:9) ip_mc_output (net/ipv4/ip_output.c:400:5) dst_output (./include/net/dst.h:459:9) ip_local_out (net/ipv4/ip_output.c:130:9) ip_send_skb (net/ipv4/ip_output.c:1496:8) udp_send_skb (net/ipv4/udp.c:1040:8) udp_sendmsg (net/ipv4/udp.c:1328:10) The output interface has a 4->6 program attached at ingress. We try to loop the multicast skb back to the sending socket. Ingress BPF runs as part of netif_rx(), pushes a valid v6 hdr and changes skb->protocol to v6. We enter ip6_rcv_core which tries to use skb_dst(). But the dst is still an IPv4 one left after IPv4 mcast output. Clear the dst in all BPF helpers which change the protocol. Try to preserve metadata dsts, those may carry non-routing metadata. Cc: stable@vger.kernel.org Reviewed-by: Maciej Żenczykowski <maze@google.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Fixes: d219df60a70e ("bpf: Add ipip6 and ip6ip decap support for bpf_skb_adjust_room()") Fixes: 1b00e0dfe7d0 ("bpf: update skb->protocol in bpf_skb_net_grow") Fixes: 6578171a7ff0 ("bpf: add bpf_skb_change_proto helper") Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250610001245.1981782-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-11mm: pgtable: fix pte_swp_exclusiveMagnus Lindholm26-27/+27
Make pte_swp_exclusive return bool instead of int. This will better reflect how pte_swp_exclusive is actually used in the code. This fixes swap/swapoff problems on Alpha due pte_swp_exclusive not returning correct values when _PAGE_SWP_EXCLUSIVE bit resides in upper 32-bits of PTE (like on alpha). Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Magnus Lindholm <linmag7@gmail.com> Cc: Sam James <sam@gentoo.org> Link: https://lore.kernel.org/lkml/20250218175735.19882-2-linmag7@gmail.com/ Link: https://lore.kernel.org/lkml/20250602041118.GA2675383@ZenIV/ [ Applied as the 'sed' script Al suggested - Linus ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-06-11Merge branch 'mlx5-misc-fixes-2025-06-10'Jakub Kicinski9-28/+40
Mark Bloch says: ==================== mlx5 misc fixes 2025-06-10 This patchset includes misc fixes from the team for the mlx5 core and Ethernet drivers. ==================== Link: https://patch.msgid.link/20250610151514.1094735-1-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-11net/mlx5e: Fix number of lanes to UNKNOWN when using data_rate_operShahar Shitrit1-4/+1
When the link is up, either eth_proto_oper or ext_eth_proto_oper typically reports the active link protocol, from which both speed and number of lanes can be retrieved. However, in certain cases, such as when a NIC is connected via a non-standard cable, the firmware may not report the protocol. In such scenarios, the speed can still be obtained from the data_rate_oper field in PTYS register. Since data_rate_oper provides only speed information and lacks lane details, it is incorrect to derive the number of lanes from it. This patch corrects the behavior by setting the number of lanes to UNKNOWN instead of incorrectly using MAX_LANES when relying on data_rate_oper. Fixes: 7e959797f021 ("net/mlx5e: Enable lanes configuration when auto-negotiation is off") Signed-off-by: Shahar Shitrit <shshitrit@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/20250610151514.1094735-10-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-11net/mlx5e: Fix leak of Geneve TLV option objectJianbo Liu1-6/+6
Previously, a unique tunnel id was added for the matching on TC non-zero chains, to support inner header rewrite with goto action. Later, it was used to support VF tunnel offload for vxlan, then for Geneve and GRE. To support VF tunnel, a temporary mlx5_flow_spec is used to parse tunnel options. For Geneve, if there is TLV option, a object is created, or refcnt is added if already exists. But the temporary mlx5_flow_spec is directly freed after parsing, which causes the leak because no information regarding the object is saved in flow's mlx5_flow_spec, which is used to free the object when deleting the flow. To fix the leak, call mlx5_geneve_tlv_option_del() before free the temporary spec if it has TLV object. Fixes: 521933cdc4aa ("net/mlx5e: Support Geneve and GRE with VF tunnel offload") Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Alex Lazar <alazar@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/20250610151514.1094735-9-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-11net/mlx5: HWS, make sure the uplink is the last destinationVlad Dogaru3-7/+11
When there are more than one destinations, we create a FW flow table and provide it with all the destinations. FW requires to have wire as the last destination in the list (if it exists), otherwise the operation fails with FW syndrome. This patch fixes the destination array action creation: if it contains a wire destination, it is moved to the end. Fixes: 504e536d9010 ("net/mlx5: HWS, added actions handling") Signed-off-by: Vlad Dogaru <vdogaru@nvidia.com> Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/20250610151514.1094735-7-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-11net/mlx5: HWS, fix missing ip_version handling in definerYevgeny Kliteynik1-0/+3
Fix missing field handling in definer - outer IP version. Fixes: 74a778b4a63f ("net/mlx5: HWS, added definers handling") Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/20250610151514.1094735-6-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-11net/mlx5: HWS, Init mutex on the correct pathVlad Dogaru1-1/+1
The newly introduced mutex is only used for reformat actions, but it was initialized for modify header instead. The struct that contains the mutex is zero-initialized and an all-zero mutex is valid, so the issue only shows up with CONFIG_DEBUG_MUTEXES. Fixes: b206d9ec19df ("net/mlx5: HWS, register reformat actions with fw") Signed-off-by: Vlad Dogaru <vdogaru@nvidia.com> Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/20250610151514.1094735-5-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-11net/mlx5: Fix return value when searching for existing flow groupPatrisious Haddad1-1/+4
When attempting to add a rule to an existing flow group, if a matching flow group exists but is not active, the error code returned should be EAGAIN, so that the rule can be added to the matching flow group once it is active, rather than ENOENT, which indicates that no matching flow group was found. Fixes: bd71b08ec2ee ("net/mlx5: Support multiple updates of steering rules in parallel") Signed-off-by: Gavi Teitz <gavi@nvidia.com> Signed-off-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/20250610151514.1094735-4-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-11net/mlx5: Fix ECVF vports unload on shutdown flowAmir Tzin1-8/+13
Fix shutdown flow UAF when a virtual function is created on the embedded chip (ECVF) of a BlueField device. In such case the vport acl ingress table is not properly destroyed. ECVF functionality is independent of ecpf_vport_exists capability and thus functions mlx5_eswitch_(enable|disable)_pf_vf_vports() should not test it when enabling/disabling ECVF vports. kernel log: [] refcount_t: underflow; use-after-free. [] WARNING: CPU: 3 PID: 1 at lib/refcount.c:28 refcount_warn_saturate+0x124/0x220 ---------------- [] Call trace: [] refcount_warn_saturate+0x124/0x220 [] tree_put_node+0x164/0x1e0 [mlx5_core] [] mlx5_destroy_flow_table+0x98/0x2c0 [mlx5_core] [] esw_acl_ingress_table_destroy+0x28/0x40 [mlx5_core] [] esw_acl_ingress_lgcy_cleanup+0x80/0xf4 [mlx5_core] [] esw_legacy_vport_acl_cleanup+0x44/0x60 [mlx5_core] [] esw_vport_cleanup+0x64/0x90 [mlx5_core] [] mlx5_esw_vport_disable+0xc0/0x1d0 [mlx5_core] [] mlx5_eswitch_unload_ec_vf_vports+0xcc/0x150 [mlx5_core] [] mlx5_eswitch_disable_sriov+0x198/0x2a0 [mlx5_core] [] mlx5_device_disable_sriov+0xb8/0x1e0 [mlx5_core] [] mlx5_sriov_detach+0x40/0x50 [mlx5_core] [] mlx5_unload+0x40/0xc4 [mlx5_core] [] mlx5_unload_one_devl_locked+0x6c/0xe4 [mlx5_core] [] mlx5_unload_one+0x3c/0x60 [mlx5_core] [] shutdown+0x7c/0xa4 [mlx5_core] [] pci_device_shutdown+0x3c/0xa0 [] device_shutdown+0x170/0x340 [] __do_sys_reboot+0x1f4/0x2a0 [] __arm64_sys_reboot+0x2c/0x40 [] invoke_syscall+0x78/0x100 [] el0_svc_common.constprop.0+0x54/0x184 [] do_el0_svc+0x30/0xac [] el0_svc+0x48/0x160 [] el0t_64_sync_handler+0xa4/0x12c [] el0t_64_sync+0x1a4/0x1a8 [] --[ end trace 9c4601d68c70030e ]--- Fixes: a7719b29a821 ("net/mlx5: Add management of EC VF vports") Reviewed-by: Daniel Jurgens <danielj@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Amir Tzin <amirtz@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/20250610151514.1094735-3-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-11net/mlx5: Ensure fw pages are always allocated on same NUMAMoshe Shemesh1-1/+1
When firmware asks the driver to allocate more pages, using event of give_pages, the driver should always allocate it from same NUMA, the original device NUMA. Current code uses dev_to_node() which can result in different NUMA as it is changed by other driver flows, such as mlx5_dma_zalloc_coherent_node(). Instead, use saved numa node for allocating firmware pages. Fixes: 311c7c71c9bb ("net/mlx5e: Allocate DMA coherent memory on reader NUMA node") Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/20250610151514.1094735-2-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-11Merge branch '40GbE' of ↵Jakub Kicinski5-8/+40
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2025-06-10 (i40e, iavf, ice, e1000) For i40e: Robert Malz improves reset handling for situations where multiple reset requests could cause some to be missed. For iavf: Ahmed adds detection, and handling, of reset that could occur early in the initialization process to stop long wait/hangs. For ice: Anton, properly, sets missed use_nsecs value. For e1000: Joe Damato moves cancel_work_sync() call to avoid deadlock. * '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: e1000: Move cancel_work_sync to avoid deadlock ice/ptp: fix crosstimestamp reporting iavf: fix reset_task for early reset event i40e: retry VFLR handling if there is ongoing VF reset i40e: return false from i40e_reset_vf if reset is in progress ==================== Link: https://patch.msgid.link/20250610171348.1476574-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-11Bluetooth: MGMT: Fix sparse errorsLuiz Augusto von Dentz1-2/+2
This fixes the following errors: net/bluetooth/mgmt.c:5400:59: sparse: sparse: incorrect type in argument 3 (different base types) @@ expected unsigned short [usertype] handle @@ got restricted __le16 [usertype] monitor_handle @@ net/bluetooth/mgmt.c:5400:59: sparse: expected unsigned short [usertype] handle net/bluetooth/mgmt.c:5400:59: sparse: got restricted __le16 [usertype] monitor_handle Fixes: e6ed54e86aae ("Bluetooth: MGMT: Fix UAF on mgmt_remove_adv_monitor_complete") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202506060347.ux2O1p7L-lkp@intel.com/ Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2025-06-11Bluetooth: ISO: Fix not using bc_sid as advertisement SIDLuiz Augusto von Dentz6-20/+72
Currently bc_sid is being ignore when acting as Broadcast Source role, so this fix it by passing the bc_sid and then use it when programming the PA: < HCI Command: LE Set Exte.. (0x08|0x0036) plen 25 Handle: 0x01 Properties: 0x0000 Min advertising interval: 140.000 msec (0x00e0) Max advertising interval: 140.000 msec (0x00e0) Channel map: 37, 38, 39 (0x07) Own address type: Random (0x01) Peer address type: Public (0x00) Peer address: 00:00:00:00:00:00 (OUI 00-00-00) Filter policy: Allow Scan Request from Any, Allow Connect Request from Any (0x00) TX power: Host has no preference (0x7f) Primary PHY: LE 1M (0x01) Secondary max skip: 0x00 Secondary PHY: LE 2M (0x02) SID: 0x01 Scan request notifications: Disabled (0x00) Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2025-06-11Bluetooth: ISO: Fix using BT_SK_PA_SYNC to detect BIS socketsLuiz Augusto von Dentz1-1/+4
BT_SK_PA_SYNC is only valid for Broadcast Sinks which means socket used for Broadcast Sources wouldn't be able to use the likes of getpeername to read out the sockaddr_iso_bc fields which may have been update (e.g. bc_sid). Fixes: 0a766a0affb5 ("Bluetooth: ISO: Fix getpeername not returning sockaddr_iso_bc fields") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2025-06-11Bluetooth: eir: Fix possible crashes on eir_create_adv_dataLuiz Augusto von Dentz3-6/+8
eir_create_adv_data may attempt to add EIR_FLAGS and EIR_TX_POWER without checking if that would fit. Link: https://github.com/bluez/bluez/issues/1117#issuecomment-2958244066 Fixes: 01ce70b0a274 ("Bluetooth: eir: Move EIR/Adv Data functions to its own file") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2025-06-11Bluetooth: hci_sync: Fix broadcast/PA when using an existing instanceLuiz Augusto von Dentz1-5/+15
When using and existing adv_info instance for broadcast source it needs to be updated to periodic first before it can be reused, also in case the existing instance already have data hci_set_adv_instance_data cannot be used directly since it would overwrite the existing data so this reappend the original data after the Broadcast ID, if one was generated. Example: bluetoothctl># Add PBP to EA so it can be later referenced as the BIS ID bluetoothctl> advertise.service 0x1856 0x00 0x00 bluetoothctl> advertise on ... < HCI Command: LE Set Extended Advertising Data (0x08|0x0037) plen 13 Handle: 0x01 Operation: Complete extended advertising data (0x03) Fragment preference: Minimize fragmentation (0x01) Data length: 0x09 Service Data: Public Broadcast Announcement (0x1856) Data[2]: 0000 Flags: 0x06 LE General Discoverable Mode BR/EDR Not Supported ... bluetoothctl># Attempt to acquire Broadcast Source transport bluetoothctl>transport.acquire /org/bluez/hci0/pac_bcast0/fd0 ... < HCI Command: LE Set Extended Advertising Data (0x08|0x0037) plen 255 Handle: 0x01 Operation: Complete extended advertising data (0x03) Fragment preference: Minimize fragmentation (0x01) Data length: 0x0e Service Data: Broadcast Audio Announcement (0x1852) Broadcast ID: 11371620 (0xad8464) Service Data: Public Broadcast Announcement (0x1856) Data[2]: 0000 Flags: 0x06 LE General Discoverable Mode BR/EDR Not Supported Link: https://github.com/bluez/bluez/issues/1117 Fixes: eca0ae4aea66 ("Bluetooth: Add initial implementation of BIS connections") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2025-06-11Bluetooth: Fix NULL pointer deference on eir_get_service_dataLuiz Augusto von Dentz1-4/+6
The len parameter is considered optional so it can be NULL so it cannot be used for skipping to next entry of EIR_SERVICE_DATA. Fixes: 8f9ae5b3ae80 ("Bluetooth: eir: Add helpers for managing service data") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2025-06-11KEYS: Invert FINAL_PUT bitHerbert Xu3-5/+6
Invert the FINAL_PUT bit so that test_bit_acquire and clear_bit_unlock can be used instead of smp_mb. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-06-11KVM: SEV: Disable SEV-SNP support on initialization failureAshish Kalra1-9/+35
During platform init, SNP initialization may fail for several reasons, such as firmware command failures and incompatible versions. However, the KVM capability may continue to advertise support for it. The platform may have SNP enabled but if SNP_INIT fails then SNP is not supported by KVM. During KVM module initialization query the SNP platform status to obtain the SNP initialization state and use it as an additional condition to determine support for SEV-SNP. Co-developed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Co-developed-by: Pratik R. Sampat <prsampat@amd.com> Signed-off-by: Pratik R. Sampat <prsampat@amd.com> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com> Reviewed-by: Pankaj Gupta <pankaj.gupta@amd.com> Reviewed-by: Pavan Kumar Paluri <papaluri@amd.com> Message-ID: <20250512221634.12045-1-Ashish.Kalra@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-06-11Merge tag 'kvmarm-fixes-6.16-2' of ↵Paolo Bonzini16-116/+151
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 fixes for 6.16, take #2 - Rework of system register accessors for system registers that are directly writen to memory, so that sanitisation of the in-memory value happens at the correct time (after the read, or before the write). For convenience, RMW-style accessors are also provided. - Multiple fixes for the so-called "arch-timer-edge-cases' selftest, which was always broken.
2025-06-11spi: stm32-ospi: clean up on error in probe()Dan Carpenter1-2/+4
If reset_control_acquire() fails, then we can't return directly. We need to do a little clean up first. Fixes: cf2c3eceb757 ("spi: stm32-ospi: Make usage of reset_control_acquire/release() API") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Link: https://patch.msgid.link/aEmAGTUzzKZlLe3K@stanley.mountain Signed-off-by: Mark Brown <broonie@kernel.org>
2025-06-11block: use plug request list tail for one-shot backmerge attemptJens Axboe1-13/+13
Previously, the block layer stored the requests in the plug list in LIFO order. For this reason, blk_attempt_plug_merge() would check just the head entry for a back merge attempt, and abort after that unless requests for multiple queues existed in the plug list. If more than one request is present in the plug list, this makes the one-shot back merging less useful than before, as it'll always fail to find a quick merge candidate. Use the tail entry for the one-shot merge attempt, which is the last added request in the list. If that fails, abort immediately unless there are multiple queues available. If multiple queues are available, then scan the list. Ideally the latter scan would be a backwards scan of the list, but as it currently stands, the plug list is singly linked and hence this isn't easily feasible. Cc: stable@vger.kernel.org Link: https://lore.kernel.org/linux-block/20250611121626.7252-1-abuehaze@amazon.com/ Reported-by: Hazem Mohamed Abuelfotoh <abuehaze@amazon.com> Fixes: e70c301faece ("block: don't reorder requests in blk_add_rq_to_plug") Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-06-11block: don't use submit_bio_noacct_nocheck in blk_zone_wplug_bio_workChristoph Hellwig1-2/+5
Bios queued up in the zone write plug have already gone through all all preparation in the submit_bio path, including the freeze protection. Submitting them through submit_bio_noacct_nocheck duplicates the work and can can cause deadlocks when freezing a queue with pending bio write plugs. Go straight to ->submit_bio or blk_mq_submit_bio to bypass the superfluous extra freeze protection and checks. Fixes: 9b1ce7f0c6f8 ("block: Implement zone append emulation") Reported-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Tested-by: Damien Le Moal <dlemoal@kernel.org> Link: https://lore.kernel.org/r/20250611044416.2351850-1-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-06-11block: Clear BIO_EMULATES_ZONE_APPEND flag on BIO completionDamien Le Moal1-0/+1
When blk_zone_write_plug_bio_endio() is called for a regular write BIO used to emulate a zone append operation, that is, a BIO flagged with BIO_EMULATES_ZONE_APPEND, the BIO operation code is restored to the original REQ_OP_ZONE_APPEND but the BIO_EMULATES_ZONE_APPEND flag is not cleared. Clear it to fully return the BIO to its orginal definition. Fixes: 9b1ce7f0c6f8 ("block: Implement zone append emulation") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20250611005915.89843-1-dlemoal@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-06-11ublk: document auto buffer registration(UBLK_F_AUTO_BUF_REG)Ming Lei1-0/+75
Document recently merged feature auto buffer registration(UBLK_F_AUTO_BUF_REG). Cc: Caleb Sander Mateos <csander@purestorage.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250611085632.109719-1-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-06-11loop: move lo_set_size() out of queue freezeMing Lei1-6/+5
It isn't necessary to freeze queue for updating disk size given submit_bio() doesn't grab queue usage counter for checking eod. Also many driver won't freeze queue for calling set_capacity_and_notify(). Move lo_set_size() out of queue freeze for fixing many lockdep warning report. Link: https://lore.kernel.org/linux-block/67ea99e0.050a0220.3c3d88.0042.GAE@google.com/ Reported-by: syzbot+9dd7dbb1a4b915dee638@syzkaller.appspotmail.com Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250611084938.108829-1-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-06-11net/mdiobus: Fix potential out-of-bounds clause 45 read/write accessJakub Raczynski1-0/+6
When using publicly available tools like 'mdio-tools' to read/write data from/to network interface and its PHY via C45 (clause 45) mdiobus, there is no verification of parameters passed to the ioctl and it accepts any mdio address. Currently there is support for 32 addresses in kernel via PHY_MAX_ADDR define, but it is possible to pass higher value than that via ioctl. While read/write operation should generally fail in this case, mdiobus provides stats array, where wrong address may allow out-of-bounds read/write. Fix that by adding address verification before C45 read/write operation. While this excludes this access from any statistics, it improves security of read/write operation. Fixes: 4e4aafcddbbf ("net: mdio: Add dedicated C45 API to MDIO bus drivers") Signed-off-by: Jakub Raczynski <j.raczynski@samsung.com> Reported-by: Wenjing Shan <wenjing.shan@samsung.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2025-06-11net/mdiobus: Fix potential out-of-bounds read/write accessJakub Raczynski1-0/+6
When using publicly available tools like 'mdio-tools' to read/write data from/to network interface and its PHY via mdiobus, there is no verification of parameters passed to the ioctl and it accepts any mdio address. Currently there is support for 32 addresses in kernel via PHY_MAX_ADDR define, but it is possible to pass higher value than that via ioctl. While read/write operation should generally fail in this case, mdiobus provides stats array, where wrong address may allow out-of-bounds read/write. Fix that by adding address verification before read/write operation. While this excludes this access from any statistics, it improves security of read/write operation. Fixes: 080bb352fad00 ("net: phy: Maintain MDIO device and bus statistics") Signed-off-by: Jakub Raczynski <j.raczynski@samsung.com> Reported-by: Wenjing Shan <wenjing.shan@samsung.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2025-06-11udmabuf: use sgtable-based scatterlist wrappersMarek Szyprowski1-3/+2
Use common wrappers operating directly on the struct sg_table objects to fix incorrect use of scatterlists sync calls. dma_sync_sg_for_*() functions have to be called with the number of elements originally passed to dma_map_sg_*() function, not the one returned in sgtable's nents. Fixes: 1ffe09590121 ("udmabuf: fix dma-buf cpu access") CC: stable@vger.kernel.org Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Acked-by: Vivek Kasireddy <vivek.kasireddy@intel.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Link: https://lore.kernel.org/r/20250507160913.2084079-3-m.szyprowski@samsung.com
2025-06-11dma-buf: fix compare in WARN_ON_ONCEChristian König1-1/+1
Smatch pointed out this trivial typo: drivers/dma-buf/dma-buf.c:1123 dma_buf_map_attachment() warn: passing positive error code '16' to 'ERR_PTR' drivers/dma-buf/dma-buf.c 1113 dma_resv_assert_held(attach->dmabuf->resv); 1114 1115 if (dma_buf_pin_on_map(attach)) { 1116 ret = attach->dmabuf->ops->pin(attach); 1117 /* 1118 * Catch exporters making buffers inaccessible even when 1119 * attachments preventing that exist. 1120 */ 1121 WARN_ON_ONCE(ret == EBUSY); ^^^^^ This was probably intended to be -EBUSY? 1122 if (ret) --> 1123 return ERR_PTR(ret); ^^^ Otherwise we will eventually crash. 1124 } 1125 1126 sg_table = attach->dmabuf->ops->map_dma_buf(attach, direction); 1127 if (!sg_table) 1128 sg_table = ERR_PTR(-ENOMEM); 1129 if (IS_ERR(sg_table)) 1130 goto error_unpin; 1131 Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Link: https://lore.kernel.org/r/20250605085336.62156-1-christian.koenig@amd.com
2025-06-11wifi: cfg80211: use kfree_sensitive() for connkeys cleanupZilin Guan1-1/+1
The nl80211_parse_connkeys() function currently uses kfree() to release the 'result' structure in error handling paths. However, if an error occurs due to result->def being less than 0, the 'result' structure may contain sensitive information. To prevent potential leakage of sensitive data, replace kfree() with kfree_sensitive() when freeing 'result'. This change aligns with the approach used in its caller, nl80211_join_ibss(), enhancing the overall security of the wireless subsystem. Signed-off-by: Zilin Guan <zilin@seu.edu.cn> Link: https://patch.msgid.link/20250523110156.4017111-1-zilin@seu.edu.cn Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-06-11Merge tag 'ath-current-20250608' of ↵Johannes Berg16-237/+228
git://git.kernel.org/pub/scm/linux/kernel/git/ath/ath Jeff Johnson says: ================== ath.git updates for v6.16-rc2 Fix a handful of both build and stability issues across multiple drivers. ================== Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-06-11wifi: iwlwifi: fix merge damage related to iwl_pci_resumeEmmanuel Grumbach1-4/+20
The changes I made in wifi: iwlwifi: don't warn if the NIC is gone in resume conflicted with a major rework done in this area. The merge de-facto removed my changes. Re-add them. Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Link: https://patch.msgid.link/20250603091754.70182-1-emmanuel.grumbach@intel.com Fixes: 06c4b2036818 ("Merge tag 'iwlwifi-next-2025-05-15' of https://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-next") Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-06-11Revert "wifi: mwifiex: Fix HT40 bandwidth issue."Francesco Dolcini1-4/+2
This reverts commit 4fcfcbe45734 ("wifi: mwifiex: Fix HT40 bandwidth issue.") That commit introduces a regression, when HT40 mode is enabled, received packets are lost, this was experience with W8997 with both SDIO-UART and SDIO-SDIO variants. From an initial investigation the issue solves on its own after some time, but it's not clear what is the reason. Given that this was just a performance optimization, let's revert it till we have a better understanding of the issue and a proper fix. Cc: Jeff Chen <jeff.chen_1@nxp.com> Cc: stable@vger.kernel.org Fixes: 4fcfcbe45734 ("wifi: mwifiex: Fix HT40 bandwidth issue.") Closes: https://lore.kernel.org/all/20250603203337.GA109929@francesco-nb/ Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com> Link: https://patch.msgid.link/20250605130302.55555-1-francesco@dolcini.it [fix commit reference format] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-06-11macsec: MACsec SCI assignment for ES = 0Carlos Fernandez1-6/+34
According to 802.1AE standard, when ES and SC flags in TCI are zero, used SCI should be the current active SC_RX. Current code uses the header MAC address. Without this patch, when ES flag is 0 (using a bridge or switch), header MAC will not fit the SCI and MACSec frames will be discarted. In order to test this issue, MACsec link should be stablished between two interfaces, setting SC and ES flags to zero and a port identifier different than one. For example, using ip macsec tools: ip link add link $ETH0 macsec0 type macsec port 11 send_sci off end_station off ip macsec add macsec0 tx sa 0 pn 2 on key 01 $ETH1_KEY ip macsec add macsec0 rx port 11 address $ETH1_MAC ip macsec add macsec0 rx port 11 address $ETH1_MAC sa 0 pn 2 on key 02 ip link set dev macsec0 up ip link add link $ETH1 macsec1 type macsec port 11 send_sci off end_station off ip macsec add macsec1 tx sa 0 pn 2 on key 01 $ETH0_KEY ip macsec add macsec1 rx port 11 address $ETH0_MAC ip macsec add macsec1 rx port 11 address $ETH0_MAC sa 0 pn 2 on key 02 ip link set dev macsec1 up Fixes: c09440f7dcb3 ("macsec: introduce IEEE 802.1AE driver") Co-developed-by: Andreu Montiel <Andreu.Montiel@technica-engineering.de> Signed-off-by: Andreu Montiel <Andreu.Montiel@technica-engineering.de> Signed-off-by: Carlos Fernandez <carlos.fernandez@technica-engineering.de> Reviewed-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2025-06-11drm/sitronix: st7571-i2c: Select VIDEOMODE_HELPERSNathan Chancellor1-0/+1
This driver requires of_get_display_timing() from CONFIG_VIDEOMODE_HELPERS but does not select it. If no other driver selects it, there will be a failure from the linker if the driver is built in or modpost if it is a module. ERROR: modpost: "of_get_display_timing" [drivers/gpu/drm/sitronix/st7571-i2c.ko] undefined! Select CONFIG_VIDEOMODE_HELPERS to resolve the build failure. Fixes: 4b35f0f41ee2 ("drm/st7571-i2c: add support for Sitronix ST7571 LCD controller") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Link: https://lore.kernel.org/r/20250610-drm-st7571-i2c-select-videomode-helpers-v1-1-d30b50ff6e64@kernel.org Signed-off-by: Maxime Ripard <mripard@kernel.org>
2025-06-11crypto: hkdf - move to late_initcallEric Biggers1-1/+1
The HKDF self-tests depend on the HMAC algorithms being registered. HMAC is now registered at module_init, which put it at the same level as HKDF. Move HKDF to late_initcall so that it runs afterwards. Fixes: ef93f1562803 ("Revert "crypto: run initcalls for generic implementations earlier"") Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2025-06-10net: airoha: Enable RX queues 16-31Lorenzo Bianconi1-1/+2
Fix RX_DONE_INT_MASK definition in order to enable RX queues 16-31. Fixes: f252493e18353 ("net: airoha: Enable multiple IRQ lines support in airoha_eth driver.") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://patch.msgid.link/20250609-aioha-fix-rx-queue-mask-v1-1-f33706a06fa2@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-10netconsole: fix appending sysdata when sysdata_fields == SYSDATA_RELEASEGustavo Luiz Duarte1-2/+1
Before appending sysdata, prepare_extradata() checks if any feature is enabled in sysdata_fields (and exits early if none is enabled). When SYSDATA_RELEASE was introduced, we missed adding it to the list of features being checked against sysdata_fields in prepare_extradata(). The result was that, if only SYSDATA_RELEASE is enabled in sysdata_fields, we incorreclty exit early and fail to append the release. Instead of checking specific bits in sysdata_fields, check if sysdata_fields has ALL bit zeroed and exit early if true. This fixes case when only SYSDATA_RELEASE enabled and makes the code more general / less error prone in future feature implementation. Signed-off-by: Gustavo Luiz Duarte <gustavold@gmail.com> Reviewed-by: Breno Leitao <leitao@debian.org> Fixes: cfcc9239e78a ("netconsole: append release to sysdata") Link: https://patch.msgid.link/20250609-netconsole-fix-v1-1-17543611ae31@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-10net: Fix TOCTOU issue in sk_is_readable()Michal Luczaj1-2/+5
sk->sk_prot->sock_is_readable is a valid function pointer when sk resides in a sockmap. After the last sk_psock_put() (which usually happens when socket is removed from sockmap), sk->sk_prot gets restored and sk->sk_prot->sock_is_readable becomes NULL. This makes sk_is_readable() racy, if the value of sk->sk_prot is reloaded after the initial check. Which in turn may lead to a null pointer dereference. Ensure the function pointer does not turn NULL after the check. Fixes: 8934ce2fd081 ("bpf: sockmap redirect ingress support") Suggested-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Michal Luczaj <mhal@rbox.co> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250609-skisreadable-toctou-v1-1-d0dfb2d62c37@rbox.co Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-10net: usb: r8152: Add device ID for TP-Link UE200Lucas Sanchez Sagrado1-0/+1
The TP-Link UE200 is a RTL8152B based USB 2.0 Fast Ethernet adapter. This patch adds its device ID. It has been tested on Ubuntu 22.04.5. Signed-off-by: Lucas Sanchez Sagrado <lucsansag@gmail.com> Reviewed-by: Subbaraya Sundeep <sbhatta@marvell.com> Link: https://patch.msgid.link/20250609145536.26648-1-lucsansag@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-10MAINTAINERS: Add myself as bpf networking reviewerStanislav Fomichev2-0/+4
I've been focusing on networking BPF bits lately, add myself as a reviewer. Signed-off-by: Stanislav Fomichev <stfomichev@gmail.com> Acked-by: KP Singh <kpsingh@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/r/20250610175442.2138504-1-stfomichev@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-06-10intel_idle: Update arguments of mwait_idle_with_hints()Uros Bizjak1-4/+4
Commit a17b37a3f416 ("x86/idle: Change arguments of mwait_idle_with_hints() to u32") changed the type of arguments of mwait_idle_with_hints() from unsigned long to u32. Change the type of variables in the call to mwait_idle_with_hints() to unsigned int to follow the change. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Reviewed-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com> Link: https://patch.msgid.link/20250609063528.48715-1-ubizjak@gmail.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-06-10ACPI: resource: Use IRQ override on MACHENIKE 16PWentao Guan1-0/+7
Use ACPI IRQ override on MACHENIKE laptop to make the internal keyboard work. Add a new entry to the irq1_edge_low_force_override structure, similar to the existing ones. Link: https://bbs.deepin.org.cn/zh/post/287628 Signed-off-by: Wentao Guan <guanwentao@uniontech.com> Link: https://patch.msgid.link/20250603122059.1072790-1-guanwentao@uniontech.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-06-10ACPI: EC: Ignore ECDT tables with an invalid ID stringArmin Wolf1-0/+17
On the MSI Modern 14 C5M the ECDT table contains invalid data: UID : 00000000 GPE Number : 00 /* Invalid, 03 would be correct */ Namepath : "" /* Invalid, "\_SB.PCI0.SBRG.EC" would * be correct */ This slows down the EC access as the wrong GPE event is used for communication. Additionally the ID string is invalid. Ignore such faulty ECDT tables by verifying that the ID string has a valid format. Tested-by: glpnk@proton.me Signed-off-by: Armin Wolf <W_Armin@gmx.de> Link: https://patch.msgid.link/20250529235310.540530-1-W_Armin@gmx.de Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-06-10ACPI: CPPC: Fix NULL pointer dereference when nosmp is usedYunhui Cui1-1/+1
With nosmp in cmdline, other CPUs are not brought up, leaving their cpc_desc_ptr NULL. CPU0's iteration via for_each_possible_cpu() dereferences these NULL pointers, causing panic. Panic backtrace: [ 0.401123] Unable to handle kernel NULL pointer dereference at virtual address 00000000000000b8 ... [ 0.403255] [<ffffffff809a5818>] cppc_allow_fast_switch+0x6a/0xd4 ... Kernel panic - not syncing: Attempted to kill init! Fixes: 3cc30dd00a58 ("cpufreq: CPPC: Enable fast_switch") Reported-by: Xu Lu <luxu.kernel@bytedance.com> Signed-off-by: Yunhui Cui <cuiyunhui@bytedance.com> Link: https://patch.msgid.link/20250604023036.99553-1-cuiyunhui@bytedance.com [ rjw: New subject ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-06-10ACPI: PAD: Update arguments of mwait_idle_with_hints()Uros Bizjak1-1/+1
Commit a17b37a3f416 ("x86/idle: Change arguments of mwait_idle_with_hints() to u32") changed the type of arguments of mwait_idle_with_hints() from unsigned long to u32. Change the type of variables in the call to mwait_idle_with_hints() to unsigned int to follow the change. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Link: https://patch.msgid.link/20250609064235.49146-1-ubizjak@gmail.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-06-10rust: time: Fix compile error in impl_has_hr_timer macroFUJITA Tomonori1-1/+1
Fix a compile error in the `impl_has_hr_timer!` macro as follows: error[E0599]: no method named cast_mut found for raw pointer *mut Foo in the current scope The `container_of!` macro already returns a mutable pointer when used in a `*mut T` context so the `.cast_mut()` method is not available. [ We missed this one because there is no caller yet and it is a macro. - Miguel ] Fixes: 74d6a606c2b3 ("rust: retain pointer mut-ness in `container_of!`") Signed-off-by: FUJITA Tomonori <fujita.tomonori@gmail.com> Reviewed-by: Benno Lossin <lossin@kernel.org> Acked-by: Andreas Hindborg <a.hindborg@kernel.org> Link: https://lore.kernel.org/r/20250606020505.3186533-1-fujita.tomonori@gmail.com Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2025-06-10ACPI: APEI: EINJ: Do not fail einj_init() on faux_device_create() failureDan Williams1-6/+3
CXL has a symbol dependency on einj_core.ko, so if einj_init() fails then cxl_core.ko fails to load. Prior to the faux_device_create() conversion, einj_probe() failures were tracked by the einj_initialized flag without failing einj_init(). Revert to that behavior and always succeed einj_init() given there is no way, and no pressing need, to discern faux device-create vs device-probe failures. This situation arose because CXL knows proper kernel named objects to trigger errors against, but acpi-einj knows how to perform the error injection. The injection mechanism is shared with non-CXL use cases. The result is CXL now has a module dependency on einj-core.ko, and init/probe failures are handled at runtime. Fixes: 6cb9441bfe8d ("ACPI: APEI: EINJ: Transition to the faux device interface") Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Ben Cheatham <benjamin.cheatham@amd.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://patch.msgid.link/20250607033228.1475625-4-dan.j.williams@intel.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-06-10driver core: faux: Quiet probe failuresDan Williams1-1/+1
The acpi-einj conversion to faux_device_create() leads to a noisy error message when the error injection facility is disabled. Quiet the error as CXL error injection via ACPI expects the module to stay loaded even if the error injection facility is disabled. This situation arose because CXL knows proper kernel named objects to trigger errors against, but acpi-einj knows how to perform the error injection. The injection mechanism is shared with non-CXL use cases. The result is CXL now has a module dependency on einj-core.ko, and init/probe failures are handled at runtime. Fixes: 6cb9441bfe8d ("ACPI: APEI: EINJ: Transition to the faux device interface") Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://patch.msgid.link/20250607033228.1475625-3-dan.j.williams@intel.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-06-10driver core: faux: Suppress bind attributesDan Williams1-0/+1
faux_device_create() is almost a suitable candidate to replace platform_driver_probe() if not for the fact that faux_device_create() supports dynamic attach/detach of the driver. Drop the bind attributes with the expectation that simple faux devices can always assume that the device is permanently bound at create, and only unbound at 'destroy'. The acpi-einj driver depends on static bind. Fixes: 6cb9441bfe8d ("ACPI: APEI: EINJ: Transition to the faux device interface") Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://patch.msgid.link/20250607033228.1475625-2-dan.j.williams@intel.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-06-10Revert "mm/damon/Kconfig: enable CONFIG_DAMON by default"Linus Torvalds1-1/+0
This reverts commit 28615e6eed152f2fda5486680090b74aeed7b554. No, we don't make random features default to being on. Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: SeongJae Park <sj@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-06-10io_uring: fix use-after-free of sq->thread in __io_uring_show_fdinfo()Penglei Jiang2-7/+14
syzbot reports: BUG: KASAN: slab-use-after-free in getrusage+0x1109/0x1a60 Read of size 8 at addr ffff88810de2d2c8 by task a.out/304 CPU: 0 UID: 0 PID: 304 Comm: a.out Not tainted 6.16.0-rc1 #1 PREEMPT(voluntary) Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x53/0x70 print_report+0xd0/0x670 ? __pfx__raw_spin_lock_irqsave+0x10/0x10 ? getrusage+0x1109/0x1a60 kasan_report+0xce/0x100 ? getrusage+0x1109/0x1a60 getrusage+0x1109/0x1a60 ? __pfx_getrusage+0x10/0x10 __io_uring_show_fdinfo+0x9fe/0x1790 ? ksys_read+0xf7/0x1c0 ? do_syscall_64+0xa4/0x260 ? vsnprintf+0x591/0x1100 ? __pfx___io_uring_show_fdinfo+0x10/0x10 ? __pfx_vsnprintf+0x10/0x10 ? mutex_trylock+0xcf/0x130 ? __pfx_mutex_trylock+0x10/0x10 ? __pfx_show_fd_locks+0x10/0x10 ? io_uring_show_fdinfo+0x57/0x80 io_uring_show_fdinfo+0x57/0x80 seq_show+0x38c/0x690 seq_read_iter+0x3f7/0x1180 ? inode_set_ctime_current+0x160/0x4b0 seq_read+0x271/0x3e0 ? __pfx_seq_read+0x10/0x10 ? __pfx__raw_spin_lock+0x10/0x10 ? __mark_inode_dirty+0x402/0x810 ? selinux_file_permission+0x368/0x500 ? file_update_time+0x10f/0x160 vfs_read+0x177/0xa40 ? __pfx___handle_mm_fault+0x10/0x10 ? __pfx_vfs_read+0x10/0x10 ? mutex_lock+0x81/0xe0 ? __pfx_mutex_lock+0x10/0x10 ? fdget_pos+0x24d/0x4b0 ksys_read+0xf7/0x1c0 ? __pfx_ksys_read+0x10/0x10 ? do_user_addr_fault+0x43b/0x9c0 do_syscall_64+0xa4/0x260 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f0f74170fc9 Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 8 RSP: 002b:00007fffece049e8 EFLAGS: 00000206 ORIG_RAX: 0000000000000000 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f0f74170fc9 RDX: 0000000000001000 RSI: 00007fffece049f0 RDI: 0000000000000004 RBP: 00007fffece05ad0 R08: 0000000000000000 R09: 00007fffece04d90 R10: 0000000000000000 R11: 0000000000000206 R12: 00005651720a1100 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 </TASK> Allocated by task 298: kasan_save_stack+0x33/0x60 kasan_save_track+0x14/0x30 __kasan_slab_alloc+0x6e/0x70 kmem_cache_alloc_node_noprof+0xe8/0x330 copy_process+0x376/0x5e00 create_io_thread+0xab/0xf0 io_sq_offload_create+0x9ed/0xf20 io_uring_setup+0x12b0/0x1cc0 do_syscall_64+0xa4/0x260 entry_SYSCALL_64_after_hwframe+0x77/0x7f Freed by task 22: kasan_save_stack+0x33/0x60 kasan_save_track+0x14/0x30 kasan_save_free_info+0x3b/0x60 __kasan_slab_free+0x37/0x50 kmem_cache_free+0xc4/0x360 rcu_core+0x5ff/0x19f0 handle_softirqs+0x18c/0x530 run_ksoftirqd+0x20/0x30 smpboot_thread_fn+0x287/0x6c0 kthread+0x30d/0x630 ret_from_fork+0xef/0x1a0 ret_from_fork_asm+0x1a/0x30 Last potentially related work creation: kasan_save_stack+0x33/0x60 kasan_record_aux_stack+0x8c/0xa0 __call_rcu_common.constprop.0+0x68/0x940 __schedule+0xff2/0x2930 __cond_resched+0x4c/0x80 mutex_lock+0x5c/0xe0 io_uring_del_tctx_node+0xe1/0x2b0 io_uring_clean_tctx+0xb7/0x160 io_uring_cancel_generic+0x34e/0x760 do_exit+0x240/0x2350 do_group_exit+0xab/0x220 __x64_sys_exit_group+0x39/0x40 x64_sys_call+0x1243/0x1840 do_syscall_64+0xa4/0x260 entry_SYSCALL_64_after_hwframe+0x77/0x7f The buggy address belongs to the object at ffff88810de2cb00 which belongs to the cache task_struct of size 3712 The buggy address is located 1992 bytes inside of freed 3712-byte region [ffff88810de2cb00, ffff88810de2d980) which is caused by the task_struct pointed to by sq->thread being released while it is being used in the function __io_uring_show_fdinfo(). Holding ctx->uring_lock does not prevent ehre relase or exit of sq->thread. Fix this by assigning and looking up ->thread under RCU, and grabbing a reference to the task_struct. This ensures that it cannot get released while fdinfo is using it. Reported-by: syzbot+531502bbbe51d2f769f4@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/682b06a5.a70a0220.3849cf.00b3.GAE@google.com Fixes: 3fcb9d17206e ("io_uring/sqpoll: statistics of the true utilization of sq threads") Signed-off-by: Penglei Jiang <superman.xpt@gmail.com> Link: https://lore.kernel.org/r/20250610171801.70960-1-superman.xpt@gmail.com [axboe: massage commit message] Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-06-10Merge tag 'linux-cpupower-6.16-rc2-fixes' of ↵Rafael J. Wysocki1-4/+5
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux Merge an urgent cpupower utility fix for 6.16-rc1 from Shuah Khan: "Add unitdir variable for specifying the location to install systemd service units instead of installing under ${libdir}/systemd/system which doesn't work on some distributions." * tag 'linux-cpupower-6.16-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux: cpupower: split unitdir from libdir in Makefile
2025-06-10e1000: Move cancel_work_sync to avoid deadlockJoe Damato1-4/+4
Previously, e1000_down called cancel_work_sync for the e1000 reset task (via e1000_down_and_stop), which takes RTNL. As reported by users and syzbot, a deadlock is possible in the following scenario: CPU 0: - RTNL is held - e1000_close - e1000_down - cancel_work_sync (cancel / wait for e1000_reset_task()) CPU 1: - process_one_work - e1000_reset_task - take RTNL To remedy this, avoid calling cancel_work_sync from e1000_down (e1000_reset_task does nothing if the device is down anyway). Instead, call cancel_work_sync for e1000_reset_task when the device is being removed. Fixes: e400c7444d84 ("e1000: Hold RTNL when e1000_down can be called") Reported-by: syzbot+846bb38dc67fe62cc733@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/683837bf.a00a0220.52848.0003.GAE@google.com/ Reported-by: John <john.cs.hey@gmail.com> Closes: https://lore.kernel.org/netdev/CAP=Rh=OEsn4y_2LvkO3UtDWurKcGPnZ_NPSXK=FbgygNXL37Sw@mail.gmail.com/ Signed-off-by: Joe Damato <jdamato@fastly.com> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Acked-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-06-10ice/ptp: fix crosstimestamp reportingAnton Nadezhdin1-0/+1
Set use_nsecs=true as timestamp is reported in ns. Lack of this result in smaller timestamp error window which cause error during phc2sys execution on E825 NICs: phc2sys[1768.256]: ioctl PTP_SYS_OFFSET_PRECISE: Invalid argument This problem was introduced in the cited commit which omitted setting use_nsecs to true when converting the ice driver to use convert_base_to_cs(). Testing hints (ethX is PF netdev): phc2sys -s ethX -c CLOCK_REALTIME -O 37 -m phc2sys[1769.256]: CLOCK_REALTIME phc offset -5 s0 freq -0 delay 0 Fixes: d4bea547ebb57 ("ice/ptp: Remove convert_art_to_tsc()") Signed-off-by: Anton Nadezhdin <anton.nadezhdin@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-06-10iavf: fix reset_task for early reset eventAhmed Zaki2-0/+28
If a reset event is received from the PF early in the init cycle, the state machine hangs for about 25 seconds. Reproducer: echo 1 > /sys/class/net/$PF0/device/sriov_numvfs ip link set dev $PF0 vf 0 mac $NEW_MAC The log shows: [792.620416] ice 0000:5e:00.0: Enabling 1 VFs [792.738812] iavf 0000:5e:01.0: enabling device (0000 -> 0002) [792.744182] ice 0000:5e:00.0: Enabling 1 VFs with 17 vectors and 16 queues per VF [792.839964] ice 0000:5e:00.0: Setting MAC 52:54:00:00:00:11 on VF 0. VF driver will be reinitialized [813.389684] iavf 0000:5e:01.0: Failed to communicate with PF; waiting before retry [818.635918] iavf 0000:5e:01.0: Hardware came out of reset. Attempting reinit. [818.766273] iavf 0000:5e:01.0: Multiqueue Enabled: Queue pair count = 16 Fix it by scheduling the reset task and making the reset task capable of resetting early in the init cycle. Fixes: ef8693eb90ae3 ("i40evf: refactor reset handling") Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com> Tested-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-06-10i40e: retry VFLR handling if there is ongoing VF resetRobert Malz1-1/+4
When a VFLR interrupt is received during a VF reset initiated from a different source, the VFLR may be not fully handled. This can leave the VF in an undefined state. To address this, set the I40E_VFLR_EVENT_PENDING bit again during VFLR handling if the reset is not yet complete. This ensures the driver will properly complete the VF reset in such scenarios. Fixes: 52424f974bc5 ("i40e: Fix VF hang when reset is triggered on another VF") Signed-off-by: Robert Malz <robert.malz@canonical.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-06-10i40e: return false from i40e_reset_vf if reset is in progressRobert Malz1-3/+3
The function i40e_vc_reset_vf attempts, up to 20 times, to handle a VF reset request, using the return value of i40e_reset_vf as an indicator of whether the reset was successfully triggered. Currently, i40e_reset_vf always returns true, which causes new reset requests to be ignored if a different VF reset is already in progress. This patch updates the return value of i40e_reset_vf to reflect when another VF reset is in progress, allowing the caller to properly use the retry mechanism. Fixes: 52424f974bc5 ("i40e: Fix VF hang when reset is triggered on another VF") Signed-off-by: Robert Malz <robert.malz@canonical.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-06-10tools/resolve_btfids: Fix build when cross compiling kernel with clang.Suleiman Souhlal1-1/+1
When cross compiling the kernel with clang, we need to override CLANG_CROSS_FLAGS when preparing the step libraries. Prior to commit d1d096312176 ("tools: fix annoying "mkdir -p ..." logs when building tools in parallel"), MAKEFLAGS would have been set to a value that wouldn't set a value for CLANG_CROSS_FLAGS, hiding the fact that we weren't properly overriding it. Fixes: 56a2df7615fa ("tools/resolve_btfids: Compile resolve_btfids as host program") Signed-off-by: Suleiman Souhlal <suleiman@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/bpf/20250606074538.1608546-1-suleiman@google.com
2025-06-10tracing: Do not free "head" on error path of filter_free_subsystem_filters()Steven Rostedt1-3/+1
The variable "head" is allocated and initialized as a list before allocating the first "item" for the list. If the allocation of "item" fails, it frees "head" and then jumps to the label "free_now" which will process head and free it. This will cause a UAF of "head", and it doesn't need to free it before jumping to the "free_now" label as that code will free it. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/20250610093348.33c5643a@gandalf.local.home Fixes: a9d0aab5eb33 ("tracing: Fix regression of filter waiting a long time on RCU synchronization") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/r/202506070424.lCiNreTI-lkp@intel.com/ Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-06-10pinctrl: sunxi: dt: Consider pin base when calculating bank number from pinChen-Yu Tsai1-4/+4
In prepare_function_table() when the pinctrl function table IRQ entries are generated, the pin bank is calculated from the absolute pin number; however the IRQ bank mux array is indexed from the first pin bank of the controller. For R_PIO controllers, this means the absolute pin bank is way off from the relative pin bank used for array indexing. Correct this by taking into account the pin base of the controller. Fixes: f5e2cd34b12f ("pinctrl: sunxi: allow reading mux values from DT") Signed-off-by: Chen-Yu Tsai <wens@csie.org> Link: https://lore.kernel.org/20250607135203.2085226-1-wens@kernel.org Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2025-06-10drm/meson: fix more rounding issues with 59.94Hz modesMartin Blumenstingl1-21/+34
Commit 1017560164b6 ("drm/meson: use unsigned long long / Hz for frequency types") attempts to resolve video playback using 59.94Hz. using YUV420 by changing the clock calculation to use Hz instead of kHz (thus yielding more precision). The basic calculation itself is correct, however the comparisions in meson_vclk_vic_supported_freq() and meson_vclk_setup() don't work anymore for 59.94Hz modes (using the freq * 1000 / 1001 logic). For example, drm/edid specifies a 593407kHz clock for 3840x2160@59.94Hz. With the mentioend commit we convert this to Hz. Then meson_vclk tries to find a matchig "params" entry (as the clock setup code currently only supports specific frequencies) by taking the venc_freq from the params and calculating the "alt frequency" (used for the 59.94Hz modes) from it, which is: (594000000Hz * 1000) / 1001 = 593406593Hz Similar calculation is applied to the phy_freq (TMDS clock), which is 10 times the pixel clock. Implement a new meson_vclk_freqs_are_matching_param() function whose purpose is to compare if the requested and calculated frequencies. They may not match exactly (for the reasons mentioned above). Allow the clocks to deviate slightly to make the 59.94Hz modes again. Fixes: 1017560164b6 ("drm/meson: use unsigned long long / Hz for frequency types") Reported-by: Christian Hewitt <christianshewitt@gmail.com> Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20250609202751.962208-1-martin.blumenstingl@googlemail.com
2025-06-10drm/meson: use vclk_freq instead of pixel_freq in debug printMartin Blumenstingl1-3/+3
meson_vclk_vic_supported_freq() has a debug print which includes the pixel freq. However, within the whole function the pixel freq is irrelevant, other than checking the end of the params array. Switch to printing the vclk_freq which is being compared / matched against the inputs to the function to avoid confusion when analyzing error reports from users. Fixes: e5fab2ec9ca4 ("drm/meson: vclk: add support for YUV420 setup") Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20250606221031.3419353-1-martin.blumenstingl@googlemail.com
2025-06-10drm/meson: fix debug log statement when setting the HDMI clocksMartin Blumenstingl1-1/+1
The "phy" and "vclk" frequency labels were swapped, making it more difficult to debug driver errors. Swap the label order to make them match with the actual frequencies printed to correct this. Fixes: e5fab2ec9ca4 ("drm/meson: vclk: add support for YUV420 setup") Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20250606203729.3311592-1-martin.blumenstingl@googlemail.com
2025-06-10cpufreq: Convert `/// SAFETY` lines to `# Safety` sectionsViresh Kumar1-37/+109
Replace `/// SAFETY` comments in doc comments with proper `# Safety` sections, as per rustdoc conventions. Also mark the C FFI callbacks as `unsafe` to correctly reflect their safety requirements. Reported-by: Miguel Ojeda <ojeda@kernel.org> Closes: https://github.com/Rust-for-Linux/linux/issues/1169 Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
2025-06-10drm/vc4: fix infinite EPROBE_DEFER loopGabriel Dalimonte1-6/+6
`vc4_hdmi_audio_init` calls `devm_snd_dmaengine_pcm_register` which may return EPROBE_DEFER. Calling `drm_connector_hdmi_audio_init` adds a child device. The driver model docs[1] state that adding a child device prior to returning EPROBE_DEFER may result in an infinite loop. [1] https://www.kernel.org/doc/html/v6.14/driver-api/driver-model/driver.html Fixes: 9640f1437a88 ("drm/vc4: hdmi: switch to using generic HDMI Codec infrastructure") Signed-off-by: Gabriel Dalimonte <gabriel.dalimonte@gmail.com> Link: https://lore.kernel.org/r/20250601-vc4-audio-inf-probe-v2-1-9ad43c7b6147@gmail.com Signed-off-by: Maxime Ripard <mripard@kernel.org>
2025-06-09Merge tag 'powerpc-6.16-2' of ↵Linus Torvalds2-2/+15
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Madhavan Srinivasan: - a couple of fixes for out of bounds issues in memtrace and vas Thanks to Ritesh Harjani (IBM), Haren Myneni, and Jonathan Greental * tag 'powerpc-6.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/vas: Return -EINVAL if the offset is non-zero in mmap() powerpc/powernv/memtrace: Fix out of bounds issue in memtrace mmap
2025-06-10powerpc/vas: Return -EINVAL if the offset is non-zero in mmap()Haren Myneni1-0/+9
The user space calls mmap() to map VAS window paste address and the kernel returns the complete mapped page for each window. So return -EINVAL if non-zero is passed for offset parameter to mmap(). See Documentation/arch/powerpc/vas-api.rst for mmap() restrictions. Co-developed-by: Jonathan Greental <yonatan02greental@gmail.com> Signed-off-by: Jonathan Greental <yonatan02greental@gmail.com> Reported-by: Jonathan Greental <yonatan02greental@gmail.com> Fixes: dda44eb29c23 ("powerpc/vas: Add VAS user space API") Signed-off-by: Haren Myneni <haren@linux.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20250610021227.361980-2-maddy@linux.ibm.com
2025-06-10powerpc/powernv/memtrace: Fix out of bounds issue in memtrace mmapRitesh Harjani (IBM)1-2/+6
memtrace mmap issue has an out of bounds issue. This patch fixes the by checking that the requested mapping region size should stay within the allocated region size. Reported-by: Jonathan Greental <yonatan02greental@gmail.com> Fixes: 08a022ad3dfa ("powerpc/powernv/memtrace: Allow mmaping trace buffers") Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20250610021227.361980-1-maddy@linux.ibm.com
2025-06-09scsi: error: alua: I/O errors for ALUA state transitionsRajashekhar M A1-1/+2
When a host is configured with a few LUNs and I/O is running, injecting FC faults repeatedly leads to path recovery problems. The LUNs have 4 paths each and 3 of them come back active after say an FC fault which makes 2 of the paths go down, instead of all 4. This happens after several iterations of continuous FC faults. Reason here is that we're returning an I/O error whenever we're encountering sense code 06/04/0a (LOGICAL UNIT NOT ACCESSIBLE, ASYMMETRIC ACCESS STATE TRANSITION) instead of retrying. Signed-off-by: Rajashekhar M A <rajs@netapp.com> Signed-off-by: Hannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/20250606135924.27397-1-hare@kernel.org Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-06-09scsi: storvsc: Increase the timeouts to storvsc_timeoutDexuan Cui1-4/+6
Currently storvsc_timeout is only used in storvsc_sdev_configure(), and 5s and 10s are used elsewhere. It turns out that rarely the 5s is not enough on Azure, so let's use storvsc_timeout everywhere. In case a timeout happens and storvsc_channel_init() returns an error, close the VMBus channel so that any host-to-guest messages in the channel's ringbuffer, which might come late, can be safely ignored. Add a "const" to storvsc_timeout. Cc: stable@kernel.org Signed-off-by: Dexuan Cui <decui@microsoft.com> Link: https://lore.kernel.org/r/1749243459-10419-1-git-send-email-decui@microsoft.com Reviewed-by: Long Li <longli@microsoft.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-06-09Merge tag 'for-net-2025-06-05' of ↵Jakub Kicinski7-115/+118
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth Luiz Augusto von Dentz says: ==================== bluetooth pull request for net: - MGMT: Fix UAF on mgmt_remove_adv_monitor_complete - MGMT: Protect mgmt_pending list with its own lock - hci_core: fix list_for_each_entry_rcu usage - btintel_pcie: Increase the tx and rx descriptor count - btintel_pcie: Reduce driver buffer posting to prevent race condition - btintel_pcie: Fix driver not posting maximum rx buffers * tag 'for-net-2025-06-05' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth: Bluetooth: MGMT: Protect mgmt_pending list with its own lock Bluetooth: MGMT: Fix UAF on mgmt_remove_adv_monitor_complete Bluetooth: btintel_pcie: Reduce driver buffer posting to prevent race condition Bluetooth: btintel_pcie: Increase the tx and rx descriptor count Bluetooth: btintel_pcie: Fix driver not posting maximum rx buffers Bluetooth: hci_core: fix list_for_each_entry_rcu usage ==================== Link: https://patch.msgid.link/20250605191136.904411-1-luiz.dentz@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-09net_sched: sch_sfq: fix a potential crash on gso_skb handlingEric Dumazet1-1/+4
SFQ has an assumption of always being able to queue at least one packet. However, after the blamed commit, sch->q.len can be inflated by packets in sch->gso_skb, and an enqueue() on an empty SFQ qdisc can be followed by an immediate drop. Fix sfq_drop() to properly clear q->tail in this situation. Tested: ip netns add lb ip link add dev to-lb type veth peer name in-lb netns lb ethtool -K to-lb tso off # force qdisc to requeue gso_skb ip netns exec lb ethtool -K in-lb gro on # enable NAPI ip link set dev to-lb up ip -netns lb link set dev in-lb up ip addr add dev to-lb 192.168.20.1/24 ip -netns lb addr add dev in-lb 192.168.20.2/24 tc qdisc replace dev to-lb root sfq limit 100 ip netns exec lb netserver netperf -H 192.168.20.2 -l 100 & netperf -H 192.168.20.2 -l 100 & netperf -H 192.168.20.2 -l 100 & netperf -H 192.168.20.2 -l 100 & Fixes: a53851e2c321 ("net: sched: explicit locking in gso_cpu fallback") Reported-by: Marcus Wichelmann <marcus.wichelmann@hetzner-cloud.de> Closes: https://lore.kernel.org/netdev/9da42688-bfaa-4364-8797-e9271f3bdaef@hetzner-cloud.de/ Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/20250606165127.3629486-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-09smb: client: disable path remapping with POSIX extensionsPhilipp Kerling2-2/+10
If SMB 3.1.1 POSIX Extensions are available and negotiated, the client should be able to use all characters and not remap anything. Currently, the user has to explicitly request this behavior by specifying the "nomapposix" mount option. Link: https://lore.kernel.org/4195bb677b33d680e77549890a4f4dd3b474ceaf.camel@rx2.rx-server.de Signed-off-by: Philipp Kerling <pkerling@casix.org> Reviewed-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-06-09Merge branch '6.16/scsi-queue' into 6.16/scsi-fixesMartin K. Petersen4-9/+15
Pull in remaining fixes from queue branch. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-06-09scsi: s390: zfcp: Ensure synchronous unit_addPeter Oberparleiter1-0/+2
Improve the usability of the unit_add sysfs attribute by ensuring that the associated FCP LUN scan processing is completed synchronously. This enables configuration tooling to consistently determine the end of the scan process to allow for serialization of follow-on actions. While the scan process associated with unit_add typically completes synchronously, it is deferred to an asynchronous background process if unit_add is used before initial remote port scanning has completed. This occurs when unit_add is used immediately after setting the associated FCP device online. To ensure synchronous unit_add processing, wait for remote port scanning to complete before initiating the FCP LUN scan. Cc: stable@vger.kernel.org Reviewed-by: M Nikhil <nikh1092@linux.ibm.com> Reviewed-by: Nihar Panda <niharp@linux.ibm.com> Signed-off-by: Peter Oberparleiter <oberpar@linux.ibm.com> Signed-off-by: Nihar Panda <niharp@linux.ibm.com> Link: https://lore.kernel.org/r/20250603182252.2287285-2-niharp@linux.ibm.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-06-09scsi: iscsi: Fix incorrect error path labels for flashnode operationsAlok Tiwari1-6/+5
Correct the error handling goto labels used when host lookup fails in various flashnode-related event handlers: - iscsi_new_flashnode() - iscsi_del_flashnode() - iscsi_login_flashnode() - iscsi_logout_flashnode() - iscsi_logout_flashnode_sid() scsi_host_put() is not required when shost is NULL, so jumping to the correct label avoids unnecessary operations. These functions previously jumped to the wrong goto label (put_host), which did not match the intended cleanup logic. Use the correct exit labels (exit_new_fnode, exit_del_fnode, etc.) to ensure proper error handling. Also remove the unused put_host label under iscsi_new_flashnode() as it is no longer needed. No functional changes beyond accurate error path correction. Fixes: c6a4bb2ef596 ("[SCSI] scsi_transport_iscsi: Add flash node mgmt support") Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Link: https://lore.kernel.org/r/20250530193012.3312911-1-alok.a.tiwari@oracle.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-06-09scsi: mvsas: Fix typos in per-phy comments and SAS cmd port registersAnkit Chauhan1-2/+2
Spelling fixes: Deocder --> Decoder Memroy --> Memory This is a non-functional change aimed at improving code clarity. Signed-off-by: Ankit Chauhan <ankitchauhan2065@gmail.com> Link: https://lore.kernel.org/r/20250528110604.59528-1-ankitchauhan2065@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-06-09SPI: omap2-mcspi: Fix SPI CS behaviour aroundMark Brown1-12/+18
Merge series from Félix Piédallu <felix.piedallu@non.se.com>: These patches fix the behaviour of the SPI Chip Select of the OMAP2 MCSPI driver used on TI SoCs. The omap2-mcspi driver supports the use of multi mode (multichannel in TI documentation). In this mode, the CS is asserted and deasserted by the hardware. The multi mode is disabled for messages when cs_change=0 for all transfers (e.g when CS is kept asserted between transfers of a same message). The multi mode also needs to be disabled for messages when cs_change=1 on the last transfer (e.g when CS is kept asserted after the WHOLE message), and the message right after. Currently, that is not the case and it CS is deasserted by hardware when it shouldn't. This breaks peripheral drivers that send multiple messages with the CS asserted in between. Patch 1 ensures that multi mode is disabled when cs_change=1 on the last transfer of the message. Patch 2 ensures that multi mode is disable on a message following one with cs_change=1 on the last transfer. This is the case for the TPM TIS SPI driver that uses this logic for flow control purposes. Tested on an AM6442 platform with a TPM ST33HTPH2X32AHE4.
2025-06-09ARC: Replace __ASSEMBLY__ with __ASSEMBLER__ in the non-uapi headersThomas Huth24-45/+45
While the GCC and Clang compilers already define __ASSEMBLER__ automatically when compiling assembly code, __ASSEMBLY__ is a macro that only gets defined by the Makefiles in the kernel. This can be very confusing when switching between userspace and kernelspace coding, or when dealing with uapi headers that rather should use __ASSEMBLER__ instead. So let's standardize on the __ASSEMBLER__ macro that is provided by the compilers now. This is a completely mechanical patch (done with a simple "sed -i" statement). Cc: linux-snps-arc@lists.infradead.org Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Vineet Gupta <vgupta@kernel.org>
2025-06-09ARC: Replace __ASSEMBLY__ with __ASSEMBLER__ in uapi headersThomas Huth1-2/+2
__ASSEMBLY__ is only defined by the Makefile of the kernel, so this is not really useful for uapi headers (unless the userspace Makefile defines it, too). Let's switch to __ASSEMBLER__ which gets set automatically by the compiler when compiling assembly code. Cc: linux-snps-arc@lists.infradead.org Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Vineet Gupta <vgupta@kernel.org>
2025-06-09ARC: unwind: Use built-in sort swap to reduce code size and improve performanceYu-Chun Lin1-10/+1
The custom swap function used in sort() was identical to the default built-in sort swap. Remove the custom swap function and passes NULL to sort(), allowing it to use the default swap function. This change reduces code size and improves performance, particularly when CONFIG_MITIGATION_RETPOLINE is enabled. With RETPOLINE mitigation, indirect function calls incur significant overhead, and using the default swap function avoids this cost. $ ./scripts/bloat-o-meter ./unwind.o.old ./unwind.o.new add/remove: 0/1 grow/shrink: 0/1 up/down: 0/-22 (-22) Function old new delta init_unwind_hdr.constprop 544 540 -4 swap_eh_frame_hdr_table_entries 18 - -18 Total: Before=4410, After=4388, chg -0.50% Signed-off-by: Yu-Chun Lin <eleanor15x@gmail.com> Signed-off-by: Vineet Gupta <vgupta@kernel.org>
2025-06-09ARC: atomics: Implement arch_atomic64_cmpxchg using _relaxedJason Gunthorpe1-10/+5
The core atomic code has a number of macros where it elaborates architecture primitives into more functions. ARC uses arch_atomic64_cmpxchg() as it's architecture primitive which disable alot of the additional functions. Instead provide arch_cmpxchg64_relaxed() as the primitive and rely on the core macros to create arch_cmpxchg64(). The macros will also provide other functions, for instance, try_cmpxchg64_release(), giving a more complete implementation. Suggested-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/Z0747n5bSep4_1VX@J2N7QTR9R3 Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Vineet Gupta <vgupta@kernel.org>
2025-06-09cpupower: split unitdir from libdir in MakefileFrancesco Poli (wintermute)1-4/+5
Improve the installation procedure for the systemd service unit 'cpupower.service', to be more flexible. Some distros install libraries to /usr/lib64/, but systemd service units have to be installed to /usr/lib/systemd/system: as a consequence, the installation procedure should not assume that systemd service units can be installed to ${libdir}/systemd/system ... Define a dedicated variable ("unitdir") in the Makefile. Link: https://lore.kernel.org/linux-pm/260b6d79-ab61-43b7-a0eb-813e257bc028@leemhuis.info/T/#m0601940ab439d5cbd288819d2af190ce59e810e6 Fixes: 9c70b779ad91 ("cpupower: add a systemd service to run cpupower") Link: https://lore.kernel.org/r/20250521211656.65646-1-invernomuto@paranoici.org Signed-off-by: Francesco Poli (wintermute) <invernomuto@paranoici.org> Tested-by: Thorsten Leemhuis <linux@leemhuis.info> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2025-06-09spi: stm32-ospi: Make usage of reset_control_acquire/release() APIPatrice Chotard1-6/+18
As ospi reset is consumed by both OMM and OSPI drivers, use the reset acquire/release mechanism which ensure exclusive reset usage. This avoid to call reset_control_get/put() in OMM driver each time we need to reset OSPI children and guarantee the reset line stays deasserted. During resume, OMM driver takes temporarily control of reset. Fixes: 79b8a705e26c ("spi: stm32: Add OSPI driver") Signed-off-by: Patrice Chotard <patrice.chotard@foss.st.com> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Link: https://patch.msgid.link/20250609-b4-upstream_ospi_reset_update-v6-1-5b602b567e8a@foss.st.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-06-09drm/xe/svm: Fix regression disallowing 64K SVM migrationMaarten Lankhorst1-1/+1
When changing the condition from >= SZ_64K, it was changed to <= SZ_64K. This disallows migration of 64K, which is the exact minimum allowed. Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/5057 Fixes: 794f5493f518 ("drm/xe: Strict migration policy for atomic SVM faults") Cc: stable@vger.kernel.org Cc: Matthew Brost <matthew.brost@intel.com> Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Signed-off-by: Maarten Lankhorst <dev@lankhorst.se> Link: https://lore.kernel.org/r/20250521090102.2965100-1-dev@lankhorst.se (cherry picked from commit 531bef26d189b28bf0d694878c0e064b30990b6c) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-06-09accel/amdxdna: Fix incorrect PSP firmware sizeLizhi Hou1-2/+2
The incorrect PSP firmware size is used for initializing. It may cause error for newer version firmware. Fixes: 8c9ff1b181ba ("accel/amdxdna: Add a new driver for AMD AI Engine") Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://lore.kernel.org/r/20250604143217.1386272-1-lizhi.hou@amd.com
2025-06-09spi: offload: check offload ops existence before disabling the triggerAndres Urian Florez1-1/+1
Add a safe guard in spi_offload_trigger to check the existence of offload->ops before invoking the trigger_disable callback Signed-off-by: Andres Urian Florez <andres.emb.sys@gmail.com> Link: https://patch.msgid.link/20250608230422.325360-1-andres.emb.sys@gmail.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-06-09pinctrl: tb10x: Drop of_match_ptr for ID tableKrzysztof Kozlowski1-1/+1
The driver can match only via the DT table so the table should be always used and the of_match_ptr does not have any sense (this also allows ACPI matching via PRP0001, even though it might not be relevant here). This also fixes !CONFIG_OF warning: pinctrl-tb10x.c:815:34: warning: unused variable 'tb10x_pinctrl_dt_ids' [-Wunused-const-variable] Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202505301317.EI1caRC0-lkp@intel.com/ Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/20250601105100.27927-2-krzysztof.kozlowski@linaro.org Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2025-06-09pinctrl: MAINTAINERS: Drop bouncing Jianlong HuangKrzysztof Kozlowski3-3/+2
Emails to Jianlong Huang bounce since 9 months, so drop the person from maintainers: 550 5.4.1 Recipient address rejected: Access denied. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/20250528104514.184122-2-krzysztof.kozlowski@linaro.org Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2025-06-09pinctrl: st: Drop unused st_gpio_bank() functionKrzysztof Kozlowski1-5/+0
Static inline st_gpio_bank() function is not referenced: pinctrl-st.c:377:19: error: unused function 'st_gpio_bank' [-Werror,-Wunused-function] Fixes: 701016c0cba5 ("pinctrl: st: Add pinctrl and pinconf support.") Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/20250528092201.52132-2-krzysztof.kozlowski@linaro.org Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2025-06-09pinctrl: qcom: pinctrl-qcm2290: Add missing pinsWojciech Slenska1-0/+9
Added the missing pins to the qcm2290_pins table. Signed-off-by: Wojciech Slenska <wojciech.slenska@gmail.com> Fixes: 48e049ef1238 ("pinctrl: qcom: Add QCM2290 pinctrl driver") Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Link: https://lore.kernel.org/20250523101437.59092-1-wojciech.slenska@gmail.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2025-06-09pinctrl: qcom: switch to devm_gpiochip_add_data()Dmitry Baryshkov58-67/+1
In order to simplify cleanup actions, use devres-enabled version of gpiochip_add_data(). As the msm_pinctrl_remove() function is now empty, drop it and all its calls from the corresponding pinctrl drivers. Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Link: https://lore.kernel.org/20250513-pinctrl-msm-fix-v2-3-249999af0fc1@oss.qualcomm.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2025-06-08spi: spi-pci1xxxx: Fix error code in probeDan Carpenter1-1/+1
Return the error code if pci_alloc_irq_vectors() fails. Don't return success. Fixes: b4608e944177 ("spi: spi-pci1xxxx: Fix Probe failure with Dual SPI instance with INTx interrupts") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Thangaraj Samynathan <thangaraj.s@microchip.com> Link: https://patch.msgid.link/aEKvDrUxD19GWi0u@stanley.mountain Signed-off-by: Mark Brown <broonie@kernel.org>
2025-06-08spi: loongson: Fix build warnings about export.hHuacai Chen1-0/+1
After commit a934a57a42f64a4 ("scripts/misc-check: check missing #include <linux/export.h> when W=1") and 7d95680d64ac8e836c ("scripts/misc-check: check unnecessary #include <linux/export.h> when W=1"), we get some build warnings with W=1: drivers/spi/spi-loongson-core.c: warning: EXPORT_SYMBOL() is used, but #include <linux/export.h> is missing So fix these build warnings for SPI/Loongson. Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Link: https://patch.msgid.link/20250608142939.172108-1-chenhuacai@loongson.cn Signed-off-by: Mark Brown <broonie@kernel.org>
2025-06-08spi: omap2-mcspi: Disable multi-mode when the previous message kept CS assertedFélix Piédallu1-10/+10
When the last transfer of a SPI message has the cs_change flag, the CS is kept asserted after the message. The next message can't use multi-mode because the CS will be briefly deasserted before the first transfer. Remove the early exit of the list_for_each_entry because the last transfer actually needs to be always checked. Fixes: d153ff4056cb ("spi: omap2-mcspi: Add support for MULTI-mode") Signed-off-by: Félix Piédallu <felix.piedallu@non.se.com> Link: https://patch.msgid.link/20250606-cs_change_fix-v1-2-27191a98a2e5@non.se.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-06-08spi: omap2-mcspi: Disable multi mode when CS should be kept asserted after ↵Félix Piédallu1-3/+9
message When the last transfer of a SPI message has the cs_change flag, the CS is kept asserted after the message. Multi-mode can't respect this as CS is deasserted by the hardware at the end of the message. Disable multi-mode when not applicable to the current message. Fixes: d153ff4056cb ("spi: omap2-mcspi: Add support for MULTI-mode") Signed-off-by: Félix Piédallu <felix.piedallu@non.se.com> Link: https://patch.msgid.link/20250606-cs_change_fix-v1-1-27191a98a2e5@non.se.com Signed-off-by: Mark Brown <broonie@kernel.org>