kernel/git/torvalds/linux.git - Linux kernel source tree

Age	Commit message (Collapse)	Author	Files	Lines
4 hours	Merge tag 'powerpc-6.16-2' of ↵HEAD master	Linus Torvalds	2	-2/+15
	git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Madhavan Srinivasan: - a couple of fixes for out of bounds issues in memtrace and vas Thanks to Ritesh Harjani (IBM), Haren Myneni, and Jonathan Greental * tag 'powerpc-6.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/vas: Return -EINVAL if the offset is non-zero in mmap() powerpc/powernv/memtrace: Fix out of bounds issue in memtrace mmap
5 hours	powerpc/vas: Return -EINVAL if the offset is non-zero in mmap()	Haren Myneni	1	-0/+9
	The user space calls mmap() to map VAS window paste address and the kernel returns the complete mapped page for each window. So return -EINVAL if non-zero is passed for offset parameter to mmap(). See Documentation/arch/powerpc/vas-api.rst for mmap() restrictions. Co-developed-by: Jonathan Greental <yonatan02greental@gmail.com> Signed-off-by: Jonathan Greental <yonatan02greental@gmail.com> Reported-by: Jonathan Greental <yonatan02greental@gmail.com> Fixes: dda44eb29c23 ("powerpc/vas: Add VAS user space API") Signed-off-by: Haren Myneni <haren@linux.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20250610021227.361980-2-maddy@linux.ibm.com
5 hours	powerpc/powernv/memtrace: Fix out of bounds issue in memtrace mmap	Ritesh Harjani (IBM)	1	-2/+6
	memtrace mmap issue has an out of bounds issue. This patch fixes the by checking that the requested mapping region size should stay within the allocated region size. Reported-by: Jonathan Greental <yonatan02greental@gmail.com> Fixes: 08a022ad3dfa ("powerpc/powernv/memtrace: Allow mmaping trace buffers") Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20250610021227.361980-1-maddy@linux.ibm.com
34 hours	Linux 6.16-rc1v6.16-rc1	Linus Torvalds	1	-2/+2

36 hours	Merge tag 'turbostat-2025.06.08' of ↵	Linus Torvalds	2	-110/+364
	git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux Pull turbostat updates from Len Brown: - Add initial DMR support, which required smarter RAPL probe - Fix AMD MSR RAPL energy reporting - Add RAPL power limit configuration output - Minor fixes * tag 'turbostat-2025.06.08' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: tools/power turbostat: version 2025.06.08 tools/power turbostat: Add initial support for BartlettLake tools/power turbostat: Add initial support for DMR tools/power turbostat: Dump RAPL sysfs info tools/power turbostat: Avoid probing the same perf counters tools/power turbostat: Allow probing RAPL with platform_features->rapl_msrs cleared tools/power turbostat: Clean up add perf/msr counter logic tools/power turbostat: Introduce add_msr_counter() tools/power turbostat: Remove add_msr_perf_counter_() tools/power turbostat: Remove add_cstate_perf_counter_() tools/power turbostat: Remove add_rapl_perf_counter_() tools/power turbostat: Quit early for unsupported RAPL counters tools/power turbostat: Always check rapl_joules flag tools/power turbostat: Fix AMD package-energy reporting tools/power turbostat: Fix RAPL_GFX_ALL typo tools/power turbostat: Add Android support for MSR device handling tools/power turbostat.8: pm_domain wording fix tools/power turbostat.8: fix typo: idle_pct should be pct_idle
37 hours	Merge tag 'timers-cleanups-2025-06-08' of ↵	Linus Torvalds	689	-955/+1151
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer cleanup from Thomas Gleixner: "The delayed from_timer() API cleanup: The renaming to the timer_() namespace was delayed due massive conflicts against Linux-next. Now that everything is upstream finish the conversion" tag 'timers-cleanups-2025-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: treewide, timers: Rename from_timer() to timer_container_of()
37 hours	Merge tag 'x86-urgent-2025-06-08' of ↵	Linus Torvalds	4	-19/+19
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: "A small set of x86 fixes: - Cure IO bitmap inconsistencies A failed fork cleans up all resources of the newly created thread via exit_thread(). exit_thread() invokes io_bitmap_exit() which does the IO bitmap cleanups, which unfortunately assume that the cleanup is related to the current task, which is obviously bogus. Make it work correctly - A lockdep fix in the resctrl code removed the clearing of the command buffer in two places, which keeps stale error messages around. Bring them back. - Remove unused trace events" * tag 'x86-urgent-2025-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: fs/resctrl: Restore the rdt_last_cmd_clear() calls after acquiring rdtgroup_mutex x86/iopl: Cure TIF_IO_BITMAP inconsistencies x86/fpu: Remove unused trace events
37 hours	Merge tag 'timers-urgent-2025-06-08' of ↵	Linus Torvalds	1	-0/+1
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fix from Thomas Gleixner: "Add the missing seq_file forward declaration in the timer namespace header" * tag 'timers-urgent-2025-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: timens: Add struct seq_file forward declaration
37 hours	tools/power turbostat: version 2025.06.08	Len Brown	1	-37/+36
	Add initial DMR support, which required smarter RAPL probe Fix AMD MSR RAPL energy reporting Add RAPL power limit configuration output Minor fixes Signed-off-by: Len Brown <len.brown@intel.com>
37 hours	tools/power turbostat: Add initial support for BartlettLake	Zhang Rui	1	-0/+1
	Add initial support for BartlettLake. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
37 hours	tools/power turbostat: Add initial support for DMR	Zhang Rui	1	-0/+18
	Add initial support for DMR. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
37 hours	tools/power turbostat: Dump RAPL sysfs info	Zhang Rui	1	-0/+156
	for example: intel-rapl:1: psys 28.0s:100W 976.0us:100W intel-rapl:0: package-0 28.0s:57W,max:15W 2.4ms:57W intel-rapl:0/intel-rapl:0:0: core disabled intel-rapl:0/intel-rapl:0:1: uncore disabled intel-rapl-mmio:0: package-0 28.0s:28W,max:15W 2.4ms:57W [lenb: simplified format] Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com> squish me Signed-off-by: Len Brown <len.brown@intel.com>
37 hours	tools/power turbostat: Avoid probing the same perf counters	Zhang Rui	1	-0/+15
	For the RAPL package energy status counter, Intel and AMD share the same perf_subsys and perf_name, but with different MSR addresses. Both rapl_counter_arch_infos[0] and rapl_counter_arch_infos[1] are introduced to describe this counter for different Vendors. As a result, the perf counter is probed twice, and causes a failure in in get_rapl_counters() because expected_read_size and actual_read_size don't match. Fix the problem by skipping the already probed counter. Note, this is not a perfect fix. For example, if different vendors/platforms use the same MSR value for different purpose, the code can be fooled when it probes a rapl_counter_arch_infos[] entry that does not belong to the running Vendor/Platform. In a long run, better to put rapl_counter_arch_infos[] into the platform_features so that this becomes Vendor/Platform specific. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
37 hours	tools/power turbostat: Allow probing RAPL with platform_features->rapl_msrs ↵	Zhang Rui	1	-25/+24
	cleared platform_features->rapl_msrs describes the RAPL MSRs supported. While RAPL Perf counters can be exposed from different kernel backend drivers, e.g. RAPL MSR I/F driver, or RAPL TPMI I/F driver. Thus, turbostat should first blindly probe all the available RAPL Perf counters, and falls back to the RAPL MSR counters if they are listed in platform_features->rapl_msrs. With this, platforms that don't have RAPL MSRs can clear the platform_features->rapl_msrs bits and use RAPL Perf counters only. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
37 hours	tools/power turbostat: Clean up add perf/msr counter logic	Zhang Rui	1	-7/+18
	Increase the code readability by moving the no_perf/no_msr flag and the cai->perf_name/cai->msr sanity checks into the counter probe functions. No functional change. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
37 hours	tools/power turbostat: Introduce add_msr_counter()	Zhang Rui	1	-9/+23
	probe_rapl_msr() is reused for probing RAPL MSR counters, cstate MSR counters and MPERF/APERF/SMI MSR counters, thus its name is misleading. Similar to add_perf_counter(), introduce add_msr_counter() to probe a counter via MSR. Introduce wrapper function add_rapl_msr_counter() at the same time to add extra check for Zero return value for specified RAPL counters. No functional change intended. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
37 hours	tools/power turbostat: Remove add_msr_perf_counter_()	Zhang Rui	1	-12/+8
	As the only caller of add_msr_perf_counter_(), add_msr_perf_counter() just gives extra debug output on top. There is no need to keep both functions. Remove add_msr_perf_counter_() and move all the logic to add_msr_perf_counter(). No functional change. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
37 hours	tools/power turbostat: Remove add_cstate_perf_counter_()	Zhang Rui	1	-13/+9
	As the only caller of add_cstate_perf_counter_(), add_cstate_perf_counter() just gives extra debug output on top. There is no need to keep both functions. Remove add_cstate_perf_counter_() and move all the logic to add_cstate_perf_counter(). No functional change. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
37 hours	tools/power turbostat: Remove add_rapl_perf_counter_()	Zhang Rui	1	-15/+10
	As the only caller of add_rapl_perf_counter_(), add_rapl_perf_counter() just gives extra debug output on top. There is no need to keep both functions. Remove add_rapl_perf_counter_() and move all the logic to add_rapl_perf_counter(). No functional change. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
37 hours	tools/power turbostat: Quit early for unsupported RAPL counters	Zhang Rui	1	-1/+4
	Quit early for unsupported RAPL counters. No functional change. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
37 hours	tools/power turbostat: Always check rapl_joules flag	Zhang Rui	1	-3/+9
	rapl_joules bit should always be checked even if platform_features->rapl_msrs is not set or no_msr flag is used. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
37 hours	tools/power turbostat: Fix AMD package-energy reporting	Gautham R. Shenoy	1	-5/+36
	commit 05a2f07db888 ("tools/power turbostat: read RAPL counters via perf") that adds support to read RAPL counters via perf defines the notion of a RAPL domain_id which is set to physical_core_id on platforms which support per_core_rapl counters (Eg: AMD processors Family 17h onwards) and is set to the physical_package_id on all the other platforms. However, the physical_core_id is only unique within a package and on platforms with multiple packages more than one core can have the same physical_core_id and thus the same domain_id. (For eg, the first cores of each package have the physical_core_id = 0). This results in all these cores with the same physical_core_id using the same entry in the rapl_counter_info_perdomain[]. Since rapl_perf_init() skips the perf-initialization for cores whose domain_ids have already been visited, cores that have the same physical_core_id always read the perf file corresponding to the physical_core_id of the first package and thus the package-energy is incorrectly reported to be the same value for different packages. Note: This issue only arises when RAPL counters are read via perf and not when they are read via MSRs since in the latter case the MSRs are read separately on each core. Fix this issue by associating each CPU with rapl_core_id which is unique across all the packages in the system. Fixes: 05a2f07db888 ("tools/power turbostat: read RAPL counters via perf") Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Signed-off-by: Len Brown <len.brown@intel.com>
37 hours	tools/power turbostat: Fix RAPL_GFX_ALL typo	Kaushlendra Kumar	1	-1/+1
	Fix typo in the currently unused RAPL_GFX_ALL macro definition. Signed-off-by: Kaushlendra Kumar <kaushlendra.kumar@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
37 hours	tools/power turbostat: Add Android support for MSR device handling	Kaushlendra Kumar	1	-3/+17
	It uses /dev/msrN device paths on Android instead of /dev/cpu/N/msr, updates error messages and permission checks to reflect the Android device path, and wraps platform-specific code with #if defined(ANDROID) to ensure correct behavior on both Android and non-Android systems. These changes improve compatibility and usability of turbostat on Android devices. Signed-off-by: Kaushlendra Kumar <kaushlendra.kumar@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
37 hours	tools/power turbostat.8: pm_domain wording fix	Len Brown	1	-2/+2
	turbostat.8: clarify that uncore "domains" are Power Management domains, aka pm_domains. Signed-off-by: Len Brown <len.brown@intel.com>
37 hours	tools/power turbostat.8: fix typo: idle_pct should be pct_idle	Len Brown	1	-1/+1
	idle_pct should be pct_idle Signed-off-by: Len Brown <len.brown@intel.com>
37 hours	Merge tag 'perf-urgent-2025-06-08' of ↵	Linus Torvalds	1	-3/+5
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 perf fix from Thomas Gleixner: "A single fix for the x86 performance counters on Intel CPUs: The MSR offset calculations for fixed performance counters are stored at the wrong index in the configuration array causing the general purpose counter MSR offset to be overwritten, so both the general purpose and the fixed counters offsets are incorrect. Correct the array index calculation to fix that" * tag 'perf-urgent-2025-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel: Fix incorrect MSR index calculations in intel_pmu_config_acr()
37 hours	Merge tag 'irq-urgent-2025-06-08' of ↵	Linus Torvalds	3	-7/+8
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fix from Thomas Gleixner: "A single fix for the PCI/MSI code: The conversion to per device MSI domains created a MSI domain with size 1 instead of sizing it to the maximum possible number of MSI interrupts for the device. This "worked" as the subsequent allocations resized the domain, but the recent change to move the prepare() call into the domain creation path broke this works by chance mechanism. Size the domain properly at creation time" * tag 'irq-urgent-2025-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: PCI/MSI: Size device MSI domain with the maximum number of vectors
38 hours	Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs	Linus Torvalds	3	-59/+74
	Pull mount fixes from Al Viro: "Various mount-related bugfixes: - split the do_move_mount() checks in subtree-of-our-ns and entire-anon cases and adapt detached mount propagation selftest for mount_setattr - allow clone_private_mount() for a path on real rootfs - fix a race in call of has_locked_children() - fix move_mount propagation graph breakage by MOVE_MOUNT_SET_GROUP - make sure clone_private_mnt() caller has CAP_SYS_ADMIN in the right userns - avoid false negatives in path_overmount() - don't leak MNT_LOCKED from parent to child in finish_automount() - do_change_type(): refuse to operate on unmounted/not ours mounts" * tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: do_change_type(): refuse to operate on unmounted/not ours mounts clone_private_mnt(): make sure that caller has CAP_SYS_ADMIN in the right userns selftests/mount_setattr: adapt detached mount propagation test do_move_mount(): split the checks in subtree-of-our-ns and entire-anon cases fs: allow clone_private_mount() for a path on real rootfs fix propagation graph breakage by MOVE_MOUNT_SET_GROUP move_mount(2) finish_automount(): don't leak MNT_LOCKED from parent to child path_overmount(): avoid false negatives fs/fhandle.c: fix a race in call of has_locked_children()
38 hours	Merge tag '6.16-rc-part2-smb3-client-fixes' of ↵	Linus Torvalds	16	-288/+533
	git://git.samba.org/sfrench/cifs-2.6 Pull more smb client updates from Steve French: - multichannel/reconnect fixes - move smbdirect (smb over RDMA) defines to fs/smb/common so they will be able to be used in the future more broadly, and a documentation update explaining setting up smbdirect mounts - update email address for Paulo * tag '6.16-rc-part2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: cifs: update internal version number MAINTAINERS, mailmap: Update Paulo Alcantara's email address cifs: add documentation for smbdirect setup cifs: do not disable interface polling on failure cifs: serialize other channels when query server interfaces is pending cifs: deal with the channel loading lag while picking channels smb: client: make use of common smbdirect_socket_parameters smb: smbdirect: introduce smbdirect_socket_parameters smb: client: make use of common smbdirect_socket smb: smbdirect: add smbdirect_socket.h smb: client: make use of common smbdirect.h smb: smbdirect: add smbdirect.h with public structures smb: client: make use of common smbdirect_pdu.h smb: smbdirect: add smbdirect_pdu.h with protocol definitions
40 hours	Merge tag 'trace-v6.16-3' of ↵	Linus Torvalds	3	-100/+143
	git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull more tracing fixes from Steven Rostedt: - Fix regression of waiting a long time on updating trace event filters When the faultable trace points were added, it needed task trace RCU synchronization. This was added to the tracepoint_synchronize_unregister() function. The filter logic always called this function whenever it updated the trace event filters before freeing the old filters. This increased the time of "trace-cmd record" from taking 13 seconds to running over 2 minutes to complete. Move the freeing of the filters to call_rcu() logic, which brings the time back down to 13 seconds. - Fix ring_buffer_subbuf_order_set() error path lock protection The error path of the ring_buffer_subbuf_order_set() released the mutex too early and allowed subsequent accesses to setting the subbuffer size to corrupt the data and cause a bug. By moving the mutex locking to the end of the error path, it prevents the reentrant access to the critical data and also allows the function to convert the taking of the mutex over to the guard() logic. - Remove unused power management clock events The clock events were added in 2010 for power management. In 2011 arm used them. In 2013 the code they were used in was removed. These events have been wasting memory since then. - Fix sparse warnings There was a few places that sparse warned about trace_events_filter.c where file->filter was referenced directly, but it is annotated with an __rcu tag. Use the helper functions and fix them up to use rcu_dereference() properly. tag 'trace-v6.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing: Add rcu annotation around file->filter accesses tracing: PM: Remove unused clock events ring-buffer: Fix buffer locking in ring_buffer_subbuf_order_set() tracing: Fix regression of filter waiting a long time on RCU synchronization
2 days	treewide, timers: Rename from_timer() to timer_container_of()	Ingo Molnar	689	-955/+1151
	Move this API to the canonical timer_*() namespace. [ tglx: Redone against pre rc1 ] Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/aB2X0jCKQO56WdMt@gmail.com
3 days	Merge tag 'kbuild-v6.16' of ↵	Linus Torvalds	49	-326/+910
	git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - Add support for the EXPORT_SYMBOL_GPL_FOR_MODULES() macro, which exports a symbol only to specified modules - Improve ABI handling in gendwarfksyms - Forcibly link lib-y objects to vmlinux even if CONFIG_MODULES=n - Add checkers for redundant or missing <linux/export.h> inclusion - Deprecate the extra-y syntax - Fix a genksyms bug when including enum constants from .symref files tag 'kbuild-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (28 commits) genksyms: Fix enum consts from a reference affecting new values arch: use always-$(KBUILD_BUILTIN) for vmlinux.lds kbuild: set y instead of 1 to KBUILD_{BUILTIN,MODULES} efi/libstub: use 'targets' instead of extra-y in Makefile module: make __mod_device_table__* symbols static scripts/misc-check: check unnecessary #include <linux/export.h> when W=1 scripts/misc-check: check missing #include <linux/export.h> when W=1 scripts/misc-check: add double-quotes to satisfy shellcheck kbuild: move W=1 check for scripts/misc-check to top-level Makefile scripts/tags.sh: allow to use alternative ctags implementation kconfig: introduce menu type enum docs: symbol-namespaces: fix reST warning with literal block kbuild: link lib-y objects to vmlinux forcibly even when CONFIG_MODULES=n tinyconfig: enable CONFIG_LD_DEAD_CODE_DATA_ELIMINATION docs/core-api/symbol-namespaces: drop table of contents and section numbering modpost: check forbidden MODULE_IMPORT_NS("module:") at compile time kbuild: move kbuild syntax processing to scripts/Makefile.build Makefile: remove dependency on archscripts for header installation Documentation/kbuild: Add new gendwarfksyms kABI rules Documentation/kbuild: Drop section numbers ...
3 days	Merge tag 'sh-for-v6.16-tag1' of ↵	Linus Torvalds	19	-50/+47
	git://git.kernel.org/pub/scm/linux/kernel/git/glaubitz/sh-linux Pull sh updates from John Paul Adrian Glaubitz: - replace the __ASSEMBLY__ with __ASSEMBLER__ macro in all headers since the latter is now defined automatically by both GCC and Clang when compiling assembly code (Thomas Huth) - set the default SPI mode for the ecovec24 board which became necessary after a new mode member as added to the sh_msiof_spi_info struct in cf9e4784f3bd ("spi: sh-msiof: Add slave mode support") (Geert Uytterhoeven) - remove unused variables in the kprobes code in kprobe_exceptions_notify() (Mike Rapoport) * tag 'sh-for-v6.16-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/glaubitz/sh-linux: sh: kprobes: Remove unused variables in kprobe_exceptions_notify() sh: ecovec24: Make SPI mode explicit sh: Replace __ASSEMBLY__ with __ASSEMBLER__ in all headers
3 days	Merge tag 'loongarch-6.16' of ↵	Linus Torvalds	32	-218/+561
	git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson Pull LoongArch updates from Huacai Chen: - Adjust the 'make install' operation - Support SCHED_MC (Multi-core scheduler) - Enable ARCH_SUPPORTS_MSEAL_SYSTEM_MAPPINGS - Enable HAVE_ARCH_STACKLEAK - Increase max supported CPUs up to 2048 - Introduce the numa_memblks conversion - Add PWM controller nodes in dts - Some bug fixes and other small changes * tag 'loongarch-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: platform/loongarch: laptop: Unregister generic_sub_drivers on exit platform/loongarch: laptop: Add backlight power control support platform/loongarch: laptop: Get brightness setting from EC on probe LoongArch: dts: Add PWM support to Loongson-2K2000 LoongArch: dts: Add PWM support to Loongson-2K1000 LoongArch: dts: Add PWM support to Loongson-2K0500 LoongArch: vDSO: Correctly use asm parameters in syscall wrappers LoongArch: Fix panic caused by NULL-PMD in huge_pte_offset() LoongArch: Preserve firmware configuration when desired LoongArch: Avoid using $r0/$r1 as "mask" for csrxchg LoongArch: Introduce the numa_memblks conversion LoongArch: Increase max supported CPUs up to 2048 LoongArch: Enable HAVE_ARCH_STACKLEAK LoongArch: Enable ARCH_SUPPORTS_MSEAL_SYSTEM_MAPPINGS LoongArch: Add SCHED_MC (Multi-core scheduler) support LoongArch: Add some annotations in archhelp LoongArch: Using generic scripts/install.sh in `make install` LoongArch: Add a default install.sh
3 days	Merge tag 'sound-fix-6.16-rc1' of ↵	Linus Torvalds	25	-70/+192
	git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "A collection of fix patches for the 6.16-rc1 merge window. Most of changes are about ASoC, especially lots of AVS driver fixes. Larger LOCs are seen in TAS571x codec drivers, but the changes are trivial and safe. The rest are all device-specific small fixes" * tag 'sound-fix-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (27 commits) ASoC: Intel: avs: boards: Fix rt5663 front end name ASoC: Intel: avs: Simplify verification of parse_int_array() result ALSA: usb-audio: Add implicit feedback quirk for RODE AI-1 ALSA: hda: Ignore unsol events for cards being shut down ALSA: hda: Add new pci id for AMD GPU display HD audio controller ALSA: hda: cs35l41: Constify regmap_irq_chip ALSA: usb-audio: Add a quirk for Lenovo Thinkpad Thunderbolt 3 dock ASoC: ti: omap-hdmi: Re-add dai_link->platform to fix card init ASoC: pcm: Do not open FEs with no BEs connected ASoC: rt1320: fix speaker noise when volume bar is 100% ASoC: Intel: avs: Include missing string.h ASoC: Intel: avs: Verify content returned by parse_int_array() ASoC: Intel: avs: Verify kcalloc() status when setting constraints ASoC: Intel: avs: Fix paths in MODULE_FIRMWARE hints ASoC: Intel: avs: Fix possible null-ptr-deref when initing hw ASoC: Intel: avs: Fix PPLCxFMT calculation ASoC: Intel: avs: Fix deadlock when the failing IPC is SET_D0IX ASoC: codecs: hda: Fix RPM usage count underflow ASoC: amd: yc: Add support for Lenovo Yoga 7 16ARP8 ASoC: tas571x: fix tas5733 num_controls ...
3 days	tracing: Add rcu annotation around file->filter accesses	Steven Rostedt	1	-4/+6
	Running sparse on trace_events_filter.c triggered several warnings about file->filter being accessed directly even though it's annotated with __rcu. Add rcu_dereference() around it and shuffle the logic slightly so that it's always referenced via accessor functions. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/20250607102821.6c7effbf@gandalf.local.home Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
3 days	Merge tag 'ubifs-for-linus-6.16-rc1' of ↵	Linus Torvalds	4	-4/+13
	git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs Pull JFFS2 and UBIFS fixes from Richard Weinberger: "JFFS2: - Correctly check return code of jffs2_prealloc_raw_node_refs() UBIFS: - Spelling fixes" * tag 'ubifs-for-linus-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs: jffs2: check jffs2_prealloc_raw_node_refs() result in few other places jffs2: check that raw node were preallocated before writing summary ubifs: Fix grammar in error message
3 days	sh: kprobes: Remove unused variables in kprobe_exceptions_notify()	Mike Rapoport	1	-4/+0
	kbuild reports the following warning: arch/sh/kernel/kprobes.c: In function 'kprobe_exceptions_notify': >> arch/sh/kernel/kprobes.c:412:24: warning: variable 'p' set but not used [-Wunused-but-set-variable] 412 \| struct kprobe p = NULL; \| ^ The variable 'p' is indeed unused since the commit fa5a24b16f94 ("sh/kprobes: Don't call the ->break_handler() in SH kprobes code") Remove that variable along with 'kprobe_opcode_t addr' which also becomes unused after 'p' is removed. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202505151341.EuRFR22l-lkp@intel.com/ Fixes: fa5a24b16f94 ("sh/kprobes: Don't call the ->break_handler() in SH kprobes code") Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
3 days	sh: ecovec24: Make SPI mode explicit	Geert Uytterhoeven	1	-0/+1
	Commit cf9e4784f3bde3e4 ("spi: sh-msiof: Add slave mode support") added a new mode member to the sh_msiof_spi_info structure, but did not update any board files. Hence all users in board files rely on the default being host mode. Make this unambiguous by configuring host mode explicitly. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
3 days	sh: Replace __ASSEMBLY__ with __ASSEMBLER__ in all headers	Thomas Huth	17	-46/+46
	While the GCC and Clang compilers already define __ASSEMBLER__ automatically when compiling assembly code, __ASSEMBLY__ is a macro that only gets defined by the Makefiles in the kernel. This can be very confusing when switching between userspace and kernelspace coding, or when dealing with uapi headers that rather should use __ASSEMBLER__ instead. So let's standardize on the __ASSEMBLER__ macro that is provided by the compilers now. This is a completely mechanical patch (done with a simple "sed -i" statement). Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Rich Felker <dalias@libc.org> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: linux-sh@vger.kernel.org Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
3 days	genksyms: Fix enum consts from a reference affecting new values	Petr Pavlu	1	-7/+20
	Enumeration constants read from a symbol reference file can incorrectly affect new enumeration constants parsed from an actual input file. Example: $ cat test.c enum { E_A, E_B, E_MAX }; struct bar { int mem[E_MAX]; }; int foo(struct bar a) {} __GENKSYMS_EXPORT_SYMBOL(foo); $ cat test.c \| ./scripts/genksyms/genksyms -T test.0.symtypes #SYMVER foo 0x070d854d $ cat test.0.symtypes E#E_MAX 2 s#bar struct bar { int mem [ E#E_MAX ] ; } foo int foo ( s#bar ) $ cat test.c \| ./scripts/genksyms/genksyms -T test.1.symtypes -r test.0.symtypes <stdin>:4: warning: foo: modversion changed because of changes in enum constant E_MAX #SYMVER foo 0x9c9dfd81 $ cat test.1.symtypes E#E_MAX ( 2 ) + 3 s#bar struct bar { int mem [ E#E_MAX ] ; } foo int foo ( s#bar * ) The __add_symbol() function includes logic to handle the incrementation of enumeration values, but this code is also invoked when reading a reference file. As a result, the variables last_enum_expr and enum_counter might be incorrectly set after reading the reference file, which later affects parsing of the actual input. Fix the problem by splitting the logic for the incrementation of enumeration values into a separate function process_enum() and call it from __add_symbol() only when processing non-reference data. Fixes: e37ddb825003 ("genksyms: Track changes to enum constants") Signed-off-by: Petr Pavlu <petr.pavlu@suse.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
3 days	arch: use always-$(KBUILD_BUILTIN) for vmlinux.lds	Masahiro Yamada	21	-21/+21
	The extra-y syntax is deprecated. Instead, use always-$(KBUILD_BUILTIN), which behaves equivalently. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Acked-by: Johannes Berg <johannes@sipsolutions.net> Reviewed-by: Nicolas Schier <n.schier@avm.de>
3 days	do_change_type(): refuse to operate on unmounted/not ours mounts	Al Viro	1	-0/+4
	Ensure that propagation settings can only be changed for mounts located in the caller's mount namespace. This change aligns permission checking with the rest of mount(2). Reviewed-by: Christian Brauner <brauner@kernel.org> Fixes: 07b20889e305 ("beginning of the shared-subtree proper") Reported-by: "Orlando, Noah" <Noah.Orlando@deshaw.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 days	kbuild: set y instead of 1 to KBUILD_{BUILTIN,MODULES}	Masahiro Yamada	2	-8/+12
	KBUILD_BUILTIN is set to 1 unless you are building only modules. KBUILD_MODULES is set to 1 when you are building only modules (a typical use case is "make modules"). It is more useful to set them to 'y' instead, so we can do something like: always-$(KBUILD_BUILTIN) += vmlinux.lds This works equivalently to: extra-y += vmlinux.lds This allows us to deprecate extra-y. extra-y and always-y are quite similar, and we do not need both. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Nicolas Schier <n.schier@avm.de>
3 days	clone_private_mnt(): make sure that caller has CAP_SYS_ADMIN in the right userns	Al Viro	1	-0/+3
	What we want is to verify there is that clone won't expose something hidden by a mount we wouldn't be able to undo. "Wouldn't be able to undo" may be a result of MNT_LOCKED on a child, but it may also come from lacking admin rights in the userns of the namespace mount belongs to. clone_private_mnt() checks the former, but not the latter. There's a number of rather confusing CAP_SYS_ADMIN checks in various userns during the mount, especially with the new mount API; they serve different purposes and in case of clone_private_mnt() they usually, but not always end up covering the missing check mentioned above. Reviewed-by: Christian Brauner <brauner@kernel.org> Reported-by: "Orlando, Noah" <Noah.Orlando@deshaw.com> Fixes: 427215d85e8d ("ovl: prevent private clone if bind mount is not allowed") Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 days	Merge tag 'mm-stable-2025-06-06-16-09' of ↵	Linus Torvalds	16	-44/+122
	git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull more MM updates from Andrew Morton: "The series 'Fix uprobe pte be overwritten when expanding vma' fixes a longstanding and quite obscure bug related to the vma merging of the uprobe mmap page" * tag 'mm-stable-2025-06-06-16-09' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: selftests/mm: add test about uprobe pte be orphan during vma merge selftests/mm: extract read_sysfs and write_sysfs into vm_util mm: expose abnormal new_pte during move_ptes mm: fix uprobe pte be overwritten when expanding vma mm/damon: s/primitives/code/ on comments
3 days	Merge tag 'mm-hotfixes-stable-2025-06-06-16-02' of ↵	Linus Torvalds	16	-46/+160
	git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "13 hotfixes. 6 are cc:stable and the remainder address post-6.15 issues or aren't considered necessary for -stable kernels. 11 are for MM" * tag 'mm-hotfixes-stable-2025-06-06-16-02' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: kernel/rcu/tree_stall: add /sys/kernel/rcu_stall_count MAINTAINERS: add mm swap section kmsan: test: add module description MAINTAINERS: add tlb trace events to MMU GATHER AND TLB INVALIDATION mm/hugetlb: fix huge_pmd_unshare() vs GUP-fast race mm/hugetlb: unshare page tables during VMA split, not before MAINTAINERS: add Alistair as reviewer of mm memory policy iov_iter: use iov_offset for length calculation in iov_iter_aligned_bvec mm/mempolicy: fix incorrect freeing of wi_kobj alloc_tag: handle module codetag load errors as module load failures mm/madvise: handle madvise_lock() failure during race unwinding mm: fix vmstat after removing NR_BOUNCE KVM: s390: rename PROT_NONE to PROT_TYPE_DUMMY
3 days	selftests/mount_setattr: adapt detached mount propagation test	Christian Brauner	1	-16/+1
	Make sure that detached trees don't receive mount propagation. Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 days	do_move_mount(): split the checks in subtree-of-our-ns and entire-anon cases	Al Viro	1	-21/+25
	... and fix the breakage in anon-to-anon case. There are two cases acceptable for do_move_mount() and mixing checks for those is making things hard to follow. One case is move of a subtree in caller's namespace. * source and destination must be in caller's namespace * source must be detachable from parent Another is moving the entire anon namespace elsewhere * source must be the root of anon namespace * target must either in caller's namespace or in a suitable anon namespace (see may_use_mount() for details). * target must not be in the same namespace as source. It's really easier to follow if tests are not mixed together... Reviewed-by: Christian Brauner <brauner@kernel.org> Fixes: 3b5260d12b1f ("Don't propagate mounts into detached trees") Reported-by: Allison Karlitskaya <lis@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 days	fs: allow clone_private_mount() for a path on real rootfs	KONDO KAZUMA(近藤　和真)	1	-10/+11
	Mounting overlayfs with a directory on real rootfs (initramfs) as upperdir has failed with following message since commit db04662e2f4f ("fs: allow detached mounts in clone_private_mount()"). [ 4.080134] overlayfs: failed to clone upperpath Overlayfs mount uses clone_private_mount() to create internal mount for the underlying layers. The commit made clone_private_mount() reject real rootfs because it does not have a parent mount and is in the initial mount namespace, that is not an anonymous mount namespace. This issue can be fixed by modifying the permission check of clone_private_mount() following [1]. Reviewed-by: Christian Brauner <brauner@kernel.org> Fixes: db04662e2f4f ("fs: allow detached mounts in clone_private_mount()") Link: https://lore.kernel.org/all/20250514190252.GQ2023217@ZenIV/ [1] Link: https://lore.kernel.org/all/20250506194849.GT2023217@ZenIV/ Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Kazuma Kondo <kazuma-kondo@nec.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 days	fix propagation graph breakage by MOVE_MOUNT_SET_GROUP move_mount(2)	Al Viro	1	-1/+1
	9ffb14ef61ba "move_mount: allow to add a mount into an existing group" breaks assertions on ->mnt_share/->mnt_slave. For once, the data structures in question are actually documented. Documentation/filesystem/sharedsubtree.rst: All vfsmounts in a peer group have the same ->mnt_master. If it is non-NULL, they form a contiguous (ordered) segment of slave list. do_set_group() puts a mount into the same place in propagation graph as the old one. As the result, if old mount gets events from somewhere and is not a pure event sink, new one needs to be placed next to the old one in the slave list the old one's on. If it is a pure event sink, we only need to make sure the new one doesn't end up in the middle of some peer group. "move_mount: allow to add a mount into an existing group" ends up putting the new one in the beginning of list; that's definitely not going to be in the middle of anything, so that's fine for case when old is not marked shared. In case when old one _is_ marked shared (i.e. is not a pure event sink), that breaks the assumptions of propagation graph iterators. Put the new mount next to the old one on the list - that does the right thing in "old is marked shared" case and is just as correct as the current behaviour if old is not marked shared (kudos to Pavel for pointing that out - my original suggested fix changed behaviour in the "nor marked" case, which complicated things for no good reason). Reviewed-by: Christian Brauner <brauner@kernel.org> Fixes: 9ffb14ef61ba ("move_mount: allow to add a mount into an existing group") Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 days	finish_automount(): don't leak MNT_LOCKED from parent to child	Al Viro	1	-1/+2
	Intention for MNT_LOCKED had always been to protect the internal mountpoints within a subtree that got copied across the userns boundary, not the mountpoint that tree got attached to - after all, it _was_ exposed before the copying. For roots of secondary copies that is enforced in attach_recursive_mnt() - MNT_LOCKED is explicitly stripped for those. For the root of primary copy we are almost always guaranteed that MNT_LOCKED won't be there, so attach_recursive_mnt() doesn't bother. Unfortunately, one call chain got overlooked - triggering e.g. NFS referral will have the submount inherit the public flags from parent; that's fine for such things as read-only, nosuid, etc., but not for MNT_LOCKED. This is particularly pointless since the mount attached by finish_automount() is usually expirable, which makes any protection granted by MNT_LOCKED null and void; just wait for a while and that mount will go away on its own. Include MNT_LOCKED into the set of flags to be ignored by do_add_mount() - it really is an internal flag. Reviewed-by: Christian Brauner <brauner@kernel.org> Fixes: 5ff9d8a65ce8 ("vfs: Lock in place mounts from more privileged users") Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 days	path_overmount(): avoid false negatives	Al Viro	1	-6/+13
	Holding namespace_sem is enough to make sure that result remains valid. It is not enough to avoid false negatives from __lookup_mnt(). Mounts can be unhashed outside of namespace_sem (stuck children getting detached on final mntput() of lazy-umounted mount) and having an unrelated mount removed from the hash chain while we traverse it may end up with false negative from __lookup_mnt(). We need to sample and recheck the seqlock component of mount_lock... Bug predates the introduction of path_overmount() - it had come from the code in finish_automount() that got abstracted into that helper. Reviewed-by: Christian Brauner <brauner@kernel.org> Fixes: 26df6034fdb2 ("fix automount/automount race properly") Fixes: 6ac392815628 ("fs: allow to mount beneath top mount") Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 days	fs/fhandle.c: fix a race in call of has_locked_children()	Al Viro	1	-4/+14
	may_decode_fh() is calling has_locked_children() while holding no locks. That's an oopsable race... The rest of the callers are safe since they are holding namespace_sem and are guaranteed a positive refcount on the mount in question. Rename the current has_locked_children() to __has_locked_children(), make it static and switch the fs/namespace.c users to it. Make has_locked_children() a wrapper for __has_locked_children(), calling the latter under read_seqlock_excl(&mount_lock). Reviewed-by: Christian Brauner <brauner@kernel.org> Reviewed-by: Jeff Layton <jlayton@kernel.org> Fixes: 620c266f3949 ("fhandle: relax open_by_handle_at() permission checks") Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
3 days	platform/loongarch: laptop: Unregister generic_sub_drivers on exit	Yao Zi	1	-3/+9
	Without correct unregisteration, ACPI notify handlers and the platform drivers installed by generic_subdriver_init() will become dangling references after removing the loongson_laptop module, triggering various kernel faults when a hotkey is sent or at kernel shutdown. Cc: stable@vger.kernel.org Fixes: 6246ed09111f ("LoongArch: Add ACPI-based generic laptop driver") Signed-off-by: Yao Zi <ziyao@disroot.org> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
3 days	platform/loongarch: laptop: Add backlight power control support	Yao Zi	1	-36/+37
	loongson_laptop_turn_{on,off}_backlight() are designed for controlling the power of the backlight, but they aren't really used in the driver previously. Unify these two functions since they only differ in arguments passed to ACPI method, and wire up loongson_laptop_backlight_update() to update the power state of the backlight as well. Tested on the TongFang L860-T2 Loongson-3A5000 laptop. Cc: stable@vger.kernel.org Fixes: 6246ed09111f ("LoongArch: Add ACPI-based generic laptop driver") Signed-off-by: Yao Zi <ziyao@disroot.org> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
3 days	Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi	Linus Torvalds	5	-47/+110
	Pull SCSI fixes from James Bottomley: "Mostly trivial updates and bug fixes (core update is a comment spelling fix). The bigger UFS update is the clock scaling and frequency fixes" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: ufs: qcom: Prevent calling phy_exit() before phy_init() scsi: ufs: qcom: Call ufs_qcom_cfg_timers() in clock scaling path scsi: ufs: qcom: Map devfreq OPP freq to UniPro Core Clock freq scsi: ufs: qcom: Check gear against max gear in vop freq_to_gear() scsi: aacraid: Remove useless code scsi: core: devinfo: Fix typo in comment scsi: ufs: core: Don't perform UFS clkscaling during host async scan
3 days	Merge tag 'riscv-for-linus-6.16-mw1' of ↵	Linus Torvalds	74	-855/+3787
	git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Palmer Dabbelt: - Support for the FWFT SBI extension, which is part of SBI 3.0 and a dependency for many new SBI and ISA extensions - Support for getrandom() in the VDSO - Support for mseal - Optimized routines for raid6 syndrome and recovery calculations - kexec_file() supports loading Image-formatted kernel binaries - Improvements to the instruction patching framework to allow for atomic instruction patching, along with rules as to how systems need to behave in order to function correctly - Support for a handful of new ISA extensions: Svinval, Zicbop, Zabha, some SiFive vendor extensions - Various fixes and cleanups, including: misaligned access handling, perf symbol mangling, module loading, PUD THPs, and improved uaccess routines * tag 'riscv-for-linus-6.16-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (69 commits) riscv: uaccess: Only restore the CSR_STATUS SUM bit RISC-V: vDSO: Wire up getrandom() vDSO implementation riscv: enable mseal sysmap for RV64 raid6: Add RISC-V SIMD syndrome and recovery calculations riscv: mm: Add support for Svinval extension RISC-V: Documentation: Add enough title underlines to CMODX riscv: Improve Kconfig help for RISCV_ISA_V_PREEMPTIVE MAINTAINERS: Update Atish's email address riscv: uaccess: do not do misaligned accesses in get/put_user() riscv: process: use unsigned int instead of unsigned long for put_user() riscv: make unsafe user copy routines use existing assembly routines riscv: hwprobe: export Zabha extension riscv: Make regs_irqs_disabled() more clear perf symbols: Ignore mapping symbols on riscv RISC-V: Kconfig: Fix help text of CMDLINE_EXTEND riscv: module: Optimize PLT/GOT entry counting riscv: Add support for PUD THP riscv: xchg: Prefetch the destination word for sc.w riscv: Add ARCH_HAS_PREFETCH[W] support with Zicbop riscv: Add support for Zicbop ...
3 days	Merge tag 's390-6.16-2' of ↵	Linus Torvalds	2	-0/+3
	git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull more s390 updates from Heiko Carstens: - Add missing select CRYPTO_ENGINE to CRYPTO_PAES_S390 - Fix secure storage access exception handling when fault handling is disabled * tag 's390-6.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/mm: Fix in_atomic() handling in do_secure_storage_access() s390/crypto: Select crypto engine in Kconfig when PAES is chosen
3 days	Merge tag 'tomoyo-pr-20250606' of git://git.code.sf.net/p/tomoyo/tomoyo	Linus Torvalds	1	-4/+2
	Pull tomoyo update from Tetsuo Handa: "Update mailing list address" * tag 'tomoyo-pr-20250606' of git://git.code.sf.net/p/tomoyo/tomoyo: tomoyo: update mailing lists
3 days	Merge tag 'ceph-for-6.16-rc1' of https://github.com/ceph/ceph-client	Linus Torvalds	4	-11/+25
	Pull ceph updates from Ilya Dryomov: - a one-liner that leads to a startling (but also very much rational) performance improvement in cases where an IMA policy with rules that are based on fsmagic matching is enforced - an encryption-related fixup that addresses generic/397 and other fstest failures - a couple of cleanups in CephFS * tag 'ceph-for-6.16-rc1' of https://github.com/ceph/ceph-client: ceph: fix variable dereferenced before check in ceph_umount_begin() ceph: set superblock s_magic for IMA fsmagic matching ceph: cleanup hardcoded constants of file handle size ceph: fix possible integer overflow in ceph_zero_objects() ceph: avoid kernel BUG for encrypted inode with unaligned file size
3 days	Merge tag 'ovl-update-v2-6.16' of ↵	Linus Torvalds	7	-80/+84
	git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs Pull overlayfs update from Miklos Szeredi: - Fix a regression in getting the path of an open file (e.g. in /proc/PID/maps) for a nested overlayfs setup (André Almeida) - Support data-only layers and verity in a user namespace (unprivileged composefs use case) - Fix a gcc warning (Kees) - Cleanups * tag 'ovl-update-v2-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs: ovl: Annotate struct ovl_entry with __counted_by() ovl: Replace offsetof() with struct_size() in ovl_stack_free() ovl: Replace offsetof() with struct_size() in ovl_cache_entry_new() ovl: Check for NULL d_inode() in ovl_dentry_upper() ovl: Use str_on_off() helper in ovl_show_options() ovl: don't require "metacopy=on" for "verity" ovl: relax redirect/metacopy requirements for lower -> data redirect ovl: make redirect/metacopy rejection consistent ovl: Fix nested backing file paths
3 days	tracing: PM: Remove unused clock events	Steven Rostedt	1	-47/+0
	The events clock_enable, clock_disable, and clock_set_rate were added back in 2010. In 2011 they were used by the arm architecture but removed in 2013. These events add around 7K of memory which was wasted for the last 12 years. Remove them. Link: https://lore.kernel.org/all/20250529130138.544ffec4@gandalf.local.home/ Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Kajetan Puchalski <kajetan.puchalski@arm.com> Acked-by: Rafael J. Wysocki <rafael@kernel.org> Link: https://lore.kernel.org/20250605162106.1a459dad@gandalf.local.home Fixes: 74704ac6ea402 ("tracing, perf: Add more power related events") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
3 days	ring-buffer: Fix buffer locking in ring_buffer_subbuf_order_set()	Dmitry Antipov	1	-3/+1
	Enlarge the critical section in ring_buffer_subbuf_order_set() to ensure that error handling takes place with per-buffer mutex held, thus preventing list corruption and other concurrency-related issues. Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Tzvetomir Stoyanov <tz.stoyanov@gmail.com> Link: https://lore.kernel.org/20250606112242.1510605-1-dmantipov@yandex.ru Reported-by: syzbot+05d673e83ec640f0ced9@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=05d673e83ec640f0ced9 Fixes: f9b94daa542a8 ("ring-buffer: Set new size of the ring buffer sub page") Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
3 days	tracing: Fix regression of filter waiting a long time on RCU synchronization	Steven Rostedt	1	-48/+138
	When faultable trace events were added, a trace event may no longer use normal RCU to synchronize but instead used synchronize_rcu_tasks_trace(). This synchronization takes a much longer time to synchronize. The filter logic would free the filters by calling tracepoint_synchronize_unregister() after it unhooked the filter strings and before freeing them. With this function now calling synchronize_rcu_tasks_trace() this increased the time to free a filter tremendously. On a PREEMPT_RT system, it was even more noticeable. # time trace-cmd record -p function sleep 1 [..] real 2m29.052s user 0m0.244s sys 0m20.136s As trace-cmd would clear out all the filters before recording, it could take up to 2 minutes to do a recording of "sleep 1". To find out where the issues was: ~# trace-cmd sqlhist -e -n sched_stack select start.prev_state as state, end.next_comm as comm, TIMESTAMP_DELTA_USECS as delta, start.STACKTRACE as stack from sched_switch as start join sched_switch as end on start.prev_pid = end.next_pid Which will produce the following commands (and -e will also execute them): echo 's:sched_stack s64 state; char comm[16]; u64 delta; unsigned long stack[];' >> /sys/kernel/tracing/dynamic_events echo 'hist:keys=prev_pid:__arg_18057_2=prev_state,__arg_18057_4=common_timestamp.usecs,__arg_18057_7=common_stacktrace' >> /sys/kernel/tracing/events/sched/sched_switch/trigger echo 'hist:keys=next_pid:__state_18057_1=$__arg_18057_2,__comm_18057_3=next_comm,__delta_18057_5=common_timestamp.usecs-$__arg_18057_4,__stack_18057_6=$__arg_18057_7:onmatch(sched.sched_switch).trace(sched_stack,$__state_18057_1,$__comm_18057_3,$__delta_18057_5,$__stack_18057_6)' >> /sys/kernel/tracing/events/sched/sched_switch/trigger The above creates a synthetic event that creates a stack trace when a task schedules out and records it with the time it scheduled back in. Basically the time a task is off the CPU. It also records the state of the task when it left the CPU (running, blocked, sleeping, etc). It also saves the comm of the task as "comm" (needed for the next command). ~# echo 'hist:keys=state,stack.stacktrace:vals=delta:sort=state,delta if comm == "trace-cmd" && state & 3' > /sys/kernel/tracing/events/synthetic/sched_stack/trigger The above creates a histogram with buckets per state, per stack, and the value of the total time it was off the CPU for that stack trace. It filters on tasks with "comm == trace-cmd" and only the sleeping and blocked states (1 - sleeping, 2 - blocked). ~# trace-cmd record -p function sleep 1 ~# cat /sys/kernel/tracing/events/synthetic/sched_stack/hist \| tail -18 { state: 2, stack.stacktrace __schedule+0x1545/0x3700 schedule+0xe2/0x390 schedule_timeout+0x175/0x200 wait_for_completion_state+0x294/0x440 __wait_rcu_gp+0x247/0x4f0 synchronize_rcu_tasks_generic+0x151/0x230 apply_subsystem_event_filter+0xa2b/0x1300 subsystem_filter_write+0x67/0xc0 vfs_write+0x1e2/0xeb0 ksys_write+0xff/0x1d0 do_syscall_64+0x7b/0x420 entry_SYSCALL_64_after_hwframe+0x76/0x7e } hitcount: 237 delta: 99756288 <<--------------- Delta is 99 seconds! Totals: Hits: 525 Entries: 21 Dropped: 0 This shows that this particular trace waited for 99 seconds on synchronize_rcu_tasks() in apply_subsystem_event_filter(). In fact, there's a lot of places in the filter code that spends a lot of time waiting for synchronize_rcu_tasks_trace() in order to free the filters. Add helper functions that will use call_rcu*() variants to asynchronously free the filters. This brings the timings back to normal: # time trace-cmd record -p function sleep 1 [..] real 0m14.681s user 0m0.335s sys 0m28.616s And the histogram also shows this: ~# cat /sys/kernel/tracing/events/synthetic/sched_stack/hist \| tail -21 { state: 2, stack.stacktrace __schedule+0x1545/0x3700 schedule+0xe2/0x390 schedule_timeout+0x175/0x200 wait_for_completion_state+0x294/0x440 __wait_rcu_gp+0x247/0x4f0 synchronize_rcu_normal+0x3db/0x5c0 tracing_reset_online_cpus+0x8f/0x1e0 tracing_open+0x335/0x440 do_dentry_open+0x4c6/0x17a0 vfs_open+0x82/0x360 path_openat+0x1a36/0x2990 do_filp_open+0x1c5/0x420 do_sys_openat2+0xed/0x180 __x64_sys_openat+0x108/0x1d0 do_syscall_64+0x7b/0x420 } hitcount: 2 delta: 77044 Totals: Hits: 55 Entries: 28 Dropped: 0 Where the total waiting time of synchronize_rcu_tasks_trace() is 77 milliseconds. Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Jan Kiszka <jan.kiszka@siemens.com> Cc: Andreas Ziegler <ziegler.andreas@siemens.com> Cc: Felix MOESSBAUER <felix.moessbauer@siemens.com> Link: https://lore.kernel.org/20250606201936.1e3d09a9@batman.local.home Reported-by: "Flot, Julien" <julien.flot@siemens.com> Tested-by: Julien Flot <julien.flot@siemens.com> Fixes: a363d27cdbc2 ("tracing: Allow system call tracepoints to handle page faults") Closes: https://lore.kernel.org/all/240017f656631c7dd4017aa93d91f41f653788ea.camel@siemens.com/ Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
3 days	Merge tag 'spi-v6.16-merge-window' of ↵	Linus Torvalds	6	-19/+58
	git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Pull more spi updates from Mark Brown: "A small set of updates that came in during the merge window, we've got: - Some small fixes for the Broadcom and spi-pci1xxxx drivers - A change to the QPIC SNAND driver to flag that the error correction features are less useful than people might be expecting - A new device ID for the SOPHGO SG2042 - The addition of Yang Shen as a Huawei maintainer" * tag 'spi-v6.16-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: spi: spi-qpic-snand: document the limited bit error reporting capability spi: bcm63xx-hsspi: fix shared reset spi: bcm63xx-spi: fix shared reset MAINTAINERS: Update HiSilicon SFC driver maintainer MAINTAINERS: Update HiSilicon SPI Controller driver maintainer spi: dt-bindings: spi-sg2044-nor: Add SOPHGO SG2042 spi: spi-pci1xxxx: Fix Probe failure with Dual SPI instance with INTx interrupts
3 days	Merge tag 'regulator-fix-v6.16-merge-window' of ↵	Linus Torvalds	1	-1/+4
	git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator Pull regulator fix from Mark Brown: "A very minor fix that came in during the merge window, checking for I/O errors in the MAX14577 driver" * tag 'regulator-fix-v6.16-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: regulator: max14577: Add error check for max14577_read_reg()
3 days	Merge tag 'pwm/for-6.16-rc1-fixes' of ↵	Linus Torvalds	2	-5/+31
	git://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linux Pull pwm fixes from Uwe Kleine-König: "axi-pwmgen: Fix handling of external clock The pwm-axi-pwmgen device is backed by an FPGA and can be synthesized in different ways. Relevant here is that it can use one or two external clock signals. These fix clock handling for the two clocks case" * tag 'pwm/for-6.16-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linux: pwm: axi-pwmgen: fix missing separate external clock dt-bindings: pwm: adi,axi-pwmgen: Fix clocks
3 days	Merge tag 'block-6.16-20250606' of git://git.kernel.dk/linux	Linus Torvalds	48	-378/+693
	Pull more block updates from Jens Axboe: - NVMe pull request via Christoph: - TCP error handling fix (Shin'ichiro Kawasaki) - TCP I/O stall handling fixes (Hannes Reinecke) - fix command limits status code (Keith Busch) - support vectored buffers also for passthrough (Pavel Begunkov) - spelling fixes (Yi Zhang) - MD pull request via Yu: - fix REQ_RAHEAD and REQ_NOWAIT IO err handling for raid1/10 - fix max_write_behind setting for dm-raid - some minor cleanups - Integrity data direction fix and cleanup - bcache NULL pointer fix - Fix for loop missing write start/end handling - Decouple hardware queues and IO threads in ublk - Slew of ublk selftests additions and updates * tag 'block-6.16-20250606' of git://git.kernel.dk/linux: (29 commits) nvme: spelling fixes nvme-tcp: fix I/O stalls on congested sockets nvme-tcp: sanitize request list handling nvme-tcp: remove tag set when second admin queue config fails nvme: enable vectored registered bufs for passthrough cmds nvme: fix implicit bool to flags conversion nvme: fix command limits status code selftests: ublk: kublk: improve behavior on init failure block: flip iter directions in blk_rq_integrity_map_user() block: drop direction param from bio_integrity_copy_user() selftests: ublk: cover PER_IO_DAEMON in more stress tests Documentation: ublk: document UBLK_F_PER_IO_DAEMON selftests: ublk: add stress test for per io daemons selftests: ublk: add functional test for per io daemons selftests: ublk: kublk: decouple ublk_queues from ublk server threads selftests: ublk: kublk: move per-thread data out of ublk_queue selftests: ublk: kublk: lift queue initialization out of thread selftests: ublk: kublk: tie sqe allocation to io instead of queue selftests: ublk: kublk: plumb q_id in io_uring user_data ublk: have a per-io daemon instead of a per-queue daemon ...
3 days	Merge tag 'io_uring-6.16-20250606' of git://git.kernel.dk/linux	Linus Torvalds	8	-13/+37
	Pull io_uring fixes from Jens Axboe: - Fix for a regression introduced in this merge window, where the 'id' passed to xa_find() for ifq lookup is uninitialized - Fix for zcrx release on registration failure. From 6.15, going to stable - Tweak for recv bundles, where msg_inq should be > 1 before being used to gate a retry event - Pavel doesnt want to be a maintainer anymore, remove him from the MAINTAINERS entry - Limit legacy kbuf registrations to 64k, which is the size of the buffer ID field anyway. Hence it's nonsensical to support more than that, and the only purpose that serves is to have syzbot trigger long exit delays for heavily configured debug kernels - Fix for the io_uring futex handling, which got broken for FUTEX2_PRIVATE by a generic futex commit adding private hashes * tag 'io_uring-6.16-20250606' of git://git.kernel.dk/linux: io_uring/futex: mark wait requests as inflight io_uring/futex: get rid of struct io_futex addr union io_uring/kbuf: limit legacy provided buffer lists to USHRT_MAX MAINTAINERS: remove myself from io_uring io_uring/net: only consider msg_inq if larger than 1 io_uring/zcrx: fix area release on registration failure io_uring/zcrx: init id for xa_find
3 days	Merge tag 'usb-6.16-rc1' of ↵	Linus Torvalds	160	-12392/+10369
	git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB / Thunderbolt updates from Greg KH: "Here is the big set of USB and Thunderbolt changes for 6.16-rc1. Included in here are the following: - USB offload support for audio devices. I think this takes the record for the most number of patch series (30+) over the longest period of time (2+ years) to get merged properly. Many props go to Wesley Cheng for seeing this effort through, they took a major out-of-tree hacked-up-monstrosity that was created by multiple vendors for their specific devices, got it all merged into a semi-coherent set of changes, and got all of the different major subsystems to agree on how this should be implemented both with changes to their code as well as userspace apis, AND wrangled the hardware companies into agreeing to go forward with this, despite making them all redo work they had already done in their private device trees. This feature offers major power savings on embedded devices where a USB audio stream can continue to flow while the rest of the system is sleeping, something that devices running on battery power really care about. There are still some more small tweaks left to be done here, and those patches are still out for review and arguing among the different hardware companies, but this is a major step forward and a great example of how to do upstream development well. - small number of thunderbolt fixes and updates, things seem to be slowing down here (famous last words...) - xhci refactors and reworking to try to handle some rough corner cases in some hardware implementations where things don't always work properly - typec driver updates - USB3 power management reworking and updates - Removal of some old and orphaned UDC gadget drivers that had not been used in a very long time, dropping over 11 thousand lines from the tree, always a nice thing, making up for the 12k lines added for the USB offload feature. - lots of little updates and fixes in different drivers All of these have been in linux-next for over 2 weeks, the USB offload logic has been in there for 8 weeks now, with no reported issues" * tag 'usb-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (172 commits) ALSA: usb-audio: qcom: fix USB_XHCI dependency ASoC: qdsp6: fix compile-testing without CONFIG_OF usb: misc: onboard_usb_dev: fix build warning for CONFIG_USB_ONBOARD_DEV_USB5744=n usb: typec: tipd: fix typo in TPS_STATUS_HIGH_VOLAGE_WARNING macro USB: typec: fix const issue in typec_match() USB: gadget: udc: fix const issue in gadget_match_driver() USB: gadget: fix up const issue with struct usb_function_instance USB: serial: pl2303: add new chip PL2303GC-Q20 and PL2303GT-2AB USB: serial: bus: fix const issue in usb_serial_device_match() usb: usbtmc: Fix timeout value in get_stb usb: usbtmc: Fix read_stb function and get_stb ioctl ALSA: qc_audio_offload: try to reduce address space confusion ALSA: qc_audio_offload: avoid leaking xfer_buf allocation ALSA: qc_audio_offload: rename dma/iova/va/cpu/phys variables ALSA: usb-audio: qcom: Fix an error handling path in qc_usb_audio_probe() usb: misc: onboard_usb_dev: Fix usb5744 initialization sequence dt-bindings: usb: ti,usb8041: Add binding for TI USB8044 hub controller usb: misc: onboard_usb_dev: Add support for TI TUSB8044 hub usb: gadget: lpc32xx_udc: Use USB API functions rather than constants usb: gadget: epautoconf: Use USB API functions rather than constants ...
3 days	Merge tag 'tty-6.16-rc1' of ↵	Linus Torvalds	80	-958/+6912
	git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty/serial updates from Greg KH: "Here is the big set of tty and serial driver changes for 6.16-rc1. A little more churn than normal in this portion of the kernel for this development cycle, Jiri and Nicholas were busy with cleanups and reviews and fixes for the vt unicode handling logic which composed most of the overall work in here. Major changes are: - vt unicode changes/reverts/changes from Nicholas. This should help out a lot with screen readers and others that rely on vt console support - lock guard additions to the core tty/serial code to clean up lots of error handling logic - 8250 driver updates and fixes - device tree conversions to yaml - sh-sci driver updates - other small cleanups and updates for serial drivers and tty core portions All of these have been in linux-next for 2 weeks with no reported issues" * tag 'tty-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (105 commits) tty: serial: 8250_omap: fix TX with DMA for am33xx vt: add VT_GETCONSIZECSRPOS to retrieve console size and cursor position vt: bracketed paste support vt: remove VT_RESIZE and VT_RESIZEX from vt_compat_ioctl() vt: process the full-width ASCII fallback range programmatically vt: make use of ucs_get_fallback() when glyph is unavailable vt: add ucs_get_fallback() vt: create ucs_fallback_table.h_shipped with gen_ucs_fallback_table.py vt: introduce gen_ucs_fallback_table.py to create ucs_fallback_table.h vt: move glyph determination to a separate function vt: make sure displayed double-width characters are remembered as such vt: ucs.c: fix misappropriate in_range() usage serial: max3100: Replace open-coded parity calculation with parity8() dt-bindings: serial: 8250_omap: Drop redundant properties dt-bindings: serial: Convert socionext,milbeaut-usio-uart to DT schema dt-bindings: serial: Convert microchip,pic32mzda-uart to DT schema dt-bindings: serial: Convert arm,sbsa-uart to DT schema dt-bindings: serial: Convert snps,arc-uart to DT schema dt-bindings: serial: Convert marvell,armada-3700-uart to DT schema dt-bindings: serial: Convert lantiq,asc to DT schema ...
4 days	Merge tag 'char-misc-6.16-rc1' of ↵	Linus Torvalds	601	-6125/+12349
	git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char / misc / iio driver updates from Greg KH: "Here is the big char/misc/iio and other small driver subsystem pull request for 6.16-rc1. Overall, a lot of individual changes, but nothing major, just the normal constant forward progress of new device support and cleanups to existing subsystems. Highlights in here are: - Large IIO driver updates and additions and device tree changes - Android binder bugfixes and logfile fixes - mhi driver updates - comedi driver updates - counter driver updates and additions - coresight driver updates and additions - echo driver removal as there are no in-kernel users of it - nvmem driver updates - spmi driver updates - new amd-sbi driver "subsystem" and drivers added - rust miscdriver binding documentation fix - other small driver fixes and updates (uio, w1, acrn, hpet, xillybus, cardreader drivers, fastrpc and others) All of these have been in linux-next for quite a while with no reported problems" * tag 'char-misc-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (390 commits) binder: fix yet another UAF in binder_devices counter: microchip-tcb-capture: Add watch validation support dt-bindings: iio: adc: Add ROHM BD79100G iio: adc: add support for Nuvoton NCT7201 dt-bindings: iio: adc: add NCT7201 ADCs iio: chemical: Add driver for SEN0322 dt-bindings: trivial-devices: Document SEN0322 iio: adc: ad7768-1: reorganize driver headers iio: bmp280: zero-init buffer iio: ssp_sensors: optimalize -> optimize HID: sensor-hub: Fix typo and improve documentation iio: admv1013: replace redundant ternary operator with just len iio: chemical: mhz19b: Fix error code in probe() iio: adc: at91-sama5d2: use IIO_DECLARE_BUFFER_WITH_TS iio: accel: sca3300: use IIO_DECLARE_BUFFER_WITH_TS iio: adc: ad7380: use IIO_DECLARE_DMA_BUFFER_WITH_TS iio: adc: ad4695: rename AD4695_MAX_VIN_CHANNELS iio: adc: ad4695: use IIO_DECLARE_DMA_BUFFER_WITH_TS iio: introduce IIO_DECLARE_BUFFER_WITH_TS macros iio: make IIO_DMA_MINALIGN minimum of 8 bytes ...
4 days	Merge tag 'staging-6.16-rc1' of ↵	Linus Torvalds	76	-2612/+1592
	git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull staging driver updates from Greg KH: "Here is the "big" set of staging driver changes for 6.16-rc1. Included in here are: - gpib driver cleanups and updates. This subsystem is _almost_ ready to be merged into the main portion of the kernel tree. Hopefully should happen in the next kernel merge cycle if all goes well. - sm750fb driver cleanups - rtl8723bs driver cleanups - other small driver cleanups for coding style issues All of these have been in for over 2 weeks with no reported issues" * tag 'staging-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (203 commits) staging: rtl8723bs: remove unnecessary braces for single statement blocks staging: rtl8723bs: Removed multiple blank lines of rtw_pwrctrl.c staging: sm750fb: rename `hw_sm750le_setBLANK` staging: sm750fb: rename `hw_sm750_setBLANK` staging: sm750fb: rename `hw_sm750_setColReg` staging: sm750fb: rename `hw_sm750_crtc_setMode` staging: sm750fb: rename `hw_sm750_crtc_checkMode` staging: sm750fb: rename `hw_sm750_output_setMode` staging: sm750fb: rename `hw_sm750le_deWait` staging: sm750fb: rename `hw_sm750_deWait` staging: sm750fb: rename `hw_sm750_initAccel` staging: gpib: switch to kmalloc(sizeof(*status)) staging: gpib: Fix secondary address restriction staging: gpib: Fix uapi include header guard name staging: gpib: Avoid unused variable warning staging: gpib: Declare driver entry points static staging: gpib: Fix PCMCIA config identifier staging: gpib: Avoid unused variable warnings staging: gpib: Fix lpvo request_system_control staging: sm750fb: rename sm750_hw_cursor_setData2 ...
4 days	Merge tag 'drm-fixes-2025-06-06' of https://gitlab.freedesktop.org/drm/kernel	Linus Torvalds	10	-96/+159
	Pull more drm fixes from Simona Vetter: "Another small batch of drm fixes, this time with a different baseline and hence separate. Drivers: - ivpu: - dma_resv locking - warning fixes - reset failure handling - improve logging - update fw file names - fix cmdqueue unregister - panel-simple: add Evervision VGG644804 Core Changes: - sysfb: screen_info type check - video: screen_info for relocated pci fb - drm/sched: signal fence of killed job - dummycon: deferred takeover fix" * tag 'drm-fixes-2025-06-06' of https://gitlab.freedesktop.org/drm/kernel: sysfb: Fix screen_info type check for VGA video: screen_info: Relocate framebuffers behind PCI bridges accel/ivpu: Fix warning in ivpu_gem_bo_free() accel/ivpu: Trigger device recovery on engine reset/resume failure accel/ivpu: Use dma_resv_lock() instead of a custom mutex drm/panel-simple: fix the warnings for the Evervision VGG644804 accel/ivpu: Reorder Doorbell Unregister and Command Queue Destruction accel/ivpu: Use firmware names from upstream repo accel/ivpu: Improve buffer object logging dummycon: Trigger redraw when switching consoles with deferred takeover drm/scheduler: signal scheduled fence when kill job
4 days	Merge tag 'drm-next-2025-06-06' of https://gitlab.freedesktop.org/drm/kernel	Linus Torvalds	140	-633/+2314
	Pull drm fixes from Dave Airlie: "This is pretty much two weeks worth of fixes, plus one thing that might be considered next: amdkfd is now able to be enabled on risc-v platforms. Otherwise, amdgpu and xe with the majority of fixes, and then a smattering all over. panel: - nt37801: fix IS_ERR - nt37801: fix KConfig connector: - Fix null deref in HDMI audio helper. bridge: - analogix_dp: fixup clk-disable removal nouveau: - minor typo fix (',' vs ';') msm: - mailmap updates i915: - Fix the enabling/disabling of DP audio SDP splitting - Fix PSR register definitions for ALPM - Fix u32 overflow in SNPS PHY HDMI PLL setup - Fix GuC pending message underflow when submit fails - Fix GuC wakeref underflow race during reset xe: - Two documentation fixes - A couple of vm init fixes - Hwmon fixes - Drop reduntant conversion to bool - Fix CONFIG_INTEL_VSEC dependency - Rework eviction rejection of bound external bos - Stop re-submitting signalled jobs - A couple of pxp fixes - Add back a fix that got lost in a merge - Create LRC bo without VM - Fix for the above fix amdgpu: - UserQ fixes - SMU 13.x fixes - VCN fixes - JPEG fixes - Misc cleanups - runtime pm fix - DCN 4.0.1 fixes - Misc display fixes - ISP fix - VRAM manager fix - RAS fixes - IP discovery fix - Cleaner shader fix for GC 10.1.x - OD fix - Non-OLED panel fix - Misc display fixes - Brightness fixes amdkfd: - Enable CONFIG_HSA_AMD on RISCV - SVM fix - Misc cleanups - Ref leak fix - WPTR BO fix radeon: - Misc cleanups" * tag 'drm-next-2025-06-06' of https://gitlab.freedesktop.org/drm/kernel: (105 commits) drm/nouveau/vfn/r535: Convert comma to semicolon drm/xe: remove unmatched xe_vm_unlock() from __xe_exec_queue_init() drm/xe: Create LRC BO without VM drm/xe/guc_submit: add back fix drm/xe/pxp: Clarify PXP queue creation behavior if PXP is not ready drm/xe/pxp: Use the correct define in the set_property_funcs array drm/xe/sched: stop re-submitting signalled jobs drm/xe: Rework eviction rejection of bound external bos drm/xe/vsec: fix CONFIG_INTEL_VSEC dependency drm/xe: drop redundant conversion to bool drm/xe/hwmon: Move card reactive critical power under channel card drm/xe/hwmon: Add support to manage power limits though mailbox drm/xe/vm: move xe_svm_init() earlier drm/xe/vm: move rebind_work init earlier MAINTAINERS: .mailmap: update Rob Clark's email address mailmap: Update entry for Akhil P Oommen MAINTAINERS: update my email address MAINTAINERS: drop myself as maintainer drm/i915/display: Fix u32 overflow in SNPS PHY HDMI PLL setup drm/amd/display: Fix default DC and AC levels ...
4 days	Merge tag 'mips_6.16' of ↵	Linus Torvalds	32	-43/+434
	git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux Pull MIPS updates from Thomas Bogendoerfer: - Added support for EcoNet platform - Added support for parallel CPU bring up on EyeQ - Other cleanups and fixes * tag 'mips_6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: (23 commits) MIPS: loongson2ef: lemote-2f: add missing function prototypes MIPS: loongson2ef: cs5536: add missing function prototypes MIPS: SMP: Move the AP sync point before the calibration delay mips: econet: Fix incorrect Kconfig dependencies MAINTAINERS: Add entry for newly added EcoNet platform. mips: dts: Add EcoNet DTS with EN751221 and SmartFiber XP8421-B board dt-bindings: vendor-prefixes: Add SmartFiber mips: Add EcoNet MIPS platform support dt-bindings: mips: Add EcoNet platform binding MIPS: bcm63xx: nvram: avoid inefficient use of crc32_le_combine() mips: dts: pic32: pic32mzda: Rename the sdhci nodename to match with common mmc-controller binding MIPS: SMP: Move the AP sync point before the non-parallel aware functions MIPS: Replace strcpy() with strscpy() in vpe_elfload() MIPS: BCM63XX: Replace strcpy() with strscpy() in board_prom_init() mips: ptrace: Improve code formatting and indentation MIPS: SMP: Implement parallel CPU bring up for EyeQ mips: Add -std= flag specified in KBUILD_CFLAGS to vdso CFLAGS MIPS: Loongson64: Add missing '#interrupt-cells' for loongson64c_ls7a mips: dts: realtek: Add MDIO controller MIPS: txx9: gpio: use new line value setter callbacks ...
4 days	Merge tag 'drm-misc-fixes-2025-06-06' of ↵	Simona Vetter	7	-74/+118
	https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes Short summary of fixes pull: ivpu: - gem: Use dma-resv lock - gem. Fix a warning - Trigger recovery on device engine reset/resume failure panel: - panel-simple: Fix settings for Evervision VGG644804 sysfb: - Fix screen_info type check video: - Update screen_info for relocated PCI framebuffers Signed-off-by: Simona Vetter <simona.vetter@ffwll.ch> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20250606072853.GA13099@linux.fritz.box
4 days	platform/loongarch: laptop: Get brightness setting from EC on probe	Yao Zi	1	-1/+1
	Previously during driver probe, 1 is unconditionally taken as current brightness value and set to props.brightness, which will be considered as the brightness before suspend and restored to EC on resume. Since a brightness value of 1 almost never matches EC's state on coldboot (my laptop's EC defaults to 80), this causes surprising changes of screen brightness on the first time of resume after coldboot. Let's get brightness from EC and take it as the current brightness on probe of the laptop driver to avoid the surprising behavior. Tested on TongFang L860-T2 Loongson-3A5000 laptop. Cc: stable@vger.kernel.org Fixes: 6246ed09111f ("LoongArch: Add ACPI-based generic laptop driver") Signed-off-by: Yao Zi <ziyao@disroot.org> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
4 days	LoongArch: dts: Add PWM support to Loongson-2K2000	Binbin Zhou	1	-0/+60
	The module is supported, enable it. Reviewed-by: Yanteng Si <si.yanteng@linux.dev> Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
4 days	LoongArch: dts: Add PWM support to Loongson-2K1000	Binbin Zhou	2	-1/+65
	The module is supported, enable it. Also, add the pwm-fan and cooling-maps associated with it. Reviewed-by: Yanteng Si <si.yanteng@linux.dev> Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
4 days	LoongArch: dts: Add PWM support to Loongson-2K0500	Binbin Zhou	1	-0/+160
	The module is supported, enable it. Reviewed-by: Yanteng Si <si.yanteng@linux.dev> Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
4 days	LoongArch: vDSO: Correctly use asm parameters in syscall wrappers	Thomas Weißschuh	2	-4/+4
	The syscall wrappers use the "a0" register for two different register variables, both the first argument and the return value. Here the "ret" variable is used as both input and output while the argument register is only used as input. Clang treats the conflicting input parameters as an undefined behaviour and optimizes away the argument assignment. The code seems to work by chance for the most part today but that may change in the future. Specifically clock_gettime_fallback() fails with clockids from 16 to 23, as implemented by the upcoming auxiliary clocks. Switch the "ret" register variable to a pure output, similar to the other architectures' vDSO code. This works in both clang and GCC. Link: https://lore.kernel.org/lkml/20250602102825-42aa84f0-23f1-4d10-89fc-e8bbaffd291a@linutronix.de/ Link: https://lore.kernel.org/lkml/20250519082042.742926976@linutronix.de/ Fixes: c6b99bed6b8f ("LoongArch: Add VDSO and VSYSCALL support") Fixes: 18efd0b10e0f ("LoongArch: vDSO: Wire up getrandom() vDSO implementation") Cc: stable@vger.kernel.org Reviewed-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Yanteng Si <si.yanteng@linux.dev> Reviewed-by: WANG Xuerui <git@xen0n.name> Reviewed-by: Xi Ruoyao <xry111@xry111.site> Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
4 days	ceph: fix variable dereferenced before check in ceph_umount_begin()	Viacheslav Dubeyko	1	-2/+1
	smatch warnings: fs/ceph/super.c:1042 ceph_umount_begin() warn: variable dereferenced before check 'fsc' (see line 1041) vim +/fsc +1042 fs/ceph/super.c void ceph_umount_begin(struct super_block sb) { struct ceph_fs_client fsc = ceph_sb_to_fs_client(sb); doutc(fsc->client, "starting forced umount\n"); ^^^^^^^^^^^ Dereferenced if (!fsc) ^^^^ Checked too late. return; fsc->mount_state = CEPH_MOUNT_SHUTDOWN; __ceph_umount_begin(fsc); } The VFS guarantees that the superblock is still alive when it calls into ceph via ->umount_begin(). Finally, we don't need to check the fsc and it should be valid. This patch simply removes the fsc check. Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/r/202503280852.YDB3pxUY-lkp@intel.com/ Signed-off-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com> Reviewed by: Alex Markuze <amarkuze@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
4 days	Merge tag 'drm-misc-fixes-2025-05-28' of ↵	Simona Vetter	6	-23/+42
	https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes Short summary of fixes pull: drm-scheduler: - signal scheduled fence when killing job dummycon: - trigger deferred takeover when switching consoles ivpu: - improve logging - update firmware filenames - reorder steps in command-queue unregistering Signed-off-by: Simona Vetter <simona.vetter@ffwll.ch> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20250528153550.GA21050@linux.fritz.box
4 days	kernel/rcu/tree_stall: add /sys/kernel/rcu_stall_count	Max Kellermann	1	-0/+26
	Expose a simple counter to userspace for monitoring tools. (akpm: 2536c5c7d6ae added the documentation but the code changes were lost) Link: https://lkml.kernel.org/r/20250504180831.4190860-3-max.kellermann@ionos.com Fixes: 2536c5c7d6ae ("kernel/rcu/tree_stall: add /sys/kernel/rcu_stall_count") Signed-off-by: Max Kellermann <max.kellermann@ionos.com> Cc: Core Minyard <cminyard@mvista.com> Cc: Doug Anderson <dianders@chromium.org> Cc: Joel Granados <joel.granados@kernel.org> Cc: Max Kellermann <max.kellermann@ionos.com> Cc: Song Liu <song@kernel.org> Cc: Sourabh Jain <sourabhjain@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 days	MAINTAINERS: add mm swap section	Lorenzo Stoakes	1	-0/+19
	In furtherance of ongoing efforts to ensure people are aware of who de-facto maintains/has an interest in specific parts of mm, as well trying to avoid get_maintainers.pl listing only Andrew and the mailing list for mm files - establish a swap memory management section and add relevant maintainers/reviewers. Link: https://lkml.kernel.org/r/20250604163139.126630-1-lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: Chris Li <chrisl@kernel.org> Acked-by: Kairui Song <kasong@tencent.com> Acked-by: Kemeng Shi <shikemeng@huaweicloud.com> Acked-by: Baoquan He <bhe@redhat.com> Acked-by: Barry Song <baohua@kernel.org> Acked-by: Nhat Pham <nphamcs@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 days	kmsan: test: add module description	Arnd Bergmann	1	-0/+1
	Every module should have a description, and kbuild now warns for those that don't. WARNING: modpost: missing MODULE_DESCRIPTION() in mm/kmsan/kmsan_test.o Link: https://lkml.kernel.org/r/20250603075323.1839608-1-arnd@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Alexander Potapenko <glider@google.com> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Macro Elver <elver@google.com> Cc: Sabyrzhan Tasbolatov <snovitoll@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 days	MAINTAINERS: add tlb trace events to MMU GATHER AND TLB INVALIDATION	Tal Zussman	1	-0/+1
	The MMU GATHER AND TLB INVALIDATION entry lists other TLB-related files. Add the tlb.h tracepoint file there as well. Link: https://lore.kernel.org/linux-mm/ce048e11-f79d-44a6-bacc-46e1ebc34b24@redhat.com/ Link: https://lkml.kernel.org/r/20250603-tlb-maintainers-v1-1-726d193c6693@columbia.edu Signed-off-by: Tal Zussman <tz2294@columbia.edu> Suggested-by: David Hildenbrand <david@redhat.com> Acked-by: David Hildenbrand <david@redhat.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 days	mm/hugetlb: fix huge_pmd_unshare() vs GUP-fast race	Jann Horn	1	-0/+7
	huge_pmd_unshare() drops a reference on a page table that may have previously been shared across processes, potentially turning it into a normal page table used in another process in which unrelated VMAs can afterwards be installed. If this happens in the middle of a concurrent gup_fast(), gup_fast() could end up walking the page tables of another process. While I don't see any way in which that immediately leads to kernel memory corruption, it is really weird and unexpected. Fix it with an explicit broadcast IPI through tlb_remove_table_sync_one(), just like we do in khugepaged when removing page tables for a THP collapse. Link: https://lkml.kernel.org/r/20250528-hugetlb-fixes-splitrace-v2-2-1329349bad1a@google.com Link: https://lkml.kernel.org/r/20250527-hugetlb-fixes-splitrace-v1-2-f4136f5ec58a@google.com Fixes: 39dde65c9940 ("[PATCH] shared page table for hugetlb page") Signed-off-by: Jann Horn <jannh@google.com> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Oscar Salvador <osalvador@suse.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 days	mm/hugetlb: unshare page tables during VMA split, not before	Jann Horn	4	-16/+56
	Currently, __split_vma() triggers hugetlb page table unsharing through vm_ops->may_split(). This happens before the VMA lock and rmap locks are taken - which is too early, it allows racing VMA-locked page faults in our process and racing rmap walks from other processes to cause page tables to be shared again before we actually perform the split. Fix it by explicitly calling into the hugetlb unshare logic from __split_vma() in the same place where THP splitting also happens. At that point, both the VMA and the rmap(s) are write-locked. An annoying detail is that we can now call into the helper hugetlb_unshare_pmds() from two different locking contexts: 1. from hugetlb_split(), holding: - mmap lock (exclusively) - VMA lock - file rmap lock (exclusively) 2. hugetlb_unshare_all_pmds(), which I think is designed to be able to call us with only the mmap lock held (in shared mode), but currently only runs while holding mmap lock (exclusively) and VMA lock Backporting note: This commit fixes a racy protection that was introduced in commit b30c14cd6102 ("hugetlb: unshare some PMDs when splitting VMAs"); that commit claimed to fix an issue introduced in 5.13, but it should actually also go all the way back. [jannh@google.com: v2] Link: https://lkml.kernel.org/r/20250528-hugetlb-fixes-splitrace-v2-1-1329349bad1a@google.com Link: https://lkml.kernel.org/r/20250528-hugetlb-fixes-splitrace-v2-0-1329349bad1a@google.com Link: https://lkml.kernel.org/r/20250527-hugetlb-fixes-splitrace-v1-1-f4136f5ec58a@google.com Fixes: 39dde65c9940 ("[PATCH] shared page table for hugetlb page") Signed-off-by: Jann Horn <jannh@google.com> Cc: Liam Howlett <liam.howlett@oracle.com> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> [b30c14cd6102: hugetlb: unshare some PMDs when splitting VMAs] Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 days	MAINTAINERS: add Alistair as reviewer of mm memory policy	Alistair Popple	1	-0/+1
	I'm particularly familiar with mm/migrate.c and especially mm/migrate_device.c so add myself to MAINTAINERS. Link: https://lkml.kernel.org/r/20250530014917.2946940-1-apopple@nvidia.com Signed-off-by: Alistair Popple <apopple@nvidia.com> Acked-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: Matthew Brost <matthew.brost@intel.com> Acked-by: David Hildenbrand <david@redhat.com> Cc: Joshua Hahn <joshua.hahnjy@gmail.com> Cc: Rakie Kim <rakie.kim@sk.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 days	iov_iter: use iov_offset for length calculation in iov_iter_aligned_bvec	Nitesh Shetty	1	-1/+1
	If iov_offset is non-zero, then we need to consider iov_offset in length calculation, otherwise we might pass smaller IOs such as 512 bytes, in below scenario [1]. This issue is reproducible using lib-uring test/fixed-seg.c application with fixed buffer on a 512 LBA formatted device. [1] At present we pass the alignment check, for 512 LBA formatted devices, len_mask = 511 when IO is smaller, i->count = 512 has an offset, i->io_offset = 3584 with bvec values, bvec->bv_offset = 256, bvec->bv_len = 3840. In short, the first 256 bytes are in the current page, next 256 bytes are in the another page. Ideally we expect to fail the IO. I can think of 2 userspace scenarios where we experience this. a: From userspace, we observe a different behaviour when device LBA size is 512 vs 4096 bytes. For 4096 LBA formatted device, I see the same liburing test [2] failing, whereas 512 the test passes without this. This is reproducible everytime. [2] https://github.com/axboe/liburing/ b: Although I was not able to reproduce the below condition, but I suspect below case should be possible from user space for devices with 512 LBA formatted device. Lets say from userspace while allocating a virtually single chunk of memory, if we get 2 physical chunk of memory, and IO happens to be at the boundary of first physical chunk with length crossing first chunk, then we allow IOs to proceed and hence we might map wrong physical address length and proceed with IO rather than failing. : --- a/test/fixed-seg.c : +++ b/test/fixed-seg.c : @@ -64,7 +64,7 @@ static int test(struct io_uring ring, int fd, int : vec_off) : return T_EXIT_FAIL; : } : : - ret = read_it(ring, fd, 4096, vec_off); : + ret = read_it(ring, fd, 4096, 7512 + 256); : if (ret) { : fprintf(stderr, "4096 0 failed\n"); : return T_EXIT_FAIL; Effectively this is a write crossing the page boundary. Link: https://lkml.kernel.org/r/20250428095849.11709-1-nj.shetty@samsung.com Fixes: 2263639f96f2 ("iov_iter: streamline iovec/bvec alignment iteration") Reviewed-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Anuj Gupta <anuj20.g@samsung.com> Signed-off-by: Nitesh Shetty <nj.shetty@samsung.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <brauner@kernel.org> Cc: Keith Busch <kbusch@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 days	mm/mempolicy: fix incorrect freeing of wi_kobj	Joshua Hahn	1	-3/+1
	We should not free wi_group->wi_kobj here. In the error path of add_weighted_interleave_group() where this snippet is called from, kobj_{del, put} is immediately called right after this section. Thus, it is not only unnecessary but also incorrect to free it here. Link: https://lkml.kernel.org/r/20250602162345.2595696-1-joshua.hahnjy@gmail.com Fixes: e341f9c3c841 ("mm/mempolicy: Weighted Interleave Auto-tuning") Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com> Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202506011545.Fduxqxqj-lkp@intel.com/ Cc: Alistair Popple <apopple@nvidia.com> Cc: Byungchul Park <byungchul@sk.com> Cc: David Hildenbrand <david@redhat.com> Cc: Gregory Price <gourry@gourry.net> Cc: "Huang, Ying" <ying.huang@linux.alibaba.com> Cc: Mathew Brost <matthew.brost@intel.com> Cc: Rakie Kim <rakie.kim@sk.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 days	alloc_tag: handle module codetag load errors as module load failures	Suren Baghdasaryan	4	-20/+39
	Failures inside codetag_load_module() are currently ignored. As a result an error there would not cause a module load failure and freeing of the associated resources. Correct this behavior by propagating the error code to the caller and handling possible errors. With this change, error to allocate percpu counters, which happens at this stage, will not be ignored and will cause a module load failure and freeing of resources. With this change we also do not need to disable memory allocation profiling when this error happens, instead we fail to load the module. Link: https://lkml.kernel.org/r/20250521160602.1940771-1-surenb@google.com Fixes: 10075262888b ("alloc_tag: allocate percpu counters for module tags dynamically") Signed-off-by: Suren Baghdasaryan <surenb@google.com> Reported-by: Casey Chen <cachen@purestorage.com> Closes: https://lore.kernel.org/all/20250520231620.15259-1-cachen@purestorage.com/ Cc: Daniel Gomez <da.gomez@samsung.com> Cc: David Wang <00107082@163.com> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: Luis Chamberalin <mcgrof@kernel.org> Cc: Petr Pavlu <petr.pavlu@suse.com> Cc: Sami Tolvanen <samitolvanen@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 days	mm/madvise: handle madvise_lock() failure during race unwinding	SeongJae Park	1	-1/+4
	When unwinding race on -ERESTARTNOINTR handling of process_madvise(), madvise_lock() failure is ignored. Check the failure and abort remaining works in the case. Link: https://lkml.kernel.org/r/20250602174926.1074-1-sj@kernel.org Fixes: 4000e3d0a367 ("mm/madvise: remove redundant mmap_lock operations from process_madvise()") Signed-off-by: SeongJae Park <sj@kernel.org> Reported-by: Barry Song <21cnbao@gmail.com> Closes: https://lore.kernel.org/CAGsJ_4xJXXO0G+4BizhohSZ4yDteziPw43_uF8nPXPWxUVChzw@mail.gmail.com Reviewed-by: Jann Horn <jannh@google.com> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev> Reviewed-by: Barry Song <baohua@kernel.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 days	mm: fix vmstat after removing NR_BOUNCE	Kirill A. Shutemov	1	-1/+0
	Hongyu noticed that the nr_unaccepted counter kept growing even in the absence of unaccepted memory on the machine. This happens due to a commit that removed NR_BOUNCE: it removed the counter from the enum zone_stat_item, but left it in the vmstat_text array. As a result, all counters below nr_bounce in /proc/vmstat are shifted by one line, causing the numa_hit counter to be labeled as nr_unaccepted. To fix this issue, remove nr_bounce from the vmstat_text array. Link: https://lkml.kernel.org/r/20250529103832.2937460-1-kirill.shutemov@linux.intel.com Fixes: 194df9f66db8 ("mm: remove NR_BOUNCE zone stat") Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reported-by: Hongyu Ning <hongyu.ning@linux.intel.com> Reviewed-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 days	KVM: s390: rename PROT_NONE to PROT_TYPE_DUMMY	Lorenzo Stoakes	1	-4/+4
	The enum type prot_type declared in arch/s390/kvm/gaccess.c declares an unfortunate identifier within it - PROT_NONE. This clashes with the protection bit define from the uapi for mmap() declared in include/uapi/asm-generic/mman-common.h, which is indeed what those casually reading this code would assume this to refer to. This means that any changes which subsequently alter headers in any way which results in the uapi header being imported here will cause build errors. Resolve the issue by renaming PROT_NONE to PROT_TYPE_DUMMY. Link: https://lkml.kernel.org/r/20250519145657.178365-1-lorenzo.stoakes@oracle.com Fixes: b3cefd6bf16e ("KVM: s390: Pass initialized arg even if unused") Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Suggested-by: Ignacio Moreno Gonzalez <Ignacio.MorenoGonzalez@kuka.com> Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202505140943.IgHDa9s7-lkp@intel.com/ Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com> Acked-by: Ignacio Moreno Gonzalez <Ignacio.MorenoGonzalez@kuka.com> Acked-by: Yang Shi <yang@os.amperecomputing.com> Reviewed-by: David Hildenbrand <david@redhat.com> Acked-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Cc: <stable@vger.kernel.org> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: James Houghton <jthoughton@google.com> Cc: Janosch Frank <frankja@linux.ibm.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 days	selftests/mm: add test about uprobe pte be orphan during vma merge	Pu Lehui	1	-0/+43
	Add test about uprobe pte be orphan during vma merge. [akpm@linux-foundation.org: include sys/syscall.h, per Lorenzo] Link: https://lkml.kernel.org/r/20250529155650.4017699-5-pulehui@huaweicloud.com Signed-off-by: Pu Lehui <pulehui@huawei.com> Cc: Jann Horn <jannh@google.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: "Masami Hiramatsu (Google)" <mhiramat@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 days	selftests/mm: extract read_sysfs and write_sysfs into vm_util	Pu Lehui	4	-33/+45
	Extract read_sysfs and write_sysfs into vm_util. Meanwhile, rename the function in thuge-gen that has the same name as read_sysfs. Link: https://lkml.kernel.org/r/20250529155650.4017699-4-pulehui@huaweicloud.com Signed-off-by: Pu Lehui <pulehui@huawei.com> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Jann Horn <jannh@google.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: "Masami Hiramatsu (Google)" <mhiramat@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 days	mm: expose abnormal new_pte during move_ptes	Pu Lehui	1	-0/+2
	When executing move_ptes, the new_pte must be NULL, otherwise it will be overwritten by the old_pte, and cause the abnormal new_pte to be leaked. In order to make this problem to be more explicit, let's add WARN_ON_ONCE when new_pte is not NULL. [akpm@linux-foundation.org: s/WARN_ON_ONCE/VM_WARN_ON_ONCE/] Link: https://lkml.kernel.org/r/20250529155650.4017699-3-pulehui@huaweicloud.com Suggested-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Pu Lehui <pulehui@huawei.com> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Jann Horn <jannh@google.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: "Masami Hiramatsu (Google)" <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 days	mm: fix uprobe pte be overwritten when expanding vma	Pu Lehui	2	-3/+24
	Patch series "Fix uprobe pte be overwritten when expanding vma". This patch (of 4): We encountered a BUG alert triggered by Syzkaller as follows: BUG: Bad rss-counter state mm:00000000b4a60fca type:MM_ANONPAGES val:1 And we can reproduce it with the following steps: 1. register uprobe on file at zero offset 2. mmap the file at zero offset: addr1 = mmap(NULL, 2 * 4096, PROT_NONE, MAP_PRIVATE, fd, 0); 3. mremap part of vma1 to new vma2: addr2 = mremap(addr1, 4096, 2 * 4096, MREMAP_MAYMOVE); 4. mremap back to orig addr1: mremap(addr2, 4096, 4096, MREMAP_MAYMOVE \| MREMAP_FIXED, addr1); In step 3, the vma1 range [addr1, addr1 + 4096] will be remap to new vma2 with range [addr2, addr2 + 8192], and remap uprobe anon page from the vma1 to vma2, then unmap the vma1 range [addr1, addr1 + 4096]. In step 4, the vma2 range [addr2, addr2 + 4096] will be remap back to the addr range [addr1, addr1 + 4096]. Since the addr range [addr1 + 4096, addr1 + 8192] still maps the file, it will take vma_merge_new_range to expand the range, and then do uprobe_mmap in vma_complete. Since the merged vma pgoff is also zero offset, it will install uprobe anon page to the merged vma. However, the upcomming move_page_tables step, which use set_pte_at to remap the vma2 uprobe pte to the merged vma, will overwrite the newly uprobe pte in the merged vma, and lead that pte to be orphan. Since the uprobe pte will be remapped to the merged vma, we can remove the unnecessary uprobe_mmap upon merged vma. This problem was first found in linux-6.6.y and also exists in the community syzkaller: https://lore.kernel.org/all/000000000000ada39605a5e71711@google.com/T/ Link: https://lkml.kernel.org/r/20250529155650.4017699-1-pulehui@huaweicloud.com Link: https://lkml.kernel.org/r/20250529155650.4017699-2-pulehui@huaweicloud.com Fixes: 2b1444983508 ("uprobes, mm, x86: Add the ability to install and remove uprobes breakpoints") Signed-off-by: Pu Lehui <pulehui@huawei.com> Suggested-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: David Hildenbrand <david@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: "Masami Hiramatsu (Google)" <mhiramat@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 days	mm/damon: s/primitives/code/ on comments	Enze Li	8	-8/+8
	The word 'primitive' is not explicit. To make the code more easily understood, this commit renames 'primitives' to 'code' in header comments of some source files. Link: https://lkml.kernel.org/r/20250530053115.153238-1-lienze@kylinos.cn Signed-off-by: Enze Li <lienze@kylinos.cn> Reviewed-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
4 days	drm/nouveau/vfn/r535: Convert comma to semicolon	Chen Ni	1	-1/+1
	Replace comma between expressions with semicolons. Using a ',' in place of a ';' can have unintended side effects. Although that is not the case here, it is seems best to use ';' unless ',' is intended. Found by inspection. No functional change intended. Compile tested only. Signed-off-by: Chen Ni <nichen@iscas.ac.cn> Fixes: cd3c62282b61 ("drm/nouveau/gsp: add usermode class id to gpu hal") Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://lore.kernel.org/r/20250603061027.1310267-1-nichen@iscas.ac.cn
4 days	Merge tag 'amd-drm-fixes-6.16-2025-06-05' of ↵	Dave Airlie	17	-51/+205
	https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-fixes-6.16-2025-06-05: amdgpu: - IP discovery fix - Cleaner shader fix for GC 10.1.x - OD fix - UserQ fixes - Non-OLED panel fix - Misc display fixes - Brightness fixes amdkfd: - Enable CONFIG_HSA_AMD on RISCV Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20250606015932.835829-1-alexander.deucher@amd.com
4 days	Merge tag 'drm-xe-next-fixes-2025-06-05' of ↵	Dave Airlie	21	-186/+491
	https://gitlab.freedesktop.org/drm/xe/kernel into drm-next Driver Changes: - A couple of vm init fixes (Matt Auld) - Hwmon fixes (Karthik) - Drop reduntant conversion to bool (Raag) - Fix CONFIG_INTEL_VSEC dependency (Arnd) - Rework eviction rejection of bound external bos (Thomas) - Stop re-submitting signalled jobs (Matt Auld) - A couple of pxp fixes (Daniele) - Add back a fix that got lost in a merge (Matt Auld) - Create LRC bo without VM (Niranjana) - Fix for the above fix (Maciej) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Hellstrom <thomas.hellstrom@linux.intel.com> Link: https://lore.kernel.org/r/aEHq44uIAZwfK-mG@fedora
4 days	Merge tag 'drm-misc-next-fixes-2025-06-05' of ↵	Dave Airlie	4	-17/+12
	https://gitlab.freedesktop.org/drm/misc/kernel into drm-next drm-misc-fixes for v6.16-rc1: - Fixes for nt37801 panel - Fix null deref in HDMI audio helper. - Fixes for analogix_dp. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://lore.kernel.org/r/14c2eff8-701d-4699-b187-08862715e1ac@linux.intel.com
4 days	Merge tag 'drm-intel-next-fixes-2025-06-05' of ↵	Dave Airlie	3	-14/+25
	https://gitlab.freedesktop.org/drm/i915/kernel into drm-next - Fix PSR register definitions for ALPM - Fix u32 overflow in SNPS PHY HDMI PLL setup - Fix GuC pending message underflow when submit fails - Fix GuC wakeref underflow race during reset Signed-off-by: Dave Airlie <airlied@redhat.com> From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://lore.kernel.org/r/aEFW1wGnt1kTVNGF@jlahtine-mobl
4 days	Merge patch series "riscv: add SBI FWFT misaligned exception delegation support"	Palmer Dabbelt	5	-18/+257
	Clément Léger <cleger@rivosinc.com> says: The SBI Firmware Feature extension allows the S-mode to request some specific features (either hardware or software) to be enabled. This series uses this extension to request misaligned access exception delegation to S-mode in order to let the kernel handle it. It also adds support for the KVM FWFT SBI extension based on the misaligned access handling infrastructure. FWFT SBI extension is part of the SBI V3.0 specifications [1]. It can be tested using the qemu provided at [2] which contains the series from [3]. Upstream kvm-unit-tests can be used inside kvm to tests the correct delegation of misaligned exceptions. Upstream OpenSBI can be used. The tests can be run using the kselftest from series [4]. $ qemu-system-riscv64 \ -cpu rv64,trap-misaligned-access=true,v=true \ -M virt \ -m 1024M \ -bios fw_dynamic.bin \ -kernel Image ... # ./misaligned TAP version 13 1..23 # Starting 23 tests from 1 test cases. # RUN global.gp_load_lh ... # OK global.gp_load_lh ok 1 global.gp_load_lh # RUN global.gp_load_lhu ... # OK global.gp_load_lhu ok 2 global.gp_load_lhu # RUN global.gp_load_lw ... # OK global.gp_load_lw ok 3 global.gp_load_lw # RUN global.gp_load_lwu ... # OK global.gp_load_lwu ok 4 global.gp_load_lwu # RUN global.gp_load_ld ... # OK global.gp_load_ld ok 5 global.gp_load_ld # RUN global.gp_load_c_lw ... # OK global.gp_load_c_lw ok 6 global.gp_load_c_lw # RUN global.gp_load_c_ld ... # OK global.gp_load_c_ld ok 7 global.gp_load_c_ld # RUN global.gp_load_c_ldsp ... # OK global.gp_load_c_ldsp ok 8 global.gp_load_c_ldsp # RUN global.gp_load_sh ... # OK global.gp_load_sh ok 9 global.gp_load_sh # RUN global.gp_load_sw ... # OK global.gp_load_sw ok 10 global.gp_load_sw # RUN global.gp_load_sd ... # OK global.gp_load_sd ok 11 global.gp_load_sd # RUN global.gp_load_c_sw ... # OK global.gp_load_c_sw ok 12 global.gp_load_c_sw # RUN global.gp_load_c_sd ... # OK global.gp_load_c_sd ok 13 global.gp_load_c_sd # RUN global.gp_load_c_sdsp ... # OK global.gp_load_c_sdsp ok 14 global.gp_load_c_sdsp # RUN global.fpu_load_flw ... # OK global.fpu_load_flw ok 15 global.fpu_load_flw # RUN global.fpu_load_fld ... # OK global.fpu_load_fld ok 16 global.fpu_load_fld # RUN global.fpu_load_c_fld ... # OK global.fpu_load_c_fld ok 17 global.fpu_load_c_fld # RUN global.fpu_load_c_fldsp ... # OK global.fpu_load_c_fldsp ok 18 global.fpu_load_c_fldsp # RUN global.fpu_store_fsw ... # OK global.fpu_store_fsw ok 19 global.fpu_store_fsw # RUN global.fpu_store_fsd ... # OK global.fpu_store_fsd ok 20 global.fpu_store_fsd # RUN global.fpu_store_c_fsd ... # OK global.fpu_store_c_fsd ok 21 global.fpu_store_c_fsd # RUN global.fpu_store_c_fsdsp ... # OK global.fpu_store_c_fsdsp ok 22 global.fpu_store_c_fsdsp # RUN global.gen_sigbus ... [12797.988647] misaligned[618]: unhandled signal 7 code 0x1 at 0x0000000000014dc0 in misaligned[4dc0,10000+76000] [12797.988990] CPU: 0 UID: 0 PID: 618 Comm: misaligned Not tainted 6.13.0-rc6-00008-g4ec4468967c9-dirty #51 [12797.989169] Hardware name: riscv-virtio,qemu (DT) [12797.989264] epc : 0000000000014dc0 ra : 0000000000014d00 sp : 00007fffe165d100 [12797.989407] gp : 000000000008f6e8 tp : 0000000000095760 t0 : 0000000000000008 [12797.989544] t1 : 00000000000965d8 t2 : 000000000008e830 s0 : 00007fffe165d160 [12797.989692] s1 : 000000000000001a a0 : 0000000000000000 a1 : 0000000000000002 [12797.989831] a2 : 0000000000000000 a3 : 0000000000000000 a4 : ffffffffdeadbeef [12797.989964] a5 : 000000000008ef61 a6 : 626769735f6e0000 a7 : fffffffffffff000 [12797.990094] s2 : 0000000000000001 s3 : 00007fffe165d838 s4 : 00007fffe165d848 [12797.990238] s5 : 000000000000001a s6 : 0000000000010442 s7 : 0000000000010200 [12797.990391] s8 : 000000000000003a s9 : 0000000000094508 s10: 0000000000000000 [12797.990526] s11: 0000555567460668 t3 : 00007fffe165d070 t4 : 00000000000965d0 [12797.990656] t5 : fefefefefefefeff t6 : 0000000000000073 [12797.990756] status: 0000000200004020 badaddr: 000000000008ef61 cause: 0000000000000006 [12797.990911] Code: 8793 8791 3423 fcf4 3783 fc84 c737 dead 0713 eef7 (c398) 0001 # OK global.gen_sigbus ok 23 global.gen_sigbus # PASSED: 23 / 23 tests passed. # Totals: pass:23 fail:0 xfail:0 xpass:0 skip:0 error:0 With kvm-tools: # lkvm run -k sbi.flat -m 128 Info: # lkvm run -k sbi.flat -m 128 -c 1 --name guest-97 Info: Removed ghost socket file "/root/.lkvm//guest-97.sock". ########################################################################## # kvm-unit-tests ########################################################################## ... [test messages elided] PASS: sbi: fwft: FWFT extension probing no error PASS: sbi: fwft: get/set reserved feature 0x6 error == SBI_ERR_DENIED PASS: sbi: fwft: get/set reserved feature 0x3fffffff error == SBI_ERR_DENIED PASS: sbi: fwft: get/set reserved feature 0x80000000 error == SBI_ERR_DENIED PASS: sbi: fwft: get/set reserved feature 0xbfffffff error == SBI_ERR_DENIED PASS: sbi: fwft: misaligned_deleg: Get misaligned deleg feature no error PASS: sbi: fwft: misaligned_deleg: Set misaligned deleg feature invalid value error PASS: sbi: fwft: misaligned_deleg: Set misaligned deleg feature invalid value error PASS: sbi: fwft: misaligned_deleg: Set misaligned deleg feature value no error PASS: sbi: fwft: misaligned_deleg: Set misaligned deleg feature value 0 PASS: sbi: fwft: misaligned_deleg: Set misaligned deleg feature value no error PASS: sbi: fwft: misaligned_deleg: Set misaligned deleg feature value 1 PASS: sbi: fwft: misaligned_deleg: Verify misaligned load exception trap in supervisor SUMMARY: 50 tests, 2 unexpected failures, 12 skipped This series is available at [5]. [Palmer: slighyt commit text modification, as SBI-3.0 is merged now. Also drop the KVM patches, as they're too late.] * b4-shazam-merge: riscv: misaligned: add a function to check misalign trap delegability riscv: misaligned: move emulated access uniformity check in a function riscv: misaligned: declare misaligned_access_speed under CONFIG_RISCV_MISALIGNED riscv: misaligned: use on_each_cpu() for scalar misaligned access probing riscv: misaligned: request misaligned exception from SBI riscv: sbi: add SBI FWFT extension calls riscv: sbi: add FWFT extension interface riscv: sbi: add new SBI error mappings riscv: sbi: remove useless parenthesis riscv: sbi: add Firmware Feature (FWFT) SBI extensions definitions Link: https://lore.kernel.org/r/20250523101932.1594077-1-cleger@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
4 days	Merge patch series "riscv: misaligned: fix misaligned accesses handling in ↵	Palmer Dabbelt	6	-50/+78
	put/get_user()" Clément Léger <cleger@rivosinc.com> says: While debugging a few problems with the misaligned access kselftest, Alexandre discovered some crash with the current code. Indeed, some misaligned access was done by the kernel using put_user(). This was resulting in trap and a kernel crash since. The path was the following: user -> kernel -> access to user memory -> misaligned trap -> trap -> kernel -> misaligned handling -> memcpy -> crash due to failed page fault while in interrupt disabled section. Last discussion about kernel misaligned handling and interrupt reenabling were actually not to reenable interrupt when handling misaligned access being done by kernel. The best solution being not to do any misaligned accesses to userspace memory, we considered a few options: - Remove any call to put/get_user() potentially doing misaligned accesses - Do not do any misaligned accesses in put/get_user() itself The second solution was the one chosen as there are too many callsites to put/get_user() that could potentially do misaligned accesses. We tried two approaches for that, either split access in two aligned accesses (and do RMW for put_user()) or call copy_from/to_user() which does not do any misaligned accesses. The later one was the simpler to implement (although the performances are probably lower than split aligned accesses but still way better than doing misaligned access emulation) and allows to support what we wanted. These commits are based on top of Alex dev/alex/get_user_misaligned_v1 branch. [Palmer: No idea what that branch is, so I'm basing it on the uaccess optimizations patch series which is the last thing to touch these.] * b4-shazam-merge riscv: uaccess: do not do misaligned accesses in get/put_user() riscv: process: use unsigned int instead of unsigned long for put_user() riscv: make unsafe user copy routines use existing assembly routines Link: https://lore.kernel.org/r/20250602193918.868962-1-cleger@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
4 days	riscv: uaccess: Only restore the CSR_STATUS SUM bit	Cyril Bur	3	-8/+9
	During switch to csrs will OR the value of the register into the corresponding csr. In this case we're only interested in restoring the SUM bit not the entire register. Signed-off-by: Cyril Bur <cyrilbur@tenstorrent.com> Link: https://lore.kernel.org/r/20250522160954.429333-1-cyrilbur@tenstorrent.com Co-developed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Fixes: 788aa64c01f1 ("riscv: save the SR_SUM status over switches") Link: https://lore.kernel.org/r/20250602121543.1544278-1-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
4 days	Merge tag 'riscv-mw2-6.16-rc1' of ↵	Palmer Dabbelt	831	-5027/+11908
	ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/alexghiti/linux into for-next riscv patches for 6.16-rc1, part 2 * Performance improvements - Add support for vdso getrandom - Implement raid6 calculations using vectors - Introduce svinval tlb invalidation * Cleanup - A bunch of deduplication of the macros we use for manipulating instructions * Misc - Introduce a kunit test for kprobes - Add support for mseal as riscv fits the requirements (thanks to Lorenzo for making sure of that :)) [Palmer: There was a rebase between part 1 and part 2, so I've had to do some more git surgery here... at least two rounds of surgery...] * alex-pr-2: (866 commits) RISC-V: vDSO: Wire up getrandom() vDSO implementation riscv: enable mseal sysmap for RV64 raid6: Add RISC-V SIMD syndrome and recovery calculations riscv: mm: Add support for Svinval extension riscv: Add kprobes KUnit test riscv: kprobes: Remove duplication of RV_EXTRACT_ITYPE_IMM riscv: kprobes: Remove duplication of RV_EXTRACT_UTYPE_IMM riscv: kprobes: Remove duplication of RV_EXTRACT_RD_REG riscv: kprobes: Remove duplication of RVC_EXTRACT_BTYPE_IMM riscv: kprobes: Remove duplication of RVC_EXTRACT_C2_RS1_REG riscv: kproves: Remove duplication of RVC_EXTRACT_JTYPE_IMM riscv: kprobes: Remove duplication of RV_EXTRACT_BTYPE_IMM riscv: kprobes: Remove duplication of RV_EXTRACT_RS1_REG riscv: kprobes: Remove duplication of RV_EXTRACT_JTYPE_IMM riscv: kprobes: Move branch_funct3 to insn.h riscv: kprobes: Move branch_rs2_idx to insn.h Linux 6.15-rc6 Input: xpad - fix xpad_device sorting Input: xpad - add support for several more controllers Input: xpad - fix Share button on Xbox One controllers ...
4 days	RISC-V: vDSO: Wire up getrandom() vDSO implementation	Xi Ruoyao	7	-0/+308
	Hook up the generic vDSO implementation to the generic vDSO getrandom implementation by providing the required __arch_chacha20_blocks_nostack and getrandom_syscall implementations. Also wire up the selftests. The benchmark result: vdso: 25000000 times in 2.466341333 seconds libc: 25000000 times in 41.447720005 seconds syscall: 25000000 times in 41.043926672 seconds vdso: 25000000 x 256 times in 162.286219353 seconds libc: 25000000 x 256 times in 2953.855018685 seconds syscall: 25000000 x 256 times in 2796.268546000 seconds [ alex: - Fix dynamic relocation - Squash Nathan's fix https://lore.kernel.org/all/20250423-riscv-fix-compat_vdso-lld-v2-1-b7bbbc244501@kernel.org/ - Add comment from Loongarch ] Signed-off-by: Xi Ruoyao <xry111@xry111.site> Link: https://lore.kernel.org/r/20250411024600.16045-1-xry111@xry111.site Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
4 days	riscv: enable mseal sysmap for RV64	Jisheng Zhang	2	-1/+2
	Provide support for CONFIG_MSEAL_SYSTEM_MAPPINGS for RV64, covering the vdso, vvar. Passed sysmap_is_sealed and mseal_test self tests. Passed booting a buildroot rootfs image and a cli debian rootfs image. Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Cc: Jeff Xu <jeffxu@chromium.org> Link: https://lore.kernel.org/r/20250426135954.5614-1-jszhang@kernel.org Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
4 days	raid6: Add RISC-V SIMD syndrome and recovery calculations	Chunyan Zhang	6	-0/+1495
	The assembly is originally based on the ARM NEON and int.uc, but uses RISC-V vector instructions to implement the RAID6 syndrome and recovery calculations. The functions are tested on QEMU running with the option "-icount shift=0": raid6: rvvx1 gen() 1008 MB/s raid6: rvvx2 gen() 1395 MB/s raid6: rvvx4 gen() 1584 MB/s raid6: rvvx8 gen() 1694 MB/s raid6: int64x8 gen() 113 MB/s raid6: int64x4 gen() 116 MB/s raid6: int64x2 gen() 272 MB/s raid6: int64x1 gen() 229 MB/s raid6: using algorithm rvvx8 gen() 1694 MB/s raid6: .... xor() 1000 MB/s, rmw enabled raid6: using rvv recovery algorithm [Charlie: - Fixup vector options] Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn> Reviewed-by: Charlie Jenkins <charlie@rivosinc.com> Tested-by: Charlie Jenkins <charlie@rivosinc.com> Link: https://lore.kernel.org/r/20250305083707.74218-1-zhangchunyan@iscas.ac.cn Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
4 days	riscv: mm: Add support for Svinval extension	Mayuresh Chitale	1	-0/+31
	The Svinval extension splits SFENCE.VMA instruction into finer-grained invalidation and ordering operations and is mandatory for RVA23S64 profile. When Svinval is enabled the local_flush_tlb_range_threshold_asid function should use the following sequence to optimize the tlb flushes instead of a simple sfence.vma: sfence.w.inval svinval.vma . . svinval.vma sfence.inval.ir The maximum number of consecutive svinval.vma instructions that can be executed in local_flush_tlb_range_threshold_asid function is limited to 64. This is required to avoid soft lockups and the approach is similar to that used in arm64. Signed-off-by: Mayuresh Chitale <mchitale@ventanamicro.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20240702102637.9074-1-mchitale@ventanamicro.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
4 days	efi/libstub: use 'targets' instead of extra-y in Makefile	Masahiro Yamada	1	-1/+1
	These objects are built as prerequisites of %.stub.o files. There is no need to use extra-y, which is planned for deprecation. Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
4 days	module: make __mod_device_table__* symbols static	Masahiro Yamada	1	-2/+2
	The __mod_device_table__* symbols are only parsed by modpost to generate MODULE_ALIAS() entries from MODULE_DEVICE_TABLE(). Therefore, these symbols do not need to be globally visible, or globally unique. If they are in the global scope, we would worry about the symbol uniqueness, but modpost is fine with parsing multiple symbols with the same name. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Petr Pavlu <petr.pavlu@suse.com>
4 days	scripts/misc-check: check unnecessary #include <linux/export.h> when W=1	Masahiro Yamada	1	-0/+12
	Another issue with <linux/export.h> is that it is sometimes included even when EXPORT_SYMBOL() is not used at all. Some headers (e.g. include/linux/linkage.h>) cannot be fixed for now for the reason described in the previous commit. This commit adds a warning for *.c files that include <linux/export.h> but do not use EXPORT_SYMBOL() when the kernel is built with W=1. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
4 days	scripts/misc-check: check missing #include <linux/export.h> when W=1	Masahiro Yamada	1	-0/+43
	The problem was described in commit 5b20755b7780 ("init: move THIS_MODULE from <linux/export.h> to <linux/init.h>"). To summarize it again here: <linux/export.h> is included by most C files, even though only some of them actually export symbols. This is because some headers, such as include/linux/{module.h,linkage}, needlessly include <linux/export.h>. I have added a more detailed explanation in the comments of scripts/misc-check. This problem will be fixed in two steps: 1. Add #include <linux/export.h> directly to C files that use EXPORT_SYMBOL() 2. Remove #include <linux/export.h> from header files that do not use EXPORT_SYMBOL() This commit addresses step 1; scripts/misc-check will warn about *.[ch] files that use EXPORT_SYMBOL() but do not include <linux/export.h>. This check is only triggered when the kernel is built with W=1. We need to fix 4000+ files. I hope others will help with this effort. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
4 days	scripts/misc-check: add double-quotes to satisfy shellcheck	Masahiro Yamada	1	-1/+1
	In scripts/misc-check line 8: git -C ${srctree:-.} ls-files -i -c --exclude-per-directory=.gitignore 2>/dev/null \| ^-----------^ SC2086 (info): Double quote to prevent globbing and word splitting. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
4 days	kbuild: move W=1 check for scripts/misc-check to top-level Makefile	Masahiro Yamada	2	-8/+4
	This script is executed only when ${KBUILD_EXTRA_WARN} contains 1. Move this check to the top-level Makefile to allow more checks to be easily added to this script. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nathan Chancellor <nathan@kernel.org>
4 days	scripts/tags.sh: allow to use alternative ctags implementation	Masatake YAMATO	1	-1/+1
	Some ctags implementations are available. With this change, You can specify your favorite one with CTAGS environment variable. Signed-off-by: Masatake YAMATO <yamato@redhat.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
4 days	kconfig: introduce menu type enum	Masahiro Yamada	4	-9/+21
	Currently, menu->prompt->type is checked to distinguish "comment" (P_COMMENT) and "menu" (P_MENU) entries from regular "config" entries. This is odd because P_COMMENT and P_MENU are not properties. This commit introduces menu type enum to distinguish menu types more naturally. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
4 days	docs: symbol-namespaces: fix reST warning with literal block	Khaled Elnaggar	1	-1/+1
	Use a literal block for the EXPORT_SYMBOL_GPL_FOR_MODULES() example to avoid a Docutils warning about unmatched '*'. This ensures correct rendering and keeps the source readable. Warning: Documentation/core-api/symbol-namespaces.rst:90: WARNING: Inline emphasis start-string without end-string. [docutils] Signed-off-by: Khaled Elnaggar <khaledelnaggarlinux@gmail.com> Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
4 days	kbuild: link lib-y objects to vmlinux forcibly even when CONFIG_MODULES=n	Masahiro Yamada	1	-6/+1
	Since commit 7273ad2b08f8 ("kbuild: link lib-y objects to vmlinux forcibly when CONFIG_MODULES=y"), all objects from lib-y have been forcibly linked to vmlinux when CONFIG_MODULES=y. To simplify future changes, this commit makes all objects from lib-y be linked regardless of the CONFIG_MODULES setting. Most use cases (CONFIG_MODULES=y) are not affected by this change. The vmlinux size with ARCH=arm allnoconfig, where CONFIG_MODULES=n, increases as follows: text data bss dec hex filename 1368644 835104 206288 2410036 24c634 vmlinux.before 1379440 837064 206288 2422792 24f808 vmlinux.after We no longer benefit from using static libraries, but the impact is mitigated by supporting CONFIG_LD_DEAD_CODE_DATA_ELIMINATION. For example, the size of vmlinux remains almost the same with ARCH=arm tinyconfig, where CONFIG_MODULES=n and CONFIG_LD_DEAD_CODE_DATA_ELIMINATION=y. text data bss dec hex filename 455316 93404 15472 564192 89be0 vmlinux.before 455312 93404 15472 564188 89bdc vmlinux.after Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
4 days	tinyconfig: enable CONFIG_LD_DEAD_CODE_DATA_ELIMINATION	Masahiro Yamada	1	-0/+1
	This CONFIG option, if supported by the architecture, helps reduce the size of vmlinux. For example, the size of vmlinux with ARCH=arm tinyconfig decreases as follows: text data bss dec hex filename 631684 104500 18176 754360 b82b8 vmlinux.before 455316 93404 15472 564192 89be0 vmlinux.after Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
4 days	Merge tag 'pm-6.16-rc1-3' of ↵	Linus Torvalds	1	-4/+12
	git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "Fix three issues introduced into device suspend/resume error paths in the PM core by some of the recent updates. First off, replace list_splice() with list_splice_init() in three places in device suspend error paths to avoid attempting to use an uninitialized list head going forward. Second, rearrange device_resume() to avoid leaking the power.is_suspended device PM flag to the next system suspend/resume cycle where it can confuse rolling back after an error or early wakeup. Finally, add synchronization to dpm_async_resume_children() to avoid resetting the async state mistakenly for devices whose resume callbacks have already been queued up for asynchronous execution in the given device resume phase, which fortunately can happen only if the preceding system suspend transition has been aborted" * tag 'pm-6.16-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PM: sleep: Add locking to dpm_async_resume_children() PM: sleep: Fix power.is_suspended cleanup for direct-complete devices PM: sleep: Fix list splicing in device suspend error paths
4 days	Merge tag 'net-6.16-rc1' of ↵	Linus Torvalds	89	-618/+1060
	git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from CAN, wireless, Bluetooth, and Netfilter. Current release - regressions: - Revert "kunit: configs: Enable CONFIG_INIT_STACK_ALL_PATTERN in all_tests", makes kunit error out if compiler is old - wifi: iwlwifi: mvm: fix assert on suspend - rxrpc: fix return from none_validate_challenge() Current release - new code bugs: - ovpn: couple of fixes for socket cleanup and UDP-tunnel teardown - can: kvaser_pciefd: refine error prone echo_skb_max handling logic - fix net_devmem_bind_dmabuf() stub when DEVMEM not compiled - eth: airoha: fixes for config / accel in bridge mode Previous releases - regressions: - Bluetooth: hci_qca: move the SoC type check to the right place, fix GPIO integration - prevent a NULL deref in rtnl_create_link() after locking changes - fix udp gso skb_segment after pull from frag_list - hv_netvsc: fix potential deadlock in netvsc_vf_setxdp() Previous releases - always broken: - netfilter: - nf_nat: also check reverse tuple to obtain clashing entry - nf_set_pipapo_avx2: fix initial map fill (zeroing) - fix the helper for incremental update of packet checksums after modifying the IP address, used by ILA and BPF - eth: - stmmac: prevent div by 0 when clock rate is misconfigured - ice: fix Tx scheduler handling of XDP and changing queue count - eth: fix support for the RGMII interface when delays configured" * tag 'net-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (76 commits) calipso: unlock rcu before returning -EAFNOSUPPORT seg6: Fix validation of nexthop addresses net: prevent a NULL deref in rtnl_create_link() net: annotate data-races around cleanup_net_task selftests: drv-net: tso: make bkg() wait for socat to quit selftests: drv-net: tso: fix the GRE device name selftests: drv-net: add configs for the TSO test wireguard: device: enable threaded NAPI netlink: specs: rt-link: decode ip6gre netlink: specs: rt-link: add missing byte-order properties net: wwan: mhi_wwan_mbim: use correct mux_id for multiplexing wifi: cfg80211/mac80211: correctly parse S1G beacon optional elements net: dsa: b53: do not touch DLL_IQQD on bcm53115 net: dsa: b53: allow RGMII for bcm63xx RGMII ports net: dsa: b53: do not configure bcm63xx's IMP port interface net: dsa: b53: do not enable RGMII delay on bcm63xx net: dsa: b53: do not enable EEE on bcm63xx net: ti: icssg-prueth: Fix swapped TX stats for MII interfaces. selftests: netfilter: nft_nat.sh: add test for reverse clash with nat netfilter: nf_nat: also check reverse tuple to obtain clashing entry ...
4 days	RISC-V: Documentation: Add enough title underlines to CMODX	Palmer Dabbelt	1	-2/+2
	This reports as a warning in linux-next along the lines of Documentation/arch/riscv/cmodx.rst:14: WARNING: Title underline too short. CMODX in the Kernel Space --------------------- [docutils] Documentation/arch/riscv/cmodx.rst:43: WARNING: Title underline too short. CMODX in the User Space --------------------- [docutils] Documentation/arch/riscv/cmodx.rst:43: WARNING: Title underline too short. CMODX in the User Space --------------------- [docutils] Link: https://lore.kernel.org/all/20250603154544.1602a8b5@canb.auug.org.au/ Fixes: 0e07200b2af6 ("riscv: Documentation: add a description about dynamic ftrace") Reviewed-by: Charlie Jenkins <charlie@rivosinc.com> Link: https://lore.kernel.org/r/20250603172856.49925-1-palmer@dabbelt.com Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
4 days	Merge tag 'riscv-mw1-6.16-rc1' of ↵	Palmer Dabbelt	903	-6229/+11793
	ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/alexghiti/linux into for-next riscv patches for 6.16-rc1 * Implement atomic patching support for ftrace which finally allows to get rid of stop_machine(). * Support for kexec_file_load() syscall * Improve module loading time by changing the algorithm that counts the number of plt/got entries in a module. * Zicbop is now used in the kernel to prefetch instructions [Palmer: There's been two rounds of surgery on this one, so as a result it's a bit different than the PR.] * alex-pr: (734 commits) riscv: Improve Kconfig help for RISCV_ISA_V_PREEMPTIVE MAINTAINERS: Update Atish's email address riscv: hwprobe: export Zabha extension riscv: Make regs_irqs_disabled() more clear perf symbols: Ignore mapping symbols on riscv RISC-V: Kconfig: Fix help text of CMDLINE_EXTEND riscv: module: Optimize PLT/GOT entry counting riscv: Add support for PUD THP riscv: xchg: Prefetch the destination word for sc.w riscv: Add ARCH_HAS_PREFETCH[W] support with Zicbop riscv: Add support for Zicbop riscv: Introduce Zicbop instructions riscv/kexec_file: Fix comment in purgatory relocator riscv: kexec_file: Support loading Image binary file riscv: kexec_file: Split the loading of kernel and others riscv: Documentation: add a description about dynamic ftrace riscv: ftrace: support direct call using call_ops riscv: Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS riscv: ftrace: support PREEMPT riscv: add a data fence for CMODX in the kernel mode ... Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
4 days	riscv: Improve Kconfig help for RISCV_ISA_V_PREEMPTIVE	Miquel Sabaté Solà	1	-4/+4
	Fix a couple of spelling issues plus some minor details on the grammar. Signed-off-by: Miquel Sabaté Solà <mikisabate@gmail.com> Link: https://lore.kernel.org/r/20250501130309.14803-1-mikisabate@gmail.com Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
4 days	MAINTAINERS: Update Atish's email address	Atish Patra	2	-3/+4
	My personal upstream email account was previously based on gmail which has become difficult to manage upstream activities lately. Update it to the more reliable linux.dev account. Signed-off-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20250505-update_email_address-v1-1-1c24db506fdb@rivosinc.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
4 days	Merge patch series "riscv: Add Zicbop & prefetchw support"	Alexandre Ghiti	9	-10/+142
	Alexandre Ghiti <alexghiti@rivosinc.com> says: I found this lost series developed by Guo so here is a respin with the comments on v2 applied. This patch series adds Zicbop support and then enables the Linux prefetch features. * patches from https://lore.kernel.org/r/20250421142441.395849-1-alexghiti@rivosinc.com: riscv: xchg: Prefetch the destination word for sc.w riscv: Add ARCH_HAS_PREFETCH[W] support with Zicbop riscv: Add support for Zicbop riscv: Introduce Zicbop instructions Link: https://lore.kernel.org/r/20250421142441.395849-1-alexghiti@rivosinc.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
4 days	MAINTAINERS: add entry for crypto library	Eric Biggers	1	-1/+10
	I am volunteering to maintain the kernel's crypto library code. [ And Jason and Ard piped up too - Linus ] Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Jason A. Donenfeld <Jason@zx2c4.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 days	Merge tag 'uml-for-linux-6.16-rc1' of ↵	Linus Torvalds	80	-4357/+2679
	git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux Pull UML updates from Johannes Berg: "The only really new thing is the long-standing seccomp work (originally from 2021!). Wven if it still isn't enabled by default due to security concerns it can still be used e.g. for tests. - remove obsolete network transports - remove PCI IO port support - start adding seccomp-based process handling instead of ptrace" * tag 'uml-for-linux-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux: (29 commits) um: remove "extern" from implementation of sigchld_handler um: fix unused variable warning um: fix SECCOMP 32bit xstate register restore um: pass FD for memory operations when needed um: Add SECCOMP support detection and initialization um: Implement kernel side of SECCOMP based process handling um: Track userspace children dying in SECCOMP mode um: Add helper functions to get/set state for SECCOMP um: Add stub side of SECCOMP/futex based process handling um: Move faultinfo extraction into userspace routine um: vector: Use mac_pton() for MAC address parsing um: vector: Clean up and modernize log messages um: chan_kern: use raw spinlock for irqs_to_free_lock MAINTAINERS: remove obsolete file entry in TUN/TAP DRIVER um: Fix tgkill compile error on old host OSes um: stop using PCI port I/O um: Remove legacy network transport infrastructure um: vector: Eliminate the dependency on uml_net um: Remove obsolete legacy network transports um/asm: Replace "REP; NOP" with PAUSE mnemonic ...
5 days	riscv: uaccess: do not do misaligned accesses in get/put_user()	Clément Léger	1	-5/+18
	Doing misaligned access to userspace memory would make a trap on platform where it is emulated. Latest fixes removed the kernel capability to do unaligned accesses to userspace memory safely since interrupts are kept disabled at all time during that. Thus doing so would crash the kernel. Such behavior was detected with GET_UNALIGN_CTL() that was doing a put_user() with an unsigned long* address that should have been an unsigned int*. Reenabling kernel misaligned access emulation is a bit risky and it would also degrade performances. Rather than doing that, we will try to avoid any misaligned accessed by using copy_from/to_user() which does not do any misaligned accesses. This can be done only for !CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS and thus allows to only generate a bit more code for this config. Signed-off-by: Clément Léger <cleger@rivosinc.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20250602193918.868962-4-cleger@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
5 days	Merge tag 'arm64-fixes' of ↵	Linus Torvalds	8	-23/+49
	git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: "We've got a couple of build fixes when using LLD, a missing TLB invalidation and a workaround for broken firmware on SoCs with CPUs that implement MPAM: - Disable problematic linker assertions for broken versions of LLD - Work around sporadic link failure with LLD and various randconfig builds - Fix missing invalidation in the TLB batching code when reclaim races with mprotect() and friends - Add a command-line override for MPAM to allow booting on systems with broken firmware" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: Add override for MPAM arm64/mm: Close theoretical race where stale TLB entry remains valid arm64: Work around convergence issue with LLD linker arm64: Disable LLD linker ASSERT()s for the time being
5 days	riscv: process: use unsigned int instead of unsigned long for put_user()	Clément Léger	1	-1/+1
	The specification of prctl() for GET_UNALIGN_CTL states that the value is returned in an unsigned int * address passed as an unsigned long. Change the type to match that and avoid an unaligned access as well. Signed-off-by: Clément Léger <cleger@rivosinc.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20250602193918.868962-3-cleger@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
5 days	riscv: make unsafe user copy routines use existing assembly routines	Alexandre Ghiti	5	-48/+63
	The current implementation is underperforming and in addition, it triggers misaligned access traps on platforms which do not handle misaligned accesses in hardware. Use the existing assembly routines to solve both problems at once. Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20250602193918.868962-2-cleger@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
5 days	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linux	Linus Torvalds	3	-4/+4
	Pull ARM fixes from Russell King: - Fix arch_memremap_can_ram_remap() which incorrectly passed a PFN to memblock_is_map_memory rather than the actual address. - Disallow kernel mode NEON when IRQs are disabled Explanation: "To avoid having to preserve/restore kernel mode NEON state when such a softirq is taken softirqs are now disabled when using the NEON from task context." should explain that it's nested kernel mode. In other words, softirqs from user mode are fine, because the context will be preserved. softirqs from kernel mode may be from a context that has already saved the user NEON state, and thus we would need to preserve the NEON state for the parent kernel mode context, and this we don't allow. The problem occurs when the kernel context disables hard IRQs, and then uses NEON. When it's finished, and restores the userspace NEON state, we call local_bh_enable() with hard IRQs disabled, which causes a warning. This commit addresses that by disallowing the use of NEON with hard IRQs disabled. https://lore.kernel.org/all/20250516231858.27899-4-ebiggers@kernel.org/T/#m104841b6e9346b1814c8b0fb9f2340551b0cd3e8 has some further context * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linux: ARM: 9446/1: Disallow kernel mode NEON when IRQs are disabled ARM: 9447/1: arm/memremap: fix arch_memremap_can_ram_remap()
5 days	riscv: hwprobe: export Zabha extension	Alexandre Ghiti	3	-0/+6
	Export Zabha through the hwprobe syscall. Reviewed-by: Clément Léger <cleger@rivosinc.com> Link: https://lore.kernel.org/r/20250421141413.394444-1-alexghiti@rivosinc.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
5 days	riscv: Make regs_irqs_disabled() more clear	Tiezhu Yang	1	-1/+1
	The return value of regs_irqs_disabled() is true or false, so change its type to reflect that and also make it always inline. Suggested-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20250422113156.25742-1-yangtiezhu@loongson.cn Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
5 days	perf symbols: Ignore mapping symbols on riscv	Haibo Xu	1	-0/+6
	RISCV ELF use mapping symbols with special names $x, $d to identify regions of RISCV code or code with different ISAs[1]. These symbols don't identify functions, so will confuse the perf output. The patch filters out these symbols at load time, similar to "4886f2ca perf symbols: Ignore mapping symbols on aarch64". [1] https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/ master/riscv-elf.adoc#mapping-symbol Signed-off-by: Haibo Xu <haibo1.xu@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20250409025202.201046-1-haibo1.xu@intel.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
5 days	Merge patch series "riscv: kexec_file: Support loading Image binary file"	Palmer Dabbelt	7	-486/+610
	Björn Töpel <bjorn@kernel.org> says: From: Björn Töpel <bjorn@rivosinc.com> Hi! For over a year ago, Daniel and I was testing the V2 of Song's series. I also promised to take the V2, that had been sitting on the lists for too long, to rebase it on a new kernel, and re-test it. One year later, here's the V3! ;-) There are no changes from V2 other, than some simple checkpatch cleanups. Song's original cover: \| This series makes the kexec_file_load() syscall support to load \| Image binary file. At the same time, corresponding support for \| kexec-tools had been pushed to my repo[2]. \| \| Now, we can leverage that kexec-tools and this series to use the \| kexec_load() or kexec_file_load() syscall to boot both vmlinux and \| Image file, as seen in these combo tests: \| \| ``` \| 1. kexec -l vmlinux \| 2. kexec -l Image \| 3. kexec -s -l vmlinux \| 4. kexec -s -l Image \| ``` Notably, kexec-tools has still not made it upstream. I've prepared a branch on my GH [3], that I indend to post ASAP. That branch is a collection of fixes/features, including Song's userland Image loading. The V2 is here [2], and V1 [1]. I've tested the kexec-file/Image on qemu-rv64, with following combinations: * ACPI/UEFI * DT/UEFI * DT both "regular" kexec (-s + -e), and crashkernels (-p). Note that there are two purgatory patches that has to be present (part of -rc1, so all good): commit 28093cfef5dd ("riscv/kexec_file: Handle R_RISCV_64 in purgatory relocator") commit 3f7023171df4 ("riscv/purgatory: 4B align purgatory_start") [1] https://lore.kernel.org/linux-riscv/20230914020044.1397356-1-songshuaishuai@tinylab.org/ [2] https://lore.kernel.org/linux-riscv/20231016092006.3347632-1-songshuaishuai@tinylab.org/ [3] https://github.com/bjoto/kexec-tools/tree/rv-on-master * patches from https://lore.kernel.org/r/20250409193004.643839-1-bjorn@kernel.org: riscv: kexec_file: Support loading Image binary file riscv: kexec_file: Split the loading of kernel and others Link: https://lore.kernel.org/r/20250409193004.643839-1-bjorn@kernel.org Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
5 days	RISC-V: Kconfig: Fix help text of CMDLINE_EXTEND	谢致邦 (XIE Zhibang)	1	-2/+2
	It is the built-in command line appended to the bootloader command line, not the bootloader command line appended to the built-in command line. Fixes: 3aed8c43267e ("RISC-V: Update Kconfig to better handle CMDLINE") Signed-off-by: 谢致邦 (XIE Zhibang) <Yeking@Red54.com> Link: https://lore.kernel.org/r/tencent_A93C7FB46BFD20054AD2FEF4645913FF550A@qq.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
5 days	riscv: module: Optimize PLT/GOT entry counting	Samuel Holland	1	-16/+65
	perf reports that 99.63% of the cycles from `modprobe amdgpu` are spent inside module_frob_arch_sections(). This is because amdgpu.ko contains about 300000 relocations in its .rela.text section, and the algorithm in count_max_entries() takes quadratic time. Apply two optimizations from the arm64 code, which together reduce the total execution time by 99.58%. First, sort the relocations so duplicate entries are adjacent. Second, reduce the number of relocations that must be sorted by filtering to only relocations that need PLT/GOT entries, as done in commit d4e0340919fb ("arm64/module: Optimize module load time by optimizing PLT counting"). Unlike the arm64 code, here the filtering and sorting is done in a scratch buffer, because the HI20 relocation search optimization in apply_relocate_add() depends on the original order of the relocations. This allows accumulating PLT/GOT relocations across sections so sorting and counting is only done once per module. Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20250409171526.862481-3-samuel.holland@sifive.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
5 days	Merge patch series "riscv: ftrace: atmoic patching and preempt improvements"	Alexandre Ghiti	12	-207/+331
	Andy Chiu <andybnac@gmail.com> says: This series makes atomic code patching in ftrace possible and eliminates the need of the stop_machine dance. The major difference of this version is that we merge the CALL_OPS support from Puranjay [1] and make direct calls available for practical uses such as BPF. Thanks for the time reviewing the series and suggestions, we hope this version gets a step closer to happening in the upstream. Please reference the link to v3 below for more introductory view of the implementation [2] Added patch: 2, 4, 10, 11, 12 Modified patch: 5, 6 Unchanged patch: 1, 3, 7, 8, 9 (1, 8 has commit msg modified) Special thanks to Björn for his efforts on testing and guiding the series! [1]: https://lore.kernel.org/lkml/20240306165904.108141-1-puranjay12@gmail.com/ [2]: https://lore.kernel.org/linux-riscv/20241127172908.17149-1-andybnac@gmail.com/ * patches from https://lore.kernel.org/r/20250407180838.42877-1-andybnac@gmail.com: riscv: Documentation: add a description about dynamic ftrace riscv: ftrace: support direct call using call_ops riscv: Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS riscv: ftrace: support PREEMPT riscv: add a data fence for CMODX in the kernel mode riscv: vector: Support calling schedule() for preemptible Vector riscv: ftrace: do not use stop_machine to update code riscv: ftrace: prepare ftrace for atomic code patching kernel: ftrace: export ftrace_sync_ipi riscv: ftrace: align patchable functions to 4 Byte boundary riscv: ftrace factor out code defined by !WITH_ARG riscv: ftrace: support fastcc in Clang for WITH_ARGS Link: https://lore.kernel.org/r/20250407180838.42877-1-andybnac@gmail.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
5 days	riscv: Add support for PUD THP	Alexandre Ghiti	6	-2/+120
	Add the necessary page table functions to deal with PUD THP, this enables the use of PUD pfnmap. Link: https://lore.kernel.org/r/20250321123954.225097-1-alexghiti@rivosinc.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
5 days	riscv: xchg: Prefetch the destination word for sc.w	Guo Ren	1	-1/+3
	The cost of changing a cacheline from shared to exclusive state can be significant, especially when this is triggered by an exclusive store, since it may result in having to retry the transaction. This patch makes use of prefetch.w to prefetch cachelines for write prior to lr/sc loops when using the xchg_small atomic routine. This patch is inspired by commit 0ea366f5e1b6 ("arm64: atomics: prefetch the destination word for write prior to stxr"). Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Link: https://lore.kernel.org/r/20231231082955.16516-4-guoren@kernel.org Tested-by: Andrea Parri <parri.andrea@gmail.com> Link: https://lore.kernel.org/r/20250421142441.395849-5-alexghiti@rivosinc.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
5 days	riscv: Add ARCH_HAS_PREFETCH[W] support with Zicbop	Guo Ren	1	-0/+24
	Enable Linux prefetch and prefetchw primitives using Zicbop. Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Link: https://lore.kernel.org/r/20231231082955.16516-3-guoren@kernel.org Tested-by: Andrea Parri <parri.andrea@gmail.com> Link: https://lore.kernel.org/r/20250421142441.395849-4-alexghiti@rivosinc.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
5 days	riscv: Add support for Zicbop	Alexandre Ghiti	8	-9/+55
	Zicbop introduces cache blocks prefetching instructions, add the necessary support for the kernel to use it in the coming commits. Co-developed-by: Guo Ren <guoren@kernel.org> Signed-off-by: Guo Ren <guoren@kernel.org> Tested-by: Andrea Parri <parri.andrea@gmail.com> Link: https://lore.kernel.org/r/20250421142441.395849-3-alexghiti@rivosinc.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
5 days	riscv: Introduce Zicbop instructions	Alexandre Ghiti	1	-0/+60
	The S-type instructions are first introduced and then used to define the encoding of the Zicbop prefetching instructions. Co-developed-by: Guo Ren <guoren@kernel.org> Signed-off-by: Guo Ren <guoren@kernel.org> Tested-by: Andrea Parri <parri.andrea@gmail.com> Link: https://lore.kernel.org/r/20250421142441.395849-2-alexghiti@rivosinc.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
5 days	riscv/kexec_file: Fix comment in purgatory relocator	Yao Zi	1	-1/+1
	Apparently sec_base doesn't mean relocated symbol value, which seems a copy-pasting error in the comment. Assigned with the address of section indexed by sym->st_shndx, it should represent base address of the relevant section. Let's fix the comment to avoid possible confusion. Fixes: 838b3e28488f ("RISC-V: Load purgatory in kexec_file") Signed-off-by: Yao Zi <ziyao@disroot.org> Reviewed-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20250326073450.57648-2-ziyao@disroot.org Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
5 days	riscv: kexec_file: Support loading Image binary file	Song Shuai	5	-1/+101
	This patch creates image_kexec_ops to load Image binary file for kexec_file_load() syscall. Signed-off-by: Song Shuai <songshuaishuai@tinylab.org> Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20250409193004.643839-3-bjorn@kernel.org Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
5 days	riscv: kexec_file: Split the loading of kernel and others	Song Shuai	5	-486/+510
	This is the preparative patch for kexec_file_load Image support. It separates the elf_kexec_load() as two parts: - the first part loads the vmlinux (or Image) - the second part loads other segments (e.g. initrd,fdt,purgatory) And the second part is exported as the load_extra_segments() function which would be used in both kexec-elf.c and kexec-image.c. No functional change intended. Signed-off-by: Song Shuai <songshuaishuai@tinylab.org> Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20250409193004.643839-2-bjorn@kernel.org Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
5 days	riscv: Documentation: add a description about dynamic ftrace	Andy Chiu	1	-7/+39
	Add a section in cmodx to describe how dynamic ftrace works on riscv, limitations, and assumptions. Signed-off-by: Andy Chiu <andybnac@gmail.com> Link: https://lore.kernel.org/r/20250407180838.42877-12-andybnac@gmail.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
5 days	riscv: ftrace: support direct call using call_ops	Andy Chiu	5	-27/+48
	jump to FTRACE_ADDR if distance is out of reach Co-developed-by: Björn Töpel <bjorn@rivosinc.com> Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Signed-off-by: Andy Chiu <andybnac@gmail.com> Link: https://lore.kernel.org/r/20250407180838.42877-11-andybnac@gmail.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
5 days	riscv: Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS	Puranjay Mohan	5	-6/+105
	This patch enables support for DYNAMIC_FTRACE_WITH_CALL_OPS on RISC-V. This allows each ftrace callsite to provide an ftrace_ops to the common ftrace trampoline, allowing each callsite to invoke distinct tracer functions without the need to fall back to list processing or to allocate custom trampolines for each callsite. This significantly speeds up cases where multiple distinct trace functions are used and callsites are mostly traced by a single tracer. The idea and most of the implementation is taken from the ARM64's implementation of the same feature. The idea is to place a pointer to the ftrace_ops as a literal at a fixed offset from the function entry point, which can be recovered by the common ftrace trampoline. We use -fpatchable-function-entry to reserve 8 bytes above the function entry by emitting 2 4 byte or 4 2 byte nops depending on the presence of CONFIG_RISCV_ISA_C. These 8 bytes are patched at runtime with a pointer to the associated ftrace_ops for that callsite. Functions are aligned to 8 bytes to make sure that the accesses to this literal are atomic. This approach allows for directly invoking ftrace_ops::func even for ftrace_ops which are dynamically-allocated (or part of a module), without going via ftrace_ops_list_func. We've benchamrked this with the ftrace_ops sample module on Spacemit K1 Jupiter: Without this patch: baseline (Linux rivos 6.14.0-09584-g7d06015d936c #3 SMP Sat Mar 29 +-----------------------+-----------------+----------------------------+ \| Number of tracers \| Total time (ns) \| Per-call average time \| \|-----------------------+-----------------+----------------------------\| \| Relevant \| Irrelevant \| 100000 calls \| Total (ns) \| Overhead (ns) \| \|----------+------------+-----------------+------------+---------------\| \| 0 \| 0 \| 1357958 \| 13 \| - \| \| 0 \| 1 \| 1302375 \| 13 \| - \| \| 0 \| 2 \| 1302375 \| 13 \| - \| \| 0 \| 10 \| 1379084 \| 13 \| - \| \| 0 \| 100 \| 1302458 \| 13 \| - \| \| 0 \| 200 \| 1302333 \| 13 \| - \| \|----------+------------+-----------------+------------+---------------\| \| 1 \| 0 \| 13677833 \| 136 \| 123 \| \| 1 \| 1 \| 18500916 \| 185 \| 172 \| \| 1 \| 2 \| 22856459 \| 228 \| 215 \| \| 1 \| 10 \| 58824709 \| 588 \| 575 \| \| 1 \| 100 \| 505141584 \| 5051 \| 5038 \| \| 1 \| 200 \| 1580473126 \| 15804 \| 15791 \| \|----------+------------+-----------------+------------+---------------\| \| 1 \| 0 \| 13561000 \| 135 \| 122 \| \| 2 \| 0 \| 19707292 \| 197 \| 184 \| \| 10 \| 0 \| 67774750 \| 677 \| 664 \| \| 100 \| 0 \| 714123125 \| 7141 \| 7128 \| \| 200 \| 0 \| 1918065668 \| 19180 \| 19167 \| +----------+------------+-----------------+------------+---------------+ Note: per-call overhead is estimated relative to the baseline case with 0 relevant tracers and 0 irrelevant tracers. With this patch: v4-rc4 (Linux rivos 6.14.0-09598-gd75747611c93 #4 SMP Sat Mar 29 +-----------------------+-----------------+----------------------------+ \| Number of tracers \| Total time (ns) \| Per-call average time \| \|-----------------------+-----------------+----------------------------\| \| Relevant \| Irrelevant \| 100000 calls \| Total (ns) \| Overhead (ns) \| \|----------+------------+-----------------+------------+---------------\| \| 0 \| 0 \| 1459917 \| 14 \| - \| \| 0 \| 1 \| 1408000 \| 14 \| - \| \| 0 \| 2 \| 1383792 \| 13 \| - \| \| 0 \| 10 \| 1430709 \| 14 \| - \| \| 0 \| 100 \| 1383791 \| 13 \| - \| \| 0 \| 200 \| 1383750 \| 13 \| - \| \|----------+------------+-----------------+------------+---------------\| \| 1 \| 0 \| 5238041 \| 52 \| 38 \| \| 1 \| 1 \| 5228542 \| 52 \| 38 \| \| 1 \| 2 \| 5325917 \| 53 \| 40 \| \| 1 \| 10 \| 5299667 \| 52 \| 38 \| \| 1 \| 100 \| 5245250 \| 52 \| 39 \| \| 1 \| 200 \| 5238459 \| 52 \| 39 \| \|----------+------------+-----------------+------------+---------------\| \| 1 \| 0 \| 5239083 \| 52 \| 38 \| \| 2 \| 0 \| 19449417 \| 194 \| 181 \| \| 10 \| 0 \| 67718584 \| 677 \| 663 \| \| 100 \| 0 \| 709840708 \| 7098 \| 7085 \| \| 200 \| 0 \| 2203580626 \| 22035 \| 22022 \| +----------+------------+-----------------+------------+---------------+ Note: per-call overhead is estimated relative to the baseline case with 0 relevant tracers and 0 irrelevant tracers. As can be seen from the above: a) Whenever there is a single relevant tracer function associated with a tracee, the overhead of invoking the tracer is constant, and does not scale with the number of tracers which are not associated with that tracee. b) The overhead for a single relevant tracer has dropped to ~1/3 of the overhead prior to this series (from 122ns to 38ns). This is largely due to permitting calls to dynamically-allocated ftrace_ops without going through ftrace_ops_list_func. Signed-off-by: Puranjay Mohan <puranjay12@gmail.com> [update kconfig, asm, refactor] Signed-off-by: Andy Chiu <andybnac@gmail.com> Tested-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20250407180838.42877-10-andybnac@gmail.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
5 days	riscv: ftrace: support PREEMPT	Andy Chiu	1	-1/+1
	Now, we can safely enable dynamic ftrace with kernel preemption. Signed-off-by: Andy Chiu <andy.chiu@sifive.com> Reviewed-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20250407180838.42877-9-andybnac@gmail.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
5 days	riscv: add a data fence for CMODX in the kernel mode	Andy Chiu	1	-1/+14
	RISC-V spec explicitly calls out that a local fence.i is not enough for the code modification to be visble from a remote hart. In fact, it states: To make a store to instruction memory visible to all RISC-V harts, the writing hart also has to execute a data FENCE before requesting that all remote RISC-V harts execute a FENCE.I. Although current riscv drivers for IPI use ordered MMIO when sending IPIs in order to synchronize the action between previous csd writes, riscv does not restrict itself to any particular flavor of IPI. Any driver or firmware implementation that does not order data writes before the IPI may pose a risk for code-modifying race. Thus, add a fence here to order data writes before making the IPI. Signed-off-by: Andy Chiu <andybnac@gmail.com> Reviewed-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20250407180838.42877-8-andybnac@gmail.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
5 days	riscv: vector: Support calling schedule() for preemptible Vector	Andy Chiu	2	-3/+24
	Each function entry implies a call to ftrace infrastructure. And it may call into schedule in some cases. So, it is possible for preemptible kernel-mode Vector to implicitly call into schedule. Since all V-regs are caller-saved, it is possible to drop all V context when a thread voluntarily call schedule(). Besides, we currently don't pass argument through vector register, so we don't have to save/restore V-regs in ftrace trampoline. Signed-off-by: Andy Chiu <andy.chiu@sifive.com> Link: https://lore.kernel.org/r/20250407180838.42877-7-andybnac@gmail.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
5 days	riscv: ftrace: do not use stop_machine to update code	Andy Chiu	1	-54/+10
	Now it is safe to remove dependency from stop_machine() for us to patch code in ftrace. Signed-off-by: Andy Chiu <andy.chiu@sifive.com> Link: https://lore.kernel.org/r/20250407180838.42877-6-andybnac@gmail.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
5 days	riscv: ftrace: prepare ftrace for atomic code patching	Andy Chiu	3	-97/+98
	We use an AUIPC+JALR pair to jump into a ftrace trampoline. Since instruction fetch can break down to 4 byte at a time, it is impossible to update two instructions without a race. In order to mitigate it, we initialize the patchable entry to AUIPC + NOP4. Then, the run-time code patching can change NOP4 to JALR to eable/disable ftrcae from a function. This limits the reach of each ftrace entry to +-2KB displacing from ftrace_caller. Starting from the trampoline, we add a level of indirection for it to reach ftrace caller target. Now, it loads the target address from a memory location, then perform the jump. This enable the kernel to update the target atomically. The new don't-stop-the-world text patching on change only one RISC-V instruction: \| -8: &ftrace_ops of the associated tracer function. \| <ftrace enable>: \| 0: auipc t0, hi(ftrace_caller) \| 4: jalr t0, lo(ftrace_caller) \| \| -8: &ftrace_nop_ops \| <ftrace disable>: \| 0: auipc t0, hi(ftrace_caller) \| 4: nop This means that f+0x0 is fixed, and should not be claimed by ftrace, e.g. kprobe should be able to put a probe in f+0x0. Thus, we adjust the offset and MCOUNT_INSN_SIZE accordingly. [ alex: Fix build errors with !CONFIG_DYNAMIC_FTRACE ] Co-developed-by: Björn Töpel <bjorn@rivosinc.com> Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Signed-off-by: Andy Chiu <andy.chiu@sifive.com> Link: https://lore.kernel.org/r/20250407180838.42877-5-andybnac@gmail.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
5 days	kernel: ftrace: export ftrace_sync_ipi	Andy Chiu	2	-1/+3
	The following ftrace patch for riscv uses a data store to update ftrace function. Therefore, a romote fence is required to order it against function_trace_op updates. The mechanism is similar to the fence between function_trace_op and update_ftrace_func in the generic ftrace, so we leverage the same ftrace_sync_ipi function. [ alex: Fix build warning when !CONFIG_DYNAMIC_FTRACE ] Signed-off-by: Andy Chiu <andybnac@gmail.com> Link: https://lore.kernel.org/r/20250407180838.42877-4-andybnac@gmail.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
5 days	riscv: ftrace: align patchable functions to 4 Byte boundary	Andy Chiu	1	-0/+2
	We are changing ftrace code patching in order to remove dependency from stop_machine() and enable kernel preemption. This requires us to align functions entry at a 4-B align address. However, -falign-functions on older versions of GCC alone was not strong enoungh to align all functions. In fact, cold functions are not aligned after turning on optimizations. We consider this is a bug in GCC and turn off guess-branch-probility as a workaround to align all functions. GCC bug id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88345 The option -fmin-function-alignment is able to align all functions properly on newer versions of gcc. So, we add a cc-option to test if the toolchain supports it. Suggested-by: Evgenii Shatokhin <e.shatokhin@yadro.com> Signed-off-by: Andy Chiu <andy.chiu@sifive.com> Reviewed-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20250407180838.42877-3-andybnac@gmail.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
5 days	riscv: ftrace factor out code defined by !WITH_ARG	Andy Chiu	2	-49/+0
	DYNAMIC_FTRACE selects DYNAMIC_FTRACE_WITH_ARGS and mcount-dyn.S in riscv, so we can remove ifdef jargons of WITH_ARG when it is known that DYNAMIC_FTRACE is true. Signed-off-by: Andy Chiu <andybnac@gmail.com> Link: https://lore.kernel.org/r/20250407180838.42877-2-andybnac@gmail.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
5 days	riscv: ftrace: support fastcc in Clang for WITH_ARGS	Andy Chiu	3	-2/+28
	Some caller-saved registers which are not defined as function arguments in the ABI can still be passed as arguments when the kernel is compiled with Clang. As a result, we must save and restore those registers to prevent ftrace from clobbering them. - [1]: https://reviews.llvm.org/D68559 Reported-by: Evgenii Shatokhin <e.shatokhin@yadro.com> Closes: https://lore.kernel.org/linux-riscv/7e7c7914-445d-426d-89a0-59a9199c45b1@yadro.com/ Fixes: 7caa9765465f ("ftrace: riscv: move from REGS to ARGS") Acked-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Björn Töpel <bjorn@rivosinc.com> Signed-off-by: Andy Chiu <andy.chiu@sifive.com> Tested-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20250407180838.42877-1-andybnac@gmail.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
5 days	drm/xe: remove unmatched xe_vm_unlock() from __xe_exec_queue_init()	Maciej Patelczyk	1	-5/+1
	There is unmatched xe_vm_unlock() in the __xe_exec_queue_init(). Leftover from commit fbeaad071a98 ("drm/xe: Create LRC BO without VM") Fixes: 2b0a0ce0c20b ("drm/xe: Create LRC BO without VM") Signed-off-by: Maciej Patelczyk <maciej.patelczyk@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Link: https://lore.kernel.org/r/20250530135627.2821612-1-maciej.patelczyk@intel.com (cherry picked from commit 28b996ce73982a44fa86736ca0e3684cb1ae8b24) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
5 days	drm/xe: Create LRC BO without VM	Niranjana Vishwanathapura	2	-28/+4
	Specifying VM during lrc->bo creation requires VM's reference to be held for the lifetime of lrc->bo as it will use VM's dma reservation object. Using VM's dma reservation object for lrc->bo doesn't provide any advantage. Hence do not pass VM while creating lrc->bo. v2: Use xe_bo_unpin_map_no_vm (Matthew Brost) Fixes: 264eecdba211 ("drm/xe: Decouple xe_exec_queue and xe_lrc") Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250529052031.2429120-2-niranjana.vishwanathapura@intel.com (cherry picked from commit fbeaad071a98fef87deccee81d564de1c8e8e16d) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
5 days	drm/xe/guc_submit: add back fix	Matthew Auld	1	-0/+11
	Daniele noticed that the fix in commit 2d2be279f1ca ("drm/xe: fix UAF around queue destruction") looks to have been unintentionally removed as part of handling a conflict in some past merge commit. Add it back. Fixes: ac44ff7cec33 ("Merge tag 'drm-xe-fixes-2024-10-10' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes") Reported-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org> # v6.12+ Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250603174213.1543579-2-matthew.auld@intel.com (cherry picked from commit 9d9fca62dc49d96f97045b6d8e7402a95f8cf92a) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
5 days	drm/xe/pxp: Clarify PXP queue creation behavior if PXP is not ready	Daniele Ceraolo Spurio	2	-2/+11
	The expected flow of operations when using PXP is to query the PXP status and wait for it to transition to "ready" before attempting to create an exec_queue. This flow is followed by the Mesa driver, but there is no guarantee that an incorrectly coded (or malicious) app will not attempt to create the queue first without querying the status. Therefore, we need to clarify what the expected behavior of the queue creation ioctl is in this scenario. Currently, the ioctl always fails with an -EBUSY code no matter the error, but for consistency it is better to distinguish between "failed to init" (-EIO) and "not ready" (-EBUSY), the same way the query ioctl does. Note that, while this is a change in the return code of an ioctl, the behavior of the ioctl in this particular corner case was not clearly spec'd, so no one should have been relying on it (and we know that Mesa, which is the only known userspace for this, didn't). v2: Minor rework of the doc (Rodrigo) Fixes: 72d479601d67 ("drm/xe/pxp/uapi: Add userspace and LRC support for PXP-using queues") Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/20250522225401.3953243-7-daniele.ceraolospurio@intel.com (cherry picked from commit 21784ca96025b62d95b670b7639ad70ddafa69b8) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
5 days	drm/xe/pxp: Use the correct define in the set_property_funcs array	Daniele Ceraolo Spurio	1	-1/+1
	The define of the extension type was accidentally used instead of the one of the property itself. They're both zero, so no functional issue, but we should use the correct define for code correctness. Fixes: 41a97c4a1294 ("drm/xe/pxp/uapi: Add API to mark a BO as using PXP") Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Link: https://lore.kernel.org/r/20250522225401.3953243-6-daniele.ceraolospurio@intel.com (cherry picked from commit 1d891ee820fd0fbb4101eacb0d922b5050a24933) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
5 days	drm/xe/sched: stop re-submitting signalled jobs	Matthew Auld	1	-1/+9
	Customer is reporting a really subtle issue where we get random DMAR faults, hangs and other nasties for kernel migration jobs when stressing stuff like s2idle/s3/s4. The explosions seems to happen somewhere after resuming the system with splats looking something like: PM: suspend exit rfkill: input handler disabled xe 0000:00:02.0: [drm] GT0: Engine reset: engine_class=bcs, logical_mask: 0x2, guc_id=0 xe 0000:00:02.0: [drm] GT0: Timedout job: seqno=24496, lrc_seqno=24496, guc_id=0, flags=0x13 in no process [-1] xe 0000:00:02.0: [drm] GT0: Kernel-submitted job timed out The likely cause appears to be a race between suspend cancelling the worker that processes the free_job()'s, such that we still have pending jobs to be freed after the cancel. Following from this, on resume the pending_list will now contain at least one already complete job, but it looks like we call drm_sched_resubmit_jobs(), which will then call run_job() on everything still on the pending_list. But if the job was already complete, then all the resources tied to the job, like the bb itself, any memory that is being accessed, the iommu mappings etc. might be long gone since those are usually tied to the fence signalling. This scenario can be seen in ftrace when running a slightly modified xe_pm IGT (kernel was only modified to inject artificial latency into free_job to make the race easier to hit): xe_sched_job_run: dev=0000:00:02.0, fence=0xffff888276cc8540, seqno=0, lrc_seqno=0, gt=0, guc_id=0, batch_addr=0x000000146910 ... xe_exec_queue_stop: dev=0000:00:02.0, 3:0x2, gt=0, width=1, guc_id=0, guc_state=0x0, flags=0x13 xe_exec_queue_stop: dev=0000:00:02.0, 3:0x2, gt=0, width=1, guc_id=1, guc_state=0x0, flags=0x4 xe_exec_queue_stop: dev=0000:00:02.0, 4:0x1, gt=1, width=1, guc_id=0, guc_state=0x0, flags=0x3 xe_exec_queue_stop: dev=0000:00:02.0, 1:0x1, gt=1, width=1, guc_id=1, guc_state=0x0, flags=0x3 xe_exec_queue_stop: dev=0000:00:02.0, 4:0x1, gt=1, width=1, guc_id=2, guc_state=0x0, flags=0x3 xe_exec_queue_resubmit: dev=0000:00:02.0, 3:0x2, gt=0, width=1, guc_id=0, guc_state=0x0, flags=0x13 xe_sched_job_run: dev=0000:00:02.0, fence=0xffff888276cc8540, seqno=0, lrc_seqno=0, gt=0, guc_id=0, batch_addr=0x000000146910 ... ..... xe_exec_queue_memory_cat_error: dev=0000:00:02.0, 3:0x2, gt=0, width=1, guc_id=0, guc_state=0x3, flags=0x13 So the job_run() is clearly triggered twice for the same job, even though the first must have already signalled to completion during suspend. We can also see a CAT error after the re-submit. To prevent this only resubmit jobs on the pending_list that have not yet signalled. v2: - Make sure to re-arm the fence callbacks with sched_start(). v3 (Matt B): - Stop using drm_sched_resubmit_jobs(), which appears to be deprecated and just open-code a simple loop such that we skip calling run_job() on anything already signalled. Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4856 Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: William Tseng <william.tseng@intel.com> Cc: <stable@vger.kernel.org> # v6.8+ Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Link: https://lore.kernel.org/r/20250528113328.289392-2-matthew.auld@intel.com (cherry picked from commit 38fafa9f392f3110d2de431432d43f4eef99cd1b) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
5 days	drm/xe: Rework eviction rejection of bound external bos	Thomas Hellström	3	-18/+105
	For preempt_fence mode VM's we're rejecting eviction of shared bos during VM_BIND. However, since we do this in the move() callback, we're getting an eviction failure warning from TTM. The TTM callback intended for these things is eviction_valuable(). However, the latter doesn't pass in the struct ttm_operation_ctx needed to determine whether the caller needs this. Instead, attach the needed information to the vm under the vm->resv, until we've been able to update TTM to provide the needed information. And add sufficient lockdep checks to prevent misuse and races. v2: - Fix a copy-paste error in xe_vm_clear_validating() v3: - Fix kerneldoc errors. Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Fixes: 0af944f0e308 ("drm/xe: Reject BO eviction if BO is bound to current VM") Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250528164105.234718-1-thomas.hellstrom@linux.intel.com (cherry picked from commit 9d5558649f68e2e84a87a909631b30e15ca0f8ec) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
5 days	drm/xe/vsec: fix CONFIG_INTEL_VSEC dependency	Arnd Bergmann	1	-1/+2
	The XE driver can be built with or without VSEC support, but fails to link as built-in if vsec is in a loadable module: x86_64-linux-ld: vmlinux.o: in function `xe_vsec_init': (.text+0x1e83e16): undefined reference to `intel_vsec_register' The normal fix for this is to add a 'depends on INTEL_VSEC \|\| !INTEL_VSEC', forcing XE to be a loadable module as well, but that causes a circular dependency: symbol DRM_XE depends on INTEL_VSEC symbol INTEL_VSEC depends on X86_PLATFORM_DEVICES symbol X86_PLATFORM_DEVICES is selected by DRM_XE The problem here is selecting a symbol from another subsystem, so change that as well and rephrase the 'select' into the corresponding dependency. Since X86_PLATFORM_DEVICES is 'default y', there is no change to defconfig builds here. Fixes: 0c45e76fcc62 ("drm/xe/vsec: Support BMG devices") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250529172355.2395634-2-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit e4931f8be347ec5f19df4d6d33aea37145378c42) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
5 days	drm/xe: drop redundant conversion to bool	Raag Jadav	1	-1/+1
	The result of integer comparison already evaluates to bool. No need for explicit conversion. No functional impact. Fixes: 0e414bf7ad01 ("drm/xe: Expose PCIe link downgrade attributes") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202505292205.MoljmkjQ-lkp@intel.com/ Signed-off-by: Raag Jadav <raag.jadav@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/20250529160937.490147-1-raag.jadav@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit 61761a6b57f2818983466d24aab60baab471ba21) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
5 days	drm/xe/hwmon: Move card reactive critical power under channel card	Karthik Poosa	2	-13/+13
	Move power2/curr2_crit to channel 1 i.e power1/curr1_crit as this represents the entire card critical power/current. v2: Update the date of curr1_crit also in hwmon documentation. Signed-off-by: Karthik Poosa <karthik.poosa@intel.com> Fixes: 345dadc4f68b ("drm/xe/hwmon: Add infra to support card power and energy attributes") Reviewed-by: Badal Nilawar <badal.nilawar@intel.com> Link: https://lore.kernel.org/r/20250529163458.2354509-3-karthik.poosa@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit 25e963a09e059ffdb15c09cc79cfded855b43668) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
5 days	drm/xe/hwmon: Add support to manage power limits though mailbox	Karthik Poosa	8	-106/+318
	Add support to manage power limits using pcode mailbox commands for supported platforms. v2: - Address review comments. (Badal) - Use mailbox commands instead of registers to manage power limits for BMG. - Clamp the maximum power limit to GPU firmware default value. v3: - Clamp power limit in write also for platforms with mailbox support. v4: - Remove unnecessary debug prints. (Badal) v5: - Update description of variable pl1_on_boot to fix kernel-doc error. v6: - Improve commit message, refer to BIOS as GPU firmware. - Change macro READ_PL_FROM_BIOS to READ_PL_FROM_FW. - Rectify drm_warn to drm_info. Signed-off-by: Karthik Poosa <karthik.poosa@intel.com> Fixes: e90f7a58e659 ("drm/xe/hwmon: Add HWMON support for BMG") Reviewed-by: Badal Nilawar <badal.nilawar@intel.com> Link: https://lore.kernel.org/r/20250529163458.2354509-2-karthik.poosa@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit 7596d839f6228757fe17a810da2d1c5f3305078c) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
5 days	drm/xe/vm: move xe_svm_init() earlier	Matthew Auld	1	-7/+12
	In xe_vm_close_and_put() we need to be able to call xe_svm_fini(), however during vm creation we can call this on the error path, before having actually initialised the svm state, leading to various splats followed by a fatal NPD. Fixes: 6fd979c2f331 ("drm/xe: Add SVM init / close / fini to faulting VMs") Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4967 Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250514152424.149591-4-matthew.auld@intel.com (cherry picked from commit 4f296d77cf49fcb5f90b4674123ad7f3a0676165) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
5 days	drm/xe/vm: move rebind_work init earlier	Matthew Auld	1	-4/+4
	In xe_vm_close_and_put() we need to be able to call flush_work(rebind_work), however during vm creation we can call this on the error path, before having actually set up the worker, leading to a splat from flush_work(). It looks like we can simply move the worker init step earlier to fix this. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org> # v6.8+ Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250514152424.149591-3-matthew.auld@intel.com (cherry picked from commit 96af397aa1a2d1032a6e28ff3f4bc0ab4be40e1d) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
5 days	Merge tag 'rtc-6.16' of ↵	Linus Torvalds	26	-329/+946
	git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux Pull RTC updates from Alexandre Belloni: "There are two new drivers this cycle. There is also support for a negative offset for RTCs that have been shipped with a date set using an epoch that is before 1970. This unfortunately happens with some products that ship with a vendor kernel and an out of tree driver. Core: - support negative offsets for RTCs that have shipped with an epoch earlier than 1970 New drivers: - NXP S32G2/S32G3 - Sophgo CV1800 Drivers: - loongson: fix missing alarm notifications for ACPI - m41t80: kickstart ocillator upon failure - mt6359: mt6357 support - pcf8563: fix wrong alarm register - sh: cleanups" * tag 'rtc-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: (39 commits) rtc: mt6359: Add mt6357 support rtc: test: Test date conversion for dates starting in 1900 rtc: test: Also test time and wday outcome of rtc_time64_to_tm() rtc: test: Emit the seconds-since-1970 value instead of days-since-1970 rtc: Fix offset calculation for .start_secs < 0 rtc: Make rtc_time64_to_tm() support dates before 1970 rtc: pcf8563: fix wrong alarm register rtc: rzn1: support input frequencies other than 32768Hz rtc: rzn1: Disable controller before initialization dt-bindings: rtc: rzn1: add optional second clock rtc: m41t80: reduce verbosity rtc: m41t80: kickstart ocillator upon failure rtc: s32g: add NXP S32G2/S32G3 SoC support dt-bindings: rtc: add schema for NXP S32G2/S32G3 SoCs dt-bindings: at91rm9260-rtt: add microchip,sama7d65-rtt dt-bindings: rtc: at91rm9200: add microchip,sama7d65-rtc rtc: loongson: Add missing alarm notifications for ACPI RTC events rtc: sophgo: add rtc support for Sophgo CV1800 SoC rtc: stm32: drop unused module alias rtc: s3c: drop unused module alias ...
5 days	sysfb: Fix screen_info type check for VGA	Thomas Zimmermann	1	-8/+18
	Use the helper screen_info_video_type() to get the framebuffer type from struct screen_info. Handle supported values in sorted switch statement. Reading orig_video_isVGA is unreliable. On most systems it is a VIDEO_TYPE_ constant. On some systems with VGA it is simply set to 1 to signal the presence of a VGA output. See vga_probe() for an example. Retrieving the screen_info type with the helper screen_info_video_type() detects these cases and returns the appropriate VIDEO_TYPE_ constant. For VGA, sysfb creates a device named "vga-framebuffer". The sysfb code has been taken from vga16fb, where it likely didn't work correctly either. With this bugfix applied, vga16fb loads for compatible vga-framebuffer devices. Fixes: 0db5b61e0dc0 ("fbdev/vga16fb: Create EGA/VGA devices in sysfb code") Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Javier Martinez Canillas <javierm@redhat.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Tzung-Bi Shih <tzungbi@kernel.org> Cc: Helge Deller <deller@gmx.de> Cc: "Uwe Kleine-König" <u.kleine-koenig@baylibre.com> Cc: Zsolt Kajtar <soci@c64.rulez.org> Cc: <stable@vger.kernel.org> # v6.1+ Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Tzung-Bi Shih <tzungbi@kernel.org> Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Link: https://lore.kernel.org/r/20250603154838.401882-1-tzimmermann@suse.de
5 days	video: screen_info: Relocate framebuffers behind PCI bridges	Thomas Zimmermann	1	-29/+50
	Apply PCI host-bridge window offsets to screen_info framebuffers. Fixes invalid access to I/O memory. Resources behind a PCI host bridge can be relocated by a certain offset in the kernel's CPU address range used for I/O. The framebuffer memory range stored in screen_info refers to the CPU addresses as seen during boot (where the offset is 0). During boot up, firmware may assign a different memory offset to the PCI host bridge and thereby relocating the framebuffer address of the PCI graphics device as seen by the kernel. The information in screen_info must be updated as well. The helper pcibios_bus_to_resource() performs the relocation of the screen_info's framebuffer resource (given in PCI bus addresses). The result matches the I/O-memory resource of the PCI graphics device (given in CPU addresses). As before, we store away the information necessary to later update the information in screen_info itself. Commit 78aa89d1dfba ("firmware/sysfb: Update screen_info for relocated EFI framebuffers") added the code for updating screen_info. It is based on similar functionality that pre-existed in efifb. Efifb uses a pointer to the PCI resource, while the newer code does a memcpy of the region. Hence efifb sees any updates to the PCI resource and avoids the issue. v3: - Only use struct pci_bus_region for PCI bus addresses (Bjorn) - Clarify address semantics in commit messages and comments (Bjorn) v2: - Fixed tags (Takashi, Ivan) - Updated information on efifb Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Reported-by: "Ivan T. Ivanov" <iivanov@suse.de> Closes: https://bugzilla.suse.com/show_bug.cgi?id=1240696 Tested-by: "Ivan T. Ivanov" <iivanov@suse.de> Fixes: 78aa89d1dfba ("firmware/sysfb: Update screen_info for relocated EFI framebuffers") Cc: dri-devel@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v6.9+ Link: https://lore.kernel.org/r/20250528080234.7380-1-tzimmermann@suse.de
5 days	Merge tag 'dmaengine-6.16-rc1' of ↵	Linus Torvalds	29	-106/+1463
	git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine Pull dmaengine updates from Vinod Koul: "A fairly small update for the dmaengine subsystem. This has a new ARM dmaengine driver and couple of new device support and few driver changes: New support: - Renesas RZ/V2H(P) dma support for r9a09g057 - Arm DMA-350 driver - Tegra Tegra264 ADMA support Updates: - AMD ptdma driver code removal and optimizations - Freescale edma error interrupt handler support" * tag 'dmaengine-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (27 commits) dmaengine: idxd: Remove unused pointer and macro arm64: dts: renesas: r9a09g057: Add DMAC nodes dmaengine: sh: rz-dmac: Add RZ/V2H(P) support dmaengine: sh: rz-dmac: Allow for multiple DMACs irqchip/renesas-rzv2h: Add rzv2h_icu_register_dma_req() dt-bindings: dma: rz-dmac: Document RZ/V2H(P) family of SoCs dt-bindings: dma: rz-dmac: Restrict properties for RZ/A1H dmaengine: idxd: Narrow the restriction on BATCH to ver. 1 only dmaengine: ti: Add NULL check in udma_probe() fsldma: Set correct dma_mask based on hw capability dmaengine: idxd: Check availability of workqueue allocated by idxd wq driver before using dmaengine: xilinx_dma: Set dma_device directions dmaengine: tegra210-adma: Add Tegra264 support dt-bindings: Document Tegra264 ADMA support dmaengine: dw-edma: Add HDMA NATIVE map check dmaegnine: fsl-edma: add edma error interrupt handler dt-bindings: dma: fsl-edma: increase maxItems of interrupts and interrupt-names dmaengine: ARM_DMA350 should depend on ARM/ARM64 dt-bindings: dma: qcom,bam: Document dma-coherent property dmaengine: Add Arm DMA-350 driver ...
5 days	cifs: update internal version number	Steve French	1	-2/+2
	to 2.55 Signed-off-by: Steve French <stfrench@microsoft.com>
5 days	MAINTAINERS, mailmap: Update Paulo Alcantara's email address	Paulo Alcantara	2	-2/+8
	Update my email address in MAINTAINERS and .mailmap files. Signed-off-by: Paulo Alcantara <pc@manguebit.org> Signed-off-by: Steve French <stfrench@microsoft.com>
5 days	cifs: add documentation for smbdirect setup	Meetakshi Setiya	2	-0/+104
	Document steps to use SMB over RDMA using the linux SMB client and KSMBD server Signed-off-by: Meetakshi Setiya <msetiya@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
5 days	Merge tag 'phy-for-6.16' of ↵	Linus Torvalds	54	-1033/+2386
	git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy Pull phy updates from Vinod Koul: "As usual featuring couple of new driver and bunch of new device support and some driver changes to Freescale, rockchip driver along with couple of yaml binding conversions. New Support: - Qualcomm IPQ5424 qusb2 support, IPQ5018 uniphy-pcie driver - Rockchip usb2 support for RK3562, RK3036 usb2 phy support - Samsung exynos2200 eusb2 phy support and driver refactoring for this support, exynos7870 USBDRD support - Mediatek MT7988 xs-phy support - Broadcom BCM74110 usb phy support - Renesas RZ/V2H(P) usb2 phy support Updates: - Freescale phy rate claculation updates, i.MX95 tuning support - Better error handling for amlogic pcie phy - Rockchip color depth configuration and management support - Yaml binding conversion for RK3399 Type-C and PCIe Phy" * tag 'phy-for-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy: (77 commits) phy: tegra: p2u: Broaden architecture dependency phy: rockchip: inno-usb2: Add usb2 phy support for rk3562 dt-bindings: phy: rockchip,inno-usb2phy: add rk3562 phy: rockchip: inno-usb2: add phy definition for rk3036 dt-bindings: phy: rockchip,inno-usb2phy: add rk3036 compatible phy: freescale: fsl-samsung-hdmi: Improve LUT search for best clock phy: freescale: fsl-samsung-hdmi: Refactor finding PHY settings phy: freescale: fsl-samsung-hdmi: Rename phy_clk_round_rate phy: renesas: phy-rcar-gen3-usb2: Add USB2.0 PHY support for RZ/V2H(P) phy: renesas: phy-rcar-gen3-usb2: Sort compatible entries by SoC part number dt-bindings: phy: renesas,usb2-phy: Document RZ/V2H(P) SoC dt-bindings: phy: renesas,usb2-phy: Add clock constraint for RZ/G2L family phy: exynos5-usbdrd: support Exynos USBDRD 3.2 4nm controller phy: phy-snps-eusb2: add support for exynos2200 phy: phy-snps-eusb2: refactor reference clock init phy: phy-snps-eusb2: make reset control optional phy: phy-snps-eusb2: make repeater optional phy: phy-snps-eusb2: split phy init code phy: phy-snps-eusb2: refactor constructs names phy: move phy-qcom-snps-eusb2 out of its vendor sub-directory ...
5 days	Merge tag 'soundwire-6.16-rc1' of ↵	Linus Torvalds	10	-20/+57
	git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire Pull soundwire updates from Vinod Koul: "A couple of small core changes and an Intel driver change: - sdw_assign_device_num() logic simplification, using internal slave id for irqs and optimizing computing of port params in specific stream states - Intel driver updates for ACE3+ microphone privacy status reporting and enabling the status in HDA Intel driver" * tag 'soundwire-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire: soundwire: only compute port params in specific stream states ASoC: SOF: Intel: hda: Set the mic_privacy flag for soundwire with ACE3+ soundwire: intel: Add awareness of ACE3+ microphone privacy soundwire: bus: Add internal slave ID and use for IRQs soundwire: bus: Simplify sdw_assign_device_num()
5 days	calipso: unlock rcu before returning -EAFNOSUPPORT	Eric Dumazet	1	-2/+4
	syzbot reported that a recent patch forgot to unlock rcu in the error path. Adopt the convention that netlbl_conn_setattr() is already using. Fixes: 6e9f2df1c550 ("calipso: Don't call calipso functions for AF_INET sk.") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Kuniyuki Iwashima <kuniyu@amazon.com> Acked-by: Paul Moore <paul@paul-moore.com> Link: https://patch.msgid.link/20250604133826.1667664-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 days	seg6: Fix validation of nexthop addresses	Ido Schimmel	1	-4/+2
	The kernel currently validates that the length of the provided nexthop address does not exceed the specified length. This can lead to the kernel reading uninitialized memory if user space provided a shorter length than the specified one. Fix by validating that the provided length exactly matches the specified one. Fixes: d1df6fd8a1d2 ("ipv6: sr: define core operations for seg6local lightweight tunnel") Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20250604113252.371528-1-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 days	net: prevent a NULL deref in rtnl_create_link()	Eric Dumazet	1	-1/+1
	At the time rtnl_create_link() is running, dev->netdev_ops is NULL, we must not use netdev_lock_ops() or risk a NULL deref if CONFIG_NET_SHAPER is defined. Use netif_set_group() instead of dev_set_group(). RIP: 0010:netdev_need_ops_lock include/net/netdev_lock.h:33 [inline] RIP: 0010:netdev_lock_ops include/net/netdev_lock.h:41 [inline] RIP: 0010:dev_set_group+0xc0/0x230 net/core/dev_api.c:82 Call Trace: <TASK> rtnl_create_link+0x748/0xd10 net/core/rtnetlink.c:3674 rtnl_newlink_create+0x25c/0xb00 net/core/rtnetlink.c:3813 __rtnl_newlink net/core/rtnetlink.c:3940 [inline] rtnl_newlink+0x16d6/0x1c70 net/core/rtnetlink.c:4055 rtnetlink_rcv_msg+0x7cf/0xb70 net/core/rtnetlink.c:6944 netlink_rcv_skb+0x208/0x470 net/netlink/af_netlink.c:2534 netlink_unicast_kernel net/netlink/af_netlink.c:1313 [inline] netlink_unicast+0x75b/0x8d0 net/netlink/af_netlink.c:1339 netlink_sendmsg+0x805/0xb30 net/netlink/af_netlink.c:1883 sock_sendmsg_nosec net/socket.c:712 [inline] Reported-by: syzbot+9fc858ba0312b42b577e@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/6840265f.a00a0220.d4325.0009.GAE@google.com/T/#u Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: 7e4d784f5810 ("net: hold netdev instance lock during rtnetlink operations") Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250604105815.1516973-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 days	net: annotate data-races around cleanup_net_task	Eric Dumazet	2	-3/+3
	from_cleanup_net() reads cleanup_net_task locklessly. Add READ_ONCE()/WRITE_ONCE() annotations to avoid a potential KCSAN warning, even if the race is harmless. Fixes: 0734d7c3d93c ("net: expedite synchronize_net() for cleanup_net()") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Link: https://patch.msgid.link/20250604093928.1323333-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 days	selftests: drv-net: tso: make bkg() wait for socat to quit	Jakub Kicinski	1	-1/+1
	Commit 846742f7e32f ("selftests: drv-net: add a warning for bkg + shell + terminate") added a warning for bkg() used with terminate=True. The tso test was missed as we didn't have it running anywhere in NIPA. Add exit_wait=True, to avoid: # Warning: combining shell and terminate is risky! # SIGTERM may not reach the child on zsh/ksh! getting printed twice for every variant. Fixes: 0d0f4174f6c8 ("selftests: drv-net: add a simple TSO test") Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250604012055.891431-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 days	selftests: drv-net: tso: fix the GRE device name	Jakub Kicinski	1	-1/+1
	The device type for IPv4 GRE is "gre" not "ipgre", unlike for IPv6 which uses "ip6gre". Not sure how I missed this when writing the test, perhaps because all HW I have access to is on an IPv6-only network. Fixes: 0d0f4174f6c8 ("selftests: drv-net: add a simple TSO test") Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250604012031.891242-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 days	selftests: drv-net: add configs for the TSO test	Jakub Kicinski	1	-0/+5
	Add missing config options for the tso.py test, specifically to make sure the kernel is built with vxlan and gre tunnels. I noticed this while adding a TSO-capable device QEMU to the CI. Previously we only run virtio tests and it doesn't report LSO stats on the QEMU we have. Fixes: 0d0f4174f6c8 ("selftests: drv-net: add a simple TSO test") Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250604001653.853008-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 days	Merge branch '40GbE' of ↵	Jakub Kicinski	3	-223/+96
	git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== iavf: get rid of the crit lock Przemek Kitszel says: Fix some deadlocks in iavf, and make it less error prone for the future. Patch 1 is simple and independent from the rest. Patches 2, 3, 4 are strictly a refactor, but it enables the last patch to be much smaller. (Technically Jake given his RB tags not knowing I will send it to -net). Patch 5 just adds annotations, this also helps prove last patch to be correct. Patch 6 removes the crit lock, with its unusual try_lock()s. I have more refactoring for scheduling done for -next, to be sent soon. There is a simple test: add VF; decrease number of queueus; remove VF that was way too hard to pass without this series :) * '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: iavf: get rid of the crit lock iavf: sprinkle netdev_assert_locked() annotations iavf: extract iavf_watchdog_step() out of iavf_watchdog_task() iavf: simplify watchdog_task in terms of adminq task scheduling iavf: centralize watchdog requeueing itself iavf: iavf_suspend(): take RTNL before netdev_lock() ==================== Link: https://patch.msgid.link/20250603171710.2336151-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
5 days	wireguard: device: enable threaded NAPI	Mirco Barone	1	-0/+1
	Enable threaded NAPI by default for WireGuard devices in response to low performance behavior that we observed when multiple tunnels (and thus multiple wg devices) are deployed on a single host. This affects any kind of multi-tunnel deployment, regardless of whether the tunnels share the same endpoints or not (i.e., a VPN concentrator type of gateway would also be affected). The problem is caused by the fact that, in case of a traffic surge that involves multiple tunnels at the same time, the polling of the NAPI instance of all these wg devices tends to converge onto the same core, causing underutilization of the CPU and bottlenecking performance. This happens because NAPI polling is hosted by default in softirq context, but the WireGuard driver only raises this softirq after the rx peer queue has been drained, which doesn't happen during high traffic. In this case, the softirq already active on a core is reused instead of raising a new one. As a result, once two or more tunnel softirqs have been scheduled on the same core, they remain pinned there until the surge ends. In our experiments, this almost always leads to all tunnel NAPIs being handled on a single core shortly after a surge begins, limiting scalability to less than 3× the performance of a single tunnel, despite plenty of unused CPU cores being available. The proposed mitigation is to enable threaded NAPI for all WireGuard devices. This moves the NAPI polling context to a dedicated per-device kernel thread, allowing the scheduler to balance the load across all available cores. On our 32-core gateways, enabling threaded NAPI yields a ~4× performance improvement with 16 tunnels, increasing throughput from ~13 Gbps to ~48 Gbps. Meanwhile, CPU usage on the receiver (which is the bottleneck) jumps from 20% to 100%. We have found no performance regressions in any scenario we tested. Single-tunnel throughput remains unchanged. More details are available in our Netdev paper. Link: https://netdevconf.info/0x18/docs/netdev-0x18-paper23-talk-paper.pdf Signed-off-by: Mirco Barone <mirco.barone@polito.it> Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Link: https://patch.msgid.link/20250605120616.2808744-1-Jason@zx2c4.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>