| Age | Commit message (Collapse) | Author | Files | Lines |
|
https://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux.git
|
|
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
# Conflicts:
# drivers/gpib/cb7210/cb7210.c
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86.git
|
|
# Conflicts:
# arch/x86/include/asm/tdx.h
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git
|
|
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git
# Conflicts:
# drivers/cpufreq/Kconfig.x86
# drivers/cpufreq/Makefile
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/modules/linux.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev.git
|
|
https://gitlab.freedesktop.org/drm/misc/kernel.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/pci/pci.git
|
|
# Conflicts:
# fs/btrfs/defrag.c
|
|
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git
|
|
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/mips/linux.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/ti/linux.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/fustini/linux.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/tenstorrent/linux.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/atorgue/stm32.git
|
|
|
|
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-devel.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/khilman/linux-omap.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/mediatek/linux.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/frank.li/linux.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/peter.chen/cix.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/bmc/linux.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/at91/linux.git
|
|
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/amlogic/linux.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/soc/soc.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git
|
|
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/at91/linux.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/soc/soc.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/nathan/linux.git
|
|
for-next
|
|
# Conflicts:
# fs/fuse/dev.c
|
|
Enable audio on the RZ/G3L SMARC EVK by linking SSI0 with the DA7212
audio CODEC. The SSI0 signals are multiplexed with SD2 and are selected
by switch SW_SD2_EN#. Add regulator nodes regulator-{1p8v,3p3v} to the
SoM DTSI for reuse by eMMC.
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/20260528074615.91110-3-biju.das.jz@bp.renesas.com
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
|
|
The RZ/G3L SMARC SoM has a Versa 5P35023B clock generator to generate
the following clocks:
- ref: Not connected,
- se1: AUDIO_MCK (11.2896 or 12.2880 MHz),
- se2: RZ_AUDIO_CLK_B (11.2896 MHz),
- se3: RZ_AUDIO_CLK_C (12.2880 MHz),
- diff{1,1B}: ET{0,1}_PHY_CLK (25 MHz),
- diff2{2,2B}: Not connected.
Enable the Vversa 5P35023B clock generator on the RZ/G3L SoM DTSI.
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/20260528074615.91110-2-biju.das.jz@bp.renesas.com
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
|
|
Enable I2C{2,3} on the RZ/G3L SMARC EVK board. I2C3 is enabled by
setting SW SYS.2 to the OFF position.
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Link: https://patch.msgid.link/20260528070239.33352-3-biju.das.jz@bp.renesas.com
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
|
|
RZ/G3L SMARC EVK has 3 user buttons called USER_SW1, USER_SW2 and
USER_SW3. Instantiate the gpio-keys driver for these buttons by
removing place holders and replacing proper pins for the buttons.
USER_SW{1,2,3} are configured as wakeup-sources, so they can wake up the
system during s2idle.
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/20260528070239.33352-2-biju.das.jz@bp.renesas.com
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
|
|
Enable the xSPI0 and xSPI1 controllers on the RZ/T2H N2H EVK board.
Configure the xSPI0 controller interface to 1-bit (x1) mode, even though
the connected MX25LW51245 octal flash device supports octal mode. Add a
corresponding inline hardware comment detailing this restriction;
operating in octal mode causes the BootROM to fail loading the
first-stage bootloader following a Watchdog Timer (WDT) reset.
Configure the xSPI1 controller interface connected to the AT25SF128A
flash device for 4-bit (x4) mode to utilize all available data lines.
Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/20260527202430.606341-1-prabhakar.mahadev-lad.rj@bp.renesas.com
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
|
|
Add DMA properties to the serial nodes on the RZ/G2L SoC.
Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/20260520132315.944117-1-claudiu.beznea@kernel.org
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
|
|
Add rsci{0..3} device nodes to the RZ/G3L ("R9A08G046") SoC DTSI.
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/20260519100022.116318-2-biju.das.jz@bp.renesas.com
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
|
|
Fully describe all available DRAM in the DT, and describe regions which
are not accessible because they are used by firmware in reserved-memory
nodes.
Replace the first memory bank memory@60600000 with memory@40000000 and a
518 MiB long reserved-memory no-map subnode. This memory region is used
by other cores in the system.
Reserve 32 kiB of memory at 0x8c100000 for parameters shared by IPL,
SCP, TFA BL31 and TEE.
Reserve 512 kiB of memory at 0x8c200000 for TFA BL31. The upcoming
upstream TFA 2.15 BL31 uses memory from 0x8c200000..0x8c242fff; rounding
up to 512 kiB is slight future-proofing.
Reserve 32 MiB of memory at 0x8c400000 for OPTEE-OS, which is the entire
OPTEE-OS TZ protected DRAM area.
Neither TFA BL31 nor OPTEE-OS modify the DT passed to Linux in any way
with any new reserved-memory {} nodes to reserve memory areas used by
the TFA BL31 or OPTEE-OS to prevent the next stage from using those
areas, which lets Linux use all of the available DRAM as it is described
in the DT that was passed in by U-Boot, including the areas that are
newly utilized by TFA BL31 or OPTEE-OS.
In case of high DRAM utilization, unless the memory used by TFA BL31 or
OPTEE-OS is properly reserved, Linux may use and corrupt the memory used
by TFA BL31 or OPTEE-OS, which would lead to the system becoming
unresponsive.
Fixes: ad142a4ef710 ("arm64: dts: renesas: r8a78000: Add initial Ironhide board support")
Signed-off-by: Marek Vasut <marek.vasut+renesas@mailbox.org>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/20260517163212.18016-1-marek.vasut+renesas@mailbox.org
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
|
|
Move all differences into panel-aa104xd12.dtsi, rename OF_GRAPH links to
generic lvds_panel_out and lvds_panel_in names, and parametrize the LVDS
output in use using RENESAS_LVDS_OUTPUT macro. No functional change.
Signed-off-by: Marek Vasut <marek.vasut+renesas@mailbox.org>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/20260504143751.42753-1-marek.vasut+renesas@mailbox.org
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
|
|
Add vspd{0,1} nodes to the RZ/G3E SoC DTSI.
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Tommaso Merciai <tommaso.merciai.xr@bp.renesas.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Link: https://patch.msgid.link/46547aaff3cdb8ea6e17cf1fdec699d83a1cd71b.1775636898.git.tommaso.merciai.xr@bp.renesas.com
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
|
|
Add fcpvd{0,1} nodes to the RZ/G3E SoC DTSI.
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Tommaso Merciai <tommaso.merciai.xr@bp.renesas.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Link: https://patch.msgid.link/1ba6a98ace4ad9525d054cbaa308d3aeeecfa22a.1775636898.git.tommaso.merciai.xr@bp.renesas.com
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
|
|
|
|
Add device tree overlay for the Radxa Camera 4K (featuring the Sony IMX415
image sensor) to applied on the Radxa ROCK 5B+ CAM1 port.
Signed-off-by: Michael Riesch <michael.riesch@collabora.com>
Link: https://patch.msgid.link/20260522-rk3588-vicap-v5-7-d1d1f5265c56@collabora.com
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
|
|
Add device tree overlay for the Radxa Camera 4K (featuring the Sony IMX415
image sensor) to applied on the Radxa ROCK 5B+ CAM0 port.
Signed-off-by: Michael Riesch <michael.riesch@collabora.com>
Link: https://patch.msgid.link/20260522-rk3588-vicap-v5-6-d1d1f5265c56@collabora.com
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
|
|
Add the device tree node for the RK3588 Video Capture (VICAP) unit.
Signed-off-by: Michael Riesch <michael.riesch@collabora.com>
[converted reg values in vicap ports to hexadecimal, to have them align
with the port@X values, and be less confusing]
Link: https://patch.msgid.link/20260522-rk3588-vicap-v5-5-d1d1f5265c56@collabora.com
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
|
|
ZCR_EL2 can be updated by a VHE guest hypervisor either using ZCR_EL2
(which traps) or ZCR_EL1 (which does not trap). KVM handles both in
different way:
- on ZCR_EL2 trap, ZCR_EL2.LEN is immediately capped at the VM's own
VL limit. This has the potential to break existing SW that relies
on the full LEN field to be stateful.
- on ZCR_EL1 access, we do absolutely nothing.
On restoring the SVE context for an L2 guest, we directly restore the
guest hypervisor's view of ZCR_EL2 into the physical ZCR_EL2. If the
guest's view of the register was updated using the ZCR_EL2 accessor,
the value has already been sanitised (with the caveat mentioned above).
But if the guest used ZCR_EL1, the raw value is written into the HW,
and the L2 guest can now access VLs that it shouldn't.
Fix all the above by moving the VL capping to the restore points,
ensuring that:
- the HW is always programmed with a capped value, irrespective of
the accessor being used,
- the ZCR_EL2.LEN field is always completely stateful, irrespective
of the accessor being used.
Additionally, move ZCR_EL2 to be a sanitised register, ensuring that
only the LEN field is actually stateful. This requires some creative
construction of the RES0 mask, as the sysreg generation script does
not yet generate RAZ/WI fields.
Fixes: b3d29a823099 ("KVM: arm64: nv: Handle ZCR_EL2 traps")
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20260529-kvm-arm64-fix-zcr-len-nv-v2-1-86cad51992bd@kernel.org
[maz: rewrote commit message, tidy up access_zcr_el2()]
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
# New commits in x86/tdx:
6712564c884d ("x86/virt/tdx: Enable TDX module runtime updates")
73be1bb72f4c ("x86/virt/tdx: Refresh TDX module version after update")
bd0ba697612a ("coco/tdx-host: Lock out module updates when reading version")
eb71a4c94061 ("x86/virt/seamldr: Add module update locking")
069be08012cf ("x86/virt/tdx: Restore TDX module state")
f74245e39c21 ("x86/virt/seamldr: Initialize the newly-installed TDX module")
d909333bf655 ("x86/virt/seamldr: Install a new TDX module")
522bacc2fbac ("x86/virt/tdx: Reset software states during TDX module shutdown")
146ac22b2b96 ("x86/virt/seamldr: Shut down the current TDX module")
c507e80de947 ("x86/virt/seamldr: Abort updates after a failed step")
e16ce07a9053 ("x86/virt/seamldr: Introduce skeleton for TDX module updates")
35621312a061 ("x86/virt/seamldr: Allocate and populate a module update request")
000c293c24bc ("coco/tdx-host: Implement firmware upload sysfs ABI for TDX module updates")
56b46fe202f8 ("coco/tdx-host: Don't expose P-SEAMLDR information on CPUs with erratum")
b094b1684fef ("coco/tdx-host: Expose P-SEAMLDR information via sysfs")
fcbc30f0d66f ("x86/virt/seamldr: Add a helper to retrieve P-SEAMLDR information")
b434b916fed3 ("x86/virt/seamldr: Introduce a wrapper for P-SEAMLDR SEAMCALLs")
e4afd39aefd8 ("coco/tdx-host: Expose TDX module version")
c6a2ea2cfa6a ("coco/tdx-host: Introduce a "tdx_host" device")
0a7808c1b5ff ("x86/virt/tdx: Move low level SEAMCALL helpers out of <asm/tdx.h>")
2818e8c8a46d ("x86/virt/tdx: Move TDX_FEATURES0 bits to asm/tdx.h")
332d5758bbad ("x86/virt/tdx: Consolidate TDX global initialization states")
2f410fa074fb ("x86/virt/tdx: Move TDX global initialization states to file scope")
394d7f52d844 ("x86/virt/tdx: Clarify try_init_module_global() result caching")
5209e5bfe5ca ("x86/virt/tdx: Remove kexec docs")
5b25f249be32 ("x86/tdx: Disable the TDX module during kexec and kdump")
b7d2173946ef ("x86/virt/tdx: Add SEAMCALL wrapper for TDH.SYS.DISABLE")
597bdf6e068e ("x86/virt/tdx: Pull kexec cache flush logic into arch/x86")
53642715861e ("x86/tdx: Move TDX architectural error codes into <asm/shared/tdx_errno.h>")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
# New commits in x86/sev:
9d8460a1c7a6 ("x86/sev: Remove redundant ghcbs_initialized checks around __sev_{get,put}_ghcb()")
e9d29f4a9183 ("crypto/ccp: Skip SNP_INIT if preparation fails")
39f1de2fffb3 ("x86/sev: Do not initialize SNP if missing CPUs")
52705e72e265 ("x86/entry: Zap the #VC entry user and kernel macros")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
# New commits in x86/mm:
39406c05f8f1 ("x86/mm: Fix freeing of PMD-sized vmemmap pages")
952ac097ce98 ("x86: Update comment about pgd_list")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
# New commits in x86/misc:
1d2cf6d5b599 ("MAINTAINERS: Move Rick Edgecombe to TDX maintainer")
c256d2a8adf2 ("x86: Remove unnecessary architecture-specific <asm/device.h>")
23aea3c539a6 ("x86/bug: Put HAVE_ARCH_BUG_FORMAT_ARGS WARN definitions inside __ASSEMBLER__")
40c4b47f41b9 ("x86/bug: Add printf() validation to HAVE_ARCH_BUG_FORMAT_ARGS WARNs")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
# New commits in x86/cpu:
87a451161f36 ("x86/cpu: Fix a F00F bug warning and clean up surrounding code")
dedcf8e10441 ("x86/cpu: Add Intel CPU model number for rugged Panther Lake")
fa6dcbc69ad4 ("x86/cpuid: Introduce a centralized CPUID parser")
3aa8f9fce860 ("x86/cpu: Introduce a centralized CPUID data model")
202311a754d4 ("x86/cpuid: Introduce <asm/cpuid/leaf_types.h>")
5fbe09ebb4dc ("x86/cpuid: Rename cpuid_leaf()/cpuid_subleaf() APIs")
55cbcb6731bb ("x86/cpu: Do not include the CPUID API header in asm/processor.h")
21ff606db9c5 ("Documentation: core-api/cpu_hotplug: Remove stale cpu0_hotplug docs")
435ef16e69b9 ("x86/cpu, cpufreq: Remove AMD ELAN support")
823caa173884 ("x86/fpu: Remove the math-emu/ FPU emulation library")
d8b55ce0c995 ("x86/fpu: Remove the 'no387' boot option")
ab05214025ee ("x86/fpu: Remove MATH_EMULATION and related glue code")
7b49a3fb69e7 ("treewide: Explicitly include the x86 CPUID headers")
2ed46bccac39 ("x86/cpu: Remove the CONFIG_X86_INVD_BUG quirk")
db1931e39ba1 ("x86/cpu, x86/platform, watchdog: Remove CONFIG_X86_RDC321X support")
dbafa16ec2b6 ("x86/cpu: Remove TSC-less CONFIG_M586 support")
7d328c5de43a ("x86/cpu: Remove CPU_SUP_UMC_32 support")
aaa3c14d1134 ("x86/cpu: Remove CONFIG_MWINCHIP3D/MWINCHIPC6")
4af2468b82bd ("x86: Mark AMD Geode support as orphaned")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
# New commits in x86/cleanups:
648fb97ee908 ("x86/tlb: Convert copy_from_user() + kstrtouint() to kstrtouint_from_user()")
f9b55e47ac60 ("x86/purgatory: Fix #endif comment")
0c37d7aca413 ("x86/boot: Get rid of kstrtoull()")
7b894dac26e5 ("x86/boot/compressed: Use boot_kstrtoul() for hugepages= parsing")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
# New commits in x86/cache:
1cfa74c683ea ("fs/resctrl: Document tasks file behaviour for task id 0 and idle tasks")
9a1646211f8c ("fs/resctrl: Document that automatic counter assignment is best effort")
3aec86e4ea01 ("fs/resctrl: Continue counter allocation after failure")
ee3d4c81d89c ("fs/resctrl: Add monitor property 'mbm_cntr_assign_fixed'")
f52abe650241 ("fs/resctrl: Disallow the software controller when MBM counters are assignable")
94a1206522d1 ("x86,fs/resctrl: Create 'event_filter' files read only if they're not configurable")
7625632fed43 ("fs/resctrl: Tidy up the error path in resctrl_mkdir_event_configs()")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
# New commits in sched/core:
5ad278dd20bd ("sched: Remove sched_class::pick_next_task()")
b3a2dfa8b42e ("sched/fair: Add newidle balance to pick_task_fair()")
e05777c44e53 ("sched/debug: Collapse subsequent CONFIG_SCHED_CLASS_EXT sections")
775570022345 ("sched: Use {READ,WRITE}_ONCE() for preempt_dynamic_mode")
333f6f0e11ac ("sched/debug: Use char * instead of char (*)[]")
25139c11693a ("sched/fair: Fix RCU usage in NOHZ exit path on CPU offline")
9e005ed21152 ("sched/topology: Allow multiple domains to claim sched_domain_shared")
dd29c017aed6 ("sched/rt: Have RT_PUSH_IPI be default off for non PREEMPT_RT")
04f80f8b12a0 ("sched: Switch rq->next_class on proxy_resched_idle()")
61ea17a63719 ("sched/fair: Add SIS_UTIL support to select_idle_capacity()")
bf6aa722198d ("sched/fair: Reject misfit pulls onto busy SMT siblings on asym-capacity")
25a32e400a14 ("sched/fair: Prefer fully-idle SMT cores in asym-capacity idle selection")
fdfe5a8cd873 ("sched/fair: Attach sched_domain_shared to sd_asym_cpucapacity")
c9d93a73ce87 ("sched/fair: Drop redundant RCU read lock in NOHZ kick path")
acbdbab75ff4 ("sched: Unify SMT active check via sched_smt_active()")
3dbb362f90f3 ("sched/fair: Add sched_smt_active check for fastpaths")
5bc6ab2d42e5 ("sched: Simplify ifdeffery around cpu_smt_mask")
815c5cb76a3e ("topology: Introduce cpu_smt_mask for CONFIG_SCHED_SMT=n")
6d2051403d6c ("sched/fair: Update util_est after updating util_avg during dequeue")
ea19506013ad ("sched/clock: Provide !HAVE_UNSTABLE_SCHED_CLOCK stub for sched_clock_stable()")
95f44886afec ("sched/cputime: Drop now-stale mul_u64_u64_div_u64() over-approximation guard")
eecd5e117cfa ("sched/deadline: Fix replenishment logic for non-deferred servers")
c2e390197ad1 ("sched/rt: Update default bandwidth for real-time tasks to ONE")
c99b8593b060 ("sched/cache: Fix stale preferred_llc for a new task")
a7660ce1590f ("sched/cache: Fix has_multi_llcs iff at least one partition has multiple LLCs")
5beff4f08727 ("sched/cache: Fix cache aware scheduling enabling for multi LLCs system")
9f7c745850b4 ("sched/cache: Fix race condition during sched domain rebuild")
d6b9afab44e2 ("sched/cache: Fix checking active load balance by only considering the CFS task")
03755348b8e7 ("sched/cache: Fix unpaired account_llc_enqueue/dequeue")
91d07324c930 ("sched/cache: Annotate lockless accesses to mm->sc_stat.cpu")
9f23469401b0 ("sched/cache: Fix potential NULL mm pointer access")
d943b86dfbf4 ("sched/cache: Fix rcu warning when accessing sd_llc domain")
c1e7fe5e75ed ("sched/cache: Add user control to adjust the aggressiveness of cache-aware scheduling")
808915f982c2 ("sched/cache: Avoid cache-aware scheduling for memory-heavy processes")
7030513a0877 ("sched/cache: Calculate the LLC size and store it in sched_domain")
7b34bb1ca324 ("sched/cache: Skip cache-aware scheduling for single-threaded processes")
deee5e27d5b6 ("sched/cache: Disable cache aware scheduling for processes with high thread counts")
a2b4cf39d9d3 ("sched/cache: Allow only 1 thread of the process to calculate the LLC occupancy")
4ac4d6549a65 ("sched: Use trace_call__<tp>() to save a static branch")
067a31358143 ("sched/cache: Allow the user space to turn on and off cache aware scheduling")
d59f4fd1d303 ("sched/cache: Enable cache aware scheduling for multi LLCs NUMA node")
5b1d5e6db20a ("sched/cache: Respect LLC preference in task migration and detach")
714059f79ff0 ("sched/cache: Handle moving single tasks to/from their preferred LLC")
e4c9a4cb244a ("sched/cache: Add migrate_llc_task migration type for cache-aware balancing")
f38cc2f0d8a3 ("sched/cache: Prioritize tasks preferring destination LLC during balancing")
9a5e22fbb0c8 ("sched/cache: Check local_group only once in update_sg_lb_stats()")
15ad45fb80ca ("sched/cache: Count tasks prefering destination LLC in a sched group")
82c960aee304 ("sched/cache: Calculate the percpu sd task LLC preference")
a8d0ca0b7f2f ("sched/cache: Introduce per CPU's tasks LLC preference counter")
46afe3af7ead ("sched/cache: Track LLC-preferred tasks per runqueue")
47d8696b95f7 ("sched/cache: Assign preferred LLC ID to processes")
b5ea300a17e3 ("sched/cache: Make LLC id continuous")
23b2b5ccc45c ("sched/cache: Introduce helper functions to enforce LLC migration policy")
f025ef275388 ("sched/cache: Record per LLC utilization to guide cache aware scheduling decisions")
b4606faab318 ("sched/cache: Limit the scan number of CPUs when calculating task occupancy")
df0d98475954 ("sched/cache: Introduce infrastructure for cache-aware load balancing")
abb12b9b52cf ("x86/topology: Add paramter to split LLC")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
# New commits in perf/core:
66cc29745f2f ("perf/x86/intel: Update event constraints and cache_extra_regsfor CWF")
7ae5f58517a6 ("perf/x86/intel: Update event constraints and cache_extra_regsfor SRF")
42f47511d979 ("perf/x86/intel: Update event constraints and cache_extra_regsfor NVL")
65fd435095bb ("perf/x86/intel: Update event constraints for PTL")
0073ed169226 ("perf/x86/intel: Update event constraints and cache_extra_regsfor ARL")
331c3e4fa39a ("perf/x86/intel: Update event constraints and cache_extra_regsfor LNL")
e99fb45436ea ("perf/x86/intel: Update event constraints and cache_extra_regsfor MTL")
4ef863352bcd ("perf/x86/intel: Update event constraints and cache_extra_regsfor ADL")
070bd45e1dba ("perf/x86/intel: Update event constraints for DMR")
30d82ddee085 ("perf/x86/intel: Update event constraints and cache_extra_regsfor SPR")
acc41cdcb091 ("perf/x86/intel: Update event constraints and cache_extra_regsfor ICX")
5c3cdc74af25 ("perf/x86/intel: Consolidate MSR_IA32_PERF_CFG_C tracking")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
# New commits in objtool/core:
2d3bb398861a ("objtool/klp: Cache dont_correlate() result")
fe6a87e0abac ("objtool: Improve and simplify prefix symbol detection")
f7ceffd21a8a ("objtool/klp: Fix kCFI prefix finding/cloning")
fc0bb9915bce ("objtool: Grow __cfi_* prefix symbols for all CFI+CALL_PADDING")
cca84cb12908 ("objtool/klp: Fix position-dependent checksums for non-relocated jumps/calls")
3ee67629b2b7 ("objtool: Add insn_sym() helper")
5d6a03eeb717 ("objtool/klp: Add correlation debugging output")
6016dd33a10a ("objtool/klp: Rewrite symbol correlation algorithm")
873a2208ea31 ("objtool/klp: Calculate object checksums")
225d16dd510d ("klp-build: Validate short-circuit prerequisites")
3b8e56b86faa ("objtool/klp: Remove "objtool --checksum"")
d4888d58041d ("klp-build: Use "objtool klp checksum" subcommand")
e10764614ad6 ("objtool/klp: Add "objtool klp checksum" subcommand")
a5b661233262 ("objtool: Consolidate file decoding into decode_file()")
30cae58cdc13 ("objtool/klp: Extricate checksum calculation from validate_branch()")
6282e9f46b4f ("objtool: Add is_cold_func() helper")
8eebd5731133 ("objtool: Add is_alias_sym() helper")
ff0cf5efef40 ("objtool/klp: Handle Clang .data..Lanon anonymous data sections")
9e4512d7de5a ("objtool/klp: Create empty checksum sections for function-less object files")
ac999926774a ("objtool: Include libsubcmd headers directly from source tree")
8d4cbb6d0caf ("objtool/klp: Don't set sym->file for section symbols")
b6480aaedf3c ("klp-build: Remove redundant SRC and OBJ variables")
e950d2a10a30 ("klp-build: Print "objtool klp diff" command in verbose mode")
df0d7bb04a27 ("klp-build: Reject patches to realmode")
d8c3e262361b ("klp-build: Reject patches to vDSO")
f3048888ea62 ("klp-build: Fix patch cleanup on interrupt")
96524543740e ("klp-build: Suppress excessive fuzz output by default")
b3ece3019e8e ("klp-build: Validate patch file existence")
946d3510fe19 ("klp-build: Don't use errexit")
ba77fe55781a ("klp-build: Fix checksum comparison for changed offsets")
cc39ccce7d5b ("klp-build: Fix hang on out-of-date .config")
a375e327b63e ("objtool: Fix reloc hash collision in find_reloc_by_dest_range()")
5f49ec82b9f6 ("objtool/klp: Fix reloc corruption in convert_reloc_sym_to_secsym()")
51e1dfce24c8 ("objtool/klp: Don't correlate .rodata.cst* constant pool objects")
d5b0f025281f ("objtool/klp: Fix pointer comparisons for rodata objects")
8fdc3585b3b0 ("objtool/klp: Simplify reloc symbol conversion")
3e01ab44af20 ("objtool: Move mark_rodata() to elf.c")
3787e82a4e3a ("objtool/klp: Fix relocation conversion failures for R_X86_64_NONE")
da4326573ae8 ("objtool/klp: Fix kCFI trap handling")
62a7a01fde87 ("objtool/klp: Fix extraction of text annotations for alternatives")
479ac5260e7e ("objtool/klp: Fix XXH3 state memory leak")
98377f3ba7c0 ("objtool/klp: Fix cloning of zero-length section symbols")
c4c02d4450b5 ("objtool/klp: Fix handling of zero-length .altinstr_replacement sections")
def5b60dcd22 ("objtool/klp: Fix --debug-checksum for duplicate symbol names")
0333b7399587 ("objtool: Replace iterator callback with for_each_sym_by_mangled_name()")
3de711fba73a ("objtool/klp: Fix create_fake_symbols() skipping entsize-based sections")
e872b3f13922 ("objtool/klp: Improve local label check")
76eb0f8639fb ("objtool/klp: Don't report uncorrelated functions as new")
0a7823d1d70d ("objtool/klp: Don't correlate __initstub__ symbols")
710c4c254688 ("objtool/klp: Don't correlate absolute symbols")
8edec016255d ("objtool/klp: Don't correlate __ADDRESSABLE() symbols")
ff529864e738 ("objtool/klp: Fix .data..once static local non-correlation")
84c304a534b8 ("objtool/klp: Fix is_uncorrelated_static_local() for Clang")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
# New commits in irq/drivers:
e61654fbc3bc ("irqchip/gic-v4: Don't advertise VLPIs if no ITS is probed")
5fd6f2154734 ("irqchip/gic-v3-its: Use FIELD_MODIFY()")
2ee2a685ee83 ("irqchip/econet-en751221: Support MIPS 34Kc VEIC mode")
02bea6ff684b ("dt-bindings: interrupt-controller: econet: Add CPU interrupt mapping")
5b9cb104594f ("irqchip/meson-gpio: Add support for Amlogic A9 SoCs")
f51c99a0e502 ("dt-bindings: interrupt-controller: Add support for Amlogic A9 SoCs")
e8d3dcdf9f57 ("irqchip/meson-gpio: Use the correct register in meson_s4_gpio_irq_set_type()")
8b9db6739610 ("irqchip/starfive: Fix error check for devm_platform_ioremap_resource()")
76841b0ea8be ("irqchip/qcom: Unify user-visible "Qualcomm" name")
5a59e82f95d3 ("irqchip/gic: Replace __ASSEMBLY__ with __ASSEMBLER__")
96c0c9b48850 ("irqchip/starfive: Implement irq_set_type() and irq_ack() callbacks")
5d1b12880fd8 ("irqchip/starfive: Increase the interrupt source number up to 64")
2f59ca185497 ("irqchip/starfive: Use devm_ interfaces to simplify resource release")
ac2005bba8d9 ("irqchip/starfive: Rename jh8100 to jhb100")
a540d544db1c ("dt-bindings: interrupt-controller: Repurpose binding for unreleased jh8100 for jhb100")
d3587cc4a5e6 ("irqchip/aspeed-intc: Remove AST2700-A0 support")
46e39ee92d14 ("irqchip/ast2700-intc: Add KUnit tests for route resolution")
07825e41519a ("irqchip/ast2700-intc: Add AST2700-A2 support")
51561ad8c89c ("dt-bindings: interrupt-controller: Describe AST2700-A2 hardware instead of A0")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
# New commits in irq/core:
171cc0d9eed1 ("genirq/proc: Speed up /proc/interrupts iteration")
61b51a167c52 ("genirq/proc: Runtime size the chip name")
7603e0575d8a ("genirq: Expose irq_find_desc_at_or_after() in core code")
1d9c4745bfb6 ("genirq: Add rcuref count to struct irq_desc")
34594da7650d ("genirq/proc: Increase default interrupt number precision to four")
2d62735f1d4a ("genirq: Calculate precision only when required")
4892e5e71ec9 ("genirq: Cache the condition for /proc/interrupts exposure")
3ba92f6a2820 ("genirq/manage: Make NMI cleanup RT safe")
b99dc723b12e ("genirq: Expose nr_irqs in core code")
cca5e6fa791b ("scripts/gdb: Update x86 interrupts to the array based storage")
d6b70b16b4e7 ("x86/irq: Move IOAPIC misrouted and PIC/APIC error counts into irq_stats")
8713f2e596a1 ("x86/irq: Suppress unlikely interrupt stats by default")
2b57c69917ee ("x86/irq: Make irqstats array based")
0179464391af ("genirq/proc: Utilize irq_desc::tot_count to avoid evaluation")
95c33a64f203 ("genirq/proc: Avoid formatting zero counts in /proc/interrupts")
115bbf0c1b60 ("x86/irq: Optimize interrupts decimals printing")
c2c7983c93f5 ("genirq/proc: Size interrupt directory names for 10-digit interrupt numbers")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
# New commits in x86/merge:
1458ade7469d ("x86/microcode: Fix comment in microcode_loader_disabled()")
00e05495c572 ("scripts/x86/intel: Add a script to update the old microcode list")
515c6b216021 ("x86/microcode/intel: Refresh old_microcode defines with Nov 2025 release")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
The aes driver registers both skcipher and aead algorithms,
but when aead is not enabled this causes a link failure:
s390-linux-ld: arch/s390/crypto/aes_s390.o: in function `aes_s390_fini':
arch/s390/crypto/aes_s390.c:969:(.text+0x115e): undefined reference to `crypto_unregister_aead'
s390-linux-ld: arch/s390/crypto/aes_s390.o: in function `aes_s390_init':
arch/s390/crypto/aes_s390.c:1028:(.init.text+0x294): undefined reference to `crypto_register_aead'
Add the missing 'select' statement.
Fixes: bf7fa038707c ("s390/crypto: add s390 platform specific aes gcm support.")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
sparc64 defconfig told me
WARNING: modpost: EXPORT symbol "_mcount" [vmlinux] version generation failed, symbol will not be versioned.
Is "_mcount" prototyped in <asm/asm-prototypes.h>?
so I added it.
BTW, altering arch/sparc/include/asm/asm-prototypes.h then running `make'
doesn't compile anything, so there's a missing dependency somewhere?
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Andreas Larsson <andreas@gaisler.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
|
|
Rename the memory block lookup helper to make the acquired reference
explicit, add memory_block_put() to wrap put_device(), remove
find_memory_block(), and use memory_block_get() as the single block-id
based lookup interface.
This makes it clearer to callers that a successful lookup holds a
reference that must be dropped, reducing the chance of forgetting the
matching put and leaking the memory block device reference.
Link: https://lore.kernel.org/linux-mm/7887915D-E598-42B3-9AFE-BFFBACE8DE2D@linux.dev/#t
Link: https://lore.kernel.org/20260512072635.3969576-1-songmuchun@bytedance.com
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: Oscar Salvador <osalvador@suse.de>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Tested-by: Donet Tom <donettom@linux.ibm.com>
Reviewed-by: Lorenzo Stoakes <ljs@kernel.org>
Tested-by: Sumanth Korikkar <sumanthk@linux.ibm.com> #s390
Cc: Richard Cheng <icheng@nvidia.com>
Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: Doug Anderson <dianders@chromium.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Kees Cook <kees@kernel.org>
Cc: Liam R. Howlett <liam@infradead.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
register_page_bootmem_info_node() essentially only calls
register_page_bootmem_memmap(). However, on powerpc that function is a
nop. So there is not benefit in using CONFIG_HAVE_BOOTMEM_INFO_NODE
anymore, let's just drop it.
We can stop including bootmem_info.h.
Link: https://lore.kernel.org/20260511-bootmem_info_prep-v1-8-3fb0be6fc688@kernel.org
Signed-off-by: David Hildenbrand (Arm) <david@kernel.org>
Acked-by: Oscar Salvador <osalvador@suse.de>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: Lance Yang <lance.yang@linux.dev>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Andreas Larsson <andreas@gaisler.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Liam R. Howlett <liam@infradead.org>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
We never select CONFIG_HAVE_BOOTMEM_INFO_NODE on s390. Therefore,
free_bootmem_page() nowadays always translates to free_reserved_page().
Let's use free_reserved_page() to replace the free_bootmem_page() loop.
We can stop including bootmem_info.h.
Likely, vmemmap freeing code could be factored out into the core in the
future.
Link: https://lore.kernel.org/20260511-bootmem_info_prep-v1-7-3fb0be6fc688@kernel.org
Signed-off-by: David Hildenbrand (Arm) <david@kernel.org>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: Lance Yang <lance.yang@linux.dev>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Andreas Larsson <andreas@gaisler.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Cc: Liam R. Howlett <liam@infradead.org>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Patch series "mm: remove CONFIG_HAVE_BOOTMEM_INFO_NODE (Part 1)".
We want to remove CONFIG_HAVE_BOOTMEM_INFO_NODE. As a first step, let's
limit the remaining harm to x86 and core code, removing sparc, ppc and
s390 leftovers, starting the stepwise removal by removing and simplifying
some code.
Once a related x86 vmemmap fix [1] is in, we can merge part 2 that will
remove CONFIG_HAVE_BOOTMEM_INFO_NODE entirely.
Tested on x86-64 with hugetlb vmemmap optimization in combination with
KMEMLEAK, making sure that the problem reported in dd0ff4d12dd2 ("bootmem:
remove the vmemmap pages from kmemleak in put_page_bootmem") does not
reappear -- hoping I managed to trigger the original problem.
This patch (of 8):
sparc does not select CONFIG_HAVE_BOOTMEM_INFO_NODE, therefore,
register_page_bootmem_info_node() is a nop.
Let's just get rid of register_page_bootmem_info().
Link: https://lore.kernel.org/20260511-bootmem_info_prep-v1-0-3fb0be6fc688@kernel.org
Link: https://lore.kernel.org/20260511-bootmem_info_prep-v1-1-3fb0be6fc688@kernel.org
Link: https://lore.kernel.org/r/20260429-vmemmap-v2-1-8dfcacffd877@kernel.org [1]
Signed-off-by: David Hildenbrand (Arm) <david@kernel.org>
Acked-by: Oscar Salvador <osalvador@suse.de>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: Lance Yang <lance.yang@linux.dev>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Andreas Larsson <andreas@gaisler.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Liam R. Howlett <liam@infradead.org>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
|
|
Since 5e8eb9aeeda3 ("arm64: mm: always call PTE/PMD ctor in
__create_pgd_mapping()") page-table allocation on ARM64 always calls
pagetable_{pte,pmd,pud,p4d}_ctor(). This sets the page_type to
PGTY_table, increments NR_PAGETABLE and possible allocates a PTL. However
the matching pagetable_dtor() calls were never added.
With DEBUG_VM enabled on kernel versions prior to v6.17 without
2dfcd1608f3a9 ("mm/page_alloc: let page freeing clear any set page type")
this leads to the following warning when freeing these pages due to
page->page_type sharing page->_mapcount:
BUG: Bad page state in process ... pfn:284fbb
page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x284fbb
flags: 0x17fffc000000000(node=0|zone=2|lastcpupid=0x1ffff)
page_type: f2(table)
page dumped because: nonzero mapcount
Call trace:
bad_page+0x13c/0x160
__free_frozen_pages+0x6cc/0x860
___free_pages+0xf4/0x180
free_pages+0x54/0x80
free_hotplug_page_range.part.0+0x58/0x90
free_empty_tables+0x438/0x500
__remove_pgd_mapping.constprop.0+0x60/0xa8
arch_remove_memory+0x48/0x80
try_remove_memory+0x158/0x1d8
offline_and_remove_memory+0x138/0x180
It can also lead to leaking the ptl allocation if ALLOC_SPLIT_PTLOCKS is
defined and incorrect NR_PAGETABLE stats. Fix this by calling
pagetable_dtor() in free_hotplug_pgtable_page() prior to freeing the page
to undo the effects of calling pagetable_*_ctor().
Link: https://lore.kernel.org/20260521032730.2104017-1-apopple@nvidia.com
Fixes: 5e8eb9aeeda3 ("arm64: mm: always call PTE/PMD ctor in __create_pgd_mapping()")
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: David Hildenbrand <david@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Add riscv32-specific '__ashldi3()', '__ashrdi3()', and '__lshrdi3()'.
Initially it was intended to fix the following link error observed when
building EFI-enabled kernel with CONFIG_EFI_STUB=y and
CONFIG_EFI_GENERIC_STUB=y:
riscv32-linux-gnu-ld: ./drivers/firmware/efi/libstub/lib-cmdline.stub.o: in function `__efistub_.L49':
__efistub_cmdline.c:(.init.text+0x1f2): undefined reference to `__efistub___ashldi3'
riscv32-linux-gnu-ld: __efistub_cmdline.c:(.init.text+0x202): undefined reference to `__efistub___lshrdi3'
Reported at [1] trying to build
https://patchew.org/linux/20260212164413.889625-1-dmantipov@yandex.ru,
tested with 'qemu-system-riscv32 -M virt' only.
Link: https://lore.kernel.org/20260519172259.908980-7-dmantipov@yandex.ru
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202603041925.KLKqpK6N-lkp@intel.com [1]
Suggested-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Charlie Jenkins <thecharlesjenkins@gmail.com>
Assisted-by: Gemini:gemini-3.1-pro-preview sashiko
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Andriy Shevchenko <andriy.shevchenko@intel.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The kernel allows arches to select between inline and outline
implementations of the copy_{from,to}_user() by defining individual
INLINE_COPY_FROM_USER and INLINE_COPY_TO_USER, correspondingly. However,
all arches enable or disable them always together.
Without the real use-case for one helper being inlined while the other
outlined, having independent controls is excessive and error prone.
Switch the codebase to the single unified INLINE_COPY_USER control.
Link: https://lore.kernel.org/20260425020857.356850-3-ynorov@nvidia.com
Signed-off-by: Yury Norov <ynorov@nvidia.com>
Tested-by: Alice Ryhl <aliceryhl@google.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christophe Leroy (CS GROUP) <chleroy@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Viktor Malik <vmalik@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Patch series "mm: remove page_mapped()".
While preparing my slides for an LSF/MM talk, I realized that I did not
yet remove page_mapped().
So let's do that. In the BPF arena code it's unclear which memdesc we
would want to allocate in the future: certainly something with a refcount,
but likely none with a mapcount. So let's just rely on the page refcount
instead to decide whether we want to try zapping the page from user page
tables.
This patch (of 3):
We already have the folio in our hands, so let's just use folio_mapped().
Link: https://lore.kernel.org/20260427-page_mapped-v1-0-e89c3592c74c@kernel.org
Link: https://lore.kernel.org/20260427-page_mapped-v1-1-e89c3592c74c@kernel.org
Signed-off-by: David Hildenbrand (Arm) <david@kernel.org>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Harry Yoo <harry@kernel.org>
Cc: Jann Horn <jannh@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Cc: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Cc: Liam R. Howlett <liam@infradead.org>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Rich Felker <dalias@libc.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Song Liu <song@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Yonghong Song <yonghong.song@linux.dev>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Currently, the memory hot-remove call chain -- arch_remove_memory(),
__remove_pages(), sparse_remove_section() and section_deactivate() -- does
not carry the struct dev_pagemap pointer. This prevents the lower levels
from knowing whether the section was originally populated with vmemmap
optimizations (e.g., DAX with vmemmap optimization enabled).
Without this information, we cannot call vmemmap_can_optimize() to
determine if the vmemmap pages were optimized. As a result, the vmemmap
page accounting during teardown will mistakenly assume a non-optimized
allocation, leading to incorrect memmap statistics.
To lay the groundwork for fixing the vmemmap page accounting, we need to
pass the @pgmap pointer down to the deactivation location. Plumb the
@pgmap argument through the APIs of arch_remove_memory(), __remove_pages()
and sparse_remove_section(), mirroring the corresponding *_activate()
paths.
Link: https://lore.kernel.org/20260428081855.1249045-4-songmuchun@bytedance.com
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Acked-by: Liam R. Howlett <liam@infradead.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: Joao Martins <joao.m.martins@oracle.com>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
On x86 SMP systems with the F00F bug present, do_clear_cpu_cap()
rightfully warns that the code clears the X86_BUG_F00F flag after
alternatives have been patched.
X86_BUG_F00F is first cleared in intel_workarounds() and then set for
the affected models. This sequence works fine on the BSP but on AP
bringup, where alternatives have already been patched and clearing the
flag there triggers the warning.
There is no technical reason for clearing the flag before setting it. It
is mainly an artifact of introducing the X86_BUG_F00F flag in
e2604b49e8a8 ("x86, cpu: Convert F00F bug detection").
Remove the unnecessary clearing of the flag.
While at it, remove the kernel notification and the surrounding logic to
inform the user about the workaround exactly once. If needed, the
presence of the F00F bug can be determined through /proc/cpuinfo.
Additionally, the F00F bug was the last remaining user of clear_cpu_bug().
With no users left, get rid of this helper as well.
[ bp: Massage commit message. ]
Co-developed-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Ahmed S. Darwish <darwi@linutronix.de>
Link: https://patch.msgid.link/20260528184826.3642051-1-sohil.mehta@intel.com
|
|
|
|
The ARMv8.2 based CPUs used in a number of Rockchip SoCs are missing
the EL2 virtual timer interrupt. Add it.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://patch.msgid.link/20260523140242.586031-16-maz@kernel.org
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
|
|
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Chromium/Depthcharge bootloaders may dynamically add a few device nodes
to a system's DTB under a /firmware node. A typical DT looks something
like the following:
/ {
firmware {
ranges;
coreboot {
compatible = "coreboot";
reg = <...>;
...;
};
};
};
Notably, the /firmware node has an empty 'ranges', but does not have
address/size-cells.
Commit 6e5773d52f4a ("of/address: Fix WARN when attempting translating
non-translatable addresses") started requiring #address-cells for a
device's parent if we want to use the reg resource in a device node.
This leads to errors like the following:
[ 7.763870] coreboot_table firmware:coreboot: probe with driver coreboot_table failed with error -22
Add appropriate #{address,size}-cells to work around the problem.
Note that Google has also patched the Depthcharge bootloader source to
add {address,size}-cells [1], but bootloader updates are typically
delivered only via Google OS updates. Not all users install Google
software updates, and even if they do, Google may not produce updated
binaries for all/older devices.
[1] https://lore.kernel.org/all/20241209092809.GA3246424@google.com/
https://crrev.com/c/6051580 ("coreboot: Insert #address-cells and
#size-cells for firmware node")
Closes: https://lore.kernel.org/all/aeKlYzTiL0OB1y3g@google.com/
Fixes: 6e5773d52f4a ("of/address: Fix WARN when attempting translating non-translatable addresses")
Signed-off-by: Brian Norris <briannorris@chromium.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Add power domains found in Tegra114 and configure operating-points-v2 for
supported devices accordingly.
Signed-off-by: Svyatoslav Ryhel <clamor95@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Add DC interconnections to Tegra114 device tree to reflect connections
between MC, EMC and DC.
Signed-off-by: Svyatoslav Ryhel <clamor95@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
device-tree nodes
Add EMC OPP tables and interconnect paths that will be used for dynamic
memory bandwidth scaling based on memory utilization statistics.
Signed-off-by: Svyatoslav Ryhel <clamor95@gmail.com>
Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Chromium/Depthcharge bootloaders may dynamically add a few device nodes
to a system's DTB under a /firmware node. A typical DT looks something
like the following:
/ {
firmware {
ranges;
coreboot {
compatible = "coreboot";
reg = <...>;
...;
};
};
};
Notably, the /firmware node has an empty 'ranges', but does not have
address/size-cells.
Commit 6e5773d52f4a ("of/address: Fix WARN when attempting translating
non-translatable addresses") started requiring #address-cells for a
device's parent if we want to use the reg resource in a device node.
This leads to errors like the following:
[ 7.763870] coreboot_table firmware:coreboot: probe with driver coreboot_table failed with error -22
Add appropriate #{address,size}-cells to work around the problem.
Note that Google has also patched the Depthcharge bootloader source to
add {address,size}-cells [1], but bootloader updates are typically
delivered only via Google OS updates. Not all users install Google
software updates, and even if they do, Google may not produce updated
binaries for all/older devices.
[1] https://lore.kernel.org/all/20241209092809.GA3246424@google.com/
https://crrev.com/c/6051580 ("coreboot: Insert #address-cells and
#size-cells for firmware node")
Closes: https://lore.kernel.org/all/aeKlYzTiL0OB1y3g@google.com/
Fixes: 6e5773d52f4a ("of/address: Fix WARN when attempting translating non-translatable addresses")
Signed-off-by: Brian Norris <briannorris@chromium.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
* arm/fixes:
ARM: dts: gemini: Fix partition offsets
soc: qcom: ice: Fix the error code when 'qcom,ice' property is not found
arm64: dts: qcom: eliza: Add power-domain and iface clk for ice node
arm64: dts: qcom: milos: Add power-domain and iface clk for ice node
tee: qcomtee: add missing va_end in early return qcomtee_object_user_init()
tee: fix params_from_user() error path in tee_ioctl_supp_recv
tee: shm: fix shm leak in register_shm_helper()
tee: fix tee_ioctl_object_invoke_arg padding
arm64: defconfig: Enable PCI M.2 power sequencing driver
scsi: ufs: ufs-qcom: Remove NULL check from devm_of_qcom_ice_get()
mmc: sdhci-msm: Remove NULL check from devm_of_qcom_ice_get()
soc: qcom: ice: Return proper error codes from devm_of_qcom_ice_get() instead of NULL
soc: qcom: ice: Return -ENODEV if the ICE platform device is not found
soc: qcom: ice: Fix race between qcom_ice_probe() and of_qcom_ice_get()
arm64: dts: qcom: x1-dell-thena: remove i2c20 (battery SMBus) and reserve its pins
arm64: dts: qcom: glymur: Drop RPMh CXO clocks from QMP PHYs
soc: qcom: ice: Allow explicit votes on 'iface' clock for ICE
dt-bindings: crypto: qcom,ice: Fix missing power-domain and iface clk
soc: imx8m: Fix match data lookup for soc device
tee: optee: prevent use-after-free when the client exits before the supplicant
|
|
These FIS partition offsets were never right: the comment clearly
states the FIS index is at 0xfe0000 and 0x7f * 0x200000 is
0xfe0000.
Tested on the iTian SQ201.
Fixes: d88b11ef91b1 ("ARM: dts: Fix up SQ201 flash access")
Fixes: b5a923f8c739 ("ARM: dts: gemini: Switch to redboot partition parsing")
Signed-off-by: Linus Walleij <linusw@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into arm/fixes
Qualcomm Arm64 defconfig fixes for v7.1
A number of targets now depends on the M.2 PCIe power sequencing driver,
enable this to keep these devices functional with a defconfig build.
* tag 'qcom-arm64-defconfig-fixes-for-7.1' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux:
arm64: defconfig: Enable PCI M.2 power sequencing driver
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into arm/fixes
Qualcomm Arm64 DeviceTree fixes for v7.1
Add missing power-domain and iface clocks for the ICE node of Eliza and
Milos to avoid the validation errors that resulted from late binding
changes. Also drop the reference clock for the USB QMP PHYs, for the
same reason.
Avoid touching the 20'th I2C bus on the Hamoa-based (X Elite) Dell
laptops, as this conflicts with the battery management firmware.
* tag 'qcom-arm64-fixes-for-7.1' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux:
arm64: dts: qcom: eliza: Add power-domain and iface clk for ice node
arm64: dts: qcom: milos: Add power-domain and iface clk for ice node
arm64: dts: qcom: x1-dell-thena: remove i2c20 (battery SMBus) and reserve its pins
arm64: dts: qcom: glymur: Drop RPMh CXO clocks from QMP PHYs
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
The Tegra194 PCIe driver converts aspm-l1-entry-delay-ns to whole ms
with ceiling division, then derives the Synopsys DesignWare PORT_AFR L1
entrance latency encoding as min(order_base_2(us), 7).
The nanosecond values from the Fixes tag below round up to 4, 8, and 16 us,
selecting PORT_AFR L1 entrance latency codes 2, 3, and 4 respectively.
Raise the programmed latency so the PORT_AFR codes are 3 / 4 / 5
(8 / 16 / 32 us buckets) instead of 2 / 3 / 4 (4 / 8 / 16 us).
- tegra194.dtsi: 4000 -> 8000 ns (all listed controllers)
- tegra234.dtsi: 8000 -> 16000 ns (Root Port), 16000 -> 32000 ns (Endpoint)
Fixes: d60ed99f1c9e ("arm64: tegra: Add aspm-l1-entry-delay-ns to PCIe nodes")
Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Cross-merge networking fixes after downstream PR (net-7.1-rc6).
Conflicts:
drivers/net/phy/air_en8811h.c
d895767c33781 ("net: phy: air_en8811h: add AN8811HB MCU assert/deassert support")
dddfadd75197e ("net: phy: Add Airoha phy library for shared code")
5226bb6634cdf ("net: phy: air_phy_lib: Factorize BuckPBus register accessors")
e08f0ea6daf2e ("net: phy: Rename Airoha common BuckPBus register accessors")
net/sched/sch_netem.c
a2f6ed7b4873 ("net/sched: netem: add per-impairment extended statistics")
9552b11e3eda ("net/sched: fix packet loop on netem when duplicate is on")
Adjacent changes:
drivers/dpll/zl3073x/core.c
c1224569cef0 ("dpll: zl3073x: make frequency monitor a per-device attribute")
54e65df8cf18 ("dpll: zl3073x: report FFO as DPLL vs input reference offset")
net/iucv/af_iucv.c
347fdd4df85f ("af_iucv: convert to getsockopt_iter")
3589d20a666c ("net/iucv: fix locking in .getsockopt")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Register the MAX77620 PMIC as the system power controller on Pixel C so
the driver can install its sys-off handler.
This allows the PMIC poweroff sequence to override the non-working PSCI
SYSTEM_OFF implementation on this platform.
Signed-off-by: Diogo Ivo <diogo.ivo@tecnico.ulisboa.pt>
Signed-off-by: Thierry Reding <treding@nvidia.com>
|
|
Add a header asm/neon-intrinsics.h similar to the one that arm64 has.
This makes it possible for NEON intrinsics code to be shared seamlessly
between ARM and arm64.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://patch.msgid.link/20260422171655.3437334-11-ardb+git@google.com
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
The hypervisor is an untrusted entity for TDX guests. It cannot be used
to boot secondary CPUs. The function hv_vtl_wakeup_secondary_cpu() cannot
be used.
Instead, the virtual firmware boots the secondary CPUs and places them in
a state to transfer control to the kernel using the wakeup mailbox. The
firmware enumerates the mailbox via either an ACPI table or a DeviceTree
node.
If the wakeup mailbox is present, the kernel updates the APIC callback
wakeup_secondary_cpu_64() to use it.
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: Dexuan Cui <dexuan@kernel.org>
|
|
The current code maps MMIO devices as shared (decrypted) by default in a
confidential computing VM.
In a TDX environment, secondary CPUs are booted using the Multiprocessor
Wakeup Structure defined in the ACPI specification. The virtual firmware
and the operating system function in the guest context, without
intervention from the VMM. Map the physical memory of the mailbox as
private. Use the is_private_mmio() callback.
Signed-off-by: Yunhong Jiang <yunhong.jiang@linux.intel.com>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: Dexuan Cui <dexuan@kernel.org>
|
|
A Hyper-V VTL level 2 guest in a TDX environment needs to map the physical
page of the ACPI Multiprocessor Wakeup Structure as private (encrypted). It
needs to know the physical address of this structure. Add a helper function
to retrieve the address.
Suggested-by: Michael Kelley <mhklinux@outlook.com>
Acked-by: Rafael J. Wysocki (Intel) <rafael@kernel.org>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: Dexuan Cui <dexuan@kernel.org>
|
|
The hypervisor is an untrusted entity for TDX guests. It cannot be used
to boot secondary CPUs - neither via hypercalls nor the INIT assert,
de-assert, plus Start-Up IPI messages.
Instead, the platform virtual firmware boots the secondary CPUs and
puts them in a state to transfer control to the kernel. This mechanism uses
the wakeup mailbox described in the Multiprocessor Wakeup Structure of the
ACPI specification. The entry point to the kernel is trampoline_start64.
Allocate and setup the trampoline using the default x86_platform callbacks.
The platform firmware configures the secondary CPUs in long mode. It is no
longer necessary to locate the trampoline under 1MB memory. After handoff
from firmware, the trampoline code switches briefly to 32-bit addressing
mode, which has an addressing limit of 4GB. Set the upper bound of the
trampoline memory accordingly.
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
Signed-off-by: Yunhong Jiang <yunhong.jiang@linux.intel.com>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: Dexuan Cui <dexuan@kernel.org>
|
|
x86 CPUs boot in real mode. This mode uses a 1MB address space. The
trampoline must reside below this 1MB memory boundary.
There are platforms in which the firmware boots the secondary CPUs,
switches them to long mode and transfers control to the kernel. An example
of such a mechanism is the ACPI Multiprocessor Wakeup Structure.
In this scenario there is no restriction on locating the trampoline under
1MB memory. Moreover, certain platforms (for example, Hyper-V VTL guests)
may not have memory available for allocation below 1MB.
Add a new member to struct x86_init_resources to specify the upper bound
for the location of the trampoline memory. Preserve the default upper bound
of 1MB to conserve the current behavior.
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
Originally-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Yunhong Jiang <yunhong.jiang@linux.intel.com>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: Dexuan Cui <dexuan@kernel.org>
|
|
Hyper-V VTL clears x86_platform.realmode_{init(), reserve()} in
hv_vtl_init_platform() whereas it sets real_mode_header later in
hv_vtl_early_init(). There is no need to deal with the settings of real
mode memory in two places. Also, both functions are called much earlier
than x86_platform.realmode_init() (via an early_initcall), where the
real_mode_header is needed.
Set real_mode_header in hv_vtl_init_platform() to keep all code dealing
with memory for the real mode trampoline in one place. Besides making the
code more readable, it prepares it for a subsequent changeset in which the
behavior needs to change to support Hyper-V VTL guests in a TDX
environment.
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Michael Kelley <mhklinux@outlook.com>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Yunhong Jiang <yunhong.jiang@linux.intel.com>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: Dexuan Cui <dexuan@kernel.org>
|
|
The Wakeup Mailbox is a mechanism to boot secondary CPUs on systems that do
not want or cannot use the INIT + StartUp IPI messages.
The platform firmware is expected to implement the mailbox as described in
the Multiprocessor Wakeup Structure of the ACPI specification. It is also
expected to publish the mailbox to the operating system as described in the
corresponding DeviceTree schema that accompanies the documentation of the
Linux kernel.
Reuse the existing functionality to set the memory location of the mailbox
and update the wakeup_secondary_cpu_64() APIC callback. Make this
functionality available to DeviceTree-based systems by making CONFIG_X86_
MAILBOX_WAKEUP depend on either CONFIG_OF or CONFIG_ACPI_MADT_WAKEUP.
do_boot_cpu() uses wakeup_secondary_cpu_64() when set. It will be set if a
wakeup mailbox is enumerated via an ACPI table or a DeviceTree node. For
cases in which this behavior is not desired, this APIC callback can be
updated later during boot using platform-specific hooks.
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Co-developed-by: Yunhong Jiang <yunhong.jiang@linux.intel.com>
Signed-off-by: Yunhong Jiang <yunhong.jiang@linux.intel.com>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: Dexuan Cui <dexuan@kernel.org>
|
|
Systems that describe hardware using DeviceTree graphs may enumerate and
implement the wakeup mailbox as defined in the ACPI specification but do
not otherwise depend on ACPI. Expose functions to setup and access the
location of the wakeup mailbox from outside ACPI code.
The function acpi_setup_mp_wakeup_mailbox() stores the physical address of
the mailbox and updates the wakeup_secondary_cpu_64() APIC callback.
The function acpi_madt_multiproc_wakeup_mailbox() returns a pointer to the
mailbox.
Acked-by: Rafael J. Wysocki (Intel) <rafael@kernel.org>
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: Dexuan Cui <dexuan@kernel.org>
|
|
The prototypes for get_topology_cpu_type_name() and
get_topology_cpu_type() take a pointer to struct cpuinfo_x86, but
asm/topology.h neither includes nor forward-declares the structure.
Including asm/topology.h, directly or indirectly, without including
asm/processor.h triggers a warning:
./arch/x86/include/asm/topology.h:159:47: error: ‘struct cpuinfo_x86’
declared inside parameter list will not be visible outside of this
definition or declaration [-Werror]
159 | const char *get_topology_cpu_type_name(struct cpuinfo_x86 *c);
| ^~~~~~~~~~~
Since only a pointer is needed, add a forward declaration of struct
cpuinfo_x86.
Additionally, sysctl_sched_itmt_enabled is declared in asm/topology.h with
the __read_mostly attribute, but the header does not include linux/cache.h.
This causes a build failure when including asm/topology.h but not linux/
cache.h:
./arch/x86/include/asm/topology.h:264:27: error: expected ‘=’, ‘,’,
‘;’, ‘asm’ or ‘__attribute__’ before ‘sysctl_sched_itmt_enabled’
264 | extern bool __read_mostly sysctl_sched_itmt_enabled;
| ^~~~~~~~~~~~~~~~~~~~~~~~~
Include the required header.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202511181954.UMxCeTV1-lkp@intel.com/
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202511190008.AA0NTn3G-lkp@intel.com/
Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Signed-off-by: Dexuan Cui <dexuan@kernel.org>
|
|
* kvm-arm64/misc-7.2:
: .
: - Check for a valid vcpu pointer upon deactivating traps when handling
: a HYP panic in VHE mode
:
: - Make the __deactivate_fgt() macro use its arguments instead of the
: surrounding context
:
: - Don't bother with initialising TPIDR_EL2 in the hyp stubs, as this
: is already taken care of in more obvious places
:
: - Drop the unused kvm_arch pointer passed to __load_stage2()
: .
KVM: arm64: Remove @arch from __load_stage2()
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Since commit fe49fd940e22 ("KVM: arm64: Move VTCR_EL2 into struct s2_mmu"),
@arch is no longer required to obtain the per-kvm_s2_mmu vtcr and can be
removed from __load_stage2().
Signed-off-by: Zenghui Yu (Huawei) <zenghui.yu@linux.dev>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://patch.msgid.link/20260318144305.56831-1-zenghui.yu@linux.dev
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
The RISC-V Bit-manipulation Extension for Cryptography (Zbkb) provides
the 'brev8' instruction, which reverses the bits within each byte.
Combined with the 'rev8' instruction (from Zbb or Zbkb), which reverses
the byte order of a register, we can efficiently implement 16-bit,
32-bit, and (on RV64) 64-bit bit reversal.
This is significantly faster than the default software table-lookup
implementation in lib/bitrev.c, as it replaces memory accesses and
multiple arithmetic operations with just two or three hardware
instructions.
Select HAVE_ARCH_BITREVERSE as well as GENERIC_BITREVERSE,
and provide <asm/bitrev.h> to utilize these instructions when
the Zbkb extension is available at runtime via the alternatives
mechanism.
[Yury: select the options conditionally on BITREVERSE]
Link: https://docs.riscv.org/reference/isa/unpriv/b-st-ext.html
Suggested-by: David Laight <david.laight.linux@gmail.com>
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Signed-off-by: Yury Norov <ynorov@nvidia.com>
|
|
Architectures may have bit reversal instructions, but if the API not
needed, the corresponding option should not be selected because it may
lead to generating the unneeded code.
Signed-off-by: Yury Norov <ynorov@nvidia.com>
|
|
Add device tree nodes for the two xSPI (Expanded SPI) controllers
integrated into the RZ/N2H (R9A09G087) SoC.
Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/20260526204045.3481604-3-prabhakar.mahadev-lad.rj@bp.renesas.com
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
|
|
Add device tree nodes for the two xSPI (Expanded SPI) controllers
integrated into the RZ/T2H (R9A09G077) SoC.
Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/20260526204045.3481604-2-prabhakar.mahadev-lad.rj@bp.renesas.com
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
|
|
Sort the pinmux entries for both GMAC ctrl nodes in port order (A/B/C and
D/E/F respectively) and remove the extra blank line before the second
pinmux assignment.
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/20260524092016.46346-1-biju.das.jz@bp.renesas.com
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
|
|
Enable SCMI via MFIS-SCP and S-TCM transport area.
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/20260519074702.3308-6-wsa+renesas@sang-engineering.com
[geert: Drop scmi_clk node]
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
|
|
Describe the MFIS and MFIS SCP instances which are used for various
tasks including inter-processor communication. Remove the PRR node
because it is part of MFIS on R-Car X5H and should be handled using the
MFIS compatible. Also, describe the S-TCM transport area used for shared
memory mailboxing.
Signed-off-by: Vinh Nguyen <vinh.nguyen.xz@renesas.com>
Signed-off-by: Hai Pham <hai.pham.ud@renesas.com>
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/20260519074702.3308-5-wsa+renesas@sang-engineering.com
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
|
|
The HW user manual for the Renesas RZ/T2H and the RZ/N2H states that for
SDR104, SDR50, and HS200 to work properly the eMMC/SDHI interface pins
have to be configured as specified below:
- SDn_CLK pin - drive strength: Ultra High, slew rate: Fast,
- Other SDn_* pins: drive strength: High, slew rate: Fast,
Schmitt trigger: disabled (not applicable to SDn_RST pins).
HS DDR and DDR50 are currently not supported, and for every other bus
mode the eMMC/SDHI interface pins should be configured as specified
below:
- SDn_CLK pin - drive strength: High, slew rate: Fast,
- Other SDn_* pins: drive strength: Middle, slew rate: Fast,
Schmitt trigger: disabled (not applicable to SDn_RST pins).
Adjust the pin definitions accordingly.
Signed-off-by: Fabrizio Castro <fabrizio.castro.jz@renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Link: https://patch.msgid.link/20260514210220.7616-1-fabrizio.castro.jz@renesas.com
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
|
|
The Renesas R-Car X5H (R8A78000) SoC contains Arm CoreLink GIC-720AE
Generic Interrupt Controller with Multi View capability. Firmware has
access to configuration View 0, Linux kernel has access to View 1.
The Arm CoreLink GIC-720AE Generic Interrupt Controller Technical
Reference Manual, currently latest r2p1 [1], chapter "5. Programmers
model for GIC-720AE", subchapter "5.4 Redistributor registers
for control and physical LPIs summary", part "5.4.3 GICR_TYPER,
Redistributor Type Register", "Table 5-50: GICR_TYPER bit descriptions"
on page 200, clarifies register "GICR_TYPER" bit 4 "Last" behavior
in Multi View setup as follows:
Last
Last Redistributor:
0 ... This Redistributor is not the last Redistributor on the chip.
1 ... This Redistributor is the last Redistributor on the chip.
When GICD_CFGID.VIEW == 1, for views 1, 2, or 3 this bit
always returns 1.
On this SoC, GICD_CFGID.VIEW is 1 and the Linux kernel has access to
View 1, therefore Linux kernel GICv3 driver will interpret register
"GICR_TYPER" bit 4 "Last" = 1 in the first Redistributor in continuous
Redistributor page as that first Redistributor being the one and only
Redistributor and will stop processing the continuous Redistributor
page further. This will prevent the other Redistributors from being
recognized by the system and used for other PEs.
Because the hardware indicates that the continuous Redistributor page
is not continuous for View 1, 2, or 3, describe every Redistributor
separately in the DT. This makes all Redistributors for all cores
accessible in Linux.
[1] https://documentation-service.arm.com/static/69ef3c1cd35efd294e335c43
Arm® CoreLink™ GIC-720AE Generic Interrupt Controller
Revision: r2p1 / Issue 12 / 102666_0201_12_en
Fixes: 63500d12cf76 ("arm64: dts: renesas: Add R8A78000 SoC support")
Acked-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Marek Vasut <marek.vasut+renesas@mailbox.org>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/20260514125328.20954-1-marek.vasut+renesas@mailbox.org
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
|
|
Describe SMC based PSCI access in SoC DT. The system can interact with
TFA BL31 PSCI provider running on the Cortex-A cores via SMC calls.
Signed-off-by: Marek Vasut <marek.vasut+renesas@mailbox.org>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/20260513225037.49803-1-marek.vasut+renesas@mailbox.org
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
|
|
Add audio_clk1 and audio_clk2 fixed-clock nodes to the RZ/G3L (r9a08g046)
SoC DTSI. These clocks are external to the SoC and their frequencies are
board-dependent, so they are defined with clock-frequency = <0> as
placeholders that must be overridden in board-level DTS files.
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/20260505123708.134069-4-biju.das.jz@bp.renesas.com
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
|
|
Add SSI{0,1,2,3} nodes to RZ/G3L SoC DTSI.
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/20260505123708.134069-3-biju.das.jz@bp.renesas.com
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
|
|
Add the DMA controller device tree node for the RZ/G3L (r9a08g046) SoC.
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/20260505123708.134069-2-biju.das.jz@bp.renesas.com
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
|
|
Add i2c{0..3} device nodes to RZ/G3L ("R9A08G046") SoC DTSI.
As the place holders for i2c0 is removed, add the pincontrol
device nodes to make it functional on the board DTS.
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/20260505070206.7932-3-biju.das.jz@bp.renesas.com
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
|
|
Add scif{1..5} device nodes to RZ/G3L ("R9A08G046") SoC DTSI.
Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/20260505070206.7932-2-biju.das.jz@bp.renesas.com
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
|
|
Clang recently added support for -Wattribute-alias [1], which results in
the same warnings that necessitated commit bee20031772a ("disable
-Wattribute-alias warning for SYSCALL_DEFINEx()") for GCC.
kernel/time/itimer.c:325:1: error: alias and aliasee have different types 'long (unsigned int)' and 'long (typeof (__builtin_choose_expr((__builtin_types_compatible_p(typeof ((unsigned int)0), typeof (0LL)) || __builtin_types_compatible_p(typeof ((unsigned int)0), typeof (0ULL))), 0LL, 0L)))' (aka 'long (long)') [-Werror,-Wattribute-alias]
325 | SYSCALL_DEFINE1(alarm, unsigned int, seconds)
| ^
include/linux/syscalls.h:225:36: note: expanded from macro 'SYSCALL_DEFINE1'
225 | #define SYSCALL_DEFINE1(name, ...) SYSCALL_DEFINEx(1, _##name, __VA_ARGS__)
| ^
include/linux/syscalls.h:236:2: note: expanded from macro 'SYSCALL_DEFINEx'
236 | __SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
| ^
include/linux/syscalls.h:251:18: note: expanded from macro '__SYSCALL_DEFINEx'
251 | __attribute__((alias(__stringify(__se_sys##name)))); \
| ^
kernel/time/itimer.c:325:1: note: aliasee is declared here
include/linux/syscalls.h:225:36: note: expanded from macro 'SYSCALL_DEFINE1'
225 | #define SYSCALL_DEFINE1(name, ...) SYSCALL_DEFINEx(1, _##name, __VA_ARGS__)
| ^
include/linux/syscalls.h:236:2: note: expanded from macro 'SYSCALL_DEFINEx'
236 | __SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
| ^
include/linux/syscalls.h:255:18: note: expanded from macro '__SYSCALL_DEFINEx'
255 | asmlinkage long __se_sys##name(__MAP(x,__SC_LONG,__VA_ARGS__)) \
| ^
<scratch space>:16:1: note: expanded from here
16 | __se_sys_alarm
| ^
Disable the warnings in the same way for clang-23 and newer. Disable the
warning about unknown warning options to avoid breaking the build for
versions of clang-23 that do not have -Wattribute-alias, such as ones
deployed by vendors like Android or CI systems or when bisecting LLVM
between llvmorg-23-init and release/23.x.
Cc: stable@vger.kernel.org
Closes: https://github.com/ClangBuiltLinux/linux/issues/2163
Link: https://github.com/llvm/llvm-project/commit/40da6920a0d71d49dfa2392b09153600b0759f5e [1]
Link: https://patch.msgid.link/20260515-syscall-disable-attribute-alias-for-clang-v1-1-9a9d95d41df6@kernel.org
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
|
|
'verisilicon', 'riscv', 'amd/amd-vi' and 'core' into next
|
|
Implement and enable the KVM_PRE_FAULT_MEMORY ioctl for s390.
Faulted-in pages will be marked as accessed, unlike x86, otherwise they
will trigger a minor fault when accessed. Avoiding such faults is one of
the points of KVM_PRE_FAULT_MEMORY.
Reviewed-by: Steffen Eiden <seiden@linux.ibm.com>
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Message-ID: <20260527144358.186359-3-imbrenda@linux.ibm.com>
|
|
Until now, the members of struct guest_fault are always accessed while
holding the required locks, and thus the ptep and crstep pointers can
be dereferenced safely.
There will be some new cases where callers of kvm_s390_faultin_gfn()
need to know the size of the page used to solve the fault, at which
point no locks are held anymore, and dereferencing the crstep field
is not possible.
Introduce a new crste_region3 flag for struct guest_fault to indicate
whether the crstep used to solve the fault was a region 3 entry with FC=1
(large pud).
This allows to disambiguate all three possible scenarios:
* If ptep is not NULL, the fault was solved with a pte.
* If ptep is NULL and crste_region3 is 0, a segment entry with FC=1
(large pmd) was used.
* If ptep is NULL and crste_region3 is 1, a region 3 entry with FC=1
(large pud) was used.
Reviewed-by: Steffen Eiden <seiden@linux.ibm.com>
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Message-ID: <20260527144358.186359-2-imbrenda@linux.ibm.com>
|
|
It was missed that idt_do_interrupt_irqoff() gets compiled on x84_64;
this is a problem for CFI builds because it includes an unadorned
indirect call. It is however completely dead code.
Rework things to not emit this function at all.
Fixes: 0701c9e17bd9 ("x86/kvm/vmx: Move IRQ/NMI dispatch from KVM into x86 core")
Reported-by: Nathan Chancellor <nathan@kernel.org>
Reported-by: Calvin Owens <calvin@wbinvd.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Link: https://patch.msgid.link/20260526090631.GA4149641@noisy.programming.kicks-ass.net
|
|
Backmerging to get GEM LRU fixes from commit 379e8f1c ("drm/gem: Make
the GEM LRU lock part of drm_device") and other updates from v7.1-rc5.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
|
|
* kvm-arm64/nv-granule-sizes:
: .
: Tidying up of the behaviour when the selected page size in not
: implemented, courtesy of Wei-Lin Chang. From the initial cover
: letter:
:
: "This small series fixes the granule size selection for software stage-1
: and stage-2 walks. Previously we treat the guest's TCR/VTCR.TGx as-is
: and use the encoded granule size for the walks. However this is
: incorrect if the granule sizes are not advertised in the guest's
: ID_AA64MMFR0_EL1.TGRAN*. The architecture specifies that when an
: unsupported size is programed in TGx, it must be treated as an
: implemented size. Fix this by choosing an available one while
: prioritizing PAGE_SIZE."
: .
KVM: arm64: Fallback to a supported value for unsupported guest TGx
KVM: arm64: nv: Use literal granule size in TLBI range calculation
KVM: arm64: Factor out TG0/1 decoding of VTCR and TCR
KVM: arm64: nv: Rename vtcr_to_walk_info() to setup_s2_walk()
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
When KVM derives the translation granule for emulated stage-1 and
stage-2 walks, it decodes TCR/VTCR.TGx and treats the granule as-is.
This is wrong when the guest programs a granule size that is not
advertised in the guest's ID_AA64MMFR0_EL1.TGRAN* fields.
Architecturally, such a value must be treated as an implemented granule
size. Choose an available one while prioritizing PAGE_SIZE.
Signed-off-by: Wei-Lin Chang <weilin.chang@arm.com>
Link: https://patch.msgid.link/20260414000334.3947257-5-weilin.chang@arm.com
[maz: minor tidying up]
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
TLBI handling derives the invalidation range from guest VTCR_EL2.TG0 in
get_guest_mapping_ttl() and compute_tlb_inval_range(). Switch these to
use a helper that returns the decoded VTCR_EL2.TG0 granule size instead
of decoding it inline.
This keeps the granule size derivation in one place and prepares for
following changes that adjust the effective granule size.
Signed-off-by: Wei-Lin Chang <weilin.chang@arm.com>
Link: https://patch.msgid.link/20260414000334.3947257-4-weilin.chang@arm.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
The current code decodes TCR.TG0/TG1 and VTCR.TG0 inline at several
places. Extract this logic into helpers so the granule size can be
derived in one place. This enables us to alter the effective granule
size in the same place, which we will do in a later patch.
Signed-off-by: Wei-Lin Chang <weilin.chang@arm.com>
Link: https://patch.msgid.link/20260414000334.3947257-3-weilin.chang@arm.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
This rename aligns the stage-2 walker better with the stage-1 walker.
Also set up other non-VTCR walk info in the function.
Signed-off-by: Wei-Lin Chang <weilin.chang@arm.com>
Link: https://patch.msgid.link/20260414000334.3947257-2-weilin.chang@arm.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
* vmx:
KVM: VMX: Handle bad values on proxied writes to LBR MSRs
KVM: TDX: Fix x2APIC MSR handling in tdx_has_emulated_msr()
|
|
* svm: (24 commits)
KVM: x86/pmu: Allow Host-Only/Guest-Only bits with nSVM and mediated PMU
KVM: x86/pmu: Reprogram Host/Guest-Only counters on nested transitions
KVM: x86/pmu: Track mediated PMU counters with mode-specific enables
KVM: x86/pmu: Disable counters based on Host-Only/Guest-Only bits in SVM
KVM: x86/pmu: Add support for KVM_X86_PMU_OP_OPTIONAL_RET0
KVM: x86/pmu: Check mediated PMU counter enablement before event filters
KVM: x86/pmu: Do a single atomic OR when reprogramming counters
KVM: x86/pmu: Rename reprogram_counters() to clarify usage
KVM: x86: Move enable_pmu/enable_mediated_pmu to pmu.h and pmu.c
KVM: nSVM: Move VMRUN instruction retirement after entering guest mode
KVM: nSVM: Unify RIP and PMU handling calls when emulating VMRUN
KVM: nSVM: Bail early out of VMRUN emulation if advancing RIP fails
KVM: nSVM: Stop leaking single-stepping on VMRUN into L2
KVM: SVM: Only disable x2AVIC WRMSR interception for MSRs that are accelerated
KVM: SVM: Disable x2AVIC RDMSR interception for MSRs KVM actually supports
KVM: x86: Add dedicated API for getting mask of accelerated x2APIC MSRs
KVM: x86: nSVM: Save/restore gPAT with KVM_{GET,SET}_NESTED_STATE
KVM: Documentation: document KVM_{GET,SET}_NESTED_STATE for SVM
KVM: x86: nSVM: Save gPAT to vmcb12.g_pat on VMEXIT
KVM: x86: nSVM: Redirect IA32_PAT accesses to either hPAT or gPAT
...
|
|
* sev:
KVM: SEV: Mark source page dirty when writing back CPUID data on failure
KVM: SEV: Unmap local kmaps in LIFO order, per highmem requirements
KVM: SEV: Pin source page for write when adding CPUID data for SNP guest
KVM: SEV: Allocate only as many bytes as needed for temp crypt buffers
KVM: SEV: Rewrite logic to {de,en}crypt memory for debug
KVM: SEV: Add helper function to pin/unpin a single page
KVM: SEV: Explicitly validate the dst buffer for debug operations
KVM: selftests: Add a test to verify SEV {en,de}crypt debug ioctls
KVM: SVM: Fix page overflow in sev_dbg_crypt() for ENCRYPT path
KVM: selftests: Teach sev_*_test about revoking VM types
KVM: SEV: Don't advertise VM types that are disabled by firmware
KVM: SEV: Don't advertise support for unusable VM types
KVM: SEV: Consolidate logic for printing state of SEV{,-ES,-SNP} enabling
KVM: SEV: Set supported SEV+ VM types during sev_hardware_setup()
crypto/ccp: export firmware supported vm types
crypto/ccp: hoist kernel part of SNP_PLATFORM_STATUS
|
|
* mmu: (23 commits)
KVM: TDX: Move external page table freeing to TDX code
KVM: x86: Move error handling inside free_external_spt()
KVM: TDX: Rename tdx_sept_remove_private_spte() to show it's for leaf SPTEs
KVM: TDX: Drop kvm_x86_ops.remove_external_spte()
KVM: TDX: Hoist tdx_sept_remove_private_spte() above set_private_spte()
KVM: x86/mmu: Drop KVM_BUG_ON() on shared lock to zap child external PTEs
KVM: x86/tdp_mmu: Centrally propagate to-present/atomic zap updates to external PTEs
KVM: x86/mmu: Plumb "sp" _pointer_ into the TDP MMU's handle_changed_spte()
KVM: x86/tdp_mmu: Morph !is_frozen_spte() check into a KVM_MMU_WARN_ON()
KVM: TDX: Move lockdep assert in __tdp_mmu_set_spte_atomic() to TDX code
KVM: TDX: Move KVM_BUG_ON()s in __tdp_mmu_set_spte_atomic() to TDX code
KVM: x86/mmu: Plumb param "old_spte" into kvm_x86_ops.set_external_spte()
KVM: x86/mmu: Fold set_external_spte_present() into its sole caller
KVM: TDX: Wrap mapping of leaf and non-leaf S-EPT entries into helpers
KVM: TDX: Drop kvm_x86_ops.link_external_spt()
x86/virt/tdx: Move mk_keyed_paddr() to tdx.c due to no external users
x86/tdx: Drop exported function tdx_quirk_reset_page()
x86/tdx: Use PFN directly for unmapping guest private memory
x86/tdx: Use PFN directly for mapping guest private memory
KVM: x86: Make "external SPTE" ops that can fail RET0 static calls
...
|
|
* misc: (30 commits)
KVM: SEV: Restrict userspace return codes for KVM_HC_MAP_GPA_RANGE
KVM: TDX: Allow userspace to return errors to guest for MAPGPA
KVM: selftests: Update hwcr_msr_test for CPUID faulting bit
KVM: x86: Virtualize AMD CPUID faulting
KVM: x86: Remove supports_cpuid_fault() helper
KVM: x86: Prioritize CPUID faulting over CPUID VM-exits in nested VMX
KVM: x86: Consolidate CPUID fault handling for emulator and interception logic
KVM: x86: Treat KVM's virtual PMU as disabled for TDX VMs
KVM: selftests: Add nested page fault injection test
KVM: VMX: Synthesize nested EPT violation GVA_IS_VALID/GVA_TRANSLATED bits
KVM: SVM: Fix nested NPF injection of PFERR_GUEST_{PAGE,FINAL}_MASK bits
KVM: x86: Tell ->inject_page_fault() whether or a fault came from hardware
KVM: x86: Widen x86_exception's error_code to 64 bits
MAINTAINERS: KVM: Include maintainer profile
KVM: x86: Remove unused X86EMUL_MODE_HOST define
KVM: selftests: Verify VMX's GUEST_PENDING_DBG_EXCEPTIONS.BS Consistency Check
KVM: selftests: Verify guest debug DR7.GD checking during instruction emulation
KVM: selftests: Add all (known) EFLAGS bit definitions
KVM: x86: Drop kvm_vcpu_do_singlestep() now that it's been gutted
KVM: x86: Move KVM_GUESTDBG_SINGLESTEP handling into kvm_inject_emulated_db()
...
|
|
* generic:
call_once:: Fix typo in comment for call_once()
KVM: Fix kvm_vcpu_map[_readonly]() function prototypes
KVM: Rename invalidate_begin to invalidate_start for consistency
|
|
* fixes: (28 commits)
KVM: SVM: Flush the current TLB when transitioning from xAVIC => x2AVIC
KVM: x86: Fix ERAPS RAP clear on INVPCID single-context invalidation
KVM: selftests: Guard execinfo.h inclusion for non-glibc builds
KVM: x86: Rate-limit global clock updates on vCPU load
x86/virt: Silence RCU lockdep splat in emergency virt callback path
KVM: selftests: Include sys/mman.h *and* linux/mman.h, via kvm_syscalls.h
KVM: VMX: introduce module parameter to disable CET
KVM: x86: Swap the dst and src operand for MOVNTDQA
KVM: x86: use again the flush argument of __link_shadow_page()
KVM: selftests: Ensure gmem file sizes are multiple of host page size
Documentation: kvm: update links in the references section of AMD Memory Encryption
KVM: nSVM: Never use L0's PAUSE loop exiting while L2 is running
KVM: x86: Fix Xen hypercall tracepoint argument assignment
KVM: Reject wrapped offset in kvm_reset_dirty_gfn()
KVM: arm64: Pre-check vcpu memcache for host->guest donate
KVM: arm64: Pre-check vcpu memcache for host->guest share
KVM: arm64: Seed pkvm_ownership_selftest vcpu memcache
KVM: arm64: Fix __deactivate_fgt macro parameter typo
KVM: arm64: Guard against NULL vcpu on VHE hyp panic path
KVM: arm64: Make EL2 exception entry and exit context-synchronization events
...
|
|
Now that KVM correctly handles Host-Only and Guest-Only bits in the
event selector MSRs, allow the guest to set them if the vCPU advertises
SVM and uses the mediated PMU.
Signed-off-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Yosry Ahmed <yosry@kernel.org>
Link: https://patch.msgid.link/20260527234711.4175166-14-yosry@kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Reprogram PMU counters on nested transitions for the mediated PMU, to
re-evaluate Host-Only and Guest-Only bits and enable/disable the PMU
counters accordingly. For example, if Host-Only is set and Guest-Only is
cleared, a counter should be disabled when entering guest mode and
enabled when exiting guest mode.
According to the APM, when EFER.SVME is cleared, setting Host-Only or
Guest-Only disables the counter, so also trigger counter reprogramming
when EFER.SVME is toggled.
Counters setting any of Host-Only and Guest-Only bits are already being
tracked in pmc_has_mode_specific_enables, use the bitmap to reprogram
these counters.
Reprogram the counters synchronously on nested VMRUN/#VMEXIT and
EFER.SVME toggling. This is necessary as these instructions are counted
based on the new CPU state (after the instruction is retired in
hardware). Hence, the PMU needs to be updated before instruction
emulation is completed and kvm_pmu_instruction_retired() is called.
Defer reprogramming the counters when force leaving guest mode through
svm_leave_nested() to avoid potentially reading stale state (e.g.
incorrect EFER). All flows force leaving nested are non-architectural,
so accuracy is irrelevant.
Refactor a helper out of kvm_pmu_request_reprogram_counters() that
accepts a boolean allowing synchronous vs deferred reprogramming, and
use that from SVM code to support both scenarios.
Signed-off-by: Yosry Ahmed <yosry@kernel.org>
Link: https://patch.msgid.link/20260527234711.4175166-13-yosry@kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Instead of always checking of a counter needs to be disabled for
mode-specific reasons (e.g. Host-Only/Guest-Only bits in SVM), add a
bitmap to track such counters. Set the bit for counters using either
Host-Only or Guest-Only bits in EVENTSEL on SVM.
This bitmap will also be reused in following changes to selectively
apply changes to such counters.
Signed-off-by: Yosry Ahmed <yosry@kernel.org>
Link: https://patch.msgid.link/20260527234711.4175166-12-yosry@kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Introduce an optional per-vendor PMU callback for checking if a counter
is disabled in the current mode, and register a callback on AMD to
disable a counter based on the vCPU's setting of Host-Only or Guest-Only
EVENT_SELECT bits with the mediated PMU.
If EFER.SVME is set, all events are counted if both bits are set or
cleared. If only one bit is set, the counter is disabled if the vCPU
context does not match the set bit.
If EFER.SVME is cleared, the counter is disabled if any of the bits is
set, otherwise all events are counted. Note that a Linux guest correctly
handles this and clears Host-Only when EFER.SVME is cleared, see commit
1018faa6cf23 ("perf/x86/kvm: Fix Host-Only/Guest-Only counting with SVM
disabled").
The callback is made from pmc_is_locally_enabled(), which is used for
the mediated PMU when updating eventsel_hw in
kvm_mediated_pmu_refresh_eventsel_hw(), as well as when checking what
PMCs count instructions/branches for emulation in
kvm_pmu_recalc_pmc_emulation().
Host-Only and Guest-Only bits are currently reserved, so this change is
a noop, but the bits will be allowed with mediated PMU in a following
change when fully supported.
Originally-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Yosry Ahmed <yosry@kernel.org>
Link: https://patch.msgid.link/20260527234711.4175166-11-yosry@kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
SG2042's PCIe root complexes are cache-coherent with the CPU. Mark all
four PCIe controller nodes (pcie_rc0 through pcie_rc3) as dma-coherent
so the kernel uses coherent DMA mappings instead of non-coherent bounce
buffering.
Cc: stable@vger.kernel.org
Signed-off-by: Han Gao <gaohan@iscas.ac.cn>
Link: https://patch.msgid.link/20260331171248.973014-3-gaohan@iscas.ac.cn
Signed-off-by: Inochi Amaoto <inochiama@gmail.com>
Signed-off-by: Chen Wang <unicorn_wang@outlook.com>
|
|
Add definitions for KVM_X86_PMU_OP_OPTIONAL_RET0() to resolve to
__static_call_return0, similar to KVM_X86_OP_OPTIONAL_RET0(). Move the
definition of kvm_pmu_call() to pmu.h, and add declarations for the
static PMU calls in the header to allow making callbacks from the header
in following changes.
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Yosry Ahmed <yosry@kernel.org>
Link: https://patch.msgid.link/20260527234711.4175166-10-yosry@kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
If the guest disables the counter (by clearing
ARCH_PERFMON_EVENTSEL_ENABLE), KVM still performs the PMU filter lookup,
even though it doesn't end up changing eventsel_hw. Check if the
counter is enabled by the guest before doing the potentially expensive
PMU filter lookup.
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Yosry Ahmed <yosry@kernel.org>
Link: https://patch.msgid.link/20260527234711.4175166-9-yosry@kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Do a single atomic OR using the atomic overlay of reprogram_pmi bitmask,
instead of one atomic set_bit() call per counter.
Signed-off-by: Yosry Ahmed <yosry@kernel.org>
Link: https://patch.msgid.link/20260527234711.4175166-8-yosry@kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Rename reprogram_counters() to kvm_pmu_request_counters_reprogram()
clarifying that it is more similar to
kvm_pmu_request_counter_reprogram(), and less similar to
reprogram_counter(). The kvm_pmu_* prefix is also appropriate as the
function is exposed in the header.
Opportunistically rename the argument from 'diff' to 'counters'.
No functional change intended.
Signed-off-by: Yosry Ahmed <yosry@kernel.org>
Link: https://patch.msgid.link/20260527234711.4175166-7-yosry@kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
The declaration and definition of enable_pmu/enable_mediated_pmu
semantically belongs in pmu.h and pmu.c, and more importantly, pmu.h
uses enable_mediated_pmu and relies on the caller including x86.h.
There is already precedence for other module params defined outside of
x86.c, so move enable_pmu/enable_mediated_pmu to pmu.c.
No functional change intended.
Signed-off-by: Yosry Ahmed <yosry@kernel.org>
Link: https://patch.msgid.link/20260527234711.4175166-6-yosry@kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
A successful VMRUN retires in guest mode and should be counted by the
PMU as a guest instruction. Move the call to
kvm_pmu_instruction_retired() after potentially entering guest mode,
such that VMRUN is counted correctly.
The PMU event will be matched against L2's CPL, but otherwise this does
not change the behavior in terms of guest vs. host, because KVM does
not virtualize Host-Only/Guest-Only PMC controls yet, so all
instructions are counted regardless of the vCPU's host/guest state. But
this change is needed for the incoming support for Host-Only/Guest-Only
controls to count VMRUN correctly.
Signed-off-by: Yosry Ahmed <yosry@kernel.org>
Link: https://patch.msgid.link/20260527234711.4175166-5-yosry@kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
The code paths for advancing RIP and retiring the instruction for RIP
are very similar whether or not caching vmcb12 succeeds. The only
difference is handling mapping failures (i.e. EFAULT).
Pull the mapping failure handling out and unify the calls to
svm_skip_emulated_instruction() and kvm_pmu_instruction_retired(), but
return immediately after if copying and caching vmcb12 failed. A nice
side effect of this is that the FIXME comment is now above the only code
path calling svm_skip_emulated_instruction().
Signed-off-by: Yosry Ahmed <yosry@kernel.org>
Link: https://patch.msgid.link/20260527234711.4175166-4-yosry@kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
If svm_skip_emulation_instruction() fails, then RIP could not be
advanced correctly (e.g. decode failure when NextRIP is not available).
KVM will exit to userspace to handle the emulation failure, but only
after stuffing the wrong RIP into vmcb01 and entering guest mode.
Bail early and exit to userspace before committing any side-effects of
emulating the VMRUN (e.g. entering guest mode).
Fixes: c8e16b78c614 ("x86: KVM: svm: eliminate hardcoded RIP advancement from vmrun_interception()")
Signed-off-by: Yosry Ahmed <yosry@kernel.org>
Link: https://patch.msgid.link/20260527234711.4175166-3-yosry@kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
According to the APM, TF on VMRUN causes a #DB after VMRUN completes on
the _host_ side. However, KVM injects a #DB in L2 context instead (or
exits to userspace if KVM_GUESTDBG_SINGLESTEP is set) in
kvm_skip_emulated_instruction().
Avoid single-step handling on VMRUN by open-coding the rest of
kvm_skip_emulated_instruction() in nested_svm_vmrun(). This doesn't look
pretty, but following changes will need to open-code
kvm_pmu_instruction_retired() anyway, and will cleanup the code. This
ignores TF on VMRUN instead of injecting a spurious exception into
L2. Document this virtualization hole with a FIXME.
Note that a failed VMRUN would have been correctly single-stepped, but
now TF is always ignored for consistency and simplicity purposes. VMX
does not support TF on a successful VMLAUNCH/VMRESUME, so it's unlikely
that single-stepping VMRUN properly is important, especially if it's
only for failed VMRUNs.
Fixes: c8e16b78c614 ("x86: KVM: svm: eliminate hardcoded RIP advancement from vmrun_interception()")
Signed-off-by: Yosry Ahmed <yosry@kernel.org>
Link: https://patch.msgid.link/20260527234711.4175166-2-yosry@kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Move the freeing of external page tables into the reclaim operation that
lives in TDX code.
The TDP MMU supports traversing the TDP without holding locks. Page tables
need to be freed via RCU to prevent walking one that gets freed.
While none of these lockless walk operations actually happen for the mirror
page table, the TDP MMU nonetheless frees the mirror page table in the same
way, and (because it's a handy place to plug it in) the external page table
as well.
However, the external page table definitely can't be walked once the page
table pages are reclaimed from the TDX module. The TDX module releases the
page for the host VMM to use, so this RCU-time free is unnecessary for the
external page table.
So move the free_page() call to TDX code. Create an
tdp_mmu_free_unused_sp() to allow for freeing external page tables that
have never left the TDP MMU code (i.e. don't need to be freed in a special
way).
Link: https://lore.kernel.org/kvm/aYpjNrtGmogNzqwT@google.com
[Based on a diff by Sean, added log]
Co-developed-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
Link: https://patch.msgid.link/20260509075740.4371-1-yan.y.zhao@intel.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Move the logic for TDX's specific need to leak pages when reclaim
fails inside the free_external_spt() op, so this can be done in TDX
specific code and not the generic MMU.
Do this by passing in "sp" instead of the external page table pointer.
This way, TDX code can set sp->external_spt to NULL. Since the error is now
handled internally in TDX code (by triggering KVM_BUG_ON() or
TDX_BUG_ON_3(), which warn and stop the VM on any error), change the op to
return void. This way it also operates like a normal free in that success
is guaranteed from the caller's perspective.
Opportunistically, drop the unused level and gfn args while adjusting the
sp arg.
[ Rick: Re-wrote log and massaged op name ]
Co-developed-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
[ Yan: Updated patch log/function comment, dropped unused param in op ]
Co-developed-by: Yan Zhao <yan.y.zhao@intel.com>
Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
Link: https://patch.msgid.link/20260509075730.4354-1-yan.y.zhao@intel.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Rename tdx_sept_remove_private_spte() to tdx_sept_remove_leaf_spte() to
clearly show that this function is for removal of leaf SPTEs.
No functional change intended.
Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
Link: https://patch.msgid.link/20260509075719.4338-1-yan.y.zhao@intel.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Drop kvm_x86_ops.remove_external_spte(), and instead handle the removal of
leaf SPTEs in the S-EPT (a.k.a. external page table) in
kvm_x86_ops.set_external_spte(). This will also allow extending
tdx_sept_set_private_spte() to support splitting a huge S-EPT entry without
needing yet another kvm_x86_ops hook.
Now all changes for removing leaf mirror SPTEs are propagated through
kvm_x86_ops.set_external_spte().
- When removing leaf mirror SPTEs under shared mmu_lock (though currently
no path can trigger this scenario and TDX does not support this
scenario), tdx_sept_remove_private_spte() may produce a warning due to
lockdep_assert_held_write() or may return -EIO and trigger TDX_BUG_ON()
due to concurrent BLOCK, TRACK, REMOVE.
- When removing leaf mirror SPTEs under exclusive mmu_lock, all errors are
unexpected. If any error occurs in this scenario,
tdx_sept_remove_private_spte() will return -EIO and trigger KVM_BUG_ON().
A redundant KVM_BUG_ON() call will also be triggered in TDP MMU core in
handle_changed_spte(), which is benign (the WARN will fire if and only if
the VM isn't already bugged).
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
Link: https://patch.msgid.link/20260509075709.4322-1-yan.y.zhao@intel.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Arrange tdx_sept_remove_private_spte() (and its tdx_track() helper) to be
above tdx_sept_set_private_spte() in anticipation of routing all S-EPT
writes (with the exception of reclaiming non-leaf pages) through the "set"
API.
No functional change intended.
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
Link: https://patch.msgid.link/20260509075658.4306-1-yan.y.zhao@intel.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Drop the KVM_BUG_ON() in the KVM MMU core before zapping child external
PTEs, since requiring zapping PTEs to be protected by exclusive mmu_lock is
TDX's specific requirement.
No need to plumb the shared/exclusive info into the remove_external_spte()
op or move the KVM_BUG_ON() to TDX, because
- There's already an assertion of exclusive mmu_lock protection in TDX.
- The KVM_BUG_ON() is a bit redundant given that if there's any bug causing
zapping of leaf PTEs in S-EPT under shared mmu_lock, SEAMCALL failures
due to contention would result in TDX_BUG_ON() in TDX.
Link: https://lore.kernel.org/kvm/aYUarHf3KEwHGuJe@google.com/
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
Link: https://patch.msgid.link/20260509075647.4290-1-yan.y.zhao@intel.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
external PTEs
Move propagation of to-present changes and atomic zap changes to external
PTEs from function __tdp_mmu_set_spte_atomic() to function
__handle_changed_spte(), which centrally handles changes of SPTEs.
When setting a PTE to present in the mirror page tables, the update needs
to be propagated to the external page tables (in TDX parlance, the S-EPT).
Today this is handled by special mirror page tables logic/branching in
__tdp_mmu_set_spte_atomic(), which is the only place where present PTEs are
set for TDX.
The current approach obviously works, but is a bit hacked on. The hook for
setting present leaf PTEs is added only where TDX happens to need it. For
example, TDX does not support any of the operations that use the non-atomic
variant, tdp_mmu_set_spte(), to set present PTEs. Since the hook is missing
there, it is very hard to understand the code from a non-TDX lens. If the
reader doesn't know the TDX specifics it could look like the external SPTE
update is missing.
In addition to being confusing, it also litters the TDP MMU with "external"
update callbacks. This is especially unfortunate because there is already a
central place to react to TDP updates, handle_changed_spte().
Begin the process of moving towards a model where all mirror page table
updates are forwarded to TDX code where the TDX-specific logic can live
with a more proper separation of concerns. Do this by adding a helper
__handle_changed_spte() and teaching it how to return error codes, such
that it can propagate the failures that may come from TDX external page
table updates. Make the original handle_changed_spte() a no-fail version of
__handle_changed_spte(), so it handles no-fail changes which are under
exclusive mmu_lock or under the no-fail path handle_removed_pt(),
triggering KVM_BUG_ON() on error returns.
Instead of having __tdp_mmu_set_spte_atomic() do the frozen mirror SPTE
dance and trigger propagation to external PTEs, make
__tdp_mmu_set_spte_atomic() a simple helper of try_cmpxchg64() and hoist
the frozen mirror SPTE dance up a level to tdp_mmu_set_spte_atomic(). Then,
the propagation of changes to present to the external PTEs can be
centralized to __handle_changed_spte(). Aging external SPTEs is not yet
supported for the mirror page table, so just warn on mirror usage in
kvm_tdp_mmu_age_spte() and invoke __tdp_mmu_set_spte_atomic() directly
without frozen dance. No need to warn on installing FROZEN_SPTE as a
long-term value in kvm_tdp_mmu_age_spte() since removing accessed bit is
mutually exclusive with installing FROZEN_SPTE (FROZEN_SPTE is with
accessed bit in all x86 platforms).
Since tdp_mmu_set_spte_atomic() can also be invoked to atomically zap SPTEs
(though there's no path to trigger atomic zap on the mirror page table up
to now), also leverage set_external_spte() op to propagate the atomic zaps
when tdp_mmu_set_spte_atomic() zaps leaf SPTEs directly. (When
tdp_mmu_set_spte_atomic() zaps a non-leaf SPTE, zaps of the child leaf
SPTEs are propagated via the remove_external_spte() op).
Note: tdp_mmu_set_spte_atomic() invokes __handle_changed_spte() to handle
changes to new_spte while the mirror SPTE is frozen, so
(1) the update of the external PTEs and statistics, or
(2) the update of child mirror SPTEs, child external PTEs and corresponding
statistics,
now occur before the mirror SPTE is actually set to new_spte.
(1) is ok since if it fails, the mirror SPTE will be restored to its
original value. (2) is also ok since handle_removed_pt() is no-fail.
Link: https://lore.kernel.org/lkml/aYYn0nf2cayYu8e7@google.com
[Rick: Based on a diff by Sean Chrisopherson]
Co-developed-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
[Yan: added atomic zap case ]
Co-developed-by: Yan Zhao <yan.y.zhao@intel.com>
Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
Link: https://patch.msgid.link/20260509075634.4274-1-yan.y.zhao@intel.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|