| Age | Commit message (Collapse) | Author | Files | Lines |
|
https://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/nolibc/linux-nolibc.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/mic/linux.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git
# Conflicts:
# tools/testing/selftests/cgroup/test_memcontrol.c
|
|
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio.git
# Conflicts:
# drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_buffer.c
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext.git
|
|
# Conflicts:
# arch/x86/include/asm/tdx.h
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/liveupdate/linux.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git
# Conflicts:
# drivers/cpufreq/Kconfig.x86
# drivers/cpufreq/Makefile
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git
|
|
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/hid/hid.git
|
|
# Conflicts:
# fs/btrfs/defrag.c
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock.git
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git
|
|
Add a shell test that verifies perf report handles truncated perf.data
files gracefully — exiting with an error code rather than crashing with
SIGSEGV or SIGABRT.
The test records a simple workload, then truncates the resulting
perf.data at four offsets that exercise different parsing stages:
8 bytes — file header magic only
64 bytes — partial file header (attr section incomplete)
256 bytes — into the first events (partial event headers)
75% size — mid-stream truncation (partial event data)
For each truncation, perf report is run and the exit code is checked:
- Exit code 0 (success) fails the test — a truncated file should
never parse without error.
- Crash signals are detected portably via kill -l, which maps the
signal number to a name on the running system. This handles
architectures where signal numbers differ (e.g. SIGBUS is 7 on
x86/ARM but 10 on MIPS/SPARC). Core-dump and fatal signals
(KILL, ILL, ABRT, BUS, FPE, SEGV, TRAP, SYS) fail the test.
- Higher exit codes (200+) are perf's own negative-errno returns
(e.g. -EINVAL = 234) and are expected.
This exercises the bounds checking, minimum-size validation, and error
propagation added by the preceding patches in this series.
Testing it:
root@number:~# perf test truncat
84: Test that perf report handles truncated perf.data gracefully (no crash, no segfault — clean error exit).: Ok
root@number:~# perf test -vv truncat
84: Test that perf report handles truncated perf.data gracefully (no crash, no segfault — clean error exit).:
--- start ---
test child forked, pid 62890
---- end(0) ----
84: Test that perf report handles truncated perf.data gracefully (no crash, no segfault — clean error exit).: Ok
root@number:~#
Changes in v2:
- Add SIGKILL to the list of fatal signals so OOM kills from
resource exhaustion bugs are detected (Reported-by: sashiko-bot@kernel.org)
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
[ Fixed the SPDX on the line where 'perf test' expects the test description, reviewed by Ian Rogers ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
On native-endian files, events are read from MAP_SHARED memory.
Multiple reads of event->header.size can return different values
if the file is concurrently modified, allowing an attacker to
bypass bounds checks performed on an earlier read.
Snapshot header.size into a local variable at function entry using
READ_ONCE() to prevent compiler rematerialization, and use it for
all size-dependent arithmetic within the function. This ensures
every bounds calculation uses the same value that was validated
by the reader.
Reported-by: sashiko-bot@kernel.org # Running on a local machine
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
work->cpu comes from sample->cpu which is (u32)-1 when
PERF_SAMPLE_CPU is absent. Stored as int, this becomes -1
which passes the signed BUG_ON(work->cpu >= MAX_NR_CPUS) but
causes an out-of-bounds access on cpus_runtime[-1].
Replace the BUG_ON in top_calc_total_runtime() with an unsigned
bounds check that skips entries with invalid CPU values, counting
them for a summary warning.
Guard the same index in profile_event_match() (bitmap OOB),
top_calc_idle_time(), top_calc_irq_runtime(), top_calc_cpu_usage(),
and top_calc_load_runtime(). Also guard against division by zero
in top_calc_cpu_usage() when no runtime was accumulated.
Reported-by: sashiko-bot@kernel.org # Running on a local machine
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Yang Jihong <yangjihong@bytedance.com>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Several downstream consumers (timechart, kwork, sched) use fixed-size
arrays indexed by CPU. A crafted perf.data can supply arbitrary CPU
values that index past these arrays, causing out-of-bounds access.
Validate sample.cpu against min(nr_cpus_avail, MAX_NR_CPUS) in
perf_session__deliver_event() before any tool callback runs. The
cap at MAX_NR_CPUS protects fixed-size downstream arrays; the true
nr_cpus_avail is preserved in env for header parsing (e.g.
process_cpu_topology) which needs the real count.
Fall back to MAX_NR_CPUS when HEADER_NRCPUS is missing (truncated
files, pipe mode, pre-2017 perf).
Only validate when PERF_SAMPLE_CPU is set in sample_type — when
absent, evsel__parse_sample() leaves sample.cpu as (u32)-1, a
sentinel that downstream tools (script, inject) check to identify
events without CPU info. Clamping it to 0 would break those checks.
Inline evlist__parse_sample() into perf_session__deliver_event()
so the evsel lookup needed for sample_type checking reuses the same
evsel that parsed the sample, avoiding a second evlist__event2evsel()
call on every event.
For pipe-mode streams where HEADER_NRCPUS may arrive late or not at
all, the MAX_NR_CPUS fallback ensures the bounds check is still
effective against the fixed-size downstream arrays.
Reported-by: sashiko-bot@kernel.org # Running on a local machine
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
On 32-bit systems, sizeof(struct decomp) + decomp_len can wrap
size_t when comp_mmap_len is large. The preceding patch validates
comp_mmap_len alignment but does not cap the upper bound, so two
additions can still overflow:
1. decomp_len += decomp_last_rem: on 32-bit, adding a u64 to
size_t silently truncates, producing a corrupted decomp_len
that would bypass the subsequent overflow check and result
in an undersized buffer allocation.
2. sizeof(struct decomp) + decomp_len: the final addition could
overflow on systems with small size_t.
Add explicit overflow checks before each addition as
defense-in-depth.
Reported-by: sashiko-bot@kernel.org # Running on a local machine
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add several hardening checks to the compressed event decompression
pipeline:
1. Guard against decomp_last_rem underflow: check that
decomp_last->head does not exceed decomp_last->size before
subtracting. A u64 underflow here would produce a huge
decomp_len, causing an oversized mmap allocation.
2. Validate comp_mmap_len from the HEADER_COMPRESSED feature
section: reject values that are not 4K-aligned or smaller than
4096. The downstream decompression path checks allocation
sizes against SIZE_MAX, which handles 32-bit safety.
3. Validate COMPRESSED event header size: reject events where
header.size is too small to contain the fixed struct fields,
preventing underflow in the payload size calculation.
4. Validate COMPRESSED2 event data_size: check that data_size
does not exceed the available payload (header.size minus the
fixed struct fields) for the newer compressed format.
5. Reject compressed events when the HEADER_COMPRESSED feature
is missing from the file header, which means no decompression
context was initialized.
Reported-by: sashiko-bot@kernel.org # Running on a local machine
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
PERF_RECORD_COMPRESSED2 events carry a data_size field that must be
byte-swapped when reading cross-endian perf.data files. Without a
swap handler, reading COMPRESSED2 events on a different-endian machine
would misinterpret data_size as a garbage value, causing the
decompression path to read the wrong number of bytes.
The compressed payload itself is a raw byte stream and needs no
swapping.
Fixes: 208c0e16834472bb ("perf record: Add 8-byte aligned event type PERF_RECORD_COMPRESSED2")
Reported-by: sashiko-bot@kernel.org # Running on a local machine
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Chun-Tse Shao <ctshao@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
do_read_bitmap() reads a u64 bit count from the file and passes it
to bitmap_zalloc() without checking it against the remaining section
size. A crafted perf.data could trigger a large allocation that would
only fail later when the per-element reads exceed section bounds.
Additionally, bitmap_zalloc() takes an int parameter, so a crafted
size with bits set above bit 31 (e.g. 0x100000040) would pass the
section bounds check but truncate when passed to bitmap_zalloc(),
allocating a much smaller buffer than the subsequent read loop
expects.
Reject size values that exceed INT_MAX, and check that the data
needed (BITS_TO_U64(size) u64 values) fits in the remaining section
before allocating. Switch from bitmap_zalloc() to calloc() of u64
units so the allocation size matches the u64 read/write granularity
and avoids unsigned long vs u64 mismatch on 32-bit architectures.
Fix do_write_bitmap() to use memcpy to read u64-sized chunks from
the unsigned long bitmap, preventing out-of-bounds reads on 32-bit
systems where sizeof(unsigned long) is 4 but the bitmap is stored
in u64 units.
Fix process_mem_topology() minimum section size: the check used
nr * 2 * sizeof(u64) per node, but do_read_bitmap() reads an
additional u64 for the bitmap size, so the minimum is 3 * sizeof(u64).
Fix memory leak in process_mem_topology() error paths: replace
free(nodes) with memory_node__delete_nodes() to free per-node
bitmaps allocated by do_read_bitmap().
Currently used by process_mem_topology() for HEADER_MEM_TOPOLOGY.
Fixes: a881fc56038a ("perf header: Sanity check HEADER_MEM_TOPOLOGY")
Closes: https://lore.kernel.org/linux-perf-users/20260414224622.2AE69C19425@smtp.kernel.org/
Closes: https://lore.kernel.org/linux-perf-users/20260410223242.DD76FC19421@smtp.kernel.org/
Reported-by: sashiko-bot@kernel.org # Running on a local machine
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
read_event_desc() reads nre (event count), sz (attr size), and nr
(IDs per event) from the file and uses them to control allocations
and loops without validating them against the section size.
A crafted perf.data could trigger large allocations or many loop
iterations before __do_read() eventually rejects the reads.
Add bounds checks in read_event_desc():
- Reject sz smaller than PERF_ATTR_SIZE_VER0.
- Require at least one event (nre > 0).
- Check that nre events fit in the remaining section, using the
minimum per-event footprint of sz + sizeof(u32).
- Pre-swap attr->size to native byte order, then reject values
below PERF_ATTR_SIZE_VER0 or above sz before calling
perf_event__attr_swap() to prevent heap out-of-bounds access.
- Handle ABI0 (attr.size == 0): substitute PERF_ATTR_SIZE_VER0,
and on native-endian files write the value back so
free_event_desc() does not treat the zero as its end-of-array
sentinel (it iterates while attr.size != 0). The swap path
skips the write-back — perf_event__attr_swap() has its own
ABI0 fallback that sets VER0 after swapping.
- Check that nr IDs fit in the remaining section before allocating.
Fixes: b30b61729246 ("perf tools: Fix a problem when opening old perf.data with different byte order")
Reported-by: sashiko-bot@kernel.org # Running on a local machine
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Harden feature section parsing against crafted perf.data files:
1. perf_header__process_sections() reads the feature section table
and passes each section's offset and size directly to the
processing callbacks without validating them against the actual
file size. A crafted section size would make all downstream
bounds checks against ff->size ineffective since they compare
against the untrusted, inflated bound. Add an fstat() check
with S_ISREG() guard and verify that each section's offset +
size does not extend past EOF.
2. __do_read_buf() validates reads against ff->size (section size),
but __do_read_fd() had no such check, so a malformed perf.data
with an understated section size could cause reads past the end
of the current section into the next section's data. Add the
bounds check in __do_read(), the common caller of both helpers,
so it is enforced uniformly for both the fd and buf paths.
Track the section-relative offset in __do_read_fd() so the
check works for the fd path. Reject negative sizes which on
32-bit can occur when a u32 >= 0x80000000 is passed as ssize_t.
3. do_read_string() relied on file data being null-padded. Add
explicit null-termination (buf[len-1] = '\0') after reading
and validate length (>= 1, fits within section) before
allocating, so callers like process_cpu_topology() never
receive an unterminated string.
4. Initialize feat_fd.offset to 0 (section-relative) instead of
section->offset (file-absolute) so the bounds tracking is
consistent with __do_read()'s section-relative comparison.
Adjust process_build_id() to use lseek() for its file-absolute
offset needs since it cannot rely on ff->offset for that.
5. Propagate ff->size to perf_file_section__fprintf_info() so its
reads are also bounded.
Reported-by: sashiko-bot@kernel.org # Running on a local machine
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: David Carrillo-Cisneros <davidcc@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
perf_session__read_header()
perf_session__read_header() reads f_attr.ids.size from the perf.data
file and divides it by sizeof(u64) to compute nr_ids, which is
declared as int. No validation is performed on the value before it
is used to allocate arrays and drive a read loop.
On 32-bit architectures, a crafted f_attr.ids.size of 0x100000000
(4 GB) produces nr_ids = 0x20000000, but the allocation size
1 * 0x20000000 * 8 overflows size_t to 0, so zalloc(0) returns a
valid pointer. The subsequent loop writes 0x20000000 IDs into that
zero-length buffer, corrupting the heap.
On 64-bit, the u64-to-int truncation silently drops high bits,
processing fewer IDs than the file claims. While not exploitable,
this is a data integrity issue.
Add validation before using f_attr.ids:
- Cap nr_attrs (attrs.size / attr_size) to MAX_NR_ATTRS (1 << 16)
with overflow-safe u64 comparison before assigning to int
- Reject ids.size not aligned to sizeof(u64)
- Cap ids.size / sizeof(u64) to MAX_IDS_PER_ATTR (1 << 24) to
prevent int truncation and size_t overflow on 32-bit
- Reject ids sections that extend past the end of the file,
guarded by S_ISREG() so non-regular files (block devices,
pipes) are not falsely rejected
Also fix perf_header__getbuffer64() to set errno = EIO when
readn() returns 0 (EOF). Without this, the out_errno path in
perf_session__read_header() returns -errno which is 0 (success)
on truncated files, causing downstream NULL dereferences.
Reported-by: sashiko-bot@kernel.org # Running on a local machine
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
perf_session__read_header() discards the return value from
perf_header__process_sections(), so any error from a feature
section processor (process_nrcpus, process_compressed, etc.)
is silently ignored and the session opens as if nothing went
wrong.
This defeats the validation added by subsequent commits in this
series: a crafted perf.data that fails a feature section check
would still be processed with partially-initialized state.
Check the return value and fail the session if any feature
section processor returns an error.
For truncated files (data.size == 0, i.e. recording was
interrupted before the header was finalized), skip feature
section processing entirely and clear the feature bitmap so
tools use their "feature not present" fallbacks instead of
accessing uninitialized env fields.
Change the feature processor stubs for optional libraries
(libtraceevent, libbpf) from returning -1 to returning 0,
so that perf.data files containing these features can still be
opened on builds without the optional library — the feature is
simply skipped rather than causing a fatal error.
Also propagate evlist__prepare_tracepoint_events() failure as
-ENOMEM, since the function can fail due to strdup() allocation
failure inside evsel__prepare_tracepoint_event().
Fixes: 1c0b04d12ae9 ("perf tools: Add perf_session__read_header function")
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
printing
perf_event_attr__fprintf() accessed all struct fields unconditionally,
but attrs from older perf.data files or BPF-captured syscall payloads
may have a smaller size than the current struct. Fields beyond the
recorded size contain uninitialized or zero-filled data.
Add size-guarded macros (PRINT_ATTRn, PRINT_ATTRn_bf) that compare
each field's offset against attr->size before accessing it.
Guard the bitfield block (disabled, inherit, ... defer_output) with
attr_size >= 48. These bitfields share a single __u64 at offset 40,
which is within PERF_ATTR_SIZE_VER0 for validated perf.data attrs,
but BPF-captured attrs from perf trace can have a smaller size when
the tracee passes a minimal struct to sys_perf_event_open.
Also fix the BPF trace path: when perf trace intercepts
sys_perf_event_open via BPF, the program copies PERF_ATTR_SIZE_VER0
bytes when the tracee passes size=0, but leaves the size field as 0.
Set attr->size to PERF_ATTR_SIZE_VER0 in the augmented syscall
handler so the bounds checks match the actual copied size.
Reported-by: sashiko-bot@kernel.org # Running on a local machine
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
strdup(ev->unit) and strdup(ev->name) read until '\0' with no
guarantee the string is null-terminated within event->header.size.
The dump_trace fprintf path has the same problem with %s.
Validate before either path runs — same class of bug fixed for
MMAP/MMAP2/COMM/CGROUP by perf_event__check_nul().
Also harden the event_update swap handler to:
- Validate SCALE event size before swapping the double at
offset 24, which exceeds the 24-byte min_size.
- Validate CPUS event size before accessing the cpu_map
type/nr/long_size fields, which also start at the min_size
boundary.
- Swap CPUS variant fields (type, nr, long_size) so the
processing path sees native byte order.
Add validation in perf_event__process_event_update() for all
event update variants (UNIT, NAME, SCALE, CPUS) before
dump_trace or processing.
Validate CPUS nr against payload size for both PERF_CPU_MAP__CPUS
and PERF_CPU_MAP__MASK types on the fprintf (dump_trace) path:
- CPUS: check nr does not exceed available cpu entries
- MASK: check nr does not exceed available mask entries for
both mask32 (long_size == 4) and mask64 (long_size == 8)
layouts, with underflow guards on the offsetof subtraction
Fix a missing break before the default case in the CPUS
switch path.
Reported-by: sashiko-bot@kernel.org # Running on a local machine
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
PERF_RECORD_BPF_METADATA has no entry in perf_event__swap_ops[], so its
nr_entries field is never byte-swapped when reading a cross-endian
perf.data file. Downstream processing in
perf_event__fprintf_bpf_metadata() loops over nr_entries, so a
foreign-endian value causes out-of-bounds reads.
Add a swap handler that byte-swaps nr_entries after validating that
header.size is large enough. The entries[] array contains only char
arrays (key/value strings), so no per-entry swap is needed — but ensure
NUL-termination on the writable cross-endian path.
Validate header.size, nr_entries, and string NUL-termination in the
common event delivery path so that native-endian files with malicious
values are also rejected. Snapshot nr_entries via READ_ONCE() before
validation — the event is on a MAP_SHARED mmap that could theoretically
change between the bounds check and the loop.
Changes in v2:
- Snapshot event->header.size via READ_ONCE() into a local variable
to prevent a double-fetch underflow in the max_entries calculation
(Reported-by: sashiko-bot@kernel.org)
- Write back clamped nr_entries to the event on the swap path,
consistent with NAMESPACES and STAT_CONFIG handlers — without
writeback the native path sees the inflated nr and skips the
event entirely (Reported-by: sashiko-bot@kernel.org)
Fixes: ab38e84ba9a8 ("perf record: collect BPF metadata from existing BPF programs")
Reported-by: sashiko-bot@kernel.org # Running on a local machine
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Blake Jones <blakejones@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Fix four issues in PERF_RECORD_AUXTRACE_ERROR handling:
1. auxtrace_error_name() takes a signed int parameter, but e->type
is __u32. A crafted value like 0xFFFFFFFF converts to -1, passes
the bounds check, and causes a negative array index. Fix by
changing the parameter to unsigned int.
2. The msg field is printed via %s without a length bound. The
min_size table only guarantees fields up to msg (offset 48), so
a truncated event has zero msg bytes within the event boundary.
Compute the available msg length from header.size, cap at
sizeof(e->msg), and use %.*s.
3. fmt >= 2 adds machine_pid and vcpu fields after msg[64]. Older
files may have fmt >= 2 but an event size that doesn't include
these fields. Add a size check in the swap handler to downgrade
fmt before the conditional field access, and a matching size
guard in the fprintf path for native-endian events (which are
mmap'd read-only and can't be modified in place).
4. python_process_auxtrace_error() had the same issues: msg was
passed to tuple_set_string() unbounded, and machine_pid/vcpu
were accessed unconditionally without checking fmt or event
size. Apply the same bounds checks.
Reported-by: sashiko-bot@kernel.org # Running on a local machine
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
cpu_map__from_range() computes nr_cpus as end_cpu - start_cpu + 1.
When a crafted perf.data has start_cpu > end_cpu, this wraps to a
huge value, causing perf_cpu_map__empty_new() to attempt a massive
allocation.
Return NULL when the range is inverted.
Also clamp any_cpu to boolean (0 or 1) since it is added to the
allocation count — a crafted value > 1 would inflate the map size.
Harden cpu_map__from_mask() to reject unsupported long_size values
(anything other than 4 or 8), preventing misinterpretation of the
mask data layout.
Snapshot mmap'd fields via READ_ONCE() into locals to prevent
TOCTOU re-reads — the data pointer references MAP_SHARED mmap'd
memory that could theoretically change between reads on a
FUSE-backed file:
- cpu_map__from_range(): snapshot start_cpu, end_cpu, any_cpu
- cpu_map__from_entries(): snapshot nr and each cpu[i] element
- cpu_map__from_mask(): snapshot long_size (before validation,
closing the check-then-read gap), mask_nr
- perf_record_cpu_map_data__read_one_mask(): add u16 long_size
parameter so callers pass the validated copy instead of
re-reading data->mask32_data.long_size from mmap'd memory
Reported-by: sashiko-bot@kernel.org # Running on a local machine
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
perf_header__read_build_ids() swaps the event header fields for cross-endian
perf.data files but not bev.pid. This causes perf_session__findnew_machine()
to look up the wrong machine for guest VM build IDs, misattributing them.
Swap bev.pid alongside the header fields.
Also add a build_id_swap callback for stream-mode build ID events,
and validate NUL-termination of build_id.filename on the native-endian
delivery path (perf_session__process_user_event) — events with
unterminated filenames are skipped.
Harden perf_header__read_build_ids() against crafted perf.data files:
- Add overflow check on offset + size to prevent wrap past ULLONG_MAX.
- Reject bev.header.size == 0 which would loop forever.
- Reject bev.header.size > remaining section to prevent reading past
the section boundary.
- Guard memcmp(filename, "nel.kallsyms]", 13) with len >= 13 to avoid
reading uninitialized stack memory on short filenames.
- Force NUL-termination of filename before passing it to functions
like machine__findnew_dso() that use strlen/strcmp.
Reported-by: sashiko-bot@kernel.org # Running on a local machine
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
paths
Several event types use an nr field to control iteration over
variable-length arrays. The swap handlers byte-swap and loop using
these fields without bounds checks, and the native processing path
trusts them as well.
Add bounds checks on both paths for:
- PERF_RECORD_THREAD_MAP: validate nr against payload, return -1
on the swap path. On the native path, reject with -EINVAL.
- PERF_RECORD_NAMESPACES: clamp nr on the swap path (safe because
each entry is indexed by type; missing entries just won't be
resolved). Skip the event on the native path.
- PERF_RECORD_CPU_MAP: clamp nr for CPUS and MASK sub-types on
the swap path. Add bounds checks for mask64 which previously
had no nr validation. Skip the event on the native path.
- PERF_RECORD_STAT_CONFIG: clamp nr on the swap path (safe because
each config entry is self-describing via its tag). Skip the
event on the native path.
The swap path (cross-endian, writable MAP_PRIVATE mapping) can
safely clamp by writing back to the event. The native path
(read-only MAP_SHARED mapping) must skip instead of clamping
because writing to the mmap'd event would segfault.
Also fix stat_config swap range: change size += 1 to
size += sizeof(event->stat_config.nr) for clarity. The old +1
happened to work because mem_bswap_64 processes 8-byte chunks,
but the intent is to include the 8-byte nr field in the swap
range.
Changes in v2:
- Document that PERF_RECORD_NAMESPACES max_nr includes trailing
sample_id space when sample_id_all is present — harmless on the
swap path because both per-element bswap_64 and swap_sample_id_all()
perform the same u64 byte swap (Reported-by: sashiko-bot@kernel.org)
Reported-by: sashiko-bot@kernel.org # Running on a local machine
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Harden PERF_RECORD_HEADER_ATTR handling against crafted perf.data:
- Validate attr.size: must be >= PERF_ATTR_SIZE_VER0, a multiple
of sizeof(u64), and fit within the event payload.
- Copy only min(attr.size, sizeof(struct perf_event_attr)) bytes
into a local attr, zeroing the rest so legacy files don't leak
adjacent event data into new fields.
- Keep the original attr.size so perf_event__synthesize_attr()
uses it for both allocation and ID-array placement.
Fix perf_event__synthesize_attr() to use attr->size (not the
compiled sizeof) for event allocation and layout, so perf inject
correctly re-synthesizes attrs from files recorded by a different
perf version. Without this, the ID array destination pointer
(computed via perf_record_header_attr_id()) would be inconsistent
with the allocation when attr->size differs from sizeof.
Also fix the parse-no-sample-id-all test to set attr.size, which
is now validated, and improve error handling in read_attr() for
short reads and invalid attr sizes.
Handle ABI0 pipe/inject events where attr.size is 0: use a local
attr_size variable set to PERF_ATTR_SIZE_VER0 for both the bounded
copy and ID array position, instead of writing back to the event.
Native-endian files may be MAP_SHARED (read-only mmap), so writing
to the event buffer would SIGSEGV. The swap path handles ABI0 in
perf_event__attr_swap() which writes to the MAP_PRIVATE copy.
header.size alignment is now validated centrally in
perf_session__process_event() (see "Add minimum event size and
alignment validation").
Reported-by: sashiko-bot@kernel.org # Running on a local machine
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
session->time_conv = event->time_conv copies sizeof(struct
perf_record_time_conv) bytes unconditionally, but older kernels
emit shorter TIME_CONV events without the time_cycles, time_mask,
cap_user_time_zero, and cap_user_time_short fields.
For a 32-byte event (the original format), this reads 24 bytes
past the event boundary into adjacent mmap'd data. The garbage
values end up in session->time_conv and can cause incorrect TSC
conversion if cap_user_time_zero happens to be non-zero.
Replace the struct assignment with a bounded memcpy capped at
event->header.size, zeroing the remainder so extended fields
default to off when absent.
Reported-by: sashiko-bot@kernel.org # Running on a local machine
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Change swap callbacks from void to int return so handlers can
propagate errors. All 28 existing handlers are converted to
return 0 on success, -1 on error. Three new handlers (KSYMBOL,
BPF_EVENT, HEADER_FEATURE) are added returning int from the
start, with sample_id_all handling for the kernel event types.
event_swap() propagates the return to its callers (process_event
and peek_event), which skip events that fail to swap.
Add perf_event__check_nul() for null-termination enforcement
on the common event delivery path for MMAP, MMAP2, COMM,
CGROUP, and KSYMBOL events. Events with
unterminated strings are skipped — native-endian files are
mapped read-only, so writing a NUL byte in place would segfault.
Swap handler hardening:
- Use strnlen bounded by event size (instead of strlen) in
COMM/MMAP/MMAP2/CGROUP swap handlers, returning -1 on
unterminated strings.
- Bounds check text_poke old_len+new_len before computing the
sample_id offset, returning -1 on overflow. Use offsetof()
for the native-path check in machines__deliver_event() since
sizeof() includes struct padding past the flexible array.
- Fix PERF_RECORD_SWITCH sample_id_all: non-CPU_WIDE SWITCH
events have sample_id immediately after the 8-byte header,
not at sizeof(struct perf_record_switch) which is the
CPU_WIDE variant size.
- Fix perf_event__time_conv_swap(): decouple time_cycles and
time_mask into independent per-field event_contains() checks,
so each field is only swapped when the event is large enough
to contain it. The original code guarded both fields under
a single time_cycles check, which would swap time_mask on a
short event that contains time_cycles but not time_mask.
- Handle ABI0 (attr.size == 0) in perf_event__attr_swap()
by substituting PERF_ATTR_SIZE_VER0, so bswap_safe()
correctly swaps VER0 fields instead of skipping everything.
- peek_events: on swap failure, advance past the malformed
entry instead of aborting the loop.
Note: the nr-field bounds checks for namespaces, thread_map,
cpu_map, and stat_config arrays are added by a subsequent
patch ("perf session: Validate nr fields against event size
on both swap and common paths"). The HEADER_ATTR attr.size
validation is added by ("perf session: Validate HEADER_ATTR
attr.size before swapping").
By establishing the int-returning swap infrastructure first,
all subsequent hardening patches can use direct error returns
from day one — no poison values, no workarounds for void return.
Changes in v2:
- peek_events: abort instead of skip for AUXTRACE events on
validation failure — skipping only header.size would land
inside the raw trace payload, causing subsequent iterations
to misparse data as events (Reported-by: sashiko-bot@kernel.org)
Fixes: 9aa0bfa370b2 ("perf tools: Handle PERF_RECORD_KSYMBOL")
Fixes: 45178a928a4b ("perf tools: Handle PERF_RECORD_BPF_EVENT")
Fixes: e9def1b2e74e ("perf tools: Add feature header record to pipe-mode")
Reported-by: sashiko-bot@kernel.org # Running on a local machine
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Carrillo-Cisneros <davidcc@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
swap_sample_id_all() calls BUG_ON(size % sizeof(u64)) which kills
perf on any event where the sample_id_all tail is not 8-byte aligned.
A crafted perf.data can trigger this trivially.
Replace BUG_ON with a bounds check: skip the swap if the data pointer
is past the end of the event, and only swap when there are bytes
remaining.
Note: the strlen calls in string-field swap handlers (comm,
mmap, mmap2, cgroup) are replaced with bounded strnlen by the
next patch in this series ("perf session: Add validated swap
infrastructure with null-termination checks").
Reported-by: sashiko-bot@kernel.org # Running on a local machine
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The kernel dynamically sizes PERF_RECORD_READ based on
attr.read_format: only the fields enabled by PERF_FORMAT_TOTAL_TIME_ENABLED,
PERF_FORMAT_TOTAL_TIME_RUNNING, PERF_FORMAT_ID, and PERF_FORMAT_LOST
are emitted, packed with no gaps.
perf_event__read_swap() unconditionally byte-swapped time_enabled,
time_running, and id at their fixed struct offsets, causing
out-of-bounds access on smaller events and swapping the wrong
bytes when not all format fields are present. It also swapped
sample_id_all at a fixed offset past the full struct, which is
wrong for shorter events.
Replace the individual field swaps with a single mem_bswap_64()
over the entire tail from value onward. Since every field after
pid/tid is u64 regardless of which combination is present, this
correctly handles any read_format combination and any trailing
sample_id_all fields.
Similarly, dump_read() accessed optional fields via fixed struct
offsets, displaying values from wrong positions when not all
format bits are set. Walk the packed u64 array sequentially
instead, with bounds checks against event->header.size.
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
zstd_decompress_stream() has two bugs in its multi-iteration loop:
1. After each ZSTD_decompressStream() call, the code advances
output.dst by output.pos but doesn't reset output.pos to 0.
ZSTD interprets output.pos relative to output.dst, so the
next iteration writes at (dst + pos) + pos = dst + 2*pos,
skipping a gap and potentially writing out of bounds.
2. On ZSTD_decompressStream() error, the loop executes break
and returns output.pos (which is > 0 if some bytes were
decompressed before the error). The caller checks
!decomp_size and skips the error, silently accepting
truncated or corrupted data.
Fix both by removing the output buffer adjustment — ZSTD
correctly accumulates output.pos across calls without it.
Return 0 on decompression error so the caller detects it.
Add a no-progress guard to prevent infinite loops if the
output buffer fills before all input is consumed.
Note: the compressed event data_size is validated against
header.size by a subsequent patch in this series
("perf tools: Harden compressed event processing").
Reported-by: sashiko-bot@kernel.org # Running on a local machine
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The error fallback does memcpy(dst, src, src_size) intending to store
uncompressed data when compression fails, but this has three bugs:
1. dst has been advanced past the record header (and potentially
past earlier compressed records), so the copy writes to the
wrong offset in the output buffer.
2. src still points to the start of the input, not to the
remaining uncompressed data at src + input.pos. On a second
or later iteration, previously compressed data would be
duplicated.
3. No check that dst_size >= src_size — if the remaining output
space is smaller, this is an out-of-bounds write.
Replace with return -1 after resetting the ZSTD compression
context via ZSTD_initCStream(). The -1 propagates through
zstd_compress() -> record__pushfn() -> perf_mmap__push() to the
recording loop, which breaks out and terminates recording.
Add an out_child_no_flush label in __cmd_record() so the
mmap-read failure path skips the final record__mmap_read_all()
flush — retrying the same read that just failed would just fail
again, and the flush is only useful when the mmap data is intact
but the control path (auxtrace, switch_output) had an error.
Consolidate all error paths through a single 'reset' label to
ensure the compression context is always reset on failure —
including the output-buffer-full path, where a bare return
without resetting would leave stale stream state that corrupts
output if the caller retries.
Also guard against process_header() writing the event header
before the buffer-full check: add a sizeof(perf_event_header)
pre-check so the callback never writes past the output buffer.
Guard against ZSTD making no progress: if output.pos is zero
after ZSTD_compressStream(), calling process_header(record, 0)
would re-trigger header initialization, double-subtracting the
header size from dst_size and underflowing the unsigned counter.
Also fix two pre-existing issues in the same function:
- Add a dst_size guard before subtracting the record header
size: if the output buffer is nearly full, the unsigned
dst_size -= size underflows to a huge value, causing
ZSTD_compressStream to write past the buffer boundary.
- Check the ZSTD_initCStream() return value and log an error
if the context reset itself fails.
Reported-by: sashiko-bot@kernel.org # Running on a local machine
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
event_contains() checked whether a field's start offset was within
the event (header.size > offsetof), but not whether the full field
fit. A crafted event with header.size = offsetof(field) + 1 would
pass the check, but an 8-byte access (bswap_64, direct read) would
overrun the event boundary by up to 7 bytes.
Fix the macro to verify the complete field:
header.size >= offsetof(field) + sizeof(field)
Also update all callers that check event_contains(time_cycles) but
access later fields (time_mask, cap_user_time_zero,
cap_user_time_short) to check for cap_user_time_short — the last
field accessed — so the entire extended block is verified:
tsc.c, arm-spe.c, cs-etm.c, jitdump.c.
Note: session.c's perf_event__time_conv_swap() also guards on
time_cycles but accesses time_mask — a pre-existing issue not
introduced by this macro change. It is fixed by a later patch
in this series ("perf session: Add validated swap
infrastructure with null-termination checks"), which decouples
time_cycles and time_mask into independent per-field
event_contains() checks. The struct assignment overread
(session->time_conv = event->time_conv copies sizeof on a
potentially shorter event) is separately fixed by "perf
session: Use bounded copy for PERF_RECORD_TIME_CONV".
Reported-by: sashiko-bot@kernel.org # Running on a local machine
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
perf_session__peek_event() computes an event pointer directly from
file_offset when one_mmap is active, without verifying that file_offset
and the subsequent event->header.size fall within the mapped region.
A corrupted perf.data file could cause out-of-bounds memory reads.
Add one_mmap_size to the session struct and validate both the header
and full event fit within the mmap before dereferencing.
Reported-by: sashiko-bot@kernel.org # Running on a local machine
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add a per-type minimum size table (perf_event__min_size[]) and
enforce it before swap and processing, so that both cross-endian
and native-endian paths are protected from accessing fields past
the event boundary.
The table uses offsetof() for types with trailing variable-length
fields (filenames, strings, msg arrays) and sizeof() for
fixed-size types. Zero entries mean no minimum beyond the 8-byte
header already enforced by the reader.
Undersized events are skipped with a warning in process_event
and rejected in peek_event — both checked before the swap
handler runs, preventing OOB access on crafted event fields.
Also reject events whose header.size is not 8-byte aligned. The
kernel aligns all event sizes to sizeof(u64) — see
perf_event_comm_event() (ALIGN), perf_event_mmap_event(),
perf_event_cgroup(), perf_event_ksymbol() (IS_ALIGNED loops),
and perf_event_text_poke() (ALIGN) in kernel/events/core.c.
An unaligned size means the file is corrupted or crafted; reject
early so downstream code that divides by sizeof(u64) to compute
array element counts gets exact results.
Three legacy user events are exempted from the alignment check:
TRACING_DATA (66) had a 12-byte struct before commit b39c915a4f36
("libperf event: Ensure tracing data is multiple of 8 sized")
added padding, COMPRESSED (81) carries raw ZSTD output (already
superseded by COMPRESSED2 with PERF_ALIGN), and HEADER_FEATURE
(80) uses do_write_string() with a 4-byte length prefix.
Also guard event_swap() against crafted event types >=
PERF_RECORD_HEADER_MAX to prevent OOB reads on the
perf_event__swap_ops[] array.
Changes in v2:
- Fix double-skip for unsupported event types: return 0 instead
of event->header.size in perf_session__process_event() for
HEADER_MAX, since reader__read_event() already advances by
event->header.size (Reported-by: sashiko-bot@kernel.org)
- Exempt TRACING_DATA, COMPRESSED, and HEADER_FEATURE from the
alignment check — these legacy user events predate the 8-byte
alignment rule (Reported-by: sashiko-bot@kernel.org)
- peek_event: return 0 (skip) for unknown event types instead of
-1 (error), consistent with process_event which already skips
unsupported types gracefully (Reported-by: sashiko-bot@kernel.org)
Reported-by: sashiko-bot@kernel.org # Running on a local machine
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
|
|
# Conflicts:
# fs/fuse/dev.c
|
|
|
|
* acpica: (27 commits)
ACPICA: add boundary checks in two places
ACPICA: Add package limit checks in parser functions
ACPICA: Update version to 20260408
ACPICA: Update the copyright year to 2026
ACPICA: Remove spurious precision from format used to dump parse trees
ACPICA: Enhance OEM ID and Table ID validation in acpi_ex_load_table_op()
ACPICA: Fix NULL pointer dereference in acpi_ns_custom_package()
ACPICA: Enhance buffer validation in acpi_ut_walk_aml_resources()
ACPICA: Add validation for node in acpi_ns_build_normalized_path()
ACPICA: validate handler object type in two places
ACPICA: Improve argument parsing in acpi_ps_get_next_simple_arg()
ACPICA: Fix integer overflow in acpi_ex_opcode_3A_1T_1R() (mid_op)
ACPICA: Prevent adding invalid references
ACPICA: add boundary checks in acpi_ps_get_next_field()
ACPICA: validate byte_count in acpi_ps_get_next_package_length()
ACPICA: Fix use-after-free in acpi_ds_terminate_control_method()
ACPICA: fix I2C LVR item count in the conversion table
ACPICA: Mention the LVR bits
ACPICA: Change LVR to 8 bit value
ACPICA: Fetch LVR I2C resource descriptor
...
|
|
* pm-sleep:
PM: hibernate: Use flexible array for CRC uncompressed buffers
PM: hibernate: make LZ4 available for hibernation compression
PM: sleep: Use complete() in device_pm_sleep_init()
PM: hibernate: call preallocate_image() after freeze prepare
* pm-powercap:
powercap: intel_rapl: Fix memory leak in rapl_add_package_cpuslocked()
* pm-tools:
PM: tools: pm-graph: fix ValueError when parsing incomplete device properties
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/cel/linux
# Conflicts:
# fs/exfat/file.c
|
|
|
|
The kmem_cache_alloc_bulk return value is weird. It returns the number
of allocated objects, but that must always be 0 or the requested number
based on the implementations and the handling in the callers, but that
assumption is not actually documented anywhere, which confuses automated
review tools.
Fix this by returning a bool if the allocation succeeded and adding a
kerneldoc comment explaining the API.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> # skbuff
Link: https://patch.msgid.link/20260528093437.2519248-2-hch@lst.de
Signed-off-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git
|
|
slab->partial is assigned by get_obj("partial") and then immediately
overwritten by get_obj_and_str("partial", &t). Remove the first
redundant assignment.
Reviewed-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Xuewen Wang <wangxuewen@kylinos.cn>
Link: https://patch.msgid.link/20260518062159.80664-4-wangxuewen@kylinos.cn
Signed-off-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
|
|
The assignment `x = NULL` sets the local parameter variable instead of
`*x`, which is a no-op since `*x` was already set to NULL on the line
above. Remove the dead assignment.
Signed-off-by: Xuewen Wang <wangxuewen@kylinos.cn>
Reviewed-by: SeongJae Park <sj@kernel.org>
Link: https://patch.msgid.link/20260518062159.80664-3-wangxuewen@kylinos.cn
Signed-off-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
|
|
The disable trace path in slab_debug() had a logic error where it would
set trace=1 instead of trace=0. This made trace functionality permanently
enabled once turned on for any slab cache.
Fixes: a87615b8f9e2 ("SLUB: slabinfo upgrade")
Cc: stable@vger.kernel.org
Reviewed-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Xuewen Wang <wangxuewen@kylinos.cn>
WARNING: From:/Signed-off-by: email address mismatch: 'From: wangxuewen <18810879172@163.com>' != 'Signed-off-by: wangxuewen <wangxuewen@kylinos.cn>'
Link: https://patch.msgid.link/20260518062159.80664-2-wangxuewen@kylinos.cn
Signed-off-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
|
|
# New commits in objtool/core:
2d3bb398861a ("objtool/klp: Cache dont_correlate() result")
fe6a87e0abac ("objtool: Improve and simplify prefix symbol detection")
f7ceffd21a8a ("objtool/klp: Fix kCFI prefix finding/cloning")
fc0bb9915bce ("objtool: Grow __cfi_* prefix symbols for all CFI+CALL_PADDING")
cca84cb12908 ("objtool/klp: Fix position-dependent checksums for non-relocated jumps/calls")
3ee67629b2b7 ("objtool: Add insn_sym() helper")
5d6a03eeb717 ("objtool/klp: Add correlation debugging output")
6016dd33a10a ("objtool/klp: Rewrite symbol correlation algorithm")
873a2208ea31 ("objtool/klp: Calculate object checksums")
225d16dd510d ("klp-build: Validate short-circuit prerequisites")
3b8e56b86faa ("objtool/klp: Remove "objtool --checksum"")
d4888d58041d ("klp-build: Use "objtool klp checksum" subcommand")
e10764614ad6 ("objtool/klp: Add "objtool klp checksum" subcommand")
a5b661233262 ("objtool: Consolidate file decoding into decode_file()")
30cae58cdc13 ("objtool/klp: Extricate checksum calculation from validate_branch()")
6282e9f46b4f ("objtool: Add is_cold_func() helper")
8eebd5731133 ("objtool: Add is_alias_sym() helper")
ff0cf5efef40 ("objtool/klp: Handle Clang .data..Lanon anonymous data sections")
9e4512d7de5a ("objtool/klp: Create empty checksum sections for function-less object files")
ac999926774a ("objtool: Include libsubcmd headers directly from source tree")
8d4cbb6d0caf ("objtool/klp: Don't set sym->file for section symbols")
b6480aaedf3c ("klp-build: Remove redundant SRC and OBJ variables")
e950d2a10a30 ("klp-build: Print "objtool klp diff" command in verbose mode")
df0d7bb04a27 ("klp-build: Reject patches to realmode")
d8c3e262361b ("klp-build: Reject patches to vDSO")
f3048888ea62 ("klp-build: Fix patch cleanup on interrupt")
96524543740e ("klp-build: Suppress excessive fuzz output by default")
b3ece3019e8e ("klp-build: Validate patch file existence")
946d3510fe19 ("klp-build: Don't use errexit")
ba77fe55781a ("klp-build: Fix checksum comparison for changed offsets")
cc39ccce7d5b ("klp-build: Fix hang on out-of-date .config")
a375e327b63e ("objtool: Fix reloc hash collision in find_reloc_by_dest_range()")
5f49ec82b9f6 ("objtool/klp: Fix reloc corruption in convert_reloc_sym_to_secsym()")
51e1dfce24c8 ("objtool/klp: Don't correlate .rodata.cst* constant pool objects")
d5b0f025281f ("objtool/klp: Fix pointer comparisons for rodata objects")
8fdc3585b3b0 ("objtool/klp: Simplify reloc symbol conversion")
3e01ab44af20 ("objtool: Move mark_rodata() to elf.c")
3787e82a4e3a ("objtool/klp: Fix relocation conversion failures for R_X86_64_NONE")
da4326573ae8 ("objtool/klp: Fix kCFI trap handling")
62a7a01fde87 ("objtool/klp: Fix extraction of text annotations for alternatives")
479ac5260e7e ("objtool/klp: Fix XXH3 state memory leak")
98377f3ba7c0 ("objtool/klp: Fix cloning of zero-length section symbols")
c4c02d4450b5 ("objtool/klp: Fix handling of zero-length .altinstr_replacement sections")
def5b60dcd22 ("objtool/klp: Fix --debug-checksum for duplicate symbol names")
0333b7399587 ("objtool: Replace iterator callback with for_each_sym_by_mangled_name()")
3de711fba73a ("objtool/klp: Fix create_fake_symbols() skipping entsize-based sections")
e872b3f13922 ("objtool/klp: Improve local label check")
76eb0f8639fb ("objtool/klp: Don't report uncorrelated functions as new")
0a7823d1d70d ("objtool/klp: Don't correlate __initstub__ symbols")
710c4c254688 ("objtool/klp: Don't correlate absolute symbols")
8edec016255d ("objtool/klp: Don't correlate __ADDRESSABLE() symbols")
ff529864e738 ("objtool/klp: Fix .data..once static local non-correlation")
84c304a534b8 ("objtool/klp: Fix is_uncorrelated_static_local() for Clang")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
# New commits in locking/core:
88331c4ec23a ("seqlock: Allow UBSAN_ALIGNMENT to fail optimizing")
a9e4e50519e9 ("locking/rtmutex: Annotate API and implementation")
03240f5de2dd ("selftests/membarrier: Add rseq stress test for CFS throttle interactions")
a5959728548c ("sched/membarrier: Modernize membarrier_global_expedited with cleanup guards")
89976cd73739 ("sched/membarrier: Use per-CPU mutexes for targeted commands")
b00192d78bb4 ("locking/barrier: Use correct parameter names")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
# New commits in timers/merge:
3eb4923e6851 ("clocksource: Add devm_clocksource_register_*() helpers")
c8d32a0389fb ("timers: Fix flseep() typo in kernel-doc comment")
5d330d652d7a ("hrtimer: Fix the bogus return type of __hrtimer_start_range_ns()")
3af1f49f415d ("hrtimer: Return ktime_t from hrtimer_get_next_event()/hrtimer_next_event_without()")
33d4bfc49613 ("clocksource: Clean up clocksource_update_freq() functions")
ed3b3c497668 ("alarmtimer: Remove stale return description from alarm_handle_timer()")
b00385b8d081 ("selftests/posix_timers: Use CLOCK_THREAD_CPUTIME_ID for ITIMER_PROF measurements")
cab0cd0130eb ("scripts/timers: Add timer_migration_tree.py")
5a7dfbcbbdb6 ("timers/migration: Handle capacity in connect tracepoints")
098cbaad8e57 ("timers/migration: Split per-capacity hierarchies")
3ba25488380f ("timers/migration: Track CPUs in a hierarchy")
ff65875f80d1 ("timers/migration: Abstract out hierarchy to prepare for CPU capacity awareness")
ed78a7019419 ("alarmtimer: Remove unused interfaces")
12e4311aa5b2 ("netfilter: xt_IDLETIMER: Switch to alarm_start_timer()")
9fa2e38ab749 ("power: supply: charger-manager: Switch to alarm_start_timer()")
7dda99952ced ("fs/timerfd: Use the new alarm/hrtimer functions")
f4b58f61da79 ("alarmtimer: Convert posix timer functions to alarm_start_timer()")
183d00b72713 ("alarmtimer: Provide alarm_start_timer()")
acc071343d29 ("posix-timers: Switch to hrtimer_start_expires_user()")
cfb7fe3fdd4c ("posix-timers: Handle the timer_[re]arm() return value")
6fdb2677a594 ("posix-timers: Expand timer_[re]arm() callbacks with a boolean return value")
b40c927345a9 ("hrtimer: Use hrtimer_start_expires_user() for hrtimer sleepers")
bd5956166d20 ("hrtimer: Provide hrtimer_start_range_ns_user()")
68ed094971b0 ("clocksource/drivers/timer-of: Make the code compatible with modules")
2423405880c2 ("clocksource/drivers/mmio: Make the code compatible with modules")
fed9f727cc3f ("clocksource/drivers/sun5i: Handle error returns from devm_reset_control_get_optional_exclusive()")
045a9dac7eb7 ("clocksource/drivers/timer-rtl-otto: Make rttm_cs variable static")
b385caf91868 ("dt-bindings: timer: fsl,imxgpt: add compatible string fsl,imx25-epit")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
The only user of msg->msg_iocb was AF_ALG, but that's deprecated.
It can be removed entirely at the cost of only supporting synchronous
operations. This doesn't break userspace, which will silently block
(for a bounded amount of time) in io_submit instead of operating
asynchronously.
This also makes struct msghdr smaller, helping every other caller of
sendmsg().
Signed-off-by: Demi Marie Obenour <demiobenour@gmail.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
fuse_test.c: In function 'sealing_thread_fn':
fuse_test.c:165:13: warning: unused variable 'sig' [-Wunused-variable]
165 | int sig, r;
| ^~~
Remove unused 'sig' to fix -Wunused-variable warning.
Link: https://lore.kernel.org/20260524193732.48853-3-eva.kurchatova@virtuozzo.com
Signed-off-by: Konstantin Khorenko <khorenko@virtuozzo.com>
Cc: Aristeu Rozanski <aris@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Patch series "selftests/memfd: fix compilation warnings".
This patchset fixes warnings about unused but initialized variables, and
unused dummy buffer passed to pwrite() syscall in the tests.
This patch (of 2):
memfd_test.c: In function 'mfd_fail_grow_write.part.0':
memfd_test.c:685:13: warning: '<unknown>' may be used uninitialized
[-Wmaybe-uninitialized]
685 | l = pwrite(fd, buf, mfd_def_size * 8, 0);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
pwrite() is declared with attribute 'access (read_only, 2, 3)', so GCC
knows it reads from the buffer. malloc() returns uninitialized memory,
hence the warning. Use calloc() to zero-initialize the buffer. The
actual contents don't matter here since the test verifies that pwrite()
fails on a sealed memfd.
Link: https://lore.kernel.org/20260524193732.48853-1-eva.kurchatova@virtuozzo.com
Link: https://lore.kernel.org/20260524193732.48853-2-eva.kurchatova@virtuozzo.com
Signed-off-by: Konstantin Khorenko <khorenko@virtuozzo.com>
Signed-off-by: Eva Kurchatova <eva.kurchatova@virtuozzo.com>
Cc: Aristeu Rozanski <aris@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Add a comment explaining that every other entry in the list is unmapped to
intentionally create fragmentation with locked pages before invoking
check_compaction().
Link: https://lore.kernel.org/da5e0a8d5152e54152c0d2f456aac2fac35af291.1779296493.git.sayalip@linux.ibm.com
Fixes: bd67d5c15cc1 ("Test compaction of mlocked memory")
Signed-off-by: Sayali Patil <sayalip@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Cc: David Hildenbrand (Arm) <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
output for memory-failure category
run_vmtests.sh contains special handling to ensure the hwpoison_inject
module is available for the memory-failure tests. This logic was
implemented outside of run_test(), making the setup category-specific but
managed globally.
Move the hwpoison_inject handling into run_test() and restrict it
to the memory-failure category so that:
1. the module is checked and loaded only when memory-failure tests run,
2. the test is skipped if the module or the debugfs interface
(/sys/kernel/debug/hwpoison/) is not available.
3. the module is unloaded after the test if it was loaded by the script.
This localizes category-specific setup and makes the test flow
consistent with other per-category preparations.
While updating this logic, fix the module availability check.
The script previously used:
modprobe -R hwpoison_inject
The -R option prints the resolved module name to stdout, causing every
run to print:
hwpoison_inject
in the test output, even when no action is required, introducing
unnecessary noise.
Replace this with:
modprobe -n hwpoison_inject
which verifies that the module is loadable without producing output,
keeping the selftest logs clean and consistent.
Also, ensure that skipped tests do not override a previously recorded
failure. A skipped test currently sets exitcode to ksft_skip even if a
prior test has failed, which can mask failures in the final exit status.
Update the logic to only set exitcode to ksft_skip when no failure has
been recorded.
Link: https://lore.kernel.org/93441f34f7ef5add47d1a130d03daa79e21b5050.1779296493.git.sayalip@linux.ibm.com
Fixes: ff4ef2fbd101 ("selftests/mm: add memory failure anonymous page test")
Signed-off-by: Sayali Patil <sayalip@linux.ibm.com>
Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Cc: David Hildenbrand (Arm) <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
When nr_pages_per_cpu evaluates to zero, the test is skipped by printing a
message and returning KSFT_SKIP manually.
Replace this with ksft_exit_skip(), which prints the skip message and
exits with the correct skip status in a single helper, making the code
consistent with other selftests.
Link: https://lore.kernel.org/88202b56-1dc5-43e2-9d1f-a0823a9531f0@linux.ibm.com
Signed-off-by: Sayali Patil <sayalip@linux.ibm.com>
Cc: David Hildenbrand (Arm) <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
uffd-stress currently fails when the computed nr_pages_per_cpu evaluates
to zero:
nr_pages_per_cpu = bytes / page_size / nr_parallel
This can occur on systems with large hugepage sizes (e.g. 1GB) and a high
number of CPUs, where the total allocated memory is sufficient overall but
not enough to provide at least one page per cpu.
In such cases, the failure is due to insufficient test resources rather
than incorrect kernel behaviour. Update the test to treat this condition
as a test skip instead of reporting an error.
Link: https://lore.kernel.org/0707e9a0f1b3dd904c4a069b91db317f9c160faa.1779296493.git.sayalip@linux.ibm.com
Fixes: db0f1c138f18 ("selftests/mm: print some details when uffd-stress gets bad params")
Signed-off-by: Sayali Patil <sayalip@linux.ibm.com>
Acked-by: Zi Yan <ziy@nvidia.com>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The uffd-wp-mremap test requires the UFFD_FEATURE_PAGEFAULT_FLAG_WP
capability. On systems where userfaultfd write-protect is not supported,
uffd_register() fails and the test reports failures.
Check for the required feature at startup and skip the test when the
UFFD_FEATURE_PAGEFAULT_FLAG_WP capability is not present, preventing false
failures on unsupported configurations.
Before patch:
running ./uffd-wp-mremap
------------------------
[INFO] detected THP size: 256 KiB
[INFO] detected THP size: 512 KiB
[INFO] detected THP size: 1024 KiB
[INFO] detected THP size: 2048 KiB
[INFO] detected hugetlb page size: 2048 KiB
[INFO] detected hugetlb page size: 1048576 KiB
1..24
[RUN] test_one_folio(size=65536, private=false, swapout=false,
hugetlb=false)
not ok 1 uffd_register() failed
[RUN] test_one_folio(size=65536, private=true, swapout=false,
hugetlb=false)
not ok 2 uffd_register() failed
[RUN] test_one_folio(size=65536, private=false, swapout=true,
hugetlb=false)
not ok 3 uffd_register() failed
[RUN] test_one_folio(size=65536, private=true, swapout=true,
hugetlb=false)
not ok 4 uffd_register() failed
[RUN] test_one_folio(size=262144, private=false, swapout=false,
hugetlb=false)
not ok 5 uffd_register() failed
[RUN] test_one_folio(size=524288, private=false, swapout=false,
hugetlb=false)
not ok 6 uffd_register() failed
.
.
.
Bail out! 24 out of 24 tests failed
Totals: pass:0 fail:24 xfail:0 xpass:0 skip:0 error:0
[FAIL]
not ok 1 uffd-wp-mremap # exit=1
After patch:
running ./uffd-wp-mremap
------------------------
1..0 # SKIP uffd-wp feature not supported
[SKIP]
ok 1 uffd-wp-mremap # SKIP
Link: https://lore.kernel.org/c3c5af76d71d5f4446f773f4de94882efc33ebe4.1779296493.git.sayalip@linux.ibm.com
Signed-off-by: Sayali Patil <sayalip@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Cc: David Hildenbrand (Arm) <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The hugetlb-mremap selftest reserves the destination address using a
anonymous base-page mapping before calling mremap() with MREMAP_FIXED,
while the source region is hugetlb-backed.
When remapping a hugetlb mapping into a base-page VMA may fail with:
mremap: Device or resource busy
This is observed on powerpc hash MMU systems where slice constraints and
page size incompatibilities prevent the remap.
Ensure the destination region is created using MAP_HUGETLB so that both
source and destination VMAs are hugetlb-backed and compatible.
Update the FLAGS macro to include MAP_HUGETLB | MAP_SHARED so that both
mappings are hugetlb-backed and compatible. Also use the macro for the
mmap() calls to avoid repeating the flag combination.
This ensures the test reliably exercises hugetlb mremap instead of failing
due to VMA type mismatch.
Link: https://lore.kernel.org/367644df45c65098f23e3945c6a80f4b8a8964a6.1779296493.git.sayalip@linux.ibm.com
Fixes: 12b613206474 ("mm, hugepages: add hugetlb vma mremap() test")
Signed-off-by: Sayali Patil <sayalip@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Cc: David Hildenbrand (Arm) <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Previously, register_region_with_uffd() created a new anonymous mapping
and overwrote the address supplied by the caller before registering the
range with userfaultfd.
As a result, userfaultfd was applied to an unrelated anonymous mapping
instead of the hugetlb region used by the test.
Remove the extra mmap() and register the caller-provided address range
directly using UFFDIO_REGISTER_MODE_MISSING, so that faults are generated
for the hugetlb mapping used by the test.
This ensures userfaultfd operates on the actual hugetlb test region and
validates the expected fault handling.
Before patch:
running ./hugetlb-mremap
-------------------------
TAP version 13
1..1
Map haddr: Returned address is 0x7eaa40000000
Map daddr: Returned address is 0x7daa40000000
Map vaddr: Returned address is 0x7faa40000000
Address returned by mmap() = 0x7fff9d000000
Mremap: Returned address is 0x7faa40000000
First hex is 0
First hex is 3020100
ok 1 Read same data
Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0
[PASS]
ok 1 hugetlb-mremap
After patch:
running ./hugetb-mremap
-------------------------
TAP version 13
1..1
Map haddr: Returned address is 0x7eaa40000000
Map daddr: Returned address is 0x7daa40000000
Map vaddr: Returned address is 0x7faa40000000
Registered memory at address 0x7eaa40000000 with userfaultfd
Mremap: Returned address is 0x7faa40000000
First hex is 0
First hex is 3020100
ok 1 Read same data
Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0
[PASS]
ok 1 hugetlb-mremap
Link: https://lore.kernel.org/13845da872ed174316173e8996dbb5f181994017.1779296493.git.sayalip@linux.ibm.com
Fixes: 12b613206474 ("mm, hugepages: add hugetlb vma mremap() test")
Signed-off-by: Sayali Patil <sayalip@linux.ibm.com>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
split_huge_page_test
Dynamically allocated buffers of PMD size for file-backed THP operations
(file_buf1 and file_buf2) were not freed on the success path and some
failure paths. Since the function is called repeatedly in a loop for each
split order, this can cause significant memory leaks.
On architectures with large PMD sizes, repeated leaks could exhaust system
memory and trigger the OOM killer during test execution.
Ensure all allocated buffers are freed to maintain stable repeated test
runs.
Link: https://lore.kernel.org/060c673b376bbeeed2b1fb1d48a825e846654191.1779296493.git.sayalip@linux.ibm.com
Fixes: 035a112e5fd5 ("selftests/mm: make file-backed THP split work by writing PMD size data")
Signed-off-by: Sayali Patil <sayalip@linux.ibm.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Cc: David Hildenbrand (Arm) <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The split_file_backed_thp() test mounts a tmpfs with a fixed size of "4m".
This works on systems with smaller PMD page sizes, but fails on
configurations where the PMD huge page size is larger (e.g. 16MB).
On such systems, the fixed 4MB tmpfs is insufficient to allocate even a
single PMD-sized THP, causing the test to fail.
Fix this by sizing the tmpfs dynamically based on the runtime
pmd_pagesize, allocating space for two PMD-sized pages.
Before patch:
running ./split_huge_page_test /tmp/xfs_dir_YTrI5E
--------------------------------------------------
TAP version 13
1..55
ok 1 Split zero filled huge pages successful
ok 2 Split huge pages to order 0 successful
ok 3 Split huge pages to order 2 successful
ok 4 Split huge pages to order 3 successful
ok 5 Split huge pages to order 4 successful
ok 6 Split huge pages to order 5 successful
ok 7 Split huge pages to order 6 successful
ok 8 Split huge pages to order 7 successful
ok 9 Split PTE-mapped huge pages successful
Please enable pr_debug in split_huge_pages_in_file() for more info.
Failed to write data to testing file: Success (0)
Bail out! Error occurred
Planned tests != run tests (55 != 9)
Totals: pass:9 fail:0 xfail:0 xpass:0 skip:0 error:0
[FAIL]
After patch:
running ./split_huge_page_test /tmp/xfs_dir_bMvj6o
--------------------------------------------------
TAP version 13
1..55
ok 1 Split zero filled huge pages successful
ok 2 Split huge pages to order 0 successful
ok 3 Split huge pages to order 2 successful
ok 4 Split huge pages to order 3 successful
ok 5 Split huge pages to order 4 successful
ok 6 Split huge pages to order 5 successful
ok 7 Split huge pages to order 6 successful
ok 8 Split huge pages to order 7 successful
ok 9 Split PTE-mapped huge pages successful
Please enable pr_debug in split_huge_pages_in_file() for more info.
Please check dmesg for more information
ok 10 File-backed THP split to order 0 test done
Please enable pr_debug in split_huge_pages_in_file() for more info.
Please check dmesg for more information
ok 11 File-backed THP split to order 1 test done
Please enable pr_debug in split_huge_pages_in_file() for more info.
Please check dmesg for more information
ok 12 File-backed THP split to order 2 test done
...
ok 55 Split PMD-mapped pagecache folio to order 7 at
in-folio offset 128 passed
Totals: pass:55 fail:0 xfail:0 xpass:0 skip:0 error:0
[PASS]
ok 1 split_huge_page_test /tmp/xfs_dir_bMvj6o
Link: https://lore.kernel.org/33e1bc10753fe82d1217613d8cd496020778cf2b.1779296493.git.sayalip@linux.ibm.com
Fixes: fbe37501b252 ("mm: huge_memory: debugfs for file-backed THP split")
Signed-off-by: Sayali Patil <sayalip@linux.ibm.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: David Hildenbrand (Arm) <david@kernel.org>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
hugetlb_reparenting_test.sh
The test currently moves the calling shell ($$) into the target cgroup
before executing write_to_hugetlbfs. This results in the shell and any
intermediate allocations being charged to the cgroup, introducing noise
and nondeterminism in accounting. It also requires moving the shell back
to the root cgroup after execution.
Spawn a helper process that joins the target cgroup and exec()'s
write_to_hugetlbfs. This ensures that only the workload is accounted to
the cgroup and avoids unintended charging from the shell.
The test currently validates both hugetlb usage and memory.current.
However, memory.current includes internal memcg allocations and per-CPU
batched accounting (MEMCG_CHARGE_BATCH), which are not synchronized and
can vary across systems, leading to non-deterministic results.
Since hugetlb memory is accounted via hugetlb.<size>.current,
memory.current is not a reliable indicator here. Drop memory.current
checks and rely only on hugetlb controller statistics for stable and
accurate validation.
Link: https://lore.kernel.org/fb57491ba83cb0a499c72922e1579b61bee514db.1779296493.git.sayalip@linux.ibm.com
Fixes: 29750f71a9b4 ("hugetlb_cgroup: add hugetlb_cgroup reservation tests")
Signed-off-by: Sayali Patil <sayalip@linux.ibm.com>
Cc: David Hildenbrand (Arm) <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The hugetlb_reparenting_test.sh script constructs hugetlb cgroup memory
interface file names based on the configured huge page size. The script
formats the size only in MB units, which causes mismatches on systems
using larger huge pages where the kernel exposes normalized units (e.g.
"1GB" instead of "1024MB").
As a result, the test fails to locate the corresponding cgroup files when
1GB huge pages are configured.
Update the script to detect the huge page size and select the appropriate
unit (MB or GB) so that the constructed paths match the kernel's hugetlb
controller naming.
Also print an explicit "Fail" message when a test failure occurs to
improve result visibility.
Link: https://lore.kernel.org/837ce751965c93f74c95d89587debf1e93281364.1779296493.git.sayalip@linux.ibm.com
Fixes: e487a5d513cb ("selftest/mm: make hugetlb_reparenting_test tolerant to async reparenting")
Signed-off-by: Sayali Patil <sayalip@linux.ibm.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
hugetlb_reparenting_test.sh
The test modifies nr_hugepages during execution and restores it from
cleanup() and again reconfigure it setup, which is invoked multiple times
across test flow. This can lead to repeated allocation/freeing of
hugepages.
With set -e, failures in cleanup (e.g., rmdir/umount) can also cause early
exit before restoring the original value at the end.
Move restoration of the original nr_hugepages value to a trap handler
registered for EXIT, INT, and TERM signals so it is always restored on all
exit paths. This also avoids unnecessary allocation churn across repeated
cleanup/setup cycles.
Link: https://lore.kernel.org/29db637c3c6ba6c168f6b33f59f059a0b39c35c8.1779296493.git.sayalip@linux.ibm.com
Fixes: 585a9145886a ("selftests/mm: restore default nr_hugepages value during cleanup in hugetlb_reparenting_test.sh")
Signed-off-by: Sayali Patil <sayalip@linux.ibm.com>
Acked-by: Zi Yan <ziy@nvidia.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Cc: David Hildenbrand (Arm) <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The charge_reserved_hugetlb.sh script assumes hugetlb cgroup memory
interface file names use the "<size>MB" format
(e.g. hugetlb.1024MB.current).
This assumption breaks on systems with larger huge pages such as 1GB,
where the kernel exposes normalized units:
hugetlb.1GB.current
hugetlb.1GB.max
hugetlb.1GB.rsvd.max
...
As a result, the script attempts to access files like
hugetlb.1024MB.current, which do not exist when the kernel reports the
size in GB.
Normalize the huge page size and construct the pathname using the
appropriate unit (MB or GB), matching the hugetlb controller naming.
Link: https://lore.kernel.org/04b6b49e4a2acf46319f627caf82b09e6dc1ad7f.1779296493.git.sayalip@linux.ibm.com
Fixes: 209376ed2a84 ("selftests/vm: make charge_reserved_hugetlb.sh work with existing cgroup setting")
Fixes: 29750f71a9b4 ("hugetlb_cgroup: add hugetlb_cgroup reservation tests")
Signed-off-by: Sayali Patil <sayalip@linux.ibm.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
charge_reserved_hugetlb.sh
Patch series "selftests/mm: fix failures and robustness improvements", v7.
Powerpc systems with a 64K base page size exposed several issues while
running mm selftests. Some tests assume specific hugetlb configurations,
use incorrect interfaces, or fail instead of skipping when the required
kernel features are not available.
This series fixes these issues and improves test robustness.
This patch (of 13):
cleanup() resets nr_hugepages to 0 on every invocation, while the test
reconfigures it again in the next iteration. This leads to repeated
allocation and freeing of large numbers of hugepages, especially when the
original value is high.
Additionally, with set -e, failures in earlier cleanup steps (e.g., rmdir
or umount returning EBUSY while background activity is still ongoing) can
cause the script to exit before restoring the original value, leaving the
system in a modified state.
Introduce a trap on EXIT, INT, and TERM to restore the original
nr_hugepages value once at script termination. This avoids unnecessary
allocation churn and ensures the original value is reliably restored on
all exit paths.
Link: https://lore.kernel.org/cover.1779296493.git.sayalip@linux.ibm.com
Link: https://lore.kernel.org/5b8fbb29cd6ceffe6752e0af104f60cec072aa10.1779296493.git.sayalip@linux.ibm.com
Fixes: 7d695b1c3695 ("selftests/mm: save and restore nr_hugepages value")
Signed-off-by: Sayali Patil <sayalip@linux.ibm.com>
Acked-by: Zi Yan <ziy@nvidia.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
MAP_FAILED
mmap() returns MAP_FAILED, which is defined as (void *)-1, on error, not
NULL. Several selftests incorrectly check the return value of mmap()
using !ptr or ptr == NULL, which would erroneously treat MAP_FAILED as a
valid pointer since MAP_FAILED is non-zero and non-NULL. This can lead to
segfaults when mmap() actually fails under memory pressure.
Link: https://lore.kernel.org/20260513025223.592766-1-lihongfu@kylinos.cn
Signed-off-by: Hongfu Li <lihongfu@kylinos.cn>
Reviewed-by: Dev Jain <dev.jain@arm.com>
Reviewed-by: Lorenzo Stoakes <ljs@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Liam R. Howlett <liam@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
All the tests that use HugeTLB can detect and setup HugeTLB pages on their
own.
Drop detection and setup of HugeTLB from run_vmtests.sh.
Link: https://lore.kernel.org/20260511162840.375890-56-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Tested-by: Luiz Capitulino <luizcap@redhat.com>
Tested-by: Sarthak Sharma <sarthak.sharma@arm.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Donet Tom <donettom@linux.ibm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Li Wang <li.wang@linux.dev>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Currently when running THP and HugeTLB tests, if HAVE_HUGEPAGES is set
run_test() drops caches, compacts memory and runs the test. But if
HAVE_HUGEPAGES is not set it skips the tests entirely, even if THP tests
have nothing to do with HAVE_HUGEPAGES.
Replace the check if HAVE_HUGEPAGES is set with a check of how much memory
is available. If there is less than 256 MB of available memory, drop
caches and run compaction and then continue to run a test regardless of
HAVE_HUGEPAGES value.
Link: https://lore.kernel.org/20260511162840.375890-55-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Tested-by: Luiz Capitulino <luizcap@redhat.com>
Tested-by: Sarthak Sharma <sarthak.sharma@arm.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Donet Tom <donettom@linux.ibm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Li Wang <li.wang@linux.dev>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Since va_high_addr_switch takes care of setting up huge pages, there is no
need to set them up in the va_high_addr_switch.sh wrapper script.
Link: https://lore.kernel.org/20260511162840.375890-54-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: Luiz Capitulino <luizcap@redhat.com>
Tested-by: Luiz Capitulino <luizcap@redhat.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Donet Tom <donettom@linux.ibm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Li Wang <li.wang@linux.dev>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Sarthak Sharma <sarthak.sharma@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
va_high_addr_switch skips HugeTLB tests if there are no free huge pages
prepared by a wrapper script.
Add setup of HugeTLB pages to the test and make sure that the original
settings are restored on the test exit.
Link: https://lore.kernel.org/20260511162840.375890-53-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: Luiz Capitulino <luizcap@redhat.com>
Tested-by: Luiz Capitulino <luizcap@redhat.com>
Tested-by: Sarthak Sharma <sarthak.sharma@arm.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Donet Tom <donettom@linux.ibm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Li Wang <li.wang@linux.dev>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
uffd-wp-remap skips HugeTLB tests if there are no free huge pages prepared
by a wrapper script.
Add setup of HugeTLB pages to the test and make sure that the original
settings are restored on the test exit.
Link: https://lore.kernel.org/20260511162840.375890-52-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Tested-by: Luiz Capitulino <luizcap@redhat.com>
Tested-by: Sarthak Sharma <sarthak.sharma@arm.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Donet Tom <donettom@linux.ibm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Li Wang <li.wang@linux.dev>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
uffd-unit-tests skips HugeTLB tests if there are no free huge pages
prepared by a wrapper script.
Add setup of HugeTLB pages to the test and make sure that the original
settings are restored on the test exit.
Replace exit() calls with _exit() to avoid restoring HugeTLB settings in
the middle of test.
Link: https://lore.kernel.org/20260511162840.375890-51-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Tested-by: Luiz Capitulino <luizcap@redhat.com>
Tested-by: Sarthak Sharma <sarthak.sharma@arm.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Donet Tom <donettom@linux.ibm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Li Wang <li.wang@linux.dev>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
uffd-stress skips HugeTLB tests if there are no free huge pages prepared
by a wrapper script.
Add setup of HugeTLB pages to the test and make sure that the original
settings are restored on the test exit.
Link: https://lore.kernel.org/20260511162840.375890-50-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Tested-by: Luiz Capitulino <luizcap@redhat.com>
Tested-by: Sarthak Sharma <sarthak.sharma@arm.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Donet Tom <donettom@linux.ibm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Li Wang <li.wang@linux.dev>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
thuge-gen skips tests if there are no free huge pages prepared by a
wrapper script and shm liimts in proc are too low.
Replace custom detection of huge pages with the library functions and add
setup of HugeTLB pages and shm limits to the test and make sure that the
original settings are restored on the test exit.
Link: https://lore.kernel.org/20260511162840.375890-49-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Tested-by: Luiz Capitulino <luizcap@redhat.com>
Tested-by: Sarthak Sharma <sarthak.sharma@arm.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Donet Tom <donettom@linux.ibm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Li Wang <li.wang@linux.dev>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
protection_keys open codes setup of HugeTLB pages.
Replace it with the library functions from hugepage_setup.
Replace exit() calls with _exit() to avoid restoring HugeTLB settings in
the middle of test.
Link: https://lore.kernel.org/20260511162840.375890-48-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Tested-by: Luiz Capitulino <luizcap@redhat.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Donet Tom <donettom@linux.ibm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Li Wang <li.wang@linux.dev>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Sarthak Sharma <sarthak.sharma@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
pagemap-ioctl skips HugeTLB tests if there are no free huge pages prepared
by a wrapper script.
Add setup of HugeTLB pages to the test and make sure that the original
settings are restored on the test exit.
Link: https://lore.kernel.org/20260511162840.375890-47-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Tested-by: Luiz Capitulino <luizcap@redhat.com>
Tested-by: Sarthak Sharma <sarthak.sharma@arm.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Donet Tom <donettom@linux.ibm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Li Wang <li.wang@linux.dev>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
migration skips HugeTLB tests if there are no free huge pages prepared by
a wrapper script.
Add setup of HugeTLB pages to the test and make sure that the original
settings are restored on the test exit.
Since kselftest_harness runs fixture setup and the tests in child
processes, use HUGETLB_SETUP_DEFAULT_PAGES() that defines a constructor
that runs in the main process and add verification that there are enough
free huge pages to the tests that use them.
Reset signal handlers to defaults in FIXTURE_SETUP() so that sending
SIGTERM and SIGHUP during the tests won't cause restoration of HugeTLB
settings.
Link: https://lore.kernel.org/20260511162840.375890-46-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Tested-by: Luiz Capitulino <luizcap@redhat.com>
Tested-by: Sarthak Sharma <sarthak.sharma@arm.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Donet Tom <donettom@linux.ibm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Li Wang <li.wang@linux.dev>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
hugetlb-vmemmap test fails if there are no free huge pages prepared by a
wrapper script.
Add setup of HugeTLB pages to the test and make sure that the original
settings are restored on the test exit.
Link: https://lore.kernel.org/20260511162840.375890-45-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Tested-by: Luiz Capitulino <luizcap@redhat.com>
Tested-by: Sarthak Sharma <sarthak.sharma@arm.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Donet Tom <donettom@linux.ibm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Li Wang <li.wang@linux.dev>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
hugetlb-soft-offline test uses open coded access to /proc to determine
availability of huge pages and fails if there are no enough free huget
pages..
Replace open coded access to /proc with hugepage helpers and add setup of
HugeTLB pages to the test and make sure that the original settings are
restored on the test exit.
Link: https://lore.kernel.org/20260511162840.375890-44-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Tested-by: Luiz Capitulino <luizcap@redhat.com>
Tested-by: Sarthak Sharma <sarthak.sharma@arm.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Donet Tom <donettom@linux.ibm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Li Wang <li.wang@linux.dev>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
hugetlb-shm test fails if there are no free huge pages prepared by a
wrapper script and shm liimts in proc are too low.
Add setup of HugeTLB pages and shm limits to the test and make sure that
the original settings are restored on the test exit.
Link: https://lore.kernel.org/20260511162840.375890-43-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Tested-by: Luiz Capitulino <luizcap@redhat.com>
Tested-by: Sarthak Sharma <sarthak.sharma@arm.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Donet Tom <donettom@linux.ibm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Li Wang <li.wang@linux.dev>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
hugetlb-mremap test fails if there are no free huge pages prepared by a
wrapper script.
Add setup of HugeTLB pages to the test and make sure that the original
settings are restored on the test exit.
Link: https://lore.kernel.org/20260511162840.375890-42-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Tested-by: Luiz Capitulino <luizcap@redhat.com>
Tested-by: Sarthak Sharma <sarthak.sharma@arm.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Donet Tom <donettom@linux.ibm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Li Wang <li.wang@linux.dev>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
hugetlb-mmap test fails if there are no free huge pages prepared by a
wrapper script.
Add setup of HugeTLB pages to the test and make sure that the original
settings are restored on the test exit.
Link: https://lore.kernel.org/20260511162840.375890-41-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: Luiz Capitulino <luizcap@redhat.com>
Tested-by: Luiz Capitulino <luizcap@redhat.com>
Tested-by: Sarthak Sharma <sarthak.sharma@arm.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Donet Tom <donettom@linux.ibm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Li Wang <li.wang@linux.dev>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
hugetlb_madv_vs_map test skips testing if there are no free huge pages
prepared by a wrapper script.
Add setup of HugeTLB pages to the test and make sure that the original
settings are restored on the test exit.
Link: https://lore.kernel.org/20260511162840.375890-40-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Tested-by: Luiz Capitulino <luizcap@redhat.com>
Tested-by: Sarthak Sharma <sarthak.sharma@arm.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Donet Tom <donettom@linux.ibm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Li Wang <li.wang@linux.dev>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
hugetlb-madvise test skips testing if there are no free huge pages
prepared by a wrapper script.
Add setup of HugeTLB pages to the test and make sure that the original
settings are restored on the test exit.
Link: https://lore.kernel.org/20260511162840.375890-39-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Tested-by: Luiz Capitulino <luizcap@redhat.com>
Tested-by: Sarthak Sharma <sarthak.sharma@arm.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Donet Tom <donettom@linux.ibm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Li Wang <li.wang@linux.dev>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
hugetlb_fault_after_madv test skips testing if there are no free huge
pages prepared by a wrapper script.
Add setup of HugeTLB pages to the test and make sure that the original
settings are restored on the test exit.
Link: https://lore.kernel.org/20260511162840.375890-38-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Tested-by: Luiz Capitulino <luizcap@redhat.com>
Tested-by: Sarthak Sharma <sarthak.sharma@arm.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Donet Tom <donettom@linux.ibm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Li Wang <li.wang@linux.dev>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
hugepage_dio test fails if there are no free huge pages prepared by a
wrapper script.
Add setup of HugeTLB pages to the test and make sure that the original
settings are restored on the test exit.
Link: https://lore.kernel.org/20260511162840.375890-37-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Tested-by: Luiz Capitulino <luizcap@redhat.com>
Tested-by: Sarthak Sharma <sarthak.sharma@arm.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Donet Tom <donettom@linux.ibm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Li Wang <li.wang@linux.dev>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
hmm-tests skips HugeTLB tests if there are no free huge pages prepared by
a wrapper script.
Add setup of HugeTLB pages to the test and make sure that the original
settings are restored on the test exit.
Since kselftest_harness runs fixture setup and the tests in child
processes, use HUGETLB_SETUP_DEFAULT_PAGES() that defines a constructor
that runs in the main process and add verification that there are enough
free huge pages to the tests that use them.
Replace exit() calls with _exit() to avoid restoring HugeTLB settings in
the middle of test and use SIGKILL to kill a child process in
hmm_cow_in_device test to avoid interference with signal handlers in
hugepage_restore_settings().
Link: https://lore.kernel.org/20260511162840.375890-36-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Tested-by: Luiz Capitulino <luizcap@redhat.com>
Tested-by: Sarthak Sharma <sarthak.sharma@arm.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Donet Tom <donettom@linux.ibm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Li Wang <li.wang@linux.dev>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
gup_test fails to run HugeTLB tests if there are no free huge pages
prepared by a wrapper script.
Add setup of HugeTLB pages to the test and make sure that the original
settings are restored on the test exit.
Link: https://lore.kernel.org/20260511162840.375890-35-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Tested-by: Luiz Capitulino <luizcap@redhat.com>
Tested-by: Sarthak Sharma <sarthak.sharma@arm.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Donet Tom <donettom@linux.ibm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Li Wang <li.wang@linux.dev>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
gup_longterm tests skips HugeTLB tests if there are no free huge pages
prepared by a wrapper script.
Add setup of HugeTLB pages to the test and make sure that the original
settings are restored on the test exit.
Link: https://lore.kernel.org/20260511162840.375890-34-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Tested-by: Luiz Capitulino <luizcap@redhat.com>
Tested-by: Sarthak Sharma <sarthak.sharma@arm.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Donet Tom <donettom@linux.ibm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Li Wang <li.wang@linux.dev>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
cow tests skips HugeTLB tests if there are no free huge pages prepared by
a wrapper script.
Add setup of HugeTLB pages to the test and make sure that the original
settings are restored on the test exit.
Link: https://lore.kernel.org/20260511162840.375890-33-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Tested-by: Luiz Capitulino <luizcap@redhat.com>
Tested-by: Sarthak Sharma <sarthak.sharma@arm.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Donet Tom <donettom@linux.ibm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Li Wang <li.wang@linux.dev>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
... instead of open coded access of HugeTLB parameters via /proc.
Link: https://lore.kernel.org/20260511162840.375890-32-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Tested-by: Luiz Capitulino <luizcap@redhat.com>
Tested-by: Sarthak Sharma <sarthak.sharma@arm.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Donet Tom <donettom@linux.ibm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Li Wang <li.wang@linux.dev>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
hugetlb-shm and thuge-gen tests require that limits defined by
/proc/sys/kernel/{shmmax,shmall} should be higher than certain values.
Add helpers that allow setting these limits and restoring their settings
on a test exit.
They will be used later in hugetlb-shm and thuge-gen.
Link: https://lore.kernel.org/20260511162840.375890-31-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Tested-by: Luiz Capitulino <luizcap@redhat.com>
Tested-by: Sarthak Sharma <sarthak.sharma@arm.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Donet Tom <donettom@linux.ibm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Li Wang <li.wang@linux.dev>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
These are useful helpers for writing and reading sysfs and proc files.
Make them available to the tests that don't use thp_settings.
While on it make write_num() use "%lu" instead of "%ld" to match 'unsigned
long num' argument type.
Link: https://lore.kernel.org/20260511162840.375890-30-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Tested-by: Luiz Capitulino <luizcap@redhat.com>
Tested-by: Sarthak Sharma <sarthak.sharma@arm.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Donet Tom <donettom@linux.ibm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Li Wang <li.wang@linux.dev>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
A lot of tests require free HugeTLB pages. Some need just a few default
huge pages, some need a certain amount of memory available as HugeTLB, and
some just skip lots of tests if huge pages of all supported sizes are not
available.
This all resulted in a huge mess in run_vmtests.sh that sets up some huge
pages, adjusts them later and restores some of the settings if the stars
align.
Add APIs that allow saving the state of HugeTLB and setting up the desired
amount of HugeTLB pages.
Saving the state also registers atexit() callback and signal handler that
will ensure restoration of HugeTLB state.
Since many tests use both HugeTLB and THP, the atexit() callbacks and
signal handler are restoring both.
For kselftest_harness tests that run fixture setups and test in child
processes add a constructor that will save and restore settings in the
main process.
Link: https://lore.kernel.org/20260511162840.375890-29-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Tested-by: Luiz Capitulino <luizcap@redhat.com>
Tested-by: Sarthak Sharma <sarthak.sharma@arm.com>
Reviewed-by: Luiz Capitulino <luizcap@redhat.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Donet Tom <donettom@linux.ibm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Li Wang <li.wang@linux.dev>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
... to hugetlb_free_default_pages() for consistency with
hugetlb_nr_default_pages().
Make hugetlb_free_default_pages() use hugetlb_sysfs_path() helper instead
of parsing /proc/meminfo.
Link: https://lore.kernel.org/20260511162840.375890-28-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Tested-by: Luiz Capitulino <luizcap@redhat.com>
Tested-by: Sarthak Sharma <sarthak.sharma@arm.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Donet Tom <donettom@linux.ibm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Li Wang <li.wang@linux.dev>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Add APIs that allow reading and writing of
/sys/kernel/mm/hugepages/hugepages-NkB/nr_hugepages
to detect and change the amount of HugeTLB pages of different sizes.
Link: https://lore.kernel.org/20260511162840.375890-27-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: Luiz Capitulino <luizcap@redhat.com>
Tested-by: Luiz Capitulino <luizcap@redhat.com>
Tested-by: Sarthak Sharma <sarthak.sharma@arm.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Donet Tom <donettom@linux.ibm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Li Wang <li.wang@linux.dev>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
... instead of size_t to avoid type mismatch in 32 and 64 bit builds.
Link: https://lore.kernel.org/20260511162840.375890-26-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Tested-by: Luiz Capitulino <luizcap@redhat.com>
Tested-by: Sarthak Sharma <sarthak.sharma@arm.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Donet Tom <donettom@linux.ibm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Li Wang <li.wang@linux.dev>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Move library functions that abstract HugeTLB /proc and /sysfs access from
vm_util to hugepage_settings.
This will help creating common helpers that save and restore HugeTLB and
THP settings.
Link: https://lore.kernel.org/20260511162840.375890-25-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Tested-by: Luiz Capitulino <luizcap@redhat.com>
Tested-by: Sarthak Sharma <sarthak.sharma@arm.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Donet Tom <donettom@linux.ibm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Li Wang <li.wang@linux.dev>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
... for upcoming addition of HugeTLB helpers.
Link: https://lore.kernel.org/20260511162840.375890-24-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Tested-by: Luiz Capitulino <luizcap@redhat.com>
Tested-by: Sarthak Sharma <sarthak.sharma@arm.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Donet Tom <donettom@linux.ibm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Li Wang <li.wang@linux.dev>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
khugepaged registers atexit() and signal handlers that ensure that THP
settings are restored regardless of how the test exited.
Make these handlers available for all users of thp_settings.
The call to thp_save_settings() installs thp_restore_settings as the
atexit() callback and makes sure that signals that kill a process would
still call exit() and atexit() callback.
Update child process in khugepaged tests using thp_settings to use _exit()
instead of exit() to avoid altering THP settings in the middle of a test.
Remove redundant THP cleanup from folio_split_race_test.c.
Link: https://lore.kernel.org/20260511162840.375890-23-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Tested-by: Sarthak Sharma <sarthak.sharma@arm.com>
Tested-by: Luiz Capitulino <luizcap@redhat.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Donet Tom <donettom@linux.ibm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Li Wang <li.wang@linux.dev>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Convert va_high_addr_switch test to use kselftest framework for reporting
and tracking successful and failing runs.
Link: https://lore.kernel.org/20260511162840.375890-22-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: Luiz Capitulino <luizcap@redhat.com>
Tested-by: Luiz Capitulino <luizcap@redhat.com>
Tested-by: Sarthak Sharma <sarthak.sharma@arm.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Donet Tom <donettom@linux.ibm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Li Wang <li.wang@linux.dev>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Convert uffd-unit-tests to use kselftest framework for reporting and
tracking successful and failing runs.
Link: https://lore.kernel.org/20260511162840.375890-21-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: Donet Tom <donettom@linux.ibm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Tested-by: Luiz Capitulino <luizcap@redhat.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Li Wang <li.wang@linux.dev>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Sarthak Sharma <sarthak.sharma@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Convert uffd-stress test to use kselftest framework for reporting and
tracking successful and failing runs.
Link: https://lore.kernel.org/20260511162840.375890-20-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Tested-by: Sarthak Sharma <sarthak.sharma@arm.com>
Tested-by: Luiz Capitulino <luizcap@redhat.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Donet Tom <donettom@linux.ibm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Li Wang <li.wang@linux.dev>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Update err() and errexit() to use ksft_print_msg() and ksft_exit_fail().
This is preparatory change required to update userfaulfd tests to use
kselftest framework.
Link: https://lore.kernel.org/20260511162840.375890-19-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Donet Tom <donettom@linux.ibm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Li Wang <li.wang@linux.dev>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Luiz Capitulino <luizcap@redhat.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Sarthak Sharma <sarthak.sharma@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Convert protection_keys test to use kselftest framework for reporting and
tracking successful and failing runs.
Adjust dprintf0() printouts to use "#" in the beginning of the line for
TAP compatibility.
Link: https://lore.kernel.org/20260511162840.375890-18-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Tested-by: Luiz Capitulino <luizcap@redhat.com>
Tested-by: Sarthak Sharma <sarthak.sharma@arm.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Donet Tom <donettom@linux.ibm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Li Wang <li.wang@linux.dev>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Replace the numeric test index in TAP output with the actual test function
name.
Use a structure containing function pointer and its name rather than only
the function pointer in the pkey_tests array.
Link: https://lore.kernel.org/20260511162840.375890-17-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: Donet Tom <donettom@linux.ibm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Tested-by: Luiz Capitulino <luizcap@redhat.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Li Wang <li.wang@linux.dev>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Sarthak Sharma <sarthak.sharma@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Convert ksm_tests to use kselftest framework for reporting and tracking
successful and failing runs.
Link: https://lore.kernel.org/20260511162840.375890-16-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: Donet Tom <donettom@linux.ibm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Tested-by: Sarthak Sharma <sarthak.sharma@arm.com>
Tested-by: Luiz Capitulino <luizcap@redhat.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Li Wang <li.wang@linux.dev>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
make the output TAP-compatible
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Convert khugepaged tests to use kselftest framework for reporting and
tracking successful and failing runs.
The conversion is mostly about replacing printf()/perror() + exit() pairs
with their ksft_ counterparts.
The nice colored success and failure indications are left intact.
Replace the progress report in collapse_compound_extreme() with a single
ksft_print_msg() to avoid headache with formatting and make the test
output more concise.
[ziy@nvidia.com: update for "Remove CONFIG_READ_ONLY_THP_FOR_FS and enable file THP for writable files", v6]
https://lore.kernel.org/20260517135416.1434539-1-ziy@nvidia.com
Link: https://lore.kernel.org/20260511162840.375890-15-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Tested-by: Luiz Capitulino <luizcap@redhat.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Donet Tom <donettom@linux.ibm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Li Wang <li.wang@linux.dev>
Cc: Sarthak Sharma <sarthak.sharma@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Currently khugepaged decides if a test can run using TEST() macro that
checks what mem_ops and collapse_context are set by the command line
arguments.
For better compatibility with ksefltest framework, add an array of 'struct
test_case's and redefine TEST() macro to conditionally add enabled tests
to that array.
Then execute the enabled test by looping the test_case's array.
Link: https://lore.kernel.org/20260511162840.375890-14-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Tested-by: Luiz Capitulino <luizcap@redhat.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Donet Tom <donettom@linux.ibm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Li Wang <li.wang@linux.dev>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Sarthak Sharma <sarthak.sharma@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Convert hugetlb-read-hwpoison test to use kselftest framework for
reporting and tracking successful and failing runs.
Link: https://lore.kernel.org/20260511162840.375890-13-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: Donet Tom <donettom@linux.ibm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Tested-by: Li Wang <li.wang@linux.dev>
Reviewed-by: Li Wang <li.wang@linux.dev>
Tested-by: Sarthak Sharma <sarthak.sharma@arm.com>
Tested-by: Luiz Capitulino <luizcap@redhat.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Convert hugetlb_madv_vs_map test to use kselftest framework for reporting
and tracking successful and failing runs.
Link: https://lore.kernel.org/20260511162840.375890-12-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Donet Tom <donettom@linux.ibm.com>
Tested-by: Luiz Capitulino <luizcap@redhat.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Li Wang <li.wang@linux.dev>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Sarthak Sharma <sarthak.sharma@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Convert hugetlb-madvise test to use kselftest framework for reporting and
tracking successful and failing runs.
While on it fix the check for base page size detection to actually use
base_page_size.
Link: https://lore.kernel.org/20260511162840.375890-11-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: Donet Tom <donettom@linux.ibm.com>
Tested-by: Sarthak Sharma <sarthak.sharma@arm.com>
Tested-by: Luiz Capitulino <luizcap@redhat.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Li Wang <li.wang@linux.dev>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Convert hugetlb-vmemmap test to use kselftest framework for reporting and
tracking successful and failing runs.
Link: https://lore.kernel.org/20260511162840.375890-10-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: Donet Tom <donettom@linux.ibm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Tested-by: Sarthak Sharma <sarthak.sharma@arm.com>
Tested-by: Luiz Capitulino <luizcap@redhat.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Li Wang <li.wang@linux.dev>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Convert hugetlb-shm test to use kselftest framework for reporting and
tracking successful and failing runs.
Link: https://lore.kernel.org/20260511162840.375890-9-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: Donet Tom <donettom@linux.ibm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Tested-by: Sarthak Sharma <sarthak.sharma@arm.com>
Tested-by: Luiz Capitulino <luizcap@redhat.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Li Wang <li.wang@linux.dev>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
hugepage could mean both THP and HugeTLB these days.
Rename hugepage-* tests for HugeTLB to hugetlb-* to avoid confusion.
Make sure that Makefile update keeps alphabetical ordering of the
TEST_GEN_FILES entries.
Keep old binary names in .gitignore because Linus prefers it this way.
Link: https://lore.kernel.org/20260511162840.375890-8-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: Donet Tom <donettom@linux.ibm.com>
Reviewed-by: Li Wang <li.wang@linux.dev>
Tested-by: Sarthak Sharma <sarthak.sharma@arm.com>
Tested-by: Luiz Capitulino <luizcap@redhat.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Both tests create a hugettlb mapping, fill it with data and verify the
data, the only difference is that one uses file-backed memory and another
one uses anonymous memory.
Merge both tests into a single file.
Link: https://lore.kernel.org/20260511162840.375890-7-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: Luiz Capitulino <luizcap@redhat.com>
Reviewed-by: Donet Tom <donettom@linux.ibm.com>
Tested-by: Luiz Capitulino <luizcap@redhat.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Li Wang <li.wang@linux.dev>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Sarthak Sharma <sarthak.sharma@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
HAVE_HUGEPAGES indicates availability of free HugeTLB pages. It should
not be used to gate KSM test that merges transparent huge pages or
split_huge_page_test.
Remove check for HAVE_HUGEPAGES when running these tests.
Link: https://lore.kernel.org/20260511162840.375890-6-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Tested-by: Li Wang <li.wang@linux.dev>
Reviewed-by: Li Wang <li.wang@linux.dev>
Tested-by: Luiz Capitulino <luizcap@redhat.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Donet Tom <donettom@linux.ibm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Sarthak Sharma <sarthak.sharma@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Several migration test use fork() to create worker processes. These
processes are later killed, but nothing collects their exit status and
they remain as zombies in the system.
Add a helper function that kills the worker processes, waitpid()s for them
and verifies the exit status.
Replace the loops that call kill() for each process with a call to that
helper.
Link: https://lore.kernel.org/20260511162840.375890-5-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reported-by: Luiz Capitulino <luizcap@redhat.com>
Tested-by: Luiz Capitulino <luizcap@redhat.com>
Reviewed-by: Luiz Capitulino <luizcap@redhat.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Donet Tom <donettom@linux.ibm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Li Wang <li.wang@linux.dev>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Sarthak Sharma <sarthak.sharma@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Fixture setup sets self->nthreads to number of available CPUs minus 1 and
then each test creates 'self->nthreads - 1' threads or processes, so
essentially nthreads counts the worker tasks and the main task.
Make nthreads represent the number of spawned tasks to simplify
thread/process creation and teardown.
While on it, make the fixture setup skip the tests if there are not enough
CPUs or NUMA nodes instead of checking this in each test.
Link: https://lore.kernel.org/20260511162840.375890-4-rppt@kernel.org
Reviewed-by: Luiz Capitulino <luizcap@redhat.com>
Tested-by: Luiz Capitulino <luizcap@redhat.com>
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Donet Tom <donettom@linux.ibm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Li Wang <li.wang@linux.dev>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Sarthak Sharma <sarthak.sharma@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
migration tests presume that both THP and HugeTLB huge pages are 2MB.
Add dynamic detection of huge page size with read_pmd_pagesize() for THP
and with default_huge_page_size() for HugeTLB.
Link: https://lore.kernel.org/20260511162840.375890-3-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: Donet Tom <donettom@linux.ibm.com>
Reviewed-by: Luiz Capitulino <luizcap@redhat.com>
Tested-by: Luiz Capitulino <luizcap@redhat.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Li Wang <li.wang@linux.dev>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Sarthak Sharma <sarthak.sharma@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Patch series "make MM selftests more CI friendly", v4.
There's a lot of dancing around HugeTLB settings in run_vmtests.sh. Some
test need just a few default huge pages, some require at least 256 MB, and
some just skip lots of tests if huge pages of all supported sizes are not
available.
The goal of this set is to make tests deal with HugeTLB setup and
teardown.
There are already convenient helpers that allow easy reading and writing
of /proc and /sysfs, so adding a few APIs that will detect and update
HugeTLB settings shouldn't be a big deal. But these nice helpers use
kselftest framework, and many of HugeTLB (and even THP) test don't, so as
a result this patchset also includes a lot of churn for conversion of
those tests to kselftest framework (patches 7-19).
The series break out:
patches 1-5: small fixes
patch 6: merge of hugetlb mmap tests
patch 7: renaming of hugepage-* to hugetlb-*
patches 8-21: mechanical conversion to kselftest framework
patches 22-28: extension of thp_settings to hugepage_settings to also include
HugeTLB helpers
patches 29-30: add helpers for setting up SHM limits in hugetlb-shm and
thuge-gen tests
patches 31-53: integrate the new APIs in all the tests that use HugeTLB
patches 54-55: drop HugeTLB setup from run_vmtests.sh
This patch (of 55):
Injection of a memory error with madvise() causes SIGBUS, which terminates
the hugetlb-read-hwpoison test prematurely.
Add a dummy SIGBUS handler to allow the test to continue regardless of
SIGBUS.
Link: https://lore.kernel.org/20260511162840.375890-1-rppt@kernel.org
Link: https://lore.kernel.org/20260511162840.375890-2-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Tested-by: Sarthak Sharma <sarthak.sharma@arm.com>
Tested-by: Li Wang <li.wang@linux.dev>
Reviewed-by: Li Wang <li.wang@linux.dev>
Tested-by: Luiz Capitulino <luizcap@redhat.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Donet Tom <donettom@linux.ibm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
collapse_file() now supports collapsing clean pagecache folios from
writable files, so add corresponding tests.
Note that madvise(MADV_COLLAPSE) works for dirty pagecache folios from
writable files, because collapse_single_pmd() triggers a synchronous
writeback when first attempt of collapse_file() fails. That writeback
makes dirty folios clean and the retry of collapse_file() succeeds.
Link: https://lore.kernel.org/20260517135416.1434539-15-ziy@nvidia.com
Signed-off-by: Zi Yan <ziy@nvidia.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Chris Mason <clm@fb.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: David Hildenbrand (Arm) <david@kernel.org>
Cc: David Sterba <dsterba@suse.com>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Liam Howlett <liam@infradead.org>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nico Pache <npache@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Any file system with large folio support and the supported orders include
PMD_ORDER can be used. There is no need to open a file with read-only.
Link: https://lore.kernel.org/20260517135416.1434539-13-ziy@nvidia.com
Signed-off-by: Zi Yan <ziy@nvidia.com>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Chris Mason <clm@fb.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: David Sterba <dsterba@suse.com>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Liam Howlett <liam@infradead.org>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nico Pache <npache@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Change the requirement to a file system with large folio support and the
supported order needs to include PMD_ORDER.
Also add tests of opening a file with read write permission and populating
folios with writes. Reuse the XFS image from split_huge_page_test.
Link: https://lore.kernel.org/20260517135416.1434539-12-ziy@nvidia.com
Signed-off-by: Zi Yan <ziy@nvidia.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Chris Mason <clm@fb.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: David Hildenbrand (Arm) <david@kernel.org>
Cc: David Sterba <dsterba@suse.com>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Liam Howlett <liam@infradead.org>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nico Pache <npache@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Add tearing tests for /proc/pid/smaps file. New tests reuse the same
logic as with maps file but skipping all the data except for the VMA
addresses, which are the only part relevant for the tearing tests. Skip
PROCMAP_QUERY parts of the tests because smaps does not implement that
ioctl.
Link: https://lore.kernel.org/20260426062718.1238437-4-surenb@google.com
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Reviewed-by: Liam R. Howlett <liam@infradead.org>
Cc: Jann Horn <jannh@google.com>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: "Paul E . McKenney" <paulmck@kernel.org>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
When running tearing tests we need to ensure the pages we use include VMAs
that were mapped by the child process for this test. Currently we always
use the first two pages, checking VMAs at their boundaries and this works,
however once we add tests for /proc/pid/smaps, the first two pages might
not contain the VMAs that child modifies. Locate the page that contains
the first VMA mapped by the child and use that and the next page for the
test.
Link: https://lore.kernel.org/20260426062718.1238437-3-surenb@google.com
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Reviewed-by: Liam R. Howlett <liam@infradead.org>
Cc: Jann Horn <jannh@google.com>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: "Paul E . McKenney" <paulmck@kernel.org>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
sysfs.sh DAMON selftest is not testing the existence of the 'pause' sysfs
file. Add the test.
Link: https://lore.kernel.org/20260522154026.80546-15-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Brendan Higgins <brendan.higgins@linux.dev>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
sysfs.sh DAMON selftest is not testing the existence of addr_unit sysfs
file. Add the test.
Link: https://lore.kernel.org/20260522154026.80546-14-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Brendan Higgins <brendan.higgins@linux.dev>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
sysfs.sh DAMON selftest is not testing monitoring intervals goal
directory. Add the test.
Link: https://lore.kernel.org/20260522154026.80546-13-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Brendan Higgins <brendan.higgins@linux.dev>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
When an assertion is failed, sysfs.py DAMON selftest immediately exits the
test program leaving the DAMON running behind. Many of the following
tests need to start DAMON on their own. But because DAMON that was
started by sysfs.py is still running, those start attempts fail, and the
tests are failed or skipped. Update sysfs.py to stop DAMON before exiting
the test program due to the assertion failure.
Link: https://lore.kernel.org/20260522154026.80546-12-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Brendan Higgins <brendan.higgins@linux.dev>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Rather than providing a hook, simplify things by providing the ability to
filter errors. This allows us to more carefully validate the value
provided and thus ensure only a valid error code is specified, and
simplifies the interface.
This way, we eliminate all hooks but mmap_prepare and allow only mmap
actions to be specified (which core mm controls).
This significantly improves robustness and eliminates any unnecessary code
duplication in driver mmap hooks.
We also update the /dev/mem logic (the only user) to use
mmap_action->error_filter instead.
Link: https://lore.kernel.org/e770b28427937057fa953ac380a134b24acd8bb4.1779462249.git.ljs@kernel.org
Signed-off-by: Lorenzo Stoakes <ljs@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: David Hildenbrand <david@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jann Horn <jannh@google.com>
Cc: Liam R. Howlett <liam@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
This hook was introduced to work around code that seemed to absolutely
require access to a VMA pointer upon mmap().
However, providing this hook leaves a backdoor to drivers getting access
to the very thing mmap_prepare eliminates - a pointer to the VMA.
Let's solve this contradiction by removing it. The key intended user was
hugetlb, however it seems that the best course now is to avoid allowing
all drivers the ability to work around mmap_prepare, and find a different
solution there.
Link: https://lore.kernel.org/2521c19866f3f10f9085d094cc4f06769042be71.1779462249.git.ljs@kernel.org
Signed-off-by: Lorenzo Stoakes <ljs@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: David Hildenbrand <david@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jann Horn <jannh@google.com>
Cc: Liam R. Howlett <liam@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Patch series "remove mmap_action success, error hooks", v2.
The mmap_action->success_hook was a strange beast added to enable code
which appeared to absolutely require access to a VMA pointer to work
correctly.
Primarily this was for hugetlb, however a different approach will be taken
there, as clearly more work is required to figure out a sensible way of
converting hugetlb to use mmap_prepare.
The other user was the memory char driver, specifically /dev/zero which
has the unusual property of explicitly setting file-backed VMAs anonymous.
Providing the success hook was always foolish, as it allowed drivers a way
to workaround the restriction that they should not access a pointer to a
not-yet-correctly-initialised VMA - which defeats the purpose of the
mmap_prepare work.
We can achieve the same thing in memory char driver without needing the
success hook, so this series removes that, then removes the success hook
altogether.
The error hook is also unnecessary - the motivation for this was for
functions which need to filter the error code when performing an mmap
action in order to avoid breaking userspace.
We can achieve this by just providing a field for the error code. Doing
this means we don't have to worry about the hook doing anything odd.
We also add a check to ensure the error code is in fact valid.
Again the memory char driver is the only current user of this, so this
series updates it to use that.
After this change mmap_action has no custom hooks at all, which seems
rather more cromulent than before.
This patch (of 3):
/dev/zero, uniquely, marks memory mapped there as anonymous. This is
currently achieved using the mmap_action->success_hook.
However this hook circumvents the abstraction of VMA initialisation so
it's preferable to do things a different way.
To achieve this, this patch firstly defaults the VMA descriptor's vm_ops
field to the dummy VMA operations, which is what file-backed VMAs default
this field to.
That way, we can detect whether a driver sets this field to NULL in order
to mark it anonymous.
We then introduce vma_desc_set_anonymous() to do this explicitly, and
invoke it in mmap_zero_prepare().
This way, any driver which does not explicitly set desc->vm_ops, retains
the dummy vm_ops as they would previously.
We also update set_vma_user_defined_fields() to make clear that we are
either setting vma->vm_ops to what is provided by the driver (or
defaulting to dummy_vm_ops if not set), or setting the VMA anonymous.
This lays the groundwork for removing the success hook.
Link: https://lore.kernel.org/cover.1779462249.git.ljs@kernel.org
Link: https://lore.kernel.org/5d1e8bd29d6e070218ba7a03461df562e372b91e.1779462249.git.ljs@kernel.org
Signed-off-by: Lorenzo Stoakes <ljs@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: David Hildenbrand <david@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jann Horn <jannh@google.com>
Cc: Liam R. Howlett <liam@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
When create_pagecache_thp_and_fd() write returns error on
/proc/sys/vm/dropcache, it just "goto err_out_unlink", which left fd still
open.
Use "goto err_out_close" to close the fd.
Link: https://lore.kernel.org/20260520020336.28914-1-richard.weiyang@gmail.com
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Dev Jain <dev.jain@arm.com>
Reviewed-by: SeongJae Park <sj@kernel.org>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: "Liam R. Howlett" <liam@infradead.org>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nico Pache <npache@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Add simple existence tests for data probes sysfs directories and files.
Link: https://lore.kernel.org/20260518234119.97569-20-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: David Hildenbrand <david@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Liam R. Howlett <liam@infradead.org>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: "Masami Hiramatsu (Google)" <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The --kpageflags option requires an argument to specify the kpageflags
file path, but has_arg was set to 0 (no_argument) in the long options
table. Change it to 1 (required_argument) so getopt_long correctly parses
the argument.
Link: https://lore.kernel.org/20260513022120.58033-4-ye.liu@linux.dev
Signed-off-by: Ye Liu <liuye@kylinos.cn>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Reviewed-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The ternary operator (?:) has lower precedence than addition (+), so the
expression `off + sigbus_addr ? sigbus_addr - ptr : 0` was parsed as
`(off + sigbus_addr) ? (sigbus_addr - ptr) : 0` rather than the intended
`off + (sigbus_addr ? sigbus_addr - ptr : 0)`. Add explicit parentheses
to ensure the correct evaluation order.
Link: https://lore.kernel.org/20260513022120.58033-3-ye.liu@linux.dev
Signed-off-by: Ye Liu <liuye@kylinos.cn>
Acked-by: SeongJae Park <sj@kernel.org>
Cc: David Hildenbrand (Arm) <david@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Patch series "tools/mm/page-types: Fix misc bugs".
This series fixes three issues in tools/mm/page-types.c:
1. Fix two typos in madvise() error messages ("madvice" -> "madvise")
2. Fix operator precedence bug in the sigbus handler where the ternary
operator binds looser than addition, producing incorrect offset
calculation when sigbus_addr is non-NULL
3. Fix --kpageflags option declaration in getopt_long: has_arg should
be 1 (required_argument) since the option requires a file path
This patch (of 3):
Two error messages incorrectly spelled the madvise() function name as
"madvice". Fix the typo in both occurrences.
Link: https://lore.kernel.org/20260513022120.58033-1-ye.liu@linux.dev
Link: https://lore.kernel.org/20260513022120.58033-2-ye.liu@linux.dev
Signed-off-by: Ye Liu <liuye@kylinos.cn>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Reviewed-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Update write() checks to properly detect and handle partial writes.
Previously, the write() calls used <= 0 to detect failure. This condition
is never true for partial writes (ret > 0 but ret < len), so partial
writes were silently treated as success.
Fix this by verifying that write() returns the full expected length and
treating any mismatch as failure.
Link: https://lore.kernel.org/20260504081638.683223-1-agarwal.vineet2006@gmail.com
Signed-off-by: Vineet Agarwal <agarwal.vineet2006@gmail.com>
Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Cc: Liam Howlett <liam@infradead.org>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
create_pagecache_thp_and_fd() fills the backing file for the pagecache THP
tests using repeated write() calls, but the return value is never checked.
If a write fails or completes only partially, the test may continue with
an incompletely initialized file and produce misleading results.
Check the result of write() and fail the test if the expected number of
bytes was not written.
[akpm@linux-foundation.org: remove unneeded local, per David]
Link: https://lore.kernel.org/da82de92-29d8-457c-9f65-40fc4900b922@kernel.org
Link: https://lore.kernel.org/20260512074924.27721-1-agarwal.vineet2006@gmail.com
Signed-off-by: Vineet Agarwal <agarwal.vineet2006@gmail.com>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Vineet Agarwal <agarwal.vineet2006@gmail.com>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
mmap() returns MAP_FAILED on error, not NULL. The current check uses
!buffer->ptr, which evaluates to false when mmap() fails (since MAP_FAILED
is (void *)-1, not 0), so the error path is never taken.
Link: https://lore.kernel.org/20260512101305.139509-1-lihongfu@kylinos.cn
Signed-off-by: Hongfu Li <lihongfu@kylinos.cn>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Reviewed-by: Dev Jain <dev.jain@arm.com>
Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: Donet Tom <donettom@linux.ibm.com>
Reviewed-by: Lorenzo Stoakes <ljs@kernel.org>
Reviewed-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
|
|
After commit 9b93f7e32774 ("tools/getdelays: use the static UAPI headers
from tools/include/uapi"), the Makefile was changed to use
-I../include/uapi/ instead of -I../../usr/include to ensure tools always
use the up-to-date UAPI headers.
However, only linux/taskstats.h was added to tools/include/uapi/ in commit
e5bbb35a07b3 ("tools headers UAPI: sync linux/taskstats.h"), but
linux/acct.h was missing.
This causes procacct.c to fail to compile with:
procacct.c:234:37: error: 'AGROUP' undeclared (first use in this function)
gcc -I../include/uapi/ getdelays.c -o getdelays
gcc -I../include/uapi/ procacct.c -o procacct
procacct.c: In function `print_procacct':
procacct.c:234:37: error: `AGROUP' undeclared (first use in this function)
did you mean `NOGROUP'?
234 | , t->version >= 12 ? (t->ac_flag & AGROUP ? 'P' : 'T') : '?'
| ^~~~~~
| NOGROUP
procacct.c:234:37: note: each undeclared ident
because procacct.c uses the AGROUP macro defined in linux/acct.h.
Add the missing linux/acct.h to complete the static UAPI header set.
Link: https://lore.kernel.org/20260527213558929EhiHHy9EDTMjmg3uuDOMi@zte.com.cn
Fixes: 9b93f7e32774 ("tools/getdelays: use the static UAPI headers from tools/include/uapi")
Signed-off-by: Wang Yaxin <wang.yaxin@zte.com.cn>
Cc: Thomas Weißschuh <linux@weissschuh.net>
Cc: Fan Yu <fan.yu9@zte.com.cn>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: xu xin <xu.xin16@zte.com.cn>
Cc: Yang Yang <yang.yang29@zte.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
In sigtrap_threads(), the return value of mmap() is checked against NULL.
mmap() returns MAP_FAILED, which is (void *)-1, not NULL, when it fails.
Since MAP_FAILED is non-zero and non-NULL, the condition "p == NULL" will
never be true on failure, causing the program to proceed with an invalid
pointer and segfault if mmap() actually fails under memory pressure.
Link: https://lore.kernel.org/20260513025838.594945-1-lihongfu@kylinos.cn
Signed-off-by: Hongfu Li <lihongfu@kylinos.cn>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Mickael Salaun <mic@digikod.net>
Cc: SeongJae Park <sj@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Kyle Huey <khuey@kylehuey.com>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Tell git to ignore the generated binary for the test.
Link: https://lore.kernel.org/20260226-selftest-filelock-ktap-v4-3-db8ae192ff42@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: Jeff Layton <jlayton@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The filelock test checks four different things but only reports an overall
status, convert to use ksft_test_result() for these individual tests.
Each test depends on the previous ones so we still bail out if any of them
fail but we get a bit more information from UIs parsing the results.
Link: https://lore.kernel.org/20260226-selftest-filelock-ktap-v4-2-db8ae192ff42@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: Jeff Layton <jlayton@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Patch series "selftests/filelock: Make output more kselftestish", v4.
This series makes the output from the ofdlocks test a bit easier for
tooling to work with, and also ignores the generated file while we're
here.
This patch (of 3):
The ofdlocks test reports some errors via perror() which does not produce
KTAP output, convert to ksft_perror() which does.
Link: https://lore.kernel.org/20260226-selftest-filelock-ktap-v4-0-db8ae192ff42@kernel.org
Link: https://lore.kernel.org/20260226-selftest-filelock-ktap-v4-1-db8ae192ff42@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: Jeff Layton <jlayton@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Add a kselftest for the taskstats TGID aggregation fix.
The test creates a worker thread, snapshots TGID taskstats while the
worker is still alive, lets the worker exit, and then verifies that the
TGID CPU total does not regress after the thread has been reaped.
The pass/fail check intentionally keys off ac_utime + ac_stime only, which
is the primary user-visible regression fixed by the taskstats change and
is less sensitive to scheduling noise than context-switch counters.
Link: https://lore.kernel.org/0d55354911c54cd1b9f10a09f6fd378af85c8d43.1776094300.git.cyyzero16@gmail.com
Signed-off-by: Yiyang Chen <cyyzero16@gmail.com>
Acked-by: Balbir Singh <balbirs@nvidia.com>
Cc: Dr. Thomas Orgis <thomas.orgis@uni-hamburg.de>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Wang Yaxin <wang.yaxin@zte.com.cn>
Cc: Yang Yang <yang.yang29@zte.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Reproduce with GCC 13.3.0:
$ cd tools/accounting
$ make
This emits:
getdelays.c: In function `format_timespec':
getdelays.c:218:67: warning: `:' directive output may be truncated writing 1 byte into a region of size between 0 and 16 [-Wformat-truncation=]
218 | snprintf(buffer, sizeof(buffer), "%04d-%02d-%02dT%02d:%02d:%02d",
|
getdelays.c:218:9: note: `snprintf' output between 20 and 72 bytes into a destination of size 32
The problem is that %04d and %02d specify minimum field widths only. GCC
cannot prove that formatting tm_year + 1900 and the other struct tm
fields will always fit in the fixed 32-byte buffer, so it warns about
possible truncation.
Fix this by replacing the manual snprintf() formatting with
strftime("%Y-%m-%dT%H:%M:%S", ...). That matches the data we already have
in struct tm, keeps the intended timestamp format, and avoids the warning
when building tools/accounting with GCC.
Link: https://lore.kernel.org/87d9723e0b59d816ee2e4bd7cddd58a54c6c9f91.1776956545.git.cyyzero16@gmail.com
Signed-off-by: Yiyang Chen <cyyzero16@gmail.com>
Cc: Fan Yu <fan.yu9@zte.com.cn>
Cc: Wang Yaxin <wang.yaxin@zte.com.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Add a regression test for the per-scan verbose dedup added in the
preceding commit. The test loads samples/kmemleak's helper module
(CONFIG_SAMPLE_KMEMLEAK=m) to generate orphan allocations, several of
which share an allocation backtrace, runs four kmemleak scans with verbose
printing enabled, then walks dmesg looking for two "unreferenced object"
reports within a single scan that share an identical backtrace - which
would mean dedup failed to collapse them.
The test is intentionally permissive on detection but strict on
regressions:
- PASS when no duplicates are observed, regardless of whether the
dedup summary line ("... and N more object(s) with the same
backtrace") was actually emitted. Per-CPU chunk reuse, slab
freelist pointers, kernel stack residue and CONFIG_DEBUG_KMEMLEAK_
AUTO_SCAN can all keep most of the orphans "still referenced" or
reported across many separate scans, so the dedup path may have
nothing to fold within one scan. That is not a regression.
- PASS reports whether dedup actually fired, so a passing run on a
well-behaved environment is still informative.
- FAIL when two same-backtrace reports land in a single scan (clear
dedup regression).
- FAIL when kmemleak's own per-scan tally counts leaks but the
verbose path emits zero "unreferenced object" lines - that catches
a regression in the verbose printer itself, which would otherwise
pass the duplicate check trivially.
- SKIP when kmemleak is absent, disabled at runtime, or the helper
module is not built.
The dmesg parser anchors stack-frame matching to the indentation kmemleak
uses for them (4+ spaces under "kmemleak: ") so unrelated kmemleak
warnings landing between reports do not get lumped into the backtrace key
and mask a duplicate.
Link: https://lore.kernel.org/20260506-kmemleak_dedup-v3-2-2d36aafc34da@debian.org
Signed-off-by: Breno Leitao <leitao@debian.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: David Hildenbrand <david@kernel.org>
Cc: Liam R. Howlett <liam@infradead.org>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
test_percpu_basic() currently compares memory.current against only
memory.stat:percpu after creating 1000 child cgroups.
Observed failure:
#./test_kmem
ok 1 test_kmem_basic
ok 2 test_kmem_memcg_deletion
ok 3 test_kmem_proc_kpagecgroup
ok 4 test_kmem_kernel_stacks
ok 5 test_kmem_dead_cgroups
memory.current 11530240
percpu 8440000
not ok 6 test_percpu_basic
That assumption is too strict: child cgroup creation also allocates
slab-backed metadata, so memory.current is expected to be larger than
percpu alone. One visible path is:
cgroup_mkdir()
cgroup_create()
cgroup_addrm_file()
cgroup_add_file()
__kernfs_create_file()
__kernfs_new_node()
kmem_cache_zalloc()
These kernfs allocations are charged as slab and show up in
memory.stat:slab.
Update the check to compare memory.current against (percpu + slab)
within MAX_VMSTAT_ERROR, and print slab/delta in the failure message to
improve diagnostics.
Link: https://lore.kernel.org/20260501022058.18024-3-li.wang@linux.dev
Signed-off-by: Li Wang <li.wang@linux.dev>
Reviewed-by: Waiman Long <longman@redhat.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Sayali Patil <sayalip@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Patch series "selftests/cgroup: Fix false positive failures in
test_percpu_basic", v2.
This patch series addresses two separate issues that cause false
positive failures in the test_percpu_basic test within the cgroup
kmem selftests.
The first issue stems from a hardcoded assumption about the system
page size, which breaks the test on architectures with larger page
sizes.
The second issue is an overly strict memory check that fails to
account for the slab metadata allocated during cgroup creation.
This patch (of 2):
MAX_VMSTAT_ERROR uses a hardcoded page size of 4096, which assumes 4K
pages. This causes test_percpu_basic to fail on systems where the kernel
is configured with a larger page size, such as aarch64 systems using 16K
or 64K pages, where the maximum permissible discrepancy between
memory.current and percpu charges is proportionally larger.
Replace the hardcoded 4096 with sysconf(_SC_PAGESIZE) to correctly derive
the page size at runtime regardless of the underlying architecture or
kernel configuration.
Link: https://lore.kernel.org/20260501022058.18024-1-li.wang@linux.dev
Link: https://lore.kernel.org/20260501022058.18024-2-li.wang@linux.dev
Signed-off-by: Li Wang <li.wang@linux.dev>
Acked-by: Waiman Long <longman@redhat.com>
Reviewed-by: Sayali Patil <sayalip@linux.ibm.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
file_setup_area() currently allocates anonymous memory, fills it, and
writes it into the backing file used for collapse testing.
Instead of copying data through write(), resize the file with ftruncate(),
map it directly with MAP_SHARED, and initialize the mapped area in place.
This simplifies the setup path and avoids the need for explicit partial
write handling.
Link: https://lore.kernel.org/20260429115816.98824-1-agarwal.vineet2006@gmail.com
Signed-off-by: Vineet Agarwal <agarwal.vineet2006@gmail.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Tested-by: Zi Yan <ziy@nvidia.com>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Liam R. Howlett <liam@infradead.org>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nico Pache <npache@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The sysfs.py test commits DAMON parameters, dump the internal DAMON state,
and show if the parameters are committed as expected using the dumped
state. While the dumping is ongoing, DAMON is alive. It can make
internal changes including addition and removal of regions. It can
therefore make a race that can result in false test results. Pause DAMON
execution during the state dumping to avoid such races.
Link: https://lore.kernel.org/20260427151231.113429-11-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Brendan Higgins <brendan.higgins@linux.dev>
Cc: David Hildenbrand <david@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Liam R. Howlett <liam@infradead.org>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Extend sysfs.py tests to confirm damon_ctx->pause can be set using the
pause sysfs file.
Link: https://lore.kernel.org/20260427151231.113429-10-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Brendan Higgins <brendan.higgins@linux.dev>
Cc: David Hildenbrand <david@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Liam R. Howlett <liam@infradead.org>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
drgn_dump_damon_status is not dumping the damon_ctx->pause parameter
value, so it cannot be tested. Dump it for future tests.
Link: https://lore.kernel.org/20260427151231.113429-9-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Brendan Higgins <brendan.higgins@linux.dev>
Cc: David Hildenbrand <david@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Liam R. Howlett <liam@infradead.org>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
DAMON test-purpose sysfs interface control Python module, _damon_sysfs, is
not supporting the newly added pause file. Add the support of the file,
for future test and use of the feature.
Link: https://lore.kernel.org/20260427151231.113429-8-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Brendan Higgins <brendan.higgins@linux.dev>
Cc: David Hildenbrand <david@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Liam R. Howlett <liam@infradead.org>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
process_madvise() used to validate the advice while walking each imported
iovec. If the vector has zero total length, vector_madvise() does not
enter the loop and can return success without checking whether the advice
value is valid.
For a local mm, such as process_madvise(PIDFD_SELF, ...), the remote-only
process_madvise_remote_valid() check is skipped. As a result, an invalid
advice can be reported as success when the vector has zero total length.
This differs from madvise(), which rejects an invalid advice before
returning success for a zero-length range.
Validate the generic madvise behavior at the syscall-facing entry points
before any vector walk. In process_madvise(), do this before the
remote-only advice restriction so unsupported advice is rejected with the
same priority for local and remote mm.
Use an errno-returning helper for address/length validation, and handle
zero-length ranges explicitly at the call sites. Requests with valid
advice and zero total length remain a noop and continue to return 0. Add
a selftest that covers invalid advice with a zero-length iovec and an
empty vector, while also checking that a request with valid advice and
zero length still succeeds.
Link: https://lore.kernel.org/tencent_C3AEB0E769C5F4F9370F9411B69B7F8B2907@qq.com
Fixes: 021781b01275 ("mm/madvise: unrestrict process_madvise() for current process")
Signed-off-by: fujunjie <fujunjie1@qq.com>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Reviewed-by: SeongJae Park <sj@kernel.org>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jann Horn <jannh@google.com>
Cc: Liam Howlett <liam@infradead.org>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Vlastimil Babka <vbabka@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
This patch set introces a new action: DAMOS_COLLAPSE.
For DAMOS_HUGEPAGE and DAMOS_NOHUGEPAGE to work, khugepaged should be
working, since it relies on hugepage_madvise to add a new slot. This slot
should be picked up by khugepaged and eventually collapse (or not, if we
are using DAMOS_NOHUGEPAGE) the pages. If THP is not enabled, khugepaged
will not be working, and therefore no collapse will happen.
DAMOS_COLLAPSE eventually calls madvise_collapse, which will collapse the
address range synchronously. In cases where there is a large VMA
(databases, for example), DAMOS_COLLAPSE allows us to collapse only the
hot region, and not the entire VMA.
This new action may be required to support autotuning with hugepage
as a goal[1].
=========
Benchmarks:
=========
MySQL
=====
Tests were performed in an ARM physical server with MariaDB 10.5 and
sysbench. Read only benchmark was perform with gaussian row hitting,
which follows a normal distribution.
T n, D h: THP set to never, DAMON action set to hugepage
T m, D h: THP set to madvise, DAMON action set to hugepage
T n, D c: THP set to never, DAMON action set to collapse
Memory consumption. Lower is better.
+------------------+----------+----------+----------+
| | T n, D h | T m, D h | T n, D c |
+------------------+----------+----------+----------+
| Total memory use | 2.13 | 2.20 | 2.20 |
| Huge pages | 0 | 1.3 | 1.27 |
+------------------+----------+----------+----------+
Performance in TPS (Transactions Per Second). Higher is better.
T n, D h: 18225.58
T m, D h 18252.93
T n, D c: 18270.21
Performance counter
I got the number of L1 D/I TLB accesses and the number a D/I TLB
accesses that triggered a page walk. I divided the second by the
first to get the percentage of page walkes per TLB access. The
lower the better.
+---------------+--------------+--------------+--------------+
| | T n, D h | T m, D h | T n, D c |
+---------------+--------------+--------------+--------------+
| L1 DTLB | 127248242753 | 125431020479 | 125327001821 |
| L1 ITLB | 80332558619 | 79346759071 | 79298139590 |
| DTLB walk | 75011087 | 52800418 | 55895794 |
| ITLB walk | 71577076 | 71505137 | 67262140 |
| DTLB % misses | 0.058948623 | 0.042095183 | 0.044599961 |
| ITLB % misses | 0.089100954 | 0.090117275 | 0.084821839 |
+---------------+--------------+--------------+--------------+
Masim
=====
I used masim with the "demo" configuration, but changing the times
to 100 seconds for the initial phase and 50 seconds for the rest of
the phases.
Memory consumption:
+------------------+----------+----------+----------+
| | T n, D h | T m, D h | T n, D c |
+------------------+----------+----------+----------+
| Total memory use | 2.38 GB | 2.36 GB | 2.37 GB |
| Huge pages | 0 | 190 MB | 188 MB |
+------------------+----------+----------+----------+
Performance:
THP never, DAMOS_HUGEPAGE
initial phase: 40,491 accesses/msec, 100001 msecs run
low phase 0: 39,658 accesses/msec, 50002 msecs run
high phase 0: 41,678 accesses/msec, 50000 msecs run
low phase 1: 39,625 accesses/msec, 50003 msecs run
high phase 1: 41,658 accesses/msec, 50002 msecs run
low phase 2: 39,642 accesses/msec, 50002 msecs run
high phase 2: 41,640 accesses/msec, 50001 msecs run
THP madvise, DAMOS_HUGEPAGE
initial phase: 51,977 accesses/msec, 100000 msecs run
low phase 0: 86,953 accesses/msec, 50000 msecs run
high phase 0: 94,812 accesses/msec, 50000 msecs run
low phase 1: 101,017 accesses/msec, 50000 msecs run
high phase 1: 94,841 accesses/msec, 50000 msecs run
low phase 2: 100,993 accesses/msec, 50000 msecs run
high phase 2: 94,791 accesses/msec, 50001 msecs run
THP never, DAMOS_COLLAPSE
initial phase: 93,678 accesses/msec, 100001 msecs run
low phase 0: 101,475 accesses/msec, 50000 msecs run
high phase 0: 98,589 accesses/msec, 50000 msecs run
low phase 1: 101,531 accesses/msec, 50001 msecs run
high phase 1: 98,506 accesses/msec, 50001 msecs run
low phase 2: 101,458 accesses/msec, 50001 msecs run
high phase 2: 98,555 accesses/msec, 50000 msecs run
Memory consumption dynamic (how quickly collapses occur):
It shows in seconds how many huge pages are allocated.
+----+----------+----------+
| | T m, D h | T n, D c |
+----+----------+----------+
| 5 | 32 | 188 |
| 10 | 48 | 188 |
| 15 | 64 | 188 |
| 20 | 96 | 188 |
| 30 | 112 | 188 |
| 35 | 144 | 188 |
| 40 | 160 | 188 |
| 45 | 190 | 188 |
| 50 | 190 | 188 |
| 55 | 190 | 188 |
| 60 | 190 | 188 |
+----+----------+----------+
=========
- We can see that DAMOS "hugepage" action works only when THP is set
to madvise. "collapse" action works even when THP is set to never.
- Performance for "collapse" action is slightly lower than "hugepage"
action and THP madvise. This is due to the fact that collapases
occur synchronously. With "hugepage" they may occur during page
faults.
- Memory consumption is slighly lower for "collapse" than "hugepage"
with THP madvise. This is due to the khugepage collapses all VMAs,
while "collapse" action only collapses the VMAs in the hot region.
- There is an improvement in TLB utilization when collapse through
"hugepage" or "collapse" actions are triggered. The amount of
TLB misses is lower.
- "collapse" action is performance synchronously, which means that
page collapses happen earlier and more rapidly. This can be
useful or not, depending on the scenario.
- "hugepage" action may trigger a VMA split in some scenarios, since
it needs to change the flag of the VMA to THP enabled. This may
lead to additional overhead.
Collapse action just adds a new option to chose the correct system
balance.
Link: https://lore.kernel.org/20260426231619.107231-5-sj@kernel.org
Link: https://lore.kernel.org/damon/20260313000816.79933-1-sj@kernel.org/ [1]
Signed-off-by: Asier Gutierrez <gutierrez.asier@huawei-partners.com>
Signed-off-by: SeongJae Park <sj@kernel.org>
Reviewed-by: SeongJae Park <sj@kernel.org>
Cc: Cheng-Han Wu <hank20010209@gmail.com>
Cc: David Hildenbrand <david@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Liam R. Howlett <liam@infradead.org>
Cc: Liew Rui Yan <aethernet65535@gmail.com>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The original version of mremap_test (7df666253f26: "kselftests: vm: add
mremap tests") validated remapped contents byte-by-byte and printed a
mismatch index in case the bytes streams didn't match. That was rather
inefficient, especially also if the test passed.
Later, commit 7033c6cc9620 ("selftests/mm: mremap_test: optimize execution
time from minutes to seconds using chunkwise memcmp") used memcmp() on
bigger chunks, to fallback to byte-wise scanning to detect the problematic
index only if it discovered a problem.
However, the implementation is overly complicated (e.g., get_sqrt() is
currently not optimal) and we don't really have to report the exact index:
whoever debugs the failing test can figure that out.
Let's simplify by just comparing both byte streams with memcmp() and not
detecting the exact failed index.
Link: https://lore.kernel.org/20260415044509.579428-1-dev.jain@arm.com
Signed-off-by: Dev Jain <dev.jain@arm.com>
Reported-by: Sarthak Sharma <sarthak.sharma@arm.com>
Tested-by: Sarthak Sharma <sarthak.sharma@arm.com>
Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Liam Howlett <liam@infradead.org>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: David Laight <david.laight.linux@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The test was not being run by the selftest framework so it was never
noticed that it would fail with an assertion failure on configs without
support for MAP_DROPPABLE. Update the test so that it is skipped instead
when MAP_DROPPABLE is not supported, and add it to the mmap category so
that the test is run by the framework.
Link: https://lore.kernel.org/20260416033939.49981-4-anthony.yznaga@oracle.com
Signed-off-by: Anthony Yznaga <anthony.yznaga@oracle.com>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Cc: Jann Horn <jannh@google.com>
Cc: Jason A. Donenfeld <jason@zx2c4.com>
Cc: Liam Howlett <liam@infradead.org>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Vlastimil Babka (SUSE) <vbabka@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
For configs that support MAP_DROPPABLE verify that a mapping created with
MAP_DROPPABLE cannot be locked via mlock(), and that it will not be locked
if it's created after mlockall(MCL_FUTURE).
Link: https://lore.kernel.org/20260416033939.49981-3-anthony.yznaga@oracle.com
Signed-off-by: Anthony Yznaga <anthony.yznaga@oracle.com>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Cc: Jann Horn <jannh@google.com>
Cc: Jason A. Donenfeld <jason@zx2c4.com>
Cc: Liam Howlett <liam@infradead.org>
Cc: Lorenzo Stoakes <ljs@kernel.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka (SUSE) <vbabka@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
zswap writeback is asynchronous, but test_zswap.c checks writeback
counters immediately after reclaim/trigger paths. On some platforms (e.g.
ppc64le), this can race with background writeback and cause spurious
failures even when behavior is correct.
Add wait_for_writeback() to poll get_cg_wb_count() with a bounded
timeout, and use it in:
test_zswap_writeback_one() when writeback is expected
test_no_invasive_cgroup_shrink() for the wb_group check
This keeps the original before/after assertion style while making the
tests robust against writeback completion latency.
No test behavior change, selftest stability improvement only.
Link: https://lore.kernel.org/20260424040059.12940-9-li.wang@linux.dev
Signed-off-by: Li Wang <li.wang@linux.dev>
Acked-by: Nhat Pham <nphamcs@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Michal Koutný <mkoutny@suse.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Tejun Heo <tj@kernel.org>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Yosry Ahmed <yosryahmed@google.com>
Cc: Chengming Zhou <chengming.zhou@linux.dev>
Cc: Jiayuan Chen <jiayuan.chen@linux.dev>
Cc: Waiman Long <longman@redhat.com>
Cc: Yosry Ahmed <yosry@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
In attempt_writeback(), a memsize of 4M only covers 64 pages on 64K page
size systems. When memory.reclaim is called, the kernel prefers
reclaiming clean file pages (binary, libc, linker, etc.) over swapping
anonymous pages. With only 64 pages of anonymous memory, the reclaim
target can be largely or entirely satisfied by dropping file pages,
resulting in very few or zero anonymous pages being pushed into zswap.
This causes zswap_usage to be extremely small or zero, making
zswap_usage/4 insufficient to create meaningful writeback pressure. The
test then fails because no writeback is triggered.
On 4K page size systems this is not an issue because 4M covers 1024
pages, and file pages are a small fraction of the reclaim target.
Fix this by:
- Always allocating 1024 pages regardless of page size. This ensures
enough anonymous pages to reliably populate zswap and trigger
writeback, while keeping the original 4M allocation on 4K systems.
- Setting zswap.max to zswap_usage/4 instead of zswap_usage/2 to
create stronger writeback pressure, ensuring reclaim reliably
triggers writeback even on large page size systems.
=== Error Log ===
# uname -rm
6.12.0-211.el10.ppc64le ppc64le
# getconf PAGESIZE
65536
# ./test_zswap
TAP version 13
1..7
ok 1 test_zswap_usage
ok 2 test_swapin_nozswap
ok 3 test_zswapin
not ok 4 test_zswap_writeback_enabled
...
Link: https://lore.kernel.org/20260424040059.12940-8-li.wang@linux.dev
Signed-off-by: Li Wang <li.wang@linux.dev>
Acked-by: Yosry Ahmed <yosry@kernel.org>
Acked-by: Nhat Pham <nphamcs@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Michal Koutný <mkoutny@suse.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Tejun Heo <tj@kernel.org>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Chengming Zhou <chengming.zhou@linux.dev>
Cc: Jiayuan Chen <jiayuan.chen@linux.dev>
Cc: Waiman Long <longman@redhat.com>
Cc: Yosry Ahmed <yosryahmed@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
system
test_no_invasive_cgroup_shrink sets up two cgroups: wb_group, which is
expected to trigger zswap writeback, and a control group (renamed to
zw_group), which should only have pages sitting in zswap without any
writeback.
There are two problems with the current test:
1) The data patterns are reversed. wb_group uses allocate_bytes(), which
writes only a single byte per page — trivially compressible,
especially by zstd — so compressed pages fit within zswap.max and
writeback is never triggered. Meanwhile, the control group uses
getrandom() to produce hard-to-compress data, but it is the group
that does *not* need writeback.
2) The test uses fixed sizes (10K zswap.max, 10MB allocation) that are
too small on systems with large PAGE_SIZE (e.g. 64K), failing to
build enough memory pressure to trigger writeback reliably.
Fix both issues by:
- Swapping the data patterns: fill wb_group pages with partially
random data (getrandom for page_size/4 bytes) to resist compression
and trigger writeback, and fill zw_group pages with simple repeated
data to stay compressed in zswap.
- Making all size parameters PAGE_SIZE-aware: set allocation size to
PAGE_SIZE * 1024, memory.zswap.max to PAGE_SIZE, and memory.max to
allocation_size / 2 for both cgroups.
- Allocating memory inline instead of via cg_run() so the pages
remain resident throughout the test.
=== Error Log ===
# getconf PAGESIZE
65536
# ./test_zswap
TAP version 13
...
ok 5 test_zswap_writeback_disabled
ok 6 # SKIP test_no_kmem_bypass
not ok 7 test_no_invasive_cgroup_shrink
Link: https://lore.kernel.org/20260424040059.12940-7-li.wang@linux.dev
Signed-off-by: Li Wang <li.wang@linux.dev>
Acked-by: Nhat Pham <nphamcs@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Michal Koutný <mkoutny@suse.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Tejun Heo <tj@kernel.org>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Yosry Ahmed <yosryahmed@google.com>
Cc: Chengming Zhou <chengming.zhou@linux.dev>
Cc: Jiayuan Chen <jiayuan.chen@linux.dev>
Cc: Waiman Long <longman@redhat.com>
Cc: Yosry Ahmed <yosry@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
test_zswap uses hardcoded values of 4095 and 4096 throughout as page
stride and page size, which are only correct on systems with a 4K page
size. On architectures with larger pages (e.g., 64K on arm64 or ppc64),
these constants cause memory to be touched at sub-page granularity,
leading to inefficient access patterns and incorrect page count
calculations, which can cause test failures.
Replace all hardcoded 4095 and 4096 values with a global pagesize variable
initialized from sysconf(_SC_PAGESIZE) at startup, and remove the
redundant local sysconf() calls scattered across individual functions. No
functional change on 4K page size systems.
Link: https://lore.kernel.org/20260424040059.12940-6-li.wang@linux.dev
Signed-off-by: Li Wang <li.wang@linux.dev>
Acked-by: Yosry Ahmed <yosry@kernel.org>
Reviewed-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Michal Koutný <mkoutny@suse.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Tejun Heo <tj@kernel.org>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Chengming Zhou <chengming.zhou@linux.dev>
Cc: Nhat Pham <nphamcs@gmail.com>
Cc: Waiman Long <longman@redhat.com>
Cc: Yosry Ahmed <yosryahmed@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|