Sandra Loosemore [Fri, 16 May 2025 21:27:28 +0000 (21:27 +0000)]
ChangeLog.omp bump
Tobias Burnus [Wed, 14 May 2025 18:06:49 +0000 (20:06 +0200)]
OpenMP: Fix mapping of zero-sized arrays with non-literal size: map(var[:n]), n = 0
For map(ptr[:0]), the used map kind is GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION
and it is permitted that 'ptr' does not exist. 'ptr' is set to the device
pointee if it exists or to the host value otherwise.
For map(ptr[:3]), the variable is first mapped and then ptr is updated to point
to the just-mapped device data; the attachment uses GOMP_MAP_ATTACH.
For map(ptr[:n]), generates always a GOMP_MAP_ATTACH, but when n == 0, it
was failing with:
"pointer target not mapped for attach"
The solution is not to fail but first to check whether it was mapped before.
It turned out that for the mapping part, GCC adds a run-time check whether
n == 0 - and uses GOMP_MAP_ZERO_LEN_ARRAY_SECTION for the mapping.
Thus, we just have to check whether there such a mapping for the address
for which the GOMP_MAP_ATTACH. was requested. And, if there was, the
error diagnostic can be skipped.
Unsurprisingly, this issue occurs in real-world code; it was detected in
a code that distributes work via MPI and for some processes, some bounds
ended up to be zero.
libgomp/ChangeLog:
* target.c (gomp_attach_pointer): Return bool; accept additional
bool to optionally silence the fatal pointee-not-found error.
(gomp_map_vars_internal): If the pointee could not be found,
check whether it was mapped as GOMP_MAP_ZERO_LEN_ARRAY_SECTION.
* libgomp.h (gomp_attach_pointer): Update prototype.
* oacc-mem.c (acc_attach_async, goacc_enter_data_internal): Update
calls.
* testsuite/libgomp.c/target-map-zero-sized.c: New test.
* testsuite/libgomp.c/target-map-zero-sized-2.c: New test.
* testsuite/libgomp.c/target-map-zero-sized-3.c: New test.
(cherry picked from commit
814e29e390b1e9253f9a38e0d84f5ebe5de0c13e)
Sandra Loosemore [Thu, 15 May 2025 20:26:30 +0000 (20:26 +0000)]
ChangeLog.omp bump
Tobias Burnus [Fri, 9 May 2025 08:57:44 +0000 (10:57 +0200)]
libgomp.{c,fortran}/interop-{hip,cuda}: Fix dg-run target selection
While the tests checked whether the CUDA/HIP runtime is available
before processing them, the execution was then done unconditionally,
leading to FAIL when the default device was the host (or the wrong
offload device).
Now the test is only executed ('run') when the default device is an
Nvidia or AMD GPU (depending on the test case, cf. the test file name).
Otherwise, only a 'link' test is done. (Except when the effective-target
check cannot find the runtime lib - then the test is skipped [as before].)
Note: The cublas/hipblas tests use variant functions and iterate over
all devices, such that the cublas or hipblas, respectively, is only
called when the active device is an AMD or Nvidia device, respectively,
while for the host and other device types the fallback is called.
libgomp/ChangeLog:
* testsuite/libgomp.c/interop-cuda-full.c: Use 'link' instead
of 'run' when the default device is "! offload_device_nvptx".
* testsuite/libgomp.c/interop-cuda-libonly.c: Likewise.
* testsuite/libgomp.c/interop-hip-nvidia-full.c: Likewise.
* testsuite/libgomp.c/interop-hip-nvidia-no-headers.c: Likewise.
* testsuite/libgomp.c/interop-hip-nvidia-no-hip-header.c: Likewise.
* testsuite/libgomp.fortran/interop-hip-nvidia-full.F90: Likewise.
* testsuite/libgomp.fortran/interop-hip-nvidia-no-module.F90: Likewise.
* testsuite/libgomp.c/interop-hip-amd-full.c: Use 'link' instead
of 'run' when the default device is "! offload_device_gcn".
* testsuite/libgomp.c/interop-hip-amd-no-hip-header.c: Likewise.
* testsuite/libgomp.fortran/interop-hip-amd-full.F90: Likewise.
* testsuite/libgomp.fortran/interop-hip-amd-no-module.F90: Likewise.
(cherry picked from commit
94e63410474a36655e1800387eabd73a6f930048)
Sandra Loosemore [Sat, 26 Apr 2025 02:22:39 +0000 (02:22 +0000)]
OpenMP: need_device_ptr and need_device_addr support for adjust_args
This patch adds support for the "need_device_addr" modifier to the
"adjust args" clause for the "declare variant" directive, and
extends/re-works the support for "need_device_ptr" as well.
This patch builds on waffl3x's recently posted patch, "OpenMP: C/C++
adjust-args numeric ranges", here.
https://gcc.gnu.org/pipermail/gcc-patches/2025-April/681806.html
In C++, "need_device_addr" supports mapping reference arguments to
device pointers. In Fortran, it similarly supports arguments passed
by reference, the default for the language, in contrast to
"need_device_ptr" which is used to map arguments of c_ptr type. The
C++ support is straightforward, but Fortran has some additional
wrinkles involving arrays passed by descriptor (a new descriptor must
be constructed with a pointer to the array data which is the only part
mapped to the device), plus special cases for passing optional
arguments and a whole array instead of a reference to its first element.
gcc/cp/ChangeLog
* parser.cc (cp_finish_omp_declare_variant): Adjust error messages.
gcc/fortran/ChangeLog
* trans-openmp.cc (gfc_trans_omp_declare_variant): Disallow
polymorphic and optional arguments with need_device_addr for now, but
don't reject need_device_addr entirely.
gcc/ChangeLog
* gimplify.cc (modify_call_for_omp_dispatch): Rework logic for
need_device_ptr and need_device_addr adjustments.
gcc/testsuite/ChangeLog
* c-c++-common/gomp/adjust-args-10.c: Ignore the new sorry since the
lack of proper diagnostic is already xfail'ed.
* g++.dg/gomp/adjust-args-1.C: Adjust output patterns.
* g++.dg/gomp/adjust-args-17.C: New.
* gcc.dg/gomp/adjust-args-3.c: New.
* gfortran.dg/gomp/adjust-args-14.f90: Don't expect this to fail now.
libgomp/ChangeLog
* libgomp.texi: Mark need_device_addr as supported.
* testsuite/libgomp.c++/need-device-ptr.C: New.
* testsuite/libgomp.c-c++-common/dispatch-3.c: New.
* testsuite/libgomp.fortran/adjust-args-array-descriptor.f90: New.
* testsuite/libgomp.fortran/need-device-ptr.f90: New.
Co-Authored-By: Tobias Burnus <tburnus@baylibre.com>
waffl3x [Mon, 5 May 2025 20:20:00 +0000 (20:20 +0000)]
OpenMP: C/C++ adjust-args numeric ranges
Add support for OpenMP parameter-lists in an adjust_args clause, more
specifically, numeric ranges and parameter indices. Many bugs are also
fixed along the way. Most of the fixes rely on the changes to handling of
clause arguments and can't reasonably be split out, while most of the
changes that fix PR119602 came as a side effect of properly handling numeric
ranges with dependent bounds so it makes sense to include it here.
Variadic arguments with an incorrect type are not currently diagnosed, but
handling for them is otherwise functional. It is unclear how references to
pointers are supposed to be handled, so for now we sorry for that case.
PR c++/119659
PR c++/118859
PR c++/119601
PR c++/119602
PR c++/119775
gcc/c/ChangeLog:
* c-parser.cc (c_omp_numeric_ranges_always_overlap): New function.
(c_parser_omp_parm_list): New function.
(c_finish_omp_declare_variant): Use c_parser_omp_parm_list instead
of c_parser_omp_variable_list. Refactor, change format of
"omp declare variant variant args" attribute.
gcc/cp/ChangeLog:
PR c++/119659
PR c++/118859
PR c++/119601
PR c++/119602
PR c++/119775
* cp-tree.h (finish_omp_parm_list): New declaration.
(finish_omp_adjust_args): New declaration.
* decl.cc (omp_declare_variant_finalize_one): Refactor and change
attribute unpacking, use finish_omp_parm_list and
finish_omp_adjust_args, refactor append_args diagnostics, add
nbase_parms to append_args attribute, remove special handling for
member functions.
* parser.cc (cp_parser_direct_declarator): Don't pass parms.
(cp_parser_late_return_type_opt): Remove parms parameter.
(cp_parser_omp_parm_list): New function.
(cp_finish_omp_declare_variant): Remove parms parameter.
Add NULL_TREE instead of nbase_args to append_args_tree. Refactor,
use cp_parser_omp_parm_list not cp_parser_omp_var_list_no_open,
handle "need_device_addr" and remove handling and diagnostics of
parm list arguments that are done too early. Change format of
unnamed variant attribute.
(cp_parser_late_parsing_omp_declare_simd): Remove parms parameter.
* pt.cc (tsubst_attribute): Copy "omp declare variant base" nodes,
substitute parm list numeric range bounds.
* semantics.cc (finish_omp_parm_list): New function.
(finish_omp_adjust_args): New function.
gcc/fortran/ChangeLog:
* trans-openmp.cc (gfc_trans_omp_declare_variant): Change format of
"omp declare variant variant args" attribute.
gcc/ChangeLog:
* gimplify.cc (modify_call_for_omp_dispatch): Refactor and change
attribute unpacking. For adjust_args variadic functions, expand
numeric ranges with relative bounds. Refactor argument adjustment.
libgomp/ChangeLog:
* libgomp.texi: Set 'adjust args' variadic arguments support to Y.
gcc/testsuite/ChangeLog:
* c-c++-common/gomp/pr118579.c: Change error text.
* g++.dg/gomp/adjust-args-1.C: Fix error text, add dg-* directives.
* g++.dg/gomp/adjust-args-2.C: Add dg-* directives.
* g++.dg/gomp/append-args-1.C: Add dg-* directives.
* gcc.dg/gomp/adjust-args-1.c: Fix error text, add dg-* directives.
* gcc.dg/gomp/append-args-1.c: Fix error text, add dg-* directives.
* c-c++-common/gomp/adjust-args-7.c: New test.
* c-c++-common/gomp/adjust-args-8.c: New test.
* c-c++-common/gomp/adjust-args-9.c: New test.
* c-c++-common/gomp/adjust-args-10.c: New test.
* c-c++-common/gomp/adjust-args-11.c: New test.
* c-c++-common/gomp/adjust-args-12.c: New test.
* c-c++-common/gomp/adjust-args-13.c: New test.
* c-c++-common/gomp/adjust-args-14.c: New test.
* c-c++-common/gomp/adjust-args-15.c: New test.
* g++.dg/gomp/adjust-args-5.C: New test.
* g++.dg/gomp/adjust-args-6.C: New test.
* g++.dg/gomp/adjust-args-7.C: New test.
* g++.dg/gomp/adjust-args-8.C: New test.
* g++.dg/gomp/adjust-args-9.C: New test.
* g++.dg/gomp/adjust-args-10.C: New test.
* g++.dg/gomp/adjust-args-11.C: New test.
* g++.dg/gomp/adjust-args-12.C: New test.
* g++.dg/gomp/adjust-args-13.C: New test.
* g++.dg/gomp/adjust-args-14.C: New test.
* g++.dg/gomp/adjust-args-15.C: New test.
* g++.dg/gomp/adjust-args-16.C: New test.
* g++.dg/gomp/append-args-9.C: New test.
* g++.dg/gomp/append-args-10.C: New test.
* g++.dg/gomp/append-args-11.C: New test.
* g++.dg/gomp/append-args-omp-interop-t.h: New header.
Signed-off-by: Waffl3x <waffl3x@baylibre.com>
Andrew Stubbs [Thu, 24 Apr 2025 16:50:08 +0000 (16:50 +0000)]
OpenMP, GCN: Add interop-hsa testcase
This testcase ensures that the interop HSA support is sufficient to run
a kernel manually on the same device.
libgomp/ChangeLog:
* testsuite/libgomp.c/interop-hsa.c: New test.
(cherry picked from commit
8d84ea28510054fbbb8a2b7441916bd75e29163f)
Thomas Schwinge [Wed, 23 Apr 2025 08:51:48 +0000 (10:51 +0200)]
GCN, nvptx offloading: Host/device compatibility: Itanium C++ ABI, DSO Object Destruction API [PR119853, PR119854]
'__dso_handle' for '__cxa_atexit', '__cxa_finalize'. See
<https://itanium-cxx-abi.github.io/cxx-abi/abi.html#dso-dtor>.
PR target/119853
PR target/119854
libgcc/
* config/gcn/crt0.c (_fini_array): Call
'__GCC_offload___cxa_finalize'.
* config/nvptx/gbl-ctors.c (__static_do_global_dtors): Likewise.
libgomp/
* target-cxa-dso-dtor.c: New.
* config/accel/target-cxa-dso-dtor.c: Likewise.
* Makefile.am (libgomp_la_SOURCES): Add it.
* Makefile.in: Regenerate.
* testsuite/libgomp.c++/target-cdtor-1.C: New.
* testsuite/libgomp.c++/target-cdtor-2.C: Likewise.
(cherry picked from commit
aafe942227baf8c2bcd4cac2cb150e49a4b895a9)
Thomas Schwinge [Wed, 23 Apr 2025 15:35:29 +0000 (17:35 +0200)]
Add 'libgomp.c-c++-common/target-cdtor-1.c'
libgomp/
* testsuite/libgomp.c-c++-common/target-cdtor-1.c: New.
(cherry picked from commit
40ce48e87c1e7344c622c8eb6bed53f1311f5a0a)
Andrew Pinski [Mon, 21 Apr 2025 22:32:26 +0000 (22:32 +0000)]
GCN: Properly switch sections in 'gcn_hsa_declare_function_name' [PR119737]
There are GCN/C++ target as well as offloading codes, where the hard-coded
section names in 'gcn_hsa_declare_function_name' do not fit, and assembly thus
fails:
LLVM ERROR: Size expression must be absolute.
This commit progresses GCN target:
[-FAIL: g++.dg/init/call1.C -std=gnu++17 (internal compiler error: Aborted signal terminated program as)-]
[-FAIL:-]{+PASS:+} g++.dg/init/call1.C -std=gnu++17 (test for excess errors)
[-UNRESOLVED:-]{+PASS:+} g++.dg/init/call1.C -std=gnu++17 [-compilation failed to produce executable-]{+execution test+}
[-FAIL: g++.dg/init/call1.C -std=gnu++26 (internal compiler error: Aborted signal terminated program as)-]
[-FAIL:-]{+PASS:+} g++.dg/init/call1.C -std=gnu++26 (test for excess errors)
[-UNRESOLVED:-]{+PASS:+} g++.dg/init/call1.C -std=gnu++26 [-compilation failed to produce executable-]{+execution test+}
UNSUPPORTED: g++.dg/init/call1.C -std=gnu++98: exception handling not supported
..., and GCN offloading:
[-XFAIL: libgomp.c++/target-exceptions-throw-1.C (internal compiler error: Aborted signal terminated program as)-]
[-XFAIL: libgomp.c++/target-exceptions-throw-1.C PR119737 at line 7 (test for bogus messages, line )-]
[-XFAIL:-]{+PASS:+} libgomp.c++/target-exceptions-throw-1.C (test for excess errors)
[-UNRESOLVED:-]{+PASS:+} libgomp.c++/target-exceptions-throw-1.C [-compilation failed to produce executable-]{+execution test+}
{+PASS: libgomp.c++/target-exceptions-throw-1.C output pattern test+}
[-XFAIL: libgomp.c++/target-exceptions-throw-2.C (internal compiler error: Aborted signal terminated program as)-]
[-XFAIL: libgomp.c++/target-exceptions-throw-2.C PR119737 at line 7 (test for bogus messages, line )-]
[-XFAIL:-]{+PASS:+} libgomp.c++/target-exceptions-throw-2.C (test for excess errors)
[-UNRESOLVED:-]{+PASS:+} libgomp.c++/target-exceptions-throw-2.C [-compilation failed to produce executable-]{+execution test+}
{+PASS: libgomp.c++/target-exceptions-throw-2.C output pattern test+}
[-XFAIL: libgomp.oacc-c++/exceptions-throw-1.C -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa -O2 (internal compiler error: Aborted signal terminated program as)-]
[-XFAIL: libgomp.oacc-c++/exceptions-throw-1.C -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa -O2 PR119737 at line 7 (test for bogus messages, line )-]
[-XFAIL:-]{+PASS:+} libgomp.oacc-c++/exceptions-throw-1.C -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa -O2 (test for excess errors)
[-UNRESOLVED:-]{+PASS:+} libgomp.oacc-c++/exceptions-throw-1.C -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa -O2 [-compilation failed to produce executable-]{+execution test+}
{+PASS: libgomp.oacc-c++/exceptions-throw-1.C -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa -O2 output pattern test+}
[-XFAIL: libgomp.oacc-c++/exceptions-throw-2.C -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa -O2 (internal compiler error: Aborted signal terminated program as)-]
[-XFAIL: libgomp.oacc-c++/exceptions-throw-2.C -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa -O2 PR119737 at line 9 (test for bogus messages, line )-]
[-XFAIL:-]{+PASS:+} libgomp.oacc-c++/exceptions-throw-2.C -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa -O2 (test for excess errors)
[-UNRESOLVED:-]{+PASS:+} libgomp.oacc-c++/exceptions-throw-2.C -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa -O2 [-compilation failed to produce executable-]{+execution test+}
{+PASS: libgomp.oacc-c++/exceptions-throw-2.C -DACC_DEVICE_TYPE_radeon=1 -DACC_MEM_SHARED=0 -foffload=amdgcn-amdhsa -O2 output pattern test+}
PR target/119737
gcc/
* config/gcn/gcn.cc (gcn_hsa_declare_function_name): Properly
switch sections.
libgomp/
* testsuite/libgomp.c++/target-exceptions-throw-1.C: Remove
PR119737 XFAILing.
* testsuite/libgomp.c++/target-exceptions-throw-2.C: Likewise.
* testsuite/libgomp.oacc-c++/exceptions-throw-1.C: Likewise.
* testsuite/libgomp.oacc-c++/exceptions-throw-2.C: Likewise.
Co-authored-by: Thomas Schwinge <tschwinge@baylibre.com>
(cherry picked from commit
dfc43afe719898c3eafbed37fac7e6809d8b97ab)
Thomas Schwinge [Tue, 22 Apr 2025 11:41:22 +0000 (13:41 +0200)]
Adjust 'libgomp.c++/target-exceptions-pr118794-1.C' for 'targetm.arm_eabi_unwinder' [PR118794]
Fix-up for commit
aa3e72f943032e5f074b2bd2fd06d130dda8760b
"Add test cases for exception handling constructs in dead code for GCN, nvptx target and OpenMP 'target' offloading [PR118794]":
we need to adjust for configurations with 'targetm.arm_eabi_unwinder', as per:
gcc/config/arm/arm.cc:#define TARGET_ARM_EABI_UNWINDER true
gcc/config/c6x/c6x.cc:#define TARGET_ARM_EABI_UNWINDER true
..., which for ARM is conditional to '#if ARM_UNWIND_INFO' (defined in
'gcc/config/arm/bpabi.h', used for various GCC configurations), and for
C6x unconditional.
This gets us:
--- target-exceptions-pr118794-1.C.269t.optimized
+++ target-exceptions-pr118794-1.C.270t.optimized
[...]
__attribute__((omp declare target))
void f ()
[...]
gimple_call <__dt_comp , NULL, &c>
- gimple_call <__builtin_eh_pointer, _7, 2>
- gimple_call <__builtin_unwind_resume, NULL, _7>
+ gimple_call <__builtin_cxa_end_cleanup, NULL>
}
[...]
PR target/118794
libgomp/
* testsuite/libgomp.c++/target-exceptions-pr118794-1.C: Adjust for
'targetm.arm_eabi_unwinder'.
* testsuite/libgomp.c++/target-exceptions-pr118794-1-offload-sorry-GCN.C:
Likewise.
* testsuite/libgomp.c++/target-exceptions-pr118794-1-offload-sorry-nvptx.C:
Likewise.
(cherry picked from commit
8a1f5424b04130f88e9dcd5cbecd58300bc5166e)
Tobias Burnus [Thu, 24 Apr 2025 16:26:30 +0000 (18:26 +0200)]
libgomp/testsuite: Fix hip_header_nvidia check, add workaround to test
This is all about using the AMD's HIP header files with
__HIP_PLATFORM_NVIDIA__ defined, i.e. HIP with Nvidia/CUDA; in that case,
HIP is a thin layer on top of CUDA.
First, the check_effective_target_gomp_hip_header_nvidia check failed;
to fix it, -Wno-deprecated-declarations was added - and likewise to the
two affected testcases that actually used the HIP headers on Nvidia.
Doing so, the HIP tested was successful but the HIP-BLAS one showed two
issues:
* One seems to be related to include search paths as the HIP header uses
#include "library_types.h" to include that CUDA header. Seemingly, it
tried to included (again) the HIP header hip/library_types.h, not the
CUDA one. I guess, some tweaking of -isystem vs. -I could have
prevented this, but the simpler workaround was to just explicitly
include the CUDA one before the HIP header files.
* Once done, everything compiled but linking failed as the association
between three HIP-BLAS functions and their CUDA-BLAS ones did not
work. Solution: Just add three #define for mapping them.
libgomp/ChangeLog:
* testsuite/lib/libgomp.exp
(check_effective_target_gomp_hip_header_nvidia): Compile with
"-Wno-deprecated-declarations".
* testsuite/libgomp.c/interop-hip-nvidia-full.c: Likewise.
* testsuite/libgomp.c/interop-hipblas-nvidia-full.c: Likewise.
* testsuite/libgomp.c/interop-hipblas.h: Add workarounds
when using the HIP headers with __HIP_PLATFORM_NVIDIA__.
(cherry picked from commit
8ef0518bce489c4c0c252a0e0c44193c5f7cf777)
Tobias Burnus [Thu, 24 Apr 2025 12:36:37 +0000 (14:36 +0200)]
libgomp: Add additional OpenMP interop runtime tests
Add checks for nowait/depend and for checks that the returned
CUDA, CUDA_DRIVER and HIP interop objects actually work.
While the CUDA/CUDA_DRIVER ones are only for Nvidia GPUs, HIP
works on both AMD and Nvidia GPUs; on Nvidia GPUs, it is a
very thin wrapper around CUDA.
For Fortran, only a HIP test has been added - using hipfort.
While libgomp.c-c++-common/interop-2.c always works - even without
GPU - and checks for depend / nowait, all others require that
runtime libraries are found at link (and execution) time:
For Nvidia GPUs, libcuda + libcudart or libcublas,
For AMD GPUs, libamdhip64 or libhipblas.
The header files and hipfort modules do not need to be present as a
fallback has been implemented, but if they are, they get used.
Due to the combinations, the basic 1x C/C++, 4x C and 1x Fortran tests
yield 1x C/C++, 14x C and 4 Fortran run-test files.
libgomp/ChangeLog:
* testsuite/lib/libgomp.exp (check_effective_target_openacc_cublas,
check_effective_target_openacc_cudart): Update description as
the check requires more.
(check_effective_target_openacc_libcuda,
check_effective_target_openacc_libcublas,
check_effective_target_openacc_libcudart,
check_effective_target_gomp_hip_header_amd,
check_effective_target_gomp_hip_header_nvidia,
check_effective_target_gomp_hipfort_module,
check_effective_target_gomp_libamdhip64,
check_effective_target_gomp_libhipblas): New.
* testsuite/libgomp.c-c++-common/interop-2.c: New test.
* testsuite/libgomp.c/interop-cublas-full.c: New test.
* testsuite/libgomp.c/interop-cublas-libonly.c: New test.
* testsuite/libgomp.c/interop-cuda-full.c: New test.
* testsuite/libgomp.c/interop-cuda-libonly.c: New test.
* testsuite/libgomp.c/interop-hip-amd-full.c: New test.
* testsuite/libgomp.c/interop-hip-amd-no-hip-header.c: New test.
* testsuite/libgomp.c/interop-hip-nvidia-full.c: New test.
* testsuite/libgomp.c/interop-hip-nvidia-no-headers.c: New test.
* testsuite/libgomp.c/interop-hip-nvidia-no-hip-header.c: New test.
* testsuite/libgomp.c/interop-hip.h: New test.
* testsuite/libgomp.c/interop-hipblas-amd-full.c: New test.
* testsuite/libgomp.c/interop-hipblas-amd-no-hip-header.c: New test.
* testsuite/libgomp.c/interop-hipblas-nvidia-full.c: New test.
* testsuite/libgomp.c/interop-hipblas-nvidia-no-headers.c: New test.
* testsuite/libgomp.c/interop-hipblas-nvidia-no-hip-header.c: New test.
* testsuite/libgomp.c/interop-hipblas.h: New test.
* testsuite/libgomp.fortran/interop-hip-amd-full.F90: New test.
* testsuite/libgomp.fortran/interop-hip-amd-no-module.F90: New test.
* testsuite/libgomp.fortran/interop-hip-nvidia-full.F90: New test.
* testsuite/libgomp.fortran/interop-hip-nvidia-no-module.F90: New test.
* testsuite/libgomp.fortran/interop-hip.h: New test.
(cherry picked from commit
515d9be7944e89f5ec4363f9816ad4031ab6394b)
Kwok Cheung Yeung [Sat, 3 May 2025 21:10:47 +0000 (21:10 +0000)]
openmp, fortran: Add support for non-constant iterator bounds in Fortran deep-mapping iterator support
gcc/fortran/
* trans-openmp.cc (gfc_omp_deep_mapping_map): Add new argument for
vector of newly created iterators. Push new iterators onto the
vector.
(gfc_omp_deep_mapping_comps): Add new argument for vector of new
iterators. Pass argument in calls to gfc_omp_deep_mapping_item and
gfc_omp_deep_mapping_comps.
(gfc_omp_deep_mapping_item): Add new argument for vector of new
iterators. Pass argument in calls to gfc_omp_deep_mapping_map and
gfc_omp_deep_mapping_comps.
(gfc_omp_deep_mapping_do): Add new argument for vector of new
iterators. Pass argument in calls to gfc_omp_deep_mapping_item.
(gfc_omp_deep_mapping_cnt): Pass NULL to new argument for
gfc_omp_deep_mapping_do.
(gfc_omp_deep_mapping): Add new argument for vector of new
iterators. Pass argument in calls to gfc_omp_deep_mapping_do.
* trans.h (gfc_omp_deep_mapping): Add new argument.
gcc/
* langhooks-def.h (lhd_omp_deep_mapping): Add new argument.
* langhooks.cc (lhd_omp_deep_mapping): Likewise.
* langhooks.h (omp_deep_mapping): Likewise.
* omp-low.cc (allocate_omp_iterator_elems): Work on the supplied
iterator set instead of the iterators in a supplied set of clauses.
(free_omp_iterator_elems): Likewise.
(lower_omp_target): Maintain vector of new iterators generated by
deep-mapping. Allocate and free iterator element arrays using
iterators found in clauses and in the new iterator vector.
libgomp/
* testsuite/libgomp.fortran/allocatable-comp-iterators.f90: Add test
for non-const iterator boundaries.
Kwok Cheung Yeung [Sat, 3 May 2025 21:03:33 +0000 (21:03 +0000)]
openmp, fortran: Add iterator support for Fortran deep-mapping of allocatables
gcc/fortran/
* trans-openmp.cc (gfc_omp_deep_mapping_map): Remove const from ctx
argument. Add arguments for iterators and the statement sequence to
go into the iterator loop. Add statement sequence to iterator loop
body. Generate iterator loop entries for generated maps, insert
the map decls and sizes into the iterator element arrays, replace
original decls with the address of the element arrays, and
sizes/biases with SIZE_INT.
(gfc_omp_deep_mapping_comps): Remove const from ctx. Add argument for
iterators. Pass iterators to calls to gfc_omp_deep_mapping_item and
gfc_omp_deep_mapping_comps.
(gfc_omp_deep_mapping_item): Remove const from ctx. Add argument for
iterators. Collect generated side-effect statements and pass to
gfc_omp_deep_mapping_map along with the iterators. Pass iterators
to gfc_omp_deep_mapping_comps.
(gfc_omp_deep_mapping_do): Remove const from ctx. Pass iterators to
gfc_omp_deep_mapping_item.
(gfc_omp_deep_mapping_cnt): Remove const from ctx.
(gfc_omp_deep_mapping): Likewise.
* trans.h (gfc_omp_deep_mapping_cnt): Likewise.
(gfc_omp_deep_mapping): Likewise.
gcc/
* gimplify.cc (enter_omp_iterator_loop_context): New function variant.
(enter_omp_iterator_loop_context): Delegate to new variant.
(exit_omp_iterator_loop_context): New function variant.
(exit_omp_iterator_loop_context): Delegate to new variant.
(assign_to_iterator_elems_array): New.
(add_new_omp_iterators_entry): New.
(add_new_omp_iterators_clause): Delegate to
add_new_omp_iterators_entry.
* gimplify.h (enter_omp_iterator_loop_context): New prototype.
(enter_omp_iterator_loop_context): Remove default argument.
(exit_omp_iterator_loop_context): Remove argument.
(assign_to_iterator_elems_array): New prototype.
(add_new_omp_iterators_entry): New prototype.
(add_new_omp_iterators_clause): New prototype.
* langhooks-def.h (lhd_omp_deep_mapping_cnt): Remove const from
argument.
(lhd_omp_deep_mapping): Likewise.
* langhooks.cc (lhd_omp_deep_mapping_cnt): Likewise.
(lhd_omp_deep_mapping): Likewise.
* langhooks.h (omp_deep_mapping_cnt): Likewise.
(omp_deep_mapping): Likewise.
* omp-low.cc (lower_omp_map_iterator_expr): Delegate to
assign_to_iterator_elems_array.
(lower_omp_map_iterator_size): Likewise.
(lower_omp_target): Remove sorry for deep mapping.
libgomp/
* testsuite/libgomp.fortran/allocatable-comp-iterators.f90: New.
Kwok Cheung Yeung [Sat, 3 May 2025 20:42:46 +0000 (20:42 +0000)]
openmp, Fortran: Add support using iterators with custom mappers in Fortran
gcc/fortran/
* openmp.cc (gfc_omp_instantiate_mapper): Add argument for namespace.
Apply namespace to new clauses. Propagate namespace to nested
mappers.
(gfc_omp_instantiate_mappers): Pass namespace of clause to clauses
generated by mappers.
libgomp/
* testsuite/libgomp.fortran/mapper-iterators-1.f90: New test.
* testsuite/libgomp.fortran/mapper-iterators-2.f90: New test.
* testsuite/libgomp.fortran/mapper-iterators-3.f90: New test.
* testsuite/libgomp.fortran/mapper-iterators-4.f90: New test.
Co-authored-by: Andrew Stubbs <ams@baylibre.com>
Kwok Cheung Yeung [Mon, 13 Jan 2025 13:08:07 +0000 (13:08 +0000)]
openmp: Add support for using custom mappers with iterators (C, C++)
gcc/c-family/
* c-omp.cc (omp_instantiate_mapper): Apply iterator to new clauses
generated from mapper.
gcc/c/
* c-parser.cc (c_parser_omp_clause_map): Apply iterator to push and
pop mapper clauses.
gcc/cp/
* parser.cc (cp_parser_omp_clause_map): Apply iterator to push and
pop mapper clauses.
* semantics.cc (cxx_omp_map_array_section): Allow array types for
base type of array sections.
libgomp/
* testsuite/libgomp.c-c++-common/mapper-iterators-1.c: New test.
* testsuite/libgomp.c-c++-common/mapper-iterators-2.c: New test.
* testsuite/libgomp.c-c++-common/mapper-iterators-3.c: New test.
Co-authored-by: Andrew Stubbs <ams@baylibre.com>
Kwok Cheung Yeung [Fri, 11 Apr 2025 17:27:00 +0000 (18:27 +0100)]
openmp: Fix struct handling for OpenMP iterators
New clauses can be created for structs, and these will also need to have
iterators applied to them if the base clause is using iterators. As this
occurs after the initial iterator expansion, a new mechanism for allocating
new entries in the iterator loop is required.
gcc/
* gimplify.cc (add_new_omp_iterators_clause): New.
(build_omp_struct_comp_nodes): Add extra argument for loops sequence.
Call add_new_omp_iterators_clause on newly generated clauses.
(omp_accumulate_sibling_list): Add extra argument for loops sequence.
Pass to calls to build_omp_struct_comp_nodes. Add iterators to newly
generator clauses for struct accesses.
(omp_build_struct_sibling_lists): Add extra argument for loops
sequence. Pass to call to omp_accumulate_sibling_list.
(gimplify_adjust_omp_clauses): Pass loops sequence to
omp_build_struct_sibling_lists.
Kwok Cheung Yeung [Sat, 3 May 2025 20:38:10 +0000 (20:38 +0000)]
openmp: Add macros for iterator element access
gcc/c/
* c-parser.cc (c_parser_omp_iterators): Use macros for accessing
iterator elements.
(c_parser_omp_clause_affinity): Likewise.
(c_parser_omp_clause_depend): Likewise.
(c_parser_omp_clause_map): Likewise.
(c_parser_omp_clause_from_to): Likewise.
* c-typeck.cc (c_omp_finish_iterators): Likewise.
gcc/cp/
* parser.cc (cp_parser_omp_iterators): Use macros for accessing
iterator elements.
(cp_parser_omp_clause_affinity): Likewise.
(cp_parser_omp_clause_depend): Likewise.
(cp_parser_omp_clause_from_to): Likewise.
(cp_parser_omp_clause_map): Likewise.
* semantics.cc (cp_omp_finish_iterators): Likewise.
gcc/fortran/
* trans-openmp.cc (gfc_trans_omp_array_section): Use macros for
accessing iterator elements.
(handle_iterator): Likewise.
(gfc_trans_omp_clauses): Likewise.
gcc/
* gimplify.cc (gimplify_omp_affinity): Use macros for accessing
iterator elements.
(compute_omp_iterator_count): Likewise.
(build_omp_iterator_loop): Likewise.
(remove_unused_omp_iterator_vars): Likewise.
(build_omp_iterators_loops): Likewise.
(enter_omp_iterator_loop_context_1): Likewise.
(extract_base_bit_offset): Likewise.
* omp-low.cc (lower_omp_map_iterator_expr): Likewise.
(lower_omp_map_iterator_size): Likewise.
(allocate_omp_iterator_elems): Likewise.
(free_omp_iterator_elems): Likewise.
* tree-inline.cc (copy_tree_body_r): Likewise.
* tree-pretty-print.cc (dump_omp_iterators): Likewise.
* tree.h (OMP_ITERATORS_VAR, OMP_ITERATORS_BEGIN, OMP_ITERATORS_END,
OMP_ITERATORS_STEP, OMP_ITERATORS_ORIG_STEP, OMP_ITERATORS_BLOCK,
OMP_ITERATORS_LABEL, OMP_ITERATORS_INDEX, OMP_ITERATORS_ELEMS,
OMP_ITERATORS_COUNT, OMP_ITERATORS_EXPANDED_P): New macros.
Kwok Cheung Yeung [Thu, 12 Dec 2024 21:22:20 +0000 (21:22 +0000)]
openmp: Add support for non-constant iterator parameters in map, to and from clauses
This patch enables support for using non-constant expressions when specifying
iterators in the map clause of target constructs and to/from clauses of
target update constructs.
gcc/
* gimplify.cc (omp_iterator_elems_length): New.
(build_omp_iterators_loops): Change type of elements
array to pointer of pointers if array length is non-constant, and
assign size with indirect reference. Reorder elements added to
iterator vector and add element containing the iteration count. Use
omp_iterator_elems_length to compute element array size required.
* gimplify.h (omp_iterator_elems_length): New prototype.
* omp-low.cc (lower_omp_map_iterator_expr): Reorder elements read
from iterator vector. If elements field is a pointer type, assign
using pointer arithmetic followed by indirect reference, and return
the field directly.
(lower_omp_map_iterator_size): Reorder elements read from iterator
vector. If elements field is a pointer type, assign using pointer
arithmetic followed by indirect reference.
(allocate_omp_iterator_elems): New.
(free_omp_iterator_elems): New.
(lower_omp_target): Call allocate_omp_iterator_elems before inserting
loops sequence, and call free_omp_iterator_elems afterwards.
* tree-pretty-print.cc (dump_omp_iterators): Print extra elements in
iterator vector.
gcc/testsuite/
* c-c++-common/gomp/target-map-iterators-3.c: Update expected Gimple
output.
* c-c++-common/gomp/target-map-iterators-5.c: New.
* c-c++-common/gomp/target-update-iterators-3.c: Update expected
Gimple output.
* gfortran.dg/gomp/target-map-iterators-3.f90: Likewise.
* gfortran.dg/gomp/target-map-iterators-5.f90: New.
* gfortran.dg/gomp/target-update-iterators-3.f90: Update expected
Gimple output.
libgomp/
* testsuite/libgomp.c-c++-common/target-map-iterators-4.c: New.
* testsuite/libgomp.c-c++-common/target-map-iterators-5.c: New.
* testsuite/libgomp.c-c++-common/target-update-iterators-4.c: New.
* testsuite/libgomp.fortran/target-map-iterators-4.f90: New.
* testsuite/libgomp.fortran/target-map-iterators-5.f90: New.
* testsuite/libgomp.fortran/target-update-iterators-4.f90: New.
Kwok Cheung Yeung [Thu, 12 Sep 2024 20:33:58 +0000 (21:33 +0100)]
openmp: Disable strided target updates when iterators are used
Non-contiguous target updates result in the new strided target updates code
being used, resulting in new clauses such as GOMP_MAP_GRID_DIM,
GOMP_MAP_GRID_STRIDE etc. These are not currently supported in conjunction
with iterators, so this code-path is disabled when used together with
iterators.
The older target updates supports non-contiguous updates as long as a stride
is not applied.
gcc/c/
* c-typeck.cc (handle_omp_array_sections): Add extra argument. Set
argument to true if array section has a stride that is not one.
(c_finish_omp_clauses): Disable strided updates when iterators are
used in the clause. Emit sorry if strided.
gcc/cp/
* semantics.cc (handle_omp_array_sections): Add extra argument. Set
argument to true if array section has a stride that is not one.
(finish_omp_clauses): Disable strided updates when iterators are
used in the clause. Emit sorry if strided.
gcc/fortran/
* trans-openmp.cc (gfc_trans_omp_clauses): Disable strided updates
when iterators are used in the clause.
Kwok Cheung Yeung [Wed, 27 Nov 2024 21:56:08 +0000 (21:56 +0000)]
openmp, fortran: Add support for iterators in OpenMP 'target update' constructs (Fortran)
This adds Fortran support for iterators in 'to' and 'from' clauses in the
'target update' OpenMP directive.
gcc/fortran/
* dump-parse-tree.cc (show_omp_namelist): Add iterator support for
OMP_LIST_TO and OMP_LIST_FROM.
* match.cc (gfc_free_namelist): Free namespace for OMP_LIST_TO and
OMP_LIST_FROM.
* openmp.cc (gfc_free_omp_clauses): Free namespace for OMP_LIST_TO
and OMP_LIST_FROM.
(gfc_match_motion_var_list): Parse 'iterator' modifier.
(resolve_omp_clauses): Resolve iterators for OMP_LIST_TO and
OMP_LIST_FROM.
* trans-openmp.cc (gfc_trans_omp_clauses): Handle iterators in
OMP_LIST_TO and OMP_LIST_FROM clauses. Add expressions to
iter_block rather than block.
gcc/testsuite/
* gfortran.dg/gomp/target-update-iterators-1.f90: New.
* gfortran.dg/gomp/target-update-iterators-2.f90: New.
* gfortran.dg/gomp/target-update-iterators-3.f90: New.
libgomp/
* testsuite/libgomp.fortran/target-update-iterators-1.f90: New.
* testsuite/libgomp.fortran/target-update-iterators-2.f90: New.
* testsuite/libgomp.fortran/target-update-iterators-3.f90: New.
Co-authored-by: Andrew Stubbs <ams@baylibre.com>
Kwok Cheung Yeung [Sat, 3 May 2025 20:36:21 +0000 (20:36 +0000)]
openmp, fortran: Add support for map iterators in OpenMP target construct (Fortran)
This adds support for iterators in map clauses within OpenMP
'target' constructs in Fortran.
Some special handling for struct field maps has been added to libgomp in
order to handle arrays of derived types.
gcc/fortran/
* dump-parse-tree.cc (show_omp_namelist): Add iterator support for
OMP_LIST_MAP.
* match.cc (gfc_free_namelist): Free namespace for OMP_LIST_MAP.
* openmp.cc (gfc_free_omp_clauses): Free namespace in namelist for
OMP_LIST_MAP.
(gfc_match_omp_clauses): Parse 'iterator' modifier for 'map' clause.
(resolve_omp_clauses): Resolve iterators for OMP_LIST_MAP.
* trans-openmp.cc: Include tree-ssa-loop-niter.h.
(gfc_trans_omp_array_section): Add iterator argument. Replace
instances of iterator variables with the initial value when
computing biases.
(gfc_trans_omp_clauses): Handle iterators in OMP_LIST_MAP clauses.
Add expressions to iter_block rather than block. Do not apply
iterators to firstprivate maps. Pass iterator to
gfc_trans_omp_array_section.
gcc/
* gimplify.cc (compute_omp_iterator_count): Account for difference
in loop boundaries in Fortran.
(build_omp_iterator_loop): Change upper boundary condition for
Fortran. Insert block statements into innermost loop.
(remove_unused_omp_iterator_vars): Copy block subblocks of old
iterator to new iterator and remove original.
(contains_vars_1): New.
(contains_vars): New.
(extract_base_bit_offset): Add iterator argument. Remove iterator
variables from base. Do not set variable_offset if the offset
does not contain any remaining variables.
(omp_accumulate_sibling_list): Add iterator argument to
extract_base_bit_offset.
* tree-pretty-print.cc (dump_block_node): Ignore BLOCK_SUBBLOCKS
containing iterator block statements.
gcc/testsuite/
* gfortran.dg/gomp/target-map-iterators-1.f90: New.
* gfortran.dg/gomp/target-map-iterators-2.f90: New.
* gfortran.dg/gomp/target-map-iterators-3.f90: New.
* gfortran.dg/gomp/target-map-iterators-4.f90: New.
libgomp/
* target.c (kind_to_name): Handle GOMP_MAP_STRUCT and
GOMP_MAP_STRUCT_UNORD.
(gomp_add_map): New.
(gomp_merge_iterator_maps): Expand fields of a struct mapping
breadth-first.
* testsuite/libgomp.fortran/target-map-iterators-1.f90: New.
* testsuite/libgomp.fortran/target-map-iterators-2.f90: New.
* testsuite/libgomp.fortran/target-map-iterators-3.f90: New.
Co-authored-by: Andrew Stubbs <ams@baylibre.com>
Kwok Cheung Yeung [Wed, 16 Apr 2025 10:43:00 +0000 (11:43 +0100)]
openmp, fortran: Revert to using tree expressions when translating Fortran OpenMP array sections
In the patch 'OpenACC 2.7: Implement reductions for arrays and records',
temporaries are used to hold the decl and bias of clauses resulting from array
sections, which breaks some assumptions made for map iterator support.
This patch reverts the change for OpenMP only.
gcc/fortran/
* trans-openmp.cc (gfc_trans_omp_array_section): Use temporaries only
when translating OpenACC.
gcc/testsuite/
* gfortran.dg/gomp/target-enter-exit-data.f90: Revert expected tree
dumps.
Kwok Cheung Yeung [Sat, 3 May 2025 20:30:16 +0000 (20:30 +0000)]
openmp, fortran: Move udm field of gfc_omp_namelist into a new union
This patch moves u2.udm into u3.udm.
This is necessary to avoid clashes when mappers are used together with
iterators, which uses u2.ns.
gcc/fortran/
* gfortran.h (struct gfc_omp_namelist): Move udm field into a new
union u3.
* match.cc (gfc_free_omp_namelist): Change references to u2.udm to
u3.udm.
* module.cc (load_omp_udms): Likewise.
(write_omp_udm): Likewise.
* openmp.cc (gfc_match_motion_var_list): Likewise.
(gfc_match_omp_clauses): Likewise.
(resolve_omp_clauses): Likewise.
(gfc_omp_instantiate_mapper): Likewise.
* trans-openmp.cc (gfc_trans_omp_clauses): Likewise.
(gfc_find_nested_mappers): Likewise.
Kwok Cheung Yeung [Wed, 27 Nov 2024 21:51:34 +0000 (21:51 +0000)]
openmp: Add support for iterators in 'target update' clauses (C/C++)
This adds support for iterators in 'to' and 'from' clauses in the
'target update' OpenMP directive.
gcc/c/
* c-parser.cc (c_parser_omp_clause_from_to): Parse 'iterator' modifier.
* c-typeck.cc (c_finish_omp_clauses): Finish iterators for to/from
clauses.
gcc/cp/
* parser.cc (cp_parser_omp_clause_from_to): Parse 'iterator' modifier.
* semantics.cc (finish_omp_clauses): Finish iterators for to/from
clauses.
gcc/
* gimplify.cc (gimplify_scan_omp_clauses): Add argument for iterator
loop sequence. Gimplify the clause decl and size into the iterator
loop if iterators are used.
(gimplify_omp_workshare): Add argument for iterator loops sequence
in call to gimplify_scan_omp_clauses.
(gimplify_omp_target_update): Call remove_unused_omp_iterator_vars and
build_omp_iterators_loops. Add loop sequence as argument when calling
gimplify_scan_omp_clauses, gimplify_adjust_omp_clauses and building
the Gimple statement.
* tree-pretty-print.cc (dump_omp_clause): Call dump_omp_iterators
for to/from clauses with iterators.
* tree.cc (omp_clause_num_ops): Add extra operand for OMP_CLAUSE_FROM
and OMP_CLAUSE_TO.
* tree.h (OMP_CLAUSE_HAS_ITERATORS): Add check for OMP_CLAUSE_TO and
OMP_CLAUSE_FROM.
(OMP_CLAUSE_ITERATORS): Likewise.
gcc/testsuite/
* c-c++-common/gomp/target-update-iterators-1.c: New.
* c-c++-common/gomp/target-update-iterators-2.c: New.
* c-c++-common/gomp/target-update-iterators-3.c: New.
libgomp/
* target.c (gomp_update): Call gomp_merge_iterator_maps. Free
allocated variables.
* testsuite/libgomp.c-c++-common/target-update-iterators-1.c: New.
* testsuite/libgomp.c-c++-common/target-update-iterators-2.c: New.
* testsuite/libgomp.c-c++-common/target-update-iterators-3.c: New.
Kwok Cheung Yeung [Sat, 3 May 2025 20:24:26 +0000 (20:24 +0000)]
openmp: Add support for iterators in map clauses (C/C++)
This adds preliminary support for iterators in map clauses within OpenMP
'target' constructs (which includes constructs such as 'target enter data').
Iterators with non-constant loop bounds are not currently supported.
gcc/c/
* c-parser.cc (c_parser_omp_clause_map): Parse 'iterator' modifier.
* c-typeck.cc (c_finish_omp_clauses): Finish iterators. Apply
iterators to generated clauses.
gcc/cp/
* parser.cc (cp_parser_omp_clause_map): Parse 'iterator' modifier.
* semantics.cc (finish_omp_clauses): Finish iterators. Apply
iterators to generated clauses.
gcc/
* gimple-pretty-print.cc (dump_gimple_omp_target): Print expanded
iterator loops.
* gimple.cc (gimple_build_omp_target): Add argument for iterator
loops sequence. Initialize iterator loops field.
* gimple.def (GIMPLE_OMP_TARGET): Set GSS symbol to GSS_OMP_TARGET.
* gimple.h (gomp_target): Set GSS symbol to GSS_OMP_TARGET. Add extra
field for iterator loops.
(gimple_build_omp_target): Add argument for iterator loops sequence.
(gimple_omp_target_iterator_loops): New.
(gimple_omp_target_iterator_loops_ptr): New.
(gimple_omp_target_set_iterator_loops): New.
* gimplify.cc (find_var_decl): New.
(copy_omp_iterator): New.
(remap_omp_iterator_var_1): New.
(remap_omp_iterator_var): New.
(remove_unused_omp_iterator_vars): New.
(struct iterator_loop_info_t): New type.
(iterator_loop_info_map_t): New type.
(build_omp_iterators_loops): New.
(enter_omp_iterator_loop_context_1): New.
(enter_omp_iterator_loop_context): New.
(enter_omp_iterator_loop_context): New.
(exit_omp_iterator_loop_context): New.
(gimplify_adjust_omp_clauses): Add argument for iterator loop
sequence. Gimplify the clause decl and size into the iterator
loop if iterators are used.
(gimplify_omp_workshare): Call remove_unused_omp_iterator_vars and
build_omp_iterators_loops for OpenMP target expressions. Add
loop sequence as argument when calling gimplify_adjust_omp_clauses
and building the Gimple statement.
* gimplify.h (enter_omp_iterator_loop_context): New prototype.
(exit_omp_iterator_loop_context): New prototype.
* gsstruct.def (GSS_OMP_TARGET): New.
* omp-low.cc (lower_omp_map_iterator_expr): New.
(lower_omp_map_iterator_size): New.
(finish_omp_map_iterators): New.
(lower_omp_target): Add sorry if iterators used with deep mapping.
Call lower_omp_map_iterator_expr before assigning to sender ref.
Call lower_omp_map_iterator_size before setting the size. Insert
iterator loop sequence before the statements for the target clause.
* tree-nested.cc (convert_nonlocal_reference_stmt): Walk the iterator
loop sequence of OpenMP target statements.
(convert_local_reference_stmt): Likewise.
(convert_tramp_reference_stmt): Likewise.
* tree-pretty-print.cc (dump_omp_iterators): Dump extra iterator
information if present.
(dump_omp_clause): Call dump_omp_iterators for iterators in map
clauses.
* tree.cc (omp_clause_num_ops): Add operand for OMP_CLAUSE_MAP.
(walk_tree_1): Do not walk last operand of OMP_CLAUSE_MAP.
* tree.h (OMP_CLAUSE_HAS_ITERATORS): New.
(OMP_CLAUSE_ITERATORS): New.
gcc/testsuite/
* c-c++-common/gomp/map-6.c (foo): Amend expected error message.
* c-c++-common/gomp/target-map-iterators-1.c: New.
* c-c++-common/gomp/target-map-iterators-2.c: New.
* c-c++-common/gomp/target-map-iterators-3.c: New.
* c-c++-common/gomp/target-map-iterators-4.c: New.
libgomp/
* target.c (kind_to_name): New.
(gomp_merge_iterator_maps): New.
(gomp_map_vars_internal): Call gomp_merge_iterator_maps. Copy
address of only the first iteration to target vars. Free allocated
variables.
* testsuite/libgomp.c-c++-common/target-map-iterators-1.c: New.
* testsuite/libgomp.c-c++-common/target-map-iterators-2.c: New.
* testsuite/libgomp.c-c++-common/target-map-iterators-3.c: New.
Co-authored-by: Andrew Stubbs <ams@baylibre.com>
Kwok Cheung Yeung [Wed, 27 Nov 2024 21:49:12 +0000 (21:49 +0000)]
openmp: Refactor handling of iterators
Move code to calculate the iteration size and to generate the iterator
expansion loop into separate functions.
Use OMP_ITERATOR_DECL_P to check for iterators in clause declarations.
gcc/c-family/
* c-omp.cc (c_finish_omp_depobj): Use OMP_ITERATOR_DECL_P.
gcc/c/
* c-typeck.cc (handle_omp_array_sections): Use OMP_ITERATOR_DECL_P.
(c_finish_omp_clauses): Likewise.
gcc/cp/
* pt.cc (tsubst_omp_clause_decl): Use OMP_ITERATOR_DECL_P.
* semantics.cc (handle_omp_array_sections): Likewise.
(finish_omp_clauses): Likewise.
gcc/
* gimplify.cc (gimplify_omp_affinity): Use OMP_ITERATOR_DECL_P.
(compute_omp_iterator_count): New.
(build_omp_iterator_loop): New.
(gimplify_omp_depend): Use OMP_ITERATOR_DECL_P,
compute_omp_iterator_count and build_omp_iterator_loop.
* tree-inline.cc (copy_tree_body_r): Use OMP_ITERATOR_DECL_P.
* tree-pretty-print.cc (dump_omp_clause): Likewise.
* tree.h (OMP_ITERATOR_DECL_P): New macro.
Thomas Schwinge [Wed, 16 Apr 2025 19:52:53 +0000 (21:52 +0200)]
OpenACC: Improve implicit mapping for non-lexically nested offload regions: Adjust cherry-picked test cases
Adjust cherry-picked test cases per
OG14 commit
b918a7e4b4bdf070bfa9ede48ef9d22f89ff7795
"OpenACC: Improve implicit mapping for non-lexically nested offload regions"
(in combination with
OG14 commit
5fb2987d33c7296543fa7b8dbeab597fc552b110
"Clarify 'OMP_CLAUSE_MAP_RUNTIME_IMPLICIT_P' in 'gcc/tree-pretty-print.cc:dump_omp_clause'"
(cherry picked from trunk commit
d6e66e7b3a40315ad303344e19bccb4006c51cac)).
libgomp/
* testsuite/libgomp.oacc-c++/exceptions-bad_cast-3.C: Adjust.
* testsuite/libgomp.oacc-c++/exceptions-throw-3.C: Likewise.
* testsuite/libgomp.oacc-c++/pr119692-1-1.C: Likewise.
* testsuite/libgomp.oacc-c++/pr119692-1-2.C: Likewise.
* testsuite/libgomp.oacc-c++/pr119692-1-3.C: Likewise.
Chung-Lin Tang [Fri, 11 Apr 2025 08:46:12 +0000 (08:46 +0000)]
OpenACC 2.7: update references to supported version to 2.7/201811.
2025-04-11 Chung-Lin Tang <cltang@baylibre.com>
gcc/c-family/ChangeLog:
* c-cppbuiltin.cc (c_cpp_builtins): Updated _OPENACC to "201811"
for OpenACC 2.7.
gcc/ChangeLog
* doc/extend.texi: Adjust version references to 2.7 from 2.6.
gcc/fortran/ChangeLog:
* cpp.cc (cpp_define_builtins): Updated _OPENACC to "201811"
for OpenACC 2.7.
* intrinsic.texi (OpenACC Module OPENACC): Adjust version
references to 2.7 from 2.6.
gcc/testsuite/ChangeLog:
* c-c++-common/cpp/openacc-define-3.c: Adjust test.
* gfortran.dg/openacc-define-3.f90: Adjust test.
libgomp/ChangeLog:
* acc_prof.h (_ACC_PROF_INFO_VERSION): Adjust to 201811.
* libgomp.texi (Enabling OpenACC): Adjust version
references to 2.7 from 2.6.
* openacc.f90 (module openacc): Adjust openacc_version to 201811.
* openacc_lib.h (openacc_version): Adjust openacc_version to 201811.
* testsuite/libgomp.oacc-c-c++-common/acc_prof-version-1.c:
Adjust test value to 201811.
* testsuite/libgomp.oacc-fortran/openacc_version-1.f: Adjust
test value to 201811.
* testsuite/libgomp.oacc-fortran/openacc_version-2.f90: Likewise.
Co-Authored-By: Sandra Loosemore <sloosemore@baylibre.com>
Chung-Lin Tang [Fri, 2 May 2025 00:33:07 +0000 (00:33 +0000)]
OpenACC: array reductions bug fixes
This is a merge of the v4 to v5 diff patch from:
https://gcc.gnu.org/pipermail/gcc-patches/2025-March/679682.html
This patch fixes issues found for NVPTX sm_70 testing, and another issue
related to copying to reduction buffer for worker/vector mode.
gcc/ChangeLog:
* config/gcn/gcn-tree.cc (gcn_goacc_reduction_setup): Fix array case
copy source into reduction buffer.
* config/nvptx/nvptx.cc (nvptx_expand_shared_addr): Move default size
init setting place.
(enum nvptx_builtins): Add NVPTX_BUILTIN_BAR_WARPSYNC.
(nvptx_init_builtins): Add DEF() of nvptx_builtin_bar_warpsync.
(nvptx_expand_builtin): Expand NVPTX_BUILTIN_BAR_WARPSYNC.
(nvptx_goacc_reduction_setup): Fix array case copy source into reduction
buffer.
(nvptx_goacc_reduction_fini): Add bar.warpsync for at end of vector-mode
reductions for sm_70 and above.
libgomp/ChangeLog:
* testsuite/libgomp.oacc-c-c++-common/reduction-arrays-2.c: Adjust test.
* testsuite/libgomp.oacc-c-c++-common/reduction-arrays-3.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/reduction-arrays-4.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/reduction-arrays-5.c: Likewise.
Chung-Lin Tang [Fri, 2 May 2025 00:27:53 +0000 (00:27 +0000)]
OpenACC 2.7: Implement reductions for arrays and records
This patch is a merge of:
https://gcc.gnu.org/pipermail/gcc-patches/2025-February/675222.html
This patch implements reductions for arrays, array sections, and
struct/record types as according to the OpenACC 2.7 specification.
2025-02-23 Chung-Lin Tang <cltang@baylibre.com>
gcc/c/ChangeLog:
* c-parser.cc (c_parser_omp_variable_list): Adjust parsing of opening
square bracket.
(c_parser_omp_clause_reduction): Adjustments for
OpenACC-specific cases.
* c-typeck.cc (c_oacc_reduction_defined_type_p): New function.
(c_oacc_reduction_code_name): Likewise.
(c_finish_omp_clauses): Handle OpenACC cases using new functions.
gcc/cp/ChangeLog:
* parser.cc (cp_parser_omp_var_list_no_open): Adjust parsing of opening
square bracket.
(cp_parser_omp_clause_reduction): Adjustments for
OpenACC-specific cases.
* semantics.cc (cp_oacc_reduction_defined_type_p): New function.
(cp_oacc_reduction_code_name): Likewise.
(finish_omp_reduction_clause): Handle OpenACC cases using new
functions.
gcc/fortran/ChangeLog:
* openmp.cc (oacc_reduction_defined_type_p): New function.
(resolve_omp_clauses): Adjust OpenACC array reduction error case.
Adjust OMP_LIST_REDUCTION case. Use oacc_reduction_defined_type_p for
OpenACC.
* trans-openmp.cc (gfc_trans_omp_array_reduction_or_udr):
Add 'stmtblock_t *block', and 'bool openacc' parameters. Add array and
array section handling for openacc case. Adjust part of function to be
!openacc only.
(gfc_trans_omp_reduction_list):
Add 'stmtblock_t *block', and 'bool openacc' parameters, pass to calls
to gfc_trans_omp_array_reduction_or_udr.
(gfc_trans_omp_array_section): Adjust setting of OMP_CLAUSE_SIZE to only
OMP_CLAUSE_MAP clauses. Adjust calculations of decls and bias to use
temporary variables instead of tree expression inside clauses.
(gfc_trans_omp_clauses): Add 'block' and 'openacc' arguments to calls to
gfc_trans_omp_reduction_list.
(gfc_trans_omp_do): Pass 'op == EXEC_OACC_LOOP' as 'bool openacc'
parameter in call to gfc_trans_omp_clauses.
gcc/ChangeLog:
* config/gcn/gcn-tree.cc (#include "omp-offload.h"): Add include.
(#include "memmodel.h"): Add include.
(gcn_array_reduction_buffers): New vec<tree>
for holding DECLs for reduction buffer pointer variables.
(gcn_lockfull_update): Add pointer type fold_converts.
(gcn_reduction_update): Additions for handling ARRAY_TYPE, pointer to
ARRAY_TYPE, and RECORD_TYPE reductions.
(gcn_goacc_get_worker_red_decl): Adjust parameters to handle
non-constant offset case.
(gcn_goacc_get_worker_array_reduction_buffer): New function.
(gcn_create_if_else_seq): New function.
(gcn_create_do_while_loop_seq): New function.
(gcn_goacc_reduction_setup): Adjustments to handle arrays and records.
(gcn_goacc_reduction_init): Likewise.
(gcn_goacc_reduction_fini): Likewise.
(gcn_goacc_reduction_teardown): Likewise.
* config/nvptx/nvptx.cc (nvptx_gen_shuffle): Properly generate
V2SI shuffle using vec_extract op.
(nvptx_expand_shared_addr): Adjustments to handle non-constant size.
(nvptx_get_shared_red_addr): Adjust type/alignment calculations to
use TYPE_SIZE/ALIGN_UNIT instead of machine mode based.
(nvptx_get_shared_red_addr): New function with array_max_idx parameter.
(nvptx_reduction_update): Additions for handling ARRAY_TYPE, pointer to
ARRA_TYPE, and RECORD_TYPE reductions.
(nvptx_goacc_reduction_setup): Likewise.
(nvptx_goacc_reduction_init): Likewise.
(nvptx_goacc_reduction_fini): Likewise.
(nvptx_goacc_reduction_teardown): Likewise.
* gimplify.cc (gimplify_scan_omp_clauses): Gimplify inside COMPONENT_REF
and convert codes for OMP_CLAUSE_REDUCTION cases. Add DECL_P check for
do_add/do_add_decl goto case.
(gimplify_adjust_omp_clauses): Avoid GOMP_MAP_POINTER OMP_CLAUSE_SIZE
handling for OpenACC kernels. Call omp_add_variable for ARRAY_REF index.
Peel away array MEM_REF for decl lookup.
* omp-low.cc (struct omp_context):
Add 'hash_map<tree, tree> *block_vars_map' field.
(omp_copy_decl_2): Create/lookup using ctx->block_vars_map first. Add
new copy into ctx->block_vars_map.
(install_var_field): Add 'bool field_may_exist = false' parameter.
Adjust lookup assertions.
(delete_omp_context): Add delete of ctx->block_vars_map.
(scan_sharing_clauses): Adjust calls to install_var_field. Adjust
ARRAY_REF pointer type building to use decl type, rather than generic
ptr_type_node. For ARRAY_REFs on offloaded constructs, also add base
expression as field lookup key.
(omp_reduction_init_op): Add ARRAY_TYPE and RECORD_TYPE init op
construction.
(oacc_array_reduction_bias): New function.
(lower_oacc_reductions): Add array reduction handling code. Arrays use
a different mode of IFN parameters, using additional 'array_addr' and
'array_max_idx' arguments. The LHS var is a simple integer for
dependency ordering.
(lower_omp_target): Adjust 'offload' condition for GOMP_MAP_POINTER
case. Generate BUILT_IN_ALLOCA_WITH_ALIGN to create private copy
for reductions of non-constant size types.
* omp-oacc-neuter-broadcast.cc (worker_single_copy):
Add 'hash_set<tree> *array_reduction_base_vars' parameter. Avoid
propagation for SSA_NAMEs used for array reduction accesses.
(neuter_worker_single): Add 'hash_set<tree> *array_reduction_base_vars'
parameter. Adjust recursive calls to self and worker_single_copy.
(oacc_do_neutering): Add 'hash_set<tree> *array_reduction_base_vars'
parameter. Adjust call to neuter_worker_single.
(execute_omp_oacc_neuter_broadcast): Add local
'hash_set<tree> array_reduction_base_vars' declaration. Collect MEM_REF
base-pointer SSA_NAMEs of arrays into array_reduction_base_vars. Add
'&array_reduction_base_vars' argument to call of oacc_do_neutering.
* omp-offload.cc (#include "cfghooks.h"): Add include.
(oacc_build_array_copy): New function.
(oacc_build_array_copy_loop): New function.
(oacc_build_indexed_ssa_loop): New function.
(default_goacc_reduction): Adjustments to handle arrays.
* omp-offload.h (oacc_build_array_copy): New declaration.
(oacc_build_array_copy_loop): New declaration.
(oacc_build_indexed_ssa_loop): New declaration.
* tree-loop-distribution.cc (generate_memset_builtin): Under OpenACC,
when last stmt of pre-header block is a UNIQUE(OACC_FORK) internal-fn,
split a new basic block to serve as place of insertion, otherwise
may fail later checking because UNIQUE(OACC_FORK) counts as control
flow stmt.
(generate_memcpy_builtin): Likewise.
gcc/testsuite/ChangeLog:
* c-c++-common/goacc/readonly-2.c: Adjust test.
* c-c++-common/goacc/reduction-9.c: Adjust test.
* c-c++-common/goacc/reduction-11.c: New test.
* c-c++-common/goacc/reduction-12.c: New test.
* c-c++-common/goacc/reduction-13.c: New test.
* c-c++-common/goacc/reduction-14.c: New test.
* c-c++-common/goacc/reduction-15.c: New test.
* c-c++-common/goacc/reduction-16.c: New test.
* g++.dg/goacc/reductions-1.C: Adjust test.
* gfortran.dg/goacc/array-reduction.f90: Adjust test.
* gfortran.dg/goacc/enter-exit-data-2.f90: Adjust test.
* gfortran.dg/goacc/finalize-1.f: Adjust test.
* gfortran.dg/goacc/kernels-decompose-1.f95: Adjust test.
* gfortran.dg/goacc/pr70828.f90: Adjust test.
* gfortran.dg/goacc/reduction.f95: Adjust test.
* gfortran.dg/gomp/target-enter-exit-data.f90: Adjust test.
libgomp/ChangeLog:
* testsuite/libgomp.oacc-c-c++-common/reduction.h
(check_reduction_array_xx): New macro.
(operator_apply): Likewise.
(check_reduction_array_op): Likewise.
(check_reduction_arraysec_op): Likewise.
(function_apply): Likewise.
(check_reduction_array_macro): Likewise.
(check_reduction_arraysec_macro): Likewise.
(check_reduction_xxx_xx_all): Likewise.
* testsuite/libgomp.oacc-c-c++-common/reduction-arrays-1.c: New test.
* testsuite/libgomp.oacc-c-c++-common/reduction-arrays-2.c: New test.
* testsuite/libgomp.oacc-c-c++-common/reduction-arrays-3.c: New test.
* testsuite/libgomp.oacc-c-c++-common/reduction-arrays-4.c: New test.
* testsuite/libgomp.oacc-c-c++-common/reduction-arrays-5.c: New test.
* testsuite/libgomp.oacc-c-c++-common/reduction-structs-1.c: New test.
* testsuite/libgomp.oacc-fortran/reduction-10.f90: New test.
* testsuite/libgomp.oacc-fortran/reduction-11.f90: New test.
* testsuite/libgomp.oacc-fortran/reduction-12.f90: New test.
* testsuite/libgomp.oacc-fortran/reduction-13.f90: New test.
* testsuite/libgomp.oacc-fortran/reduction-14.f90: New test.
* testsuite/libgomp.oacc-fortran/reduction-15.f90: New test.
* testsuite/libgomp.oacc-fortran/reduction-16.f90: New test.
Sandra Loosemore [Sun, 9 Feb 2025 21:34:36 +0000 (21:34 +0000)]
OpenMP: C/C++ common testcases for "omp begin declare variant"
gcc/testsuite/ChangeLog
* c-c++-common/gomp/delim-declare-variant-1.c: New.
* c-c++-common/gomp/delim-declare-variant-2.c: New.
* c-c++-common/gomp/delim-declare-variant-3.c: New.
* c-c++-common/gomp/delim-declare-variant-4.c: New.
* c-c++-common/gomp/delim-declare-variant-5.c: New.
* c-c++-common/gomp/delim-declare-variant-6.c: New.
* c-c++-common/gomp/delim-declare-variant-7.c: New.
libgomp/ChangeLog
* testsuite/libgomp.c-c++-common/delim-declare-variant-1.c: New.
Sandra Loosemore [Sun, 9 Feb 2025 21:34:36 +0000 (21:34 +0000)]
OpenMP: C front end support for "begin declare variant"
gcc/c/ChangeLog
* c-decl.cc (current_omp_declare_variant_attribute): Define.
* c-lang.h (struct c_omp_declare_variant_attr): Declare.
(current_omp_declare_variant_attribute): Declare.
* c-parser.cc (c_parser_skip_to_pragma_omp_end_declare_variant): New.
(c_parser_translation_unit): Check for "omp begin declare variant"
with no matching "end".
(c_parser_declaration_or_fndef): Handle functions in "omp begin
declare variant" block.
(c_finish_omp_declare_variant): Merge context selectors with
surrounding "omp begin declare variant".
(JOIN_STR): Define.
(omp_start_variant_function): New.
(omp_finish_variant_function): New.
(c_parser_omp_begin): Handle "omp begin declare variant".
(c_parser_omp_end): Likewise.
Co-Authored-By: Julian Brown <julian@codesourcery.com>
Sandra Loosemore [Sun, 9 Feb 2025 21:34:35 +0000 (21:34 +0000)]
OpenMP: C++ front end support for "begin declare variant"
This patch implements C++ support for the "begin declare variant"
construct. The OpenMP specification is hazy on interaction of this
feature with C++ language features. Variant functions in classes are
supported but must be defined as members in the class definition,
using an unqualified name for the base function which also must be
present in that class. Similarly variant functions in a namespace can
only be defined in that namespace using an unqualified name for a base
function already declared in that namespace. Variants for template
functions or inside template classes seem to (mostly) work.
gcc/cp/ChangeLog
* cp-tree.h (struct cp_omp_declare_variant_attr): New.
(struct saved_scope): Add omp_declare_variant_attribute field.
* decl.cc (omp_declare_variant_finalize_one): Add logic to inject
"this" parameter for method calls.
* parser.cc (cp_parser_skip_to_pragma_omp_end_declare_variant): New.
(omp_start_variant_function): New.
(omp_finish_variant_function): New.
(cp_parser_init_declarator): Handle variant functions.
(cp_parser_class_specifier): Handle deferred lookup of base functions
when the entire class has been seen.
(cp_parser_member_declaration): Handle variant functions.
(cp_finish_omp_declare_variant): Merge context selectors if in
a "begin declare variant" block.
(cp_parser_omp_begin): Match "omp begin declare variant". Adjust
error messages.
(cp_parser_omp_end): Match "omp end declare variant".
* parser.h (struct cp_parser): Add omp_unregistered_variants field.
* semantics.cc (finish_translation_unit): Detect unmatched
"omp begin declare variant".
gcc/testsuite/ChangeLog
* g++.dg/gomp/delim-declare-variant-1.C: New.
* g++.dg/gomp/delim-declare-variant-2.C: New.
* g++.dg/gomp/delim-declare-variant-3.C: New.
* g++.dg/gomp/delim-declare-variant-4.C: New.
* g++.dg/gomp/delim-declare-variant-5.C: New.
* g++.dg/gomp/delim-declare-variant-6.C: New.
* g++.dg/gomp/delim-declare-variant-7.C: New.
* g++.dg/gomp/delim-declare-variant-40.C: New.
* g++.dg/gomp/delim-declare-variant-41.C: New.
* g++.dg/gomp/delim-declare-variant-50.C: New.
* g++.dg/gomp/delim-declare-variant-51.C: New.
* g++.dg/gomp/delim-declare-variant-52.C: New.
* g++.dg/gomp/delim-declare-variant-70.C: New.
* g++.dg/gomp/delim-declare-variant-71.C: New.
libgomp/
* testsuite/libgomp.c++/delim-declare-variant-1.C: New.
* testsuite/libgomp.c++/delim-declare-variant-2.C: New.
* testsuite/libgomp.c++/delim-declare-variant-7.C: New.
Co-Authored-By: Julian Brown <julian@codesourcery.com>
Co-Authored-By: waffl3x <waffl3x@baylibre.com>
Sandra Loosemore [Sun, 9 Feb 2025 21:34:35 +0000 (21:34 +0000)]
OpenMP: Add flag for code elision to omp_context_selector_matches.
The "begin declare variant" has different rules for determining
whether a context selector cannot match for purposes of code elision
than we normally use; it excludes the case of a constant false
"condition" selector for the "user" set.
gcc/ChangeLog
* omp-general.cc (omp_context_selector_matches): Add an optional
bool argument for the code elision case.
* omp-general.h (omp_context_selector_matches): Likewise.
Sandra Loosemore [Sun, 9 Feb 2025 21:34:35 +0000 (21:34 +0000)]
OpenMP: Support functions for nested "begin declare variant"
This patch adds functions for variant name mangling and context selector
merging that are shared by the C and C++ front ends.
The OpenMP specification says that name mangling is supposed to encode
the context selector for the variant, but also provides for no way to
reference these functions directly by name or from a different
compilation unit. It also gives no guidance on how dynamic selectors
might be encoded across compilation units.
The GCC implementation of this feature instead treats variant
functions as if they have no linkage and uses a simple counter to
generate names.
gcc/ChangeLog
* omp-general.cc (omp_mangle_variant_name): New.
(omp_check_for_duplicate_variant): New.
(omp_copy_trait_set): New.
(omp_trait_selectors_equivalent): New.
(omp_combine_trait_sets): New.
(omp_merge_context_selectors): New.
* omp-general.h (omp_mangle_variant_name): Declare.
(omp_check_for_duplicate_variant): Declare.
(omp_merge_context_selectors): Declare.
Chung-Lin Tang [Mon, 11 Nov 2024 17:16:26 +0000 (17:16 +0000)]
OpenACC 2.7: Connect readonly modifier to points-to analysis
This patch links the readonly modifier to points-to analysis.
In front-ends, firstprivate pointer clauses are marked with
OMP_CLAUSE_MAP_POINTS_TO_READONLY set true, and later during lowering the
receiver side read of pointer has VAR_POINTS_TO_READONLY set true, which later
directs SSA_NAME_POINTS_TO_READONLY_MEMORY set to true during SSA conversion.
SSA_NAME_POINTS_TO_READONLY_MEMORY is an already existing flag connected with
alias oracle routines in tree-ssa-alias.cc, thus making the readonly-modifier
effective in hinting points-to analysis.
Currently have one testcase c-c++-common/goacc/readonly-2.c where we can
demonstrate 'readonly' can avoid a clobber by function call.
This patch is ported from upstream submission:
https://gcc.gnu.org/pipermail/gcc-patches/2024-April/648728.html
gcc/c-family/ChangeLog:
* c-omp.cc (c_omp_address_inspector::expand_array_base):
Set OMP_CLAUSE_MAP_POINTS_TO_READONLY on pointer clause.
(c_omp_address_inspector::expand_component_selector): Likewise.
gcc/fortran/ChangeLog:
* trans-openmp.cc (gfc_trans_omp_array_section):
Set OMP_CLAUSE_MAP_POINTS_TO_READONLY on pointer clause.
gcc/ChangeLog:
* gimple-expr.cc (copy_var_decl): Copy VAR_POINTS_TO_READONLY
for VAR_DECLs.
* omp-low.cc (lower_omp_target): Set VAR_POINTS_TO_READONLY for
variables of receiver refs.
* tree-pretty-print.cc (dump_omp_clause):
Print OMP_CLAUSE_MAP_POINTS_TO_READONLY.
(dump_generic_node): Print SSA_NAME_POINTS_TO_READONLY_MEMORY.
* tree-ssanames.cc (make_ssa_name_fn): Set
SSA_NAME_POINTS_TO_READONLY_MEMORY if DECL_POINTS_TO_READONLY is set.
* tree.h (OMP_CLAUSE_MAP_POINTS_TO_READONLY): New macro.
(VAR_POINTS_TO_READONLY): New macro.
gcc/testsuite/ChangeLog:
* c-c++-common/goacc/readonly-1.c: Adjust testcase.
* c-c++-common/goacc/readonly-2.c: New testcase.
* gfortran.dg/goacc/readonly-1.f90: Adjust testcase.
* gfortran.dg/pr67170.f90: Likewise.
Paul-Antoine Arras [Fri, 7 Jun 2024 18:29:40 +0000 (20:29 +0200)]
Fix strided `target update`
OG14 fixup for mainline commit
25072a477a5
"OpenMP: Call cuMemcpy2D/cuMemcpy3D for nvptx for omp_target_memcpy_rect"
libgomp/ChangeLog
* target.c (omp_target_memcpy_rect_worker): Require unit strides
and matching element size.
Tobias Burnus [Thu, 1 May 2025 15:39:42 +0000 (15:39 +0000)]
OpenMP: Restore lost Fortran testcase for 'omp allocate'
This testcase, which is present on the OG13 and OG14 branches, was
overlooked when the Fortran support for 'omp allocate' was added to
mainline (commit
d4b6d147920b93297e621124a99ed01e7e310d92 from
December 2023).
libgomp/ChangeLog
* testsuite/libgomp.fortran/allocate-8a.f90: New test.
(cherry picked from commit
08ce1b9f6707e00089c4d77d2bb82963d531bb1d)
Sandra Loosemore [Sat, 10 May 2025 21:45:56 +0000 (21:45 +0000)]
OpenMP: testsuite fixups for C++ allocators
The patch "OpenMP: Add C++ support for 'omp allocate'" is a backport
of a mainline patch. These additional testsuite fixes are necessary to
shut up bogus failures on OG15 but maybe are not required or suitable for
upstream.
gcc/testsuite/
* c-c++-common/gomp/uses_allocators-1.c: Adjust for this testcase
no longer failing with "sorry" in C++.
* g++.dg/gomp/allocate-15.C: Disable scan-assembler tests since
compilation fails with "sorry" before getting that far.
* g++.dg/gomp/allocate-16.C: Likewise.
waffl3x [Fri, 10 Jan 2025 08:13:37 +0000 (01:13 -0700)]
OpenMP: Add C++ support for 'omp allocate'
This patch handles local variables, global variables, as well as static
local variables where it is currently practical to do so. For now we sorry
on static local variables inside implicit constexpr functions, this includes
lambdas in c++17 and up, and inline functions with -fimplicit-constexpr in
c++14 and up. I have another patch that fixes most cases, unfortunately
there are a few cases that are not fixable without additional redesigns.
For function templates Instead of storing the directive directly in a
variables 'omp allocate' attribute and substituting into it, this patch
adds the OMP_ALLOCATE tree code that gets substituted. This makes it much
easier to prevent duplicate diagnostics with an invalid allocator/align
clause. This is added to a function template's stmt list and is only used
for substitution. It is not added to instantiations of a function template,
nor to regular functions.
Location information is included in the 'omp allocate' attribute to enhance
diagnostics of variables used in multiple allocate directives. While it is
possible that this could be added to the c front end, it would require some
reworking so it is not included in this patch. It was easy to support this
in the c++ front end because it is not practical to wait for a finalized
directive to add the 'omp allocate' attribute to vars, nor is it practical
to remove it in error cases. Consequentially some extra handling for this
needed to be added to gimplify.cc to avoid problems in error cases and
prevent conflicts with Fortran specific implementation details.
There is a left over band-aid fix in make_rtl_for_nonlocal_decl that only
worked for template cases, it probably has no effect at the moment.
The problem is make_rtl_for_nonlocal_decl never defers static locals in any
kind of constexpr function, including lambdas in c++17, regardless of
whether they can be used in a constant expression. I have a lengthy write
up of the history of why this is the case and the implications of it all,
but it is not directly relevant to this patch. In short, the original
reason static locals are not deferred was to fix PR70353 and was added in
r6-7642-ge0bffbbb5936be, however in
r9-3788-gddd0d18c9c0702 the handling of
that case was changed and no longer goes through make_rtl_for_nonlocal_decl.
Unfortunately, we can't merely undo what was added, as c++23 static
constexpr local variables rely on it, as well as cases with c++17 lambdas
that should be disallowed, but aren't.
This should never be relevant, As OpenMP directives are not currently
allowed in constexpr functions, but as stated above the early processing of
static locals happens regardless of whether the function is actually usable
in a constant expression. In non templates, this early processing occurs
before we have even parsed the allocate directive, causing alignment
specified in an align clause to be skipped over entirely. In templates we
at least get to add the attribute to mark a var before this happens, so we
can use the presence of it to make sure they get deferred. This is the
band-aid that is currently present in make_rtl_for_nonlocal_decl, however
we currently reject these cases as it is fairly difficult to differentiate
whether we are in a regular function or not. We can't just rely on
processing_template_decl as it would just error upon instantiation. In
hindsight it would probably have worked fine in cp_parser_omp_allocate, but
this is supposed to be a temporary measure anyway as I have a follow up
patch.
gcc/c/ChangeLog:
* c-parser.cc (c_parser_omp_allocate): Fix typo in diagnostic.
gcc/ChangeLog:
* cgraphunit.cc (varpool_node::finalize_decl): Add assert.
* gimplify.cc (gimplify_bind_expr): Handle C++ specific
implementation details.
gcc/cp/ChangeLog:
* constexpr.cc (potential_constant_expression_1): Handle
OMP_ALLOCATE.
* cp-tree.def (OMP_ALLOCATE): New tree code.
* cp-tree.h (OMP_ALLOCATE_LOCATION): Define.
(OMP_ALLOCATE_VARS): Define.
(OMP_ALLOCATE_ALLOCATOR): Define.
(OMP_ALLOCATE_ALIGN): Define.
(finish_omp_allocate): New function declaration.
* decl.cc (make_rtl_for_nonlocal_decl): Work around ICE with
implicit constexpr functions.
* parser.cc (cp_parser_omp_allocate): Use OMP_CLAUSE_ERROR,
add diagnostics for args, call finish_omp_allocate.
(cp_parser_omp_construct): Don't handle PRAGMA_OMP_ALLOCATE.
(cp_parser_pragma): Comment.
* pt.cc (tsubst_stmt): Handle OMP_ALLOCATE, call
finish_omp_allocate.
* semantics.cc (finish_omp_allocate): New function.
* typeck.cc (can_do_nrvo_p): Don't do NRVO for omp allocate vars.
libgomp/ChangeLog:
* libgomp.texi: Document C++ support.
* testsuite/libgomp.c/allocate-4.c: Move to...
* testsuite/libgomp.c-c++-common/allocate-4.c: ...here.
* testsuite/libgomp.c/allocate-5.c: Move to...
* testsuite/libgomp.c-c++-common/allocate-5.c: ...here.
* testsuite/libgomp.c/allocate-6.c: Move to...
* testsuite/libgomp.c-c++-common/allocate-6.c: ...here.
* testsuite/libgomp.c++/allocate-2.C: New test.
gcc/testsuite/ChangeLog:
* c-c++-common/gomp/allocate-allocator-handle.h: New header.
* c-c++-common/gomp/allocate-5.c: Remove dg-messages for 'sorry',
add dg-error for c++.
* c-c++-common/gomp/allocate-9.c: Include header, remove dg-messages
for 'sorry', add dg-notes for c++, minor refactoring.
* c-c++-common/gomp/allocate-10.c: Enable for c++.
* c-c++-common/gomp/allocate-11.c: Enable for c++, disable warning.
* c-c++-common/gomp/allocate-12.c: Enable for c++, add cases.
* c-c++-common/gomp/allocate-14.c: Enable for c++.
* c-c++-common/gomp/allocate-15.c: Enable for c++.
* c-c++-common/gomp/allocate-16.c: Enable for c++.
* c-c++-common/gomp/allocate-17.c: Remove dg-message for 'sorry'.
* c-c++-common/gomp/allocate-18.c: Include header, remove dg-message
for 'sorry'.
* c-c++-common/gomp/allocate-19.c: Remove xfails for c++, remove
dg-messages for 'sorry'.
* c-c++-common/gomp/allocate-20.c: New test.
* c-c++-common/gomp/directive-1.c: Remove dg-message for 'sorry'.
* g++.dg/gomp/allocate-allocator-handle.h: New header.
* g++.dg/gomp/allocate-5.C: New test.
* g++.dg/gomp/allocate-6.C: New test.
* g++.dg/gomp/allocate-7.C: New test.
* g++.dg/gomp/allocate-8.C: New test.
* g++.dg/gomp/allocate-9.C: New test.
* g++.dg/gomp/allocate-10.C: New test.
* g++.dg/gomp/allocate-11.C: New test.
* g++.dg/gomp/allocate-12.C: New test.
* g++.dg/gomp/allocate-13.C: New test.
* g++.dg/gomp/allocate-14.C: New test.
* g++.dg/gomp/allocate-15.C: New test.
* g++.dg/gomp/allocate-16.C: New test.
* g++.dg/gomp/allocate-17.C: New test.
* g++.dg/gomp/allocate-18.C: New test.
* g++.dg/gomp/allocate-19.C: New test.
* g++.dg/gomp/allocate-20.C: New test.
* g++.dg/gomp/allocate-21.C: New test.
Signed-off-by: waffl3x <waffl3x@baylibre.com>
Co-authored-by: Tobias Burnus <tobias@codesourcery.com>
Julian Brown [Wed, 13 Sep 2023 13:31:48 +0000 (13:31 +0000)]
OpenMP: Support accelerated 2D/3D memory copies for AMD GCN [OG14-only part]
This patch only adds the bits missing from mainline:
Support is also added for 1-dimensional strided accesses: these are
treated as a special case of 2-dimensional transfers, where the innermost
dimension is formed from the stride length (in bytes).
2023-09-19 Julian Brown <julian@codesourcery.com>
libgomp/
* target.c (omp_target_memcpy_rect_worker): Add 1D strided transfer
support.
Andrew Stubbs [Fri, 1 Sep 2023 15:49:58 +0000 (16:49 +0100)]
libgomp: parallel reverse offload
Extend OpenMP reverse offload support to allow running the host kernels
on multiple threads. The device plugin API for reverse offload is now made
non-blocking, meaning that running the host kernel in the wrong device
context is no longer a problem. The NVPTX message passing interface now
uses a ring buffer aproximately matching GCN.
libgomp/ChangeLog:
* config/gcn/target.c (GOMP_target_ext): Add "signal" field.
Fix atomics race condition.
* config/nvptx/libgomp-nvptx.h (REV_OFFLOAD_QUEUE_SIZE): New define.
(struct rev_offload): Implement ring buffer.
* config/nvptx/target.c (GOMP_target_ext): Likewise.
* env.c (initialize_env): Read GOMP_REVERSE_OFFLOAD_THREADS.
* libgomp-plugin.c (GOMP_PLUGIN_target_rev): Replace "aq" parameter
with "signal" and "use_aq".
* libgomp-plugin.h (GOMP_PLUGIN_target_rev): Likewise.
* libgomp.h (gomp_target_rev): Likewise.
* plugin/plugin-gcn.c (process_reverse_offload): Add "signal".
(console_output): Pass signal value through.
* plugin/plugin-nvptx.c (GOMP_OFFLOAD_openacc_async_construct):
Attach new threads to the numbered device.
Change the flag to CU_STREAM_NON_BLOCKING.
(GOMP_OFFLOAD_run): Implement ring-buffer and remove signalling.
* target.c (gomp_target_rev): Rename to ...
(gomp_target_rev_internal): ... this, and change "dev_num" to
"devicep".
(gomp_target_rev_worker_thread): New function.
(gomp_target_rev): New function (old name).
* libgomp.texi: Document GOMP_REVERSE_OFFLOAD_THREADS.
* testsuite/libgomp.c/reverse-offload-threads-1.c: New test.
* testsuite/libgomp.c/reverse-offload-threads-2.c: New test.
Julian Brown [Sat, 26 Apr 2025 18:28:36 +0000 (18:28 +0000)]
OpenMP: Enable 'declare mapper' mappers for 'target update' directives
This patch enables use of 'declare mapper' for 'target update' directives,
for each of C, C++ and Fortran.
There are some implementation choices here and some
"read-between-the-lines" consequences regarding this functionality,
as follows:
* It is possible to invoke a mapper which contains clauses that
don't make sense for a given 'target update' operation. E.g. if a
mapper definition specifies a "from:" mapping and the user does "target
update to(...)" which triggers that mapper, the resulting map kind
(OpenMP 5.2, "Table 5.3: Map-Type Decay of Map Type Combinations")
is "alloc" (and for the inverse case "release"). For such cases,
an unconditional warning is issued and the map clause in question is
dropped from the mapper expansion. (Other choices might be to make
this an error, or to do the same thing but silently, or warn only
given some special option.)
* The array-shaping operator is *permitted* for map clauses within
'declare mapper' definitions. That is because such mappers may be used
for 'target update' directives, where the array-shaping operator is
permitted. I think that makes sense, depending on the semantic model
of how and when substitution is supposed to take place, but I couldn't
find such behaviour explicitly mentioned in the spec (as of 5.2).
If the mapper is triggered by a different directive ("omp target",
"omp target data", etc.), an error will be raised.
Support is also added for the "mapper" modifier on to/from clauses for
all three base languages.
2023-08-10 Julian Brown <julian@codesourcery.com>
gcc/c-family/
* c-common.h (c_omp_region_type): Add C_ORT_UPDATE and C_ORT_OMP_UPDATE
codes.
* c-omp.cc (omp_basic_map_kind_name): New function.
(omp_instantiate_mapper): Add LOC parameter. Add 'target update'
support.
(c_omp_instantiate_mappers): Add 'target update' support.
gcc/c/
* c-parser.cc (c_parser_omp_variable_list): Support array-shaping
operator in 'declare mapper' definitions.
(c_parser_omp_clause_map): Pass C_ORT_OMP_DECLARE_MAPPER to
c_parser_omp_variable_list in mapper definitions.
(c_parser_omp_clause_from_to): Add parsing for mapper modifier.
(c_parser_omp_target_update): Instantiate mappers.
gcc/cp/
* parser.cc (cp_parser_omp_var_list_no_open): Support array-shaping
operator in 'declare mapper' definitions.
(cp_parser_omp_clause_from_to): Add parsing for mapper modifier.
(cp_parser_omp_clause_map): Pass C_ORT_OMP_DECLARE_MAPPER to
cp_parser_omp_var_list_no_open in mapper definitions.
(cp_parser_omp_target_update): Instantiate mappers.
gcc/fortran/
* openmp.cc (gfc_match_motion_var_list): Add parsing for mapper
modifier.
(gfc_match_omp_clauses): Adjust error handling for changes to
gfc_match_motion_var_list.
(gfc_omp_instantiate_mapper): Add code argument to get proper
location for diagnostic.
(gfc_omp_instantiate_mappers): Adjust for above change.
* trans-openmp.cc (gfc_trans_omp_clauses): Use correct ref for update
operations.
(gfc_trans_omp_target_update): Instantiate mappers.
gcc/testsuite/
* c-c++-common/gomp/declare-mapper-17.c: New test.
* c-c++-common/gomp/declare-mapper-19.c: New test.
* gfortran.dg/gomp/declare-mapper-24.f90: New test.
* gfortran.dg/gomp/declare-mapper-26.f90: Uncomment 'target update'
part of test.
* gfortran.dg/gomp/declare-mapper-27.f90: New test.
libgomp/
* testsuite/libgomp.c-c++-common/declare-mapper-18.c: New test.
* testsuite/libgomp.fortran/declare-mapper-25.f90: New test.
* testsuite/libgomp.fortran/declare-mapper-28.f90: New test.
Co-Authored-By: Andrew Stubbs <ams@baylibre.com>
Co-Authored-By: Kwok Cheung Yeung <kcyeung@baylibre.com>
Co-Authored-By: Sandra Loosemore <sloosemore@baylibre.com>
Julian Brown [Thu, 24 Apr 2025 19:05:56 +0000 (19:05 +0000)]
OpenMP: Noncontiguous "target update" for Fortran
(Note: On OG14 branch, this was tacked on to "OpenMP: Array shaping
operator and strided "target update" for C", losing its original
commit message from OG13; I've restored it as a separate patch for
OG15, and merged with "Strided/rectangular 'target update'
out-of-bounds array lookup fix" and the Fortran part of "Dimension
ordering for array-shaping operator for C and C++".)
This patch implements noncontiguous "target update" for Fortran.
The existing middle end/runtime bits relating to C++ support are reused,
with some small adjustments, e.g.:
1. The node used to map the OMP "array descriptor" (from omp-low.cc
onwards) now uses the OMP_CLAUSE_SIZE field as a bias (the difference
between the "virtual origin" element with zero indices in each
dimension and the first element actually stored in memory).
2. The OMP_CLAUSE_SIZE field of a GOMP_MAP_DIM_STRIDE node may now be
used to store a "span", which is the distance in bytes between
two adjacent elements in an array (with unit stride) when that is
different from the element size, as it can be in Fortran.
The implementation goes to some effort to massage Fortran array metadata
(array descriptors) into a form that can ultimately be consumed by
omp_target_memcpy_rect_worker. The method for doing this is described
in comments in the patch body.
2023-07-03 Julian Brown <julian@codesourcery.com>
gcc/ChangeLog
* gimplify.cc (gimplify_adjust_omp_clauses): Don't gimplify
VIEW_CONVERT_EXPR away in GOMP_MAP_TO_GRID/GOMP_MAP_FROM_GRID clauses.
* omp-low.cc (omp_noncontig_descriptor_type): Add SPAN field.
(scan_sharing_clauses): Don't store descriptor size in its
OMP_CLAUSE_SIZE field.
(lower_omp_target): Add missing OMP_CLAUSE_MAP check. Add special-case
string handling. Handle span and bias. Use low bound instead of zero
as index for trailing full dimensions.
gcc/fortran/ChangeLog
* trans-openmp.cc (gfc_omp_deep_map_kind_p): Handle
GOMP_MAP_{TO,FROM}_GRID, GOMP_MAP_GRID_{DIM,STRIDE}.
(gfc_trans_omp_arrayshape_type, gfc_omp_calculate_gcd,
gfc_desc_to_omp_noncontig_array, gfc_omp_contiguous_update_p): New
functions.
(gfc_trans_omp_clauses): Handle noncontiguous to/from clauses for OMP
"target update" directives.
gcc/testsuite/ChangeLog
* gfortran.dg/gomp/noncontig-updates-1.f90: New test.
* gfortran.dg/gomp/noncontig-updates-2.f90: New test.
* gfortran.dg/gomp/noncontig-updates-3.f90: New test.
* gfortran.dg/gomp/noncontig-updates-4.f90: New test.
libgomp/ChangeLog
* libgomp.h (omp_noncontig_array_desc): Add span field.
* target.c (omp_target_memcpy_rect_worker): Add span parameter. Update
forward declaration. Handle span != element_size.
(gomp_update): Handle bias in descriptor's size slot. Update calls to
omp_target_memcpy_rect_worker.
* testsuite/libgomp.fortran/noncontig-updates-1.f90: New test.
* testsuite/libgomp.fortran/noncontig-updates-2.f90: New test.
* testsuite/libgomp.fortran/noncontig-updates-3.f90: New test.
* testsuite/libgomp.fortran/noncontig-updates-4.f90: New test.
* testsuite/libgomp.fortran/noncontig-updates-5.f90: New test.
* testsuite/libgomp.fortran/noncontig-updates-6.f90: New test.
* testsuite/libgomp.fortran/noncontig-updates-7.f90: New test.
* testsuite/libgomp.fortran/noncontig-updates-8.f90: New test.
* testsuite/libgomp.fortran/noncontig-updates-9.f90: New test.
* testsuite/libgomp.fortran/noncontig-updates-10.f90: New test.
* testsuite/libgomp.fortran/noncontig-updates-11.f90: New test.
* testsuite/libgomp.fortran/noncontig-updates-12.f90: New test.
* testsuite/libgomp.fortran/noncontig-updates-13.f90: New test.
Co-Authored-By: Sandra Loosemore <sloosemore@baylibre.com>
Julian Brown [Thu, 24 Apr 2025 19:03:16 +0000 (19:03 +0000)]
OpenMP: Array shaping operator and strided "target update" for C
Following the similar support for C++, here is the C implementation for
the OpenMP 5.0 array-shaping operator, and for strided and rectangular
updates for "target update".
Much of the implementation is shared with the C++ support added by the
previous patch. Some details of parsing necessarily differ for C,
but the general ideas are the same.
This version of the patch has been rebased and contains a couple of
minor fixes relative to versions posted previously.
2023-09-05 Julian Brown <julian@codesourcery.com>
gcc/c/
* c-parser.cc (c_parser_braced_init): Disallow array-shaping operator
in braced init.
(c_parser_conditional_expression): Disallow array-shaping operator in
conditional expression.
(c_parser_cast_expression): Add array-shaping operator support.
(c_parser_postfix_expression): Disallow array-shaping operator in
statement expressions.
(c_parser_postfix_expression_after_primary): Add OpenMP array section
stride support.
(c_parser_expr_list): Disallow array-shaping operator in expression
lists.
(c_array_type_nelts_total): New function.
(c_parser_omp_variable_list): Support array-shaping operator.
(c_parser_omp_target_update): Recognize GOMP_MAP_TO_GRID and
GOMP_MAP_FROM_GRID map kinds as well as OMP_CLAUSE_TO/OMP_CLAUSE_FROM.
* c-tree.h (c_omp_array_shaping_op_p, c_omp_has_array_shape_p): New
extern declarations.
(create_omp_arrayshape_type): Add prototype.
* c-typeck.cc (c_omp_array_shaping_op_p, c_omp_has_array_shape_p): New
globals.
(build_omp_array_section): Permit integral types, not just integer
constants, when creating array types for array sections.
(create_omp_arrayshape_type): New function.
(handle_omp_array_sections_1): Add DISCONTIGUOUS parameter. Add
strided/rectangular array section support.
(omp_array_section_low_bound): New function.
(handle_omp_array_sections): Add DISCONTIGUOUS parameter. Add
strided/rectangular array section support.
(c_finish_omp_clauses): Update calls to handle_omp_array_sections.
Handle discontiguous updates.
gcc/testsuite/
* gcc.dg/gomp/bad-array-shaping-c-1.c: New test.
* gcc.dg/gomp/bad-array-shaping-c-2.c: New test.
* gcc.dg/gomp/bad-array-shaping-c-3.c: New test.
* gcc.dg/gomp/bad-array-shaping-c-4.c: New test.
* gcc.dg/gomp/bad-array-shaping-c-5.c: New test.
* gcc.dg/gomp/bad-array-shaping-c-6.c: New test.
* gcc.dg/gomp/bad-array-shaping-c-7.c: New test.
libgomp/
* testsuite/libgomp.c-c++-common/array-shaping-14.c: New test.
* testsuite/libgomp.c/array-shaping-1.c: New test.
* testsuite/libgomp.c/array-shaping-2.c: New test.
* testsuite/libgomp.c/array-shaping-3.c: New test.
* testsuite/libgomp.c/array-shaping-4.c: New test.
* testsuite/libgomp.c/array-shaping-5.c: New test.
* testsuite/libgomp.c/array-shaping-6.c: New test.
Julian Brown [Thu, 24 Apr 2025 15:32:14 +0000 (15:32 +0000)]
OpenMP: Support strided and shaped-array updates for C++
This patch adds support for OpenMP 5.0 strided updates and the
array-shaping operator ("([x][y][z]) foo[0:n]..."). This is mostly for
C++ only so far, though necessary changes have been made to the C FE to
adjust for changes to shared data structures.
In terms of the implementation of various bits:
- The OMP_ARRAY_SECTION tree code has been extended to take a 'stride'
argument, and changes have been made throughout semantics.cc, etc. to
take the new field into account -- including bounds checking.
- A new type of cast operator has been added to represent the OpenMP
array-shaping operator: OMP_ARRAYSHAPE_CAST_EXPR (1).
- The address tokenization mechanism from previous patches has been
extended with two new access kinds to represent noncontiguous array
updates.
- New mapping kinds have been added to represent noncontiguous updates:
those which may be subject to array shaping, or have non-unit strides.
These are processed by omp-low.cc into a kind of descriptor that is
passed to the libgomp runtime (2).
The current patch reuses an extended version of the helper code for
omp_target_memcpy_rect, which may generate very many small host-device or
device-host copies. (The "descriptor" has also been designed so reusing
that functionality is relatively straightforward.) Optimising those
multiple copies, e.g. by packing them into a single transfer when it
would be beneficial, is left as the subject of a future patch.
This patch has some adjustments to the omp-low.cc code after Chung-Lin's
patch "OpenMP 5.0: Allow multiple clauses mapping same variable"
(
325f085897efca59879a64704ab15f1763ecb807), relative to the version last
posted for mainline.
Notes:
(1) In a bit more detail: the array-shaping operator has the same
precedence as a C-style cast, but applies to the whole expression,
including array-section specifiers. We parse it initially as if it
applies to the "value" of the whole expression:
([x][y]) ptr[0:10:2][1:5:2]
i.e., something like:
([x][y]) (ptr[0:10:2][1:5:2])
or as if the cast applies to the innermost/right-hand side array
section. Then, a little later in parsing (cp_parser_omp_var_list_no_open),
we rewrite it to apply to the inner pointer instead:
(([x][y]) ptr)[0:10:2][1:5:2]
and that means a genuine multi-dimensional array or an array-shaped
pointer can be handled pretty much the same for the rest of
compilation. We use VIEW_CONVERT_EXPR for the "cast", unless we're
processing a template definition, where we use a new tree code instead.
(2) The new map kinds work like this. An update directive starts
out with OMP_CLAUSE_TO or OMP_CLAUSE_FROM clauses representing the
block in question and the direction of the needed transfer. If we
detect a noncontiguous update, we emit a list of mapping nodes (type
OMP_CLAUSE_MAP, with new kinds, so the "mapping group" machinery in
gimplify.cc can be reused):
OMP_CLAUSE_TO -->
GOMP_MAP_TO_GRID (VIEW_CONVERT_EXPR<int[x][y]>(ptr) [len: <element-size>])
GOMP_MAP_GRID_DIM 0 [len: 10] (i.e. [0:10:2])
GOMP_MAP_GRID_STRIDE 2
GOMP_MAP_GRID_DIM 1 [len: 5] (i.e. [1:5:2])
GOMP_MAP_GRID_STRIDE 2
During omp-low.cc, this sequence is reformulated into:
GOMP_MAP_TO_GRID (ptr) [len: <whole array size>]
GOMP_MAP_TO_PSET (&ptr_desc [len: <desc size>])
"ptr_desc" is a struct, stored statically or constructed on the (host)
stack, containing arrays representing the size of the whole array, the
rectangular subregion to transfer, and the stride with which to walk
over elements in each dimension.
2023-07-03 Julian Brown <julian@codesourcery.com>
gcc/c-family/
* c-omp.cc (c_omp_address_inspector::map_supported_p): Support
VIEW_CONVERT_EXPR and ADDR_EXPR codes.
(omp_expand_grid_dim): New function.
(omp_handle_noncontig_array): New function.
(c_omp_address_inspector:expand_array_base): Support noncontiguous
array updates.
(c_omp_address_inspector::expand_component_selector): Support
noncontiguous array updates.
* c-pretty-print.cc (c_pretty_printer::postfix_expression): Add
OMP_ARRAY_SECTION stride support.
gcc/c/
* c-parser.cc (c_parser_postfix_expression_after_primary): Dummy stride
support (for now).
(struct omp_dim): Add stride support.
(c_parser_omp_variable_list): Likewise.
* c-tree.h (build_omp_array_section): Update prototype.
* c-typeck.cc (mark_exp_read): Add stride support for
OMP_ARRAY_SECTION.
(build_omp_array_section): Add stride support.
(handle_omp_array_sections_1): Add minimal stride support.
gcc/cp/
* cp-objcp-common.cc (cp_common_init_ts): Add array-shape cast
support.
* cp-tree.def (OMP_ARRAYSHAPE_CAST_EXPR): Add tree code.
* cp-tree.h (DECLTYPE_FOR_OMP_ARRAYSHAPE_CAST): Add flag.
(cp_omp_create_arrayshape_type, cp_build_omp_arrayshape_cast): Add
prototypes.
(grok_omp_array_section, build_omp_array_section): Add stride
parameters.
* decl.cc (create_anon_array_type): New function.
(cp_omp_create_arrayshape_type): New function.
* decl2.cc (grok_omp_array_section): Add stride parameter.
(min_vis_expr_r): Add OMP_ARRAYSHAPE_CAST_EXPR support.
* error.cc (dump_expr): Add stride support for OMP_ARRAY_SECTION.
* mangle.cc (write_expression): Add OMP_ARRAYSHAPE_CAST_EXPR support.
* operators.def (OMP_ARRAYSHAPE_CAST_EXPR): Add.
* parser.cc (cp_parser_new): Initialise omp_array_shaping_op_p and
omp_has_array_shape_p fields.
(cp_parser_statement_expr): Don't allow array shaping op in statement
exprs.
(cp_parser_postfix_open_square_expression): Add stride parsing for
array sections. Use array section code to represent array refs if we
have an array-shaping operator.
(cp_parser_parenthesized_expression_list): Don't allow array-shaping
op here.
(cp_parser_cast_expression): Add array-shaping operator parsing.
(cp_parser_lambda_expression): Don't allow array-shaping op in lambda
body.
(cp_parser_braced_list): Don't allow array-shaping op in braced list.
(struct omp_dim): Add stride field.
(cp_parser_var_list_no_open): Add stride/array shape support.
(cp_parser_omp_target_update): Handle noncontiguous updates.
* parser.h (cp_parser): Add omp_array_shaping_op_p and
omp_has_array_shape_p fields.
* pt.cc (tsubst): Add array-shape cast support.
(tsubst_copy, tsubst_copy_and_build): Likewise. Add stride support for
OMP_ARRAY_SECTION.
(tsubst_omp_clause_decl): Add stride support for OMP_ARRAY_SECTION.
* semantics.cc (handle_omp_array_sections_1): Add DISCONTIGUOUS
parameter and stride support.
(omp_array_section_low_bound): New function.
(handle_omp_array_sections): Add DISCONTIGUOUS parameter and stride
support.
(finish_omp_clauses): Update calls to handle_omp_array_sections, and
add noncontiguous array update support.
(cp_build_omp_arrayshape_cast): New function.
* typeck.cc (structural_comptypes): Add array-shape cast support.
(build_omp_array_section): Add stride parameter.
(check_for_casting_away_constness): Add OMP_ARRAYSHAPE_CAST_EXPR
support.
gcc/
* gimplify.cc (omp_group_last, omp_group_base): Add GOMP_MAP_TO_GRID,
GOMP_MAP_FROM_GRID support.
(gimplify_adjust_omp_clauses): Support new GOMP_MAP_GRID_DIM,
GOMP_MAP_GRID_STRIDE mapping nodes. Don't crash on e.g. misuse of
ADDR_EXPR in mapping clauses.
* omp-general.cc (omp_parse_noncontiguous_array): New function.
(omp_parse_access_method): Add noncontiguous array support.
(omp_parse_structure_base): Add array-shaping support.
(debug_omp_tokenized_addr): Add ACCESS_NONCONTIG_ARRAY,
ACCESS_NONCONTIG_REF_TO_ARRAY token support.
* omp-general.h (access_method_kinds): Add ACCESS_NONCONTIG_ARRAY and
ACCESS_NONCONTIG_REF_TO_ARRAY access kinds.
* omp-low.cc (omp_noncontig_descriptor_type): New function.
(scan_sharing_clauses): Support noncontiguous array updates.
(lower_omp_target): Likewise.
* tree-pretty-print.cc (dump_omp_clause): Add GOMP_MAP_TO_GRID,
GOMP_MAP_FROM_GRID, GOMP_MAP_GRID_DIM, GOMP_MAP_GRID_STRIDE map kinds.
(dump_generic_node): Add stride support for OMP_ARRAY_SECTION.
* tree.def (OMP_ARRAY_SECTION): Add stride argument.
include/
* gomp-constants.h (gomp_map_kind): Add GOMP_MAP_TO_GRID,
GOMP_MAP_FROM_GRID, GOMP_MAP_GRID_DIM, GOMP_MAP_GRID_STRIDE map kinds.
gcc/testsuite/
* g++.dg/gomp/array-shaping-1.C: New test.
* g++.dg/gomp/array-shaping-2.C: New test.
* g++.dg/gomp/bad-array-shaping-1.C: New test.
* g++.dg/gomp/bad-array-shaping-2.C: New test.
* g++.dg/gomp/bad-array-shaping-3.C: New test.
* g++.dg/gomp/bad-array-shaping-4.C: New test.
* g++.dg/gomp/bad-array-shaping-5.C: New test.
* g++.dg/gomp/bad-array-shaping-6.C: New test.
* g++.dg/gomp/bad-array-shaping-7.C: New test.
* g++.dg/gomp/bad-array-shaping-8.C: New test.
libgomp/
* libgomp.h (omp_noncontig_array_desc): New struct.
* target.c (omp_target_memcpy_rect_worker): Add stride array
parameter. Forward declare. Add STRIDES parameter and strided
update support.
(gomp_update): Add noncontiguous (strided/shaped) update support.
* testsuite/libgomp.c++/array-shaping-1.C: New test.
* testsuite/libgomp.c++/array-shaping-2.C: New test.
* testsuite/libgomp.c++/array-shaping-3.C: New test.
* testsuite/libgomp.c++/array-shaping-4.C: New test.
* testsuite/libgomp.c++/array-shaping-5.C: New test.
* testsuite/libgomp.c++/array-shaping-6.C: New test.
* testsuite/libgomp.c++/array-shaping-7.C: New test.
* testsuite/libgomp.c++/array-shaping-8.C: New test.
* testsuite/libgomp.c++/array-shaping-9.C: New test.
* testsuite/libgomp.c++/array-shaping-10.C: New test.
* testsuite/libgomp.c++/array-shaping-11.C: New test.
* testsuite/libgomp.c++/array-shaping-12.C: New test.
* testsuite/libgomp.c++/array-shaping-13.C: New test.
Julian Brown [Tue, 30 Apr 2024 17:21:22 +0000 (19:21 +0200)]
OpenMP: Allow complete replacement of clause during map/to/from expansion
At present, map/to/from clauses on OpenMP "target" directives may be
expanded into several mapping nodes if they describe array sections with
pointer or reference bases, or similar. This patch allows the original
clause to be replaced during that expansion, mostly by passing the list
pointer to the node to various functions rather than the node itself.
This is needed by the following patch. There shouldn't be any functional
changes introduced by this patch itself.
2023-09-05 Julian Brown <julian@codesourcery.com>
gcc/c-family/
* c-common.h (expand_array_base, expand_component_selector,
expand_map_clause): Adjust member declarations.
* c-omp.cc (omp_expand_access_chain): Pass and return pointer to
clause.
(c_omp_address_inspector::expand_array_base): Likewise.
(c_omp_address_inspector::expand_component_selector): Likewise.
(c_omp_address_inspector::expand_map_clause): Likewise.
gcc/c/
* c-typeck.cc (handle_omp_array_sections): Pass pointer to clause to
process instead of clause.
(c_finish_omp_clauses): Update calls to handle_omp_array_sections.
Handle cases where initial clause might be replaced.
gcc/cp/
* semantics.cc (handle_omp_array_sections): Pass pointer to clause
instead of clause. Add PNEXT return parameter for next clause in list
to process.
(finish_omp_clauses): Update calls to handle_omp_array_sections.
Handle cases where initial clause might be replaced.
Julian Brown [Wed, 23 Apr 2025 00:01:43 +0000 (00:01 +0000)]
OpenMP: Look up 'declare mapper' definitions at resolution time not parse time
This patch moves 'declare mapper' lookup for OpenMP clauses from parse
time to resolution time for Fortran, and adds diagnostics for missing
named mappers. This changes clause lookup in a particular case -- where
several 'declare mapper's are defined in a context, mappers declared
earlier may now instantiate mappers declared later, whereas previously
they would not. I think the new behaviour makes more sense -- at an
invocation site, all mappers are visible no matter the declaration order
in some particular block. I've adjusted tests to account for this.
I think the new arrangement better matches the Fortran FE's usual way of
doing things -- mapper lookup is a semantic concept, not a syntactical
one, so shouldn't be handled in the syntax-handling code.
The patch also fixes a case where the user explicitly writes 'default'
as the name on the mapper modifier for a clause.
2023-08-10 Julian Brown <julian@codesourcery.com>
gcc/fortran/
* gfortran.h (gfc_omp_namelist_udm): Add MAPPER_ID field to store the
mapper name to use for lookup during resolution.
* match.cc (gfc_free_omp_namelist): Handle OMP_LIST_TO and
OMP_LIST_FROM when freeing mapper references.
* module.cc (load_omp_udms, write_omp_udm): Handle MAPPER_ID field.
* openmp.cc (gfc_match_omp_clauses): Handle explicitly-specified
'default' name. Don't do mapper lookup here, but record mapper name if
the user specifies one.
(resolve_omp_clauses): Do mapper lookup here instead. Report error for
missing named mapper.
gcc/testsuite/
* gfortran.dg/gomp/declare-mapper-31.f90: New test.
libgomp/
* testsuite/libgomp.fortran/declare-mapper-30.f90: New test.
* testsuite/libgomp.fortran/declare-mapper-4.f90: Adjust test for new
lookup behaviour.
Sandra Loosemore [Mon, 12 May 2025 16:41:45 +0000 (16:41 +0000)]
OpenMP, Fortran: Handle errors in 'declare mapper' instantiation
The patch "OpenMP: Reprocess expanded clauses after 'declare mapper'
instantiation" added further error-checking to
gfc_omp_instantiate_mappers, which is called during the translation
phase of processing, but these errors were effectively ignored for
further processing of the code. This patch makes the translation
phase insert an error_mark_node in these situations instead of
generating normal tree code.
This patch fixes an ICE in gfortran.dg/gomp/declare-mapper-29.f90
where it was attempting to gimplify code that had already been
diagnosed as invalid in the front end.
gcc/fortran/ChangeLog
* gfortran.h (gfc_omp_instantiate_mappers): Adjust declaration
to return an error status instead of void.
* openmp.cc (gfc_gomp_instantiate_mappers): Likewise for the
the definition.
* trans-openmp.cc (gfc_trans_omp_target): Check return status of
call to gfc_omp_instantiate_mappers and insert an error_mark_node
on failure instead of continuing normal processing of the construct.
(gfc_trans_omp_target_data): Likewise.
(gfc_trans_omp_target_enter_data): Likewise.
(gfc_trans_omp_target_exit_data): Likewise.
Julian Brown [Thu, 24 Apr 2025 14:31:21 +0000 (14:31 +0000)]
OpenMP: Reprocess expanded clauses after 'declare mapper' instantiation
This patch reprocesses expanded clauses after 'declare mapper'
instantiation -- checking things such as duplicated clauses, illegal
use of strided accesses, and so forth. Two functions are broken out
of the 'resolve_omp_clauses' function and reused in a new function
'resolve_omp_mapper_clauses', called after mapper instantiation.
This improves diagnostic output.
2023-08-10 Julian Brown <julian@codesourcery.com>
gcc/fortran/
* gfortran.h (gfc_omp_clauses): Add NS field.
* openmp.cc (verify_omp_clauses_symbol_dups,
omp_verify_map_motion_clauses): New functions, broken out of...
(resolve_omp_clauses): Here. Record namespace containing clauses.
Call above functions.
(resolve_omp_mapper_clauses): New function, using helper functions
broken out above.
(gfc_resolve_omp_directive): Add NS parameter to resolve_omp_clauses
calls.
(gfc_omp_instantiate_mappers): Call resolve_omp_mapper_clauses if we
instantiate any mappers.
gcc/testsuite/
* gfortran.dg/gomp/declare-mapper-26.f90: New test.
* gfortran.dg/gomp/declare-mapper-29.f90: New test.
Julian Brown [Tue, 22 Apr 2025 04:29:50 +0000 (04:29 +0000)]
OpenMP: Move Fortran 'declare mapper' instantiation code
This patch moves the code for explicit 'declare mapper' directive
instantiation in the Fortran front-end to openmp.cc from trans-openmp.cc.
The transformation takes place entirely in the front end's own
representation and doesn't involve middle-end trees at all. Also, having
the code in openmp.cc is more convenient for the following patch that
introduces the 'resolve_omp_mapper_clauses' function.
2023-08-10 Julian Brown <julian@codesourcery.com>
gcc/fortran/
* gfortran.h (toc_directive): Move here.
(gfc_omp_instantiate_mappers, gfc_get_location): Add prototypes.
* openmp.cc (omp_split_map_op, omp_join_map_op, omp_map_decayed_kind,
omp_basic_map_kind_name, gfc_subst_replace, gfc_subst_prepend_ref,
gfc_subst_in_expr_1, gfc_subst_in_expr, gfc_subst_mapper_var): Move
here.
(gfc_omp_instantiate_mapper, gfc_omp_instantiate_mappers): Move here
and rename.
* trans-openmp.cc (toc_directive, omp_split_map_op, omp_join_map_op,
omp_map_decayed_kind, gfc_subst_replace, gfc_subst_prepend_ref,
gfc_subst_in_expr_1, gfc_subst_in_expr, gfc_subst_mapper_var,
gfc_trans_omp_instantiate_mapper, gfc_trans_omp_instantiate_mappers):
Remove from here.
(gfc_trans_omp_target, gfc_trans_omp_target_data,
gfc_trans_omp_target_enter_data, gfc_trans_omp_target_exit_data):
Rename calls to gfc_omp_instantiate_mappers.
Julian Brown [Tue, 22 Apr 2025 02:50:59 +0000 (02:50 +0000)]
OpenMP: Expand "declare mapper" mappers for target {enter,exit,} data directives
This patch allows 'declare mapper' mappers to be used on 'omp target
data', 'omp target enter data' and 'omp target exit data' directives.
For each of these, only explicit mappings are supported, unlike for
'omp target' directives where implicit uses of variables inside an
offload region might trigger mappers also.
Each of C, C++ and Fortran are supported.
The patch also adjusts 'map kind decay' to match OpenMP 5.2 semantics,
which is particularly important with regard to 'exit data' operations.
2023-07-06 Julian Brown <julian@codesourcery.com>
gcc/c-family/
* c-common.h (c_omp_region_type): Add C_ORT_EXIT_DATA,
C_ORT_OMP_EXIT_DATA.
(c_omp_instantiate_mappers): Add region type parameter.
* c-omp.cc (omp_split_map_kind, omp_join_map_kind,
omp_map_decayed_kind): New functions.
(omp_instantiate_mapper): Add ORT parameter. Implement map kind decay
for instantiated mapper clauses.
(c_omp_instantiate_mappers): Add ORT parameter, pass to
omp_instantiate_mapper.
gcc/c/
* c-parser.cc (c_parser_omp_target_data): Instantiate mappers for
'omp target data'.
(c_parser_omp_target_enter_data): Instantiate mappers for 'omp target
enter data'.
(c_parser_omp_target_exit_data): Instantiate mappers for 'omp target
exit data'.
(c_parser_omp_target): Add c_omp_region_type argument to
c_omp_instantiate_mappers call.
* c-tree.h (c_omp_instantiate_mappers): Remove spurious prototype.
gcc/cp/
* parser.cc (cp_parser_omp_target_data): Instantiate mappers for 'omp
target data'.
(cp_parser_omp_target_enter_data): Instantiate mappers for 'omp target
enter data'.
(cp_parser_omp_target_exit_data): Instantiate mappers for 'omp target
exit data'.
(cp_parser_omp_target): Add c_omp_region_type argument to
c_omp_instantiate_mappers call.
* pt.cc (tsubst_omp_clauses): Instantiate mappers for OMP regions other
than just C_ORT_OMP_TARGET.
(tsubst_expr): Update call to tsubst_omp_clauses for OMP_TARGET_UPDATE,
OMP_TARGET_ENTER_DATA, OMP_TARGET_EXIT_DATA stanza.
* semantics.cc (cxx_omp_map_array_section): Avoid calling
build_array_ref for non-array/non-pointer bases (error reported
already).
gcc/fortran/
* trans-openmp.cc (omp_split_map_op, omp_join_map_op,
omp_map_decayed_kind): New functions.
(gfc_trans_omp_instantiate_mapper): Add CD parameter. Implement map
kind decay.
(gfc_trans_omp_instantiate_mappers): Add CD parameter. Pass to above
function.
(gfc_trans_omp_target_data): Instantiate mappers for 'omp target data'.
(gfc_trans_omp_target_enter_data): Instantiate mappers for 'omp target
enter data'.
(gfc_trans_omp_target_exit_data): Instantiate mappers for 'omp target
exit data'.
gcc/testsuite/
* c-c++-common/gomp/declare-mapper-15.c: New test.
* c-c++-common/gomp/declare-mapper-16.c: New test.
* g++.dg/gomp/declare-mapper-1.C: Adjust expected scan output.
* gfortran.dg/gomp/declare-mapper-22.f90: New test.
* gfortran.dg/gomp/declare-mapper-23.f90: New test.
Julian Brown [Wed, 23 Apr 2025 03:22:33 +0000 (03:22 +0000)]
OpenMP: Fortran "!$omp declare mapper" support
This patch implements "omp declare mapper" functionality for Fortran,
following the equivalent support for C and C++. This version of the
patch has been merged to og13 and contains various fixes for e.g.:
* Mappers with deferred-length strings
* Array descriptors not being appropriately transferred
to the offload target (see "OMP_MAP_POINTER_ONLY" and
gimplify.cc:omp_maybe_get_descriptor_from_ptr).
2023-06-30 Julian Brown <julian@codesourcery.com>
gcc/fortran/
* dump-parse-tree.cc (show_attr): Show omp_udm_artificial_var flag.
(show_omp_namelist): Support OMP_MAP_POINTER_ONLY and OMP_MAP_UNSET.
* f95-lang.cc (LANG_HOOKS_OMP_FINISH_MAPPER_CLAUSES,
LANG_HOOKS_OMP_EXTRACT_MAPPER_DIRECTIVE,
LANG_HOOKS_OMP_MAP_ARRAY_SECTION): Define language hooks.
* gfortran.h (gfc_statement): Add ST_OMP_DECLARE_MAPPER.
(symbol_attribute): Add omp_udm_artificial_var attribute.
(gfc_omp_map_op): Add OMP_MAP_POINTER_ONLY and OMP_MAP_UNSET.
(gfc_omp_namelist): Add udm pointer to u2 union.
(gfc_omp_udm): New struct.
(gfc_omp_namelist_udm): New struct.
(gfc_symtree): Add omp_udm pointer.
(gfc_namespace): Add omp_udm_root symtree. Add omp_udm_ns flag.
(gfc_free_omp_namelist): Update prototype.
(gfc_free_omp_udm, gfc_omp_udm_find, gfc_find_omp_udm,
gfc_resolve_omp_udms): Add prototypes.
* match.cc (gfc_free_omp_namelist): Change FREE_NS and FREE_ALIGN
parameters to LIST number, to handle freeing user-defined mapper
namelists safely.
* match.h (gfc_match_omp_declare_mapper): Add prototype.
* module.cc (ab_attribute): Add AB_OMP_DECLARE_MAPPER_VAR.
(attr_bits): Add OMP_DECLARE_MAPPER_VAR.
(mio_symbol_attribute): Read/write AB_OMP_DECLARE_MAPPER_VAR attribute.
Set referenced attr on read.
(omp_map_clause_ops, omp_map_cardinality): New arrays.
(load_omp_udms, check_omp_declare_mappers): New functions.
(read_module): Load and check OMP declare mappers.
(write_omp_udm, write_omp_udms): New functions.
(write_module): Write OMP declare mappers.
* openmp.cc (gfc_free_omp_clauses, gfc_match_omp_variable_list,
gfc_match_omp_to_link, gfc_match_omp_depend_sink,
gfc_match_omp_clause_reduction): Update calls to gfc_free_omp_namelist.
(gfc_free_omp_udm, gfc_find_omp_udm, gfc_omp_udm_find,
gfc_match_omp_declare_mapper): New functions.
(gfc_match_omp_clauses): Add DEFAULT_MAP_OP parameter. Update calls to
gfc_free_omp_namelist. Add declare mapper support.
(resolve_omp_clauses): Add declare mapper support. Update calls to
gfc_free_omp_namelist.
(gfc_resolve_omp_udm, gfc_resolve_omp_udms): New functions.
* parse.cc (decode_omp_directive): Add declare mapper support.
(case_omp_decl): Add ST_OMP_DECLARE_MAPPER case.
(gfc_ascii_statement): Add ST_OMP_DECLARE_MAPPER case.
* resolve.cc (resolve_types): Call gfc_resolve_omp_udms.
* st.cc (gfc_free_statement): Update call to gfc_free_omp_namelist.
* symbol.cc (free_omp_udm_tree): New function.
(gfc_free_namespace): Call above.
* trans-decl.cc (omp_declare_mapper_ns): New global.
(gfc_finish_var_decl, gfc_generate_function_code): Support declare
mappers.
(gfc_trans_deferred_vars): Ignore artificial declare-mapper vars.
* trans-openmp.cc (tree-iterator.h): Include.
(toc_directive): New enum.
(gfc_trans_omp_array_section): Change OP and OPENMP parameters to
toc_directive CD ('clause directive').
(gfc_omp_finish_mapper_clauses, gfc_omp_extract_mapper_directive,
gfc_omp_map_array_section): New functions.
(omp_clause_directive): New enum.
(gfc_trans_omp_clauses): Remove DECLARE_SIMD and OPENACC parameters.
Replace with toc_directive CD, defaulting to TOC_OPENMP. Add declare
mapper support and OMP_MAP_POINTER_ONLY support.
(gfc_trans_omp_construct, gfc_trans_oacc_executable_directive,
gfc_trans_oacc_combined_directive): Update calls to
gfc_trans_omp_clauses.
(gfc_subst_replace, gfc_subst_prepend_ref): New variables.
(gfc_subst_in_expr_1, gfc_subst_in_expr, gfc_subst_mapper_var,
gfc_trans_omp_instantiate_mapper, gfc_trans_omp_instantiate_mappers,
gfc_record_mapper_bindings_code_fn, gfc_record_mapper_bindings_expr_fn,
gfc_find_nested_mappers, gfc_record_mapper_bindings): New functions.
(gfc_typespec * hash traits): New template.
(omp_declare_mapper_ns): Extern declaration.
(gfc_trans_omp_target): Call gfc_trans_omp_instantiate_mappers and
gfc_record_mapper_bindings. Update calls to gfc_trans_omp_clauses.
(gfc_trans_omp_declare_simd, gfc_trans_omp_declare_variant): Update
calls to gfc_trans_omp_clauses.
(gfc_trans_omp_mapper_name, gfc_trans_omp_declare_mapper,
gfc_trans_omp_declare_mappers): New functions.
* trans-stmt.h (gfc_trans_omp_declare_mappers): Add prototype.
* trans.h (gfc_omp_finish_mapper_clauses,
gfc_omp_extract_mapper_directive, gfc_omp_map_array_section): Add
prototypes.
gcc/
* gimplify.cc (dwarf2out.h): Include.
(omp_maybe_get_descriptor_from_ptr): New function.
(build_omp_struct_comp_nodes): Use above function to locate array
descriptor when necessary.
(omp_mapping_group_data, omp_mapping_group_ptr,
omp_mapping_group_pset): New functions.
(omp_instantiate_mapper): Handle inlining of "declare mapper" function
bodies containing setup code (e.g. for Fortran). Handle pointers to
derived types. Handle GOMP_MAP_MAPPING_GROUPs.
* tree-pretty-print.cc (dump_omp_clause): Handle
GOMP_MAP_MAPPING_GROUP.
include/
* gomp-constants.h (gomp_map_kind): Add GOMP_MAP_MAPPING_GROUP.
gcc/testsuite/
* gfortran.dg/gomp/declare-mapper-1.f90: New test.
* gfortran.dg/gomp/declare-mapper-5.f90: New test.
* gfortran.dg/gomp/declare-mapper-14.f90: New test.
libgomp/
* testsuite/libgomp.fortran/declare-mapper-2.f90: New test.
* testsuite/libgomp.fortran/declare-mapper-3.f90: New test.
* testsuite/libgomp.fortran/declare-mapper-4.f90: New test.
* testsuite/libgomp.fortran/declare-mapper-6.f90: New test.
* testsuite/libgomp.fortran/declare-mapper-7.f90: New test.
* testsuite/libgomp.fortran/declare-mapper-8.f90: New test.
* testsuite/libgomp.fortran/declare-mapper-9.f90: New test.
* testsuite/libgomp.fortran/declare-mapper-10.f90: New test.
* testsuite/libgomp.fortran/declare-mapper-11.f90: New test.
* testsuite/libgomp.fortran/declare-mapper-12.f90: New test.
* testsuite/libgomp.fortran/declare-mapper-13.f90: New test.
* testsuite/libgomp.fortran/declare-mapper-15.f90: New test.
* testsuite/libgomp.fortran/declare-mapper-17.f90: New test.
* testsuite/libgomp.fortran/declare-mapper-18.f90: New test.
* testsuite/libgomp.fortran/declare-mapper-19.f90: New test.
* testsuite/libgomp.fortran/declare-mapper-20.f90: New test.
* testsuite/libgomp.fortran/declare-mapper-21.f90: New test.
Julian Brown [Mon, 21 Apr 2025 02:08:35 +0000 (02:08 +0000)]
OpenMP: Support OpenMP 5.0 "declare mapper" directives for C
This patch adds support for "declare mapper" directives (and the "mapper"
modifier on "map" clauses) for C.
gcc/c/
* c-decl.cc (c_omp_mapper_id, c_omp_mapper_decl, c_omp_mapper_lookup,
c_omp_extract_mapper_directive, c_omp_map_array_section,
c_omp_scan_mapper_bindings_r, c_omp_scan_mapper_bindings): New
functions.
* c-objc-common.h (LANG_HOOKS_OMP_FINISH_MAPPER_CLAUSES,
LANG_HOOKS_OMP_MAPPER_LOOKUP, LANG_HOOKS_OMP_EXTRACT_MAPPER_DIRECTIVE,
LANG_HOOKS_OMP_MAP_ARRAY_SECTION): Define langhooks for C.
* c-parser.cc (c_parser_omp_clause_map): Add KIND parameter. Handle
mapper modifier.
(c_parser_omp_all_clauses): Update call to c_parser_omp_clause_map with
new kind argument.
(c_parser_omp_target): Instantiate explicit mappers and record bindings
for implicit mappers.
(c_parser_omp_declare_mapper): Parse "declare mapper" directives.
(c_parser_omp_declare): Support "declare mapper".
* c-tree.h (c_omp_finish_mapper_clauses, c_omp_mapper_lookup,
c_omp_extract_mapper_directive, c_omp_map_array_section,
c_omp_mapper_id, c_omp_mapper_decl, c_omp_scan_mapper_bindings,
c_omp_instantiate_mappers): Add prototypes.
* c-typeck.cc (c_finish_omp_clauses): Handle GOMP_MAP_PUSH_MAPPER_NAME
and GOMP_MAP_POP_MAPPER_NAME.
(c_omp_finish_mapper_clauses): New function (langhook).
gcc/testsuite/
* c-c++-common/gomp/declare-mapper-3.c: Enable for C.
* c-c++-common/gomp/declare-mapper-4.c: Likewise.
* c-c++-common/gomp/declare-mapper-5.c: Likewise.
* c-c++-common/gomp/declare-mapper-6.c: Likewise.
* c-c++-common/gomp/declare-mapper-7.c: Likewise.
* c-c++-common/gomp/declare-mapper-8.c: Likewise.
* c-c++-common/gomp/declare-mapper-9.c: Likewise.
* c-c++-common/gomp/declare-mapper-12.c: Enable for C.
* gcc.dg/gomp/declare-mapper-10.c: New test.
* gcc.dg/gomp/declare-mapper-11.c: New test.
libgomp/
* testsuite/libgomp.c-c++-common/declare-mapper-9.c: Enable for C.
* testsuite/libgomp.c-c++-common/declare-mapper-10.c: Likewise.
* testsuite/libgomp.c-c++-common/declare-mapper-11.c: Likewise.
* testsuite/libgomp.c-c++-common/declare-mapper-12.c: Likewise.
* testsuite/libgomp.c-c++-common/declare-mapper-13.c: Likewise.
* testsuite/libgomp.c-c++-common/declare-mapper-14.c: Likewise.
Julian Brown [Sun, 20 Apr 2025 21:59:22 +0000 (21:59 +0000)]
OpenMP: C++ "declare mapper" support
This patch adds support for OpenMP 5.0 "declare mapper" functionality
for C++. I've merged it to og13 based on the last version
posted upstream, with some minor changes due to the newly-added
'present' map modifier support. There's also a fix to splay-tree
traversal in gimplify.cc:omp_instantiate_implicit_mappers, and this patch
omits the rearrangement of gimplify.cc:gimplify_{scan,adjust}_omp_clauses
that I separated out into its own patch and applied (to og13) already.
2023-06-30 Julian Brown <julian@codesourcery.com>
gcc/c-family/
* c-common.h (c_omp_region_type): Add C_ORT_DECLARE_MAPPER and
C_ORT_OMP_DECLARE_MAPPER codes.
(omp_mapper_list): Add forward declaration.
(c_omp_find_nested_mappers, c_omp_instantiate_mappers): Add prototypes.
* c-omp.cc (c_omp_find_nested_mappers): New function.
(remap_mapper_decl_info): New struct.
(remap_mapper_decl_1, omp_instantiate_mapper,
c_omp_instantiate_mappers): New functions.
gcc/cp/
* constexpr.cc (reduced_constant_expression_p): Add OMP_DECLARE_MAPPER
case.
(cxx_eval_constant_expression, potential_constant_expression_1):
Likewise.
* cp-gimplify.cc (cxx_omp_finish_mapper_clauses): New function.
* cp-objcp-common.h (LANG_HOOKS_OMP_FINISH_MAPPER_CLAUSES,
LANG_HOOKS_OMP_MAPPER_LOOKUP, LANG_HOOKS_OMP_EXTRACT_MAPPER_DIRECTIVE,
LANG_HOOKS_OMP_MAP_ARRAY_SECTION): Define langhooks.
* cp-tree.h (lang_decl_base): Add omp_declare_mapper_p field. Recount
spare bits comment.
(DECL_OMP_DECLARE_MAPPER_P): New macro.
(omp_mapper_id): Add prototype.
(cp_check_omp_declare_mapper): Add prototype.
(omp_instantiate_mappers): Add prototype.
(cxx_omp_finish_mapper_clauses): Add prototype.
(cxx_omp_mapper_lookup): Add prototype.
(cxx_omp_extract_mapper_directive): Add prototype.
(cxx_omp_map_array_section): Add prototype.
* decl.cc (check_initializer): Add OpenMP declare mapper support.
(cp_finish_decl): Set DECL_INITIAL for OpenMP declare mapper var decls
as appropriate.
* decl2.cc (mark_used): Instantiate OpenMP "declare mapper" magic var
decls.
* error.cc (dump_omp_declare_mapper): New function.
(dump_simple_decl): Use above.
* parser.cc (cp_parser_omp_clause_map): Add KIND parameter. Support
"mapper" modifier.
(cp_parser_omp_all_clauses): Add KIND argument to
cp_parser_omp_clause_map call.
(cp_parser_omp_target): Call omp_instantiate_mappers before
finish_omp_clauses.
(cp_parser_omp_declare_mapper): New function.
(cp_parser_omp_declare): Add "declare mapper" support.
* pt.cc (tsubst_decl): Adjust name of "declare mapper" magic var decls
once we know their type.
(tsubst_omp_clauses): Call omp_instantiate_mappers before
finish_omp_clauses, for target regions.
(tsubst_expr): Support OMP_DECLARE_MAPPER nodes.
(instantiate_decl): Instantiate initialiser (i.e definition) for OpenMP
declare mappers.
* semantics.cc (gimplify.h): Include.
(omp_mapper_id, omp_mapper_lookup, omp_extract_mapper_directive,
cxx_omp_map_array_section, cp_check_omp_declare_mapper): New functions.
(finish_omp_clauses): Delete GOMP_MAP_PUSH_MAPPER_NAME and
GOMP_MAP_POP_MAPPER_NAME artificial clauses.
(omp_target_walk_data): Add MAPPERS field.
(finish_omp_target_clauses_r): Scan for uses of struct/union/class type
variables.
(finish_omp_target_clauses): Create artificial mapper binding clauses
for used structs/unions/classes in offload region.
gcc/fortran/
* parse.cc (tree.h, fold-const.h, tree-hash-traits.h): Add includes
(for additions to omp-general.h).
gcc/
* gimplify.cc (gimplify_omp_ctx): Add IMPLICIT_MAPPERS field.
(new_omp_context): Initialise IMPLICIT_MAPPERS hash map.
(delete_omp_context): Delete IMPLICIT_MAPPERS hash map.
(instantiate_mapper_info): New structs.
(remap_mapper_decl_1, omp_mapper_copy_decl, omp_instantiate_mapper,
omp_instantiate_implicit_mappers): New functions.
(gimplify_scan_omp_clauses): Handle MAPPER_BINDING clauses.
(gimplify_adjust_omp_clauses): Instantiate implicit declared mappers.
(gimplify_omp_declare_mapper): New function.
(gimplify_expr): Call above function.
* langhooks-def.h (lhd_omp_finish_mapper_clauses,
lhd_omp_mapper_lookup, lhd_omp_extract_mapper_directive,
lhd_omp_map_array_section): Add prototypes.
(LANG_HOOKS_OMP_FINISH_MAPPER_CLAUSES,
LANG_HOOKS_OMP_MAPPER_LOOKUP, LANG_HOOKS_OMP_EXTRACT_MAPPER_DIRECTIVE,
LANG_HOOKS_OMP_MAP_ARRAY_SECTION): Define macros.
(LANG_HOOK_DECLS): Add above macros.
* langhooks.cc (lhd_omp_finish_mapper_clauses,
lhd_omp_mapper_lookup, lhd_omp_extract_mapper_directive,
lhd_omp_map_array_section): New dummy functions.
* langhooks.h (lang_hooks_for_decls): Add OMP_FINISH_MAPPER_CLAUSES,
OMP_MAPPER_LOOKUP, OMP_EXTRACT_MAPPER_DIRECTIVE, OMP_MAP_ARRAY_SECTION
hooks.
* omp-general.h (omp_name_type<T>): Add templatized struct, hash type
traits (for omp_name_type<tree> specialization).
(omp_mapper_list<T>): Add struct.
* tree-core.h (omp_clause_code): Add OMP_CLAUSE__MAPPER_BINDING_.
* tree-pretty-print.cc (dump_omp_clause): Support GOMP_MAP_UNSET,
GOMP_MAP_PUSH_MAPPER_NAME, GOMP_MAP_POP_MAPPER_NAME artificial mapping
clauses. Support OMP_CLAUSE__MAPPER_BINDING_ and OMP_DECLARE_MAPPER.
* tree.cc (omp_clause_num_ops, omp_clause_code_name): Add
OMP_CLAUSE__MAPPER_BINDING_.
* tree.def (OMP_DECLARE_MAPPER): New tree code.
* tree.h (OMP_DECLARE_MAPPER_ID, OMP_DECLARE_MAPPER_DECL,
OMP_DECLARE_MAPPER_CLAUSES): New defines.
(OMP_CLAUSE__MAPPER_BINDING__ID, OMP_CLAUSE__MAPPER_BINDING__DECL,
OMP_CLAUSE__MAPPER_BINDING__MAPPER): New defines.
include/
* gomp-constants.h (gomp_map_kind): Add GOMP_MAP_UNSET,
GOMP_MAP_PUSH_MAPPER_NAME, GOMP_MAP_POP_MAPPER_NAME artificial mapping
clause types.
gcc/testsuite/
* c-c++-common/gomp/map-6.c: Update error scan output.
* c-c++-common/gomp/declare-mapper-3.c: New test (only enabled for C++
for now).
* c-c++-common/gomp/declare-mapper-4.c: Likewise.
* c-c++-common/gomp/declare-mapper-5.c: Likewise.
* c-c++-common/gomp/declare-mapper-6.c: Likewise.
* c-c++-common/gomp/declare-mapper-7.c: Likewise.
* c-c++-common/gomp/declare-mapper-8.c: Likewise.
* c-c++-common/gomp/declare-mapper-9.c: Likewise.
* c-c++-common/gomp/declare-mapper-12.c: Likewise.
* g++.dg/gomp/declare-mapper-1.C: New test.
* g++.dg/gomp/declare-mapper-2.C: New test.
libgomp/
* testsuite/libgomp.c++/declare-mapper-1.C: New test.
* testsuite/libgomp.c++/declare-mapper-2.C: New test.
* testsuite/libgomp.c++/declare-mapper-3.C: New test.
* testsuite/libgomp.c++/declare-mapper-4.C: New test.
* testsuite/libgomp.c++/declare-mapper-5.C: New test.
* testsuite/libgomp.c++/declare-mapper-6.C: New test.
* testsuite/libgomp.c++/declare-mapper-7.C: New test.
* testsuite/libgomp.c++/declare-mapper-8.C: New test.
* testsuite/libgomp.c-c++-common/declare-mapper-9.c: New test (only
enabled for C++ for now).
* testsuite/libgomp.c-c++-common/declare-mapper-10.c: Likewise.
* testsuite/libgomp.c-c++-common/declare-mapper-11.c: Likewise.
* testsuite/libgomp.c-c++-common/declare-mapper-12.c: Likewise.
* testsuite/libgomp.c-c++-common/declare-mapper-13.c: Likewise.
* testsuite/libgomp.c-c++-common/declare-mapper-14.c: Likewise.
Julian Brown [Sun, 20 Apr 2025 00:00:54 +0000 (00:00 +0000)]
OpenACC: Improve implicit mapping for non-lexically nested offload regions
This patch enables use of the OMP_CLAUSE_RUNTIME_IMPLICIT_P flag for
OpenACC.
This allows code like this to work correctly:
int arr[100];
[...]
#pragma acc enter data copyin(arr[20:10])
/* No explicit mapping of 'arr' here. */
#pragma acc parallel
{ /* use of arr[20:10]... */ }
#pragma acc exit data copyout(arr[20:10])
Otherwise, the implicit "copy" ("present_or_copy") on the parallel
corresponds to the whole array, and that fails at runtime when the
subarray is mapped.
The numbering of the GOMP_MAP_IMPLICIT bit clashes with the OpenACC
"non-contiguous" dynamic array support, so the GOMP_MAP_NONCONTIG_ARRAY_P
macro has been adjusted to account for that.
gcc/
* gimplify.cc (gimplify_adjust_omp_clauses_1): Set
OMP_CLAUSE_MAP_RUNTIME_IMPLICIT_P for OpenACC also.
gcc/testsuite/
* c-c++-common/goacc/combined-reduction.c: Adjust scan output.
* c-c++-common/goacc/implied-copy-1.c: Likewise.
* c-c++-common/goacc/reduction-1.c: Adjust patterns.
* c-c++-common/goacc/reduction-2.c: Likewise.
* c-c++-common/goacc/reduction-3.c: Likewise.
* c-c++-common/goacc/reduction-4.c: Likewise.
* c-c++-common/goacc/reduction-10.c: Likewise.
* gfortran.dg/goacc/common-block-3.f90: Likewise.
* gfortran.dg/goacc/implied-copy-1.f90: Likewise.
* gfortran.dg/goacc/loop-tree-1.f90: Likewise.
* gfortran.dg/goacc/private-explicit-kernels-1.f95: Likewise.
* gfortran.dg/goacc/private-predetermined-kernels-1.f95: Likewise.
include/
* gomp-constants.h (GOMP_MAP_NONCONTIG_ARRAY_P): Tweak condition.
libgomp/
* testsuite/libgomp.oacc-c-c++-common/implicit-mapping-1.c: New test.
Co-Authored-By: Thomas Schwinge <tschwinge@baylibre.com>
Co-Authored-By: Sandra Loosemore <sloosemore@baylibre.com>
Julian Brown [Wed, 23 Apr 2025 03:11:29 +0000 (03:11 +0000)]
OpenACC: Allow implicit uses of assumed-size arrays in offload regions
This patch reimplements the functionality of the previously-reverted
patch "Assumed-size arrays with non-lexical data mappings". The purpose
is to support implicit uses of assumed-size arrays for Fortran when those
arrays have already been mapped on the target some other way (e.g. by
"acc enter data").
This relates to upstream OpenACC issue 489 (not yet resolved).
gcc/fortran/
* trans-openmp.cc (gfc_omp_finish_clause): Treat implicitly-mapped
assumed-size arrays as zero-sized for OpenACC, rather than an error.
gcc/testsuite/
* gfortran.dg/goacc/assumed-size.f90: Don't expect error.
libgomp/
* testsuite/libgomp.oacc-fortran/nonlexical-assumed-size-1.f90: New
test.
* testsuite/libgomp.oacc-fortran/nonlexical-assumed-size-2.f90: New
test.
Julian Brown [Wed, 23 Apr 2025 03:09:01 +0000 (03:09 +0000)]
OpenACC: "declare create" fixes wrt. "allocatable" variables
This patch fixes a case revealed by the previous patch where a synthetic
"acc data" region created for a "declare create" variable could interact
strangely with lexical inheritance behaviour. In fact, it doesn't seem
right to create the "acc data" region for allocatable variables at all
-- doing so means that a data region is likely to be created for an
unallocated variable.
The fix is not to add such variables to the synthetic "acc data" region
at all, and defer to the code that performs "enter data"/"exit data"
for them when allocated/deallocated on the host instead. Then, "declare
create" variables are implicitly turned into "present" clauses on in-scope
offload regions.
gcc/fortran/
* trans-openmp.cc (gfc_omp_finish_clause): Handle "declare create" for
scalar allocatable variables.
(gfc_trans_omp_clauses): Don't include allocatable vars in synthetic
"acc data" region created for "declare create" variables. Mark such
variables with the "oacc declare create" attribute instead. Don't
create ALWAYS_POINTER mapping for target-to-host updates of declare
create variables.
(gfc_trans_oacc_declare): Handle empty clause list.
gcc/
* gimplify.cc (gimplify_adjust_omp_clauses_1): Handle "oacc declare
create" attribute.
gcc/testsuite/
* c-c++-common/goacc/readonly-1.c: Adjust patterns.
libgomp/
* testsuite/libgomp.oacc-fortran/declare-allocatable-1-directive.f90:
Remove xfails.
* testsuite/libgomp.oacc-fortran/declare-allocatable-1-runtime.f90:
Remove xfails.
* testsuite/libgomp.oacc-fortran/declare-allocatable-1.f90:
Remove xfails.
* testsuite/libgomp.oacc-fortran/declare-create-1.f90: New test.
* testsuite/libgomp.oacc-fortran/declare-create-2.f90: New test.
* testsuite/libgomp.oacc-fortran/declare-create-3.f90: New test.
Co-Authored-By: Paul-Antoine Arras <parras@baylibre.com>
Co-Authored-By: Sandra Loosemore <sandra@baylibre.com>
Julian Brown [Sat, 19 Apr 2025 23:41:41 +0000 (23:41 +0000)]
OpenACC: Reimplement "inheritance" for lexically-nested offload regions
This patch reimplements "lexical inheritance" for OpenACC offload regions
inside "data" regions, allowing e.g. this to work:
int *ptr;
[...]
#pragma acc data copyin(ptr[10:2])
{
#pragma acc parallel
{ ... }
}
here, the "copyin" is mirrored on the inner "acc parallel" as
"present(ptr[10:2])" -- allowing code within the parallel to use that
section of the array even though the mapping is implicit.
In terms of implementation, this works by expanding mapping nodes for
"acc data" to include pointer mappings that might be needed by inner
offload regions. The resulting mapping group is then copied to the inner
offload region as needed, rewriting the first node to "force_present".
The pointer mapping nodes are then removed from the "acc data" later
during gimplification.
For OpenMP, pointer mapping nodes on equivalent "omp data" regions are
not needed, so remain suppressed during expansion.
gcc/c-family/
* c-omp.cc (c_omp_address_inspector::expand_array_base): Don't omit
pointer nodes for OpenACC.
gcc/
* gimplify.cc (omp_tsort_mark, omp_mapping_group): Move before
gimplify_omp_ctx. Add constructor to omp_mapping_group.
(gimplify_omp_ctx): Add DECL_DATA_CLAUSE field.
(new_omp_context, delete_omp_context): Initialise and free above field.
(omp_gather_mapping_groups_1): Use constructor for omp_mapping_group.
(gimplify_scan_omp_clauses): Record mappings that might be lexically
inherited. Don't remove
GOMP_MAP_FIRSTPRIVATE_POINTER/GOMP_MAP_FIRSTPRIVATE_REFERENCE yet.
(gomp_oacc_needs_data_present): New function.
(gimplify_adjust_omp_clauses_1): Implement lexical inheritance
behaviour for OpenACC.
(gimplify_adjust_omp_clauses): Remove
GOMP_MAP_FIRSTPRIVATE_POINTER/GOMP_MAP_FIRSTPRIVATE_REFERENCE here
instead, after lexical inheritance is done.
gcc/testsuite/
* c-c++-common/goacc/acc-data-chain.c: New test.
* gfortran.dg/goacc/pr70828.f90: Likewise.
* gfortran.dg/goacc/assumed-size.f90: Likewise.
libgomp/
* testsuite/libgomp.oacc-c-c++-common/pr70828.c: New test.
* testsuite/libgomp.oacc-c-c++-common/pr70828-2.c: Likewise.
* testsuite/libgomp.oacc-fortran/pr70828.f90: Likewise.
* testsuite/libgomp.oacc-fortran/pr70828-2.f90: Likewise.
* testsuite/libgomp.oacc-fortran/pr70828-3.f90: Likewise.
* testsuite/libgomp.oacc-fortran/pr70828-4.f90: Likewise.
* testsuite/libgomp.oacc-fortran/pr70828-5.f90: Likewise.
* testsuite/libgomp.oacc-fortran/pr70828-6.f90: Likewise.
Chung-Lin Tang [Sat, 19 Apr 2025 21:11:34 +0000 (21:11 +0000)]
Use OpenACC code to process OpenMP target regions
(forward ported from devel/omp/gcc-12)
This is a backport of:
https://gcc.gnu.org/pipermail/gcc-patches/2023-May/619003.html
This patch implements '-fopenmp-target=acc', which enables internally handling
a subset of OpenMP target regions as OpenACC parallel regions. This basically
includes target, teams, parallel, distribute, for/do constructs, and atomics.
Essentially, we adjust the internal kinds to OpenACC type, and let OpenACC code
paths handle them, with various needed adjustments throughout middle-end and
nvptx backend. When using this "OMPACC" mode, if there are cases the patch
doesn't handle, it issues a warning, and reverts to normal processing for that
target region.
gcc/ChangeLog:
* builtins.cc (expand_builtin_omp_builtins): New function.
(expand_builtin): Add expand cases for BUILT_IN_GOMP_BARRIER,
BUILT_IN_OMP_GET_THREAD_NUM, BUILT_IN_OMP_GET_NUM_THREADS,
BUILT_IN_OMP_GET_TEAM_NUM, and BUILT_IN_OMP_GET_NUM_TEAMS using
expand_builtin_omp_builtins, enabled under -fopenmp-target=acc.
* cgraphunit.cc (analyze_functions): Add call to
omp_ompacc_attribute_tagging, enabled under -fopenmp-target=acc.
* common.opt (fopenmp-target=): Add new option and enums.
* config/nvptx/mkoffload.cc (main): Handle -fopenmp-target=.
* config/nvptx/nvptx-protos.h (nvptx_expand_omp_get_num_threads): New
prototype.
(nvptx_mem_shared_p): Likewise.
* config/nvptx/nvptx.cc (omp_num_threads_sym): New global static RTX
symbol for number of threads in team.
(omp_num_threads_align): New var for alignment of omp_num_threads_sym.
(need_omp_num_threads): New bool for if any function references
omp_num_threads_sym.
(nvptx_option_override): Initialize omp_num_threads_sym/align.
(write_as_kernel): Disable normal OpenMP kernel entry under OMPACC mode.
(nvptx_declare_function_name): Disable shim function under OMPACC mode.
Disable soft-stack under OMPACC mode. Add generation of neutering init
code under OMPACC mode.
(nvptx_output_set_softstack): Return "" under OMPACC mode.
(nvptx_expand_call): Set parallelism to vector for function calls with
"ompacc for" attached.
(nvptx_expand_oacc_fork): Set mode to GOMP_DIM_VECTOR under OMPACC mode.
(nvptx_expand_oacc_join): Likewise.
(nvptx_expand_omp_get_num_threads): New function.
(nvptx_mem_shared_p): New function.
(nvptx_mach_max_workers): Return 1 under OMPACC mode.
(nvptx_mach_vector_length): Return 32 under OMPACC mode.
(nvptx_single): Add adjustments for OMPACC mode, which have
parallel-construct fork/joins, and regions of code where neutering is
dynamically determined.
(nvptx_reorg): Enable neutering under OMPACC mode when "ompacc for"
attribute is attached to function. Disable uniform-simt when under
OMPACC mode.
(nvptx_file_end): Write __nvptx_omp_num_threads out when needed.
(nvptx_goacc_fork_join): Return true under OMPACC mode.
* config/nvptx/nvptx.h (struct GTY(()) machine_function): Add
omp_parallel_predicate and omp_fn_entry_num_threads_reg fields.
* config/nvptx/nvptx.md (unspecv): Add UNSPECV_GET_TID,
UNSPECV_GET_NTID, UNSPECV_GET_CTAID, UNSPECV_GET_NCTAID,
UNSPECV_OMP_PARALLEL_FORK, UNSPECV_OMP_PARALLEL_JOIN entries.
(nvptx_shared_mem_operand): New predicate.
(gomp_barrier): New expand pattern.
(omp_get_num_threads): New expand pattern.
(omp_get_num_teams): New insn pattern.
(omp_get_thread_num): Likewise.
(omp_get_team_num): Likewise.
(get_ntid): Likewise.
(nvptx_omp_parallel_fork): Likewise.
(nvptx_omp_parallel_join): Likewise.
* expr.cc (expand_expr_real_1): Call expand_var_decl target hook.
* flag-types.h (omp_target_mode_kind): New flag value enum.
* gimplify.cc (struct gimplify_omp_ctx): Add 'bool ompacc' field.
(gimplify_scan_omp_clauses): Handle OMP_CLAUSE__OMPACC_.
(gimplify_adjust_omp_clauses): Likewise.
(gimplify_omp_ctx_ompacc_p): New function.
(gimplify_omp_for): Handle combined loops under OMPACC.
* lto-wrapper.cc (append_compiler_options): Add OPT_fopenmp_target_.
* omp-builtins.def (BUILT_IN_OMP_GET_THREAD_NUM): Remove CONST.
(BUILT_IN_OMP_GET_NUM_THREADS): Likewise.
* omp-expand.cc (remove_exit_barrier): Disable addressable-var
processing for parallel construct child functions under OMPACC mode.
(expand_oacc_for): Add OMPACC mode handling.
(get_target_arguments): Force thread_limit clause value to 1 under
OMPACC mode.
(expand_omp): Under OMPACC mode, avoid child function expanding of
GIMPLE_OMP_PARALLEL.
* omp-general.cc (omp_extract_for_data): Adjustments for OMPACC mode.
* omp-low.cc (struct omp_context): Add 'bool ompacc_p' field.
(scan_sharing_clauses): Handle OMP_CLAUSE__OMPACC_.
(ompacc_ctx_p): New function.
(scan_omp_parallel): Handle OMPACC mode, avoid creating child function.
(scan_omp_target): Tag "ompacc"/"ompacc for" attributes for target
construct child function, remove OMP_CLAUSE__OMPACC_ clauses.
(lower_oacc_head_mark): Handle OMPACC mode cases.
(lower_omp_for): Adjust OMP_FOR kind from OpenMP to OpenACC kinds, add
vector/gang clauses as needed. Add other OMPACC handling.
(lower_omp_taskreg): Add call to lower_oacc_head_tail for OMPACC case.
(lower_omp_target): Do OpenACC gang privatization under OMPACC case.
(lower_omp_teams): Forward OpenACC privatization variables to outer
target region under OMPACC mode.
(lower_omp_1): Do OpenACC gang privatization under OMPACC case for
GIMPLE_BIND.
* omp-offload.cc (ompacc_supported_clauses_p): New function.
(struct target_region_data): New struct type for tree walk.
(scan_fndecl_for_ompacc): New function.
(scan_omp_target_region_r): New function.
(scan_omp_target_construct_r): New function.
(omp_ompacc_attribute_tagging): New function.
(oacc_dim_call): Add OMPACC case handling.
(execute_oacc_device_lower): Make parts explicitly only OpenACC enabled.
(pass_oacc_device_lower::gate): Enable pass under OMPACC mode.
* omp-offload.h (omp_ompacc_attribute_tagging): New prototype.
* opts.cc (finish_options): Only allow -fopenmp-target= when -fopenmp
and no -fopenacc.
* target-insns.def (gomp_barrier): New defined insn pattern.
(omp_get_thread_num): Likewise.
(omp_get_num_threads): Likewise.
(omp_get_team_num): Likewise.
(omp_get_num_teams): Likewise.
* tree-core.h (enum omp_clause_code): Add new OMP_CLAUSE__OMPACC_ entry
for internal clause.
* tree-nested.cc (convert_nonlocal_omp_clauses): Handle
OMP_CLAUSE__OMPACC_.
* tree-pretty-print.cc (dump_omp_clause): Handle OMP_CLAUSE__OMPACC_.
* tree.cc (omp_clause_num_ops): Add OMP_CLAUSE__OMPACC_ entry.
(omp_clause_code_name): Likewise.
* tree.h (OMP_CLAUSE__OMPACC__FOR): New macro for OMP_CLAUSE__OMPACC_.
libgomp/ChangeLog:
* config/nvptx/team.c (__nvptx_omp_num_threads): New global variable in
shared memory.
(cherry picked from commit
5f881613fa9128edae5bbfa4e19f9752809e4bd7)
Thomas Schwinge [Sat, 19 Apr 2025 20:57:56 +0000 (20:57 +0000)]
libgomp: Document OpenMP 'pinned' memory
libgomp/
* libgomp.texi (AMD Radeon, nvptx): Document OpenMP 'pinned'
memory.
Thomas Schwinge [Sat, 19 Apr 2025 20:52:08 +0000 (20:52 +0000)]
In 'libgomp/target.c:gomp_unmap_vars_internal', defer 'gomp_remove_var'
An upcoming change requires that 'gomp_remove_var' be deferred until after all
'gomp_copy_dev2host' calls have been handled.
Do this likewise to how commit
275c736e732d29934e4d22e8f030d5aae8c12a52
"libgomp: Structure element mapping for OpenMP 5.0" changed 'gomp_exit_data'.
libgomp/
* target.c (gomp_unmap_vars_internal): Queue splay-tree keys for
removal after main loop.
Tobias Burnus [Wed, 2 Nov 2022 16:04:53 +0000 (17:04 +0100)]
Fortran/OpenMP: Testcase for DT struct-component with 'alloc' and array descr
When using 'map(alloc: var, dt%comp)' needs to have a 'to' mapping of
the array descriptor as otherwise the bounds are not available in the
target region. - Likewise for character strings.
Part of patch submitted to mainline:
https://gcc.gnu.org/pipermail/gcc-patches/2022-November/604887.html
The code changes from that patch made it upstream in a different form
in commit
80bb0b8a81fdc5d0a1c88ae3febd593868daa752. This patch is only
the new testcase from the original posting. Some additional issues are
exposed by the testcase; those are '#if 0'ed and will be handled later.
libgomp/ChangeLog:
* testsuite/libgomp.fortran/target-enter-data-3a.f90: New test.
Tobias Burnus [Wed, 2 Nov 2022 08:06:28 +0000 (09:06 +0100)]
OpenMP/Fortran: 'target update' with strides + DT components
OpenMP 5.0 permits to use arrays with strides and derived
type components for the list items to the 'from'/'to' clauses
of the 'target update' directive.
Partially committed to mainline as:
6629444170f85 OpenMP/Fortran: 'target update' with DT components
This patch contains the differences to the mainline version.
gcc/fortran/ChangeLog:
* openmp.cc (resolve_omp_clauses): Apply to OpenMP target update.
libgomp/ChangeLog:
* testsuite/libgomp.fortran/target-13.f90: Update test.
Andrew Stubbs [Fri, 21 Oct 2022 13:19:31 +0000 (14:19 +0100)]
vect: WORKAROUND vectorizer bug
This patch disables vectorization of memory accesses to non-default address
spaces where the pointer size is different to the usual pointer size. This
condition typically occurs in OpenACC programs on amdgcn, where LDS memory is
used for broadcasting gang-private variables between threads. In particular,
see libgomp.oacc-c-c++-common/private-variables.c
The problem is that the address space information is dropped from the various
types in the middle-end and eventually it triggers an ICE trying to do an
address conversion. That ICE can be avoided by defining
POINTERS_EXTEND_UNSIGNED, but that just produces wrong RTL code later on.
A correct solution would ensure that all the vectypes have the correct address
spaces, but I don't have time for that right now.
gcc/ChangeLog:
* tree-vect-data-refs.cc (vect_analyze_data_refs): Workaround an
address-space bug.
Tobias Burnus [Sat, 19 Apr 2025 02:13:38 +0000 (02:13 +0000)]
OpenMP: Add uses_allocators support
This adds middle end support for uses_allocators, wires Fortran to use it and
add C/C++ parsing support.
gcc/ChangeLog:
* builtin-types.def (BT_FN_VOID_PTRMODE): Add.
(BT_FN_PTRMODE_PTRMODE_INT_PTR): Add.
* gimplify.cc (gimplify_bind_expr): Diagnose missing
uses_allocators clause.
(gimplify_scan_omp_clauses, gimplify_adjust_omp_clauses,
gimplify_omp_workshare): Handle uses_allocators.
* omp-builtins.def (BUILT_IN_OMP_INIT_ALLOCATOR,
BUILT_IN_OMP_DESTROY_ALLOCATOR): Add.
* omp-low.cc (scan_sharing_clauses): Handle OMP_CLAUSE_USES_ALLOCATORS
and OMP_CLAUSE_USES_ALLOCATORS_ALLOCATOR clauses.
* tree-core.h (enum omp_clause_code): Add OMP_CLAUSE_USES_ALLOCATORS.
* tree.cc (omp_clause_num_ops, omp_clause_code_name): Likewise.
* tree-pretty-print.cc (dump_omp_clause): Handle it.
* tree.h (OMP_CLAUSE_USES_ALLOCATORS_ALLOCATOR,
OMP_CLAUSE_USES_ALLOCATORS_MEMSPACE,
OMP_CLAUSE_USES_ALLOCATORS_TRAITS): New.
gcc/c-family/ChangeLog:
* c-omp.cc (c_omp_split_clauses): Hande uses_allocators.
* c-pragma.h (enum pragma_omp_clause): Add
PRAGMA_OMP_CLAUSE_USES_ALLOCATORS.
gcc/c/ChangeLog:
* c-parser.cc (c_parser_omp_clause_uses_allocators): New.
(c_parser_omp_clause_name, c_parser_omp_all_clauses,
OMP_TARGET_CLAUSE_MASK): Handle uses_allocators.
* c-typeck.cc (c_finish_omp_clauses): Likewise.
gcc/cp/ChangeLog:
* parser.cc (cp_parser_omp_clause_uses_allocators): New.
(cp_parser_omp_clause_name, cp_parser_omp_all_clauses,
OMP_TARGET_CLAUSE_MASK): Handle uses_allocators.
* semantics.cc (finish_omp_clauses): Likewise.
gcc/fortran/ChangeLog:
* trans-array.cc (gfc_conv_array_initializer): Set PURPOSE
when building constructor for get_initialized_tmp_var.
* trans-openmp.cc (gfc_trans_omp_clauses): Handle uses_allocators.
* types.def (BT_FN_VOID_PTRMODE, BT_FN_PTRMODE_PTRMODE_INT_PTR): Add.
libgomp/ChangeLog:
* testsuite/libgomp.c++/c++.exp (check_effective_target_c,
check_effective_target_c++): Add.
* testsuite/libgomp.c/c.exp (check_effective_target_c,
check_effective_target_c++): Add.
* testsuite/libgomp.fortran/uses_allocators_2.f90: Remove 'sorry'.
* testsuite/libgomp.c-c++-common/uses_allocators-1.c: New test.
* testsuite/libgomp.c-c++-common/uses_allocators-2.c: New test.
* testsuite/libgomp.c-c++-common/uses_allocators-3.c: New test.
* testsuite/libgomp.c-c++-common/uses_allocators-4.c: New test.
* testsuite/libgomp.fortran/uses_allocators_3.f90: New test.
* testsuite/libgomp.fortran/uses_allocators_4.f90: New test.
* testsuite/libgomp.fortran/uses_allocators_5.f90: New test.
* testsuite/libgomp.fortran/uses_allocators_6.f90: New test.
gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/allocate-1.f90: Add uses_allocators.
* gfortran.dg/gomp/scope-6.f90: Update dg-scan-tree-dump.
* c-c++-common/gomp/uses_allocators-1.c: New test.
* c-c++-common/gomp/uses_allocators-2.c: New test.
* gfortran.dg/gomp/uses_allocators-1.f90: New test.
Tobias Burnus [Fri, 21 Oct 2022 13:31:25 +0000 (15:31 +0200)]
omp-oacc-kernels-decompose.cc: fix -fcompare-debug with GIMPLE_DEBUG
GIMPLE_DEBUG were put in a parallel region of its own, which is not
only pointless but also breaks -fcompare-debug. With this commit,
they are handled like simple assignments: those placed are places
into the same body as the loop such that only one parallel region
remains as without debugging. This fixes the existing testcase
libgomp.oacc-c-c++-common/kernels-loop-g.c.
Note: GIMPLE_DEBUG are only accepted with -fcompare-debug; if they
appear otherwise, decompose_kernels_region_body rejects them with
a sorry (unchanged).
Also note that there are still many xfailed tests in the
c-c++-common/goacc/kernels-decompose-pr* testcases that were added
in mainline commit
c14ea6a72fb1ae66e3d32ac8329558497c6e4403.
gcc/ChangeLog
* omp-oacc-kernels-decompose.cc (top_level_omp_for_in_stmt,
decompose_kernels_region_body): Handle GIMPLE_DEBUG like
simple assignment.
gcc/testsuite/ChangeLog
* c-c++-common/goacc/kernels-decompose-pr103836-1-2.c: Adjust xfails.
* c-c++-common/goacc/kernels-decompose-pr103836-1-3.c: Likewise.
* c-c++-common/goacc/kernels-decompose-pr103836-1-4.c: Likewise.
* c-c++-common/goacc/kernels-decompose-pr104061-1-2.c: Likewise.
* c-c++-common/goacc/kernels-decompose-pr104061-1-3.c: Likewise.
* c-c++-common/goacc/kernels-decompose-pr104061-1-4.c: Likewise.
Co-Authored-By: Sandra Loosemore <sloosemore@baylibre.com>
Andrew Stubbs [Fri, 18 Apr 2025 21:56:19 +0000 (21:56 +0000)]
libgomp: fine-grained pinned memory allocator
https://patchwork.sourceware.org/project/gcc/list/?series=35022
This patch introduces a new custom memory allocator for use with pinned
memory (in the case where the Cuda allocator isn't available). In future,
this allocator will also be used for Unified Shared Memory. Both memories
are incompatible with the system malloc because allocated memory cannot
share a page with memory allocated for other purposes.
This means that small allocations will no longer consume an entire page of
pinned memory. Unfortunately, it also means that pinned memory pages will
never be unmapped (although they may be reused).
The implementation is not perfect; there are various corner cases (especially
related to extending onto new pages) where allocations and reallocations may
be sub-optimal, but it should still be a step forward in support for small
allocations.
I have considered using libmemkind's "fixed" memory but rejected it for three
reasons: 1) libmemkind may not always be present at runtime, 2) there's no
currently documented means to extend a "fixed" kind one page at a time
(although the code appears to have an undocumented function that may do the
job, and/or extending libmemkind to support the MAP_LOCKED mmap flag with its
regular kinds would be straight-forward), 3) USM benefits from having the
metadata located in different memory and using an external implementation makes
it hard to guarantee this.
libgomp/ChangeLog:
* Makefile.am (libgomp_la_SOURCES): Add usmpin-allocator.c.
* Makefile.in: Regenerate.
* config/linux/allocator.c: Include unistd.h.
(pin_ctx): New variable.
(ctxlock): New variable.
(linux_init_pin_ctx): New function.
(linux_memspace_alloc): Use usmpin-allocator for pinned memory.
(linux_memspace_free): Likewise.
(linux_memspace_realloc): Likewise.
* libgomp.h (usmpin_init_context): New prototype.
(usmpin_register_memory): New prototype.
(usmpin_alloc): New prototype.
(usmpin_free): New prototype.
(usmpin_realloc): New prototype.
* testsuite/libgomp.c/alloc-pinned-8.c: New test.
* usmpin-allocator.c: New file.
Andrew Stubbs [Fri, 18 Apr 2025 21:48:33 +0000 (21:48 +0000)]
libgomp, nvptx: Cuda pinned memory
https://patchwork.sourceware.org/project/gcc/list/?series=35022
This patch was already approved, in the v3 posting by Tobias Burnus
(with one caveat about initialization location), but wasn't committed at
that time as I didn't want to disentangle it from the textual
dependencies on the other patches in the series.
------
Use Cuda to pin memory, instead of Linux mlock, when available.
There are two advantages: firstly, this gives a significant speed boost for
NVPTX offloading, and secondly, it side-steps the usual OS ulimit/rlimit
setting.
The design adds a device independent plugin API for allocating pinned memory,
and then implements it for NVPTX. At present, the other supported devices do
not have equivalent capabilities (or requirements).
libgomp/ChangeLog:
* config/linux/allocator.c: Include assert.h.
(using_device_for_page_locked): New variable.
(linux_memspace_alloc): Add init0 parameter. Support device pinning.
(linux_memspace_calloc): Set init0 to true.
(linux_memspace_free): Support device pinning.
(linux_memspace_realloc): Support device pinning.
(MEMSPACE_ALLOC): Set init0 to false.
* libgomp-plugin.h
(GOMP_OFFLOAD_page_locked_host_alloc): New prototype.
(GOMP_OFFLOAD_page_locked_host_free): Likewise.
* libgomp.h (gomp_page_locked_host_alloc): Likewise.
(gomp_page_locked_host_free): Likewise.
(struct gomp_device_descr): Add page_locked_host_alloc_func and
page_locked_host_free_func.
* libgomp.texi: Adjust the docs for the pinned trait.
* libgomp_g.h (GOMP_enable_pinned_mode): New prototype.
* plugin/plugin-nvptx.c
(GOMP_OFFLOAD_page_locked_host_alloc): New function.
(GOMP_OFFLOAD_page_locked_host_free): Likewise.
* target.c (device_for_page_locked): New variable.
(get_device_for_page_locked): New function.
(gomp_page_locked_host_alloc): Likewise.
(gomp_page_locked_host_free): Likewise.
(gomp_load_plugin_for_device): Add page_locked_host_alloc and
page_locked_host_free.
* testsuite/libgomp.c/alloc-pinned-1.c: Change expectations for NVPTX
devices.
* testsuite/libgomp.c/alloc-pinned-2.c: Likewise.
* testsuite/libgomp.c/alloc-pinned-3.c: Likewise.
* testsuite/libgomp.c/alloc-pinned-4.c: Likewise.
* testsuite/libgomp.c/alloc-pinned-5.c: Likewise.
* testsuite/libgomp.c/alloc-pinned-6.c: Likewise.
Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
Andrew Stubbs [Wed, 12 Jun 2024 11:42:21 +0000 (11:42 +0000)]
openmp: -foffload-memory=pinned
https://patchwork.sourceware.org/project/gcc/list/?series=35022
Implement the -foffload-memory=pinned option such that libgomp is
instructed to enable fully-pinned memory at start-up. The option is
intended to provide a performance boost to certain offload programs without
modifying the code.
This feature only works on Linux, at present, and simply calls mlockall to
enable always-on memory pinning. It requires that the ulimit feature is
set high enough to accommodate all the program's memory usage.
In this mode the ompx_gnu_pinned_memory_alloc feature is disabled as it is not
needed and may conflict.
gcc/ChangeLog:
* omp-builtins.def (BUILT_IN_GOMP_ENABLE_PINNED_MODE): New.
* omp-low.cc (omp_enable_pinned_mode): New function.
(execute_lower_omp): Call omp_enable_pinned_mode.
libgomp/ChangeLog:
* config/linux/allocator.c (always_pinned_mode): New variable.
(GOMP_enable_pinned_mode): New function.
(linux_memspace_alloc): Disable pinning when always_pinned_mode set.
(linux_memspace_calloc): Likewise.
(linux_memspace_free): Likewise.
(linux_memspace_realloc): Likewise.
* libgomp.map: Add GOMP_enable_pinned_mode.
* testsuite/libgomp.c/alloc-pinned-7.c: New test.
* testsuite/libgomp.c-c++-common/alloc-pinned-1.c: New test.
Andrew Stubbs [Fri, 18 Apr 2025 21:39:54 +0000 (21:39 +0000)]
openmp: Add -foffload-memory
https://patchwork.sourceware.org/project/gcc/list/?series=35022
Add a new option. It's inactive until I add some follow-up patches.
gcc/ChangeLog:
* common.opt: Add -foffload-memory and its enum values.
* coretypes.h (enum offload_memory): New.
* doc/invoke.texi: Document -foffload-memory.
Kwok Cheung Yeung [Mon, 1 Mar 2021 22:15:30 +0000 (14:15 -0800)]
openmp: Scale type precision of collapsed iterator variable
This sets the type precision of the collapsed iterator variable to the
sum of the precision of the collapsed loop variables, up to a maximum of
sizeof(long long) (i.e. 64-bits).
gcc/ChangeLog
* omp-expand.cc (expand_oacc_for): Convert .tile variable to
diff_type before multiplying.
* omp-general.cc (omp_extract_for_data): Use accumulated precision
of all collapsed for-loops as precision of iteration variable, up
to the precision of a long long.
libgomp/ChangeLog
* testsuite/libgomp.c-c++-common/collapse-4.c: New.
* testsuite/libgomp.fortran/collapse5.f90: New.
Andrew Stubbs [Tue, 3 Aug 2021 12:45:35 +0000 (13:45 +0100)]
libgomp amdgcn: Fix issues with dynamic OpenMP thread scaling
libgomp/ChangeLog:
* config/gcn/bar.h (gomp_barrier_init): Limit thread count to the
actual physical number.
* config/gcn/team.c (gomp_team_start): Don't attempt to set up
threads that do not exist.
Andrew Stubbs [Tue, 23 Feb 2021 21:35:08 +0000 (21:35 +0000)]
nvptx: remove erroneous stack deletion
The stacks are not supposed to be deleted every time memory is allocated, only
when there is insufficient memory. The unconditional call here seems to be in
error, and is causing a costly reallocation of the stacks before every launch.
libgomp/
* plugin/plugin-nvptx.c (GOMP_OFFLOAD_alloc): Remove early call to
nvptx_stacks_free.
Chung-Lin Tang [Mon, 1 Feb 2021 11:16:47 +0000 (03:16 -0800)]
OpenMP 5.0: Allow multiple clauses mapping same variable
This is a merge of:
https://gcc.gnu.org/pipermail/gcc-patches/2020-December/562081.html
This patch now allows multiple clauses on the same construct to map
the same variable, which was not valid in OpenMP 4.5, but now allowed
in 5.0.
This may possibly reverted/updated when a final patch is approved
for mainline.
gcc/c/ChangeLog
* c-typeck.cc (c_finish_omp_clauses): Adjust to allow duplicate
mapped variables for OpenMP.
gcc/cp/ChangeLog
* semantics.cc (finish_omp_clauses): Adjust to allow duplicate
mapped variables for OpenMP.
gcc/ChangeLog
* omp-low.cc (install_var_field): Add new 'tree key_expr = NULL_TREE'
default parameter. Set splay-tree lookup key to key_expr instead of
var if key_expr is non-NULL. Adjust call to install_parm_decl.
Update comments.
(scan_sharing_clauses): Use clause tree expression as splay-tree key
for map/to/from and OpenACC firstprivate cases when installing the
variable field into the send/receive record type.
(maybe_lookup_field_in_outer_ctx): Add code to search through
construct clauses instead of entirely based on splay-tree lookup.
(lower_oacc_reductions): Adjust to find map-clause of reduction
variable, then create receiver-ref.
(lower_omp_target): Adjust to lookup var field using clause expression.
gcc/testsuite/ChangeLog
* c-c++-common/gomp/clauses-2.c: Adjust testcase.
* c-c++-common/gomp/map-6.c: Adjust testcase.
Co-Authored-By: Paul-Antoine Arras <parras@baylibre.com>
Andrew Stubbs [Fri, 15 Jan 2021 11:26:46 +0000 (11:26 +0000)]
DWARF address space for variables
Add DWARF address class attributes for variables that exist outside the
generic address space. In particular, this is the case for gang-private
variables in OpenACC offload kernels.
gcc/ChangeLog:
* dwarf2out.cc (add_location_or_const_value_attribute): Set
DW_AT_address_class, if appropriate.
Andrew Stubbs [Sun, 6 Dec 2020 19:23:55 +0000 (19:23 +0000)]
DWARF: late code range fixup
Ensure that the parent DWARF subprograms of offload kernel functions have a
code range, and are therefore not discarded by GDB. This is only necessary
when the parent function does not actually exist in the final binary, which is
commonly the case within the offload device's binary.
gcc/
* dwarf2out.cc (notional_parents_list): New file variable.
(gen_subprogram_die): Record offload kernel functions in
notional_parents_list.
(fixup_notional_parents): New function.
(dwarf2out_finish): Call fixup_notional_parents.
(dwarf2out_c_finalize): Reset notional_parents_list.
Julian Brown [Fri, 18 Apr 2025 21:00:09 +0000 (21:00 +0000)]
openacc: Adjust loop lowering for AMD GCN
This patch adjusts OpenACC loop lowering in the AMD GCN target compiler
in such a way that the autovectorizer can vectorize the "vector" dimension
of those loops in more cases.
Rather than generating "SIMT" code that executes a scalar instruction
stream for each lane of a vector in lockstep, for GCN we model the GPU
like a typical CPU, with separate instructions to operate on scalar and
vector data. That means that unlike other offload targets, we rely on
the autovectorizer to handle the innermost OpenACC parallelism level,
which is "vector".
Because of this, the OpenACC builtin functions to return the current
vector lane and the vector width return 0 and 1 respectively, despite
the native vector width being 64 elements wide.
This allows generated code to work with our chosen compilation model,
but the way loops are lowered in omp-offload.c:oacc_xform_loop does not
understand the discrepancy between logical (OpenACC) and physical vector
sizes correctly. That means that if a loop is partitioned over e.g. the
worker AND vector dimensions, we actually lower with unit vector size --
meaning that if we then autovectorize, we end up trying to vectorize
over the "worker" dimension rather than the vector one! Then, because
the number of workers is not fixed at compile time, that means the
autovectorizer has a hard time analysing the loop and thus vectorization
often fails entirely.
We can fix this by deducing the true vector width in oacc_xform_loop,
and using that when we are on a "non-SIMT" offload target. We can then
rearrange how loops are lowered in that function so that the loop form
fed to the autovectorizer is more amenable to vectorization -- namely,
the innermost step is set to process each loop iteration sequentially.
For some benchmarks, allowing vectorization to succeed leads to quite
impressive performance improvements -- I've observed between 2.5x and
40x on one machine/GPU combination.
The low-level builtins available to user code (__builtin_goacc_parlevel_id
and __builtin_goacc_parlevel_size) continue to return 0/1 respectively
for the vector dimension for AMD GCN, even if their containing loop is
vectorized -- that's a quirk that we might possibly want to address at
some later date.
Only non-"chunking" loops are handled at present. "Chunking" loops are
still lowered as before.
gcc/ChangeLog
* omp-offload.cc (oacc_thread_numbers): Add VF_BY_VECTORIZER parameter.
Add overloaded wrapper for previous arguments & behaviour.
(oacc_xform_loop): Lower vector loops to iterate a multiple of
omp_max_vf times over contiguous steps on non-SIMT targets.
libgomp/ChangeLog
* testsuite/libgomp.oacc-c-c++-common/loop-gwv-1.c: Adjust for loop
lowering changes.
* testsuite/libgomp.oacc-c-c++-common/loop-wv-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-red-gwv-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/loop-red-wv-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/routine-gwv-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/routine-wv-1.c: Likewise.
Cesar Philippidis [Thu, 17 Apr 2025 14:10:11 +0000 (14:10 +0000)]
Fortran "declare create"/allocate support for OpenACC
This patch incorporates these commits from OG14 branch:
65be1389eeda9b3b97f6587721215c3f31bd7f98
9d43e819d88f97c7ade7f8c95c35ea3464ea7771
f2cf2b994c4d8c871fad5502ffb9aaee9ea4f4e0
2770ce41615557e595065ce0c5db71e9f3d82b0a
a29e58f4b314862a72730119f85e9125879abf0b
ffd990543f805ed448aaa355d190f37103f8f1f0
gcc/ChangeLog
* gimplify.cc (omp_group_base): Handle GOMP_MAP_DECLARE_ALLOCATE
and GOMP_MAP_DECLARE_DEALLOCATE.
(gimplify_adjust_omp_clauses): Likewise.
* omp-low.cc (scan_sharing_clauses): Update handling of OpenACC declare
create, declare copyin and declare deviceptr to have local lifetimes.
(convert_to_firstprivate_int): Handle pointer types.
(convert_from_firstprivate_int): Likewise. Create local storage for
the values being pointed to. Add new orig_type argument. Use
VIEW_CONVERT also for vectors.
(lower_omp_target): Handle GOMP_MAP_DECLARE_{ALLOCATE,DEALLOCATE}.
Add orig_type argument to convert_from_firstprivate_int call.
Allow pointer types with GOMP_MAP_FIRSTPRIVATE_INT. Don't privatize
firstprivate VLAs.
* tree-pretty-print.cc (dump_omp_clause): Handle
GOMP_MAP_DECLARE_{ALLOCATE,DEALLOCATE}.
gcc/fortran/ChangeLog
* gfortran.h (enum gfc_omp_map_op): Add OMP_MAP_DECLARE_ALLOCATE,
OMP_MAP_DECLARE_DEALLOCATE.
(gfc_omp_clauses): Add update_allocatable.
* trans-array.cc (gfc_array_allocate): Call
gfc_trans_oacc_declare_allocate for decls that have oacc_declare_create
attribute set.
* trans-decl.cc (find_module_oacc_declare_clauses): Relax
oacc_declare_create to OMP_MAP_ALLOC, and oacc_declare_copyin to
OMP_MAP_TO, in order to match OpenACC 2.5 semantics.
* trans-openmp.cc (gfc_omp_check_optional_argument): Handle non-decl
case.
(gfc_trans_omp_clauses): Use GOMP_MAP_ALWAYS_POINTER (for update
directive) or GOMP_MAP_FIRSTPRIVATE_POINTER (otherwise) for
allocatable scalar decls. Handle OMP_MAP_DECLARE_{ALLOCATE,DEALLOCATE}
clauses.
(gfc_trans_oacc_executable_directive): Use GOMP_MAP_ALWAYS_POINTER
for allocatable scalar data clauses inside acc update directives.
(gfc_trans_oacc_declare_allocate): New function.
* trans-stmt.cc (gfc_trans_allocate): Call
gfc_trans_oacc_declare_allocate for decls with oacc_declare_create
attribute set.
(gfc_trans_deallocate): Likewise.
* trans.h (gfc_trans_oacc_declare_allocate): Declare.
gcc/testsuite/ChangeLog
* gfortran.dg/goacc/declare-allocatable-1.f90: New test.
* gfortran.dg/goacc/declare-3.f95: Adjust expected dump output.
include/ChangeLog
* gomp-constants.h (enum gomp_map_kind): Define
GOMP_MAP_DECLARE_{ALLOCATE,DEALLOCATE} and GOMP_MAP_FLAG_SPECIAL_4.
libgomp/ChangeLog
* libgomp.h (gomp_acc_declare_allocate): Remove prototype.
* oacc-mem.c (gomp_acc_declare_allocate): New function.
(find_group_last): Handle GOMP_MAP_DECLARE_ALLOCATE and
GOMP_MAP_DECLARE_DEALLOCATE groupings.
(goacc_enter_data_internal): Fix kind check for
GOMP_MAP_DECLARE_ALLOCATE. Pass new pointer argument to
gomp_acc_declare_allocate. Unlock mutex before calling
gomp_acc_declare_allocate and relock it afterwards.
(goacc_exit_data_internal): Unlock device mutex around
gomp_acc_declare_allocate call. Pass new pointer argument. Handle
group pointer mapping for deallocate.
* testsuite/libgomp.oacc-fortran/allocatable-scalar.f90: New test.
* testsuite/libgomp.oacc-fortran/declare-allocatable-1-directive.f90:
Adjust.
* testsuite/libgomp.oacc-fortran/declare-allocatable-1-runtime.f90:
Likewise.
* testsuite/libgomp.oacc-fortran/declare-allocatable-1.f90:
Likewise.
* testsuite/libgomp.oacc-fortran/declare-allocatable-2.f90: New test.
* testsuite/libgomp.oacc-fortran/declare-allocatable-3.f90: New test.
* testsuite/libgomp.oacc-fortran/declare-allocatable-4.f90: New test.
* testsuite/libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-directive.f90:
Adjust.
* testsuite/libgomp.oacc-fortran/declare-allocatable-array_descriptor-1-runtime.f90:
Likewise.
* testsuite/libgomp.oacc-fortran/declare-allocatable-array_descriptor-1.f90:
New test.
Co-Authored-By: Julian Brown <julian@codesourcery.com>
Co-Authored-By: Kwok Cheung Yeung <kcy@codesourcery.com>
Co-Authored-By: Tobias Burnus <tobias@codesourcery.com>
Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
Co-Authored-By: Paul-Antoine Arras <parras@baylibre.com>
Julian Brown [Fri, 20 Sep 2019 20:53:10 +0000 (13:53 -0700)]
Handle references in OpenACC "private" clauses
Combination of OG14 commits
141a592bf147c91c28de7864fa12259687e827e3
8d7562192cc814c6d0d48b424d7751762871a37b
+ new testsuite fixes to add xfails for tests that already failed on OG14.
gcc/ChangeLog
* gimplify.cc (localize_reductions): Rewrite references for
OMP_CLAUSE_PRIVATE also. Do not create local variable for
privatized arrays as the size is not directly known by the type.
gcc/testsuite/ChangeLog
* gfortran.dg/goacc/privatization-1-compute-loop.f90: Add xfails.
* gfortran.dg/goacc/privatization-1-compute.f90: Likewise.
libgomp/ChangeLog
* testsuite/libgomp.oacc-c++/privatized-ref-3.C: Add xfails.
* testsuite/libgomp.oacc-fortran/optional-private.f90: Likewise.
* testsuite/libgomp.oacc-fortran/privatized-ref-1.f95: Likewise.
Co-Authored-By: Tobias Burnus <tobias@codesourcery.com>
Co-Authored-By: Sandra Loosemore <sandra@baylibre.com>
Cesar Philippidis [Tue, 15 Apr 2025 15:56:38 +0000 (15:56 +0000)]
Reference reduction localization
gcc/ChangeLog
* gimplify.cc (privatize_reduction): New struct.
(localize_reductions_r, localize_reductions): New functions.
(gimplify_omp_for): Call localize_reductions.
(gimplify_omp_workshare): Likewise.
* omp-low.cc (lower_oacc_reductions): Handle localized reductions.
Create fewer temp vars.
* tree-core.h (omp_clause_code): Add OMP_CLAUSE_REDUCTION_PRIVATE_DECL
documentation.
* tree.cc (omp_clause_num_ops): Bump number of ops for
OMP_CLAUSE_REDUCTION to 6.
(walk_tree_1): Adjust accordingly.
* tree.h (OMP_CLAUSE_REDUCTION_PRIVATE_DECL): Add macro.
gcc/testsuite/ChangeLog
* gfortran.dg/goacc/modules.f95: Remove xfail on bogus warnings.
libgomp/ChangeLog
* testsuite/libgomp.oacc-fortran/optional-reduction.f90: Remove
xfail on bogus warnings.
* testsuite/libgomp.oacc-fortran/parallel-reduction.f90: Likewise.
* testsuite/libgomp.oacc-fortran/pr70643.f90: Likewise.
* testsuite/libgomp.oacc-fortran/reduction-5.f90: Likewise.
* testsuite/libgomp.oacc-fortran/reduction-7.f90: Likewise.
* testsuite/libgomp.oacc-fortran/reference-reductions.f90: Likewise.
Co-Authored-By: Julian Brown <julian@codesourcery.com>
Co-Authored-By: Sandra Loosemore <sloosemore@baylibre.com>
Thomas Schwinge [Fri, 21 Jun 2019 17:40:38 +0000 (10:40 -0700)]
Add changes to profiling interface from OG8 branch
This bundles up the parts of the profiling code from the OG8 branch that were
not included in the upstream patch.
libgomp/ChangeLog
* Makefile.am (libgomp_la_SOURCES): Add
oacc-profiling-acc_register_library.c.
* Makefile.in: Regenerate.
* libgomp.texi: Remove paragraph about acc_register_library.
* oacc-init.c (get_property_any): Add profiling code.
* oacc-parallel.c (GOACC_parallel_keyed_internal): Set device_api for
profiling.
* oacc-profiling-acc_register_library.c: New file.
* oacc-profiling.c (goacc_profiling_initialize): Call
acc_register_library. Avoid duplicate registration.
(acc_register_library): Remove.
* config/nvptx/oacc-profiling-acc_register_library.c:
New empty file.
* config/nvptx/oacc-profiling.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/acc_prof-dispatch-1.c: Remove
call to acc_register_library.
* testsuite/libgomp.oacc-c-c++-common/acc_prof-init-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/acc_prof-kernels-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/acc_prof-parallel-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/acc_prof-valid_bytes-1.c:
Likewise.
* testsuite/libgomp.oacc-c-c++-common/acc_prof-version-1.c: Likewise.
Co-Authored-By: Maciej W. Rozycki <macro@codesourcery.com>
Cesar Philippidis [Tue, 26 Feb 2019 23:59:03 +0000 (15:59 -0800)]
Enable firstprivate OpenACC reductions
gcc/ChangeLog
* gimplify.cc (omp_add_variable): Enable firstprivate reduction
variables.
gcc/testsuite/ChangeLog
* c-c++-common/goacc/reduction-10.c: New test.
libgomp/ChangeLog
* testsuite/libgomp.oacc-c-c++-common/privatize-reduction-1.c: New
test.
* testsuite/libgomp.oacc-c-c++-common/privatize-reduction-2.c: New
test.
Co-Authored-By: Chung-Lin Tang <cltang@codesourcery.com>
Cesar Philippidis [Tue, 26 Feb 2019 23:55:23 +0000 (15:55 -0800)]
Don't mark OpenACC auto loops as independent inside acc parallel regions
gcc/ChangeLog
* omp-low.cc (lower_oacc_head_mark): Don't mark OpenACC auto
loops as independent inside acc parallel regions.
gcc/testsuite/ChangeLog
* c-c++-common/goacc/loop-auto-1.c: Adjust test case to conform to
the new behavior of the auto clause in OpenACC 2.5.
* c-c++-common/goacc/loop-auto-2.c: Likewise.
* gcc.dg/goacc/loop-processing-1.c: Likewise.
* c-c++-common/goacc/loop-auto-3.c: New test.
* gfortran.dg/goacc/loop-auto-1.f90: New test.
libgomp/ChangeLog
* testsuite/libgomp.oacc-c-c++-common/loop-auto-1.c: Adjust test case
to conform to the new behavior of the auto clause in OpenACC 2.5.
Cesar Philippidis [Tue, 26 Feb 2019 23:10:21 +0000 (15:10 -0800)]
Enable GOMP_MAP_FIRSTPRIVATE_INT for OpenACC
Incorporates these commits from OG14 branch:
a743b0947593f38fd93fa1bc5f8dd3c50ba5c498
6f759e76f00d70bf1d44a25767c73ec2de855452
61d66e2e494655a34cca7289ae584d2644f1a65f
gcc/ChangeLog
* omp-low.cc (maybe_lookup_field_in_outer_ctx): New function.
(convert_to_firstprivate_int): New function.
(convert_from_firstprivate_int): New function.
(lower_omp_target): Enable GOMP_MAP_FIRSTPRIVATE_INT in OpenACC.
Remove unused variable.
libgomp/ChangeLog
* testsuite/libgomp.oacc-c++/firstprivate-int.C: New test.
* testsuite/libgomp.oacc-c-c++-common/firstprivate-int.c: New
test.
* testsuite/libgomp.oacc-c-c++-common/data-firstprivate-1.c: XFAIL
execution test.
* testsuite/libgomp.oacc-fortran/firstprivate-int.f90: New test.
Co-Authored-By: Julian Brown <julian@codesourcery.com>
Co-Authored-By: Tobias Burnus <tobias@codesourcery.com>
Co-Authored-By: Kwok Cheung Yeung <kcy@codesourcery.com>
Nathan Sidwell [Mon, 14 Apr 2025 16:38:55 +0000 (16:38 +0000)]
Default compute dimensions (compile time)
This patch was previously combined with an unrelated change and testcases
that was upstreamed separately ("Add '-Wopenacc-parallelism'",
22cff118f7526bec195ed6e41452980820fdf3a8).
gcc/ChangeLog
* doc/invoke.texi (fopenacc-dim): Document syntax for using
runtime value from environment variable.
* omp-offload.cc (oacc_parse_default_dims): Implement it.
libgomp/ChangeLog
* testsuite/libgomp.oacc-c-c++-common/loop-default-compile.c: New.
Co-Authored-By: Tom de Vries <tdevries@suse.de>
Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
Co-Authored-By: Julian Brown <julian@codesourcery.com>
Cesar Philippidis [Tue, 26 Feb 2019 21:18:36 +0000 (13:18 -0800)]
Adjustments and additions to testcases
gcc/testsuite/ChangeLog
* g++.dg/goacc/loop-1.c: New test.
* g++.dg/goacc/loop-2.c: New test.
* g++.dg/goacc/loop-3.c: New test.
libgomp/ChangeLog
* testsuite/libgomp.oacc-fortran/data-3.f90: Update parallel
regions to denote variables copyied in via acc enter data as
present.
* testsuite/libgomp.oacc-c-c++-common/subr.h: Reimplement.
* testsuite/libgomp.oacc-c-c++-common/subr.ptx: Regenerated PTX.
* testsuite/libgomp.oacc-c-c++-common/timer.h: Removed.
* testsuite/libgomp.oacc-c-c++-common/lib-69.c: Change async checks.
* testsuite/libgomp.oacc-c-c++-common/lib-70.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-72.c: Rework kernel i/f and
change async checks.
* testsuite/libgomp.oacc-c-c++-common/lib-73.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-74.c: Rework kernel i/f and
timing checks.
* testsuite/libgomp.oacc-c-c++-common/lib-75.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-76.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-78.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-79.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-81.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-82.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-93.c: New test.
Co-Authored-By: James Norris <jnorris@codesourcery.com>
Co-Authored-By: Tom de Vries <tom@codesourcery.com>
Co-Authored-By: Julian Brown <julian@codesourcery.com>
Co-Authored-By: Tobias Burnus <tburnus@baylibre.com>
Julian Brown [Sun, 13 Apr 2025 23:54:16 +0000 (23:54 +0000)]
Various OpenACC reduction enhancements - test cases
gcc/testsuite/ChangeLog
* c-c++-common/goacc/reduction-9.c: New.
* g++.dg/goacc/reductions-1.C: New.
* gcc.dg/goacc/loop-processing-1.c: Update.
libgomp/ChangeLog
* testsuite/libgomp.oacc-c-c++-common/par-reduction-3.c: New.
* testsuite/libgomp.oacc-c-c++-common/reduction-cplx-flt-2.c: New.
* testsuite/libgomp.oacc-fortran/reduction-9.f90: New.
Co-Authored-By: Cesar Philippidis <cesar@codesourcery.com>
Co-Authored-By: Nathan Sidwell <nathan@acm.org>
Julian Brown [Sun, 13 Apr 2025 23:34:17 +0000 (23:34 +0000)]
Various OpenACC reduction enhancements - ME and nvptx changes
gcc/ChangeLog
* config/nvptx/nvptx.cc (nvptx_propagate_unified): New.
(nvptx_split_blocks): Call it for cond_uni insn.
(nvptx_expand_cond_uni): New.
(enum nvptx_builtins): Add NVPTX_BUILTIN_COND_UNI.
(nvptx_init_builtins): Initialize it.
(nvptx_expand_builtin): Handle NVPTX_BUILTIN_COND_UNI.
(nvptx_generate_vector_shuffle): Change integral SHIFT operand to
tree BITS operand.
(nvptx_vector_reduction): New.
(nvptx_adjust_reduction_type): New.
(nvptx_goacc_reduction_setup): Use it to adjust the type of ref_to_res.
(nvptx_goacc_reduction_init): Don't update LHS if it doesn't exist.
(nvptx_goacc_reduction_fini): Call nvptx_vector_reduction for vector.
Use it to adjust the type of ref_to_res.
(nvptx_goacc_reduction_teardown): Call nvptx_adjust_reduction_type.
* config/nvptx/nvptx.md (cond_uni): New pattern.
* gimplify.cc (gimplify_adjust_omp_clauses): Add DECL_P check
for OMP_CLAUSE_TASK_REDUCTION.
* omp-low.cc (lower_oacc_reductions): Handle
GOMP_MAP_FIRSTPRIVATE_POINTER.
* omp-offload.cc (default_goacc_reduction): Likewise.
Co-Authored-By: Cesar Philippidis <cesar@codesourcery.com>
Julian Brown [Sun, 13 Apr 2025 23:15:27 +0000 (23:15 +0000)]
Various OpenACC reduction enhancements - FE changes
gcc/c/ChangeLog
* c-parser.cc (c_parser_omp_variable_list): New c_omp_region_type
argument. Use it to specialize handling of OMP_CLAUSE_REDUCTION for
OpenACC.
(c_parser_omp_var_list_parens): Add region-type argument to call.
(c_parser_oacc_data_clause): Likewise.
(c_parser_oacc_data_clause_deviceptr): Likewise.
(c_parser_omp_clause_reduction): Change is_omp boolean parameter to
c_omp_region_type. Update call to c_parser_omp_variable_list.
(c_parser_omp_clause_map): Update call to
c_parser_omp_variable_list.
(c_parser_omp_clause_from_to): Likewise.
(c_parser_omp_clause_init): Likewise.
(c_parser_oacc_all_clauses): Update calls to
c_parser_omp_clause_reduction.
(c_parser_omp_all_clauses): Likewise.
(c_parser_oacc_cache): Update call to c_parser_omp_variable_list.
* c-typeck.cc (c_finish_omp_clauses): Emit an error on orphan OpenACC
gang reductions. Suppress user-defined reduction error for OpenACC.
gcc/cp/ChangeLog
* parser.cc (cp_parser_omp_var_list_no_open): New c_omp_region_type
argument. Use it to specialize handling of OMP_CLAUSE_REDUCTION for
OpenACC.
(cp_parser_omp_var_list): Add c_omp_region_type argument. Update call
to cp_parser_omp_var_list_no_open.
(cp_parser_oacc_data_clause): Update call to
cp_parser_omp_var_list_no_open.
(cp_parser_omp_clause_reduction): Change is_omp boolean parameter to
c_omp_region_type. Update call to cp_parser_omp_var_list_no_open.
(cp_parser_omp_clause_from_to): Update call to
cp_parser_omp_clause_var_list_no_open.
(cp_parser_omp_clause_map): Likewise.
(cp_parser_omp_clause_init): Likewise.
(cp_parser_oacc_all_clauses): Update call to
cp_parser_omp_clause_reduction.
(cp_parser_omp_all_clauses): Likewise.
* semantics.cc (finish_omp_reduction_clause): Add c_omp_region_type
argument. Suppress user-defined reduction error for OpenACC.
(finish_omp_clauses): Update call to finish_omp_reduction_clause.
gcc/fortran/ChangeLog
* trans-openmp.cc (gfc_omp_clause_copy_ctor): Permit reductions.
Co-Authored-By: Cesar Philippidis <cesar@codesourcery.com>
Co-Authored-By: Nathan Sidwell <nathan@acm.org>
Co-Authored-By: Kwok Cheung Yeung <kcy@codesourcery.com>
Cesar Philippidis [Wed, 23 Apr 2025 03:01:28 +0000 (03:01 +0000)]
Add OpenACC Fortran support for deviceptr and variable in common blocks
This is a merge of these OG14 commits:
987bb4c25b4076eb54f043644bdf9988378be90d
9e8395708c0027ad1de871bae870c4b0185a74fd
2adb0ec35cd47b34d47c961f6ae46089e3e02cbc
4d29174a9602e6ea783ba0e9a7b1e38fb6913db5
gcc/fortran/ChangeLog
* openmp.cc (gfc_match_omp_map_clause): Re-write handling of the
deviceptr clause. Add new common_blocks argument. Propagate it to
gfc_match_omp_variable_list.
(gfc_match_omp_clauses): Update calls to gfc_match_omp_map_clauses.
(resolve_positive_int_expr): Promote the warning to an error.
(check_array_not_assumed): Remove pointer check.
(resolve_oacc_nested_loops): Error on do concurrent loops.
* trans-openmp.cc (gfc_omp_finish_clause): Don't create pointer data
mappings for deviceptr clauses.
(gfc_trans_omp_clauses): Likewise.
gcc/ChangeLog
* gimplify.cc (enum gimplify_omp_var_data): Add GOVD_DEVICETPR.
(oacc_default_clause): Privatize fortran common blocks.
(omp_notice_variable): Add GOVD_DEVICEPTR attribute when appropriate.
Defer the expansion of DECL_VALUE_EXPR for common block decls.
(gimplify_scan_omp_clauses): Add GOVD_DEVICEPTR attribute when
appropriate.
(gimplify_adjust_omp_clauses_1): Set GOMP_MAP_FORCE_DEVICEPTR for
implicit deviceptr mappings.
gcc/testsuite/ChangeLog
* c-c++-common/goacc/deviceptr-4.c: Update.
* gfortran.dg/goacc/loop-2-kernels-tile.f95: Update.
* gfortran.dg/goacc/loop-2-parallel-tile.f95: Update.
* gfortran.dg/goacc/loop-2-serial-tile.f95: Update.
* gfortran.dg/goacc/sie.f95: Update.
* gfortran.dg/goacc/tile-1.f90: Update.
* gfortran.dg/gomp/num-teams-2.f90: Update.
* gfortran.dg/gomp/pr67500.f90: Update.
* gfortran.dg/gomp/pr77516.f90: Update.
libgomp/ChangeLog
* oacc-parallel.c (GOACC_parallel_keyed): Handle Fortran deviceptr
clause.
(GOACC_data_start): Likewise.
* testsuite/libgomp.oacc-fortran/deviceptr-1.f90: New test.
Co-Authored-By: James Norris <jnorris@codesourcery.com>
Co-Authored-By: Julian Brown <julian@codesourcery.com>
Co-Authored-By: Tobias Burnus <tobias@codesourcery.com>
Co-Authored-By: Thomas Schwinge <tschwinge@baylibre.com>
Thomas Schwinge [Sat, 12 Apr 2025 17:58:05 +0000 (17:58 +0000)]
OpenACC: Pass pre-allocated 'ptrblock' to 'goacc_noncontig_array_create_ptrblock' [PR76739]
... to simplify later changes. No functional change.
Follow-up for og12 commit
15d0f61a7fecdc8fd12857c40879ea3730f6d99f
"Merge non-contiguous array support patches".
libgomp/ChangeLog
PR other/76739
* target.c (gomp_map_vars_internal): Pass pre-allocated 'ptrblock'
to 'goacc_noncontig_array_create_ptrblock'.
* oacc-parallel.c (goacc_noncontig_array_create_ptrblock): Adjust.
* oacc-int.h (goacc_noncontig_array_create_ptrblock): Adjust.
Thomas Schwinge [Sat, 12 Apr 2025 17:54:45 +0000 (17:54 +0000)]
Given OpenACC 'async', defer 'free' of non-contiguous array support data structures [PR76739]
Fix-up for og12 commit
15d0f61a7fecdc8fd12857c40879ea3730f6d99f
"Merge non-contiguous array support patches".
libgomp/ChangeLog
PR other/76739
* oacc-parallel.c (GOACC_parallel_keyed): Given OpenACC 'async',
defer 'free' of non-contiguous array support data structures.
* target.c (gomp_map_vars_internal): Likewise.
Thomas Schwinge [Sat, 12 Apr 2025 17:52:13 +0000 (17:52 +0000)]
libgomp: Merge 'gomp_map_vars_openacc' into 'goacc_map_vars' [PR76739]
Upstream has 'goacc_map_vars'; merge the new 'gomp_map_vars_openacc' into it.
(Maybe the latter didn't exist yet when the former was originally added?)
No functional change.
Clean-up for og12 commit
15d0f61a7fecdc8fd12857c40879ea3730f6d99f
"Merge non-contiguous array support patches".
libgomp/ChangeLog
PR other/76739
* libgomp.h (goacc_map_vars): Add 'struct goacc_ncarray_info *'
formal parameter.
(gomp_map_vars_openacc): Remove.
* target.c (goacc_map_vars): Adjust.
(gomp_map_vars_openacc): Remove.
* oacc-mem.c (acc_map_data, goacc_enter_datum)
(goacc_enter_data_internal): Adjust.
* oacc-parallel.c (GOACC_parallel_keyed, GOACC_data_start):
Adjust.
Chung-Lin Tang [Sat, 12 Apr 2025 16:22:37 +0000 (16:22 +0000)]
Non-contiguous array support patches [PR76739]
This is based on OG14 commit
b143c1c447945ce05903ff1360ead97774dfce4b,
which was based from v4, posted upstream here:
https://gcc.gnu.org/pipermail/gcc-patches/2020-April/543437.html
It also incorporates a number of follow-up bug and bit-rot fixes, OG14
commits
e11726d3467543de45448097dde27ba34bf04bfe
87ea4de1c4a360d5d62357491a41811213f4528c
151fc161d0ed640048444ca18f9325e3d2e03e99
628a000bdbf63252c2ede13ccab8e99a19769866
11263c048d39ab1d6a11067b18674bf8307bbbf5
8c1068bbe3e52529bede5466a43af8d98f38dac2
gcc/c/ChangeLog
PR other/76739
* c-typeck.cc (handle_omp_array_sections_1): Add 'bool &non_contiguous'
parameter, adjust recursive call site, add cases for allowing
pointer based multi-dimensional arrays for OpenACC. Reject
non-DECL base-pointer cases as unsupported.
(handle_omp_array_sections): Adjust handle_omp_array_sections_1 call,
handle non-contiguous case to create dynamic array map.
gcc/cp/ChangeLog
PR other/76739
* semantics.cc (handle_omp_array_sections_1): Add 'bool &non_contiguous'
parameter, adjust recursive call site, add cases for allowing
pointer based multi-dimensional arrays for OpenACC. Reject
non-DECL base-pointer cases as unsupported.
(handle_omp_array_sections): Adjust handle_omp_array_sections_1 call,
handle non-contiguous case to create dynamic array map.
gcc/fortran/ChangeLog
PR other/76739
* f95-lang.cc (DEF_FUNCTION_TYPE_VAR_5): New symbol.
* types.def (BT_FN_VOID_INT_SIZE_PTR_PTR_PTR_VAR): New type.
gcc/ChangeLog
PR other/76739
* builtin-types.def (BT_FN_VOID_INT_SIZE_PTR_PTR_PTR_VAR): New type.
* gimplify.cc (omp_group_base): Handle GOMP_MAP_NONCONTIG_ARRAY_*.
(gimplify_scan_omp_clauses): Handle OMP_TARGET_UPDATE.
(gimplify_adjust_omp_clauses): Skip gimplification of
OMP_CLAUSE_SIZE of non-contiguous array maps (which is a TREE_LIST).
* omp-builtins.def (BUILT_IN_GOACC_DATA_START): Adjust function type
to new BT_FN_VOID_INT_SIZE_PTR_PTR_PTR_VAR.
* omp-expand.cc (expand_omp_target): Add non-contiguous array
descriptor pointers to variadic arguments.
* omp-low.cc (append_field_to_record_type): New function.
(create_noncontig_array_descr_type): Likewise.
(create_noncontig_array_descr_init_code): Likewise.
(scan_sharing_clauses): For non-contiguous array map kinds, check for
supported dimension structure, and install non-contiguous array
variable into current omp_context.
(reorder_noncontig_array_clauses): New function.
(scan_omp_target): Call reorder_noncontig_array_clauses to place
non-contiguous array map clauses at beginning of clause sequence.
(lower_omp_target): Add handling for non-contiguous array map kinds,
add all created non-contiguous array descriptors to
gimple_omp_target_data_arg.
* tree-pretty-print.cc (dump_omp_clause): Handle
GOMP_MAP_NONCONTIG_ARRAY_*.
gcc/testsuite/ChangeLog
PR other/76739
* c-c++-common/goacc/data-clause-1.c (foo): Remove expected message.
* c-c++-common/goacc/noncontig_array-1.c: New test.
* g++.dg/goacc/data-clause-1.C (foo): Remove expected message.
include/ChangeLog
PR other/76739
* gomp-constants.h (GOMP_MAP_FLAG_SPECIAL_3): Define.
(enum gomp_map_kind): Add GOMP_MAP_NONCONTIG_ARRAY,
GOMP_MAP_NONCONTIG_ARRAY_TO, GOMP_MAP_NONCONTIG_ARRAY_FROM,
GOMP_MAP_NONCONTIG_ARRAY_TOFROM, GOMP_MAP_NONCONTIG_ARRAY_FORCE_TO,
GOMP_MAP_NONCONTIG_ARRAY_FORCE_FROM,
GOMP_MAP_NONCONTIG_ARRAY_FORCE_TOFROM,
GOMP_MAP_NONCONTIG_ARRAY_ALLOC, GOMP_MAP_NONCONTIG_ARRAY_FORCE_ALLOC,
GOMP_MAP_NONCONTIG_ARRAY_FORCE_PRESENT.
(GOMP_MAP_NONCONTIG_ARRAY_P): Define.
libgomp/ChangeLog
PR other/76739
* libgomp.h (gomp_map_vars_openacc): New function declaration.
* libgomp_g.h (GOACC_data_start): Add variadic '...' to declaration.
* oacc-int.h (struct goacc_ncarray_dim): New struct declaration.
(struct goacc_ncarray_descr_type): Likewise.
(struct goacc_ncarray): Likewise.
(struct goacc_ncarray_info): Likewise.
(goacc_noncontig_array_create_ptrblock): New function declaration.
* oacc-parallel.c (goacc_noncontig_array_count_rows): New function.
(goacc_noncontig_array_compute_sizes): Likewise.
(goacc_noncontig_array_fill_rows_1): Likewise.
(goacc_noncontig_array_fill_rows): Likewise.
(goacc_process_noncontiguous_arrays): Likewise.
(goacc_noncontig_array_create_ptrblock): Likewise.
(GOACC_parallel_keyed): Use goacc_process_noncontiguous_arrays to
handle non-contiguous array descriptors at end of varargs, adjust
to use gomp_map_vars_openacc.
(GOACC_data_start): Likewise. Adjust function type to accept varargs.
* target.c (gomp_map_vars_internal): Add struct goacc_ncarray_info *
nca_info parameter, add handling code for non-contiguous arrays.
(gomp_map_vars_openacc): Add new function for specialization of
gomp_map_vars_internal for OpenACC structured region usage.
* testsuite/libgomp.oacc-c-c++-common/noncontig_array-1.c: New test.
* testsuite/libgomp.oacc-c-c++-common/noncontig_array-2.c: New test.
* testsuite/libgomp.oacc-c-c++-common/noncontig_array-3.c: New test.
* testsuite/libgomp.oacc-c-c++-common/noncontig_array-4.c: New test.
* testsuite/libgomp.oacc-c-c++-common/noncontig_array-utils.h: Support
header for new tests.
Co-Authored-By: Kwok Cheung Yeung <kcy@codesourcery.com>
Co-Authored-By: Paul-Antoine Arras <parras@baylibre.com>
Sandra Loosemore [Sat, 12 Apr 2025 15:43:34 +0000 (15:43 +0000)]
Identify OMP development branch in output of 'gcc --version'
gcc/ChangeLog
* Makefile.in (REVISION_s): Change default message.
Tobias Burnus [Thu, 15 May 2025 00:34:05 +0000 (00:34 +0000)]
git_update_version.py: Support vendor-branch version bumps
contrib/ChangeLog:
* gcc-changelog/git_repository.py (parse_git_revisions): Optional
exclude_branch_name argument
* gcc-changelog/git_update_version.py: Add --suffix, --exclude-branch
and --last-commit to handle vendor branches.
Joseph Myers [Mon, 12 May 2025 17:56:18 +0000 (17:56 +0000)]
Update cpplib es.po
* es.po: Update.
This page took 0.135435 seconds and 5 git commands to generate.