Prevent ABI changes affect EnzymeAD #142544

Sa4dUs · 2025-06-15T16:12:03Z

This PR handles ABI changes for autodiff input arguments to improve Enzyme compatibility. A couple of cases (like statics and small arrays) are still not handled.

r? @ZuseZ4

workingjubilee · 2025-06-15T16:20:09Z

compiler/rustc_middle/src/middle/codegen_fn_attrs.rs

+        /// `#[rustc_autodiff_no_abi_opt]`: internal marker applied to `#[rustc_autodiff]` primal functions
+        /// whose argument layout may be sensitive to ABI-level optimizations. This marker prevents certain
+        /// optimizations that could otherwise break compatibility with Enzyme's expectations.
+        const RUSTC_AUTODIFF_NO_ABI_OPT = 1 << 16;


Don't say "certain optimizations", or the next person who comes along is going to make it so that these functions are treated as -O0. Identify the actual problem: LLVM will modify the ABI of functions if it can identify them as fully internalized.

A change that bad should be caught by a reviewer, but I review a lot of PRs and I enjoy it when the codebase informs people of what is actually going on so they are more likely to have made changes that are consistent with the existing situation.

Got it, thanks for the feedback. I’ll make sure to keep comments as specific as possible to avoid any ambiguity going forward.

workingjubilee · 2025-06-15T16:29:37Z

compiler/rustc_codegen_ssa/src/codegen_attrs.rs

+fn is_abi_opt_sensitive<'tcx>(tcx: TyCtxt<'tcx>, ty: Ty<'tcx>) -> bool {
+    match ty.kind() {
+        ty::Ref(_, inner, _) | ty::RawPtr(inner, _) => {
+            match inner.kind() {
+                ty::Slice(_) => {
+                    // Since we cannot guarantee that the slice length is large enough
+                    // to avoid optimization, we assume it is ABI-opt sensitive.
+                    return true;
+                }
+                ty::Array(elem_ty, len) => {
+                    let Some(len_val) = len.try_to_target_usize(tcx) else {
+                        return false;
+                    };
+
+                    let pci = PseudoCanonicalInput {
+                        typing_env: TypingEnv::fully_monomorphized(),
+                        value: *elem_ty,
+                    };
+
+                    if elem_ty.is_scalar() {
+                        let elem_size =
+                            tcx.layout_of(pci).ok().map(|layout| layout.size).unwrap_or(Size::ZERO);
+
+                        if elem_size.bytes() * len_val <= tcx.data_layout.pointer_size.bytes() * 2 {
+                            return true;
+                        }
+                    }
+                }
+                _ => {}
+            }
+
+            false
+        }
+        ty::FnPtr(_, _) => true,
+        _ => false,


This should only matter when Enzyme is on, right?

Why are you ignoring ty::Array when it is not through ty::Ref? Does Enzyme not even deal in simple aggregates like that? I'm not even sure this is the correct layer to be examining things like this at, since it's well above the LLVM IR type layer. Multiple types in Rust source can wind up being lowered to the naive equivalent of this in the IR.

Anyway, please name this fn as specific to Enzyme, at least.

Yes, this logic is only for Enzyme. I didn't add the ty::Array logic yet because I'm still not sure on how to handle it as, for example, [f32; 2] is lowered to i64. As the number of args does not change, Enzyme may not have issues with that.

Okay, I have determined this is blocked on me to have a proper solution for it. I would like it if you opened an issue before this PR lands that points to https://github.com/rust-lang/rust/blame/86d0aef80403f095d8bbabf44d9fdecfcd45f076/compiler/rustc_target/src/callconv/mod.rs#L708 and your new code here, and says that a new variant of the adjust_for_rust_abi code that doesn't just mutate the arguments needs to exist so that this query can be answered without relying on mutable state that cannot be invoked idempotently.

bjorn3 · 2025-06-15T16:41:04Z

compiler/rustc_middle/src/middle/codegen_fn_attrs.rs

@@ -175,6 +179,7 @@ impl CodegenFnAttrs {
        self.flags.contains(CodegenFnAttrFlags::NO_MANGLE)
            || self.flags.contains(CodegenFnAttrFlags::RUSTC_STD_INTERNAL_SYMBOL)
            || self.export_name.is_some()
+            || self.flags.contains(CodegenFnAttrFlags::RUSTC_AUTODIFF_NO_ABI_OPT)


This will also suppress the dead_code lint. I think the reason the symbol is still getting marked as dso_local even with this change is because it has SymbolExportLevel::Rust. Only for SymbolExportLevel::C do we tell LTO to export the symbol.

Okay, I'll look into that and try to ensure it has the minimal number of side effects possible. Thank you :)

bors · 2025-06-20T13:56:01Z

☔ The latest upstream changes (presumably #142770) made this pull request unmergeable. Please resolve the merge conflicts.

rustbot · 2025-06-24T18:48:08Z

Some changes occurred in compiler/rustc_monomorphize/src/partitioning/autodiff.rs

cc @ZuseZ4

ZuseZ4 · 2025-06-24T21:12:38Z

tests/codegen/autodiff/abi_handling.rs

+// debug-NEXT: %_2 = load float, ptr %0, align 4, !alias.scope !7, !noalias !4
+// debug-NEXT: %"'ipg2" = getelementptr inbounds float, ptr %"x'", i64 1
+// debug-NEXT: %1 = getelementptr inbounds nuw float, ptr %x, i64 1
+// debug-NEXT: %"_5'ipl" = load float, ptr %"'ipg2", align 4, !alias.scope !4, !noalias !7


remove , !alias.scope !4, !noalias !7 and similar scope and metadata annotations. align can stay.

They are fragile, numbers might change and we want to avoid test failures because of it.

ZuseZ4 · 2025-06-24T21:17:18Z

tests/codegen/autodiff/abi_handling.rs

+//@ no-prefer-dynamic
+//@ needs-enzyme
+
+// This does only test the funtion attribute handling for autodiff.


This isn't about function attributes, or? It's more about verifying that Rust types are lowered to LLVM-IR types in a way that we expect and which enzyme can handle. We also explicitely check release mode, to verify that LLVM's O3 pipeline does not rewrite function signatures into something that Enzyme can not handle anymore.

It was initially and forgot to remove that. I'll fix it and add a more detailed comment about what's this test for.

compiler/rustc_monomorphize/src/partitioning/autodiff.rs

ZuseZ4 · 2025-06-24T21:31:37Z

compiler/rustc_monomorphize/src/partitioning/autodiff.rs

+            .non_enum_variant()
+            .fields
+            .iter()
+            .map(|f| count_scalar_fields(tcx, f.ty(tcx, substs)))


Do you really want to recursively count and sum?

I think that anything behind a double indirection probably won't affect the size on the function abi, or?

After testing a bit, i think the recursive sumation (or any other way of counting the "non-splittable" fields) is necessary becase if the aggregate has more than 2 fields when flattened, it's behaving slightly different, even when under the pointer size. I'll adjust it to consider this cases.

ZuseZ4 · 2025-06-24T21:32:31Z

compiler/rustc_monomorphize/src/partitioning/autodiff.rs

+
+        let is_product = |t: Ty<'tcx>| matches!(t.kind(), ty::Tuple(_) | ty::Adt(_, _));
+
+        if layout.size() <= pointer_size * 2 && is_product(*ty) {


Can you add a note that this is the magic number based on which LLVM might optimize?

ZuseZ4 · 2025-06-24T21:36:09Z

I am not 100% sure about the recursive summation here. In general also, @oli-obk can you review this, as I just don't know if PseudoCanonicalInput and fully_monomorphize is something we want here.

…structs)

Note(Sa4dUs): As LLVM-IR opt passes are executed after passing LLVM to Enzyme, most of the cases have turned out to not be problematic. Anyways, we still test them to prevent any kind of regression.

Update `count_scalar_fields`->`count_leaf_fields` to support more types Add extra activities only if `count_scalar_fields` is leq 2 Logic can be optimized if needed Removed metadata specific fields from test to avoid future fails.

Sa4dUs · 2025-06-26T15:14:46Z

Once the solution is decent, I can optimize minor things. I leave it for the end to not optimize on something that is not correct.

rustbot assigned ZuseZ4 Jun 15, 2025

rustbot added A-attributes Area: Attributes (`#[…]`, `#![…]`) F-autodiff `#![feature(autodiff)]` T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jun 15, 2025

This comment has been minimized.

Sign in to view

workingjubilee reviewed Jun 15, 2025

View reviewed changes

bjorn3 reviewed Jun 15, 2025

View reviewed changes

bors added the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Jun 20, 2025

rust-cloud-vms bot force-pushed the prevent-abi-changes branch from 1035486 to e243a3c Compare June 20, 2025 18:30

This comment has been minimized.

Sign in to view

ZuseZ4 marked this pull request as ready for review June 24, 2025 18:48

rustbot added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jun 24, 2025

This comment has been minimized.

Sign in to view

ZuseZ4 reviewed Jun 24, 2025

View reviewed changes

oli-obk self-assigned this Jun 25, 2025

Sa4dUs added 4 commits June 26, 2025 13:39

Adjust autodiff activities for abi transformations (small tuples and …

727b8b5

…structs)

Add codegen tests

a96d085

Note(Sa4dUs): As LLVM-IR opt passes are executed after passing LLVM to Enzyme, most of the cases have turned out to not be problematic. Anyways, we still test them to prevent any kind of regression.

Split tests for tidy checks

56dbc6a

Update activity adjustment logic

39d1efc

Update `count_scalar_fields`->`count_leaf_fields` to support more types Add extra activities only if `count_scalar_fields` is leq 2 Logic can be optimized if needed Removed metadata specific fields from test to avoid future fails.

rust-cloud-vms bot force-pushed the prevent-abi-changes branch from 07b10dd to 39d1efc Compare June 26, 2025 15:00

Update test header comment

d25ed22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Prevent ABI changes affect EnzymeAD #142544

Prevent ABI changes affect EnzymeAD #142544

Sa4dUs commented Jun 15, 2025

This comment has been minimized.

workingjubilee Jun 15, 2025

workingjubilee Jun 15, 2025

Sa4dUs Jun 16, 2025

workingjubilee Jun 15, 2025 •

edited

Loading

Sa4dUs Jun 16, 2025

workingjubilee Jun 17, 2025 •

edited

Loading

bjorn3 Jun 15, 2025

Sa4dUs Jun 16, 2025

bors commented Jun 20, 2025

This comment has been minimized.

rustbot commented Jun 24, 2025

This comment has been minimized.

ZuseZ4 Jun 24, 2025

ZuseZ4 Jun 24, 2025

Sa4dUs Jun 25, 2025

Uh oh!

ZuseZ4 Jun 24, 2025

Sa4dUs Jun 26, 2025

ZuseZ4 Jun 24, 2025

ZuseZ4 commented Jun 24, 2025

Sa4dUs commented Jun 26, 2025


		let is_product = \|t: Ty<'tcx>\| matches!(t.kind(), ty::Tuple(_) \| ty::Adt(_, _));

		if layout.size() <= pointer_size * 2 && is_product(*ty) {

Prevent ABI changes affect EnzymeAD #142544

Are you sure you want to change the base?

Prevent ABI changes affect EnzymeAD #142544

Conversation

Sa4dUs commented Jun 15, 2025

This comment has been minimized.

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

workingjubilee Jun 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Choose a reason for hiding this comment

workingjubilee Jun 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bors commented Jun 20, 2025

This comment has been minimized.

rustbot commented Jun 24, 2025

This comment has been minimized.

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ZuseZ4 commented Jun 24, 2025

Sa4dUs commented Jun 26, 2025

workingjubilee Jun 15, 2025 •

edited

Loading

workingjubilee Jun 17, 2025 •

edited

Loading