Skip to content

Prevent ABI changes affect EnzymeAD #142544

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

Sa4dUs
Copy link
Contributor

@Sa4dUs Sa4dUs commented Jun 15, 2025

This PR handles ABI changes for autodiff input arguments to improve Enzyme compatibility. A couple of cases (like statics and small arrays) are still not handled.

r? @ZuseZ4

@rustbot rustbot added A-attributes Area: Attributes (`#[…]`, `#![…]`) F-autodiff `#![feature(autodiff)]` T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jun 15, 2025
@rust-log-analyzer

This comment has been minimized.

Comment on lines 139 to 142
/// `#[rustc_autodiff_no_abi_opt]`: internal marker applied to `#[rustc_autodiff]` primal functions
/// whose argument layout may be sensitive to ABI-level optimizations. This marker prevents certain
/// optimizations that could otherwise break compatibility with Enzyme's expectations.
const RUSTC_AUTODIFF_NO_ABI_OPT = 1 << 16;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't say "certain optimizations", or the next person who comes along is going to make it so that these functions are treated as -O0. Identify the actual problem: LLVM will modify the ABI of functions if it can identify them as fully internalized.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A change that bad should be caught by a reviewer, but I review a lot of PRs and I enjoy it when the codebase informs people of what is actually going on so they are more likely to have made changes that are consistent with the existing situation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, thanks for the feedback. I’ll make sure to keep comments as specific as possible to avoid any ambiguity going forward.

Comment on lines 924 to 958
fn is_abi_opt_sensitive<'tcx>(tcx: TyCtxt<'tcx>, ty: Ty<'tcx>) -> bool {
match ty.kind() {
ty::Ref(_, inner, _) | ty::RawPtr(inner, _) => {
match inner.kind() {
ty::Slice(_) => {
// Since we cannot guarantee that the slice length is large enough
// to avoid optimization, we assume it is ABI-opt sensitive.
return true;
}
ty::Array(elem_ty, len) => {
let Some(len_val) = len.try_to_target_usize(tcx) else {
return false;
};

let pci = PseudoCanonicalInput {
typing_env: TypingEnv::fully_monomorphized(),
value: *elem_ty,
};

if elem_ty.is_scalar() {
let elem_size =
tcx.layout_of(pci).ok().map(|layout| layout.size).unwrap_or(Size::ZERO);

if elem_size.bytes() * len_val <= tcx.data_layout.pointer_size.bytes() * 2 {
return true;
}
}
}
_ => {}
}

false
}
ty::FnPtr(_, _) => true,
_ => false,
Copy link
Member

@workingjubilee workingjubilee Jun 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should only matter when Enzyme is on, right?

Why are you ignoring ty::Array when it is not through ty::Ref? Does Enzyme not even deal in simple aggregates like that? I'm not even sure this is the correct layer to be examining things like this at, since it's well above the LLVM IR type layer. Multiple types in Rust source can wind up being lowered to the naive equivalent of this in the IR.

Anyway, please name this fn as specific to Enzyme, at least.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this logic is only for Enzyme. I didn't add the ty::Array logic yet because I'm still not sure on how to handle it as, for example, [f32; 2] is lowered to i64. As the number of args does not change, Enzyme may not have issues with that.

Copy link
Member

@workingjubilee workingjubilee Jun 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I have determined this is blocked on me to have a proper solution for it. I would like it if you opened an issue before this PR lands that points to https://github.com/rust-lang/rust/blame/86d0aef80403f095d8bbabf44d9fdecfcd45f076/compiler/rustc_target/src/callconv/mod.rs#L708 and your new code here, and says that a new variant of the adjust_for_rust_abi code that doesn't just mutate the arguments needs to exist so that this query can be answered without relying on mutable state that cannot be invoked idempotently.

@@ -175,6 +179,7 @@ impl CodegenFnAttrs {
self.flags.contains(CodegenFnAttrFlags::NO_MANGLE)
|| self.flags.contains(CodegenFnAttrFlags::RUSTC_STD_INTERNAL_SYMBOL)
|| self.export_name.is_some()
|| self.flags.contains(CodegenFnAttrFlags::RUSTC_AUTODIFF_NO_ABI_OPT)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will also suppress the dead_code lint. I think the reason the symbol is still getting marked as dso_local even with this change is because it has SymbolExportLevel::Rust. Only for SymbolExportLevel::C do we tell LTO to export the symbol.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I'll look into that and try to ensure it has the minimal number of side effects possible. Thank you :)

@bors
Copy link
Collaborator

bors commented Jun 20, 2025

☔ The latest upstream changes (presumably #142770) made this pull request unmergeable. Please resolve the merge conflicts.

@bors bors added the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Jun 20, 2025
@rust-cloud-vms rust-cloud-vms bot force-pushed the prevent-abi-changes branch from 1035486 to e243a3c Compare June 20, 2025 18:30
@rust-log-analyzer

This comment has been minimized.

@ZuseZ4 ZuseZ4 marked this pull request as ready for review June 24, 2025 18:48
@rustbot rustbot added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jun 24, 2025
@rustbot
Copy link
Collaborator

rustbot commented Jun 24, 2025

Some changes occurred in compiler/rustc_monomorphize/src/partitioning/autodiff.rs

cc @ZuseZ4

@rustbot

This comment has been minimized.

// debug-NEXT: %_2 = load float, ptr %0, align 4, !alias.scope !7, !noalias !4
// debug-NEXT: %"'ipg2" = getelementptr inbounds float, ptr %"x'", i64 1
// debug-NEXT: %1 = getelementptr inbounds nuw float, ptr %x, i64 1
// debug-NEXT: %"_5'ipl" = load float, ptr %"'ipg2", align 4, !alias.scope !4, !noalias !7
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove , !alias.scope !4, !noalias !7 and similar scope and metadata annotations. align can stay.

They are fragile, numbers might change and we want to avoid test failures because of it.

//@ no-prefer-dynamic
//@ needs-enzyme

// This does only test the funtion attribute handling for autodiff.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't about function attributes, or? It's more about verifying that Rust types are lowered to LLVM-IR types in a way that we expect and which enzyme can handle. We also explicitely check release mode, to verify that LLVM's O3 pipeline does not rewrite function signatures into something that Enzyme can not handle anymore.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was initially and forgot to remove that. I'll fix it and add a more detailed comment about what's this test for.

.non_enum_variant()
.fields
.iter()
.map(|f| count_scalar_fields(tcx, f.ty(tcx, substs)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you really want to recursively count and sum?

I think that anything behind a double indirection probably won't affect the size on the function abi, or?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After testing a bit, i think the recursive sumation (or any other way of counting the "non-splittable" fields) is necessary becase if the aggregate has more than 2 fields when flattened, it's behaving slightly different, even when under the pointer size. I'll adjust it to consider this cases.


let is_product = |t: Ty<'tcx>| matches!(t.kind(), ty::Tuple(_) | ty::Adt(_, _));

if layout.size() <= pointer_size * 2 && is_product(*ty) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a note that this is the magic number based on which LLVM might optimize?

@ZuseZ4
Copy link
Member

ZuseZ4 commented Jun 24, 2025

I am not 100% sure about the recursive summation here. In general also, @oli-obk can you review this, as I just don't know if PseudoCanonicalInput and fully_monomorphize is something we want here.

@oli-obk oli-obk self-assigned this Jun 25, 2025
Sa4dUs added 4 commits June 26, 2025 13:39
Note(Sa4dUs): As LLVM-IR opt passes are executed after passing LLVM to Enzyme, most of the cases have turned out to not be problematic. Anyways, we still test them to prevent any kind of regression.
Update `count_scalar_fields`->`count_leaf_fields` to support more types
Add extra activities only if `count_scalar_fields` is leq 2
Logic can be optimized if needed
Removed metadata specific fields from test to avoid future fails.
@rust-cloud-vms rust-cloud-vms bot force-pushed the prevent-abi-changes branch from 07b10dd to 39d1efc Compare June 26, 2025 15:00
@Sa4dUs
Copy link
Contributor Author

Sa4dUs commented Jun 26, 2025

Once the solution is decent, I can optimize minor things. I leave it for the end to not optimize on something that is not correct.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-attributes Area: Attributes (`#[…]`, `#![…]`) F-autodiff `#![feature(autodiff)]` S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
8 participants