-
Notifications
You must be signed in to change notification settings - Fork 13.5k
Allow custom default address spaces and parse p-
specifications in the datalayout string
#143182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
Some changes occurred to the CTFE / Miri interpreter cc @rust-lang/miri These commits modify compiler targets. Some changes occurred in compiler/rustc_ast_lowering/src/format.rs cc @m-ou-se Some changes occurred to the CTFE / Miri interpreter cc @rust-lang/miri, @RalfJung, @oli-obk, @lcnr Some changes occurred in cfg and check-cfg configuration cc @Urgau Some changes occurred to the CTFE machinery Some changes occurred in compiler/rustc_codegen_ssa |
@bors2 try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
Allow custom default address spaces and parse `p-` specifications in the datalayout string Some targets, such as CHERI, use as default an address space different from the "normal" default address space `0` (in the case of CHERI, [200 is used](https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-877.pdf)). Currently, `rustc` does not allow to specify custom address spaces and does not take into consideration [`p-` specifications in the datalayout string](https://llvm.org/docs/LangRef.html#langref-datalayout). This patch tries to mitigate these problems by allowing target to define a custom default address space (while keeping the default value to address space `0`) and adding the code to parse the `p-` specifications in `rustc_abi`. The main changes are that `TargetDataLayout` now uses functions to refer to pointer-related informations, instead of having specific fields for the size and alignment of pointers in the default address space: the potential performance drawbacks of not having ad-hoc fields will be tested in this PR's CI run. r? workingjubilee
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (a6d8fc2): comparison URL. Overall result: ❌ regressions - please read the text belowBenchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @bors rollup=never Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary -2.6%, secondary -8.0%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (secondary 2.6%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 695.013s -> 695.668s (0.09%) |
Some changes occurred in compiler/rustc_codegen_gcc rust-analyzer is developed in its own repository. If possible, consider making this change to rust-lang/rust-analyzer instead. cc @rust-lang/rust-analyzer Some changes occurred in compiler/rustc_codegen_cranelift cc @bjorn3 Some changes occurred in src/tools/clippy cc @rust-lang/clippy |
@bors2 try |
Allow custom default address spaces and parse `p-` specifications in the datalayout string Some targets, such as CHERI, use as default an address space different from the "normal" default address space `0` (in the case of CHERI, [200 is used](https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-877.pdf)). Currently, `rustc` does not allow to specify custom address spaces and does not take into consideration [`p-` specifications in the datalayout string](https://llvm.org/docs/LangRef.html#langref-datalayout). This patch tries to mitigate these problems by allowing targets to define a custom default address space (while keeping the default value to address space `0`) and adding the code to parse the `p-` specifications in `rustc_abi`. The main changes are that `TargetDataLayout` now uses functions to refer to pointer-related informations, instead of having specific fields for the size and alignment of pointers in the default address space; furthermore, the two `pointer_size` and `pointer_align` fields in `TargetDataLayout` are replaced with an `FxHashMap` that holds info for all the possible address spaces, as parsed by the `p-` specifications. The potential performance drawbacks of not having ad-hoc fields for the default address space will be tested in this PR's CI run. r? workingjubilee
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
@rust-timer build 92687cf |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (92687cf): comparison URL. Overall result: ❌ regressions - please read the text belowBenchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @bors rollup=never Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary -2.3%, secondary 2.1%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (primary -1.1%, secondary 0.3%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeResults (secondary 0.0%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Bootstrap: 694.715s -> 695.8s (0.16%) |
defc7a8
to
3f8b20b
Compare
@bors try |
This comment has been minimized.
This comment has been minimized.
Allow custom default address spaces and parse `p-` specifications in the datalayout string Some targets, such as CHERI, use as default an address space different from the "normal" default address space `0` (in the case of CHERI, [200 is used](https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-877.pdf)). Currently, `rustc` does not allow to specify custom address spaces and does not take into consideration [`p-` specifications in the datalayout string](https://llvm.org/docs/LangRef.html#langref-datalayout). This patch tries to mitigate these problems by allowing targets to define a custom default address space (while keeping the default value to address space `0`) and adding the code to parse the `p-` specifications in `rustc_abi`. The main changes are that `TargetDataLayout` now uses functions to refer to pointer-related informations, instead of having specific fields for the size and alignment of pointers in the default address space; furthermore, the two `pointer_size` and `pointer_align` fields in `TargetDataLayout` are replaced with an `FxHashMap` that holds info for all the possible address spaces, as parsed by the `p-` specifications. The potential performance drawbacks of not having ad-hoc fields for the default address space will be tested in this PR's CI run. r? workingjubilee
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (d439e37): comparison URL. Overall result: ✅ improvements - no action neededBenchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf. @bors rollup=never Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)This benchmark run did not return any relevant results for this metric. CyclesResults (primary 0.2%, secondary 0.4%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 461.796s -> 461.853s (0.01%) |
This comment has been minimized.
This comment has been minimized.
dl.pointer_size = parse_size(s, p)?; | ||
dl.pointer_align = parse_align(a, p)?; | ||
[p, s, a @ ..] if p.starts_with("p") => { | ||
// Some targets, such as CHERI, use the 'f' suffix in the p- spec to signal that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am planning to add more flags upstream for non-integral pointer properties so ignoring any letter for now is great for future proofing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We generally try to reduce future-proofing in this code: deviation from the LLVM datalayout string will cause rustc to error, quite deliberately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're making rustc correct with respect to address spaces, then if you add a character that would be relevant for how the platform is handled by codegen, why should we ignore it? And if we don't know what it means, why should we assume it is irrelevant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, just ignoring things we don't know about is generally bad since we don't know what we are ignoring, so there's a high risk we are ignoring something important.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will change it so that only p
and pf
are stripped. Do you think that would make sense? Once more flags are added we can take them into account accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that seems fine. I wonder if there's something we should do in response to "pf" but nothing immediately occurs to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In CHERI targets the f
is used to signal that the pointer is non integral (or f
at). Upstream LLVM, in practice, just ignores it and sets to false
the according field in the pointer spec.
Another possibility would be to wait for f
to be taken into account at all until a target that uses it is introduced.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fair and it probably makes sense to wait until such uses exist. I am planning to introduce 'e', 'n', 'u' flags which are basically finger-grained properties for the current ni:<addrspace>
(llvm/llvm-project#105735), but to allow building rustc against the CHERI LLVM fork, ignoring 'f' would be quite nice since it doesn't affect the rest of rustc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ccc2083 removes the stripping of f
. Instead, upon seeing a pX
specification, an error that tells the user that the given pointer specification is unknown is returned.
compiler/rustc_abi/src/lib.rs
Outdated
} | ||
} | ||
} | ||
[p, s, _pr, i, a @ ..] if p.starts_with("p") => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[p, s, _pr, i, a @ ..] if p.starts_with("p") => { | |
[p, s, _pr, a, idc @ ..] if p.starts_with("p") => { |
Index size is the last component?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I missed that. Do you think it should it be, then,
[p, s, a , _pr, i @ ..] if p.starts_with("p") => {
instead?
For reference, the relevant bit from LLVM's datalayout string spec:
p[n]:<size>:<abi>[:<pref>[:<idx>]]
I can also add the pref
field to AddressSpaceInfo
, which perhaps would make even more sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
commit ccc2083 adds the parsing of :<pref>
with functions to retrieve that value: I set it to default to <abi>
when not specified.
I also had to slightly change the utility parse_align
function, which took as parameter a &[&str]
, whose semantics were actually to ignore the potential :<pref>:<more...>
following <abi>
.
@@ -1104,7 +1287,7 @@ impl Primitive { | |||
// FIXME(erikdesjardins): ignoring address space is technically wrong, pointers in | |||
// different address spaces can have different sizes | |||
// (but TargetDataLayout doesn't currently parse that part of the DL string) | |||
Pointer(_) => dl.pointer_size, | |||
Pointer(a) => dl.pointer_size_in(a), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like this Todo can be removed now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
@@ -1118,7 +1301,7 @@ impl Primitive { | |||
// FIXME(erikdesjardins): ignoring address space is technically wrong, pointers in | |||
// different address spaces can have different alignments | |||
// (but TargetDataLayout doesn't currently parse that part of the DL string) | |||
Pointer(_) => dl.pointer_align, | |||
Pointer(a) => dl.pointer_align_in(a), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove Todo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
@@ -298,6 +323,7 @@ impl TargetDataLayout { | |||
/// determined from llvm string. | |||
pub fn parse_from_llvm_datalayout_string<'a>( | |||
input: &'a str, | |||
default_address_space: AddressSpace, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO this should be part of the data layout but currently none of alloca/globals/program AS can be used for this, so passing it in seems reasonable.
This comment has been minimized.
This comment has been minimized.
…arse preferred alignment spec in `p-`
Some changes occurred in coverage instrumentation. cc @Zalathar |
Some targets, such as CHERI, use as default an address space different from the "normal" default address space
0
(in the case of CHERI, 200 is used). Currently,rustc
does not allow to specify custom address spaces and does not take into considerationp-
specifications in the datalayout string.This patch tries to mitigate these problems by allowing targets to define a custom default address space (while keeping the default value to address space
0
) and adding the code to parse thep-
specifications inrustc_abi
. The main changes are thatTargetDataLayout
now uses functions to refer to pointer-related informations, instead of having specific fields for the size and alignment of pointers in the default address space; furthermore, the twopointer_size
andpointer_align
fields inTargetDataLayout
are replaced with anFxHashMap
that holds info for all the possible address spaces, as parsed by thep-
specifications.The potential performance drawbacks of not having ad-hoc fields for the default address space will be tested in this PR's CI run.
r? workingjubilee