-
Notifications
You must be signed in to change notification settings - Fork 13.5k
Remove fewer Storage calls in CopyProp and GVN #142531
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Remove fewer Storage calls in CopyProp and GVN #142531
Conversation
Some changes occurred to MIR optimizations cc @rust-lang/wg-mir-opt |
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
…try> Remove fewer Storage calls in `copy_prop` Modify the `copy_prop` MIR optimization pass to remove fewer `Storage{Live,Dead}` calls, allowing for better optimizations by LLVM - see #141649. ### Details This is my attempt to fix the mentioned issue (this is the first part, I also implemented a similar solution for GVN in [this branch](https://github.com/rust-lang/rust/compare/master...ohadravid:rust:better-storage-calls-gvn-v2?expand=1)). The idea is to use the `MaybeStorageDead` analysis and remove only the storage calls of `head`s that are maybe-storage-dead when the associated `local` is accessed (or, conversely, keep the storage of `head`s that are for-sure alive in _every_ relevant access). When combined with the GVN change, the final example in the issue (#141649 (comment)) is optimized as expected by LLVM. I also measured the effect on a few functions in `rav1d` (where I originally saw the issue) and observed reduced stack usage in several of them. This is my first attempt at working with MIR optimizations, so it's possible this isn't the right approach — but all tests pass, and the resulting diffs appear correct. r? tmiasko since he commented on the issue and pointed to these passes.
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (ef7d206): comparison URL. Overall result: ❌ regressions - please read the text belowBenchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @bors rollup=never Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary 0.7%, secondary 3.4%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (primary -0.6%, secondary -0.1%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeResults (primary 0.0%, secondary 0.0%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Bootstrap: 757.399s -> 756.065s (-0.18%) |
@matthiaskrgr - I updated the impl to stop re-checking once a head is found to be maybe-dead, which should be a bit better |
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
…try> Remove fewer Storage calls in `copy_prop` Modify the `copy_prop` MIR optimization pass to remove fewer `Storage{Live,Dead}` calls, allowing for better optimizations by LLVM - see #141649. ### Details This is my attempt to fix the mentioned issue (this is the first part, I also implemented a similar solution for GVN in [this branch](https://github.com/rust-lang/rust/compare/master...ohadravid:rust:better-storage-calls-gvn-v2?expand=1)). The idea is to use the `MaybeStorageDead` analysis and remove only the storage calls of `head`s that are maybe-storage-dead when the associated `local` is accessed (or, conversely, keep the storage of `head`s that are for-sure alive in _every_ relevant access). When combined with the GVN change, the final example in the issue (#141649 (comment)) is optimized as expected by LLVM. I also measured the effect on a few functions in `rav1d` (where I originally saw the issue) and observed reduced stack usage in several of them. This is my first attempt at working with MIR optimizations, so it's possible this isn't the right approach — but all tests pass, and the resulting diffs appear correct. r? tmiasko since he commented on the issue and pointed to these passes.
Should this check happen in |
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
I'm not sure how to make this work: using Is there a different way to do this? |
Finished benchmarking commit (c0a2949): comparison URL. Overall result: ❌✅ regressions and improvements - please read the text belowBenchmarking this pull request means it may be perf-sensitive – we'll automatically label it not fit for rolling up. You can override this, but we strongly advise not to, due to possible changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please do so in sufficient writing along with @bors rollup=never Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary -0.1%, secondary -1.3%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (secondary -1.0%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeResults (primary -0.0%, secondary 0.0%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Bootstrap: 756.494s -> 757.685s (0.16%) |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
tests/mir-opt/copy-prop/copy_prop_borrowed_storage_not_removed.rs
Outdated
Show resolved
Hide resolved
I am not familiar with GVN, so I will leave review of that part to @cjgillot . |
☔ The latest upstream changes (presumably #142870) made this pull request unmergeable. Please resolve the merge conflicts. |
… to remove fewer storage statements
…r storage statements
26fc160
to
5c21ce3
Compare
This comment has been minimized.
This comment has been minimized.
654c4a2
to
ab1da60
Compare
This comment has been minimized.
This comment has been minimized.
ab1da60
to
e69bef8
Compare
This comment has been minimized.
This comment has been minimized.
@tmiasko - implemented all the changes 😄 I also updated the GVN code since they applied there are well (use a single I can also split this PR if needed, and I'll polish the git history when you think this looks good enough 🧹 PS |
The job Click to see the possible cause of the failure (guessed by this bot)
|
} else { | ||
// Remove the storage statements of all the reused locals. | ||
state.reused_locals.clone() | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic is very close to the one in copy_prop
. Could you merge them into a single compute_storage_to_remove
function in rustc_mir_transform::ssa
? It would take a set of relevant locals (copy-prop : copy_classes[local] != local
, GVN : reused_locals
) and compute storage_to_remove
.
pub(crate) fn new() -> Self { | ||
Self {} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not very useful. It can be created using the struct's name.
PlaceContext::MutatingUse(MutatingUseContext::AsmOutput) | ||
| PlaceContext::MutatingUse(MutatingUseContext::Call) | ||
| PlaceContext::MutatingUse(MutatingUseContext::Store) | ||
| PlaceContext::MutatingUse(MutatingUseContext::Yield) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They do not require the local to be initialized, but they do require it to have storage. Mixing the two notions makes me uneasy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can probably add a test that triggers this. Is the right solution to use two separate analysis checks (so for these we'll check the MaybeStorageDead
)?
// We don't need to check storage statements and statements for which the local doesn't need to be initialized. | ||
match context { | ||
PlaceContext::MutatingUse( | ||
MutatingUseContext::Store | ||
| MutatingUseContext::Call | ||
| MutatingUseContext::Yield | ||
| MutatingUseContext::AsmOutput, | ||
) | ||
| PlaceContext::NonUse(_) => { | ||
return; | ||
} | ||
_ => {} | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This entire match is unnecessary and can be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually PlaceContext::NoUse
part is potentially useful, but there are no tests. Could you add one? Something with storage statements other way around like in:
#![feature(custom_mir, core_intrinsics)]
use std::intrinsics::mir::*;
#[custom_mir(dialect = "runtime")]
pub fn f() -> usize {
mir! {
let _1: usize;
let _2: usize;
let _3: usize;
{
StorageLive(_2);
StorageLive(_1);
_1 = 0;
_2 = _1;
_3 = _2;
RET = _3 + _3;
StorageDead(_1);
StorageDead(_2);
Return()
}
}
}
// If the local is borrowed, we cannot easily determine if it is used, so we have to remove the storage statements. | ||
if self.borrowed_locals.contains(local) { | ||
self.storage_to_remove.insert(head); | ||
return; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you account for borrowed_locals
outside the visitor? The specific location / use of a local irrelevant in thise case, so it doesn't seem like a right place to consider this.
Modify the CopyProp and GVN MIR optimization passes to remove fewer
Storage{Live,Dead}
calls, allowing for better optimizations by LLVM - see #141649.Details
The idea is to use a new
MaybeUninitializedLocals
analysis and remove only the storage calls of locals that are maybe-uninit when accessed in a new location.