Open
Description
For reference: https://godbolt.org/z/dv6rEeYWe
Compile the following code under opt-level=s
:
pub fn subps(a: &[f32; 4], b: &[f32; 4], c: &mut [f32; 4]) {
let c0 = a[0] - b[0];
let c1 = a[1] - b[1];
let c2 = a[2] - b[2];
let c3 = a[3] - b[3];
c[0] = c0;
c[1] = c1;
c[2] = c2;
c[3] = c3;
}
I expected:
subps:
movups xmm0, xmmword ptr [rdi]
movups xmm1, xmmword ptr [rsi]
subps xmm0, xmm1
movups xmmword ptr [rdx], xmm0
ret
Instead, I got:
subps:
movss xmm0, dword ptr [rdi]
movss xmm1, dword ptr [rdi + 4]
subss xmm0, dword ptr [rsi]
subss xmm1, dword ptr [rsi + 4]
movss xmm2, dword ptr [rdi + 8]
subss xmm2, dword ptr [rsi + 8]
movss xmm3, dword ptr [rdi + 12]
subss xmm3, dword ptr [rsi + 12]
movss dword ptr [rdx], xmm0
movss dword ptr [rdx + 4], xmm1
movss dword ptr [rdx + 8], xmm2
movss dword ptr [rdx + 12], xmm3
ret
This is roughly similar to clang's -O1
level. Since opt-level=s
should optimize for size, the inability to use packed floats is an issue.
Meta
rustc --version --verbose
:
rustc 1.90.0-nightly (11ad40bb8 2025-06-28)
binary: rustc
commit-hash: 11ad40bb839ca16f74784b4ab72596ad85587298
commit-date: 2025-06-28
host: x86_64-unknown-linux-gnu
release: 1.90.0-nightly
LLVM version: 20.1.7
Compiler returned: 0