| Age | Commit message (Collapse) | Author | Files | Lines |
|
The return type of IR instructions is stored in the field
::type of struct instruction and this struct has no space
to hold the type of the operand(s). This is not a problem
for most instructions because there is an easy way to get
the operands' type. For example, for binops both types
must be the same so they are used interchangeably.
However, for compare instructions both types can be different
and there is no easy way to get the type of the operands.
Currently, this is ignored and creates some errors. It
also blocks simplifications that need this type information.
But compares instructions need only 2 operands, there is
thus one 'slot' left. So, use this slot for the operands' type.
This solves the current errors, allows new simplifications
and has very little impact on existing code. Of course,
this type information needs now to be tracked and adjusted
whenever the operands change or an instruction is changed
into a compare.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Because of the lack of type information, compare instruction are
not always handled correctly. So, add some testcases for this.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
The current tags check-output-contains/excludes/pattern are
quite powerful and the new check-output-match is easy to use
but it can be even simpler. Indeed, a lot of IR simplifications/
canonicalizations can be tested by checking that the expression
to be tested is equivalent to another one. This is less precise
than some more specific tests but:
* it's a big advantage because it's less sensitive to 'noise'
like the exact number used by the pseudos or to the results
of some new simplifications or canonicalizations
* very often, this equivalence is what really matters and not
the exact transformation.
So, add a new utra-simple-to-use tag: just ask that all functions
of the tests return the same specified value (usually 1):
check-output-returns: <value>
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
The current tags check-output-contains/excludes/pattern are
quite powerful, universal, but they often need 'complex' regular
expressions with escaping which make them not so nice to read.
For testing IR results, a very common pattern is:
this instruction must have this (kind of) operand.
So, make a new tag for this. It does nothing than can't be done
with done with the previous ones, on the contrary, but is much
simpler to use:
check-output-match(instruction): operand
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
* simplify and canonicalize unsigned compares
* basic unop simplifications
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Some unsigned compares against 0 or 1 are equivalent to testing
equality with 0 (x <= 0, x > 0, x < 1, x >= 1).
Canonicalize them to this later, more common form.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Some unsigned compares against 0 are always true or always false
(x < 0 or x >= 0). Simplify them.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Add a few testcases for the simplification of unary operations.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
* essential OP_ADD & OP_SUB simplifications
|
|
The patterns used here were based on looser semantic for OP_{SEXT,TRUNC}.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
* teach sparse about -funsigned-bitfields
* let plain bitfields default to signed
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Not really a simplification in itself but it make some other
simplification a little easier (already because there is one
argument less to be matched).
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Do this simplification once for all associative binops.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
At expansion phase, when simplified, all constants are truncated
to the size of the operations that generate them.
This should be done during simplification too because:
*) if some constants are sometimes truncated and sometimes
sign-extended, CSE will miss some opportunities.
*) it's not possible to sign-extend them because it's not
always known if the constant is used in a signed context or not.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Add some testcases about basic simplifications of additions
and subtractions.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Currently, Sparse treats 'plain' bitfields as unsigned.
However, this is this is inconsistent with how non-bitfield integers
are handled and with how GCC & clang handle bitfields.
So, teach sparse about '-funsigned-bitfields' and by default treat
these bitfields are signed, like done by GCC & clang and like done
for non-bitfield integers.
Also, avoid plain bitfields in IR related testcases.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Sparse complains when a shift amount is too big for the size
of its operand or if it's negative.
However, it does this even for expressions that are never evaluated.
It's especially annoying in the kernel for type generic macros,
for example the ones in arch/*/include/asm/cmpxchg.h
So, remove all warnings done at expansion time and avoid any
simplifications of such expressions. Same, at linearization
and optimization time but in this case mark the instructions as
'tainted' to inhibit any further simplifications. Finally, at the
end of the optimization phase, warn for the tainted instructions.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
If the flag '-m64' is used on a 32-bit architecture/machine having
int32_t set to 'long', then these int32_t are forced to 64-bit ...
So, ignore the effect of -m64 on these archs and ignore
'64-bit only' tests on them.
Reported-by: Uwe Kleine-König <uwe@kleine-koenig.org>
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Tested-by: Uwe Kleine-König <uwe@kleine-koenig.org>
|
|
The test optim/cse-size fials on 32-bit because it needs
two integers of different size but uses int & long.
These two types have indeed different sizes on 64-bit
(LP64) but not on 32-bit (ILP32).
Fix this by using short & int.
Reported-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
* remove more complex phi-nodes
|
|
In a set of related phi-nodes and phi-sources if all phi-sources
but one correspond to the target of one of the phi-sources, then
no phi-nodes is needed and all %phis can be replaced by the unique
source.
For example, code like:
int test(void);
int foo(int a)
{
while (test())
a ^= 0;
return a;
}
used to produce an IR with a phi-node for 'a', like:
foo:
phisrc.32 %phi2(a) <- %arg1
br .L4
.L4:
phi.32 %r7(a) <- %phi2(a), %phi3(a)
call.32 %r1 <- test
cbr %r1, .L2, .L5
.L2:
phisrc.32 %phi3(a) <- %r7(a)
br .L4
.L5:
ret.32 %r7(a)
but since 'a ^= 0' is a no-op, the value of 'a' is in fact
never mofified. This can be seen in the phi-node where its
second operand (%phi3) is the same as its target (%r7). So
the only possible value for 'a' is the one from the first
operand, its initial value (%arg1).
Once this trivial phi-nodes is removed, the IR is the expected:
foo:
br .L4
.L4:
call.32 %r1 <- test
cbr %r1, .L4, .L5
.L5:
ret.32 %arg1
Removing these trivial phi-nodes will usually trigger other
simplifications, especially those concerning the CFG.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Trivial phi-nodes are phi-nodes having an unique possible outcome.
So, there is nothing to join and the phi-node target can be replaced
by the unique value.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Accesses to volatiles must, of course, not be optimized away.
For this, we need to check to type associated to the memory access.
Currently this is done by checking if the type of the result of
the memops is volatile or not. Usualy, the type of the result is
the same as the one of the access so everything is good but for
bitfields, the memop is not done with the type of the bitfield
itself but to its base type. Since this base type is unrelated
to the access type, it is generaly not marked as volatile even
when the access to the bitfield is volatile.
Fix this by using the true type of the access to set the field
struct instruction::is_volatile.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Accesses to bitfields must, of course, not be optimized away.
This is currently not the case.
Add a testcase for it.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
* do 'classical' SSA conversion (via the iterated dominance frontier).
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
* fix buggy recursion in kill_dead_stores()
* kill dead stores again after memops simplification is done.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
* simplify TRUNC((x & M') | y, N)
* simplify AND(SHIFT(a | b, S), M)
* simplify TRUNC(SHIFT(a | b, S), N)
|
|
The simplification of TRUNC(SHIFT(a | b, S), N) can be done by
combining the effective mask corresponding to TRUNC(_, N) with
the one corresponding to SHIFT(_, S).
This allows to also simplify signed bitfields. For example, code like:
struct s {
signed int :2;
signed int f:3;
};
int bfs(struct s s, int a)
{
s.f = a;
return s.f;
}
is now simplified into the minimal:
bfs:
trunc.3 %r4 <- (32) %arg2
sext.32 %r11 <- (3) %r4
ret.32 %r11
The simplification is done by calling simplify_mask_shift() with
the mask corresponding to TRUNC(_, N).
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
The simplification of AND(SHIFT(a | b, S), M) can be done by combining
the mask M with the effective mask corresponding to SHIFT(_, S).
This instruction pattern is generated when accessing bitfields,
for example, code like:
struct u {
unsigned int :2;
unsigned int f:3;
};
int bfu(struct u s, int a)
{
s.f = a;
return s.f;
}
is now simplified into the minimal:
bfu:
and.32 %r11 <- %arg2, $7
ret.32 %r11
The simplification is done by introducing a small helper,
simplify_mask_shift(), doing the pattern matching and then calling
simplify_mask_shift_or() with the mask M.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
A N-bit truncate is not much different than ANDing with
a N-bit mask and so some simplifications done for AND can
also be done for TRUNC. For example for code like this:
char foo(int x, int y) { return (x & 0xffff) | y; }
the mask is unneeded and the function should be equivalent to:
char foo(int x, int y) { return x | y; }
The simplification in this patch does exactly this, giving:
foo:
or.32 %r4 <- %arg1, %arg2
trunc.8 %r5 <- (32) %r4
ret.8 %r5
while previously the mask was not optimized away:
foo:
and.32 %r2 <- %arg1, $0xffff
or.32 %r4 <- %r2, %arg2
trunc.8 %r5 <- (32) %r4
ret.8 %r5
This simplification is especially important for signed bitfields
because the TRUNC+ZEXT of unsigned bitfields is simplified into
an OP_AND but this is, of course, not the case for the TRUNC+SEXT
of signed bitfields.
Do the simplification by calling simplify_mask_or(), initialy used
for OP_AND, but with the effective mask corresponding to TRUNC(x, N):
$mask(N).
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
|
|
The instructions SHL(AND(x, M), S) can be simplified into SHL(x, S)
if (M << S) == (-1 << S).
For example, code like:
unsigned foo(unsigned x)
{
return (x & 0x000fffff) << 12;
}
is now optimized into:
foo:
shl.32 %r3 <- %arg1, $12
ret.32 %r3
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
The instructions SHL(AND(x, M), S) can be simplified to 0
if (M << S) == 0.
For example code like:
unsigned foo(unsigned x)
{
return (x & 0xfff00000) << 12;
}
is now simplified into:
foo:
ret.32 $0
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
The instructions LSR(AND(x, M), S) are already simplified into
AND(LSR(x, S), (M >> S)) but only if AND(x, M) has a single user.
However, if (M >> S) == (-1 >> S), the AND part is redundant and the
whole can always directly be simplified into LSR(x, S).
For example, code like:
unsigned foo(unsigned x)
{
unsigned t = (x & 0xfffff000);
return ((t >> 12) ^ (x >> 12)) & t;
}
is now optimized into:
foo:
ret.32 $0
because (t >> 12) is simplified into (x >> 12).
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
The instructions LSR(AND(x, M), S) are already simplified into
AND(LSR(x, S), (M >> S)) but only if AND(x, M) has a single user.
However, if (M >> S) == 0, they can always directly be simplified to 0.
For example code like:
unsigned foo(unsigned x)
{
unsigned t = (x & 0x00000fff);
return (t >> 12) & t;
}
is now simplified into:
foo:
ret.32 $0
while previously it was:
foo:
and.32 %r2 <- %arg1, $0xfff
lsr.32 %r4 <- %r2, $12
and.32 %r6 <- %r4, %r2
ret.32 %r6
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
The pattern LSR(AND(x, M), S) is already generically simplified into
((x >> S) & (M >> S)) but only if the sub-expression AND(x, M) is not
shared with some other expressions because the simplification modify it.
But for some special cases the expression can be simplified even if
the sub-expression is shared because the simplification doesn't need
to modify this AND(x, M) part.
Add the testcases for LSR and the incoming SHL.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
The same simplifications done for LSR can be done for SHL
(with the appropriate mask).
For example, code like:
int foo(int a, int b)
{
return ((a & 0xfff00000) | b) << 12;
}
is now optimized into:
foo:
shl.32 %r5 <- %arg2, $12
ret.32 %r5
while previously it was:
foo:
and.32 %r2 <- %arg1, $0xfff00000
or.32 %r4 <- %r2, %arg2
shl.32 %r5 <- %r4, $12
ret.32 %r5
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
If the effective mask (M) corresponding to OP(_, K) is different
than the combined mask (C & M), then the inner mask (C) can be
replaced by the smaller (C & M), giving:
OP((x | M'), K) with M' == (C & M).
For example, code like:
int foo(int x)
{
return (x | 0xfffffff0) & 0xfff;
}
is now simplified into:
foo:
or.32 %r2 <- %arg1, $0xff0
and.32 %r3 <- %r2, $0xfff
ret.32 %r3
while previously, the constant was not reduced:
foo:
or.32 %r2 <- %arg1, $0xfffffff0
and.32 %r3 <- %r2, $0xfff
ret.32 %r3
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
In an expression like OP((x | C), K), if the effective mask (M)
corresponding to OP(_, K) is equal to the combined mask (C & M),
then the OR operation is unneeded and can be replaced by M itself,
giving: OP(M, K).
In mathematical terms:
0) ((x | C) & M) = ((x & M) | (C & M))
1) (C & M) = M
2) ((x & M) | (C & M)) = ((x & M) | M) = M
and so OP((x | C), K) -> OP(M, K).
For example, code like:
unsigned int foo(int x)
{
return (x | 7) & 2;
}
is now simplified into:
foo:
ret.32 $2
which previously was not optimized
foo:
or.32 %r2 <- %arg1, $7
and.32 %r3 <- %r2, $2
ret.32 %r3
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
In an expression like OP((x | C), K), if the inner constant
(C) and the effective mask (M) corresponding to OP(_, K) are
disjoint ((C & M) == 0), then the OR with the constant is
unneeded and can be optimized away since the constant will be
'cleared' by the mask, giving: OP(x, K)
For example, code like:
int foo(int x)
{
return (x | 0xfffff000) & 0xfff;
}
is now optimized into:
foo:
and.32 %r3 <- %arg1, $0xfff
ret.32 %r3
while previously the OR mask was not optimized away:
foo:
or.32 %r2 <- %arg1, $0xfffff000
and.32 %r3 <- %r2, $0xfff
ret.32 %r3
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Currently, in simplify_mask_or_and(), only the cases where
(M' & M) == 0 or (M' & M) == M are simplified. However, if the
combined mask (M' & M) is different than the original inner mask
(M'), this inner mask can be replaced by the smaller combined one,
giving: OP(((x & (M' & M)) | y), K).
For example, code like:
int foo(int x, int y)
{
return (((x & 0xfffffff0) | y) & 0xfff);
}
is now simplified into:
foo:
and.32 %r2 <- %arg1, $0xff0
or.32 %r4 <- %r2, %arg2
and.32 %r5 <- %r4, $0xfff
ret.32 %r5
while previously, the mask was not reduced:
foo:
and.32 %r2 <- %arg1, $0xfffffff0
...
Note: this is not a very effective simplification like directly
removing an instruction, nevertheless the smaller mask can
trigger other simplifications and may also be advantageous
for a subsequent code generation phase.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
In an expression like this, if the inner mask (M') 'covers' all the
bits of the effective mask (M) corresponding to OP(_, K), IOW if
(M' & M) == M, then the inner mask (M') is unneeded and can be
optimized away, giving: OP((x | y), K).
For example, with this simplification, code like:
int foo(int x, int y)
{
return (((x & 0x0fffffff) | y) & 0xfff);
}
is now optimized into the minimal:
foo:
or.32 %r4 <- %arg1, %arg2
and.32 %r5 <- %r4, $0xfff
ret.32 %r5
while previously, the inner mask was not optimized away:
foo:
and.32 %r2 <- %arg1, $0xfffffff
or.32 %r4 <- %r2, %arg2
and.32 %r5 <- %r4, $0xfff
ret.32 %r5
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
In simplify_mask_or(), two calls to simplify_mask_or_and() are
made: one for each operand of the OR instruction.
These two calls are guarded by a test checking the presence of
the AND instruction. If it is the case for the first operand,
the second operand is never considered for simplification,
even if no simplifications have been made.
Fix this by testing the return value of simplify_and_or_mask()
and let's do the second call if no simplifications could be
done in the first one.
For example, code like:
int foo(int x, int y, int a)
{
return ((a & 0xf000) | (x & y)) & 0x0fff;
}
was expectedly simplified into:
foo:
and.32 %r5 <- %arg1, %arg2
and.32 %r7 <- %r5, $0xfff
ret.32 %r7
while the same code with the operands of the OR swapped was not:
int foo(int x, int y, int a)
{
return ((x & y) | (a & 0xf000)) & 0x0fff;
}
resulted in the non-optimized:
foo:
and.32 %r3 <- %arg1, %arg2
and.32 %r5 <- %arg3, $0xf000
or.32 %r6 <- %r3, %r5
and.32 %r7 <- %r6, $0xfff
ret.32 %r7
but now the simplification can also be done:
foo:
and.32 %r3 <- %arg1, %arg2
and.32 %r7 <- %r3, $0xfff
ret.32 %r7
Note: it would be simpler to unconditionally do both calls
but this is unsafe because some of the concerned
instructions, needed in the second call, could have
been simplified away in the first one.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
The extraction & insertion of bitfields is made of relatively
complex combinations of SHL/LSR/AND/OR and TRUNC/ZEXT/SEXT.
Add a few testcases showing the effectiveness of their
simplification and to catch possible future regressions.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
There is a potential problem when the second side of the OR
is simplified away.
Add 2 testcases to catch possible regressions here.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
* add simplification of TRUNC(TRUNC((x))
* add simplification of SHL(LSR(x), S), S)
|
|
The simplification of ZEXT(ZEXT(x)) was already added but
its dual, TRUNC(TRUNC(x)), was not.
Add it now.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
The simplification of ((x << S) >> S) was already added
but its dual was forgotten.
Add it now.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Add the testcase before doing the simplification.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Add the testcase before doing the simplification.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
The testcase was named with the wrong operation order
(inner op first instead of outer-first).
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
This is especially usefull when simplifying code
accessing bitfields.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
This transformation is especially usefull when simplifying code
accessing bitfields or for other masking manipulations.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
This is a common pattern for bitfield simplification.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
If an expression is already a boolean (constant) expression,
converting it to a boolean expression is a no-op.
In this case, return early, this avoids to create intermediate
code that will need to be simplified away at some later stage.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Since the OP_SETCC instructions can only return a 0 or a 1,
a subsequent AND mask can often be heavily simplified.
Simplify the mask value.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Since the OP_SETCC instructions can only return a 0 or a 1,
a truncation won't change the value and the OP_SETCC
can be changed to directly return the extended size.
Remove the unneeded truncate.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Since the OP_SETCC instructions can only return a 0 or a 1, a
sign-extension won't change the value if the size is wider than 1.
The OP_SETCC can thus be changed to directly return the extended size.
Remove the unneeded extension.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Since the OP_SETCC instructions can only return a 0 or a 1,
a zero-extension won't change the value and the OP_SETCC
can be changed to directly return the extended size.
Remove the unneeded extension.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Convert the OP_TRUNC into an OP_AND with the appropriate mask.
This a greater chance to simplify with previous instructions
and to correspond to a register-size operand.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Since the OP_SETCC instructions can only return a 0 or a 1,
a compare-not-equal-zero or compare-equal-1 is a no-op
if the operand have been masked with 1.
Remove the no-op comparison (if the size matches).
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Conditional branches, or more exactly OP_CBR, can't accept
arbitrary expression as condition. it is required to have
an integer value.
Fix this by adding a comparison against zero.
|
|
Add some tests in preparation of some bug-fixing and
simplification in linearize_logical()linearize_conditional().
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
* some simplifications of OP_SETNE & OP_SETEQ with 0 and 1
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
These two comparisons are no-ops when the operand is boolean.
Simplify them away.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
A sign-extension has no effect on the result of a comparison
with 0 or with 1 when the original size is bigger than 1.
Optimize away the extension.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
A zero-extension has no effect on the result of a comparison with 0 or 1.
Optimize away the extension.
Note: this simplification was already done but only when the
original size was 1 but it can be done for all sizes.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Add some basic testcase for these relatively common
simplification opportunities.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
* several simplifications involving casts and/or bitfields
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
* give a correct & sensible warning on negative or over-sized shifts.
* add conservative simplification of such shifts.
* do not optimize the meaningless shift:
* any shift with a negative count
* OP_ASRs with an over-sized shift count.
* try to give a correct negative/too-big error message during simplification.
* simplify chains of shifts.
* simplify ZEXT + ASR into ZEXT + LSR
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
After an OP_ZEXT, it is guaranteed that the sign bit is zero.
So, an arithmetic right-shift following an OP_ZEXT gives the
same result as an logical right-shift which is simpler to handle
and further optimize.
So, when preceded by an OP_ZEXT, replace OP_ASR by OP_LSR.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Since an LSR with an in-range shift count will insert a zero in
the MSB, a subsequent ASR will be equivalent to an LSR of the same
count or equivalently, the combinaison the these two shifts is
equivalent to a single LSR with a shift count equal to the sum
of the two initial counts.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
A shift of a shift (of the same kind) is equivalent to a single
shitf with a count equal to the sum of the two initial counts.
In addition:
* for LSRs & SHLs, if the sum is >= to the instruction size, then
the result is zero.
* for ASRs, if the sum is >= to the instruction size, then
the result is the same as a shift of a count of size - 1.
Implement these simplifications if both shift counts are in range.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Floating-point values are displayed using the printf
format "%Lf" but this is the format without exponent
(and with default precision of 6 digit).
However, by its nature, this format is very imprecise.
For example, *all* values smaller than 0.5e-6 are displayed
as "0.000000".
Improve this by using the "%Le" format which always use
an exponent and thus maximize the precision.
Note: ultimately, we should display them exactly, for
example by using "%La", but this will requires C99.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Like GCC, we don't want to touch over-sized ASR but over-sized
LSRs & SHLs degenerate to 0.
Add testcases covering all cases.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
A sign-extension following a zero-extension can always
be simplified into a single zero-extension since the
intermediate value is guaranted to have a zero sign bit.
Simplify away such instructions.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
A zero-extension following another zero-extension can always
be simplified into a single zero-extension.
Simplify away such instructions.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
A sign-extension following another sign-extension can always
be simplified into a single sign-extension.
Simplify away such instructions.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
An OP_AND with a constant value following a OP_ZEXT can often
be simplified:
* the constant mask can be made smaller
* the whole masking is redundant and can be removed.
Do these two simplifications depending on the initial mask
and the sizes of the instructions.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
A OP_AND with a constant value followed by a sign or zero-extension
can often be moved after the extension.
This is good because it can trigger the cancellation of the OP[ZS]EXT
with a preceding OP_TRUNC, a common situation with bitfields.
Note: This 'simplification' is less desirable if there is
no preceding OP_TRUNC, for example, in a sequence like:
and.32 %t <- %a, $mask
*ext.64 %r <- %t
If needed, this can be filtered out at some later stage.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
sparse contains some code to simplify an AND masking
followed by a truncating cast.
However, this simplification is problematic because it
doesn't keep the sizes consistent. For example, code like:
and.32 %r3 <- %r2, $0x7fff
trunc.16 %r4 <- (32) %r3
will be discarded with %r3 used in place of %r4.
This is correct in the mathematical sense but %r4 had a size
of 16 while %r3 has a size of 32, so using %r3 in place of %r4
will make the sizes inconsistent with unexpected consequences.
We can more or less fix this by using another transformation
that preserve the sizes:
trunc.16 %r3 <- (32) %r2
and.16 %r4 <- %r3, $0x7fff
which in itself doesn't optimize anything but:
1) the mask may be smaller
2) may trigger other simplification with the TRUNC or the AND.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
An OP_SEXT or a OP_ZEXT followed by a truncate to a size smaller
than the original size is unneeded, the same result can be obtained
by doing the truncate directly on the original value.
Dualy, an OP_SEXT or a OP_ZEXT followed by a truncate to a size greater
than the original size doesn't need the truncate, the same result can be
obtained by doing the extend directly on the original value.
Rearrange the inputs (src & orig_type) to bypass the unneeded operation.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
An OP_ZEXT/OP_SEXT following by a OP_TRUNC to the original size is a NOP.
Simplify away such OP_TRUNC.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
There is several difficulties some related to unclear semantic
of our IR instructions and/or type evaluation.
Add testcases trying to cover this area.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
In the mathematical sense, the result of a left-shift by an
amount bigger than the operand size equals zero.
Do the corresponding simplification.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
In the mathematical sense, the result of LSR by an amount bigger
than the operand size equals zero.
Do the corresponding simplification.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
This activate the new SSA conversion that will be used
to replace simplify_symbol_usage() which created invalid
SSA (phi-nodes were placed where the value was needed
instead of where the paths meet, also and partially related,
it was possible for a phi-node to have more operands/sources
than the BB it was in had parents).
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
A few tests are added, some have been renamed to better
refect their purposes. Finally, some checks have been added
or tweaked.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
In kill_dead_stores_bb(), after having scanned the current BB,
we may recurse into the parent BBs. But we must do this only if
we're sure that the store there may not be needed in another BB.
In other words, we must do this only if the current BB is the only
child of the parent.
However, if one of the parent has more than one child, instead
of trying the next parent, the function stops there and returns.
Furthermore, the check is done with an open loop instead of using
the helper bb_list_size().
Fix this by using bb_list_size() to check if the parent has only
one child, do the recursion if it is the case and try the next
parent if not.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Currently, dead stores are removed after the initial
promotion of symbols to pseudos (simplify_one_symbol()).
But more complex promotions are done later during memops
simplification (simplify_loads()) and dead stores should be
removed there too but aren't.
Fix this by calling kill_dead_stores() after memops
simplification.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
* cse: try the next pair instead of keeping the first instruction
* cse: compare casts only by kind a size, not by C type.
|
|
* optimize away OP_UTPTR & OP_PTRTU which are nops.
|
|
Now that all casts to or from a pointer are between a pointer
and a pointer-sized unsigned integer, from an optimization
PoV, they are all no-ops.
So, optimize them away at simplification time.
Note: casts between pointers (OP_PTRCAST) should also be
optimized away but the original type is used for a
number a things (for example in check_access()) and
can't be optimized away so simply (yet).
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
validation/linear/* should not contain testcases that are
optimization dependent and validation/*.c should not contain
tests using 'test-linearize', only those using 'sparse'.
Move some cast-related testcases accordingly.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Now that OP_AND_BOOL and OP_OR_BOOL are always given boolean
operands, they are just a special case of 1 bit OP_AND & OP_OR.
To avoid to have to repeat CSE, simplification patterns, ...
better to generate plain OP_AND & OP_OR instead.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
The function add_convert_to_bool() was added to give a
boolean context to logical expressions but did this onl
for integers.
Fix this for floating-point expressions by adding the proper
comparison to 0.0.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Because of C's integer promotion, in code like 'a == 0',
the operand 'a' must be promoted to int. So, if 'a' is
of type 'bool', it results in following linearization:
zext.32 %t <- (1) %a
setne.32 %r <- %t, $0
While this promotion is required by the standard at C level,
here, from an operational PoV, the zero-extension is unneeded
since the result will be the same without it.
Change this by simplifying away such zero-extensions.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
This is yet another simple identity with the potential to trigger
more simplifications.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
This is another simple identity with the potential to trigger
more simplifications.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
This is a simple identity with the potential to trigger
more simplifications.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
A truncation followed by a zero-extension to the original size,
which is produced when loading a storing bitfields, is equivalent
to a simple AND masking. Often, this AND can then trigger even
more optimizations.
So, replace TRUNC + ZEXT instructions by the equivalent AND.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Let's suppose we have a few instructions: A, B, C & D,
all congruent insn_compare()
The current way the CSE is done is:
The first try will be try_to_cse(A,B). If this succeeds,
A & B have been merged (AB) and the next try will be done
with the next instruction:
try_to_cse(AB,C).
That's good. However, if it fails, the next try will be:
try_to_cse(A,C).
If this one also fails, the next one will be:
try_to_cse(A,D)
And if this one also fails, nothing else is done.
In other words, all attempts are done with A. If it happens that
A can't be eliminated (because it doesn't dominate B, C & D and
is not dominated by them), no eliminations are done because the
other pairs, not involving A are never tried.
Ideally, we should try all possible pairs: A-B, A-C & A-D but
also B-C, B-D & C-D. However this is quadratic and can be a bit
costly. An easy & cheap way to improve the current situation is,
in case of failure, to not reuse the original first instruction
but the other one. So instead of testing:
A-B, A-C, A-D, ...
we will test:
A-B, B-C, C-D, ...
with the result that an 'bad' instruction can't block anymore
the following pairs.
So, when try_to_cse() fails, do not retry by keeping the first
instructions, but retry using the second one.
In practice, this seems to work fairly well.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
In the test the offset is the same for dst & src and thus
it's calculation should be CSEed away but it is not (yet).
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Casts to integer used to be done with only 2 instructions:
OP_CAST & OP_SCAST.
Those are not very convenient as they don't reflect the real
operations that need to be done.
This patch specialize these instructions in:
- OP_TRUNC, for casts to a smaller type
- OP_ZEXT, for casts that need a zero extension
- OP_SEXT, for casts that need a sign extension
- Integer-to-integer casts of the same size are considered as
a NOPs and are, in fact, never emitted.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Currently all casts to pointers are processed alike. This is
simple but rather unconvenient in later phases as this
correspond to different operations that obeys to different
rules and which later need extra checks.
Change this by using a specific instructions (OP_UTPTR) for
[unsigned] integer to pointers.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Currently all casts to pointers are processed alike.
This is simple but rather unconvenient as it correspond to
different operations that obeys to different rules and
which later need extra checks.
Change this by using a specific instructions (OP_UTPTR) for
unsigned integer to pointers.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Currently, casts from floats to integers are processed like
integers (or any other type) to integers. This is simple but
rather unconvenient as it correspond to different operations
that obeys to different rules and which later need extra checks.
Change this by directly using specific instructions:
- FCVTU for floats to unsigned integers
- FCVTS for floats to signed integers
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Currently, all casts to a floating point type use OP_FPCAST.
This is maybe simple but rather uncovenient as it correspond
to several quite different operations that later need extra
checks.
Change this by directly using different instructions for the
different cases:
- FCVTF for float-float conversions
- UCVTF for unsigned integer to floats
- SCVTF for signed integer to floats
and reject attempts to cast a pointer to a float.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
* merge the tests about implicit & explicit casts in a
single file as there was a lot of redundancy.
* shuffle the tests to linear/ or optim/
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
A integer-to-float cast of a constant is currently simplified away
as if it is an integer-to-integer cast. That's bad.
Fix this by refusing to simplify away any integer-to-float casts
like already done for float-to-integer casts.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
A integer-to-float cast of a constant is currently simplified away
as if it is an integer-to-integer cast. That's bad.
Create a testcase for it.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Some non-void functions in the testcases miss a return.
Add the missing return or make the function as returning void.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
During the linearization of a function, returns are directly
linearized as phi-sources and the exit BB contains the
corresponding phi-node and the unique OP_RET.
There is also a kind of optimization that is done if there is
only a single a return statement and thus a single phi-source:
the phi-source and the phi-node is simply ignored and the
unique value is directly used by the OP_RET instruction.
While this optimization make sense it also has some cons:
- the phi-node and the phi-source are created anyway and will
need to be removed during cleanup.
- the corresponding optimization need to be done anyway during
simplification
- it's only a tiny special case which save very litte.
So, keep things simple and generic and leave this sort of
simplification for the cleanup/simplification phase.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
A select instruction like:
select r <- x, 0, x
always gives zero as result but the optimizer doesn't this.
Change this by teaching the optimizer about it.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Normal instruction simplification & CSE must not be done
on dead block (otherwise it's possible to have unsound
situations like having an instruction defining its own
operand with possible infinite loops as consequence).
This is insured by the main optimization loop but not after
BB packing or flow simplification.
Fix this by calling kill_unreachabe_bbs() after BB packing
and flow simplification.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Add some tests showing missed optimization opportunities.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
REPEAT_SYMBOL_CLEANUP is set when there is changes to the
addressability of a symbol, typically when a OP_SYMADDR is removed.
However, currently most OP_SYMADDRs are 'simplified'/folded into
their use. For example:
symaddr.64 %r1 <- var
add.64 %r2 <- %r1, $4
is simplified into:
add.64 %r2 <- var, $4
One of the bad consequences of this 'simplification' is that
if the 'add' instruction is later optimized away, this correspond
to an effective change to the addressability of the symbol. This
is exactly as if the 'symaddr' has been removed before being so
'simplified', but because the symaddr is not there anymore
there is no change recorded to the addressability to the symbol
and some further optimizations may be missed.
Change that by checking at each time the usage is removed from an
instruction if the corresponding pseudo was a symbol and set
REPEAT_SYMBOL_CLEANUP if it was the case.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
The function address_taken() check that the address of a variable
is not used for anything else than for doing a memory access.
If the function fails, it means that the address can escape and
thus can be used to modify the variable indirectly and the
memop simplification would then be unsafe.
The function do this check by assuming that any uses by an instruction
other than a load or a store can escape the address, which is true.
However it doesn't take in account that if the variable's address is
used, not as the address of a store, but as the value stored, then
this address also escape.
Fix this by adding a check that the use of the variable's address
is effectively done as the address of stores & loads and not as
the value stored by the memop.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Since the patterns in the testcases are evaluated in the shell
script, the backslash used to escape characters special to the
pattern need itself to be escaped. Theer is a few cases where
it wasn't done so, partly because 'format -l' gave a single
escape in its template.
Fix all occurences neededing this double-escape as well as the
'format -l' template.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
'kill-dead-loads' and 'kill-dead-stores' into tip
Each of these branches contains a fix for the missing removal of
value or address usage when unneeded loads or stores are killed
during symbol simplification.
No conflicts but one of the tests fails in the branches while it
correctly succeed after they are merged (so no code conflict but
a semantic conflict).
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
In the initial simplify_symbols() and in simplify_memops()
when a store is simplified away, it's killed via kill_store()
where its ->bb is set to NULL and the usage is removed from
the value. However the usage is not removed from the address.
As consequence, code related to the address calculation
is not optimized away as it should be since the value is
wrongly considered as needed.
Fix this by using kill_instruction_force() to remove these
stores.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Like others instructions producing a value, OP_LOADs can be dead.
But currently, dead OP_LOAD are not removed as dead_insn() do
for others instructions.
Fix this by checking at simplification time if OP_LOADs are
dead and call kill_instruction() if it is the case.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Converted loads are dead and can be removed but that
also means that the address usage need to be adjusted
which wasn't done.
Fix this by directly using kill_instruction().
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Before the introduction of OP_SETFVAL, floating-point were
created via OP_SETVAL whose CSE is done by comparing the
pointer of the corresponding expression without any
interpretation of this pointer.
As consequence, even if two OP_SETVAL have two identical
expressions (value), in most cases the corresponding pointers
are not identical, completly inhibiting the CSE of OP_SETVALs.
Fix the CSE of floating-point literals by directly using
the value given by the new OP_SETFVAL.
Note: to respect some of the subtilities of floating-point,
the equality comparison of two literals is not done on
the floating-point value itself but bit-by-bit on its
binary representation (as such we can continue to make the
distinction between +0.0 & -0.0, handle NaNs, ...).
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
OP_SETVAL is used to create floating-point and string
as well as labels-as-values. This multi-purpose aspect
sometimes make things a bit more complicated.
Change this by using a new instruction for the direct
creation of floating-point literals without needing
to have an intermediate EXPR_FVALUE.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
'size-unsized-arrays' and 'master' into tip
|
|
Since commit "1c182507c (fix support of floating-point compare)",
CSE wasn't done anymore on floating-point compare.
Fix this by adding the two missing 'case OP_FPCMP ...'
Fixes: 1c182507c3981aa20193c68d7cfd32d750b571cf
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
A few testcases had typos in their 'check-...' tags or
the tag was plainly invalid.
Fix them in accordance to the doc.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Comparision of floating-point values can't be done
like for integral values because of the possibility to have
NaNs which can't be ordered with normal values or even between
themselves.
The real difference appears once there is any "reasoning"
done with the result of the comparison. For example, once NaNs
are taken in account: "!(a < b)" and "(a >= b)" are not the same.
In fact the usual comparison operators must be reinterpreted
as implicitely first testing if any of the operand is a Nan
and return 'false' if it is the case. Thus "a < b" becomes
"!isnan(a) && !isnan(b) && (a < b)".
If we need to negate the comparison we get "!(a < b)" which
naturally becomes "isnan(a) || isnan(b) || (a >= b)".
We thus need two sets of operators for comparison of floats:
one for the "ordered" values (only true if neither operand
is a Nan) and one for the "values" (also true if either
operand is a NaN). A negation of the comparison switch from one
of the set to the other.
So, introduce another set of instructions for the comparison
of floats.
Note: the C standard requires that:
*) "x == x" is false if x is a NaN,
*) "x != x" is true if x is a NaN,
and this is coherent with "x != x" <-> "!(x == x)".
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
OP_INLINED_CALL are there only as a sort of annotation
for debugging purpose.
Their presence should thus not block the packing of
basic blocks.
Fix this by ignoring OP_INLINED_CALL when trying to pack
a basic block.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Currently only commutative instructions are canonicalized
(the "simpler" operands, often a constant, is forced, if present
to be in the second operand). This improve CSE (more cases are
considered as equivalent) and help to reduce the number of "pattern"
to be handled at simplification.
Do this also for compare instructions since in thsi case we can
swap the order of the operands if at the same time we also swap
the 'direction' on the comparison.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
'testcases-mem2reg' into tip
|
|
This may seems as an missing optimization problem but in truth,
the root cause is a problem with SSA conversion.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
'fix-volatile-simplification', 'struct-asm-ops', 'restricted-pointers', 'fix-f2i-casts', 'symaddr-description', 'flush-stdout' and 'diet-simple' into tip
|
|
Some branch simplifications are only valid if the branch is free
of side-effects. The check is done in flow.c:bb_has_side_effects()
but currently deosn't take in account the fact that a volatile load
is also a side-effect.
Fix this by adding the appropriate check.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Note: there is still no semantic associated with 'restrict'
but this is a preparatory step.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
'fix-missing-reload', 'fix-insert-branch', 'fix-NULL-type', 'testsuite-clean', 'fix-bitfield-init-v3' and 'fdump-linearize' into tip
* fix missing reload
* fix boolean context for OP_AND_BOOL & OP_OR_BOOL
* fix bitfields implicit zero initializer.
* fix: kill old branch in insert_branch()
* returns the correct type when evaluating NULL
* use internal size for __INT_MAX__ & friends
* testsuite: cleanup result files
* add support for '-fdump-linearize[=only]'
* add support for '-fmem-report'
* add support for '-dD'
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Current simplifications of 'x && 1 --> x' and its dual
'x || 0 --> x' are wrong because the '||' and '&&' operators
demand that their operands are first compared against zero
which then always give a boolean valued result.
For example: '3 && 1' is not equal to '3' but to '1' (or 'true').
The correct simplification is thus 'x && 1 --> x != 0' and
'x || 0 --> x != 0'.
Fix this by always first doing the comparison against zero
before generating the OP_AND_BOOL and OP_OR_BOOL instructions.
Note: of course, we could decide that the semantic of OP_AND_BOOL
and OP_OR_BOOL is that these ops take care themselves of
making a boolean context (which was, I think, why these
ops were created) but then these simplifications cannot be
done (or when they are done, we need to add the comparison
against zero).
Fixes: b85ec4bb7f5b1c522d7c71782dbd9cf1c4c49b2f
Fixes: a0886db12307d2633b04ec44342099a2955794a5
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Simple if-then-else expressions can often be
replaced by an OP_SELECT. This is done in a very
generic way by looking not at the test/if part but
at OP_PHIs which contain exactly 2 values.
An implementation detail makes that the address
of valid OP_PHI's sources must not change, so the
list holding these values must not be repacked and such.
As consequence, when a value need to be removed from the
list this value is simply replaced in the list by a VOID.
These VOIDs must then be ignored when processing OP_PHI's
sources.
But for the if-conversion, these VOID are not ignored
and are thus considered like any other value which in turn
make miss some optimization opportunities.
For example code like:
int foo(int a)
{
if (a)
return 0;
else
return 1;
return 2;
}
will be linearized into something like:
...
phi.32 %r2 <- %phi1, %phi2, VOID
ret.32 %r2
where %phi1 & %phi2 correspond to the 'return 0' & 'return 1'
and the VOID correspond to the dead 'return 2'.
This code should be trivially be converted to a select
but is not because the OP_PHI is considered as having 3
elements instead of 2.
Worse, if at some moment we would have a phi list with:
phi.32 %r2 <- %phi1, VOID
then the optimization would trigger but the corresponding
code make the assumption that the pseudos are the target
of an OP_PHISOURCE instruction, which VOIDs, of course,
are not and a crash should ensue.
Fix this by filtering out these VOIDs before checking the
conditions of the if-conversion.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
By definition a commutative operation give the same result if its
operands are exchanged.
CSE can use this property when comparing instructions to find
more equivalent instructions and thus eliminate more of them..
Implement this by special-casing commutative ops in insn_compare().
Note: this come for free when all instructions are properly
canonicalized, which is not yet the case.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Christopher Li <sparse@chrisli.org>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Christopher Li <sparse@chrisli.org>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Christopher Li <sparse@chrisli.org>
|
|
The operators '||' and '&&' being idempotent, the expressions
'(x || x)' and '(x && x)' can be simplified to a test against zero.
Note: they could even be replaced by 'x' itself but only if 'x' is
already a boolean expression/has already been tested against zero.
If it is the case, the redundant test this will be optimized away
in further steps.
For example, test-linearize on the following code:
int ior(int a) { return a || a; }
emitted the following instructions:
or-bool.32 %r3 <- %arg1, %arg1
after the patch, it now emits:
setne.32 %r3 <- %arg1, $0
which is easier to combine with others simplifications.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Christopher Li <sparse@chrisli.org>
|
|
Expressions involving equality testing against zero are ubiquitious
and can often be simplified with previous comparisons.
For example, when using test-linearize on the following code:
_Bool foo(int a) { return !(a < 3); }
the following was emitted:
setlt.32 %r2 <- %arg1, $3
seteq.32 %r3 <- %r2, $0
setne.1 %r4 <- %r3, $0
ret.1 %r4
but this can be simplified into:
setge.1 %r4 <- %arg1, $3
ret.1 %r4
Implement this simplification and add associated test cases.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Christopher Li <sparse@chrisli.org>
|
|
Most binops can be simplified when their two operands are
identical. For example '(x ^ x)' can be simplified into '0'.
The cases '(x / x)' and '(x % x)' are not simplified since this
is correct only when x is not zero.
The cases '(x || x)' and '(x && x)' are not simplified because
it's only correct when the operands are booleans/have already
been compared against zero and the linearization don't enforce that.
This patch add the code for these simplifications as well as
their corresponding test cases.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Christopher Li <sparse@chrisli.org>
|
|
There is simplifications for:
(x && 0) => 0
(x && 1) => x
(x || 0) => x
but the fourth case '(x || 1)' is missing.
This patch add the missing simplification and a small test case.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Christopher Li <sparse@chrisli.org>
|
|
Currently those double operations are not simplified.
This patch add those simplifications and some small test cases.
Note: the 'boolean not': '!(!x)' is not handled by this patch
because this operator is processed differently (it doesn't generate
an unop instruction but directly generates 'seteq' operations).
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Christopher Li <sparse@chrisli.org>
|
|
For completeness, add the dual simplification 'x * 1 => x'
for modulo: 'x % 1 => 0'.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Christopher Li <sparse@chrisli.org>
|
|
A previous patch added the simplification for multiply by -1
but we can do the same for the signed divide.
This patch add this simplification
Also add the corresponding test cases.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Christopher Li <sparse@chrisli.org>
|
|
Currently we simplify multiplication by 1 but nothing is
done for multiplication by -1 which is equivalent to the
negation of its first operand.
This patch add this simplification.
Also add small test cases showing the simplification.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Christopher Li <sparse@chrisli.org>
|
|
Currently we simplify multiplication by 1 but nothing
for the similar divide by 1.
This patch add the missing simplification together with
its test cases.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Christopher Li <sparse@chrisli.org>
|
|
This patch contains no functional changes.
It just moves the code for simplification of OP_MUL{U,S} with
constant operands in its own function in preparation for some
additional simplifications coming in the same serie.
Also add some test cases for the concerned simplifications.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Christopher Li <sparse@chrisli.org>
|