| Age | Commit message (Collapse) | Author | Files | Lines |
|
This is another simple identity with the potential to trigger
more simplifications.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
This is a simple identity with the potential to trigger
more simplifications.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
A truncation followed by a zero-extension to the original size,
which is produced when loading a storing bitfields, is equivalent
to a simple AND masking. Often, this AND can then trigger even
more optimizations.
So, replace TRUNC + ZEXT instructions by the equivalent AND.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Let's suppose we have a few instructions: A, B, C & D,
all congruent insn_compare()
The current way the CSE is done is:
The first try will be try_to_cse(A,B). If this succeeds,
A & B have been merged (AB) and the next try will be done
with the next instruction:
try_to_cse(AB,C).
That's good. However, if it fails, the next try will be:
try_to_cse(A,C).
If this one also fails, the next one will be:
try_to_cse(A,D)
And if this one also fails, nothing else is done.
In other words, all attempts are done with A. If it happens that
A can't be eliminated (because it doesn't dominate B, C & D and
is not dominated by them), no eliminations are done because the
other pairs, not involving A are never tried.
Ideally, we should try all possible pairs: A-B, A-C & A-D but
also B-C, B-D & C-D. However this is quadratic and can be a bit
costly. An easy & cheap way to improve the current situation is,
in case of failure, to not reuse the original first instruction
but the other one. So instead of testing:
A-B, A-C, A-D, ...
we will test:
A-B, B-C, C-D, ...
with the result that an 'bad' instruction can't block anymore
the following pairs.
So, when try_to_cse() fails, do not retry by keeping the first
instructions, but retry using the second one.
In practice, this seems to work fairly well.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
In the test the offset is the same for dst & src and thus
it's calculation should be CSEed away but it is not (yet).
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Casts to integer used to be done with only 2 instructions:
OP_CAST & OP_SCAST.
Those are not very convenient as they don't reflect the real
operations that need to be done.
This patch specialize these instructions in:
- OP_TRUNC, for casts to a smaller type
- OP_ZEXT, for casts that need a zero extension
- OP_SEXT, for casts that need a sign extension
- Integer-to-integer casts of the same size are considered as
a NOPs and are, in fact, never emitted.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Currently all casts to pointers are processed alike. This is
simple but rather unconvenient in later phases as this
correspond to different operations that obeys to different
rules and which later need extra checks.
Change this by using a specific instructions (OP_UTPTR) for
[unsigned] integer to pointers.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Currently all casts to pointers are processed alike.
This is simple but rather unconvenient as it correspond to
different operations that obeys to different rules and
which later need extra checks.
Change this by using a specific instructions (OP_UTPTR) for
unsigned integer to pointers.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Currently, casts from floats to integers are processed like
integers (or any other type) to integers. This is simple but
rather unconvenient as it correspond to different operations
that obeys to different rules and which later need extra checks.
Change this by directly using specific instructions:
- FCVTU for floats to unsigned integers
- FCVTS for floats to signed integers
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Currently, all casts to a floating point type use OP_FPCAST.
This is maybe simple but rather uncovenient as it correspond
to several quite different operations that later need extra
checks.
Change this by directly using different instructions for the
different cases:
- FCVTF for float-float conversions
- UCVTF for unsigned integer to floats
- SCVTF for signed integer to floats
and reject attempts to cast a pointer to a float.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
* merge the tests about implicit & explicit casts in a
single file as there was a lot of redundancy.
* shuffle the tests to linear/ or optim/
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
A integer-to-float cast of a constant is currently simplified away
as if it is an integer-to-integer cast. That's bad.
Fix this by refusing to simplify away any integer-to-float casts
like already done for float-to-integer casts.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
A integer-to-float cast of a constant is currently simplified away
as if it is an integer-to-integer cast. That's bad.
Create a testcase for it.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Some non-void functions in the testcases miss a return.
Add the missing return or make the function as returning void.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
During the linearization of a function, returns are directly
linearized as phi-sources and the exit BB contains the
corresponding phi-node and the unique OP_RET.
There is also a kind of optimization that is done if there is
only a single a return statement and thus a single phi-source:
the phi-source and the phi-node is simply ignored and the
unique value is directly used by the OP_RET instruction.
While this optimization make sense it also has some cons:
- the phi-node and the phi-source are created anyway and will
need to be removed during cleanup.
- the corresponding optimization need to be done anyway during
simplification
- it's only a tiny special case which save very litte.
So, keep things simple and generic and leave this sort of
simplification for the cleanup/simplification phase.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
A select instruction like:
select r <- x, 0, x
always gives zero as result but the optimizer doesn't this.
Change this by teaching the optimizer about it.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Normal instruction simplification & CSE must not be done
on dead block (otherwise it's possible to have unsound
situations like having an instruction defining its own
operand with possible infinite loops as consequence).
This is insured by the main optimization loop but not after
BB packing or flow simplification.
Fix this by calling kill_unreachabe_bbs() after BB packing
and flow simplification.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Add some tests showing missed optimization opportunities.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
REPEAT_SYMBOL_CLEANUP is set when there is changes to the
addressability of a symbol, typically when a OP_SYMADDR is removed.
However, currently most OP_SYMADDRs are 'simplified'/folded into
their use. For example:
symaddr.64 %r1 <- var
add.64 %r2 <- %r1, $4
is simplified into:
add.64 %r2 <- var, $4
One of the bad consequences of this 'simplification' is that
if the 'add' instruction is later optimized away, this correspond
to an effective change to the addressability of the symbol. This
is exactly as if the 'symaddr' has been removed before being so
'simplified', but because the symaddr is not there anymore
there is no change recorded to the addressability to the symbol
and some further optimizations may be missed.
Change that by checking at each time the usage is removed from an
instruction if the corresponding pseudo was a symbol and set
REPEAT_SYMBOL_CLEANUP if it was the case.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
The function address_taken() check that the address of a variable
is not used for anything else than for doing a memory access.
If the function fails, it means that the address can escape and
thus can be used to modify the variable indirectly and the
memop simplification would then be unsafe.
The function do this check by assuming that any uses by an instruction
other than a load or a store can escape the address, which is true.
However it doesn't take in account that if the variable's address is
used, not as the address of a store, but as the value stored, then
this address also escape.
Fix this by adding a check that the use of the variable's address
is effectively done as the address of stores & loads and not as
the value stored by the memop.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Since the patterns in the testcases are evaluated in the shell
script, the backslash used to escape characters special to the
pattern need itself to be escaped. Theer is a few cases where
it wasn't done so, partly because 'format -l' gave a single
escape in its template.
Fix all occurences neededing this double-escape as well as the
'format -l' template.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
'kill-dead-loads' and 'kill-dead-stores' into tip
Each of these branches contains a fix for the missing removal of
value or address usage when unneeded loads or stores are killed
during symbol simplification.
No conflicts but one of the tests fails in the branches while it
correctly succeed after they are merged (so no code conflict but
a semantic conflict).
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
In the initial simplify_symbols() and in simplify_memops()
when a store is simplified away, it's killed via kill_store()
where its ->bb is set to NULL and the usage is removed from
the value. However the usage is not removed from the address.
As consequence, code related to the address calculation
is not optimized away as it should be since the value is
wrongly considered as needed.
Fix this by using kill_instruction_force() to remove these
stores.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Like others instructions producing a value, OP_LOADs can be dead.
But currently, dead OP_LOAD are not removed as dead_insn() do
for others instructions.
Fix this by checking at simplification time if OP_LOADs are
dead and call kill_instruction() if it is the case.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Converted loads are dead and can be removed but that
also means that the address usage need to be adjusted
which wasn't done.
Fix this by directly using kill_instruction().
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Before the introduction of OP_SETFVAL, floating-point were
created via OP_SETVAL whose CSE is done by comparing the
pointer of the corresponding expression without any
interpretation of this pointer.
As consequence, even if two OP_SETVAL have two identical
expressions (value), in most cases the corresponding pointers
are not identical, completly inhibiting the CSE of OP_SETVALs.
Fix the CSE of floating-point literals by directly using
the value given by the new OP_SETFVAL.
Note: to respect some of the subtilities of floating-point,
the equality comparison of two literals is not done on
the floating-point value itself but bit-by-bit on its
binary representation (as such we can continue to make the
distinction between +0.0 & -0.0, handle NaNs, ...).
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
OP_SETVAL is used to create floating-point and string
as well as labels-as-values. This multi-purpose aspect
sometimes make things a bit more complicated.
Change this by using a new instruction for the direct
creation of floating-point literals without needing
to have an intermediate EXPR_FVALUE.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
'size-unsized-arrays' and 'master' into tip
|
|
Since commit "1c182507c (fix support of floating-point compare)",
CSE wasn't done anymore on floating-point compare.
Fix this by adding the two missing 'case OP_FPCMP ...'
Fixes: 1c182507c3981aa20193c68d7cfd32d750b571cf
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
A few testcases had typos in their 'check-...' tags or
the tag was plainly invalid.
Fix them in accordance to the doc.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Comparision of floating-point values can't be done
like for integral values because of the possibility to have
NaNs which can't be ordered with normal values or even between
themselves.
The real difference appears once there is any "reasoning"
done with the result of the comparison. For example, once NaNs
are taken in account: "!(a < b)" and "(a >= b)" are not the same.
In fact the usual comparison operators must be reinterpreted
as implicitely first testing if any of the operand is a Nan
and return 'false' if it is the case. Thus "a < b" becomes
"!isnan(a) && !isnan(b) && (a < b)".
If we need to negate the comparison we get "!(a < b)" which
naturally becomes "isnan(a) || isnan(b) || (a >= b)".
We thus need two sets of operators for comparison of floats:
one for the "ordered" values (only true if neither operand
is a Nan) and one for the "values" (also true if either
operand is a NaN). A negation of the comparison switch from one
of the set to the other.
So, introduce another set of instructions for the comparison
of floats.
Note: the C standard requires that:
*) "x == x" is false if x is a NaN,
*) "x != x" is true if x is a NaN,
and this is coherent with "x != x" <-> "!(x == x)".
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
OP_INLINED_CALL are there only as a sort of annotation
for debugging purpose.
Their presence should thus not block the packing of
basic blocks.
Fix this by ignoring OP_INLINED_CALL when trying to pack
a basic block.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Currently only commutative instructions are canonicalized
(the "simpler" operands, often a constant, is forced, if present
to be in the second operand). This improve CSE (more cases are
considered as equivalent) and help to reduce the number of "pattern"
to be handled at simplification.
Do this also for compare instructions since in thsi case we can
swap the order of the operands if at the same time we also swap
the 'direction' on the comparison.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
'testcases-mem2reg' into tip
|
|
This may seems as an missing optimization problem but in truth,
the root cause is a problem with SSA conversion.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
'fix-volatile-simplification', 'struct-asm-ops', 'restricted-pointers', 'fix-f2i-casts', 'symaddr-description', 'flush-stdout' and 'diet-simple' into tip
|
|
Some branch simplifications are only valid if the branch is free
of side-effects. The check is done in flow.c:bb_has_side_effects()
but currently deosn't take in account the fact that a volatile load
is also a side-effect.
Fix this by adding the appropriate check.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Note: there is still no semantic associated with 'restrict'
but this is a preparatory step.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
'fix-missing-reload', 'fix-insert-branch', 'fix-NULL-type', 'testsuite-clean', 'fix-bitfield-init-v3' and 'fdump-linearize' into tip
* fix missing reload
* fix boolean context for OP_AND_BOOL & OP_OR_BOOL
* fix bitfields implicit zero initializer.
* fix: kill old branch in insert_branch()
* returns the correct type when evaluating NULL
* use internal size for __INT_MAX__ & friends
* testsuite: cleanup result files
* add support for '-fdump-linearize[=only]'
* add support for '-fmem-report'
* add support for '-dD'
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Current simplifications of 'x && 1 --> x' and its dual
'x || 0 --> x' are wrong because the '||' and '&&' operators
demand that their operands are first compared against zero
which then always give a boolean valued result.
For example: '3 && 1' is not equal to '3' but to '1' (or 'true').
The correct simplification is thus 'x && 1 --> x != 0' and
'x || 0 --> x != 0'.
Fix this by always first doing the comparison against zero
before generating the OP_AND_BOOL and OP_OR_BOOL instructions.
Note: of course, we could decide that the semantic of OP_AND_BOOL
and OP_OR_BOOL is that these ops take care themselves of
making a boolean context (which was, I think, why these
ops were created) but then these simplifications cannot be
done (or when they are done, we need to add the comparison
against zero).
Fixes: b85ec4bb7f5b1c522d7c71782dbd9cf1c4c49b2f
Fixes: a0886db12307d2633b04ec44342099a2955794a5
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Simple if-then-else expressions can often be
replaced by an OP_SELECT. This is done in a very
generic way by looking not at the test/if part but
at OP_PHIs which contain exactly 2 values.
An implementation detail makes that the address
of valid OP_PHI's sources must not change, so the
list holding these values must not be repacked and such.
As consequence, when a value need to be removed from the
list this value is simply replaced in the list by a VOID.
These VOIDs must then be ignored when processing OP_PHI's
sources.
But for the if-conversion, these VOID are not ignored
and are thus considered like any other value which in turn
make miss some optimization opportunities.
For example code like:
int foo(int a)
{
if (a)
return 0;
else
return 1;
return 2;
}
will be linearized into something like:
...
phi.32 %r2 <- %phi1, %phi2, VOID
ret.32 %r2
where %phi1 & %phi2 correspond to the 'return 0' & 'return 1'
and the VOID correspond to the dead 'return 2'.
This code should be trivially be converted to a select
but is not because the OP_PHI is considered as having 3
elements instead of 2.
Worse, if at some moment we would have a phi list with:
phi.32 %r2 <- %phi1, VOID
then the optimization would trigger but the corresponding
code make the assumption that the pseudos are the target
of an OP_PHISOURCE instruction, which VOIDs, of course,
are not and a crash should ensue.
Fix this by filtering out these VOIDs before checking the
conditions of the if-conversion.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
By definition a commutative operation give the same result if its
operands are exchanged.
CSE can use this property when comparing instructions to find
more equivalent instructions and thus eliminate more of them..
Implement this by special-casing commutative ops in insn_compare().
Note: this come for free when all instructions are properly
canonicalized, which is not yet the case.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Christopher Li <sparse@chrisli.org>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Christopher Li <sparse@chrisli.org>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Christopher Li <sparse@chrisli.org>
|
|
The operators '||' and '&&' being idempotent, the expressions
'(x || x)' and '(x && x)' can be simplified to a test against zero.
Note: they could even be replaced by 'x' itself but only if 'x' is
already a boolean expression/has already been tested against zero.
If it is the case, the redundant test this will be optimized away
in further steps.
For example, test-linearize on the following code:
int ior(int a) { return a || a; }
emitted the following instructions:
or-bool.32 %r3 <- %arg1, %arg1
after the patch, it now emits:
setne.32 %r3 <- %arg1, $0
which is easier to combine with others simplifications.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Christopher Li <sparse@chrisli.org>
|
|
Expressions involving equality testing against zero are ubiquitious
and can often be simplified with previous comparisons.
For example, when using test-linearize on the following code:
_Bool foo(int a) { return !(a < 3); }
the following was emitted:
setlt.32 %r2 <- %arg1, $3
seteq.32 %r3 <- %r2, $0
setne.1 %r4 <- %r3, $0
ret.1 %r4
but this can be simplified into:
setge.1 %r4 <- %arg1, $3
ret.1 %r4
Implement this simplification and add associated test cases.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Christopher Li <sparse@chrisli.org>
|
|
Most binops can be simplified when their two operands are
identical. For example '(x ^ x)' can be simplified into '0'.
The cases '(x / x)' and '(x % x)' are not simplified since this
is correct only when x is not zero.
The cases '(x || x)' and '(x && x)' are not simplified because
it's only correct when the operands are booleans/have already
been compared against zero and the linearization don't enforce that.
This patch add the code for these simplifications as well as
their corresponding test cases.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Christopher Li <sparse@chrisli.org>
|
|
There is simplifications for:
(x && 0) => 0
(x && 1) => x
(x || 0) => x
but the fourth case '(x || 1)' is missing.
This patch add the missing simplification and a small test case.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Christopher Li <sparse@chrisli.org>
|
|
Currently those double operations are not simplified.
This patch add those simplifications and some small test cases.
Note: the 'boolean not': '!(!x)' is not handled by this patch
because this operator is processed differently (it doesn't generate
an unop instruction but directly generates 'seteq' operations).
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Christopher Li <sparse@chrisli.org>
|
|
For completeness, add the dual simplification 'x * 1 => x'
for modulo: 'x % 1 => 0'.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Christopher Li <sparse@chrisli.org>
|
|
A previous patch added the simplification for multiply by -1
but we can do the same for the signed divide.
This patch add this simplification
Also add the corresponding test cases.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Christopher Li <sparse@chrisli.org>
|
|
Currently we simplify multiplication by 1 but nothing is
done for multiplication by -1 which is equivalent to the
negation of its first operand.
This patch add this simplification.
Also add small test cases showing the simplification.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Christopher Li <sparse@chrisli.org>
|
|
Currently we simplify multiplication by 1 but nothing
for the similar divide by 1.
This patch add the missing simplification together with
its test cases.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Christopher Li <sparse@chrisli.org>
|
|
This patch contains no functional changes.
It just moves the code for simplification of OP_MUL{U,S} with
constant operands in its own function in preparation for some
additional simplifications coming in the same serie.
Also add some test cases for the concerned simplifications.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Christopher Li <sparse@chrisli.org>
|