aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/cse.c
AgeCommit message (Collapse)AuthorFilesLines
2021-03-21replace add_instruction_to_end() by insert_last_instruction()Luc Van Oostenryck1-9/+1
add_instruction_to_end() and insert_last_instruction() do exactly the same thing but with the arguments in the opposite order. So, replace add_instruction_to_end() by insert_last_instruction(). Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2020-11-21add a new instruction for label-as-valueLuc Van Oostenryck1-0/+9
Convert OP_SETVAL of a label into a new instruction: OP_LABEL. There is 2 reasons to do this: *) there is slightly less checking to be done in later phases (since OP_SETVAL can be for labels but also strings) *) OP_SETVAL is CSEd but this is largely useless because this instruction is hashed on the expression's address and these are (most) often not shared. With a separate instruction for label expressions, their CSE is now OK because the hashing is done on the BB. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-08-25symaddr: s/insn->symbol/insn->src/Luc Van Oostenryck1-9/+2
OP_SYMADDR take a single operand 'symbol' but this instruction is very much like other unops and using the same operand's name allow to avoid some special cases. So, s/symbol/src/ for OP_SYMADDRs. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-08-25Merge branch 'ssa' into tipLuc Van Oostenryck1-24/+3
* do 'classical' SSA conversion (via the iterated dominance frontier). Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-07-01dom: use domtree for bb_dominates()Luc Van Oostenryck1-24/+3
The function bb_dominates() do a dominance query on two BB but do this by walking the CFG tree. Now that we have the dominance tree, we don't need anymore to do this walk, we can use the dom tree itself. This patch just replace the calls bb_dominates() by domtree_dominates() since they have a slightly different interface. Note: in the next version, it'll be better/simpler to rename domtree_dominates() by bb_dominates(). Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-06-30Merge branch 'cse-cast' into tipLuc Van Oostenryck1-12/+18
* cse: try the next pair instead of keeping the first instruction * cse: compare casts only by kind a size, not by C type.
2018-06-30cse: let equivalent casts hash & compare identicallyLuc Van Oostenryck1-12/+16
Now that cast instructions are more finely grained, it's not needed to compare the original type of the casts, only the original size can matter. So, do not hash & compare the original types but only the orignal sizes. This allow much more casts instructions to be CSEed away. Note: like noted in the code, even the original size shouldn't matter as identical sources should implies identical original sizes but this can't yet be guaranted. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-06-28bool: remove OP_{AND,OR}_BOOL instructionsLuc Van Oostenryck1-3/+1
Now that these instructions are not generated anymore, we can remove all related code, defines and doc. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-06-26cse: move to next comparable instructionLuc Van Oostenryck1-0/+2
Let's suppose we have a few instructions: A, B, C & D, all congruent insn_compare() The current way the CSE is done is: The first try will be try_to_cse(A,B). If this succeeds, A & B have been merged (AB) and the next try will be done with the next instruction: try_to_cse(AB,C). That's good. However, if it fails, the next try will be: try_to_cse(A,C). If this one also fails, the next one will be: try_to_cse(A,D) And if this one also fails, nothing else is done. In other words, all attempts are done with A. If it happens that A can't be eliminated (because it doesn't dominate B, C & D and is not dominated by them), no eliminations are done because the other pairs, not involving A are never tried. Ideally, we should try all possible pairs: A-B, A-C & A-D but also B-C, B-D & C-D. However this is quadratic and can be a bit costly. An easy & cheap way to improve the current situation is, in case of failure, to not reuse the original first instruction but the other one. So instead of testing: A-B, A-C, A-D, ... we will test: A-B, B-C, C-D, ... with the result that an 'bad' instruction can't block anymore the following pairs. So, when try_to_cse() fails, do not retry by keeping the first instructions, but retry using the second one. In practice, this seems to work fairly well. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-06-23cast: specialize integer castsLuc Van Oostenryck1-4/+4
Casts to integer used to be done with only 2 instructions: OP_CAST & OP_SCAST. Those are not very convenient as they don't reflect the real operations that need to be done. This patch specialize these instructions in: - OP_TRUNC, for casts to a smaller type - OP_ZEXT, for casts that need a zero extension - OP_SEXT, for casts that need a sign extension - Integer-to-integer casts of the same size are considered as a NOPs and are, in fact, never emitted. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-06-23cast: specialize cast from pointersLuc Van Oostenryck1-2/+2
Currently all casts to pointers are processed alike. This is simple but rather unconvenient in later phases as this correspond to different operations that obeys to different rules and which later need extra checks. Change this by using a specific instructions (OP_UTPTR) for [unsigned] integer to pointers. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-06-23cast: specialize casts from unsigned to pointersLuc Van Oostenryck1-0/+2
Currently all casts to pointers are processed alike. This is simple but rather unconvenient as it correspond to different operations that obeys to different rules and which later need extra checks. Change this by using a specific instructions (OP_UTPTR) for unsigned integer to pointers. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-02-24remove unneeded cast in calls to free_ptr_list()Luc Van Oostenryck1-1/+1
free_ptr_list() is a macro doing the typechecking. It's thus not only unnneded but counter-productive to cast its argument to the untyped 'struct ptr_list*'. Remove these casts. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-02-24move inner optimization loop into optimize.cLuc Van Oostenryck1-36/+0
The code for the inner optimization loop was in the same file than the one for CSE. Now that the CSE have a well defined interface, we can move this inner loop together with the main optimization loop in optimize.c This move make the code better structured and make it easier to understand the optimization logic and make any experiment or needed changes to it. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-02-24expose interface to CSELuc Van Oostenryck1-2/+3
Now that the CSE have a well defined interface, we can make it public, which will help us to untangle the CSE part from the whole optimization logic. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-02-24extract cse_eliminate() from cleanup_and_cse()Luc Van Oostenryck1-7/+13
Currently, the same function doing the CSE's elimination phase also contains the inner part of the optimization loop. Moving the CSE elimination part in a seperate function will allow to make a clear interface for CSE and move this inner optimization loop out of the CSE file. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-02-24cse: untangle simplification & hashingLuc Van Oostenryck1-8/+6
Currently, the simplification of individual instructions is done at the beginnig of the function doing the hashing for CSE. Separate the two help a little bit to understand the code there and making any changes if needed. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-02-20no need for signed & unsigned multiplicationLuc Van Oostenryck1-2/+2
Currently, we have OP_MULS & OP_MULU but unless it's full, widening multiplication both must give exactly the same result (the world run on 2's complement CPUs now, right?). Also, the IR doesn't have widening multiplication but only instruction where both operands and the result have the same size. So, since theer is no reasons to keep 2 instructions, merge OP_MULS & OP_MULU into a single one: OP_MUL. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-01-16CSE: support CSE of floating-point literalLuc Van Oostenryck1-0/+11
Before the introduction of OP_SETFVAL, floating-point were created via OP_SETVAL whose CSE is done by comparing the pointer of the corresponding expression without any interpretation of this pointer. As consequence, even if two OP_SETVAL have two identical expressions (value), in most cases the corresponding pointers are not identical, completly inhibiting the CSE of OP_SETVALs. Fix the CSE of floating-point literals by directly using the value given by the new OP_SETFVAL. Note: to respect some of the subtilities of floating-point, the equality comparison of two literals is not done on the floating-point value itself but bit-by-bit on its binary representation (as such we can continue to make the distinction between +0.0 & -0.0, handle NaNs, ...). Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2017-12-21fix: restore CSE on floating-point comparesLuc Van Oostenryck1-1/+3
Since commit "1c182507c (fix support of floating-point compare)", CSE wasn't done anymore on floating-point compare. Fix this by adding the two missing 'case OP_FPCMP ...' Fixes: 1c182507c3981aa20193c68d7cfd32d750b571cf Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2017-11-18add support of floating-point specific arithmetic opsLuc Van Oostenryck1-0/+14
Floating-point arithmetic is quite different from the arithmetic on integers or the one of real numbers. In particular, most transformations, simplifications that can be done on integers are invalid when done on floats. For example: - associativity doesn't hold - distributivity doesn't hold - comparison is tricky & complex This is because (among others things): - limited precision, rounding everywhere - presence of signed zeroes - presence of infinities - presence of NaNs (signaling or quiet) - presence of numbers without inverse - several kind of exceptions. Since they don't follow the same rules as their integer counterpart, better to give them a specific opcode instead of having to test the type of the operands at each manipulation. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2017-07-31fix ptrlist corruption while killing unreachable BBsLuc Van Oostenryck1-0/+2
In commit (51cfbc90a5 "fix: kill unreachable BBs after killing a child") kill_unreachable_bbs() was called during simplification in order to avoid creating fake cycles between phis and issuing bogus "crazy programmer" warnings. However, simplification is done via cse.c:clean_up_insns() which is looping over all BBs while kill_unreachable_bbs() loops over all BBs *and* can delete some of them which may corrupt the looping in clean_up_insns(). Fix this by: 1) refuse to emit the "crazy programmer" warning if there is a potential dead BB 2) move kill_unreachable_bbs() in the main cleanup loop which will avoid nested ep->bbs loop. Note: this solution is preferable to some others because the "crazy programmer" condition happens very rarely. It this thus better to delay this check than to call kill_unreachable_bbs() preventively. Note: the reproducer is one with very broken syntax but nothing forbid the same situation to happen with a valid program. Fixes: 51cfbc90a5e1462fcd624a1598ecd985a508a5d6 Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2017-02-23CSE: avoid hashing removed instructionsLuc Van Oostenryck1-0/+2
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Signed-off-by: Christopher Li <sparse@chrisli.org>
2017-02-23CSE: use commutativity to identify equivalent instructionsLuc Van Oostenryck1-8/+14
By definition a commutative operation give the same result if its operands are exchanged. CSE can use this property when comparing instructions to find more equivalent instructions and thus eliminate more of them.. Implement this by special-casing commutative ops in insn_compare(). Note: this come for free when all instructions are properly canonicalized, which is not yet the case. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Signed-off-by: Christopher Li <sparse@chrisli.org>
2017-02-16use kill_instruction() when killing any instructions during CSELuc Van Oostenryck1-6/+1
Instructions removed during CSE had their ->bb simply set to NULL and the usage of their operand was not adjusted. Fix that by calling kill_instruction(). Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Signed-off-by: Christopher Li <sparse@chrisli.org>
2017-02-16use kill_instruction() when killing an OP_PHI during CSELuc Van Oostenryck1-9/+1
cse_one_instruction() contains some code to be executed when an OP_PHI is eliminated. This code is roughly equivalent to what is done in kill_instruction() for OP_PHI but not complete since it doesn't recursively try to kill others instructions which are used only by this OP_PHI. Fix this and avoid duplicated code by replacing this code by a call to kill_instruction(). Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Signed-off-by: Christopher Li <sparse@chrisli.org>
2011-08-27cse: update PHI users when throwing away an instructionKamil Dudka1-0/+13
Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Kamil Dudka <kdudka@redhat.com> Signed-off-by: Christopher Li <sparse@chrisli.org>
2011-08-27cse: treat PHI-nodes as other instructionsKamil Dudka1-7/+0
Without this patch test-linearize fails on a simple example: static void test(int i) { while (i) { if (i) test(0); i++; } } It generates a conditional jump depending on an uninitialized value which is obviously not in the input code. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Kamil Dudka <kdudka@redhat.com> Signed-off-by: Christopher Li <sparse@chrisli.org>
2007-07-29cse: Size insn_hash_table more realistically, speeding up CSE significantlyJosh Triplett1-1/+1
Profiling showed that Sparse spent much of its time doing CSE, which involved traversing every bucket of insn_hash_table repeatedly. insn_hash_table had 64K buckets, but never more than a few dozen entries. Shrink insn_hash_table to 256 entries. This makes Sparse 2-4 times faster for me. Signed-off-by: Josh Triplett <josh@freedesktop.org>
2007-03-09Fix typos in commentsJosh Triplett1-4/+4
Signed-off-by: Josh Triplett <josh@freedesktop.org>
2005-09-24Do stupid and crappy CSE on casts.Linus Torvalds1-0/+25
It's better than nothing, but I need to re-do cast linearization some day. Right now we squirrell away the whole original type, but the fact is, we only need to save away a few bits of "type of cast". We need to know if it's sign-extending, and whether the source or destination are floating point values. And the source size, of course. But we don't really need to know the detailed original type. Oh, well. This makes at least trivial things look much better.
2005-04-07[PATCH] using 0 as NULL in sparseChristopher Li1-3/+3
This trivial patch remove the warning when I try to compile sparse in with sparse. Another warning I did not fix is: ../pre-process.c:554:18: warning: bad constant expression The offending line is: struct arg args[nargs];
2005-04-07Use the new per-instruction position information for betterLinus Torvalds1-1/+1
error reporting.
2005-04-07Split the binops where signedness matters into unsigned and signed.Linus Torvalds1-6/+10
This is OP_MUL/OP_DIV/OP_MOD/OP_SHR. We actually do the constant simplifications still wrong, but now the information is all there.
2005-04-07Split OP_SETVAL into OP_SETVAL (fp expressions and labels) and OP_SYMADDRLinus Torvalds1-2/+8
(symbol addresses). They are pretty different. Symbol addresses have special meaning during various phases, from symbol simplification to CSE.
2005-04-07Remove phi source merging.Linus Torvalds1-17/+1
It looked good on paper. Not so good when actually trying to generate code.
2005-04-07Expose "show_instruction()" for debugging.Linus Torvalds1-2/+0
We already did this, we just didn't have a visible prototype.
2005-04-07Be more aggressive on PHI-node CSE.Linus Torvalds1-3/+10
We don't care _where_ a PHI-node is, since the real location is at the phi sources. So unlike other instructions, we do not bother with any dominance analysis - we can just combine them blindly across basic block borders. It might make sense to totally disconnect the PHI nodes from the flow graph to make this lack of positional information even more explicit. That might also allow us to combine more basic blocks.
2005-04-07Make OP_PHISOURCE track the OP_PHI instructions that it defines.Linus Torvalds1-4/+2
This allows us to always see which pseudos are nonlocally affected by the phi source. We can only do this after the instruction flow is fixed, together with the OP_DEATHNOTE phase.
2005-04-07Yet another missed "entry" change point.Linus Torvalds1-1/+1
Umm. The compiler even warned about it, I don't see how I could miss it. I'm a maroon.
2005-04-07Remove OP_SETCC, make OP_SEL bigger instead.Linus Torvalds1-6/+10
This was originally done so that "struct instruction" would be smaller, but it has since grown anyway (for the memops), so splitting OP_SEL into two instructions no longer makes sense, and makes it harder on CSE.
2005-04-07Don't try to share parenthood fn between phi node removal and Linus Torvalds1-1/+15
regular CSE. The CSE case is totally different: since both children use the same pseudos, there can be no question that a common parent wouldn't have that pseudo live.
2005-04-07Start using instruction sizes properly.Linus Torvalds1-1/+3
We should drop symbol types by now, and base things on just raw sizes. Anything else just doesn't work from a CSE standpoint. Some things may care about actual types later, notably the type- based alias analysis. Those types still exist in the cast nodes. Also, make the printout be more readable.
2005-04-07Oops. Don't try to CSE OP_SEL, at least not until we learn toLinus Torvalds1-2/+2
CSE it together with the OP_SETCC that goes with it ;)
2005-04-07Do CSE over children with trivially common parents.Linus Torvalds1-1/+21
Hey, it's simple, and it happens.
2005-04-07Allow CSE to run after bb packing. Linus Torvalds1-1/+5
...but we can't merge phi-sources after packing. Or rather, we could, but we'll have to be more careful about not re-ordering them, which we aren't. Sort it out later.
2005-04-07Add "memop" simplification phase.Linus Torvalds1-1/+2
It's a bit more simple-minded than the symbol simplification, and it can be more costly. So we start off with the (cheap) symbol simplification algorithm, and then use this new more generic phase later. This allows us to remove extra loads (and, in theory, stores, but the dead store elimination is so simple-minded right now that it's effectively useless).
2005-04-07Be more careful about insn->bb pointers.Linus Torvalds1-0/+1
Make sure to clear them when turning operations into NOP's. And when doing OP_LOAD->OP_PHI conversion, just re-use the target pseudo, which means that we don't need to unnecessarily move all the uses over.
2005-04-07Make the CSE "repeat" logic be more fine-grained than justLinus Torvalds1-6/+8
repeating CSE itself. In particular, some symbol address simplifications imply that we should repeat symbol simplification, since things may have improved.
2005-04-07Move instruction simplification to new file "simplify.c".Linus Torvalds1-193/+18
Also, allow marking a pseudo unused by setting it to VOID, which just disables further renaming of that use. This is needed to make phi-source instructions (that can be shared among multiple phi-nodes thanks to CSE) be collectable. We used to just mark the phi source dead, but that was wrong, since _another_ phi node might be using it.
2005-04-07Another assert - verify bb validity.Linus Torvalds1-2/+2
2005-04-07Don't change instruction pseudo's when nopping them out.Linus Torvalds1-1/+0
It may be a nice readability thing, but with the usage lists we really mustn't change a location that may be on a list.
2005-04-07Damn. We can't actually sort the phi-node list of phi's nowLinus Torvalds1-21/+4
that we keep track of usage notes. The usage trackign tracks the address of the phi pseudo, so sorting the list would need to update that too. This sucks, because it means we can't turn the phi-nodes into canonical format, and thus not CSE phi-nodes as effectively.
2005-04-07Now that phi sources are just instructions, we can CSE them.Linus Torvalds1-0/+12
They's _slightly_ special in that they depend on their bb, so we can only combine them within one basic block, but that's easily handled by making the bb be part of the hashing and comparison.
2005-04-07Clear phi list when killing a phi-node instructionLinus Torvalds1-1/+3
2005-04-07Oops. Clean up some left-overs from phi removal.Linus Torvalds1-8/+10
The "half-way" point was making the phi be an instruction, and some of that half-way stuff remained.
2005-04-07Remove "struct phi", replace with instruction that generates a pseudo.Linus Torvalds1-24/+27
The instruction contains everything "struct phi" had, namely source bb and pseudo. And generating a new pseudo means that we not only have the pointer to the instruction that defines the phi-node (in pseudo->def), we also end up with automatic phi-node usage tracking (pseudo->users). The changes may look large, but almost all of this is just direct search-and-replace of the old phi-node usage.
2005-04-07Do if-conversion.Linus Torvalds1-0/+101
We can turn trivial phi-nodes into OP_SEL instructions instead, and generally improve code flow. This doesn't remove the empty blocks this results in yet, so it's not quite as dramatic as it could be, but it does the hard work. Empty block removal is trivial.
2005-04-07Re-do CSE if we did a phi-node simplification.Linus Torvalds1-12/+11
2005-04-07We need to pack the phi-list even if we simplify the phiLinus Torvalds1-2/+3
instruction.
2005-04-07Make phi-node normalization (part of CSE) do trivial simplification.Linus Torvalds1-2/+17
2005-04-07Oops. Don't try to CSE the dead instructions.Linus Torvalds1-0/+2
2005-04-07Kill trivially dead instructionsLinus Torvalds1-0/+55
That is - simple instructions with no users.
2005-04-07Make the "cse nop" a bit more informativeLinus Torvalds1-0/+1
Show the source that the thing was replaced with.
2005-04-07Make CSE convert instructions to OP_NOPLinus Torvalds1-2/+2
Instead of using the "convert_load_insn()" that converts them to OP_LNOP's, which ends up confusing instruction printout horribly.
2005-04-07Add simple-stupid dominance testing for CSE.Linus Torvalds1-2/+32
This makes us able to CSE stuff where one subexpression directly dominates another. But we still don't do any fancy code motion in the non- dominance case.
2005-04-07Pack the phi-list after removing duplicates.Linus Torvalds1-0/+3
2005-04-07Add initial CSE passLinus Torvalds1-0/+291
It ain't very smart yet, but give it time. In particular, right now it gathers information for the whole function, but it only does the actual replacement if the instructions are in the same basic block.