| Age | Commit message (Collapse) | Author | Files | Lines |
|
linearize_store_gen() is complex because it has to handle
the insertion of bitfields.
Isolate this operation in a separate function in order to
facilitate the incoming changes there.
|
|
linearize_load_gen() is complex because it not only has to do
the loading of a value from some address but it also has to
handle the extraction of bitfields.
Isolate this operation in a separate function in order to
facilitate the incoming changes there.
|
|
Some undefined operations, like shifting by an amount bigger
than the size, should not raise a warning during the optimization
phase because the corresponding warning has already been issued
during the expand phase.
Mark the corresponding instructions as tainted and don't warn if
already tainted.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Now that these instructions are not generated anymore,
we can remove all related code, defines and doc.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Now that OP_AND_BOOL and OP_OR_BOOL are always given boolean
operands, they are just a special case of 1 bit OP_AND & OP_OR.
To avoid to have to repeat CSE, simplification patterns, ...
better to generate plain OP_AND & OP_OR instead.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
The function add_convert_to_bool() was added to give a
boolean context to logical expressions but did this onl
for integers.
Fix this for floating-point expressions by adding the proper
comparison to 0.0.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
The last instruction of linearize_load_gen() ensure that
loading a bitfield of size N results in a object of size N.
Also, we require that the usual binops & unops use the same type
on their operand and result. This means that before anything
can be done on the loaded bitfield it must first be sign or zero-
extended in order to match the other operand's size.
The same situation exists when storing a bitfield but there
the extension isn't done. We can thus have some weird code like:
trunc.9 %r2 <- (32) %r1
shl.32 %r3 <- %r2, ...
where a bitfield of size 9 is mixed with a 32 bit shift.
Avoid such mixing of size and always zero extend the bitfield
before storing it (since this was the implicitly desired semantic).
The combination TRUNC + ZEXT can then be optimised later into
a simple masking operation.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Also, do not warn about them because the warnings should
already have been given at evaluation time.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Without this, casts to the same non-scalar type would be rejected
and replaced by VOID. Not good.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Casts to integer used to be done with only 2 instructions:
OP_CAST & OP_SCAST.
Those are not very convenient as they don't reflect the real
operations that need to be done.
This patch specialize these instructions in:
- OP_TRUNC, for casts to a smaller type
- OP_ZEXT, for casts that need a zero extension
- OP_SEXT, for casts that need a sign extension
- Integer-to-integer casts of the same size are considered as
a NOPs and are, in fact, never emitted.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Currently casts from pointers can be done to any integer type.
However, casts to (or from) pointers are only meaningful if
it preserves the value and thus done between same-sized objects.
To avoid to have to worry about sign/zero extension while doing
casts to pointers it's good to not have to deal with such casts.
Do this by doing first a cast to an unsigned integer of the same size
as a pointer and then, if needed, doing to cast to the final type.
As such we have only to support pointer casts to unsigned integers
of the same size and on the other hand we have the generic
integer-to-interger casts we to support anyway.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
It's relatively common to cast a pointer to an unsigned long,
for example to make some bit operations.
It's much less sensical to cast a pointer to an integer smaller
(or bigger) than a pointer is.
So, emit a diagnostic for this, under the control of a new
warning flag: -Wpointer-to-int-cast.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Currently all casts to pointers are processed alike. This is
simple but rather unconvenient in later phases as this
correspond to different operations that obeys to different
rules and which later need extra checks.
Change this by using a specific instructions (OP_UTPTR) for
[unsigned] integer to pointers.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Currently pointer casts from/to void are treated as casts
to/from integer. The rationale being that a void pointer can't
anyway be dereferenced.
Allow to continue to do so by using a new 'machine type' for void
pointers which then allow to select the right type of cast:
OP_PTRCAST or OP_[S]CAST.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Currently casts to pointers can be done from any integer types.
However, casts to (or from) pointers are only meaningful if value
preserving and thus between objects of the same size.
To avoid to have to worry about sign/zero extension while doing
casts to pointers it's good to only have to deal with the value
preserving ones.
Do this by doing first, if needed, a cast an integer of the same
size as a pointer before doing the cast to a pointer.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Currently all casts to pointers are processed alike.
This is simple but rather unconvenient as it correspond to
different operations that obeys to different rules and
which later need extra checks.
Change this by using a specific instructions (OP_UTPTR) for
unsigned integer to pointers.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Currently, casts from floats to integers are processed like
integers (or any other type) to integers. This is simple but
rather unconvenient as it correspond to different operations
that obeys to different rules and which later need extra checks.
Change this by directly using specific instructions:
- FCVTU for floats to unsigned integers
- FCVTS for floats to signed integers
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Some casts, the ones which doesn't change the size or the resulting
'machine type', are no-op.
Directly simplify away such casts.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Currently, all casts to a floating point type use OP_FPCAST.
This is maybe simple but rather uncovenient as it correspond
to several quite different operations that later need extra
checks.
Change this by directly using different instructions for the
different cases:
- FCVTF for float-float conversions
- UCVTF for unsigned integer to floats
- SCVTF for signed integer to floats
and reject attempts to cast a pointer to a float.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
As a first step, add a small function which classify values into
'machine type:
- signed integer
- unsigned integer
- pointer
- floats
and use this to select the appropriate instruction.
Note: this is a preparatory patch and should not yet introduce
any changes in the generated IR.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Everywhere else in the code and comments 'unop' is used as short
for 'unary op'.
So, use 'unop' also for this function.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
The function add_uniop() needs the instruction's type but this
type was given indirectly via an expression. As consequence this
function cannot be used standalone.
Change the function to take the type as argument so that it can
also be used when no expression is available.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
No check are done if the inc/dec value of context statements
is effectively a compile-time integer value: '0' is silently
used if it is not.
Change that by using get_expression_value() when linearizing
context statements (which has the added advantage to also
slightly simplify the code).
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
The comment stated that PSEUDO_SYMs have no usage.
However, PSEUDO_SYMs *have* usage that need to be tracked correctly
(has_use_list() determines what kind of pseudo have or have not
usage/use list).
Fix the comment.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
During linearization, most kinds of instruction are not generated
if there is no active BB (which mean that these instructions can
never be executed).
However, loads & stores are generated anyway. These dead loads
and stores will then need to be removed which is a bit tricky:
- memops are special and more complex to be removed than
instructions like 'add' and such.
- these instructions exist in 'phantom basic blocks': a BB
which has ia null bb->ep and which doesn't belong to then
entrypoint's ep->bbs. Such blocks are considered as removed
and are never scanned or anything.
Keep things simple and avoid to generate memops in inactive
basic blocks.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
During the linearization of a function, returns are directly
linearized as phi-sources and the exit BB contains the
corresponding phi-node and the unique OP_RET.
There is also a kind of optimization that is done if there is
only a single a return statement and thus a single phi-source:
the phi-source and the phi-node is simply ignored and the
unique value is directly used by the OP_RET instruction.
While this optimization make sense it also has some cons:
- the phi-node and the phi-source are created anyway and will
need to be removed during cleanup.
- the corresponding optimization need to be done anyway during
simplification
- it's only a tiny special case which save very litte.
So, keep things simple and generic and leave this sort of
simplification for the cleanup/simplification phase.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Currently bb::context is intialized (at -1) when the
basic block is allocated.
But this field:
1) is only used when using the sparse tools;
2) when used, it's only quite late in the process;
3) this early initialization prevents us to reuse the space
of this field for another purpose, earlier in the process.
Change this by initializing this field much later, by the sparse tool
itself, just before needing it.
The real motivation being, of course, to be able to reuse the space
for some other upcoming field.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
No functional changes here. Just prepare for upcoming changes.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
This give us:
- a clearer name (than alloc_phi())
- more flexibility when we need the instruction and not the pseudo.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Add an helper to display labels.
This allow us to:
*) abstract away the details of how to display them
*) gracefully handle abnormal cases of a null label.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
In struct instruction, .target is normally used to hold
the result. Its value is thus produced/defined by instructions.
However, OP_COMPUTEDGOTO use .target as an input. This
creates slight complications for code, like liveness analysis,
which care about which fields are used and which are defined
by the instructions.
Change this by letting OP_COMPUTEDGOTO use .src for its
operand instead of .target.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Some of the IR instructions have been defined but are
never generated.
Remove them as they have no purposes.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
No instructions have an opcode set to OP_[LS]NOP anymore
so we can now remove all remaining traces of these opcode.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Once linearized, a few essentials optimizations are done
but the logic of the succession of these optimizations
is open-coded at the end of linearize_fn().
Obviously, this logic have nothing to do with the
linearization and moving it to a separate file will help
to clarify this logic and change it when needed.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
The idea being, of course, to be able for some functions to return
a bool, making clear what's their possible returned values.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Currently, we have OP_MULS & OP_MULU but unless it's full,
widening multiplication both must give exactly the same
result (the world run on 2's complement CPUs now, right?).
Also, the IR doesn't have widening multiplication but
only instruction where both operands and the result have
the same size.
So, since theer is no reasons to keep 2 instructions,
merge OP_MULS & OP_MULU into a single one: OP_MUL.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
In linearize.c, base_type() is not some kind of generic helper
like get_base_type() but is a more specialized thing used for
loads and stores and giving only the base type of bitfields.
Rename this helper to bitfield_base_type() to make this more
self-explained.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
'struct access_data' has two members for the type of the access:
'result_type' & 'source_type', the second being the application
of base_type() on the first.
These two names are not very self-explained and are confusing
when trying to know which one must be used for loads vs stores.
Also, the distinction between these two types only exists for
bitfields.
Simplify things by using a single type, and use base_type() when
doing the linearization of bitfield accesses.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
In show_instruction(), what is done to display OP_SYMADDR's
symbol is very close to what is done to display a PSEUDO_SYM.
In fact, what is done by show_pseudo() is more complete and
more informative.
So simply use show_pseudo() to display this part. This will
also avoid crashes caused by a NULL symbol.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
When displaying a PSEUDO_SYM, the corresponding symbol is
dereferenced but it's possible that this symbol is NULL
when a type error is present.
Fix this by explicitly checking against null ->sym and
emitting "<bad symbol>" if null.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
|
|
This warning is issued when trying to linearize a call
expression with a null expr->ctype. However, such null
ctypes are a consequence of earlier errors detected
during evaluation and for which a warning, specific
about the nature of the problem, have already been issued.
In short, this "call with no type" is non-informative
and redundant, so avoid useless noise and remove this
warning.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
OP_SETVAL is used to create floating-point and string
as well as labels-as-values. This multi-purpose aspect
sometimes make things a bit more complicated.
Change this by using a new instruction for the direct
creation of floating-point literals without needing
to have an intermediate EXPR_FVALUE.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
These functions are clearly meant to be used locally.
Furthermore, some have no prototypes.
Make these functions static and remove the prototype
when one was present.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
If the expression for the condition is dereferenced
for its type even if it is NULL.
Fix this by returning early if the expression linearize to VOID.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
A function call via a function pointer can be written like:
fp(), (fp)() or (*fp)()
In the latter case the dereference is unneeded but legal and
idiomatic.
However, the linearization doesn't handle this unneeded deref
and leads to the generation of a load of the pointer:
int foo(int a, int (*fun)(int))
{
(*fun)(a);
}
gives something like:
foo:
load %r2 <- 0[%arg2]
call.32 %r3 <- %r2, %arg1
ret.32 %r3
This happens because, at linearization, the deref is dropped
but only if the sub-expression is a symbol and the test for
node is not done.
Fix this by using is_func_type() to test the type of all call
expressions.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Use the fact that fn->ctype can't be NULL if expr->ctype
is not NULL.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Floating-point arithmetic is quite different from the
arithmetic on integers or the one of real numbers.
In particular, most transformations, simplifications that can
be done on integers are invalid when done on floats.
For example:
- associativity doesn't hold
- distributivity doesn't hold
- comparison is tricky & complex
This is because (among others things):
- limited precision, rounding everywhere
- presence of signed zeroes
- presence of infinities
- presence of NaNs (signaling or quiet)
- presence of numbers without inverse
- several kind of exceptions.
Since they don't follow the same rules as their integer
counterpart, better to give them a specific opcode
instead of having to test the type of the operands at
each manipulation.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Comparision of floating-point values can't be done
like for integral values because of the possibility to have
NaNs which can't be ordered with normal values or even between
themselves.
The real difference appears once there is any "reasoning"
done with the result of the comparison. For example, once NaNs
are taken in account: "!(a < b)" and "(a >= b)" are not the same.
In fact the usual comparison operators must be reinterpreted
as implicitely first testing if any of the operand is a Nan
and return 'false' if it is the case. Thus "a < b" becomes
"!isnan(a) && !isnan(b) && (a < b)".
If we need to negate the comparison we get "!(a < b)" which
naturally becomes "isnan(a) || isnan(b) || (a >= b)".
We thus need two sets of operators for comparison of floats:
one for the "ordered" values (only true if neither operand
is a Nan) and one for the "values" (also true if either
operand is a NaN). A negation of the comparison switch from one
of the set to the other.
So, introduce another set of instructions for the comparison
of floats.
Note: the C standard requires that:
*) "x == x" is false if x is a NaN,
*) "x != x" is true if x is a NaN,
and this is coherent with "x != x" <-> "!(x == x)".
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Using the pre or post increment or decrement operator on
floating-point values mix the addition of a floating-point
value with an *integral* constant 1 or -1.
Fix this by checking if we're dealing with fp or not and using
the proper fp constants (1.0 or -1.0) if it is the case.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Currently the different cases of a switch-statement, or more
exactly the 'struct multijmp' that hold the value of these cases
excepted only value of 'int' type. Trying to use a wider value
results in the value being truncated but any integer value should
be valid.
Fix this by unsigned 'long long' to hold these values.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
For consistency and for sparse-LLVM which needs it,
give them a type too.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Currently, when a phi-node is converted into a OP_SEL
this instruction is given a size but not a type but when
created directly it is given a type.
For consistency and for sparse-LLVM which needs it,
give them always a type.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Currently, OP_PHISOURCEs are given a size but not a type.
For consistency and for sparse-LLVM which need it,
give them a type too.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
The linearized code, sparse's IR, have no use of C's complex type
system. Those types are checked in previous phases and the pseudos
doesn't have a type directly attached to them as all the needed
typing info info are conveyed by the instructions.
In particular, PSEUDO_VAL (used for integer and address constants)
are completely typeless.
There is a problem with this when calling a variadic function
with a constant argument as in this case there is no type in the
function prototype (for the variadic part, of course) and there is
no defining instructions holding the type of the argument.
Possible but rejected solutions are:
* give a full type to all pseudos
This is silly as, for example 'int 0' and 'unsigned int 0'
are, really, the same constants.
* give simply a size to all pseudos
This seems to be simple enough but *currently* it negatively
impacts CSE (maybe because of some others deficiencies in
the type system, especially for casts).
* give a type to all function arguments via a new instruction
that would mimic argument passing (OP_ARG or OP_PUSH).
This solution is interesting, especially as it gives a placeholder
for real argument passing at code generation time, but:
0) they can be added, if needed, when some code generation
will be done.
1) it creates a superfluous instruction for every argument
of every function call
2) these instructions are essentially copy instructions in
disguise.
So, since the problem only exist for constants used in variadic
arguments (and currently, it's only a problem for sparse-llvm),
the solution selected is to add to OP_CALLs a list holding the
type of all arguments. More exactly, it reuses the field .fntype
which was used to store the type of the function, and changes
it to a list holding the function type *and* the type of all
effective arguments.
Reported-by: Dibyendu Majumdar <mobile@majumdar.org.uk>
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
OP_INLINED_CALL are there only as a sort of annotation
for debugging purpose. It is thus wrong to associate
pseudo's usage to them (even if the pseudo are the arguments
of the function now inlined).
Fix this by removing the use_pseudo() for each arguments.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
The value of this pointer is of no use unless you're
using a debugger (or just to see if two things are
identical or not) and it's presence produces noise
when comparing the output of two runs for testing.
Change this by issuing it only if 'verbose' is set.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
As the normal result of branch simplification OP_PHI
can have some VOID in their phi_list, sometimes lots of them.
These list can't be simplified, comacted or so because the
address of the pseudos is used for the pseudo tracking.
But it's annoying that these VOID are displayed by
show_instruction(), it make things sometimes hard to read.
Chnage this by ommiting to display them (when not verbose).
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
as it will be used for dumping the IR not only just after
linearization but after other passes too.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
For experimenting with some optimization and the linearization
process, it's useful to see the raw result, without any kind
of optimization.
This patch test PASS_OPTIM (-foptim-{enable,disable}) to see
if the whole optimization process can be skipped.
Note: ideally, this should be coupled with -O0 but this
could create problems with the 'sparse' tools when used
on file wit an optimization level of 0 (since the sparse
tools depends on the optimization for the context checking
to be accurate and in general diagnostics also need
at least basic optimization).
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
ASM operands have the following syntax:
[<ident>] "<constraint>" '(' <expr> ')'
For some reasons, during parsing this is stored
as a sequence of 3 expressions. This has some serious
disadvantages though:
- <ident> has not the type of an expression
- it complicates processing when compared to having a specific
struct for it (need to loop & maintain some state).
- <ident> is optional and stored as a null pointer when not present
which is annoying, for example, if null pointers are used internally
in ptr-lists to mark removed pointers.
Fix this by using a specific structure to store the 3 elements
of ASM operands.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
In commit (51cfbc90a5 "fix: kill unreachable BBs after killing a child")
kill_unreachable_bbs() was called during simplification in order
to avoid creating fake cycles between phis and issuing bogus
"crazy programmer" warnings.
However, simplification is done via cse.c:clean_up_insns()
which is looping over all BBs while kill_unreachable_bbs()
loops over all BBs *and* can delete some of them which may
corrupt the looping in clean_up_insns().
Fix this by:
1) refuse to emit the "crazy programmer" warning if there
is a potential dead BB
2) move kill_unreachable_bbs() in the main cleanup loop
which will avoid nested ep->bbs loop.
Note: this solution is preferable to some others because
the "crazy programmer" condition happens very rarely.
It this thus better to delay this check than to call
kill_unreachable_bbs() preventively.
Note: the reproducer is one with very broken syntax but nothing
forbid the same situation to happen with a valid program.
Fixes: 51cfbc90a5e1462fcd624a1598ecd985a508a5d6
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
'fix-missing-reload', 'fix-insert-branch', 'fix-NULL-type', 'testsuite-clean', 'fix-bitfield-init-v3' and 'fdump-linearize' into tip
* fix missing reload
* fix boolean context for OP_AND_BOOL & OP_OR_BOOL
* fix bitfields implicit zero initializer.
* fix: kill old branch in insert_branch()
* returns the correct type when evaluating NULL
* use internal size for __INT_MAX__ & friends
* testsuite: cleanup result files
* add support for '-fdump-linearize[=only]'
* add support for '-fmem-report'
* add support for '-dD'
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
The effect of this flag is to dump the IR just after the
linearization, before any simplification, and to stop
further processing if '=only' is given as argument.
The motivation of this flag is of course for debugging,
to be able to inspect the raw result of the linearization,
undisturbed by an simplification.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
In struct access_data, the field 'origval' seems to be a
leftower of some previous version and is set but never really used.
Change this by removing this field.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
In struct access_data, the field 'alignment' is always
the one present in 'result_type' and is thus completely
redundant.
Change this by removing this field.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
The C standard requires that, when initializing an aggregate, all
fieds not explicitly initialized shall be implicity zero-initialized
(more exactly "the same as objects that have static storage duration"
[6.7.9.21]).
Until now sparse didn't did this.
Fix this (when an initializer is present and the object not a scalar)
by first storing zeroes in the whole object before doing the
initialization of each fields explicitly initialized.
Note 1: this patch initialize the *whole* aggregate while the
standard only requires that existing fields are initialized.
Thanks to Linus to notice this.
Note 2: this implicit initialization is not needed if all fields are
explicitly initialized but is done anyway, for the moment.
Note 3: the code simplify nicely when there is a single field that is
initialized, much less so when there is several ones.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
In struct access_data, the fields: 'bit_offset', 'bit_size'
are always the ones corresponding to the 'result_type' and
are thus completely redundant.
Change this by removing these fields and directly using
the info from the 'result_type' field.
Note: the motivation for this change is the realization that the
initialization of bitfields are buggy because 'bit_size'
is never set for initializers. The bug could be solved by
initializing 'bit_size' & 'bit_offset' but it is much
simpler (and feel safer) to simply use the values from
'result_type'.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
insert_branch() is called after an optimization has detected
an opportunity to convert a switch or a conditional branch
into an unconditional branch (for example because the condition
is a constant). It does this by removing the BB's last instruction
and then allocate a new unconditional branch which is added at
the end of the BB. The old instruction is simply discarded.
Since the discarded instruction is one with a condition we must
insure that the associated usage is also removed (for example,
by calling kill_instruction()).
But currently kill_instruction() is called, just after the call
to insert_branch(), only at a single place. The 4 other places where
insert_branch() is called do nothing with the removed instruction
and it's condition's usage. As consequence, instructions that are
dead are not removed since it still wrongly has an user.
Fix this by adding a call to kill_instruction() in insert_branch()
itself (and make the function description more exact).
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Current simplifications of 'x && 1 --> x' and its dual
'x || 0 --> x' are wrong because the '||' and '&&' operators
demand that their operands are first compared against zero
which then always give a boolean valued result.
For example: '3 && 1' is not equal to '3' but to '1' (or 'true').
The correct simplification is thus 'x && 1 --> x != 0' and
'x || 0 --> x != 0'.
Fix this by always first doing the comparison against zero
before generating the OP_AND_BOOL and OP_OR_BOOL instructions.
Note: of course, we could decide that the semantic of OP_AND_BOOL
and OP_OR_BOOL is that these ops take care themselves of
making a boolean context (which was, I think, why these
ops were created) but then these simplifications cannot be
done (or when they are done, we need to add the comparison
against zero).
Fixes: b85ec4bb7f5b1c522d7c71782dbd9cf1c4c49b2f
Fixes: a0886db12307d2633b04ec44342099a2955794a5
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
When simplifying a switch into a simple branch all the now
unused children of the current BB must be removed.
If one of these children become now orphaned, it is directly
killed (it will need to be killed soon or later since it is
unreachable).
However, if one of the killed children is the header of a loop
where some variables are updated this may cause problems.
Indeed, by killing the header (which contains the phisrc of
the entry value of the variable) the whole loop may become
unreachable but is not killed yet, OTOH simplification of
the associated OP_PHI may create a cycle which may then be
detected later by simplify_one_memop() which will issue a
"crazy programmer" warning while the programmer was innocent.
This situation can be seen in code like:
int *p;
switch (i - i) { // will be optimized to 0
case 0: // will be the simple branch
return 0;
case 1: // will be optimized away
p = ptr;
do { // will be an unreachable loop
*p++ = 123;
} while (--i);
}
Fix this by calling kill_unreachable_bbs() after having
simplified the switch into a branch. This will avoid to
create a cycle with because of the removed phisrc in the
header and as an added benefit will avoid to waste time
trying to simplify BBs that are unreachable.
In addition, it's now useless to call kill_bb() for each
removed switch's children as kill_unreachable_bbs() will
do that too.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
'sent/fix-cond-address' and 'careful-concat-user-list' into tip
|
|
When verbose is set to 2 or higher, show_instruction()
also display removed instructions. Some of these instructions
can have their associated symbol removed and set to NULL
which will cause a crash with 'test-linearize -vv'.
Fix this by avoiding to dereference symbol's details of
removed instructions.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Christopher Li <sparse@chrisli.org>
|
|
By mistake, a cast to floating-pointer pointer was
created as OP_FPCAST instead of OP_PTRCAST.
Fix this by adding the missing 'else'.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Christopher Li <sparse@chrisli.org>
|
|
OP_BR instructions exist in two flavours, relatively much
differentiated: conditional & non-conditional. One has an
operand (and thus its usage need to be cared for, must be
handled in liveness analysis, ..) the other has not; one has
two BB target, the other only one.
There is essentially no places in the code where both flavours
are handled the same. Sometimes they both must be handled but
each with their specificities but in most cases only one of
them is concerned and we need to filter out the other one.
In both cases it means that we need to check what kind we're
dealing with.
There is already a problem with this because there is
several ways to test which kind an OP_BR is and they
are not exactly equivalent:
1) testing if insn->cond is NULL
2) testing if one of insn->bb_true or ->bb_false is NULL.
There exist also an helper (is_branch_goto()) which does
the second tests but which is never used.
It appears that the first test should not be used because
in some cases an conditional OP_BR is changed into
a non-conditional one by (amongst others things) setting
it's ->cond to VOID. We should thus always use the seconds
test (which need two compares with NULL).
This could be corrected in several ways (like changing all
the places wheer the first test is used, use the helper
everywhere or never set ->cond to VOID) but the simplest
is to simply split them in two separated instructions:
OP_BR & OP_CBR, especailly given the fact that in most cases
the OP_BR was first selected by a switch (opcode).
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Christopher Li <sparse@chrisli.org>
|
|
Loading a bitfield correctly take in account the offset
of the bitfield inside the whole container integer.
But truncating it to the width of the bitfield is not done
or is done very implicitely (because the correct size is not lost).
For example, with the following code:
struct bfu {
unsigned int a:3;
};
unsigned int get__bfu_a(struct bfu bf) { return bf.a; }
test-linearize gives as output something like:
get__bfu_a:
cast.32 %r2 <- (3) %arg1
ret.32 %r2
We can notice the (3) in the cast instruction but this is misleading
as %arg1 is not 3bit wide.
Fix this by adding the missing truncating cast.
This will then gives something like:
get__bfu_a:
cast.3 %r2 <- (32) %arg1
cast.32 %r3 <- (3) %r2
ret.32 %r3
Note the truncation could also be done by a and-mask but the cast
is more logical since we're here only changing size and not doing
some arithmetic operations.
Fixes: 1688f039c ("Re-do memory access linearization.")
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Christopher Li <sparse@chrisli.org>
|
|
commit a0962f7893 ("Fix cast instruction generation") correctly
used the source type instead of the destination type to decide
between OP_CAST & OP_SCAST but at the same time made the choice
between OP_PTRCAST & OP_FPCAST depends on the source's type too,
thus losing all information about the target type (but its size).
In other words, in the code here below:
int i ;
...
(long) i;
(void*) i;
(double) i;
(float) i;
the three first casts generate exactly the same code (for a LP64 arch):
scast.64 %r.. <- (32) %r..
and the last one will not generate any code at all.
Fix this by using the source type only to determine if the cast is an
OP_CAST or an OP_SCAST but using the target type to determine if the
cast is a cast to a pointer type (OP_PTRCAST), a floating-point type
(OP_FPCAST) or an integer type (OP_CAST or OP_SCAST).
Note: this is still not correct for floating-point to integer cast
as it is needed to know if the destination is a signed or unsigned
integer type.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Christopher Li <sparse@chrisli.org>
|
|
Normaly a label statement is not also an expression but it could be
in GCC's statement expression like : ({ ...; label: <expression>; })
Currently, during linearization sparse discards the return value
of the linearization of the label's sub-statement, thus also
discarding the value of such statement-expressions.
Trivialy fix this by not discarding the return value but returning
it instead, like other statements that also are an expression and
thus have a value.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Christopher Li <sparse@chrisli.org>
|
|
A compound assignment like, for example:
x /= a;
should have the same effect as the operation followed by the
assignment except that the left side should only be evaluated
once. So the statement above (assuming 'x' free of side-effects)
should have the same effect as:
x = x / a;
In particular, the usual conversions should applied. So, if the
type of 'x' and 'a' is, respectively, 'unsigned int' (32 bit) and
'long' (64 bit), the statement above should be equivalent to:
x = (unsigned int) ((long) x / a);
But what is really done currently is something like:
x = x / (unsigned int) a;
In other words, the right-hand side is casted to the same type as the
lhs and the operation is always done with this type, neglecting the
usual conversions and thus forcing the operation to always be done
with the lhs type, here 'unsigned int' instead of 'long'.
For example, with the values:
unsigned int x = 1;
long a = -1;
We have:
x = 1 / (unsigned int) (-1L);
x = 1 / 0xffffffff;
x = 0;
instead of the expected:
x = (unsigned int) ((long)1 / -1L);
x = (unsigned int) (-1L);
x = 0xffffffff;
The patch fix this by first calculating the type corresponding to
the usual conversion and then casting the right-hand side to this
type, which is fine since it's a rvalue anyway.
Later steps will then use the rhs type when doing the operation.
On the example above, the cast will be a no-op and the operation will
be done with the correct type:
x = x / (long) a;
which, at linearization, will become:
x = (unsigned int) ((long) x / (long) a);
and with unneeded casts optimized away:
x = (unsigned int) ((long) x / a);
Which will give us the expected result.
If the types where in the other order, the result will also be done
with the correct type:
long x;
unsigned int a;
...
x /= a;
will become:
x = x / (long) a;
and, at linearization:
x = (long) ((long) x / (long) a);
and with unneeded casts optimized away:
x = (x / (long) a);
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Christopher Li <sparse@chrisli.org>
|
|
show_instruction() uses the field 'target' to display OP_SWITCH
instruction instead of 'cond' like OP_BR does.
It doesn't change anything since these two fields use the same
storage inside struct instruction but better to use the right field
to keep consistent.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Christopher Li <sparse@chrisli.org>
|
|
test-linearize displays basic block's labels by using
'.L0x' plus the address of the bb struct. This is certainly
convenient as an UID but it has the disadvantage that these
names change at each run and are thus not comparable.
This complicate testing quite a bit.
Let change this by giving a simple sequential number to each bb
and simply display them as: '.L1', '.L2', ...
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Christopher Li <sparse@chrisli.org>
|
|
When code contains an unused label, it's not needed to create a new
basic block for the code that follow it but that doesn't mean that
the following code is unreachable.
There is currently a bug related to this when processing a label statement
for a label that is never used: not only the label is ignored (and this
no new basic block is created) but the whol statement is ignored.
In other words the statement directly following an unused label is
simply ignored.
The patch fix this by simply moving the code handling the statement out
of the conditional part processing used labels.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Christopher Li <sparse@chrisli.org>
|
|
Variables declared with C99 syntax inside a for statement should be
linearized in the loop top, rather than implicitly inside the loop body.
Signed-off-by: Emily Maier <emily@emilymaier.net>
Reviewed-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Tested-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Christopher Li <sparse@chrisli.org>
|
|
The result type of relational operators (e.g., x < y) and logical
operators (e.g., x && y) in C should be int, rather than bool.
For example, sparse incorrectly evaluates sizeof(x < y) to 1 (i.e.,
sizeof(int)), which should have been sizeof(int).
This patch fixes the result type of these operators in evaluation,
linearization, and the LLVM backend.
Acked-by: Christopher Li <sparse@chrisli.org>
Signed-off-by: Xi Wang <xi.wang@gmail.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
|
|
On Tue, Aug 30, 2011 at 10:43 AM, Jeff Garzik <jeff@garzik.org> wrote:
> * if someone knows how to access a function declaration, I can solve the
> varargs problem
Hmm. Right now we do not have access to the function declaration at
linearize time. We've checked that the arguments match, and we've cast
the arguments to the right types (evaluate.c), so the thinking was
that you just use the arguments as-is.
But if llvm needs the declaration of a function, we'd need to squirrel it away.
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Christopher Li <sparse@chrisli.org>
Cc: Jeff Garzik <jgarzik@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[ penberg@kernel.org: Fix validation/context.c breakage. ]
Signed-off-by: Pekka Enberg <penberg@kernel.org>
|
|
Rather than do it in that huge 'linearize_statement()' function, split
out the return generation case. Avoids one level of indentation, and
makes for simpler and more straightforward functions.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Christopher Li <sparse@chrisli.org>
|
|
Rather than do it in that huge 'linearize_statement()' function, split
out the switch generation case. Avoids one level of indentation, and
makes for simpler and more straightforward functions.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Christopher Li <sparse@chrisli.org>
|
|
Rather than do it in that huge 'linearize_statement()' function, split
out the iterator case. Avoids one level of indentation, and makes for
simpler and more straightforward functions.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Christopher Li <sparse@chrisli.org>
|
|
Some cases were omitted with the patch from Namhyung Kim
(commit c5e425e in Chris Li's repo).
My curiosity led me to try out coccinelle/spatch as suggested by
Nicholas Mc Guire in reply to Kim's patch, but it*) only discovered
occurrences in show-parse.c, probably because of "const vs. non-const"
differences of array item types and the expression given to sizeof.
*) sequence to try coccinelle out on this case (when coccinelle installed):
$ wget http://coccinelle.lip6.fr/rules/array.cocci
$ sed 's/<linux\/kernel.h>/"lib.h"/' array.cocci > array-sparse.cocci
$ for i in $(find . -path ./validation -prune -o -name "*.c" -print); \
> do spatch -sp_file array-sparse.cocci $i; done
Beside proceeding messages, this will print out any "real" patch
generated according to the semantic patch in `array-sparse.cocci'
(it can also reflect these changes directly etc.).
Signed-off-by: Jan Pokorny <pokorny_jan@seznam.cz>
Signed-off-by: Christopher Li <sparse@chrisli.org>
|
|
This fixes an incorrect assumption that results in && using shortcut
logic on the true branch instead of the false branch.
Signed-off-by: Daniel De Graaf <danieldegraaf@gmail.com>
Signed-off-by: Christopher Li <sparse@chrisli.org>
|
|
It's unfortunate to use 'true' and 'false' as identifiers in a system
header. It clashes with corresponding macros from <stdbool.h> when
included before <sparse/linearize.h>.
Signed-off-by: Kamil Dudka <kdudka@redhat.com>
Acked-by: Hannes Eder <hannes@hanneseder.net>
Signed-off-by: Christopher Li <sparse@chrisli.org>
|
|
> Do you want to resend your change which revert the context changes?
> Make it base on Josh's git's tree and I will merge your changes in my
> branch.
Below. Or I can give it to you in git if you prefer. I still think we
should redo this in some form so that annotations with different
contexts can work properly, but I don't have time to take care of it
right now.
johannes
>From ca95b62edf1600a2b55ed9ca0515d049807a84fc Mon Sep 17 00:00:00 2001
From: Johannes Berg <johannes@sipsolutions.net>
Date: Tue, 23 Dec 2008 10:53:19 +0100
Subject: [PATCH] Revert context tracking code
|
|
Currently there is no generic way to derive phy
type information from the instruction flow.
Signed-off-by: David Given <dg@cowlark.com>
|
|
My optimisation to avoid recursion into BBs when checking contexts
lead to a failure in a case like this:
static int warn_conditional(void)
{
if (condition)
return 0;
a();
if (condition == 0)
return 1;
r();
return 0;
}
because some blocks are called with different contexts and thus
need to be checked multiple times.
The obvious fix would be to decrease the recursion depth at the
end of the BB check function, but that, while correct, leads to
extremely long sparse runtimes on somewhat complex functions.
Thus, this patch also makes sparse cache which contexts it has
checked a block in and avoid the re-checking in that case.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
|
|
Whether it's a sign-extending cast or not depends on the source
of the cast, not destination. The final size of the cast depends
on the destination, of course.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This patch enables a very simple form of conditional context tracking,
namely something like
if (spin_trylock(...)) {
[...]
spin_unlock(...);
}
Note that
__ret = spin_trylock(...);
if (__ret) {
[...]
spin_unlock(...);
}
does /not/ work since that would require tracking the variable and doing
extra checks to ensure the variable isn't globally accessible or similar
which could lead to race conditions.
To declare a trylock, one uses:
int spin_trylock(...) __attribute__((conditional_context(spinlock,0,1,0)))
{...}
Note that doing this currently excludes that function itself from context
checking completely.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
|
|
The sparse man page promises that it will check this:
Functions with the extended attribute
__attribute__((context(expression,in_context,out_context))
require the context expression (for instance, a lock) to have the
value in_context (a constant nonnegative integer) when called,
and return with the value out_context (a constant nonnegative
integer).
It doesn't keep that promise though, nor can it, especially with
contexts that can be acquired recursively (like RCU in the kernel.)
This patch makes sparse track different contexts, and also follows
up on that promise, but with slightly different semantics:
* the "require the context to have the value" is changed to require
it to have /at least/ the value if 'in_context',
* an exact_context(...) attribute is introduced with the previously
described semantics (to be used for non-recursive contexts),
* the __context__ statement is extended to also include a required
context argument (same at least semantics),
Unfortunately, I wasn't able to keep the same output, so now you'll
see different messages from sparse, especially when trying to unlock
a lock that isn't locked you'll see a message pointing to the unlock
function rather than complaining about the basic block, you can see
that in the test suite changes.
This patch also contains test updates and a lot of new tests for the
new functionality. Except for the changed messages, old functionality
should not be affected.
However, the kernel use of __attribute__((context(...)) is actually
wrong, the kernel often does things like:
static void *dev_mc_seq_start(struct seq_file *seq, loff_t * pos)
__acquires(dev_base_lock)
{
[...]
read_lock(&dev_base_lock);
[...]
}
rather than
static void *dev_mc_seq_start(struct seq_file *seq, loff_t * pos)
__acquires(dev_base_lock)
{
[...]
__acquire__(dev_base_lock);
read_lock(&dev_base_lock);
[...]
}
(and possibly more when read_lock() is annotated appropriately, such
as dropping whatever context read_lock() returns to convert the context
to the dev_base_lock one.)
Currently, sparse doesn't care, but if it's going to check the context
of functions contained within another function then we need to put the
actual __acquire__ together with acquiring the context.
The great benefit of this patch is that you can now document at least
some locking assumptions in a machine-readable way:
before:
/* requires mylock held */
static void myfunc(void)
{...}
after:
static void myfunc(void)
__requires(mylock)
{...}
where, for sparse,
#define __requires(x) __attribute__((context(x,1,1)))
Doing so may result in lots of other functions that need to be annoated
along with it because they also have the same locking requirements, but
ultimately sparse can check a lot of locking assumptions that way.
I have already used this patch and identify a number of kernel bugs by
marking things to require certain locks or RCU-protection and checking
sparse output. To do that, you need a few kernel patches which I'll
send separately.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
|
|
Turn FORCE_MOD into storage class specifier (that's how it's
actually used and that makes for much simpler logics).
Introduce explicit EXPR_FORCE_CAST for forced casts; handle it
properly.
Kill the idiocy in get_as() (we end up picking the oddest things
for address space - e.g. if we have int __attribute__((address_space(1))) *p,
we'll get warnings about removal of address space when we do things like
(unsigned short)*p. Fixed. BTW, that had caught a bunch of very odd
bogosities in the kernel and eliminated several false positives in there.
As the result, get_as() is gone now and evaluate_cast() got simpler.
Kill the similar idiocy in handling pointer assignments; while we are at it,
fix the qualifiers check for assignments to/from void * (you can't assign
const int * to void * - qualifiers on the left side should be no less than
on the right one; for normal codepath we get that checked, but the special
case of void * skips these checks).
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Improve the dot files generated by graph.c to put the basic blocks for
each function into their own labelled subgraph cluster and to
distinguish between branches and function calls using different edge
types. Requires a change to struct symbol to keep track of the
entrypoint associated with a given function identifier.
Currently, no attempt is made to handle indirect function calls -- they
are just ignored.
Signed-off-by: Dan Sheridan <djs@adelard.com>
|
|
Thanks to Damien Lespiau for reporting the problem.
Signed-off-by: Josh Triplett <josh@freedesktop.org>
|
|
Signed-off-by: Josh Triplett <josh@freedesktop.org>
|
|
Signed-off-by: Josh Triplett <josh@freedesktop.org>
|
|
Here is some diff comparing output between the sparse 0.2 and the tip of git.
-mm/mmap.c:1631:2: warning: context imbalance in 'expand_stack' - different lock
contexts for basic block
+include/linux/rmap.h:55:2: warning: context imbalance in 'expand_stack' - diffe
The change is introduced by the inline annotate instruction,
which mark the bb->pos to the inline function.
This change make it back to the caller position.
Signed-Off-By: Christopher Li <sparse@chrisli.org>
|
|
For inline functions, Sparse inlines the function body at evaluation. It is
very hard to find out the original function call. This change preserves the
original call as an annotation.
Signed-Off-By: Christopher Li <sparse@chrisli.org>
|
|
The liveness instruction takes up about 10% of the bytecode bloat file.
It is not very useful, it is duplicate information that can be obtained
from the def/user chain.
This change disables the liveness instruction by default.
The caller can track_pseudo_death() if needed.
Signed-Off-By: Christopher Li <sparse@chrisli.org>
|
|
Signed-off-by: Josh Triplett <josh@freedesktop.org>
|
|
Bump up the size of the instruction buffer. vt_ioctl.c has
a huge switch statement causing sparse to overflow the instruction
buffer.
Signed-Off-By: Christopher Li <sparse@chrisli.org>
|
|
Using the following can generate anonymous symbol without
initializer expression.
return (struct foo) {};
It would be nice if sparse does not crash on it.
Signed-Off-By: Christopher Li <sparse@chrisli.org>
|
|
Signed-off-by: Josh Triplett <josh@freedesktop.org>
|
|
The current way of tracking pseudo register user is by
keeping a list of the address of the pseudo_t member.
This address can be in part of the instruction member, the
worse case is in the argument list of the call instruction.
As the comment for address_taken() said, using the container
to get instruction pointer is wrong. It use to work with instruction
that relate to symbol address. But that is not true any more. Even
worse, it is very hard to track the pseudo usage other than symbol
address. The only reason symbol address used to works for call instruction
is because call instruction did not directly use the symbol address.
I bit the bullet and just add the instruction pointer to pair with
the pseudo user pointer. So it will work with the case that the
user instruction is call as well.
Testing:
I compare the linearize result with/without the patch on a few
sparse source file it self. The linearize generate exactly
the same result except the symbol address changes. Which is
predictable different because the pseudo user structure allocate
memory.
Singed-Off-By: Christopher Li <sparse@chrisli.org>
|
|
A pseudo list contains more information. It can
get to the symbol as well as the usage information.
Now it is much easier to answer questions like
"What functions does this function call?".
Signed-Off-By: Christopher Li <sparse@chrisli.org>
|
|
sparse currently only tracks one global context for __context__ and
__attribute__((context)).
This adds support for parsing an additional argument to each of these
which gives a context expression. For __attribute__((context)), store
each context attribute as a separate context structure containing the
expression, the entry context, and the exit context, and keep a list of
these structures in the ctype. For __context__, store the context
expression in the context instruction. Modify the various frontends to
adapt to this change, without changing functionality.
This change should not affect parsing of programs which worked with
previous versions of sparse, unless those programs use comma expressions
as arguments to __context__ or __attribute__((context)), which seems
highly dubious and unlikely. sparse with -Wcontext generates identical
output with or without this change on Linux 2.6.18-rc4.
Signed-off-by: Josh Triplett <josh@freedesktop.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
This removes the list of symbols for block statements, and instead makes
a declaration be a statement of its own.
This is necessary to correctly handle the case of mixed statements and
declarations correctly, since the order of declarations and statements
is meaningful.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
This is needed to translate SSA back to normal form.
Maybe we should split this in two distinct ops: The first ones
correspond to the old OP_PHISOURCE (but if the same phisrc feed several
phi-nodes, we need sevarel distinct copy instruction); at this stage
they are the only instruction that can define a pseudo in multiple
places. The other ones being the "normal" copies.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@looxix.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Instead of having a special case for "post_condition == pre_condition",
make a NULL post_condition mean that it's the same as the pre-condition.
That's how while/for loops work anyway.
This avoids the double warnings for conditionals that Oleg Nesterov
noted.
|
|
This makes some needlessly global code static so that sparse don't
complain when self checking.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@looxix.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Avoid deferencing a null pointer in add_asm_output() after a parse error
in asm statement.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@looxix.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Avoid deferencing a null pointer after parse errors in casts.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@looxix.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Avoid dereferencing a null pointer after parse errors in assignements.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@looxix.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
asm statement
Avoid deferencing a null pointer in linearize_asm_statement() after a
parse error in asm statement.
Note: the problem was already fixed (or was never there) for the
asm_inputs part. This patch uses the same protection for the output
part: use "".
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@looxix.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Sparse can segfault when code is present after a goto statement.
The problem arises because after a goto statement (same with break and
continue) ep->active is set to NULL but processing of code present after
this statement can dereference ep->active (alloc_phi()).
The following piece of code reproduce the problem:
static int foo(int a)
{
goto end;
return a ? : b;
end:
return 0;
}
From what I have tested, the whole "a ? : b;" is needed to reproduce the
problem, with absent true part of the conditional and a undefined second
part; but the problem seems very general.
The following fixes the problem by checking for NULL ep->active, but
I suppose that there is nicer way to solve this problem.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@looxix.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
If there were earlier parse errors, we may not have some of the
expressions involved, so just check that first.
Reported by Luc Van Oostenryck.
|
|
sparse segfault on the following input:
static int foo(int a)
{
return a ?? 1 : 0;
}
This adds a new validation file and fixes the segfault.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@looxix.net>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Noted by Roland Dreier.
|
|
You can mark a function as requiring a lock, or requiring being called
without a lock.
|
|
|
|
Rather than tying everything to the beginning basic block
position.
This should be able to pinpoint instruction issues much more
exactly and readably.
|
|
|
|
Otherwise we lose the information what the target
type is (we only have the bit-size of the target).
|
|
about addresses of local variables.
|
|
This is OP_MUL/OP_DIV/OP_MOD/OP_SHR.
We actually do the constant simplifications still wrong, but
now the information is all there.
|
|
assignment expression needs to be done in a wider type.
We never actually generate those kinds of expression trees
yet, but that's because our expression evaluation is a bit
buggy and will cast them to the target type too soon.
Another of the "Al Viro thinks of things mortal humans do
not" cases.
|
|
contain sub-initializers.
This really shouldn't be needed, since we should be flattening
the EXPR_INITIALIZER during evaluation, but there's something
missing. However, since flattening cannot handle the range case
(which we don't handle right now in linearization either), we do
conceptually need this for the generic case anyway.
|
|
The _result_ size of a comparison is always just a boolean,
but that isn't the interesting part. What we care about is
what the size of the thing we compare is.
|
|
(symbol addresses).
They are pretty different. Symbol addresses have special meaning during
various phases, from symbol simplification to CSE.
|
|
The asm_inputs/outputs "expression list" is not really an
expression list any more: it is a list of "triples", where
the first entry is the identifier name, the second one is
the constraint string, and the third one is the expression.
|
|
This also makes the OP_ASM data structures a bit more structured
in order to contain all the required information.
|
|
Not only does it make sense, but the back-end needs to
know what pseudos got allocated ;)
|
|
|
|
changes.
|
|
and labels. And comments, for that matter.
This eventually allows us to buffer them up, rather than print
them out directly. Which we'll need to do if we want to fix up
frame offsets etc (for register save areas). And other post-
processing.
Also, for comments, make "show_instruction()" return the string
rather than print it out.
|
|
Much prettier than "input_streams[x].name", since most
users really don't want to know about the internals of
how the preprocessor lays out its stream tracking.
|
|
It looked good on paper. Not so good when actually trying
to generate code.
|
|
We can just replace all users with the symbol pseudo
directly.
This means that we can no longer re-do symbol simplification
after CSE, and we need to rely on the generic memop simplification.
|
|
Noted by Sam Ravnborg
|
|
white-space.
|
|
reachability pass after CSE.
|
|
This allows us to always see which pseudos are nonlocally affected
by the phi source.
We can only do this after the instruction flow is fixed, together
with the OP_DEATHNOTE phase.
|
|
It's cute and useful for debugging, but it ends up doing the wrong
thing, since now CSE and liveness analysis will see the intial values
as the "dynamic" ones, and make the wrong kind of assumptions about
dominating stores etc.
|
|
Also fix the printout of such unnamed symbol pseudos.
|
|
|
|
We used the output list instead of the input list ;)
|
|
This doesn't actually save the register class information, so
you can't do proper register allocation, but it does all the
input reading and store to outputs.
Still need to teach the liveness analysis about the _usage_ rules.
|
|
might be parent of another block several times.
|
|
a special basic block.
This removes a lot of special cases, since now flow doesn't have any
special BB to worry about. It also gives a clear definition point for
argument pseudos, and thus makes liveness tracking lose a special
case.
|
|
We want to use it in almost any debugging schenario, so why not
just admit that.
|
|
This requires that we be able to handle "uses" pseudos without
a defining instruction.
|
|
Now that we have full pseudo usage lists, we can also add
deathnotes to pseudos in the instruction stream.
NOTE! We add the deathnote to _before_ the last instruction
that uses that pseudo. That looks a bit strange, but it's
actually what you want: when we traverse the instuctions, we
want to know that the inputs are dead.
|
|
|
|
|
|
This was originally done so that "struct instruction" would be
smaller, but it has since grown anyway (for the memops), so splitting
OP_SEL into two instructions no longer makes sense, and makes it
harder on CSE.
|
|
A simple "for (;;)" wouldn't linearize properly.
|
|
Flow simplification depends on liveness analysis, so we really can't
do it early any more. I'd missed one call to it when I re-organized
things, and this call would totally mess up the flow because it thought
there were no pseudos being defined or used anywhere..
|
|
The "constant conditional" stuff in flow simplification is now
handled by the trivial instruction simplification, so the only
thing that remained was the conditional flow based on following
constant PHI-nodes. And that one can be much more effectively
done after liveness analysis, when we can decide to skip even
non-empty blocks if the target we are skipping to doesn't care
about this particular block.
|
|
This does trivial simplification of casting to the same
typesize. HOWEVER. We split casts up into whether they
cast to a dereferencable pointer type or not, and we don't
simplify pointer casts. This should mean that if we
ever want to do type-based alias analysis, we can still
avoid casted accesses.
(If we do type-based alias analysis, we'll also need to make
a union access do a cast. Which we probably should do anyway).
|
|
No need to SIGSEGV just because the program is crap ;)
|
|
|
|
This is purely for debugging. It's only used to show what symbol
a pseudo might be (and I stress "might", since CSE can and does
screw it up) associated with.
Also print out the def list for a basic block when verbose.
It all makes it a bit easier to guess what's up.
|
|
uninteresting parts only with "-v -v". Dead blocks in particular.
Also, don't even put a pseudo on the OP_RET if were a void
function. That makes the liveness analysis happier.
|
|
This should basically allow us to do register allocation. Or
maybe it's too broken. But the results look reasonable from a
quick manual peek.
|
|
We should drop symbol types by now, and base things on just raw
sizes. Anything else just doesn't work from a CSE standpoint.
Some things may care about actual types later, notably the type-
based alias analysis. Those types still exist in the cast nodes.
Also, make the printout be more readable.
|
|
all but one child.
|
|
Very useful for debugging, since otherwise it's often
impossible to find all the other basic blocks.
|
|
Let's try to make sure that once I get these flow bugs ironed
out they won't come creeping back in.
|
|
...but we can't merge phi-sources after packing. Or rather, we
could, but we'll have to be more careful about not re-ordering
them, which we aren't. Sort it out later.
|
|
It will assrt if it can't find that many entries.
Zero means "all the ones you can find", aka old behaviour.
|
|
|
|
|
|
Just a semblance, though.
The rule at this point is that alignment etc doesn't matter (pure
code generation issue), and the only thing that matters is whether
accesses overlap. This probably doesn't get it right either, but
it gets closer.
|
|
We don't have usage counters, we have usage lists these days.
|
|
track of which instructions use which pseudos.
It also verifies the usage, which shows a few bugs in
structure handling.
|
|
We forgot to set the "def" to point to the new OP_SEL
instruction.
|
|
(which was the bulk of it) a function of its own.
No code changes, but this gets rid of two levels of indentation.
|
|
repeating CSE itself.
In particular, some symbol address simplifications imply that
we should repeat symbol simplification, since things may have
improved.
|
|
Address gen still not killed, that's a bit more involved.
|
|
|
|
It now ends up being
CALL fn, arg, argc, arg
pseudo <- %retval
for a function that returns a return value.
This is slightly closer to what real assembly would be like,
but more importantly, it removes the only valid use of VOID
pseudos from the printout, so now you can see when something
goes wrong y just grepping for VOID.
This happens in the kernel a lot. Apparently mainly because
of lack of support for inline assembly.
|
|
|
|
This allows us to notice when a symbol address usage is
trivially dead, and optimize such a symbol better since
we don't have to worry about the address being used to
access it.
|
|
from child and parent lists.
|
|
Instead of faking it by doing bad things to the child list,
do it properly. The fake stuff really confused the if-conversion.
|
|
Also, remove OP_SEL from the "binop" list - it really really isn't,
since it has a third implied input in the OP_SETCC.
|