| Age | Commit message (Collapse) | Author | Files | Lines |
|
An OP_SEXT or a OP_ZEXT followed by a truncate to a size smaller
than the original size is unneeded, the same result can be obtained
by doing the truncate directly on the original value.
Dualy, an OP_SEXT or a OP_ZEXT followed by a truncate to a size greater
than the original size doesn't need the truncate, the same result can be
obtained by doing the extend directly on the original value.
Rearrange the inputs (src & orig_type) to bypass the unneeded operation.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
An OP_ZEXT/OP_SEXT following by a OP_TRUNC to the original size is a NOP.
Simplify away such OP_TRUNC.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
There is several difficulties some related to unclear semantic
of our IR instructions and/or type evaluation.
Add testcases trying to cover this area.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
The function check_shift_count() is used to check the validity
of shift counts but the count is truncated to an (host) int.
This truncated value can thus miss very large (or very negative)
shift count.
Fix this by using the full width shift count when doing the check.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
The diagnostic emitted when shifting by an negative amount is confusing:
warning: shift too big (4294967295) for type ...
Change the message to the more informative:
warning: shift count is negative (-1)
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Currently, during simplification, when it's checked if the shift
amount of a shift instruction is in range, it's not the
instruction's size (which is the same as the type's width)
that is used but the 'operand size' as returned by operand_size().
operand_size() is a bit like poor man's Value Range Propagation
or VRP without Propagation, and use the knowledge of previous
instructions to deduce the effective size of the operand.
For example, if the operand is the result of a narrowing cast
or have been ANDed with a known mask, or was a bitfield, the
size returned is the one of the cast or the mask and may be
smaller than its type.
A priori, using more precise knowledge is a good thing but in
the present case it's not because it causes warning for things
that are totally legal, meaningfull and used all the time.
For example, we don't want to warn in the following code:
struct bf { int u:8; };
int bf(struct bf *p) { return p->s << 24; }
Another situation where such warnings are not desirable is
when the shift instruction 'is far' from the one defining
the size, for example when the shift occurs in an inline function
and this inline function is called with a smaller type.
So, use the instruction size instead of the effective operand size
when checking the range of a shift instruction.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
In the mathematical sense, the result of a left-shift by an
amount bigger than the operand size equals zero.
Do the corresponding simplification.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
In the mathematical sense, the result of LSR by an amount bigger
than the operand size equals zero.
Do the corresponding simplification.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
During the simplification phase, ASR instructions are checked
and possibly simplified but LSR & SHL are not.
Rename simplify_asr() into simplify_shift() and do the check
& simplification for LSR & SHL too.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Currently, arithmetic right shifts with a shift count bigger
than the operand size are simplified to zero.
While this makes a lot of sense for LSR and SHL, for ASR it's
much less so.
Remove the simplification and let the back-end generate the code
for a non-immediate count (which is what GCC do).
Note: Some other options would be:
- reduce the shift count modulo the operand size, like it was
done previously at expansion time and which correspond to the
run-time behaviour of several CPUs families (x86[-64], arm64,
mips, ...) but not all of them (arm, ppc, ...).
- truncate the count to N-1
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
At expansion time, a diagnostic is emitted for shift with a
bad count in a shift expression but not in a shift-assign.
Fix this by calling the checking function also for shift-assigns.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
The following patches change quite a bit of things regarding
shifts with bad count. Add testcases to ensure everything is
covered and catch possible future regressions.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
* cse: try the next pair instead of keeping the first instruction
* cse: compare casts only by kind a size, not by C type.
|
|
Currently OP_SWITCHes are only handled by default in kill_insn().
In consequence, when killed, OP_SWITCHes leave a fake usage on the
condition.
Fix this by removing the condition's usage when killing an OP_SWITCH.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
* optimize away OP_UTPTR & OP_PTRTU which are nops.
|
|
Now that all casts to or from a pointer are between a pointer
and a pointer-sized unsigned integer, from an optimization
PoV, they are all no-ops.
So, optimize them away at simplification time.
Note: casts between pointers (OP_PTRCAST) should also be
optimized away but the original type is used for a
number a things (for example in check_access()) and
can't be optimized away so simply (yet).
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
validation/linear/* should not contain testcases that are
optimization dependent and validation/*.c should not contain
tests using 'test-linearize', only those using 'sparse'.
Move some cast-related testcases accordingly.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Now that OP_AND_BOOL and OP_OR_BOOL are always given boolean
operands, they are just a special case of 1 bit OP_AND & OP_OR.
To avoid to have to repeat CSE, simplification patterns, ...
better to generate plain OP_AND & OP_OR instead.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
The function add_convert_to_bool() was added to give a
boolean context to logical expressions but did this onl
for integers.
Fix this for floating-point expressions by adding the proper
comparison to 0.0.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Because of C's integer promotion, in code like 'a == 0',
the operand 'a' must be promoted to int. So, if 'a' is
of type 'bool', it results in following linearization:
zext.32 %t <- (1) %a
setne.32 %r <- %t, $0
While this promotion is required by the standard at C level,
here, from an operational PoV, the zero-extension is unneeded
since the result will be the same without it.
Change this by simplifying away such zero-extensions.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
This is yet another simple identity with the potential to trigger
more simplifications.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
This is another simple identity with the potential to trigger
more simplifications.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
This is a simple identity with the potential to trigger
more simplifications.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
The current SSA conversion kinda ignore undefined variables.
Those may then later be detected when they are part of a
LOAD + (ADD|SUB) cycle but they can also create other cycles
which are not detected.
These cycles inhibit (or uselessly complicate) lots of optimizations.
For example, code like:
and.32 %r2 <- %r1, $1
and.32 %r3 <- %r2, $1
should be simplified into:
and.32 %r3 <- %r1, $1
but this simplification would behave horribly with 'cycles' like:
and.32 %r1 <- %r1, $1
This patch add a testcase for a number a very simple situations
where such cycles can be created.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
A truncation followed by a zero-extension to the original size,
which is produced when loading a storing bitfields, is equivalent
to a simple AND masking. Often, this AND can then trigger even
more optimizations.
So, replace TRUNC + ZEXT instructions by the equivalent AND.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Let's suppose we have a few instructions: A, B, C & D,
all congruent insn_compare()
The current way the CSE is done is:
The first try will be try_to_cse(A,B). If this succeeds,
A & B have been merged (AB) and the next try will be done
with the next instruction:
try_to_cse(AB,C).
That's good. However, if it fails, the next try will be:
try_to_cse(A,C).
If this one also fails, the next one will be:
try_to_cse(A,D)
And if this one also fails, nothing else is done.
In other words, all attempts are done with A. If it happens that
A can't be eliminated (because it doesn't dominate B, C & D and
is not dominated by them), no eliminations are done because the
other pairs, not involving A are never tried.
Ideally, we should try all possible pairs: A-B, A-C & A-D but
also B-C, B-D & C-D. However this is quadratic and can be a bit
costly. An easy & cheap way to improve the current situation is,
in case of failure, to not reuse the original first instruction
but the other one. So instead of testing:
A-B, A-C, A-D, ...
we will test:
A-B, B-C, C-D, ...
with the result that an 'bad' instruction can't block anymore
the following pairs.
So, when try_to_cse() fails, do not retry by keeping the first
instructions, but retry using the second one.
In practice, this seems to work fairly well.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
The last instruction of linearize_load_gen() ensure that
loading a bitfield of size N results in a object of size N.
Also, we require that the usual binops & unops use the same type
on their operand and result. This means that before anything
can be done on the loaded bitfield it must first be sign or zero-
extended in order to match the other operand's size.
The same situation exists when storing a bitfield but there
the extension isn't done. We can thus have some weird code like:
trunc.9 %r2 <- (32) %r1
shl.32 %r3 <- %r2, ...
where a bitfield of size 9 is mixed with a 32 bit shift.
Avoid such mixing of size and always zero extend the bitfield
before storing it (since this was the implicitly desired semantic).
The combination TRUNC + ZEXT can then be optimised later into
a simple masking operation.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
In the test the offset is the same for dst & src and thus
it's calculation should be CSEed away but it is not (yet).
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Casts to integer used to be done with only 2 instructions:
OP_CAST & OP_SCAST.
Those are not very convenient as they don't reflect the real
operations that need to be done.
This patch specialize these instructions in:
- OP_TRUNC, for casts to a smaller type
- OP_ZEXT, for casts that need a zero extension
- OP_SEXT, for casts that need a sign extension
- Integer-to-integer casts of the same size are considered as
a NOPs and are, in fact, never emitted.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Currently casts from pointers can be done to any integer type.
However, casts to (or from) pointers are only meaningful if
it preserves the value and thus done between same-sized objects.
To avoid to have to worry about sign/zero extension while doing
casts to pointers it's good to not have to deal with such casts.
Do this by doing first a cast to an unsigned integer of the same size
as a pointer and then, if needed, doing to cast to the final type.
As such we have only to support pointer casts to unsigned integers
of the same size and on the other hand we have the generic
integer-to-interger casts we to support anyway.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
It's relatively common to cast a pointer to an unsigned long,
for example to make some bit operations.
It's much less sensical to cast a pointer to an integer smaller
(or bigger) than a pointer is.
So, emit a diagnostic for this, under the control of a new
warning flag: -Wpointer-to-int-cast.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Currently all casts to pointers are processed alike. This is
simple but rather unconvenient in later phases as this
correspond to different operations that obeys to different
rules and which later need extra checks.
Change this by using a specific instructions (OP_UTPTR) for
[unsigned] integer to pointers.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Currently casts to pointers can be done from any integer types.
However, casts to (or from) pointers are only meaningful if value
preserving and thus between objects of the same size.
To avoid to have to worry about sign/zero extension while doing
casts to pointers it's good to only have to deal with the value
preserving ones.
Do this by doing first, if needed, a cast an integer of the same
size as a pointer before doing the cast to a pointer.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Currently all casts to pointers are processed alike.
This is simple but rather unconvenient as it correspond to
different operations that obeys to different rules and
which later need extra checks.
Change this by using a specific instructions (OP_UTPTR) for
unsigned integer to pointers.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Currently, casts from floats to integers are processed like
integers (or any other type) to integers. This is simple but
rather unconvenient as it correspond to different operations
that obeys to different rules and which later need extra checks.
Change this by directly using specific instructions:
- FCVTU for floats to unsigned integers
- FCVTS for floats to signed integers
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Some casts, the ones which doesn't change the size or the resulting
'machine type', are no-op.
Directly simplify away such casts.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Currently, all casts to a floating point type use OP_FPCAST.
This is maybe simple but rather uncovenient as it correspond
to several quite different operations that later need extra
checks.
Change this by directly using different instructions for the
different cases:
- FCVTF for float-float conversions
- UCVTF for unsigned integer to floats
- SCVTF for signed integer to floats
and reject attempts to cast a pointer to a float.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
The sparse command (aka the 'checker') do a number of additional
checks when used with the -v flag (I strongly believes that this
option is rarely used but let me not disgress about it here).
One of these additional checks is about casts.
Add some testcases in order to catch any problems here.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
There are special problems when a typeof() expression can't
be evaluated. Catch this here.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
* merge the tests about implicit & explicit casts in a
single file as there was a lot of redundancy.
* shuffle the tests to linear/ or optim/
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
A integer-to-float cast of a constant is currently simplified away
as if it is an integer-to-integer cast. That's bad.
Fix this by refusing to simplify away any integer-to-float casts
like already done for float-to-integer casts.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
A integer-to-float cast of a constant is currently simplified away
as if it is an integer-to-integer cast. That's bad.
Create a testcase for it.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
This allow, for example, to simulate running the testsuite on a 32bit
machine or running the testsuite with some extra debugging flags.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Some non-void functions in the testcases miss a return.
Add the missing return or make the function as returning void.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
When using sparse it's common to compile a file and directly
run sparse on the same file, like it is done for the kernel.
In this case, error messages from sparse are interspersed with
those from the compiler. It's thus not always easy to know from
which tools they come.
Fix this by allowing to prefix all the diagnostic messages
by some configurable string, by default "sparse". More exactly,
an error message that was emitted like:
file.c:<line>:<col>: error: this is invalid code
can now be emitted as:
file.c:<line>:<col>: sparse: error: this is invalid code
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Reviewed-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
|
|
This will merge the following topic branches:
* builtin-dyn:
Expansion of macros like __FILE__, __DATE__, ... must, contrary
to usual macros, be done dynamically.
This series improves support for them:
- consider them as defined (like GCC do)
- use table + method for their expansion (avoid useless compares)
- add support for __INCLUDE_LEVEL__ & __BASE_FILE__
* builtin-predef:
Add a function: predefine() which allow to directly add
the definition of a simple macro (without args and with
a single number or ident as definition).
Also do the conversion of the concerned predefined macros
and some cleanups.
* builtin-overflow:
Add support for builtins doing overflow checking.
* has-builtin:
Add support for the __has_builtin() macro.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Sparse has support for a subset of GCC's large collection of builtin
functions. As for GCC, it's not easy to know which builtins are
supported in which versions.
clang has a good solution to this problem: it adds the checking macro
__has_builtin(<name>) which evaluates to 1 if <name> is a builtin
function supported by the compiler and 0 otherwise.
It can be used like:
#if __has_builtin(__builtin_clz)
#define clz(x) __builtin_clz(x)
#else
...
#endif
It's possible or probable that GCC will have this soon too:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66970
Add support for this __has_builtin() macro by extending
the evaluation of preprocessor expressions very much like
it is done to support defined().
Note: Some function-like builtin features, like __builtin_offset(), are
considered as a kind of keyword/operator and processed as such.
These are *not* considered as builtins by __has_builtin().
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
It seems they will be used in the kernel so add them.
Unlike __builtin_uadd_overflow() and friends, these don't take
a fixed type and thus can't simply be declared but need their
own evaluate() method to do the type checking.
Note: of course, no expansion is done on them.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Add a few silly testcases doing some expansion of a builtin macro.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
There was some support for it but it was just a define
that expanded to the fixed name "base_file.c".
Implement the real thing by saving the input filename
and replacing __BASE_FILE__ by it.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
This macro, which is supported by GCC, wasn't yet by sparse.
Add support for it.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
GCC consider these macros as being defined, sparse doesn't.
Also sparse uses a sequence of comparisons, one for each of
__DATE__, __FILE__, ..., which is not ideal.
Fix this by defining and using a table to associate to the
corresponding symbol a method doing the expansion.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
GCC considers __LINE__, __FILE__, ... as being defined.
Add a testcase for this.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
'fix-builtin-expect' and 'int-const-expr' into tip
|
|
This use Martin's awesome macro to test if sparse's
notion of integer-const-expr is the same as GCC's.
It test also that the result of this macro is itself a
constant integer expression.
Awesome-macro-by: Martin Uecker <Martin.Uecker@med.uni-goettingen.de>
Test-originally-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
If an error occurs at the end of the input, for example because
a missing terminating ';' or '}', the error message is like:
builtin:0:0: error: ...
IOW, the stream name & position is not displayed because the
because the current token is eof_token_entry which has no position.
This can be confusing and for sure doesn't point where the error is.
Fix this by giving to eof_token_entry the end-of-stream position.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
On code like 'goto <some reserved code>', bind_symbol() reports
an error and doesn't bind the label's ident to the goto's label
symbol.
However, at evaluation time, the ident is unconditionally
dereferenced.
Avoid the crash by checking for a null ident before dereferencing it.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Typewisely, sparse process __builtin_expect() as:
1) returning 'int'
2) taking any type in first & second arg
3) returning exactly its first argument, silently, even
if this conflicts with 1).
but this doesn't match with how gcc declare it:
long __builtin_expect(long, long);
Fix this by giving the proper prototype to this builtin
and removing the bogus 'returns an int'.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
While sparse can parse VLA, their size were left as
'undeterminated' (-1) and trying to use sizeof on VLAs
resulted in a warning 'error: cannot size the expression'.
Fix this by adding the needing code to evaluate the expressions
corresponding to the sizeof of VLAs.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
If a label is defined several times, an error is issued about it.
Nevertheless, the label is used as is and once the code is linearized
several BB are created for the same label and this create
inconsistencies. For example, some code will trigger assertion failures
in rewrite_parent_branch().
Avoid the consistencies by ignoring redefined labels.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Redefined labels create inconsistencies in BB processing.
Add a testcase for it.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Statements with an empty expression, like:
__context__();
or
__context__(x,);
are silently accepted. Worse, since NULL expressions are usually
ignored because it is assumed they have already been properly
diagnosticated, no warnings of any kind are given at some later
stage.
Fix this by explicitly checking after empty expressions and
emit an error message if needed.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
The expected syntax for the __context__ statement is:
__context__(<expression>);
or
__context__(<context>, <expression>);
but originally it was just:
__context__ <expression>
In other words the parenthesis were not needed and are
still not needed when no context is given.
One problem with the current way to parse these statements is
that very few validation is done. For example, code like:
__context__;
is silently accepted, as is:
__context__ a, b;
which is of course not the same as:
__context__(a,b);
And code like:
__context__(,1);
results in a confusing error message:
error: an expression is expected before ')'
error: Expected ) in expression
error: got ,
So, given that:
* the kernel has always used the syntax with parenthesis,
* the two arguments form requires the parenthesis and thus
a function-like syntax
use a more direct, robust and simpl parsing which enforce
the function-like syntax for both forms.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
The expected syntax for the __context__ statement is:
__context__(<inc/dec value>);
or
__context__(<context>, <inc/dec value>);
The distinction between the two formats is made by checking if
the expression is a PREOP with '(' as op and with an comma
expression as inner expression.
However, code like:
__context__;
or
__context__(;
crashes while trying to test the non-existing expression
(after PREOP or after the comma expression).
Fix this by testing if the expression is non-null before
dereferencing it.
Note: this fix has the merit to directly address the problem
but doesn't let a diagnostic to be issued for the case
__context__;
which is considered as perfectly valid.
The next patch will take care of this.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
No check are done if the inc/dec value of context statements
is effectively a compile-time integer value: '0' is silently
used if it is not.
Change that by using get_expression_value() when linearizing
context statements (which has the added advantage to also
slightly simplify the code).
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Currently the parsing of the attribute 'context' is rather
complex and uses a loop which allows 1, 2, 3 or more arguments.
But the the real syntax is only correct for 2 or 3 arguments.
Furthermore the parsing mixes calls to expect() with its own
error reporting. This is a problem because if the error has first
been reported by expect(), the returned token is 'bad_token'
which has no position so you can have error logs like:
test.c:1:43: error: Expected ( after context attribute
test.c:1:43: error: got )
builtin:0:0: error: expected context input/output values
But the 'builtin:0.0' should really be 'test.c:1.43' or, even better,
there shouldn't be a double error reporting.
Fix this by simplifying the parsing and only support 2 or 3 args.
Also, make the error messages slightly more explicit about the
nature of the error.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
|
|
It's certainly worth to have some tests but to not
slow down the testsuite and to not create a dependency
on python this test need to be run explicitely with:
./test-suite doc/cdoc.cdoc
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
'fix-redef-typedef' and 'fixes' into tip
|
|
One of sparse's extension to the C language is an operator
to check ranges. This operator takes 3 operands: the expression
to be checked and the bounds.
The syntax for this operator is such that the operands need to
be a 3-items comma separated expression. This is a bit weird
and doesn't play along very well with macros, for example.
Change the syntax to a 3-arguments function-like operator.
NB. Of course, this will break all existing uses of this
extension not using parenthesis around the comma
expression but there doesn't seems to be any.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
constant_symbol_value() needs several conditions before returning
a valid value.
However not all conditions are always met. One such case occurs when
an union is involved and the value we look after is of a different
type than the initializer.
Add two test cases showing (some of) the problems.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Ideally, the testcases should be universal but it happen than
some of them need to test some specificities or are meaningless
or plainly wrong in some situations. In such cases, these tests
must but ignored.
Currently, the only the only mechanisms a test are:
1) ignoring the tests depending on a tool which cannot be compiled
(like, for example, those using sparse-llvm when LLVM is not
installed.
2) some rather corse criteria using the name of the arch used
to run the tests.
Allow more flexibility by allowing to exclude some tests based on
the success or failure of an arbitrary condition via _Static_assert().
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
with help from Linus (many moons ago) and Luc this year.
sparse addition to print all compound/composite global data symbols
with their sizes and alignment.
usage: -vcompound
Example:
$ sparse -vcompound symbol-sizes.c
compound-sizes.c:39:17: union un static [toplevel] un: compound size 192, alignment 8
compound-sizes.c:42:25: struct inventory static [toplevel] inven[100]: compound size 19200, alignment 8
compound-sizes.c:51:33: struct inventory static [toplevel] [usertype] invent[10]: compound size 1920, alignment 8
compound-sizes.c:58:25: float static [toplevel] floats[42]: compound size 168, alignment 4
compound-sizes.c:59:25: double static [toplevel] doubles[84]: compound size 672, alignment 8
and validation:
$ ./test-suite single compound-sizes.c
TEST compound-sizes (compound-sizes.c)
compound-sizes.c passed !
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Ideally, the testcases should be universal but it happen than
some of them need to test some specificities or are meaningless
or plainly wrong in some situations. In such cases, these tests
must but ignored.
Currently, the only the only mechanisms a test are:
1) ignoring the tests depending on a tool which cannot be compiled
(like, for example, those using sparse-llvm when LLVM is not
installed.
2) some rather corse criteria using the name of the arch used
to run the tests.
Allow more flexibility by allowing to exclude some tests based on
the evaluation of some pre-processor expression at test-time.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
validation/{phase2/backslash,phase3/comments} are two ancient
testcases that predate ./test-suite and they are ignored by
the testsuite because they do not have a '.c' extension.
Change this by:
- renaming them with a '.c' extension
- moving them to validation/preprocessor/
- adding the testsuite tags & results to them
- remove comments about their previous status
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
check_duplicates() verifies that symbols are not redefined and
warns if they are.
However, this function is only called at evaluation-time and
then only for symbols corresponding to objects and functions.
So, typedefs can be redefined without any kind of diagnostic.
Fix this by calling check_duplicates() at parsing time on typedefs.
Note: this is C11's semantic or GCC's C89/C99 in non-pedantic mode.
Reported-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Currently, sparse doesn't issue a diagnostic when a typedef
is redefined.
Add some testcases for this.
Reported-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Currently, type_difference() ignore array sizes.
Add some testcases for this.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Currently, type_difference() doesn't make a distinction between
enums & ints.
Add some testcases for this and mark the test as 'known-to-fail'.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
It's easy to make some errors or some wrong assumptions
about size and alignment of basic types, certainly so when
taking in account the -m32, -m64 & -msize-llp64 flags and
the default behaviour on different arch & environments.
Try to catch these errors by adding a testcase for the
size & alignment of int, long, long long & void*.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Recent changes to the min()/max() macros in include/linux/kernel.h
have added a lot of noise when compiling the kernel with Sparse checking
enabled. This mostly is due to the *huge* increase in the number of
sizeof(void) warnings, a larger number of which can safely be ignored.
Add the -Wpointer-arith flag to enable/disable these warnings (along
with the warning when applying sizeof to function types as well as
warning about pointer arithmetic on these types exactly like the
GCC -Wpointer-arith flag) on demand; the warning itself has been disabled
by default to reduce the large influx of noise which was inadvertently
added by commit 3c8ba0d61d04ced9f8 (kernel.h: Retain constant expression
output for max()/min()).
Update the manpage to document the new flag and add a validation case
for sizeof(void).
CC: Kees Cook <keescook@chromium.org>
CC: Linus Torvalds <torvalds@linux-foundation.org>
CC: Martin Uecker <Martin.Uecker@med.uni-goettingen.de>
CC: Al Viro <viro@ZenIV.linux.org.uk>
CC: Christopher Li <sparse@chrisli.org>
CC: Joey Pabalinas <joeypabalinas@gmail.com>
CC: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Joey Pabalinas <joeypabalinas@gmail.com>
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Clean up the grammar/capitalization of the -Wsizeof-bool sections and
italicize the size (1) so that it is consistent with the surrounding
text.
CC: Al Viro <viro@ZenIV.linux.org.uk>
CC: Christopher Li <sparse@chrisli.org>
CC: Joey Pabalinas <joeypabalinas@gmail.com>
CC: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Joey Pabalinas <joeypabalinas@gmail.com>
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
In preparation to the upcoming introduction of -Wpointer-arith
uses -Wpointer-arith for tests that will need it.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
During the linearization of a function, returns are directly
linearized as phi-sources and the exit BB contains the
corresponding phi-node and the unique OP_RET.
There is also a kind of optimization that is done if there is
only a single a return statement and thus a single phi-source:
the phi-source and the phi-node is simply ignored and the
unique value is directly used by the OP_RET instruction.
While this optimization make sense it also has some cons:
- the phi-node and the phi-source are created anyway and will
need to be removed during cleanup.
- the corresponding optimization need to be done anyway during
simplification
- it's only a tiny special case which save very litte.
So, keep things simple and generic and leave this sort of
simplification for the cleanup/simplification phase.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
A select instruction like:
select r <- x, 0, x
always gives zero as result but the optimizer doesn't this.
Change this by teaching the optimizer about it.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Normal instruction simplification & CSE must not be done
on dead block (otherwise it's possible to have unsound
situations like having an instruction defining its own
operand with possible infinite loops as consequence).
This is insured by the main optimization loop but not after
BB packing or flow simplification.
Fix this by calling kill_unreachabe_bbs() after BB packing
and flow simplification.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Insure that the tests for infinite optimization loops
never goes infinitively by adding a timeout to them.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Add some tests showing missed optimization opportunities.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
In simplify_loads(), phisrcs are created in find_dominating_parents()
and are then supposed to be used in rewrite_load_instruction().
However, it may happen (quite often) that find_dominating_parents()
find a dominator for one of the branch, create a phi-source for it,
record it's usage and then doesn't find a dominator in one of other
parent branches. In this case, the function returns early and the
created phisrcs are simply ignored. These phisrcs can't be simplified
away as dead instructions because they still have their usage recorded.
Fix this by explicitly remove these ignored phisrcs.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
REPEAT_SYMBOL_CLEANUP is set when there is changes to the
addressability of a symbol, typically when a OP_SYMADDR is removed.
However, currently most OP_SYMADDRs are 'simplified'/folded into
their use. For example:
symaddr.64 %r1 <- var
add.64 %r2 <- %r1, $4
is simplified into:
add.64 %r2 <- var, $4
One of the bad consequences of this 'simplification' is that
if the 'add' instruction is later optimized away, this correspond
to an effective change to the addressability of the symbol. This
is exactly as if the 'symaddr' has been removed before being so
'simplified', but because the symaddr is not there anymore
there is no change recorded to the addressability to the symbol
and some further optimizations may be missed.
Change that by checking at each time the usage is removed from an
instruction if the corresponding pseudo was a symbol and set
REPEAT_SYMBOL_CLEANUP if it was the case.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
The function address_taken() check that the address of a variable
is not used for anything else than for doing a memory access.
If the function fails, it means that the address can escape and
thus can be used to modify the variable indirectly and the
memop simplification would then be unsafe.
The function do this check by assuming that any uses by an instruction
other than a load or a store can escape the address, which is true.
However it doesn't take in account that if the variable's address is
used, not as the address of a store, but as the value stored, then
this address also escape.
Fix this by adding a check that the use of the variable's address
is effectively done as the address of stores & loads and not as
the value stored by the memop.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
The function check_access() verify that memory accesses are not
done out of their bounds. This is good.
However, this function is called at each run of simplify_loads()
which means that the same warning can be given multiple times
which is annoying.
Fix this by using the newly added 'tainted' field to not warn
a second time on this instruction.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Since the patterns in the testcases are evaluated in the shell
script, the backslash used to escape characters special to the
pattern need itself to be escaped. Theer is a few cases where
it wasn't done so, partly because 'format -l' gave a single
escape in its template.
Fix all occurences neededing this double-escape as well as the
'format -l' template.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
"append" was used instead of "$append" when formatting
a new test, with the result that the infos for the test
system were always appended to the test fiel, which is
maybe often but not always desirable.
So, add the missing '$' when using the variable.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Currently, we have OP_MULS & OP_MULU but unless it's full,
widening multiplication both must give exactly the same
result (the world run on 2's complement CPUs now, right?).
Also, the IR doesn't have widening multiplication but
only instruction where both operands and the result have
the same size.
So, since theer is no reasons to keep 2 instructions,
merge OP_MULS & OP_MULU into a single one: OP_MUL.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
__builtin_isinf(), isnan() & isnormal() are all special cases
of __builtin_fpclassify().
Add a few cases testing if those are correctly expanded if
when the argument is a constant.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
More specifically: for __builtin_nan(), _huge_val() & _inf()
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
These builtins should accept a single argument of any
floating-point type and should not do the usual promotion
of float to double.
Add the type and argument number check in the builtin's
evaluate method.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Currently, (most) builtin functions are declared like
normal function prototypes. Furthermore, these declarations
are done via the pre-buffer mechanism and thus need to be
tokenized & parsed like normal code.
This is far from being 'builtin' and involves unnneded
processing.
Change this by skipping this pre-buffer phase and directly creating
the appropriate symbol for them.
Note: the correct mechanism to be used to make them really builtin
is via init_builtins(), used when we have a real semantic
action for the builtin.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
These builtins are defined by gcc since 4.4. They are also now
used by the isinf, isfinite and isnan macros. So using them with a
newer gcc causes 'undefined identifier' errors.
Add the builtin definitions and some validation checks for these
functions.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
'kill-dead-loads' and 'kill-dead-stores' into tip
Each of these branches contains a fix for the missing removal of
value or address usage when unneeded loads or stores are killed
during symbol simplification.
No conflicts but one of the tests fails in the branches while it
correctly succeed after they are merged (so no code conflict but
a semantic conflict).
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
The warning message for unknown attributes is a bit longish and
uses the word 'attribute' twice.
Change the message for something more direct, shorter and without
repetition.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Generally, we're not interested by those warnings, but we
can always explicitly ask for them if needed.
So make the flag '-Wunknown-attribute' off by default.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
In the initial simplify_symbols() and in simplify_memops()
when a store is simplified away, it's killed via kill_store()
where its ->bb is set to NULL and the usage is removed from
the value. However the usage is not removed from the address.
As consequence, code related to the address calculation
is not optimized away as it should be since the value is
wrongly considered as needed.
Fix this by using kill_instruction_force() to remove these
stores.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Like others instructions producing a value, OP_LOADs can be dead.
But currently, dead OP_LOAD are not removed as dead_insn() do
for others instructions.
Fix this by checking at simplification time if OP_LOADs are
dead and call kill_instruction() if it is the case.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
In some situations, loads and others instructions can be
unreachable already when linearized, for example in code like:
void foo(int *ptr)
{
return;
*ptr;
}
Such loads are detected in find_dominating_stores() and must
be discarded. This is done and the load have its opcode set
to OP_LNOP (wich is only useful for debugging) but it's
address is left as being used by the load.
Fix this by removing the address usage.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Converted loads are dead and can be removed but that
also means that the address usage need to be adjusted
which wasn't done.
Fix this by directly using kill_instruction().
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Instructions with a null ->bb are instructions which have
been killed. As such, they must thus always be ignored
but it's not always the case.
Fix this by adding a check for null ->bb where there is
some looping over all the instructions of a basic block.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
During simplify_one_symbol(), if possible, loads are replaced by
an OP_PHI and the corresponding OP_PHISOURCE. To simplify things
firther, If all the phisrcs correspond to an unique pseudo (often
because there is only a single phisrc), then it's useless to
create the OP_PHI: the created OP_PHISOURCEs can be removed and
the initial load can be converted to the unique pseudo.
However, if the unique pseudo was never used, the removal of
the OP_PHISOURCEs, done *before* the load conversion, will
kill the defining load (at this point the only user of the
pseudo was the OP_PHISOURCEs) which will then erroneously make
a VOID from the pseudo.
Fix this by doing the load conversion before removing the
unneeded OP_PHISOURCEs.
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
|
|
When the operand of typeof() is invalid, the corresponding
type is also invalid and can't be used. But currently,
when the corresponding SYM_TYPEOF symbol is examined,
this symbol is then simply returned as is. This, of course,
create problem for subsequent code since an examined type is
not supposed to be a SYM_TYPEOF anymore (one of the symptoms
will be warning about "unknown type 11").
Fix this by changing the SYM_TYPEOF into a SYM_NODE (as it's
expect) but pointing to bad_ctype. So further processing gently
continue and the 'bad_ctype' will do its job when needed.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
The type 'bad_ctype' is only used after an error has been detected.
Since this error has also been reported, there is no reasons
to issue more errors when a 'bad_ctype' is involved. This allow
to focus on the root cause of the error.
Fix this by checking in bad_expr_type() if one of the operands
is already a 'bad_ctype' and do not issue an diagnostic message
in this case.
Note: the kernel has a bunch of these situations where the
exact same warning is given several times in a row,
sometimes as much as a dozen time.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
When evaluating a conditional, the expression is first evaluated
and some further verifications and processing are done if
the returned type is not NULL.
However, the returned type can also be 'bad_ctype' and if it is
the case, the additional verifications will just give meaningless
additional warnings.
Fix this by using the new helper valid_type() instead of just
testing for a null ctype.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Commit "b5867a33b (fix evaluation of a function or array symbol in conditionals)"
added a missing call to degenerate(epxr) but this must not be done
if the expression is erroneous.
fix this by bypassing the call to degenerate() if the ctype is NULL.
Fixes: b5867a33b62c04811784c6fc233c601a4f2b0841
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Once a invalid type is present in some sub-expression,
every upper level of the expression has an invalid type
but we should only warn about the innermost/root error.
However, this is not currently the case and invalid types
can create duplicated warnings, sometimes even a succession
of such warning.
Add some testcases to catch such situations.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
The option -Wptr-subtraction-blows warns when pointer subtraction
is done and the base size is not a power-of-2.
However, the current message doesn't give much context.
Change this by giving the base type and the size in the warning.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Before the introduction of OP_SETFVAL, floating-point were
created via OP_SETVAL whose CSE is done by comparing the
pointer of the corresponding expression without any
interpretation of this pointer.
As consequence, even if two OP_SETVAL have two identical
expressions (value), in most cases the corresponding pointers
are not identical, completly inhibiting the CSE of OP_SETVALs.
Fix the CSE of floating-point literals by directly using
the value given by the new OP_SETFVAL.
Note: to respect some of the subtilities of floating-point,
the equality comparison of two literals is not done on
the floating-point value itself but bit-by-bit on its
binary representation (as such we can continue to make the
distinction between +0.0 & -0.0, handle NaNs, ...).
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
OP_SETVAL is used to create floating-point and string
as well as labels-as-values. This multi-purpose aspect
sometimes make things a bit more complicated.
Change this by using a new instruction for the direct
creation of floating-point literals without needing
to have an intermediate EXPR_FVALUE.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
'testcase-fix-missing-return', 'type-as-first-class', 'llvm-zero-init' and 'llvm-prototype' into tip
|
|
'size-unsized-arrays' and 'master' into tip
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Till now, sparse's plain chars where signed with no possibility
to change it. This is a problem when using sparse on code
for architectures like ARM where chars are by default unsigned
or simply for code compiled with GCC's '-f[no-][un]signed-char'.
Change this by parsing these options and adjusting the type
of plain chars accordingly.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
When an array is declared without an explicit size. In this case,
an implicit size is given by the number of elements in its initializer
if one is present.
Currently, in sparse, this implicit size is only associated with
the node corresponding to the initializer while the base type is
left unsized. This is a problem because the node is only used for
the modifiers & address-space and the bitsize of nodes are expected
to match the size of the basetype. So this implicit size can be used
for when directly using the bit_size of the node but the array is
still left, essentially unsized.
It's not enough to simply copy the bitsize of the node to the base
type because:
1) sym->array_size need to be set in the node & the base type.
2) the base type can be shared between several declarators.
It's thus needed to copy the the base type to unshare it before
setting the sym->array_size.
Reported-by: Dibyendu Majumdar <mobile@majumdar.org.uk>
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Sparse, as an extension and with a special syntax, supports the
direct comparison of types, either equality modulo qualifiers for
'==' and '!=', or size comparison for '<', '>', '<=' and '>='.
Add some testcases to avoid possible regressions here.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
If the expression for the condition is dereferenced
for its type even if it is NULL.
Fix this by returning early if the expression linearize to VOID.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Empty sub-expressions are normally caught as syntax error
in most expressions but this is not the case for parenthesized
expressions.
Fix this by adding a check at the end of parens_expressions()
and warning if needed.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
In some testcases, some non-void functions are missing
a return statement. This is undetected because:
- no checks are done and so no warnings can be given
- there are some bugs in the linearization of returns
(the value returned by the last statement is implicitly used).
Fix the testcases by adding the missing return.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
In sparse-llvm, when a reference to a function is made via
get_sym_value(), for example for a call, the reference is
created with LLVMAddFunction() and if needed later, the existing
reference is returned via LLVMGetNamedFunction().
This is the correct way to do it.
However, when emitting the code for a function definition, a fresh
reference is always made. If a previous reference to this function
already existed, the second one will have a slightly different name:
the given name suffixed by ".<somenumber>". LLVM does this for every
created references, to disambiguate them. As consequence, the
compiled function will not be name "<functionname>", as expected,
but "<functionname>.<somenumber>".
Fix this by always using get_sym_value() when emitting the code
for the function definition as this will return the reference
for the given function name if it already exist.
This has the added bonus to remove some code duplication.
CC: Dibyendu Majumdar <mobile@majumdar.org.uk>
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Since commit "1c182507c (fix support of floating-point compare)",
CSE wasn't done anymore on floating-point compare.
Fix this by adding the two missing 'case OP_FPCMP ...'
Fixes: 1c182507c3981aa20193c68d7cfd32d750b571cf
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
During the expansion of a dereference, it's checked if the
initializer corrresponding to the offset we're interested in
is a constant. If it's the case, the dereference can be avoided
and the constant given as initializer can be used instead.
However, it's not enough to check for the offset since, for
bitfields there are (usualy) several distinct fields at the
same offset. Currently, the first initializer matching the
offset is selected and, if a constant, its value is used
for the result of the dereferencing of the whole structure.
Fix this by refusing such expansion if the constant value
correspond to a bitfield.
Reported-by: Dibyendu Majumdar <mobile@majumdar.org.uk>
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Reported-by: Dibyendu Majumdar <mobile@majumdar.org.uk>
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
|
|
For the '*' operator and functions, the C standard says:
"If the operand points to a function, the result is a
function designator; ... If the operand has type
‘pointer to type’, the result has type ‘type’".
but also (C11 6.3.2.1p4):
"(except with 'sizeof' ...) a function designator with type
‘function returning type’ is converted to an expression
that has type ‘pointer to function returning type’".
This means that in dereferencement of a function-designator is
a no-op since the resulting expression is immediately back converted
to a pointer to the function.
The change effectively drop any dereferencement of function types
during their evaluation.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
A function call via a function pointer can be written like:
fp(), (fp)() or (*fp)()
In the latter case the dereference is unneeded but legal and
idiomatic.
However, the linearization doesn't handle this unneeded deref
and leads to the generation of a load of the pointer:
int foo(int a, int (*fun)(int))
{
(*fun)(a);
}
gives something like:
foo:
load %r2 <- 0[%arg2]
call.32 %r3 <- %r2, %arg1
ret.32 %r3
This happens because, at linearization, the deref is dropped
but only if the sub-expression is a symbol and the test for
node is not done.
Fix this by using is_func_type() to test the type of all call
expressions.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Examination of a pointer type doesn't examine the corresponding
base type (this base type may not yet be complete). So, this
examination must be done later, when the base type is needed.
However, in some cases it's possible to call evaluate_dereference()
while the base type is still unexamined.
Fix this by adding the missing examine_symbol_type() on the base type.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
evaluate_dereference() lacks an explicit examination of the
base type. Most of the time, the base type has already been
examined via another path, but in some case, it's not.
The symptom here is the dereferenced value having a null size.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
|
|
GCC's manual or POSIX say about the '-D' option something like:
'−D name[=value]' should be treated as if in directive
'#define name value' (with '1' as default for the value),
including its tokenization.
So an option like '-DM(X, Y)=...' should be processed like a
directive '#define M(X, Y) ...'.
However, the current code treat a space as a separator between
the macro and its definition, just like the '='. As consequence,
the above option is processed like the directive would be
'#define M(X, Y)=...', with 'M(X,' as the macro (name) and
'Y)=...' as its definition.
Fix this by stopping to treat the space character specially,
thus only using '=' as the separator.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
|
|
During testing it's sometimes useful to force some default arguments
for all commands. An example of this is using '-m32' which essentially
allow to run the tessuite on an 64bit machine as-if run a 32-bit one.
Allow this by using the environment variable 'SPARSE_TEST_ARGS' to
hole default arguments for the test commands.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Currently the testsuite use 'eval echo $cmd' to expand
the name of the test file to be given on the command line.
This has the annoying consequence to go a bit too far in
the expansion of variables and to destroy any quotes and
whitespaces escaping that would have done.
Fix this by doing the eval later, when effectively executing
the command.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Commit 399c43889 (testsuite: get options from env too) allowed
the testsuite to takes extra options from the environment but
did it in a crude way involving exec.
Change this by using 'set --' instead of doing an 'exec'.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Till now, sparse was unneedlessly strict in what it accepted in
'-D' options. More specifically, it doesn't accept:
1) separated '-D' and the macro definition, like:
sparse -D MACRO[=definition] ...
2) a space between the '-D' and the macro name, like:
sparse '-D MACRO[=definition] ...
Case 1) is clearly accepted by GCC, clang and should be
accepted for a POSIX's c99. Case 2's status is less clear
but is also accepted by GCC and clang (leaving any validation
to the corresponding internal #define).
Fix this by accepting separated command line argument for '-D'
and the macro (and removing the check that rejected the macro
part if it started with a space).
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Some fixes & improvements to the testsuite; mainly:
- allow to run the testsuite on all the tests of a subdir
- teach 'format' to directly append to the testcase
- validate the 'check-...' tags
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
The 'format' command create the information needed for the testcase
from the input file and output this on stdout. The developper must
then add this to the input file.
Let's do this automatically by adding an option '-a' to the 'format'
command to directly append the infos to the input file.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
The getopt loop used to bear by default and only some
options had to explicitly call 'shift' and 'continue'
to process further elements.
Change this to a 'normal' loop, shifting the next arg by default
and breaking of the loop when needed.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
During development, it is very useful to be able to run only
some of the tests, maybe a whole class.
Help this by allowing to run the testsuite on only a subdir
of the 'validation/' directory.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
The text for the testsuite usage used 'none' as if it was an
option/keyword while it only meant the absence of arguments.
Make the text clearer by removing the 'none' and being explicit
about the absence of arguments.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
This is a preparatory step to allow to run only a part
of the testsuite (a subdir).
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
This is a preparatory step to allow to run only a part
of the testsuite (a subdir).
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
This is a preparatory step to allow to run only a part
of the testsuite (a subdir).
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Making a typo in one of the 'check-...' tags can make
a testcase useless and thus incapable of detecting a regression.
Add some validation to these tags in order to detect wrong
tags.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
A few testcases had typos in their 'check-...' tags or
the tag was plainly invalid.
Fix them in accordance to the doc.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
The flag 'quiet' is used to quiets unwanted error messages,
for example for testcases known to fail, but this flag is reset
too late so that the beginning of the next testcases will run
with the value for the previous case.
Fix this by reseting the flag at the begining of each testcase.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Allow this new helper to indicate wich file trigger the
warning and replace the existing call to 'echo "warning: ...'.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
So, we can use them inside get_tag_value().
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
It was too ugly (and a bit longish).
Remove it.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
The old one is too ugly and has to die.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
The exact syntax for commands is:
'check-command: ' <command> <args>...
and the command itself must *not* be prefixed with './'.
Fix the last three that had it wrong.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Expressions involving the logical-not '!' does not
call degenerate().
Since the result type is always 'int' and thus independent
of the expression being negated, this has no effect on the
type-checking but the linearization is wrong.
For example, code like:
int foo(void)
{
if (!arr) return 1;
return 0;
}
generates:
foo:
load %r6 <- 0[arr]
seteq.32 %r7 <- VOID, $0
ret.32 %r7
The 'load' being obviously wrong.
Fix this by adding the missing degenerate().
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
and unify the existing ones.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Some testcases have their command specified as './<command name>'
but the './' part is unneeded as all commands are first prefixed
with '../' before being run.
Furthermore, the presence of these './' inhibit simple
filtering of the disabled commands.
Fix this by stripping the './' where it was used.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Floating-point arithmetic is quite different from the
arithmetic on integers or the one of real numbers.
In particular, most transformations, simplifications that can
be done on integers are invalid when done on floats.
For example:
- associativity doesn't hold
- distributivity doesn't hold
- comparison is tricky & complex
This is because (among others things):
- limited precision, rounding everywhere
- presence of signed zeroes
- presence of infinities
- presence of NaNs (signaling or quiet)
- presence of numbers without inverse
- several kind of exceptions.
Since they don't follow the same rules as their integer
counterpart, better to give them a specific opcode
instead of having to test the type of the operands at
each manipulation.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Comparision of floating-point values can't be done
like for integral values because of the possibility to have
NaNs which can't be ordered with normal values or even between
themselves.
The real difference appears once there is any "reasoning"
done with the result of the comparison. For example, once NaNs
are taken in account: "!(a < b)" and "(a >= b)" are not the same.
In fact the usual comparison operators must be reinterpreted
as implicitely first testing if any of the operand is a Nan
and return 'false' if it is the case. Thus "a < b" becomes
"!isnan(a) && !isnan(b) && (a < b)".
If we need to negate the comparison we get "!(a < b)" which
naturally becomes "isnan(a) || isnan(b) || (a >= b)".
We thus need two sets of operators for comparison of floats:
one for the "ordered" values (only true if neither operand
is a Nan) and one for the "values" (also true if either
operand is a NaN). A negation of the comparison switch from one
of the set to the other.
So, introduce another set of instructions for the comparison
of floats.
Note: the C standard requires that:
*) "x == x" is false if x is a NaN,
*) "x != x" is true if x is a NaN,
and this is coherent with "x != x" <-> "!(x == x)".
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Using the pre or post increment or decrement operator on
floating-point values mix the addition of a floating-point
value with an *integral* constant 1 or -1.
Fix this by checking if we're dealing with fp or not and using
the proper fp constants (1.0 or -1.0) if it is the case.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
LLVM automatically add an numeric suffix for names
automatically created. So, if intermediate names must
be created for a pseudo whose name was, for example, "%R4",
these new names will be "%R41", "%R42".
This is quite annoying because we can't make the distinction
between these names and the original names, (maybe of some other
pseudos whose names were "%R41" & "%R42).
Change this by adding a "." at the end of each name, as this will
then allow to see what the original name was.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
The linearized code, sparse's IR, have no use of C's complex type
system. Those types are checked in previous phases and the pseudos
doesn't have a type directly attached to them as all the needed
typing info info are conveyed by the instructions.
In particular, PSEUDO_VAL (used for integer and address constants)
are completely typeless.
There is a problem with this when calling a variadic function
with a constant argument as in this case there is no type in the
function prototype (for the variadic part, of course) and there is
no defining instructions holding the type of the argument.
Fiw this by using the type of the arguments explicitly given
in the OP_CALL instructions.
Reported-by: Dibyendu Majumdar <mobile@majumdar.org.uk>
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Pointer arithmetic and/or simplification can mixup pointer
and integer types.
Fix this by adding casts before all non-floating point binops
and adjust the result type if needed to match the instructio.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
Since sparse's constant are typeless comparing a pointer with
an address constant lack correct type information.
Fix this by casting the constant to the same type as the LHS.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|
|
In output_op_compare() everything that is not of interger
type is treated as floats. Pointers disagree.
Fix this by rearranging the code and treat pointers like integers
as required for LLVM's icmp.
Reported-by: Dibyendu Majumdar <mobile@majumdar.org.uk>
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
|