aboutsummaryrefslogtreecommitdiffstatshomepage
path: root/validation
AgeCommit message (Collapse)AuthorFilesLines
2020-06-12testsuite: plain chars are never compatible with [un]signed charsLuc Van Oostenryck1-0/+19
In standard C, plain chars are either signed or unsigned but are only compatible with themselves, not with signed chars nor with unsigned ones. However, Sparse has this wrong and make them compatible with the corresponding sign-qualified chars. So, add a testcase for this. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2020-06-09generic: fix crash when nothing matchLuc Van Oostenryck1-0/+23
The code for the generic selection doesn't take in account the fact that the default entry could be absent. Catch the case where nothing matches and issue an error. Fixes: c100a7ab2504f9e6fe6b6d3f9a010a8ea5ed30a3 Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2020-06-03univ-init: scalar initializer needs some additional checksLuc Van Oostenryck2-0/+35
Currently, -Wno-universal-initializer is simply implemented by simply replacing '{ 0 }' by '{ }'. However, this is a bit too simple when it concerns scalars initialized with '{ 0 }' because: * sparse & GCC issued warnings for empty scalar initializers * initializing a pointer with '{ }' is extra bad. So, restore the old behaviour for scalar initializers. This is done by leaving '{ 0 }' as-is at parse time and changing it as '{ }' only at evaluation time for compound initializers. Fixes: 537e3e2daebd37d69447e65535fc94e82b38fc18 Thanks-to: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2020-05-28add support for _GenericLuc Van Oostenryck3-0/+240
It's slightly tested but is fine for the latest kernels like https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git locking/kcsan Note: a known difference with GCC is that it doesn't make the distinction between 'signed char' and a plain 'char' (on platforms where plain char are signed) since it's using the usual type compatbility like used for assignements. Reference: lore.kernel.org/r/20200527235442.GC1805@zn.tnic Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2020-05-27testsuite: add testcase for duplicated local definitionsLuc Van Oostenryck1-0/+28
Sparse warn when a top-level object is initialized multiple times but doesn't warn when it's a local object. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2020-05-21Merge branch 'univ'Luc Van Oostenryck2-0/+25
* conditionally accept { 0 } without warnings
2020-05-21Merge branch 'bad-goto'Luc Van Oostenryck21-16/+463
* warn when jumping into statement expressions * warn when using undefined labels * warn on defined but unused labels It's not allowed to do a goto into an expression statement. For example, it's not well defined what should happen if such an expression is not otherwise reachable and/or can be optimized away. For such situations GCC issues an error, clang doesn't and produce a valid IR but Spare produce an invalid IR with branches to unexisting BBs. The goals of the patches in this series are: *) to detect such gotos at evaluation time; *) issue a sensible error message; *) avoid the linearization of functions with invalid gotos. The implementation principle behind these is to add a new kind of scope (label_scope), one for the usual function scope of labels one for each statement expressions. This new scope, instead of being used as a real scope for the visibility of labels, is used to mark where labels are defined and where they're used. Using this label scope as a real scope controling the visibility of labels was quite appealing and was the initial drive for this implementation but has the problem of inner scope shadowing earlier occurence of labels identically named. This is of course desired for 'normal' symbols but for labels (which are normally visible in the whole function and which may be used before being declared/defined) it has the disadvantage of: *) inhibiting the detecting of misuses once an inner scope is closed *) allowing several distinct labels with the same name in a single function (this can be regarded as a feature but __label__ at block scope should be used for this) *) create diffrences about what is permssble or not between sparse and GCC or clang.
2020-05-21univ-init: conditionally accept { 0 } without warningsLuc Van Oostenryck2-0/+25
In standard C '{ 0 }' is valid to initialize any compound object. OTOH, Sparse allows '{ }' for the same purpose but: 1) '{ }' is not standard 2) Sparse warns when using '0' to initialize pointers. Some projects (git) legitimately like to be able to use the standard '{ 0 }' without the null-pointer warnings So, add a new warning flag (-Wno-universal-initializer) to handle '{ 0 }' as '{ }', suppressing the warnings. Reference: https://lore.kernel.org/git/1df91aa4-dda5-64da-6ae3-5d65e50a55c5@ramsayjones.plus.com/ Reference: https://lore.kernel.org/git/e6796c60-a870-e761-3b07-b680f934c537@ramsayjones.plus.com/ Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2020-05-21bad-label: respect attribute((unused))Luc Van Oostenryck1-0/+6
Currently, attributes on labels were simply ignored. This was fine since nothing was done wth them anyway. But now that Sparse can give a warning for unused labels it would be nice to also support the attribute 'unused' not to issues the warning when not desired. So, add a small helper around handle_attributes() and use this instead of skipping the attributes. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2020-05-21bad-label: check for unused labelsLuc Van Oostenryck1-1/+0
Issue a warning if a label is defined but not used. Note: this should take in account the attribute 'unused'. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2020-05-21bad-goto: check declaration of label expressionsLuc Van Oostenryck2-2/+0
Issue an error when taking the address of an undeclared label and mark the function as improper for linearization since the resulting IR would be invalid. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2020-05-21bad-goto: jumping inside a statement expression is an errorLuc Van Oostenryck6-6/+0
It's invalid to jump inside a statement expression. So, detect such jumps, issue an error message and mark the function as useless for linearization since the resulting IR would be invalid. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2020-05-21bad-goto: catch labels with reserved namesLuc Van Oostenryck1-1/+0
If a reserved name is used as the destination of a goto, its associated label won't be valid and at linearization time no BB will can be created for it, resulting in an invalid IR. So, catch such gotos at evaluation time and mark the function to not be linearized. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2020-05-21bad-goto: reorganize testcases and add some moreLuc Van Oostenryck18-13/+332
Reorganize the testcases related to the 'scope' of labels and add a few new ones. Also, some related testcases have some unreported errors other than the features being tested. This is a problem since such tescases can still fail after the feature being tested is fixed or implemented. So, fix these testcases or split them so that they each test a unique feature. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2020-05-21bad-goto: add testcases for linearization of invalid labelsLuc Van Oostenryck1-0/+19
A goto to a reserved or a undeclared label will generate an IR with a branch to a non-existing BB. Bad. Add a testcase for these. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2020-05-21bad-goto: add testcase for 'jump inside discarded expression statement'Luc Van Oostenryck2-0/+57
A goto done into an piece of code discarded at expand or linearize time will produce an invalid IR. Add a testcase for it. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2020-05-21misc: fix testcase typeof-safeLuc Van Oostenryck1-7/+20
This testcase was marked as known-to-fail but it was simply the expected error messages that were missing. So, slightly reorganize the test a little bit, add the expected messages and remove the 'known-to-fail' tag. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2020-05-19testsuite: add a few testcases for nested functionsLuc Van Oostenryck1-0/+43
Sparse doesn't really support nested functions but is able to parse them correctly. Add some testcases with them so that it continue to catch possible errors concerning them. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2020-05-19attribute: 'externally_visible' is just another 'declaration' modifierLuc Van Oostenryck2-2/+0
Now that the distinction is made between type modifiers and 'declaration' modifiers, there is no more reasons to parse this attribute differently than other attributes/modifiers. Even more so because this special casing made this attribute to be ignored when placed after the declarator. So, use the the generic code for 'declaration modifiers' to parse this attribute. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2020-05-17attribute: sort the table of modifier namesLuc Van Oostenryck2-4/+4
It easier to search an item if sorted and this avoid needless conflict when new items are always added at the end of the table. So, sort the table but keep the storage modifers first so that show_typename() & friends still display types as usual. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2020-04-13Merge branch 'fix-atomic-type'Luc Van Oostenryck2-22/+38
* fix type compatibility of _Atomic types
2020-03-24add support for GCC's __auto_typeLuc Van Oostenryck2-0/+100
Despite the similarity with typeof, the approach taken here is relatively different. A specific symbol type (SYM_TYPEOF) is not used, instead a new flag is added to decl_state, another one in the declared symbol and a new internal type is used: 'autotype_ctype'. It's this new internal type that will be resolved to the definitive type at evalution time. It seems to be working pretty well, maybe because it hasn't been tested well enough. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2020-03-20teach sparse to linearize __builtin_unreachable()Luc Van Oostenryck3-3/+0
__builtin_unreachable() is one of the builtin that shouldn't be ignored at IR level since it directly impact the CFG. So, add the infrastructure put in place in the previous patch to generate the OP_UNREACH instruction instead of generating a call to a non-existing function "__builtin_unreachable()". Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2020-03-20add an implicit __builtin_unreachable() for __noreturnLuc Van Oostenryck1-1/+0
The semantic of a __noreturn function is that ... it doesn't return. So, insert an instruction OP_UNREACH after calls to such functions. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2020-03-20add testcases for OP_UNREACHLuc Van Oostenryck4-7/+74
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2020-03-16cpp: fix redefinition of a macro during its own expansionLuc Van Oostenryck1-0/+20
The presence of preprocessor directives within the arguments of a macro invocation is Undefined Behaviour but most of these directives, like the conditionals, are well-defined and harmless. OTOH, the redefinition of a macro during its own expansion makes much less sense. However, it can be given a reasonable meaning: * use the initial definition for the macro body * use the new defintion for its arguments, in text order. It's what gcc & clang do but Sparse can't handle this because, during the expansion, a reference to the initial macro's body is not kept. What is used instead is what is currently associated with the macro. Fix this by using the body associated with the macro at the time of its invocation. Testcase-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2020-03-15cpp: remove extra newlines during macro expansionLuc Van Oostenryck3-9/+16
During macro expansion, Sparse doesn't strip newlines from the arguments as required by 6.10.3p10 and done by gcc & clang. So, remove these newlines. Note: the current behaviour may make the preprocessed output more readable (and so may be considered as a feature). Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2020-03-15cpp: silently allow conditional directives within a macroLuc Van Oostenryck2-1/+41
The presence of preprocessor directives within the arguments of a macro invocation is Undefined Behaviour [6.10.3p11]. However, conditional directives are harmless here and are useful (and commonly used in the kernel). So, relax the warning by restricting it to non-conditional directives. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2020-03-15make "directive in macro's argument list" a warningOleg Nesterov1-4/+4
The presence of preprocessor directives within the arguments of a macro invocation is Undefined Behaviour [6.10.3p11]. Sparse issues an error for this but most often the result is well defined and is not a problem, processing can continue (for example, when the directive is one of the conditional ones). So, downgrade this sparse_error() to warning() (especially because issuing an error message can hide those coming later). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2020-02-09do the tree inlining during expansion phaseLuc Van Oostenryck1-1/+0
Currently, the tree inlining is done very early, during the evaluation phase. This means that the inlining is done even if the corresponding call belong to a sub-expression that will be discarded during the expansion phase. Usually this is not a problem but in some pathological cases it can lead to a huge waste of memory and CPU time. So, move this inline expansion to ... the expansion phase. Also, re-expand the resulting expression since constant arguments may create new opportunities for simplification. Note: the motivation for thsi is a pathological case in the kernel where a combination of max_t() + const_ilog2() + roundup_pow_of_two() + cpumask_weight() + __const_hweight*() caused Sparse to use 2.3Gb of memory. With this patch the memory consumption is down to 247Mb. Link: https://marc.info/?l=linux-sparse&m=158098958501220 Link: https://lore.kernel.org/netdev/CAHk-=whvS9x5NKtOqcUgJeTY7dfdAHc Reported-by: Randy Dunlap <rdunlap@infradead.org> Originally-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2020-02-09inline: add some testsLuc Van Oostenryck4-0/+108
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2020-02-06fix type compatibility of _AtomicLuc Van Oostenryck2-22/+38
When _Atomic was introduced, it was treated, for most purposes, like the other qualifiers. However, it's best to consider _Atomic as an qualifier only for syntaxic reasons. In particular, an _Atomic type may have different size and alignment that its corresponding unqualified type. Also, an _Atomic type is never compatible with its corresponding unqualified type, and thus, for type checking, this qualifier must never be ignored. Fix this by removing MOD_ATOMIC from MOD_QUALIFIER. This, essentially, has the effect to stop to ignore MOD_ATOMIC when comparing types. Fixes: ffe9f9fef003d29b65d29b8da5416aff72baff5a Repoted-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-12-17Merge branch 'msg-wrong-redecl' into nextLuc Van Oostenryck4-15/+60
* improve diagnostic message about wrong redeclaration
2019-12-17Merge branch 'expand-init' (early part) into nextLuc Van Oostenryck15-7/+256
* improve expansion of constant symbols
2019-12-17Merge branch 'top-level-init' into nextLuc Van Oostenryck1-2/+8
* fix testcase with non-constant initializer
2019-12-17fix testcase with non-constant initializerLuc Van Oostenryck1-2/+8
These 2 top-level declarations had a non-constant initializer. Fix that by moving them into a function. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-12-15improve diagnostic message about wrong redeclarationLuc Van Oostenryck4-15/+60
The current message is very long (in most cases the position of the previous declaration is past the 80th column) and, while saying that the types differ, doesn't show these types. Change this by splitting the message in 2 parts: - first, on the current position, the main message and the type of the current declaration. - then the type of the previous declaration on its own position. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-12-15testcase: remove trailing ';' in commandsLuc Van Oostenryck2-2/+2
Two testcases had their command wrongly terminated by ';'. Fix this by removing this ';'. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-12-10fix cost of dereference of symbols with complex typeLuc Van Oostenryck1-1/+0
Currently, in expand_dereference(), the dereference of a symbol with a complex type is considered as costing as high as a non-symbol because it's not recognised it's a symbol. However, both cases should have exactly the same cost since they address calculation amounts to 'symbol + offset'. So, instead of taking in account a single level of symbol + offset let's use a loop for this in order to handle symbol [+ offset]* Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-12-10fix simplify_loads() when doing type punningLuc Van Oostenryck2-2/+0
When doing loads simplification for a location where floats & integers are mixed, loads are systematically replaced with the value of their dominating memop (this checks if the corresponding write or load overlaps). However, this must not be done if the involved operations are doing some form of integer/float type punning. Fix this by refusing to convert load of an integer by a previous float value or the opposite. Note: another way to describe this problem would be to say that floats need to have their own memory operations: OP_FSTORE & OP_FLOAD or that instructions need to have some form of 'machine type' in addition of the size (like clang's i32/f32, ...). Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-12-10fix expansion of initializer (default)Luc Van Oostenryck1-1/+0
Currently, constant_symbol_value() is doing the expansion of a constant initializer when an explicit one is found but nothing is done if the initilizer is an implicit one. Fix this by: * adding an helper to lookup the corresponding type from offset; * using this helper to get the correct kind for the value: - a 0-valued EXPR_VALUE for integers - a 0.0-valued EXPR_FVALUE for floats. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-12-10fix expansion of initializer (mismatching type)Luc Van Oostenryck2-2/+0
Currently, the expansion of constant initializers is done whenever the offset in the initializer match the one being expanded. However, it's not correct to do this expansion of an integer with the initializer for a float and vice-versa. Fix this by adding the corresponding tests to the other tests of the value. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-12-10fix expansion of initializer (mismatching size)Luc Van Oostenryck1-1/+0
Currently, the expansion of constant initializers is done whenever the offset in the initializer match the one we're expanding. However, it's not correct to do this expansion if their size doesn't match since in this case the value of one doesn't represent the value of the other. Fix this by adding a check for the size. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-12-10degenerated arrays & functions are addressable tooLuc Van Oostenryck2-1/+15
Symbols which have their address taken (with the 'addressof' operator: &) are marked as such (with the modifier MOD_ADDRESSABLE). But degenerated arrays and functions have their address implicitly taken. MOD_ADDRESSABLE is used to prevent to replace a symbol dereference nto the value used to initialize to it. For example, in code like: static int foo(void) { int x[2] = { 1, 2 }; return x[1]; } the return expression can be replaced by 2. This is not the case case if the array is first passed in a function call, like here: extern void def(void *, unsigned int); static int bar(void) { int x[2] = { 1, 2 }; def(x, sizeof(x)); return x[1]; } Fix this by marking degenerated arrays (and functions) as also being addressable. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-12-10fix addressability marking in evaluate_addressof()Luc Van Oostenryck1-1/+0
mark_addressable() is used to track if a symbol has its address taken but does not take in account the fact that a symbol can be accessed via one of its subfields. A failure occurs in case like: struct { int a; } s = { 3 }; ... def(&s.a); return s.a; where 's' is not marked as being addressable and so the the initializer will be expanded and the return expression will always be replaced by 3, while def() can redefine it. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-12-10add test for constant expansion of complex initializerLuc Van Oostenryck3-0/+53
Constant expansion of symbols with a complex type is not done like for simpler ones. Only the first-level EXPR_INITIALIZER is handled. Add some testcases for this. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-12-10add test for dereference cost of symbol with complex typeLuc Van Oostenryck1-0/+21
Currently, in expand_dereference(), the dereference of a symbol with a complex type is considered as costing as high as a non-symbol because it's not recognised it's a symbol. Add a testcase for this. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-12-10add test for union castLuc Van Oostenryck1-0/+27
Sparse can't do this yet. So, add a testcase for it. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-12-10add testcase for addressability of 'complex' symbolsLuc Van Oostenryck1-0/+24
Once a symbol has its address taken, a lot of simplifications must be avoided because the symbol can now be modified via a pointer. This is currently done but the symbol addressability does not take in account the fact that a symbol can be accessed via one of its subfields. Add a testcase to illustrate this. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-12-10add testcase for addressability of degenerated symbolLuc Van Oostenryck1-0/+18
An array or a function that degenerates into a pointer has its address implicitly taken since the result is equivalent to '&array[0]' or '&fun'. So, the corresponding symbol needs to be marked as addressable, like when its address is explicitly taken. Add a testcase to illustrate this. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-12-10add testcase for expansion of default initializersLuc Van Oostenryck2-0/+39
Currently, constant_symbol_value() is doing the expansion of a constant initializer when an explicit one is found but nothing is done for the default/implicit ones. Add a testcase to illustrate this. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-12-10split testcases for type punning & constant initializer expansionLuc Van Oostenryck5-5/+66
Several issues were covered by the same testcase. Fix this by splitting the testcases. Also, rename these testcases to a more descriptive name. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-12-09Merge branch 'premature-examine' into nextLuc Van Oostenryck1-0/+27
* fix premature examination of dereferenced object
2019-12-09fix premature examination of dereferenced objectLuc Van Oostenryck1-0/+27
in the fixes 696b243a5ae0 ("fix: evaluate_dereference() unexamined base type"), the pointer's examination was done prematurely, before the undereferenceable types are filtered out. This allows to examine the base abstract types when the expression was in fact not dereferenceable. Fix that by moving the examination to the top of the SYM_PTR's case since only pointers are concerned. Fixes: 696b243a5ae0 ("fix: evaluate_dereference() unexamined base type") Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-12-09Merge branch 'bitfield-size'Luc Van Oostenryck1-0/+30
* improve diagnostic messages concerning bitfields
2019-11-30bitfield: display the bitfield name in error messagesLuc Van Oostenryck1-5/+5
Diagnostics related to a bitfield and issued after parsing didn't display the bitfield name because it was not available. Now that that the name is available, use it in error messages since it helps to find the origin of the problem. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-11-30bitfield: oversized bitfields are errorsLuc Van Oostenryck1-1/+0
Till now, a bitfield with a width bigger than its base type only caused a warning but this should be considered as an error since it's generally impossible to emit correct IR code for it. Fix this by issuing an error instead and marking the width as invalid. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-11-30bitfield: add testcases for invalid bitfield widthLuc Van Oostenryck1-0/+31
Add some testcases before making related changes. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-11-28testsuite: avoid standard includes in the testsLuc Van Oostenryck2-3/+2
These headers are often complex and full of implementation specificities. They have no place in the testsuite. So, remove these includes and replace them by the prototype of the function being used. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-11-28Merge branch 'arch-cleanup' into masterLuc Van Oostenryck1-0/+2
2019-11-28arch: add predefines for INT128 only on supported archsLuc Van Oostenryck1-0/+2
The predefines for INT128 were added unconditionally for all archs but only the 64-bit ones support them. Fix this by issuing the the predefines only on 64-bit archs. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-11-27Merge branch 'arm-hf' into masterLuc Van Oostenryck5-0/+40
2019-11-27fp-abi: teach sparse about -m{hard,soft}-floatLuc Van Oostenryck1-1/+0
Teach Sparse about these options. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-11-27fp-abi: teach sparse about -mfloat-abi on ARMLuc Van Oostenryck4-4/+0
Teach sparse about the -mfloat-abi option and set the related predefines for ARM accordingly. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-11-27fp-abi: add tests for ARM's -mfloat-abi=... & -msoft-floatLuc Van Oostenryck5-0/+45
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-11-26Merge branch 'arch-cygwin' into masterLuc Van Oostenryck4-1/+27
2019-11-26Merge branch 'static-forward' into masterLuc Van Oostenryck1-9/+10
2019-11-21allow 'static' forward declarationLuc Van Oostenryck1-9/+10
A function or an object can be forward-declared as 'static' and then defining with the keyword 'static' omitted. This is perfectly legal and relatively common. However, Sparse complains that the definition is not declared and asks to the dev if should not be static. This is weird because the function or object *is* declared and *is* static (or at least should be following the standard or GCC's rules). Fix this by letting a new declaration or definition 'inherit' the 'static-ness' of the previous declarations. This is a bit more complicated than simply copying MOD_STATIC and must be done when binding the new symbol because static or extern objects have different scopes. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-11-21let function definition inherit prototype attributesLuc Van Oostenryck2-5/+1
It's common to declare a function with the attribute 'pure' or 'noreturn' and to omit the attribute in the function definition. It makes somehow sense since the information conveyed by these attributes are destined to the function users not the function itself. So, when checking declaration/definition, let the current symbol inherit any function attributes present in previous declarations. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-11-20propagate function modifiers only to functionsLuc Van Oostenryck2-2/+0
Function attributes need to be parsed differently than the usual specifiers: For example, in code like: #define __noreturn __attribute__((noreturn)) __noreturn void foo(int a); the __noreturn attribute should apply to the function type while a specifier like 'const' would apply to its return type. The situation is quite similar to how storage specifiers must not be handled by alloc_indirect_symbol(). However, the solution used for storage specifiers (apply the modifier bits only after the declarator is reached: cfr.commit 233d4e17c ("function attributes apply to the function declaration")) can't be used here (because the storage modifiers can be applied to the outermost declarator and function attributes may be applied more deeply if function pointers are present). Fix this by: 1) reverting the previous storage-specifier-like solution 2) collect function specifiers MODs in a new separate field in the declaration context (f_modifiers) 3) apply these modifiers when the declarator for the function type is reached (note: it must not be applied to the SYM_FN itself since this correspond to the function's return type; it must be applied to the parent node which can be a SYM_NODE or a SYM_PTR). 4) also apply these modifiers to the declared symbol, if this symbol is a function declaration, to take into account attributes which are placed at the end of the declaration and not in front. Reported-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Fixes: 233d4e17c544e1de252aed8f409630599104dbc7 Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-11-19add tests for function attributesLuc Van Oostenryck9-10/+134
Function attributes need to be parsed differently than the usual specifiers. For example, in code like: #define __noreturn __attribute__((noreturn)) __noreturn void foo(int a); the __noreturn attribute should apply to the function type, while a specifier like 'const' would apply to its return type. It's even more clear when function pointers are involved: __noreturn void (*fptr)(void); here too, the attribute should be applied to the function type, not the its return type, nor to the declared pointer type. Add some testcases to cover some of the situations concerning the parsing of these function pointers. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-11-15arch: teach sparse about -fshort-wcharLuc Van Oostenryck1-0/+6
This is useful in cgcc for supporting Cygwin which doesn't use a 32-bit type for wchar_t. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-11-15function attributes apply to the function declarationLuc Van Oostenryck1-0/+19
Function attributes relate to the function declaration they appear in. Sparse ignore most these attributes but a few ones have a semantic value: 'pure', 'noreturn' & 'externally_visible'. Due to how Sparse parse attributes and how these attributes are stored for functions, the attributes 'pure' & 'noreturn' are applied not to the function itself but its return type if the function returns a pointer. Fix this by extracting these attributes from the declaration context and ensure they're applied to the declarator. Reported-by: John Levon <john.levon@joyent.com> Reported-by: Alex Kogan <alex.kogan@oracle.com> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-11-14arch: fix the signedness of plain charsLuc Van Oostenryck3-1/+21
Some architectures, like ARM or PPC, use 'unsigned' for plain chars while others, like the Intel's, use signed ones. Sparse understands -funsigned-char but by default uses the native signedness. Fix this by setting the proper signedness of plain chars for the archs that Sparse know about. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-11-10Merge branch 'eval-typeof' into nextLuc Van Oostenryck1-0/+10
* clarify lazy evaluation & conversion of SYM_TYPEOF
2019-11-10typeof: examine it at show-timeLuc Van Oostenryck1-1/+0
Unless an explicit call to examine_pointer_target() or get_base_type() is made, the base type of pointers are *not* examined via the usual recursive examine_symbol_type(). That means that it is possible to call show_typename() on a non-fully examined type which is wrong (for example, because SYM_TYPEOFs may not be converted). So, call examine_pointer_target() on pointers when trying to display them. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-11-10typeof: add a test for unexamined typeofLuc Van Oostenryck1-0/+11
The base type of pointers are not examined when the pointer is. It needs to be done later when looked at. This may be a problem when show_typename() is used on a pointer which has not yet been 'deep-examined' and, for example, has a SYM_TYPEOF as its base type. Add a test case showing the problem. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-10-30arch: add an option to specify the desired arch: --arch=<arch>Luc Van Oostenryck4-0/+106
Sparse is universal in the sense that the same executable can be used for all architectures. For this, most arch-specific setting can be set with an option and the default values are taken from the host machine. This is working nicely for native targets. However, for cross- compilation, while seeming to work relatively well (thanks to the kernel build system using -m32/-m64 for all archs, for example) things can never work 100% correctly. For example, in the case an X86-64 host machine is used for an ARM target, the kernel build system will call sparse with -m32, Sparse will 'autodetect' the target arch as i386 (x86-64 + -m32) and will then predefine the macro __i386__. Most of the time this is not a problem (at least for the kernel) unless, of course, if the code contains something like: #ifdef __i386__ ... #elif __arm__ ... So, add an option --arch=<arch> to specify the target architecture. The native arch is still used if no such flag is given. Reported-by: Ben Dooks <ben.dooks@codethink.co.uk> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-10-09"graph" segfaults on top-level asmLuc Van Oostenryck1-0/+1
The "graph" binary segfaults on this input: asm(""); with gdb saying (edited for clarity): Program received signal SIGSEGV, Segmentation fault. in graph_ep (ep=0x7ffff7f62010) at graph.c:52 (gdb) p ep->entry $1 = (struct instruction *) 0x0 Sadly, the commit that introduced this crash: 15fa4d60e ("topasm: top-level asm is special") was (part of a bigger series) meant to fix crashes because of such toplevel asm statements. Toplevel ASM statements are quite abnormal: * they are toplevel but anonymous symbols * they should be limited to basic ASM syntax but are not * they are given the type SYM_FN but are not functions * there is nothing to evaluate or expand about it. These cause quite a few problems including crashes, even before the above commit. So, before handling them more correctly and instead of adding a bunch of special cases here and there, temporarily take the more radical approach of stopping to add them to the list of toplevel symbols. Fixes: 15fa4d60ebba3025495bb34f0718764336d3dfe0 Reported-by: Vegard Nossum <vegard.nossum@gmail.com> Analyzed-by: Vegard Nossum <vegard.nossum@gmail.com> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-10-01make 'directive in argument list' clearerLuc Van Oostenryck1-4/+4
The warning 'directive in argument list' is about macros' arguments, not functions' ones. Make this clearer in the warning message. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-09-30Merge branch 'fix-expand-asm' into tipLuc Van Oostenryck14-33/+228
Currently, ASM operands aren't expanded or even evaluated. This causes Sparse to emit warnings about 'unknown expression' during the linearization of these operands if they contains, for example, calls to __builtin_compatible_types_p(). Note: the correct handling of ASM operands needs to make the distinction between 'memory' operands and 'normal' operands. For this, it is needed to look at the constraints and these are architecture specific. The patches in this series only consider the constraints m, v, o & Q as being for memory operands and, happily, these seems to cover most usage for the most common architectures. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-09-30Merge branch 'relax-constexpr' into tipLuc Van Oostenryck2-4/+10
2019-09-30Merge branch 'fix-bad-linear' into tipLuc Van Oostenryck2-0/+36
Expressions without a valid type should never be linearized since they have no (valid) type and haven't been expanded. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-09-30fix sign extension in casting enumsDan Carpenter1-1/+0
The function cast_value() needs the exact type of the old expression but when called via cast_enum_list() this type is incorrect because: - the same struct is used for the new and the old expression - the type of the new expression is adjusted before cast_value() is called. Fix this by adjusting the type of the new expression only after cast_value() has been called. Fixes: 604a148a73af ("enum: fix cast_enum_list()") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-09-30add test for enum sign extensionLuc Van Oostenryck1-0/+13
In a declaration like: enum { a = 0x80000000, b = -1, } the underlying type should be long and b's value should be 0xffffffffffffffff (on a 64-bit machine) but is 0xffffffff. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-09-30do not linearize invalid expressionLuc Van Oostenryck1-1/+0
Code like: int *r; r = ({ __builtin_types_compatible_p(long, long); }); triggers the following diagnostics: warning: incorrect type in assignment (different base types) expected int *r got long warning: unknown expression (4 0) warning: unknown expression (4 0) The first warning is expected but the other two are bogus. The origin of the problem could be considered as being how type incompabilities are handled in assignment: If an incompatibility is found by compatible_assignment_types() - a warning is issued (not an error), - the source expression is casted to the destination type, - the returned value indicates a problem was detected. In the other uses of this function the returned value is simply ignored and normal processing continue. This seems logical since only a warning is issued and so (thanks to the cast) the resulting expression is at least type-coherent. However, in evaluate_assignment() the returned value is not ignored and the calling function directly returns. This leaves the resulting expression without a valid type, as if an error occured, unable to be correctly processed further. However, the real problem is that an expression without a valid type should never be linearized. So, in linearize_expression(), refuse to linearize an expression without a valid type. Note: if one is interested in doing a maximum of processing, including expansion and linearization, check_assignment_types() should be modified to distinguish between recoverable and non-recoverable type error (those for which the forced cast make sense and those for which it doesn't) and compatible_assignment_types() modified accordingly (maybe issuing a warning in the first case and an error otherwise). Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-09-28asm: arrays & functions in non-memory operand degenerate into pointersLuc Van Oostenryck1-1/+0
Non-memory asm operands are very much like function's arguments. As such, any array (or function designator) used as an asm operand need to degenerate into the corresponding pointer. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-09-27asm: fix missing expansion of asm statementsLuc Van Oostenryck1-1/+0
The operands of extended ASM need to be expanded, exactly like any other expression. For example, without this expansion expressions with __builtin_compatible_types_p() can't be linearized and will issue a 'warning unknown expression". So, add the missing expansion of ASM operands. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-09-27asm: linearization of output memory operands is differentLuc Van Oostenryck1-1/+0
ASM memory operands are considered by GCC as some kind of implicit reference. Their linearization should thus not create any storage statement: the storage is done by the ASM code itself. Adjust the linearization of such operands accordingly. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-09-27asm: missing evaluation of asm statementsLuc Van Oostenryck1-1/+0
The operands of extended ASM need to have their type evaluated, exactly like any other expression. So, add the missing evaluation of ASM operands. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-09-27asm: check earlier that body & constraints are stringsLuc Van Oostenryck1-3/+3
The syntax of extended ASM statements requires that the bodies & constraints are given via a literal string. However, at parsing time more general expressions are accepted and it's checked only at evaluation time if these are effectively string literals. This has at least two drawbacks: *) evaluate_asm_statement() is slightly more complicated than needed, mixing these checks with the real evaluation code *) in case of error, the diagnostic is issued later than other syntaxic warnings. Fix this by checking at parse-time that ASM bodies & constraints are string literals and not some arbitrary expressions. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-09-27asm: add test evaluation, expansion & linearization of ASM operandsLuc Van Oostenryck5-0/+174
ASM statements are quite complex. Add some tests to catch some potential errors. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-09-26string: use string_expression() in parse_static_assert()Luc Van Oostenryck1-3/+3
The error handling during the parsing of _Static_assert()'s message string is relatively complex. Simplify this by using the new helper string_expression(). Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-09-26expand: add missing expansion of compound literalsLuc Van Oostenryck2-2/+0
Compound literals, like all other expressions, need to be be expanded before linearization, but this is currently not done. As consequence, some builtins are unexpectedly still present, same for EXPR_TYPEs, ... with error messages like: warning: unknown expression at linearization. Fix this by adding the missing expansion of compound literals. Note: as explained in the code itself, it's not totally clear how compound literals can be identified after evaluation. The code here consider all anonymous symbols with an initializer as being a compound literal. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-09-26expand: add test for expansion of compound literalsLuc Van Oostenryck1-0/+27
Compound literals are currently not expanded. Add a test for this. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-09-26shorter message for non-scalar in conditionalsLuc Van Oostenryck2-9/+9
The diagnostic message is a bit long with the non-really-informative part 'incorrect type' first and the explanation later in parentheses. Change this by using a shorter message "non-scalar type in ...". Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-09-26more consistent type info in error messagesLuc Van Oostenryck6-29/+29
Some error messages are displayed with auxillary information about the concerned type(s). However, this type information is displayed in various way: just the type, "[left/right] side has type ...", "got ...", ... Make these more consistent and simpler by just displaying types when the error message is unambigous about the fact that the problem is a type problem (and/or make the message unambiguous when possible). Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-09-02constexpr: relax constexprness of constant conditionalsLuc Van Oostenryck2-4/+10
Currently, sparse emits a warning when a conditional expression with a constant condition is used where an "Integer Constant Expression" is expected and only the false-side operand (which is not evaluated) is not constant. The standard are especially unclear about this situation. However, GCC silently accept those as ICEs when they evaluate to a compile-time known value (in other words, when the conditional and the corresponding true/false sub-expression are themselves constant). The standard are especially unclear about the situation when the unevaluated side is non-constant. So, relax sparse to match GCC's behaviour. Reported-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-04-01fix allowing casts of AS pointers to uintptr_tLuc Van Oostenryck4-15/+57
The patch b3daa62b5 ("also accept casts of AS pointers to uintptr_t") is bogus and allows uintptr_t as the *source type* instead of the *target type*. This was helped by a previous bug, in patch d96da358c ("stricter warning for explicit cast to ulong"), where a test for Wcast_from_as was wrongly added for the source type. Fix this by: * adding the test for uintptr_t to the target type; * removing the test for Wcast_from_as from the source type, replacing it by a test of Wcast_to_as; * clarify and extend the tge testcases. So, now, casts from uintptr_t to AS pointers are also allowed. Fixes: b3daa62b53109dba78c7937b3a6a0cd7d67865d5 Fixes: d96da358cfa0432f067a4e66940765883b80ee62 Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-03-30also accept casts of AS pointers to uintptr_tLuc Van Oostenryck1-0/+60
Sparse will warn on casts removing the address space of a pointer if the destination type is not unsigned long. But the type 'uintptr_t' should be more suited for this. So, also accept casts of address-space qualified pointers to uintptr_t. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-03-05add test for evaluation of invalid assignmentsLuc Van Oostenryck2-0/+37
Due to the way compatible_assignment_types()'s handle type incompatibilities and how expression with an invalid type are nevertheless processed by linearize_expression(), some invalid assignments retunr unwanted error messages (and working around them can create some others). Here are 2 relatively simple tests triggering the situation. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-03-03expand: 'conservative' must not bypass valid simplificationsThomas Weißschuh2-0/+76
During the expansion of shifts, the variable 'conservative' is used to inhibit any possible diagnostics (for example, because the needed information is if the expression is a constant or not). However, this must not inhibit the simplification of valid shift expressions. Unfortunately, by moving the validation inside check_shift_count(), this what was done by commit 0b73dee01 ("big-shift: move the check into check_shift_count()"). Found through a false positive VLA detected in the Linux kernel. The array size was computed through min() on a shifted constant value and sparse complained about it. Fix this by changing the logic of check_shift_count(): 1) moving the test of 'conservative' inside check_shift_count() and only issuing warnings if set. 2) moving the warning part in a separate function: warn_shift_count() 3) let check_shift_count() return if the shift count is valid so that the simplication can be eluded if not. Fixes: 0b73dee0171a15800d0a4ae6225b602bf8961599 Signed-off-by: Thomas Weißschuh <thomas@t-8ch.de> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-02-28display extra info for type errors in compare & conditionalLuc Van Oostenryck1-2/+6
For "incompatible types in comparison expression" errors, only the kind of type difference is displayed. Displaying the types would make easier to find the cause of the problem. The same is true for ternary conditionals. So, also display the left & right types. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-02-18testsuite: fix bad escaping of '[' & ']'Luc Van Oostenryck2-2/+2
Fix escaping of square brackets in some test patterns. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-02-17Merge branch 'branch-v0.6'Luc Van Oostenryck2-9/+9
* explain cause of 'incorrect type in conditional' * manpage: fix doc of '-Wcast-from-as'
2019-02-07redecl: add test for attribute placement in function declaratorsRamsay Jones1-0/+31
Add a new test file which demonstrates some problems which can be seen on the git codebase. gcc does not complain about this file: $ gcc -Wall -c validation/function-redecl2.c $ ... but sparse does: $ sparse validation/function-redecl2.c validation/function-redecl2.c:6:5: error: symbol 'func0' redeclared with different type (originally declared at validation/function-redecl2.c:3) - different modifiers validation/function-redecl2.c:13:6: error: symbol 'func1' redeclared with different type (originally declared at validation/function-redecl2.c:11) - different modifiers validation/function-redecl2.c:21:6: error: symbol 'func2' redeclared with different type (originally declared at validation/function-redecl2.c:18) - different modifiers $ Note that func0 and func2 are essentially the same example, apart from the attribute used, to demonstrate that the issue isn't caused by the 'pure' attribute. Also, examples like func1 have occurred several times in git and, although they can be worked around (eg. See [1]), it would be preferable if this were not necessary. [1] (git) commit 3d7dd2d3b6 ("usage: add NORETURN to BUG() function definitions", 2017-05-21). Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-02-07validation: Add patterns FAIL, PASS, XPASS and XFAIL to testUwe Kleine-König1-6/+9
This simplifies finding the offending test when the build ended with KO: out of 584 tests, 527 passed, 57 failed 56 of them are known to fail Signed-off-by: Uwe Kleine-König <uwe@kleine-koenig.org> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-02-04target.c: ignore -m64 on archs where int32_t is a longLuc Van Oostenryck18-0/+19
If the flag '-m64' is used on a 32-bit architecture/machine having int32_t set to 'long', then these int32_t are forced to 64-bit ... So, ignore the effect of -m64 on these archs and ignore '64-bit only' tests on them. Reported-by: Uwe Kleine-König <uwe@kleine-koenig.org> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Tested-by: Uwe Kleine-König <uwe@kleine-koenig.org>
2019-02-04testsuite: remove unneeded -m64 from command-lineLuc Van Oostenryck1-1/+1
The test was called with the flag '-m64' but doesn't need it. So, remove it. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Tested-by: Uwe Kleine-König <uwe@kleine-koenig.org>
2018-12-29explain cause of 'incorrect type in conditional'Luc Van Oostenryck2-9/+9
A conditional only make sense on a scalar type. If not, an error is issued but the message doesn't explain the cause. Fix this by adding the cause to the error message. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-20Merge branch 'show-type'Luc Van Oostenryck12-61/+61
* small improvemnets to show_typename()'s outout: * strip trailing space * don't display '<noident>' * do not display base type's redundant specifiers * do not let display string_ctype lika a base type 'string'
2018-12-19Merge branch 'bitwise-ptr'Luc Van Oostenryck2-0/+39
* warn on casts to/from bitwise pointers
2018-12-17show-parse: do not display base type's redundant specifiersLuc Van Oostenryck5-37/+37
In do_show_type(), builtin_typename() is used to display builtin (base) types and modifier_string() is used to display modifiers. However, most base types contains some intrinsic modifiers, the type specifiers. So, a type like 'unsigned long' is displayed as 'unsigned long [unsigned] [long]'. Fix this redundancy by not displaying the specifiers when displaying a base_type (or an enum). Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-17show-parse: don't display null ident in show_typename()Luc Van Oostenryck9-25/+25
Often show_typename() is used to display a type and the associated identifier is irrelevant but is displayed nevertheless. However, when the identifier is itself not present, it is still displayed as '<noident>', which is just noise and can be confusing. Fix this by displaying nothing for null identifiers in show_typename(). Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-17add a flag to warn on casts to/from bitwise pointersLuc Van Oostenryck1-2/+1
Support for 'bitwise' integers is one of the main sparse's extension. However, casts to or from pointers to bitwise types can be done without incurring any sort of warnings although such casts can be as wrong as direct casts to or from bitwise integers themselves. Add the corresponding warnings and control them by a new flag -Wbitwise-pointer (defaulting to off as it creates tens of thousands warnings in the kernel). CC: Thiebaud Weksteen <tweek@google.com> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-17Add testcases for bitwise cast on pointerThiebaud Weksteen2-0/+40
since it seems that the strict type checking is not done on pointers to restricted types. Signed-off-by: Thiebaud Weksteen <tweek@google.com> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-17Merge branch 'predefs' into tipv0.6.0-rc1Luc Van Oostenryck8-59/+93
* add predefined macros for __INTMAX_TYPE__, __INT_MAX__, ...
2018-12-17add predefined macros for [u]int32_tLuc Van Oostenryck1-0/+2
These are a pain. All LP64 archs use [u]int. Good. But some LP32 archs use [u]int and some others use [u]long. Some even use [u]int for some ABI and [u]long for some others (bare metal). This really need to be target-specific to be correct. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-17add predefined macros for [u]int64_tLuc Van Oostenryck1-0/+2
All LP32 archs use [u]llong and all LP64 use [u]long for these but Darwin which seems to always use [u]llong. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-17add predefined macros for [u]int{8,16}_tLuc Van Oostenryck1-0/+4
All LP64 & LP32 use [u]char and [u]short for these ones. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-17add predefined macros for [u]intmaxLuc Van Oostenryck1-0/+2
Seems to use [u]long for all LP64 archs and [u]llong and all LP32 ones (but OpenBSD but it seems to not defines the corresponding macros). Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-17add predefined macros for [u]intptrLuc Van Oostenryck1-0/+2
Luckily, it seems all archs use for them the same types as size_t & ssize_t. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-17make predefined_type_size() more genericLuc Van Oostenryck2-0/+12
This allows to have a single function to output the size, the type, the maximal value, ... Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-16show-parse: strip do_show_type()'s trailing spaceLuc Van Oostenryck1-2/+2
It's possible that the result of do_show_type() ends with a space. Strip this unneeded space. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-14teach sparse about asm inlineLuc Van Oostenryck1-0/+52
GCC's trunk now allows to specifiy 'inline' with asm statements. This feature has been asked by kernel devs and will most probably by used for the kernel. So, teach sparse about this syntax too. Note: for sparse, there is no semantic associated to this inline because sparse doesn't make any size-based inlining decisions. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-14fix '__SIZE_TYPE__' for LLP64Luc Van Oostenryck1-1/+0
size_t_ctype is set to uint, ulong or ullong, depending on the architecture (ullong is only used for LLP64). However, when emitting '__SIZE_TYPE__', it's only compared to ulong or uint. Fix this by using an small helper directly using the right struct symbol * and using builtin_typename() to output the right type. This way we're guaranteed that '__SIZE_TYPE__' is kept coherent with the internal type: size_t_ctype. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-14testsuite: test predef macros on LP32/LP64/LLP64Luc Van Oostenryck7-59/+70
Now these tests should succeed and be meaningful on all archs. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-12Merge branch 'as-named' into tipLuc Van Oostenryck5-20/+37
* prepare to identify & display the address spaces by name
2018-12-12as-name: allow ident as address_spaceLuc Van Oostenryck1-0/+17
Currently, address space 1 is displayed as '<asn:1>' and so on. Now that address spaces can be displayed by name, the address space number should just be an implementation detail and it would make more sense the be able to 'declare' these address space directly by name, like: #define __user attribute((noderef, address_space(__user))) Since directly using the name instead of an number creates some problems internally, allow this syntax but for the moment keep the address space number and use a table to lookup the number from the name. References: https://marc.info/?l=linux-sparse&m=153627490128505 Idea-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-10Merge branch 'fix-non-const-case' into tipLuc Van Oostenryck1-0/+37
* fix linearization of non-constant switch-cases
2018-12-09as-name: add and use show_as()Luc Van Oostenryck4-20/+20
Use a function to display the address spaces. This will allow to display a real name instead of '<asn:1>'. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-09Merge branch 'dump-macros'Luc Van Oostenryck2-0/+52
* fixes for -dD * add support for -dM Luc Van Oostenryck (2): dump-macro: break the loop at TOKEN_UNTAINT dump-macro: simplify processing of whitespace Ramsay Jones (5): pre-process: suppress trailing space when dumping macros pre-process: print macros containing # and ## correctly pre-process: don't put spaces in macro parameter list pre-process: print variable argument macros correctly pre-process: add the -dM option to dump macro definitions
2018-12-09don't allow newlines inside string literalsLuc Van Oostenryck2-4/+3
Sparse allows (but warns about) a bare newline (not preceded by a backslash) inside a string. Since this is invalid C, it's probable that a terminating '"' is missing just before the newline. In this case, allowing the newline implies accepting the following characters until the next '"' is found, which is most case creates a lot of irrelevant warnings. Change this by disallowing newlines inside strings, exactly like already done for character constants. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-08add testcase for missing deliminator ' or "Luc Van Oostenryck1-0/+18
Add a testcase for "Newline in string or character constant" vs. "missing delimitator" upcoming change. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-01Conditionalize 'warning: non-ANSI function ...'John Levon4-0/+53
Sparse unconditionally issues warnings about non-ANSI function declarations & definitions. However, some environments have large amounts of legacy headers that are pre-ANSI, and can't easily be changed. These generate a lot of useless warnings. Fix this by using the options flags -Wstrict-prototypes & -Wold-style-definition to conditionalize these warnings. Signed-off-by: John Levon <levon@movementarian.org> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-01Use -Wimplicit-int when warning about missing K&R argument typesLuc Van Oostenryck1-0/+15
In legacy environment, a lot of warnings can be issued about arguments without an explicit type. Fix this by contitionalizing such warnings with the flag -Wimplicit-int, reducing the level of noise in such environment. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-01fix implicit K&R argument typesLuc Van Oostenryck1-0/+16
In an old-style function definition, if not explicitly specified, the type of an argument defaults to 'int'. Sparse issues an error for such arguments and leaves the type as 'incomplete'. This can then create a cascade of other warnings. Fix this by effectively giving the type 'int' to such arguments. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-11-29Ignore #ident directivesJohn Levon2-0/+24
Legacy code can be littered with the non-standard "#ident" directive; ignore it. Signed-off-by: John Levon <levon@movementarian.org> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-11-24pre-process: add the -dM option to dump macro definitionsRamsay Jones2-0/+42
The current -dD option outputs the macro definitions, in addition to the pre-processed text. In contrast, the -dM option outputs only the macro definitions. Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-11-24pre-process: print variable argument macros correctlyRamsay Jones1-0/+5
The dump_macros() function fails to correctly output the definition of macros that have a variable argument list. For example, the following macros: #define unlocks(...) annotate(unlock_func(__VA_ARGS__)) #define apply(x,...) x(__VA_ARGS__) are output like so: #define unlocks(__VA_ARGS__) annotate(unlock_func(__VA_ARGS__)) #define apply(x,__VA_ARGS__) x(__VA_ARGS__) Add the code necessary to print the ellipsis in the argument list to the dump_macros() function and add the above macros to the 'dump-macros.c' test file. Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-11-24pre-process: don't put spaces in macro parameter listRamsay Jones1-1/+1
The dump_macros() function adds a ", " separator between the arguments of a function-like macro. Using a simple "," separator, which aligns the output with gcc, leads to one less distraction when comparing the output of sparse and gcc. Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-11-24pre-process: print macros containing # and ## correctlyRamsay Jones1-0/+5
The dump_macro() function fails to correctly output the definitions of macros that contain the string operator '#', the concatenation operator '##' and any macro parameter in the definition token list. For example, the following macros: #define STRING(x) #x #define CONCAT(x,y) x ## y are output like so: #define STRING(x) unhandled token type '21' #define CONCAT(x, y) unhandled token type '22' unhandled token type '23' unhandled token type '22' Add the code necessary to handle those token types to the dump_macros() function and add the above macros to the 'dump-macros.c' test file. Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-11-23constant: add -Wconstant-suffix warningRamsay Jones2-0/+30
Currently, when used on the kernel, sparse issues a bunch of warnings like: warning: constant 0x100000000 is so big it is long These warning are issued when there is a discrepancy between the type as indicated by the suffix (or the absence of a suffix) and the real type as selected by the type suffix *and* the value of the constant. Since there is nothing incorrect with this discrepancy, (no bits are lost) these warnings are more annoying than useful. So, make them depending on a new warning flag -Wconstant-suffix and make it off by default. Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-11-22sparsei: add the --[no-]jit optionsRamsay Jones1-1/+1
On the cygwin platform, a 'sparsei' backend test, which uses the llvm 'lli' tool, fails due to a dynamic linking error: $ make check ... TEST sum from 1 to n (backend/sum.c) error: actual output text does not match expected output text. error: see backend/sum.c.output.* for further investigation. --- backend/sum.c.output.expected 2018-06-03 18:27:11.502760500 +0100 +++ backend/sum.c.output.got 2018-06-03 18:27:11.307670000 +0100 @@ -1,2 +0,0 @@ -15 -5050 error: actual error text does not match expected error text. error: see backend/sum.c.error.* for further investigation. --- backend/sum.c.error.expected 2018-06-03 18:27:11.562997400 +0100 +++ backend/sum.c.error.got 2018-06-03 18:27:11.481038800 +0100 @@ -0,0 +1 @@ +LLVM ERROR: Program used external function 'printf' which could not be resolved! error: Actual exit value does not match the expected one. error: expected 0, got 1. ... Out of 288 tests, 277 passed, 11 failed (10 of them are known to fail) make: *** [Makefile:236: check] Error 1 $ Note the 'LLVM ERROR' about the 'printf' external function which could not be resolved (linked). On Linux, it seems that the 'lli' tool (JIT compiler) can resolve the 'printf' symbol, with the help of the dynamic linker, since the tool itself is linked to the (dynamic) C library. On windows (hence also on cygwin), the 'lli' tool fails to resolve the external symbol, since it is not exported from the '.exe'. The 'lli' tool can be used as an interpreter, so that the JIT compiler is disabled, which also side-steps this external symbol linking problem. Add the --[no-]jit options to the 'sparsei' tool, which in turn uses (or not) the '-force-interpreter' option to 'lli'. In order to fix the failing test-case, simply pass the '--no-jit' option to 'sparsei'. Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-11-20fix expansion of function designatorLuc Van Oostenryck1-1/+0
The expression corresponding to the function pointer of indirect call can be arbirarily complex. For example, it can contain a statement expression or another call, possibly inlined. These expressions must be expanded to insure that sub-expressions involving 'sizeof()' or other operators taking a type as argument (like __builtin_compatible_types_p()) are no more present (because these expressions always evaluate to a compile-time constant and so are not expected and thus not handled at linearization time). However, this is not currently enforced, possibly causing some failures during linearization with warnings like: warning: unknown expression (4 0) (which correspond to EXPR_TYPE). Fix this, during the expansion of function calls, by also expanding the corresponding designator. References: https://lore.kernel.org/lkml/1542623503-3755-1-git-send-email-yamada.masahiro@socionext.com/ Reported-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Tested-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-11-20add testcase for missing function designator expansionLuc Van Oostenryck1-0/+23
Add a testcase showing function designator are not expanded. References: https://lore.kernel.org/lkml/1542623503-3755-1-git-send-email-yamada.masahi> Reported-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-10-05Merge branch 'fix-enum-type' into tipLuc Van Oostenryck14-3/+305
2018-10-05enum: more specific error message for empty enumLuc Van Oostenryck1-1/+1
Currently, the error message issued for an empty enum is "bad enum definition". This is exactly the same message used when one of the enumerator is invalid. Fix this by using a specific error message. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-10-05enum: default to unsignedLuc Van Oostenryck3-4/+3
GCC uses an unsigned type for enum's basetype unless one of the enumerators is negative. Using 'int' for plain simple enumerators and then using the same rule as for integer constants (int -> unsigned int -> long -> ...) should be more natural but doing so creates useless warnings when using sparse on the kernel code. So, do the same as GCC: * uses the smaller type that fits all enumerators, * uses at least int or unsigned int, * uses an signed type only if one of the enumerators is negative. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-10-05enum: warn when mixing different restricted typesLuc Van Oostenryck1-0/+20
Sparse supports enum initializers with bitwise types but this makes sense only if they are all the same type. Add a check and issue a warning if an enum is initialized with different restricted types. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-10-05enum: only warn (once) when mixing bitwisenessLuc Van Oostenryck1-0/+29
As an extension to the standard C types, parse supports bitwise types (also called 'restricted') which should in no circonstances mix with other types. In the kernel, some enums are defined with such bitwise types as initializers; the goal being to have slightly more strict enums. While the semantic of such enums is not very clear, using a mix of bitwise and not-bitwise initializers completely defeats the desired stricter typing. Attract some attention to such mixed initialization by issuing a single warning for each such declarations. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-10-05enum: use the smallest type that fitLuc Van Oostenryck3-3/+0
The C standard requires that the type of enum constants is 'int' and let the enum base/compatible type be implementation defined. For this base type, instead of 'int', GCC uses the smallest type that can represent all the values of the enum (int, unsigned int, long, ...) Sparse has the same logic as GCC but if all the initializers have the same type, this type is used instead. This is a sensible choice but often gives differents result than GCC. To stay more compatible with GCC, always use the same logic and thus only keep the common type as base type for restricted types. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-10-05enum: fix cast_enum_list()Luc Van Oostenryck1-1/+0
Sparse want that an enum's enumerators have all the same type. This is done by first determining the common type and then calling cast_enum_list() which use cast_value() on each member to cast them to the common type. However, cast_value() doesn't create a new expression and doesn't change the ctype of the target: the target expression is supposed to have already the right type and it's just the value that is transfered from the source expression and size adjusted. It's seems that in cast_enum_list() this has been overlooked with the result that the value is correctly adjusted but keep it's original type. Fix this by updating, for each member, the desired type. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-10-05enum: add testcase for base & enumerator typeLuc Van Oostenryck8-0/+227
Add various testcases for checking enum's base & enumerator type. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-10-05enum: add testcase for type of enum membersLuc Van Oostenryck1-0/+15
Members of an enum should all have the same type but isn't so currently. Add a testcase for it and mark it as 'known-to-fail'. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-10-05enum: fix UB when rshifting by full widthLuc Van Oostenryck1-1/+0
Shifting by an amount greater or equal than the width of the type is Undefined Behaviour. In the present case, when type_is_ok() is called with a type as wide as an ullong (64 bits here), the bounds are shifted by 64 which is UB and at execution (on x86) the value is simply unchanged (since the shift is done with the amount modulo 63). This, of course, doesn't give the expected result and as consequence valid enums can have an invalid base type (bad_ctype). Fix this by doing the shift with a small helper which return 0 if the amount is equal to the maximum width. NB. Doing the shift in two steps could also be a solution, as maybe some clever trick, but since this code is in no way critical performance-wise, the solution here has the merit to be very explicit. Fixes: b598c1d75a9c455c85a894172329941300fcfb9f Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-10-05enum: add testcase for UB in oversized shiftLuc Van Oostenryck1-0/+17
type_is_ok(), used to calculate the base type of enums, has a bug related to UB when doing a full width rshift. Add a testcase for this. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-26print address space number for cast-from-AS warningsVincenzo Frascino2-3/+63
This patch prints the address space number when a warning "cast removes address space of expression" is triggered. This makes easier to discriminate in between different address spaces. Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-10ssa: relax what can be promotedLuc Van Oostenryck1-2/+0
During SSA conversion, it is checked what can be promoted and what cannot. Obviously, ints, longs, pointers can be promoted, enums and bitfields can too. Complication arise with unions and structs. Currently union are only accepted if they contains integers of the same size. For structs its even more complicated because we want to convert simple bitfields. What should be accepted is structs containing either: * a single scalar * only bitfields and only if the total size is < long However the test was slightly more strict than that: it dodn't allowed a struct with a total size bigger than a long. As consequence, on IP32, a struct containing a single double wasn't promoted. Fix this by moving the test about the total size and only if some bitfield was present. Reported-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-10test: make 32-bit version of failed testLuc Van Oostenryck2-2/+31
The test mem2reg/init-local.c succeeds on 64-bit but fails on 32-bit. Duplicate the test, one with -m64 and the other with -m32 and mark this one as known-to-fail. Reported-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-10test: use integers of different sizes, even on 32-bitLuc Van Oostenryck1-2/+2
The test optim/cse-size fials on 32-bit because it needs two integers of different size but uses int & long. These two types have indeed different sizes on 64-bit (LP64) but not on 32-bit (ILP32). Fix this by using short & int. Reported-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-10test: make test Waddress-space-strict succeed on 32-bitLuc Van Oostenryck1-26/+7
The test Waddress-space-strict made assumptions about the relative size of integers & pointers. Since this test was crafted on a 64-bit machine, the test was running fine for LP64 but failed on a 32-bit machine (or anything using IP32, like using the -m32 option). However, since the test is about conversion of address-spaces, using integers of different size adds no value, and indeed brings problems. Fix this by limiting the conversions to a single integer type, the one with the same size as pointers on ILP32 & LP64: long. Reported-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-08fix linearization of non-constant switch-casesLuc Van Oostenryck1-1/+0
The linearization of switches & cases makes the assumption that the expressions for the cases are constants (EXPR_VALUE). So, the corresponding values are dereferenced without checks. However, if the code uses a non-constant case, this dereference produces a random value, probably one corresponding to some pointers belonging to the real type of the expression. Fix this by checking during linearization the constness of the expression and ignore the non-constant ones. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-08add testcase for non-constant switch-caseLuc Van Oostenryck1-0/+38
Switches with non-constant cases are currently linearized using as value the bit pattern present in the expression, creating more or less random multijmps. Add a basic testcase to catch this. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-06Merge branch 'rem-trivial-phi' into tipLuc Van Oostenryck1-0/+14
* remove more complex phi-nodes
2018-09-06Merge branches 'missing-return' and 'fix-logical-phi' into tipLuc Van Oostenryck13-90/+281
* fix linearization/SSA when missing a return * fix linearization/SSA of (nested) logical expressions
2018-09-06fix linearization of nested logical exprLuc Van Oostenryck4-93/+90
The linearization of nested logical expressions is not correct regarding the phi-nodes and their phi-sources. For example, code like: extern int a(void); int b(void); int c(void); static int foo(void) { return (a() && b()) && c(); } gives (optimized) IR like: foo: phisrc.32 %phi1 <- $0 call.32 %r1 <- a cbr %r1, .L4, .L3 .L4: call.32 %r3 <- b cbr %r3, .L2, .L3 .L2: call.32 %r5 <- c setne.32 %r7 <- %r5, $0 phisrc.32 %phi2 <- %r7 br .L3 .L3: phi.32 %r8 <- %phi2, %phi1 ret.32 %r8 The problem can already be seen by the fact that the phi-node in L3 has 2 operands while L3 has 3 parents. There is no phi-value for L4. The code is OK for non-nested logical expressions: linearize_cond_branch() takes the sucess/failure BB as argument, generate the code for those branches and there is a phi-node for each of them. However, with nested logical expressions, one of the BB will be shared between the inner and the outer expression. The phisrc will 'cover' one of the BB but only one of them. The solution is to add the phi-sources not before but after and add one for each of the parent BB. This way, it can be guaranteed that each parent BB has its phisrc, whatever the complexity of the sub- expressions. With this change, the generated IR becomes: foo: call.32 %r2 <- a phisrc.32 %phi1 <- $0 cbr %r2, .L4, .L3 .L4: call.32 %r4 <- b phisrc.32 %phi2 <- $0 cbr %r4, .L2, .L3 .L2: call.32 %r6 <- c setne.32 %r8 <- %r6, $0 phisrc.32 %phi3 <- %r8 br .L3 .L3: phi.32 %r1 <- %phi1, %phi2, %phi3 ret.32 %r1 Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-06add tests for nested logical exprLuc Van Oostenryck1-0/+49
Nested logical expressions are not correctly linearized. Add a test for all possible combinations of 2 logical operators. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-06fix ordering of phi-node operandLuc Van Oostenryck2-5/+4
The linearization of logical '&&' create a phi-node with its operands in the wrong order relatively to the parent BBs. Switch the order of the operands for logical '&&'. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-06add testcases for wrong ordering in phi-nodesLuc Van Oostenryck4-0/+55
In valid SSA there is a 1-to-1 correspondance between each operand of a phi-node and the parents BB. However, currently, this is not always respected. Add testcases for the known problems. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-06return nothing only in void functionsLuc Van Oostenryck1-1/+0
Currently, the code for the return is only generated if the effectively return a type or a value with a size greater than 0. But this mean that a non-void function with an error in its return expression is considered as a void function for what the generated IR is concerned, making things incoherent. Fix this by using the declared type instead of the type of the return expression. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-06use UNDEF for missing returnsLuc Van Oostenryck5-5/+0
If a return statement is missing in the last block, the generated IR will be invalid because the number of operands in the exit phi-node will not match the number or parent BBs. Detect this situation and insert an UNDEF for the missing value. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-06topasm: top-level asm is specialLuc Van Oostenryck1-0/+0
Top-level ASM statements are parsed as fake anonymous functions. Obviously, they have few in common with functions (for example, they don't have a return type) and mixing the two makes things more complicated than needed (for example, to detect a top-level ASM, we had to check that the corresponding symbol (name) had a null ident). Avoid potential problems by special casing them and return early in linearize_fn(). As consequence, they now don't have anymore an OP_ENTRY as first instructions and can be detected by testing ep->entry. Note: It would be more logical to catch them even erlier, in linearize_symbol() but they also need an entrypoint and an active BB so that we can generate the single statement. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-05add testcases for missing return in last blockLuc Van Oostenryck6-0/+97
In this case the phi-node created for the return value ends up with a missing operand, violating the semantic of the phi-node: map one value with each predecessor. Add testcases for these missing returns. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-01stricter warning for explicit cast to ulongLuc Van Oostenryck1-0/+56
sparse issues a warning when user pointers are casted to integer types except to unsigned longs which are explicitly allowed. However it may happen that we would like to also be warned on casts to unsigned long. Fix this by adding a new warning flag: -Wcast-from-as (to mirrors -Wcast-to-as) which extends -Waddress-space to all casts that remove an address space attribute (without using __force). References: https://lore.kernel.org/lkml/20180628102741.vk6vphfinlj3lvhv@armageddon.cambridge.arm.com/ Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-01Merge branch 'dead-switch' into tipLuc Van Oostenryck1-0/+19
* fix linearization of unreachable switch + label
2018-09-01Merge branch 'has-attribute' into tipLuc Van Oostenryck1-0/+56
* add support for __has_attribute()
2018-09-01trivial-phi: remove more complex trivial phi-nodesLuc Van Oostenryck1-1/+0
In a set of related phi-nodes and phi-sources if all phi-sources but one correspond to the target of one of the phi-sources, then no phi-nodes is needed and all %phis can be replaced by the unique source. For example, code like: int test(void); int foo(int a) { while (test()) a ^= 0; return a; } used to produce an IR with a phi-node for 'a', like: foo: phisrc.32 %phi2(a) <- %arg1 br .L4 .L4: phi.32 %r7(a) <- %phi2(a), %phi3(a) call.32 %r1 <- test cbr %r1, .L2, .L5 .L2: phisrc.32 %phi3(a) <- %r7(a) br .L4 .L5: ret.32 %r7(a) but since 'a ^= 0' is a no-op, the value of 'a' is in fact never mofified. This can be seen in the phi-node where its second operand (%phi3) is the same as its target (%r7). So the only possible value for 'a' is the one from the first operand, its initial value (%arg1). Once this trivial phi-nodes is removed, the IR is the expected: foo: br .L4 .L4: call.32 %r1 <- test cbr %r1, .L4, .L5 .L5: ret.32 %arg1 Removing these trivial phi-nodes will usually trigger other simplifications, especially those concerning the CFG. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-01trivial-phi: add testcase for unneeded trivial phi-nodesLuc Van Oostenryck1-0/+15
Trivial phi-nodes are phi-nodes having an unique possible outcome. So, there is nothing to join and the phi-node target can be replaced by the unique value. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-01fix linearization of unreachable switch (with reachable label).Luc Van Oostenryck1-1/+0
An unreachable/inactive switch statement is currently not linearized. That's nice because it avoids to create useless instructions. However, the body of the statement can contain a label which can be reachable. If so, the resulting IR will contain a branch to an unexisting BB. Bad. For example, code like: int foo(int a) { goto label; switch(a) { default: label: break; } return 0; } (which is just a complicated way to write: int foo(int a) { return 0; }) is linearized as: foo: br .L1 Fix this by linearizing the statement even if not active. Note: it seems that none of the other statements are discarded if inactive. Good. OTOH, statement expressions can also contains (reachable) labels and thus would need the same fix (which will need much more work). Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-01add tescase for unreachable label in switchLuc Van Oostenryck1-0/+20
or more exactly, an unreachable switch statement but containing a reachable label. This is valid code but is curently wrongly linearized. So, add a testcase for it. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-01has-attr: add support for __has_attribute()Luc Van Oostenryck1-1/+0
Sparse has support for a subset of GCC's large collection of attributes. It's not easy to know which versions support this or that attribute. However, since GCC5 there is a good solution to this problem: the magic macro __has_attribute(<name>) which evaluates to 1 if <name> is an attribute known to the compiler and 0 otherwise. Add support for this __has_attribute() macro by extending the already existing support for __has_builtin(). Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-01has-attr: add testcase for __has_attribute()Luc Van Oostenryck1-0/+57
Add a testcase for the incoming support of __has_attribute(). Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-08-30Merge branch 'volatile-bitfield' and 'mode-pointer' into tipLuc Van Oostenryck2-0/+34
* fix: do not optimize away accesses to volatile bitfields * support mode(__pointer__) and mode(__byte__)
2018-08-25fix: do not optimize away accesses to volatile bitfieldsLuc Van Oostenryck1-1/+0
Accesses to volatiles must, of course, not be optimized away. For this, we need to check to type associated to the memory access. Currently this is done by checking if the type of the result of the memops is volatile or not. Usualy, the type of the result is the same as the one of the access so everything is good but for bitfields, the memop is not done with the type of the bitfield itself but to its base type. Since this base type is unrelated to the access type, it is generaly not marked as volatile even when the access to the bitfield is volatile. Fix this by using the true type of the access to set the field struct instruction::is_volatile. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-08-25add testcase for accesses to volatile bitfieldsLuc Van Oostenryck1-0/+17
Accesses to bitfields must, of course, not be optimized away. This is currently not the case. Add a testcase for it. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-08-25Merge branch 'ssa' into tipLuc Van Oostenryck32-71/+301
* do 'classical' SSA conversion (via the iterated dominance frontier). Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-08-25testsuite: remove useless test for loop-linearizationLuc Van Oostenryck1-136/+0
This testcase was added a bit too quickly in order to have minimal testing of loop's linearization. However, such test just comparing the raw output of test-linearize is a big PITA because it's so sensible to things like pseudos' name themselves depending very much on details about the linearization and simplification. Also, this test didn't really tested anything, it only allowed to track changes. Remove it as it has no testing value. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-08-25Merge branch 'kill-dead-stores' into tipLuc Van Oostenryck4-0/+128
* fix buggy recursion in kill_dead_stores() * kill dead stores again after memops simplification is done. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-08-25add a testcase for enum using a modeLuc Van Oostenryck1-0/+18
Sparse can apply a mode on plain integer types. Add a known-to-fail testcase showing the problem. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-08-24Merge branches 'optim-trunc-or' and 'optim-mask-shift-or' into tipLuc Van Oostenryck4-4/+0
* simplify TRUNC((x & M') | y, N) * simplify AND(SHIFT(a | b, S), M) * simplify TRUNC(SHIFT(a | b, S), N)
2018-08-24simplify TRUNC(SHIFT(a | b, S), N)Luc Van Oostenryck2-2/+0
The simplification of TRUNC(SHIFT(a | b, S), N) can be done by combining the effective mask corresponding to TRUNC(_, N) with the one corresponding to SHIFT(_, S). This allows to also simplify signed bitfields. For example, code like: struct s { signed int :2; signed int f:3; }; int bfs(struct s s, int a) { s.f = a; return s.f; } is now simplified into the minimal: bfs: trunc.3 %r4 <- (32) %arg2 sext.32 %r11 <- (3) %r4 ret.32 %r11 The simplification is done by calling simplify_mask_shift() with the mask corresponding to TRUNC(_, N). Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-08-24simplify AND(SHIFT(a | b, S), M)Luc Van Oostenryck2-2/+0
The simplification of AND(SHIFT(a | b, S), M) can be done by combining the mask M with the effective mask corresponding to SHIFT(_, S). This instruction pattern is generated when accessing bitfields, for example, code like: struct u { unsigned int :2; unsigned int f:3; }; int bfu(struct u s, int a) { s.f = a; return s.f; } is now simplified into the minimal: bfu: and.32 %r11 <- %arg2, $7 ret.32 %r11 The simplification is done by introducing a small helper, simplify_mask_shift(), doing the pattern matching and then calling simplify_mask_shift_or() with the mask M. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-08-22simplify TRUNC((x & M') | y, N)Luc Van Oostenryck4-4/+0
A N-bit truncate is not much different than ANDing with a N-bit mask and so some simplifications done for AND can also be done for TRUNC. For example for code like this: char foo(int x, int y) { return (x & 0xffff) | y; } the mask is unneeded and the function should be equivalent to: char foo(int x, int y) { return x | y; } The simplification in this patch does exactly this, giving: foo: or.32 %r4 <- %arg1, %arg2 trunc.8 %r5 <- (32) %r4 ret.8 %r5 while previously the mask was not optimized away: foo: and.32 %r2 <- %arg1, $0xffff or.32 %r4 <- %r2, %arg2 trunc.8 %r5 <- (32) %r4 ret.8 %r5 This simplification is especially important for signed bitfields because the TRUNC+ZEXT of unsigned bitfields is simplified into an OP_AND but this is, of course, not the case for the TRUNC+SEXT of signed bitfields. Do the simplification by calling simplify_mask_or(), initialy used for OP_AND, but with the effective mask corresponding to TRUNC(x, N): $mask(N). Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-08-22Merge branches 'optim-shift-and' and 'optim-bitfield' into tipLuc Van Oostenryck38-0/+628
2018-08-22simplify ((x & M) << S) when (M << S) == (-1 << S)Luc Van Oostenryck1-1/+0
The instructions SHL(AND(x, M), S) can be simplified into SHL(x, S) if (M << S) == (-1 << S). For example, code like: unsigned foo(unsigned x) { return (x & 0x000fffff) << 12; } is now optimized into: foo: shl.32 %r3 <- %arg1, $12 ret.32 %r3 Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-08-22simplify ((x & M) << S) when (M << S) == 0Luc Van Oostenryck1-1/+0
The instructions SHL(AND(x, M), S) can be simplified to 0 if (M << S) == 0. For example code like: unsigned foo(unsigned x) { return (x & 0xfff00000) << 12; } is now simplified into: foo: ret.32 $0 Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-08-22simplify ((x & M) >> S) when (M >> S) == (-1 >> S)Luc Van Oostenryck1-1/+0
The instructions LSR(AND(x, M), S) are already simplified into AND(LSR(x, S), (M >> S)) but only if AND(x, M) has a single user. However, if (M >> S) == (-1 >> S), the AND part is redundant and the whole can always directly be simplified into LSR(x, S). For example, code like: unsigned foo(unsigned x) { unsigned t = (x & 0xfffff000); return ((t >> 12) ^ (x >> 12)) & t; } is now optimized into: foo: ret.32 $0 because (t >> 12) is simplified into (x >> 12). Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-08-22simplify ((x & M) >> S) when (M >> S) == 0Luc Van Oostenryck1-1/+0
The instructions LSR(AND(x, M), S) are already simplified into AND(LSR(x, S), (M >> S)) but only if AND(x, M) has a single user. However, if (M >> S) == 0, they can always directly be simplified to 0. For example code like: unsigned foo(unsigned x) { unsigned t = (x & 0x00000fff); return (t >> 12) & t; } is now simplified into: foo: ret.32 $0 while previously it was: foo: and.32 %r2 <- %arg1, $0xfff lsr.32 %r4 <- %r2, $12 and.32 %r6 <- %r4, %r2 ret.32 %r6 Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-08-22add testcases for {LSR,SHL}(AND(x, M), S) with shared AND(x, M)Luc Van Oostenryck4-0/+66
The pattern LSR(AND(x, M), S) is already generically simplified into ((x >> S) & (M >> S)) but only if the sub-expression AND(x, M) is not shared with some other expressions because the simplification modify it. But for some special cases the expression can be simplified even if the sub-expression is shared because the simplification doesn't need to modify this AND(x, M) part. Add the testcases for LSR and the incoming SHL. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>