sparse/sparse-dev.git - Sparse's development tree with unstable git history

Age	Commit message (Collapse)	Author	Files	Lines
2019-09-26	shorter message for non-scalar in conditionals	Luc Van Oostenryck	3	-10/+10
	The diagnostic message is a bit long with the non-really-informative part 'incorrect type' first and the explanation later in parentheses. Change this by using a shorter message "non-scalar type in ...". Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-09-26	more consistent type info in error messages	Luc Van Oostenryck	7	-39/+39
	Some error messages are displayed with auxillary information about the concerned type(s). However, this type information is displayed in various way: just the type, "[left/right] side has type ...", "got ...", ... Make these more consistent and simpler by just displaying types when the error message is unambigous about the fact that the problem is a type problem (and/or make the message unambiguous when possible). Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-09-02	constexpr: relax constexprness of constant conditionals	Luc Van Oostenryck	3	-17/+25
	Currently, sparse emits a warning when a conditional expression with a constant condition is used where an "Integer Constant Expression" is expected and only the false-side operand (which is not evaluated) is not constant. The standard are especially unclear about this situation. However, GCC silently accept those as ICEs when they evaluate to a compile-time known value (in other words, when the conditional and the corresponding true/false sub-expression are themselves constant). The standard are especially unclear about the situation when the unevaluated side is non-constant. So, relax sparse to match GCC's behaviour. Reported-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-09-02	problem building sparse 0.6.0 (sparse-llvm)	Randy Dunlap	1	-1/+1
	Certain macros have to be defined in order to use the llvm DataTypes.h header file. Fixes these build errors when building sparse-llvm: CC sparse-llvm.o In file included from /usr/include/llvm-c/Types.h:17:0, from /usr/include/llvm-c/ErrorHandling.h:17, from /usr/include/llvm-c/Core.h:18, from sparse-llvm.c:6: /usr/include/llvm/Support/DataTypes.h:57:3: error: #error "Must #define __STDC_LIMIT_MACROS before #including Support/DataTypes.h" # error "Must #define __STDC_LIMIT_MACROS before #including Support/DataTypes.h" ^ /usr/include/llvm/Support/DataTypes.h:61:3: error: #error "Must #define __STDC_CONSTANT_MACROS before " "#including Support/DataTypes.h" # error "Must #define __STDC_CONSTANT_MACROS before " \ ^ This is from using llvm 3.8.0. Suggested-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
2019-09-02	cgcc: fix wrong processing of -MD & -MMD	Luc Van Oostenryck	1	-1/+1
	In commit bb1bf7485 ("cgcc: gendeps for -MM, -MD & -MMD too"), the flags -MD & -MMD were treated as -M (and -MM): inhibit calling the checker/sparse because the command is only used to generate dependencies. But while this behaviour is correct for -MM, it's not for -MD & -MMD since these flags are only used to generate the dependencies in addition to the normal processing. Fixes: bb1bf748580d1794f8da7200ba83ccfc2f2f3a8a Reported-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-04-04	.gitignore: add temporary *~ files	Ben Dooks	1	-0/+1
	Ignore any ~ files left in the repository. Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-04-01	fix allowing casts of AS pointers to uintptr_t	Luc Van Oostenryck	6	-17/+61
	The patch b3daa62b5 ("also accept casts of AS pointers to uintptr_t") is bogus and allows uintptr_t as the source type instead of the target type. This was helped by a previous bug, in patch d96da358c ("stricter warning for explicit cast to ulong"), where a test for Wcast_from_as was wrongly added for the source type. Fix this by: * adding the test for uintptr_t to the target type; * removing the test for Wcast_from_as from the source type, replacing it by a test of Wcast_to_as; * clarify and extend the tge testcases. So, now, casts from uintptr_t to AS pointers are also allowed. Fixes: b3daa62b53109dba78c7937b3a6a0cd7d67865d5 Fixes: d96da358cfa0432f067a4e66940765883b80ee62 Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-03-30	man: explain role of uintptr_t & unsigned long in casts from AS pointers	Luc Van Oostenryck	1	-3/+10
	Sparse will warn on casts removing the address space of a pointer if the destination type is not uintptr_t or unsigned long. But the special role of these 2 types is not explained in the man page. So, add an explanation for them in the description of -Waddress-space. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-03-30	also accept casts of AS pointers to uintptr_t	Luc Van Oostenryck	2	-1/+61
	Sparse will warn on casts removing the address space of a pointer if the destination type is not unsigned long. But the type 'uintptr_t' should be more suited for this. So, also accept casts of address-space qualified pointers to uintptr_t. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-03-27	evaluate: externally_visible functions don't need a declaration	Jann Horn	4	-4/+19
	sparse warns for non-static functions that don't have a separate declaration. The kernel contains several such functions that are marked as __attribute__((externally_visible)) to mark that they are called from assembly code. Assembly code doesn't need a header with a declaration to call a function. Therefore, suppress the warning for functions with __attribute__((externally_visible)). Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-03-05	add test for evaluation of invalid assignments	Luc Van Oostenryck	2	-0/+37
	Due to the way compatible_assignment_types()'s handle type incompatibilities and how expression with an invalid type are nevertheless processed by linearize_expression(), some invalid assignments retunr unwanted error messages (and working around them can create some others). Here are 2 relatively simple tests triggering the situation. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-03-03	expand: add explanation to 'conservative'	Luc Van Oostenryck	1	-0/+5
	The variable 'conservative' is used to allow testing some characteristics of an expression while inhibiting any possible side-efects like issuing a warning or marking the expression as erroneous. But this role is not immedialtely apparent. So, add a comment to the variable declaration. Suggested-by: Thomas Weißschuh <thomas@t-8ch.de> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-03-03	expand: 'conservative' must not bypass valid simplifications	Thomas Weißschuh	3	-8/+91
	During the expansion of shifts, the variable 'conservative' is used to inhibit any possible diagnostics (for example, because the needed information is if the expression is a constant or not). However, this must not inhibit the simplification of valid shift expressions. Unfortunately, by moving the validation inside check_shift_count(), this what was done by commit 0b73dee01 ("big-shift: move the check into check_shift_count()"). Found through a false positive VLA detected in the Linux kernel. The array size was computed through min() on a shifted constant value and sparse complained about it. Fix this by changing the logic of check_shift_count(): 1) moving the test of 'conservative' inside check_shift_count() and only issuing warnings if set. 2) moving the warning part in a separate function: warn_shift_count() 3) let check_shift_count() return if the shift count is valid so that the simplication can be eluded if not. Fixes: 0b73dee0171a15800d0a4ae6225b602bf8961599 Signed-off-by: Thomas Weißschuh <thomas@t-8ch.de> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-03-01	Sparse v0.6.1-rc1v0.6.1-rc1	Luc Van Oostenryck	1	-1/+1
	Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-02-28	remove confusing intermediate 'where' in evaluate_assignment()	Luc Van Oostenryck	1	-2/+1
	In evaluate_assignment(), a local variable (named 'where') contains the input expression (named 'expr', like in most other functions). This is doubly confusing because: ) both variables hold the same pointer. ) the name 'where' is normally used for a string with extra information for error messages. So, remove this intermediate var and use the original 'expr' instead. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-02-28	display extra info for type errors in compare & conditional	Luc Van Oostenryck	2	-4/+12
	For "incompatible types in comparison expression" errors, only the kind of type difference is displayed. Displaying the types would make easier to find the cause of the problem. The same is true for ternary conditionals. So, also display the left & right types. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-02-25	cgcc: use strict & warnings	Luc Van Oostenryck	1	-0/+3
	Better to declare undeclared or unintialized vars early, so use the 'strict' & 'warnings' pragmas. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-02-25	Merge branch 'cgcc-dumpmachine' into tip	Luc Van Oostenryck	1	-11/+39
	* cgcc: add support for x86-x32 * cgcc: favor using 'gcc -dumpmachine' to determine specifics * cgcc: simpler handling of hard-float ARM * cgcc: add pseudo-archs for ppc64be/ppc64le * cgcc: -dumpmachine should be fetched with '$ccom'
2019-02-25	cgcc: add support for x86-x32	Luc Van Oostenryck	1	-1/+3
	Detect when the target is x86-x32 and pass the appropriate flag '-mx32'. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-02-25	cgcc: favor using 'gcc -dumpmachine' to determine specifics	Uwe Kleine-König	1	-5/+28
	`uname -m` returns information about the host machine but this information is useless when cgcc is used with a non-native compiler since it's information about the target machine that is needed. So, first try to determine the target machine via `gcc -dumpmachine` and default to `uname -m`. Note: this should fix problems with Debian build when armhf builder is run on a arm64 environment. Originally-by: Uwe Kleine-König <uwe@kleine-koenig.org> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-02-25	cgcc: simpler handling of hard-float ARM	Luc Van Oostenryck	1	-8/+7
	There is an ABI for ARM with hard floats and one for soft floats (as well as one for a sort of mix between hard & soft). For hard floats, the preprocessor symbol '__ARM_PCS_VFP' needs to be defined. This is added as an additional check in the code returning the 'specs' for ARM. To facilitate some upcoming changes and code reuse here, create a pseudo-arch 'arm+hf' using '-D__ARM_PCS_VFP=1' in addition to the usual options for ARM. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-02-25	cgcc: add pseudo-archs for ppc64be/ppc64le	Luc Van Oostenryck	1	-2/+6
	Platforms having an uname's machine 'ppc64' or 'ppc64le' need to have their endianness set (as well as the 'ELF' version). To facilitate some future changes and code reuse here, create entries for 2 pseudo-archs 'ppc64+be' & 'ppc64+le'. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-02-25	cgcc: -dumpmachine should be fetched with '$ccom'	Luc Van Oostenryck	1	-1/+1
	The variable '$ccom' is used to hold the compiler command only while '$cc' hold the compiler and it's options. So, use '$ccom' to fetch '-dumpmachine' instead of '$cc'. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-02-25	Merge branch 'fix-cgcc-gendeps' into tip	Luc Van Oostenryck	1	-5/+5
	* cgcc: -MF, -MQ & -MT need an argument * cgcc: gendeps for -MM, -MD & -MMD too
2019-02-25	cgcc: define __APPLE_CC__ on OSX	Luc Van Oostenryck	1	-1/+1
	It seems that some header files somehow need this. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-02-25	evaluate: sizeof(bool) could be larger than sizeof(char)	Luc Van Oostenryck	1	-1/+1
	The C standard doesn't require that the size of a _Bool is 1, its size is implementation defined. However, in evaluate_sizeof() the assumption is made that a bool is the same size as a char. Fix this wrong assumption by using the existing bits_in_bool. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-02-20	cgcc: -MF, -MQ & -MT need an argument	Luc Van Oostenryck	1	-4/+4
	These flags expect an argument. So, the following element in '@ARGV' must then not be considered as an option or an input file, exactly like done for '-o FILE'. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-02-20	cgcc: gendeps for -MM, -MD & -MMD too	Luc Van Oostenryck	1	-1/+1
	These flags must set '$gendeps', just like a plain '-M' do, since they implies '-M'. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-02-18	testsuite: fix bad escaping of '[' & ']'	Luc Van Oostenryck	2	-2/+2
	Fix escaping of square brackets in some test patterns. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-02-17	Merge branch 'branch-v0.6'	Luc Van Oostenryck	4	-11/+11
	* explain cause of 'incorrect type in conditional' * manpage: fix doc of '-Wcast-from-as'
2019-02-17	build: honor CFLAGS & friends from environment	Uwe Kleine-König	1	-6/+9
	Debian build scripts pass CFLAGS in the environment. However, this is ignored by Sparse's Makefile since 'CFLAGS' is unconditionaly initialized. Fix this by initializing CFLAGS to its default value using '?='. Do the same for PKG_CONFIG, DESTDIR, BINDIR, MANDIR and CHECKER_FLAGS. Note: It's useless to try to do the same for CC, LD & AR since they're builtin variables so '?= ...' is a no-op for them (unless make is called with -R). Note: This makes sparse native builds reproducible for Debian. Signed-off-by: Uwe Kleine-König <uwe@kleine-koenig.org> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-02-17	build: get rid of MAN1DIR	Luc Van Oostenryck	1	-5/+8
	MAN1DIR is one of the configurable build option but it seems to have few, if any, reasons to have such an option in addition of MANDIR. So, remove this variable and simplify the install rules by using an internal-only "$(bindir)" & "$(man1dir)" to replace "$(DESTDIR)$(BINDIR)" & "$(DESTDIR)$(MANDIR)/man1". Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-02-17	build: always use '-Wall -Wwrite-strings'	Luc Van Oostenryck	1	-1/+1
	Currently, these options are in the configurable part of CFLAGS, like '-O2' or '-g', but since they're just warnings they can be moved to the non-optional flags. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-02-17	build: CHECKER is not needed, remove it	Luc Van Oostenryck	1	-2/+1
	This variable is only used for selfcheck and there is no reasons to be configurable from the command line or the environment. So, get rid of 'CHECKER' by inlining it in the check command. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-02-08	fix parallel install	Luc Van Oostenryck	1	-6/+3
	The current make rules for 'install' were mixing pure declarative and procedural style. As consequence, the binaries or the manpages could be installed before their target directory was created. Fix this by removing the rule to create these dirs and use install with the '-D' option to create them. Also remove the first prerequisites '$(INST_PROGRAMS) $(INST_MAN1)' since these are not needed (the effective install rules already depend them) and somehow misleading (it's not because they're first in the dependencies list that they will be created before the next ones). Spotted-by: Uwe Kleine-König <uwe@kleine-koenig.org> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-02-07	redecl: add test for attribute placement in function declarators	Ramsay Jones	1	-0/+31
	Add a new test file which demonstrates some problems which can be seen on the git codebase. gcc does not complain about this file: $ gcc -Wall -c validation/function-redecl2.c $ ... but sparse does: $ sparse validation/function-redecl2.c validation/function-redecl2.c:6:5: error: symbol 'func0' redeclared with different type (originally declared at validation/function-redecl2.c:3) - different modifiers validation/function-redecl2.c:13:6: error: symbol 'func1' redeclared with different type (originally declared at validation/function-redecl2.c:11) - different modifiers validation/function-redecl2.c:21:6: error: symbol 'func2' redeclared with different type (originally declared at validation/function-redecl2.c:18) - different modifiers $ Note that func0 and func2 are essentially the same example, apart from the attribute used, to demonstrate that the issue isn't caused by the 'pure' attribute. Also, examples like func1 have occurred several times in git and, although they can be worked around (eg. See [1]), it would be preferable if this were not necessary. [1] (git) commit 3d7dd2d3b6 ("usage: add NORETURN to BUG() function definitions", 2017-05-21). Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-02-07	validation: Add patterns FAIL, PASS, XPASS and XFAIL to test	Uwe Kleine-König	1	-6/+9
	This simplifies finding the offending test when the build ended with KO: out of 584 tests, 527 passed, 57 failed 56 of them are known to fail Signed-off-by: Uwe Kleine-König <uwe@kleine-koenig.org> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-02-07	cgcc: teach cgcc about Hurd/GNU	Luc Van Oostenryck	1	-0/+3
	cgcc fails if it doesn't know about the system/OS as returned by `uname -s`. This creates a build failure for Debian since Hurd is one of their non-official 'ports' but unknown to cgcc. So, teach cgcc about 'GNU' (the OS/system name returned on Hurd) and add the few predefines used to identify it. Reported-by: Uwe Kleine-König <uwe@kleine-koenig.org> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-02-07	predefs: add arch-specific predefines	Luc Van Oostenryck	2	-17/+68
	Predefined macros like '__x86_64__', '__arm__', ... are used in systems headers (and surely at other places too). So, when appropriate, define the following symbols which seems to be somehow needed by glibc: m68k: __m68k__ mips: __mpis64, __mips ppc: __ppc64__, __powerpc, __ppc__ riscv: __riscv__, __riscv_xlen__ s390: __zarch__ sparc: __sparc_v9__, __sparcv9 x86-64: __x86_64__, __x86_64 Also, the following symbols, which were previously only defined in cgcc, are now defined in Sparse itself: i386 __i386, __i386__ sparc __sparc, __sparc__, __arch64__, __sparc64__, __sparcv9__ s390 __s390__, __s390x__ ppc __PPC__, __powerpc__, __PPC64__, __powerpc64__ arm __arm__ arm64 __aarch64__ Note: these are only tested on i386, x86-64, arm, arm64, mips64 (ABI O32), ppc, ppc64 (power7), ppc64el (power8) and sparc64, most of them on a not-so-new OS version. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-02-04	predefs: fix for MIPS system headers needing _MIPS_SZ{INT,LONG,PTR}	Luc Van Oostenryck	1	-0/+9
	System headers (at least glibc's ones) define and use __WORDSIZE which, on most archs, is defined depending on __LP64 or __ILP32. But on MIPS, __WORDISZE is defined depending on the value of the builtin macro _MIPS_SZPTR. So, add the predefine for _MIPS_SZPTR on MIPS. Reported-by: Uwe Kleine-König <uwe@kleine-koenig.org> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-02-04	target.c: ignore -m64 on archs where int32_t is a long	Luc Van Oostenryck	19	-0/+20
	If the flag '-m64' is used on a 32-bit architecture/machine having int32_t set to 'long', then these int32_t are forced to 64-bit ... So, ignore the effect of -m64 on these archs and ignore '64-bit only' tests on them. Reported-by: Uwe Kleine-König <uwe@kleine-koenig.org> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Tested-by: Uwe Kleine-König <uwe@kleine-koenig.org>
2019-02-04	lib.c: move handle_arch_m64_finalize() to init_target()	Luc Van Oostenryck	2	-41/+36
	It must be done after init_target because of some archs (PPC32, mips32, ...) have int32_t set to long. These 32-bit ints would then become 64-bit. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Tested-by: Uwe Kleine-König <uwe@kleine-koenig.org>
2019-02-04	lib.c: move predefines out of handle_arch_m64_finalize()	Luc Van Oostenryck	1	-12/+24
	In handle_arch_m64_finalize(), some types (like size_t_ctype) are set to support the flag '-m64' and native 64-bit archs. But some predefines are also issued. As a preparatory step to fix a bug related to a inconstency between this function and init_target(), move the predefines to predefined_macros(). Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Tested-by: Uwe Kleine-König <uwe@kleine-koenig.org>
2019-02-04	testsuite: remove unneeded -m64 from command-line	Luc Van Oostenryck	1	-1/+1
	The test was called with the flag '-m64' but doesn't need it. So, remove it. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Tested-by: Uwe Kleine-König <uwe@kleine-koenig.org>
2019-02-01	Makefile: default to LD = CC	Uwe Kleine-König	1	-1/+1
	Usually the compiler is used as linker. Assuming that if someone wants to change the compiler the linker should be changed, too, simplify that use case by using "$(CC)" as linker instead of the hard coded "gcc". This also matches the behaviour of make when using the built-in rules of GNU Make which include: LINK.o = $(CC) $(LDFLAGS) $(TARGET_ARCH) %: %.o $(LINK.o) $^ $(LOADLIBES) $(LDLIBS) -o $@ Signed-off-by: Uwe Kleine-König <uwe@kleine-koenig.org> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-02-01	machine.h: Fix MACH_NATIVE on m68k	Uwe Kleine-König	1	-1/+1
	This fixes a failure to compile on m68k as MACH_68K is undefined. Fixes: ce50c885b8b0 ("add detection of native platform") Signed-off-by: Uwe Kleine-König <uwe@kleine-koenig.org> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2019-01-10	is_null_pointer_constant(): replace magic constant by enum	Aurelien Aptel	1	-8/+14
	Replace the constants 0, 1 & 2 returned by is_null_pointer_constant() with self-describing enums. Signed-off-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-29	explain cause of 'incorrect type in conditional'	Luc Van Oostenryck	3	-10/+10
	A conditional only make sense on a scalar type. If not, an error is issued but the message doesn't explain the cause. Fix this by adding the cause to the error message. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-28	manpage: fix doc of '-Wcast-from-as'	Luc Van Oostenryck	1	-1/+1
	It seems that the current doc for -Wcast-from-as was cut-and-pasted but not adjusted correctly. Fix the wording. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-26	Sparse v0.6.0v0.6.0	Luc Van Oostenryck	1	-1/+1

2018-12-26	add TODO list.	Luc Van Oostenryck	3	-52/+99
	The Documentation directory contains a 'project ideas' document. But some of the entries there are outdated, some are questionable and some more are simply not clear about the problem or the goal. Some important entries are also missing. So, remove what's needed to be removed, reformulate unclear entries and add a bunch of new things that should be done. Also, rename the file to 'TODO.md' as this express more clearly the intent of the document. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-22	as-name: document that identifiers are OK for address spaces	Luc Van Oostenryck	1	-3/+4
	A previous series allowed to used an indentifier to denotate an address space but this wasn't documented. Document it now in the manpage. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-22	doc: fix list formatting	Luc Van Oostenryck	1	-2/+3
	Sphinx gives a warning on if_convert_phi()'s autodoc because of some 'unknown indentation' caused by using the wrong marker ('#' instead of '' or '#.'). Use the right markup: ''. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-21	expression.h: update comment to include other cast types	Tycho Andersen	1	-1/+2
	This part of the union is used with other cast types as well, and also by EXPR_{SIZEOF,PTRSIZEOF,ALIGNOF}, so let's include those in the comment. Signed-off-by: Tycho Andersen <tycho@tycho.ws> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-21	remove self-assignment of base_type	Luc Van Oostenryck	1	-1/+1
	The parsing of enums contains a self-assignment to base_type. But this self-assignment doesn't show very clearly that the variable doesn't need to change and that some compilers complain. So, replace that self-assignment by a null statement. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-21	remove -finline-functions from CFLAGS	Luc Van Oostenryck	1	-1/+1
	By default, sparse is compiled with -finline-functions but this flag as no effect on the generated code (since gcc's defaults at -O2 already do automatic inlining). So, remove this flag. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-21	remove unused regno()	Luc Van Oostenryck	1	-8/+0
	The function regno() is unused since a very long time and replaced by show_pseudo(). So, remove this function. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-20	remove redundant check of _Bool bitsize	Luc Van Oostenryck	1	-1/+1
	To test if a type is a variant of _Bool it is useless to test is_bool_type(x) and test if 'x->bit_size == 1' since the first implies the second. So, remove the test of the bitsize. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-20	Merge branch 'cleanup'	Luc Van Oostenryck	3	-8/+9
	* small build cleanup related to LLVM * build: only need includedir from llvm-config * build: check if sparse-llvm needs libc++ * small cleanup * remove unneeded declarations in "compat.h" * remove unused arg in add_branch() * allocate BBs only after initial checks in linearize_short_conditional()
2018-12-20	Merge branch 'show-type'	Luc Van Oostenryck	13	-63/+69
	* small improvemnets to show_typename()'s outout: * strip trailing space * don't display '<noident>' * do not display base type's redundant specifiers * do not let display string_ctype lika a base type 'string'
2018-12-19	Merge branch 'bitwise-ptr'	Luc Van Oostenryck	6	-0/+67
	* warn on casts to/from bitwise pointers
2018-12-19	allocate BBs after the guards	Luc Van Oostenryck	1	-1/+3
	In linearize_short_conditional(), the 'merge' BB is directly allocated at function entry but then some checks can directly return without ever using this BB. Move the allocation after the checks have been made. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-19	remove unused arg in add_branch()	Luc Van Oostenryck	1	-4/+4
	add_branch() has an argument for an expression but this argument is not used anymore and doesn't seem to be of any possible use. So, remove this unneeded arg. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-19	remove unneeded declarations in "compat.h"	Luc Van Oostenryck	1	-2/+0
	struct stream & struct stat are defined in this file but were only used for identical_files() which has been removed years ago. So, remove these declarations. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-18	build: check if sparse-llvm needs libc++	Luc Van Oostenryck	1	-0/+1
	The output of 'llvm-config --system-libs' is not really complete as libc++ may be needed but not reported as such by this command. So, use the output of 'llvm-config --cxxflags' to check this. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-18	build: only need includedir from llvm-config	Luc Van Oostenryck	1	-1/+1
	sparse-llvm doesn't need to full output of 'llvm-config --cflags', it only needs where LLVM's header files can be found. So, use 'llvm-config --includedir' instead. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-18	VERSION=0.6.0-rc1	Luc Van Oostenryck	1	-1/+1
	I forgot to update the version number in the Makefile. Here it is now. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-17	show-parse: remove string_ctype from typenames	Luc Van Oostenryck	1	-1/+0
	Currently, a string_ctype (only used for the declaration of builtin functions) is displayed as "string", not "char *". Fix this by removing the entry for string_ctype from typenames[] which should only contains the name of base types. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-17	show-parse: do not display base type's redundant specifiers	Luc Van Oostenryck	6	-37/+39
	In do_show_type(), builtin_typename() is used to display builtin (base) types and modifier_string() is used to display modifiers. However, most base types contains some intrinsic modifiers, the type specifiers. So, a type like 'unsigned long' is displayed as 'unsigned long [unsigned] [long]'. Fix this redundancy by not displaying the specifiers when displaying a base_type (or an enum). Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-17	show-parse: don't display null ident in show_typename()	Luc Van Oostenryck	10	-26/+27
	Often show_typename() is used to display a type and the associated identifier is irrelevant but is displayed nevertheless. However, when the identifier is itself not present, it is still displayed as '<noident>', which is just noise and can be confusing. Fix this by displaying nothing for null identifiers in show_typename(). Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-17	add a flag to warn on casts to/from bitwise pointers	Luc Van Oostenryck	5	-2/+29
	Support for 'bitwise' integers is one of the main sparse's extension. However, casts to or from pointers to bitwise types can be done without incurring any sort of warnings although such casts can be as wrong as direct casts to or from bitwise integers themselves. Add the corresponding warnings and control them by a new flag -Wbitwise-pointer (defaulting to off as it creates tens of thousands warnings in the kernel). CC: Thiebaud Weksteen <tweek@google.com> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-17	Add testcases for bitwise cast on pointer	Thiebaud Weksteen	2	-0/+40
	since it seems that the strict type checking is not done on pointers to restricted types. Signed-off-by: Thiebaud Weksteen <tweek@google.com> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-17	Merge branch 'predefs' into tipv0.6.0-rc1	Luc Van Oostenryck	16	-163/+415
	* add predefined macros for __INTMAX_TYPE__, __INT_MAX__, ...
2018-12-17	add predefine_min() and use it for __{WCHAR,WINT}_MIN__	Luc Van Oostenryck	1	-2/+17
	wchar_t & wint_t seems to be the only integer types needing the _MIN__ macros. Extend predefine_ctype() to handle these and use it to define __WCHAR_MIN__ & __WINT_MIN__. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-17	add predefine for __CHAR_UNSIGNED__	Luc Van Oostenryck	2	-1/+9
	This macro is needed by <limits.h> to get, among others things, CHAR_MAX from __SCHAR_MAX__. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-17	fix the size of long double	Luc Van Oostenryck	1	-2/+23
	The odd one here is, of course i386, with its 80-bit extended floats taking 12 bytes. Idem for __m68k__. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-17	add predefined macros for char{16,32}_t	Luc Van Oostenryck	1	-0/+2
	These types are supposed to be defined the same as uint_least{8,16}_t. So define them as 'ushort' and 'uint'. Note: it seems that some archs define char32_t as 'ulong' although their 'uint' is 32bit ... Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-17	add predefined macros for [u]int32_t	Luc Van Oostenryck	4	-0/+25
	These are a pain. All LP64 archs use [u]int. Good. But some LP32 archs use [u]int and some others use [u]long. Some even use [u]int for some ABI and [u]long for some others (bare metal). This really need to be target-specific to be correct. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-17	add predefined macros for [u]int64_t	Luc Van Oostenryck	4	-0/+18
	All LP32 archs use [u]llong and all LP64 use [u]long for these but Darwin which seems to always use [u]llong. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-17	add predefined macros for [u]int{8,16}_t	Luc Van Oostenryck	2	-0/+9
	All LP64 & LP32 use [u]char and [u]short for these ones. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-17	add predefined macros for [u]intmax	Luc Van Oostenryck	4	-0/+12
	Seems to use [u]long for all LP64 archs and [u]llong and all LP32 ones (but OpenBSD but it seems to not defines the corresponding macros). Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-17	add predefined macros for [u]intptr	Luc Van Oostenryck	2	-0/+4
	Luckily, it seems all archs use for them the same types as size_t & ssize_t. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-17	add predefined macros for wint_t	Luc Van Oostenryck	3	-0/+6
	This type seems to use 'unsigned int' on all archs but for Darwin & FreeBSD. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-17	use the type for predefined_max()	Luc Van Oostenryck	1	-5/+5

2018-12-17	give a type to wchar	Luc Van Oostenryck	6	-9/+37
	This allows to use predefined_ctype() on wchar_t. Note: currently __WCHAR_TYPE__ is defined to 'int' but this is incorrect on: i386, m68k, ppc32, ... Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-17	make predefined_type_size() more generic	Luc Van Oostenryck	3	-23/+50
	This allows to have a single function to output the size, the type, the maximal value, ... Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-16	show-parse: strip do_show_type()'s trailing space	Luc Van Oostenryck	2	-2/+6
	It's possible that the result of do_show_type() ends with a space. Strip this unneeded space. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-14	remove duplicates from gcc-attr-list.h	Luc Van Oostenryck	1	-9/+0
	gcc-attr-list.h constains list of 'known' attributes but some of them are already defined in keyword_table[]. So, remove these duplicated attributes from the list (including '__default__' which is not duplicated but doesn't appear to be an attribute). Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-14	teach sparse about asm inline	Luc Van Oostenryck	3	-7/+81
	GCC's trunk now allows to specifiy 'inline' with asm statements. This feature has been asked by kernel devs and will most probably by used for the kernel. So, teach sparse about this syntax too. Note: for sparse, there is no semantic associated to this inline because sparse doesn't make any size-based inlining decisions. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-14	add builtin_type_suffix()	Luc Van Oostenryck	2	-31/+43
	With this helper, we can easily output constants with the correct type, like '0x123' for ints, '0x123UL' for unsigned longs, .... Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-14	use bits_mask() for predefined_max()	Luc Van Oostenryck	1	-1/+2
	Creating a bit mask using '(1 << n) - 1' is undefined if n is as big as the width of an int. Use the safe helper bits_mask() to create the mask/value for predefined_max(). Note: predefined_max() is currently correct only for signed types. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-14	allow optional "_T" suffix to __SIZEOF_XXX__	Luc Van Oostenryck	1	-12/+12
	This allows to be more generic. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-14	fix '__SIZE_TYPE__' for LLP64	Luc Van Oostenryck	2	-11/+7
	size_t_ctype is set to uint, ulong or ullong, depending on the architecture (ullong is only used for LLP64). However, when emitting '__SIZE_TYPE__', it's only compared to ulong or uint. Fix this by using an small helper directly using the right struct symbol * and using builtin_typename() to output the right type. This way we're guaranteed that '__SIZE_TYPE__' is kept coherent with the internal type: size_t_ctype. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-14	testsuite: test predef macros on LP32/LP64/LLP64	Luc Van Oostenryck	7	-59/+70
	Now these tests should succeed and be meaningful on all archs. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-14	test endianness with __BYTE_ORDER__	Luc Van Oostenryck	1	-1/+1
	The detection of the native endianness is currently done by testing if __BIG_ENDIAN__ is defined. However, not all native big endian platforms define this macro. Test the endianness with __BYTE_ORDER__. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-14	Consolidate 'machine detection' into "machine.h"	Luc Van Oostenryck	2	-21/+24
	The file "lib.c' contains some defines and have some #ifdefery to detect the data model of the native machine (LP32/LP64). Same for the native endianness. Move these into "machine.h" where the platform detection is already done. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-14	add detection of native platform	Luc Van Oostenryck	1	-0/+54
	The underlying type of most builtin types (size_t, int32_t, ...), as well as their size, the endianness and other parameters are platform dependent. The minimal is to have these parameters correct on the native machine. Use the diffrent predefined macros to detect the native machine. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-12	Merge branch 'as-named' into tip	Luc Van Oostenryck	11	-65/+152
	* prepare to identify & display the address spaces by name
2018-12-12	as-named: warn on bad address space	Luc Van Oostenryck	4	-14/+17
	Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-12	as-name: check for multiple address spaces at parsing time	Luc Van Oostenryck	1	-1/+6
	Warn on non-sensical declarations like: int __user __iomem *ptr; These can be easily be checked at parsing time. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-12	as-name: allow ident as address_space	Luc Van Oostenryck	2	-8/+43
	Currently, address space 1 is displayed as '<asn:1>' and so on. Now that address spaces can be displayed by name, the address space number should just be an implementation detail and it would make more sense the be able to 'declare' these address space directly by name, like: #define __user attribute((noderef, address_space(__user))) Since directly using the name instead of an number creates some problems internally, allow this syntax but for the moment keep the address space number and use a table to lookup the number from the name. References: https://marc.info/?l=linux-sparse&m=153627490128505 Idea-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-12	as-name: use idents for address spaces	Luc Van Oostenryck	5	-42/+70
	Currently, address space are identified by an number and displayed as '<asn:%d>'. It would be more useful to display a name like the one used in the code: '__user', '__iomem', .... Prepare this by using an identifier instead of the AS number. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-10	Merge branch 'fix-non-const-case' into tip	Luc Van Oostenryck	2	-2/+42
	* fix linearization of non-constant switch-cases
2018-12-09	as-name: add and use show_as()	Luc Van Oostenryck	8	-26/+38
	Use a function to display the address spaces. This will allow to display a real name instead of '<asn:1>'. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-09	multi-buffer for idents	Luc Van Oostenryck	1	-1/+5
	Currently, show_indent() use a single static buffer. It thus can't be used like: printf("%s %s", show_ident(a), show_ident(b)); Fix this by using multiple buffers like done for show_pseudo() and others. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-09	Merge branch 'maintainer' into tip	Luc Van Oostenryck	1	-2/+15
	* manpage: update maintainer info * manpage: add AUTHORS section
2018-12-09	Merge branch 'dump-macros'	Luc Van Oostenryck	5	-16/+105
	* fixes for -dD * add support for -dM Luc Van Oostenryck (2): dump-macro: break the loop at TOKEN_UNTAINT dump-macro: simplify processing of whitespace Ramsay Jones (5): pre-process: suppress trailing space when dumping macros pre-process: print macros containing # and ## correctly pre-process: don't put spaces in macro parameter list pre-process: print variable argument macros correctly pre-process: add the -dM option to dump macro definitions
2018-12-09	don't allow newlines inside string literals	Luc Van Oostenryck	3	-7/+6
	Sparse allows (but warns about) a bare newline (not preceded by a backslash) inside a string. Since this is invalid C, it's probable that a terminating '"' is missing just before the newline. In this case, allowing the newline implies accepting the following characters until the next '"' is found, which is most case creates a lot of irrelevant warnings. Change this by disallowing newlines inside strings, exactly like already done for character constants. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-08	man: update maintainer info	Luc Van Oostenryck	1	-2/+1
	Update the info about the maintainer. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-08	man: add AUTHORS section	Luc Van Oostenryck	1	-0/+5
	Add a section about the authors. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-08	man: add section about reporting bugs	Luc Van Oostenryck	1	-0/+9
	Add a small section about contributing and reporting bugs. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-08	add testcase for missing deliminator ' or "	Luc Van Oostenryck	1	-0/+18
	Add a testcase for "Newline in string or character constant" vs. "missing delimitator" upcoming change. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-06	cgcc: use 'i386' for the arch instead of 'i86'	Luc Van Oostenryck	1	-2/+2
	cgcc can be used when cross-compiling if the target architecture is given with '-target=<arch>'. However, the name that needs to be given for the i386 arch is 'i86'. Fix this by changing the name 'i86' into 'i386'. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-01	Conditionalize 'warning: non-ANSI function ...'	John Levon	7	-4/+66
	Sparse unconditionally issues warnings about non-ANSI function declarations & definitions. However, some environments have large amounts of legacy headers that are pre-ANSI, and can't easily be changed. These generate a lot of useless warnings. Fix this by using the options flags -Wstrict-prototypes & -Wold-style-definition to conditionalize these warnings. Signed-off-by: John Levon <levon@movementarian.org> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-01	Accept comma-separated list for function declarations.	Luc Van Oostenryck	1	-1/+1
	The declaration of a function without prototype is currently silently accepted by sparse but a warning is issued for 'old-style' declarations: ... warning: non-ANSI function declaration ... However, the difference between these two cases is made by checking if a ';' directly follow the parentheses. So: int foo(); is silently accepted, while a warning is issued for: int foo(a) int a; but also for: int foo(), bar(); This last case, while unusual, is not less ANSI than a simple 'int foo();'. It's just detected so because there is no ';' directly after the first '()'. Fix this by also using ',' to detect the end of function declarations and their ANSIness. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-01	Use -Wimplicit-int when warning about missing K&R argument types	Luc Van Oostenryck	4	-1/+22
	In legacy environment, a lot of warnings can be issued about arguments without an explicit type. Fix this by contitionalizing such warnings with the flag -Wimplicit-int, reducing the level of noise in such environment. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-12-01	fix implicit K&R argument types	Luc Van Oostenryck	2	-1/+19
	In an old-style function definition, if not explicitly specified, the type of an argument defaults to 'int'. Sparse issues an error for such arguments and leaves the type as 'incomplete'. This can then create a cascade of other warnings. Fix this by effectively giving the type 'int' to such arguments. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-11-29	Ignore #ident directives	John Levon	3	-0/+30
	Legacy code can be littered with the non-standard "#ident" directive; ignore it. Signed-off-by: John Levon <levon@movementarian.org> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-11-25	dump-macro: simplify processing of whitespace	Luc Van Oostenryck	1	-6/+3
	When dumping the macros, two special cases are needed regarding whitespace: * just before the macro body * just after the macro body. This is caused in parts because some misunderstanding about the role of TOKEN_UNTAINT. Happily, things can be simplified, which is what is done in this patch. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-11-24	dump-macro: break the loop at TOKEN_UNTAINT	Luc Van Oostenryck	1	-3/+1
	Since EOF are preceded by an UNTAINT, these UNTAINTs need to be special cased to not print a whitespace and are otherwise ignored, it make the handling of the whitespaces slightly simpler to stop the loop not at EOF but at these UNTAINT. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-11-24	pre-process: add the -dM option to dump macro definitions	Ramsay Jones	4	-9/+75
	The current -dD option outputs the macro definitions, in addition to the pre-processed text. In contrast, the -dM option outputs only the macro definitions. Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-11-24	pre-process: print variable argument macros correctly	Ramsay Jones	2	-1/+15
	The dump_macros() function fails to correctly output the definition of macros that have a variable argument list. For example, the following macros: #define unlocks(...) annotate(unlock_func(__VA_ARGS__)) #define apply(x,...) x(__VA_ARGS__) are output like so: #define unlocks(__VA_ARGS__) annotate(unlock_func(__VA_ARGS__)) #define apply(x,__VA_ARGS__) x(__VA_ARGS__) Add the code necessary to print the ellipsis in the argument list to the dump_macros() function and add the above macros to the 'dump-macros.c' test file. Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-11-24	pre-process: don't put spaces in macro parameter list	Ramsay Jones	2	-2/+2
	The dump_macros() function adds a ", " separator between the arguments of a function-like macro. Using a simple "," separator, which aligns the output with gcc, leads to one less distraction when comparing the output of sparse and gcc. Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-11-24	pre-process: print macros containing # and ## correctly	Ramsay Jones	2	-0/+12
	The dump_macro() function fails to correctly output the definitions of macros that contain the string operator '#', the concatenation operator '##' and any macro parameter in the definition token list. For example, the following macros: #define STRING(x) #x #define CONCAT(x,y) x ## y are output like so: #define STRING(x) unhandled token type '21' #define CONCAT(x, y) unhandled token type '22' unhandled token type '23' unhandled token type '22' Add the code necessary to handle those token types to the dump_macros() function and add the above macros to the 'dump-macros.c' test file. Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-11-24	cgcc: teach about '-x c'	Luc Van Oostenryck	1	-0/+8
	Currently, cgcc only checks input files if their names end with '.c' or if given as stdout. Other files are explicitly ignored. This generally corresponds to what is wanted but GCC allows arbitrary input files if the option '-x <language>' is given. Some projects use this mechanism, for example to use the C pre-processor on non-C files. This fails when cgcc is used as wrapper around sparse + GCC. Fix this by teaching cgcc about the '-x c' option. Reported-by: Antonio Ospite <ao2@ao2.it> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-11-24	cgcc: teach about '-o <file>'	Luc Van Oostenryck	1	-0/+8
	The option '-o', in itself, doesn't need to be handled specially by cgcc but this option takes an argument and option arguments need to be ignored by cgcc (otherwise they can be interpreted and filtered-out by cgcc). Avoid potential problems with -o's argument by simply ignoring it. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-11-24	cgcc: add support to ignore argument(s) of options	Luc Van Oostenryck	1	-2/+9
	cgcc only does a minimal processing and filtering of its command line and most options are simply forwarded to sparse and gcc. However, if one of the ignored options takes an argument that matches one of the non-ignored options, this argument will be processed as an option with undesirable effect. Allow options to specify the number of arguments they're taking and avoid any processing or filtering of these arguments while still forwarding them to sparse and gcc. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-11-24	teach sparse about '-x <language>'	Luc Van Oostenryck	1	-0/+8
	Sparse ignores unknown options but then doesn't know if they have arguments or not. So, if '-x c' is given as option, sparse will try to process a file named 'c'. Fix this by teaching sparse about the option '-x' and its argument. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-11-24	teach sparse about '-o <file>'	Luc Van Oostenryck	3	-0/+17
	Sparse knows about the '-o' option, parses it but does nothing with it. Change this by redirecting stdout to <file> unless <file> is '-' since sparse (the lib) outputs to stdout by default. But ignore this flag when sparse is used purely as an checker since in this case it's not supposed to output to stdout and would create undesired empty file, possibly erasing the result of the compiler if one is used before sparse. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-11-24	pre-process: suppress trailing space when dumping macros	Ramsay Jones	1	-0/+2
	The dump_macro() function outputs a trailing space character for every macro. This makes comparing the '-dD' output from sparse to the similar output from gcc somewhat annoying. The space character arises from the presence of an <untaint> token directly before the <eof> token in the macro definition token list. The <untaint> token seems to always have the 'whitespace' flag set, which results in the output of a space in order to separate it from the current token. In order to suppress the unwanted space character, check if the next token is an <untaint> token and, if so, don't print the space. Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-11-23	constant: add -Wconstant-suffix warning	Ramsay Jones	7	-2/+47
	Currently, when used on the kernel, sparse issues a bunch of warnings like: warning: constant 0x100000000 is so big it is long These warning are issued when there is a discrepancy between the type as indicated by the suffix (or the absence of a suffix) and the real type as selected by the type suffix and the value of the constant. Since there is nothing incorrect with this discrepancy, (no bits are lost) these warnings are more annoying than useful. So, make them depending on a new warning flag -Wconstant-suffix and make it off by default. Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-11-22	sparsei: add the --[no-]jit options	Ramsay Jones	2	-2/+20
	On the cygwin platform, a 'sparsei' backend test, which uses the llvm 'lli' tool, fails due to a dynamic linking error: $ make check ... TEST sum from 1 to n (backend/sum.c) error: actual output text does not match expected output text. error: see backend/sum.c.output.* for further investigation. --- backend/sum.c.output.expected 2018-06-03 18:27:11.502760500 +0100 +++ backend/sum.c.output.got 2018-06-03 18:27:11.307670000 +0100 @@ -1,2 +0,0 @@ -15 -5050 error: actual error text does not match expected error text. error: see backend/sum.c.error.* for further investigation. --- backend/sum.c.error.expected 2018-06-03 18:27:11.562997400 +0100 +++ backend/sum.c.error.got 2018-06-03 18:27:11.481038800 +0100 @@ -0,0 +1 @@ +LLVM ERROR: Program used external function 'printf' which could not be resolved! error: Actual exit value does not match the expected one. error: expected 0, got 1. ... Out of 288 tests, 277 passed, 11 failed (10 of them are known to fail) make: *** [Makefile:236: check] Error 1 $ Note the 'LLVM ERROR' about the 'printf' external function which could not be resolved (linked). On Linux, it seems that the 'lli' tool (JIT compiler) can resolve the 'printf' symbol, with the help of the dynamic linker, since the tool itself is linked to the (dynamic) C library. On windows (hence also on cygwin), the 'lli' tool fails to resolve the external symbol, since it is not exported from the '.exe'. The 'lli' tool can be used as an interpreter, so that the JIT compiler is disabled, which also side-steps this external symbol linking problem. Add the --[no-]jit options to the 'sparsei' tool, which in turn uses (or not) the '-force-interpreter' option to 'lli'. In order to fix the failing test-case, simply pass the '--no-jit' option to 'sparsei'. Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-11-22	sparsec: use a compatible exception model on cygwin	Ramsay Jones	1	-1/+11
	On the cygwin platform, some of the backend tests fail due to the 'llc' tool using win32 'structured exception' assembler directives, which the platform assembler 'as' does not accept: $ make check ... TEST 'hello, world' code generation (backend/hello.c) error: actual error text does not match expected error text. error: see backend/hello.c.error.* for further investigation. --- backend/hello.c.error.expected 2018-06-03 17:14:30.550972400 +0100 +++ backend/hello.c.error.got 2018-06-03 17:14:30.478731900 +0100 @@ -0,0 +1,6 @@ +{standard input}: Assembler messages: +{standard input}:14: Error: invalid register for .seh_pushreg +{standard input}:14: Error: junk at end of line, first unrecognized character is `5' +{standard input}:20: Error: invalid register for .seh_setframe +{standard input}:20: Error: missing separator +mv: cannot stat '/tmp/tmp.oTA6mS.o': No such file or directory ... Out of 288 tests, 275 passed, 13 failed (10 of them are known to fail) make: *** [Makefile:236: check] Error 1 $ The exception model used by 'llc' can be changed from the command line to be one of 'default', 'dwarf', 'sjlj', 'arm' or 'wineh'. In this case the default is 'wineh' (windows exception handling). The 'sjlj' model is the older (setjmp,longjmp) stack-based model, which is no longer used on Linux. The newer 'dwarf' model uses a 'zero cost' table based method. (The 'arm' model is not relevant here). For more information, see [1]. After some experiments, using small test programs compiled with gcc and g++, comparing the output of tools like 'nm' and 'objdump', it seems that cygwin binutils are employing the 'sjlj' model. (Using the 'dwarf' model on 'llc' also works on simple programs). In order to fix the test failures, add an '-exception-model=sjlj' option to the 'llc' invocation when executing sparsec on the cygwin platform. [1] http://www.hexblog.com/wp-content/uploads/2012/06/Recon-2012-Skochinsky-Compiler-Internals.pdf Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-11-20	fix expansion of function designator	Luc Van Oostenryck	2	-1/+2
	The expression corresponding to the function pointer of indirect call can be arbirarily complex. For example, it can contain a statement expression or another call, possibly inlined. These expressions must be expanded to insure that sub-expressions involving 'sizeof()' or other operators taking a type as argument (like __builtin_compatible_types_p()) are no more present (because these expressions always evaluate to a compile-time constant and so are not expected and thus not handled at linearization time). However, this is not currently enforced, possibly causing some failures during linearization with warnings like: warning: unknown expression (4 0) (which correspond to EXPR_TYPE). Fix this, during the expansion of function calls, by also expanding the corresponding designator. References: https://lore.kernel.org/lkml/1542623503-3755-1-git-send-email-yamada.masahiro@socionext.com/ Reported-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Tested-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2018-11-20	add testcase for missing function designator expansion	Luc Van Oostenryck	1	-0/+23
	Add a testcase showing function designator are not expanded. References: https://lore.kernel.org/lkml/1542623503-3755-1-git-send-email-yamada.masahi> Reported-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-11-02	tokenize: check show_string() for NULL pointer	Ben Dooks	1	-1/+1
	Fix issue where show_string() being passed a NULL pointer by accident. This only happened during debugging, but would be a useful addition to the checks in this function. Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-10-30	ptrlist: add ptr_list_nth_entry()	Luc Van Oostenryck	2	-0/+23
	Usually ptr lists are accessed iteratively via the FOR/END macros but in few case we may need to access a given element in a list, like for example when accessing a given argument of a function. Create an helper doing that instead of open coding it. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-10-26	__attribute__((fallthrough)) can't simply be ignored	Luc Van Oostenryck	1	-1/+0
	Currently, sparse has the attribute 'fallthrough' in its list of known-but-ignored attributes (like almost every GCC's attributes). But this attribute is a statement attribute, something which is currently not supported (it's interpreted as the attribute of an empty declaration which doesn't play well with -Wdeclaration-after-statement). Fix this by stopping to consider this attribute as known. This will allow __has_attribute(fallthrough) to correctly play its role. Note: a more complete solution will need to parse this statement attribute and maybe be able to make the distinction between statement attributes, label attributes (also currently ignored), and type, function & variable attributes. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-10-05	Merge branch 'fix-enum-type' into tip	Luc Van Oostenryck	16	-64/+405

2018-10-05	enum: more specific error message for empty enum	Luc Van Oostenryck	2	-2/+2
	Currently, the error message issued for an empty enum is "bad enum definition". This is exactly the same message used when one of the enumerator is invalid. Fix this by using a specific error message. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-10-05	enum: default to unsigned	Luc Van Oostenryck	4	-10/+9
	GCC uses an unsigned type for enum's basetype unless one of the enumerators is negative. Using 'int' for plain simple enumerators and then using the same rule as for integer constants (int -> unsigned int -> long -> ...) should be more natural but doing so creates useless warnings when using sparse on the kernel code. So, do the same as GCC: * uses the smaller type that fits all enumerators, * uses at least int or unsigned int, * uses an signed type only if one of the enumerators is negative. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-10-05	enum: keep enumerators as int if they fit	Luc Van Oostenryck	1	-0/+37
	In Standard C, enumerators have type 'int' and an unspecified base type. OTOH, GCC (and thus sparse) llows any integer type (and sparse also more or less allows bitwise types). After the enum's decalration is parsed, the enumerators are converted to the underlying type. Also, GCC (and thus sparse) uses an unsigned type unless one of the enumerators have a negative value. This is a problem, though, because when comparing simple integers with simple enumerators like: enum e { OK, ONE, TWO }; the integers will unexpectedly be promoted to unsigned. GCC avoid these promotions by not converting the enumerators that fit in an int. For GCC compatibility, do the same: do not convert enumerators that fit in an int. Note: this is somehow hackish but without this some enum usages in the kernel give useless warnings with sparse. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-10-05	enum: rewrite bound checking	Luc Van Oostenryck	1	-50/+34
	To determine the base type of enums it is needed to keep track of the range that the enumerators can take. However, this tracking seems to be more complex than needed. It's now simplified like this: -) a single 'struct range' keep track of the biggest positive value and the smallest negative one (if any) -) the bound checking in itself is then quite similar to what was already done: ) adjust the bit size if the type is negative ) check that the positive bound is in range ) if the type is unsigned -> check that the negative bound is 0 ) if the type is signed -> check that the negative bound is in range Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-10-05	enum: warn on bad enums	Luc Van Oostenryck	1	-2/+4
	During the parsing of enum definitions, if some invalid type combination is reached, the base type is forced to 'bad_ctype'. Good. However, this is done without a warning and it's only when the enum is used that some sign of a problem may appear, with no hint toward the true cause. Fix this by issuing a warning when the base type becomes invalid but only if the type of the enumerator is itself not already set to 'bad_ctype' (since it this case a more specific warning has already been issued). Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-10-05	enum: warn when mixing different restricted types	Luc Van Oostenryck	2	-0/+25
	Sparse supports enum initializers with bitwise types but this makes sense only if they are all the same type. Add a check and issue a warning if an enum is initialized with different restricted types. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-10-05	enum: only warn (once) when mixing bitwiseness	Luc Van Oostenryck	2	-0/+36
	As an extension to the standard C types, parse supports bitwise types (also called 'restricted') which should in no circonstances mix with other types. In the kernel, some enums are defined with such bitwise types as initializers; the goal being to have slightly more strict enums. While the semantic of such enums is not very clear, using a mix of bitwise and not-bitwise initializers completely defeats the desired stricter typing. Attract some attention to such mixed initialization by issuing a single warning for each such declarations. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-10-05	enum: use the values to determine the base type	Luc Van Oostenryck	1	-14/+1
	The C standard requires that the type of enum constants is 'int'. So a constant not representable as an int can't be used as the initializer of an enum. GCC extend this by using, instead of 'int', the smallest type that can represent all the values of the enum: first int, then unsigned int, long, ... For sparse, we need to take in account the bitwise integers. However, currently sparse doesn't do this based on the values but on the type, so if one of the initializer is, for example, 1L, the base type is forced to a size as least as wide as 'long'. Fix this by removing the call to bigger_enum_type(). Note that this is essentially a revert of commit "51d3e7239: Make sure we keep enum values in a sufficiently large type for parsing" which had the remark: "Make sure that the intermediate stages keep the intermediate types big enough to cover the full range." But this is not needed as during parsing, the values are kept at full width and with their original type & value. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-10-05	enum: use the smallest type that fit	Luc Van Oostenryck	4	-5/+0
	The C standard requires that the type of enum constants is 'int' and let the enum base/compatible type be implementation defined. For this base type, instead of 'int', GCC uses the smallest type that can represent all the values of the enum (int, unsigned int, long, ...) Sparse has the same logic as GCC but if all the initializers have the same type, this type is used instead. This is a sensible choice but often gives differents result than GCC. To stay more compatible with GCC, always use the same logic and thus only keep the common type as base type for restricted types. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-10-05	enum: fix cast_enum_list()	Luc Van Oostenryck	2	-1/+1
	Sparse want that an enum's enumerators have all the same type. This is done by first determining the common type and then calling cast_enum_list() which use cast_value() on each member to cast them to the common type. However, cast_value() doesn't create a new expression and doesn't change the ctype of the target: the target expression is supposed to have already the right type and it's just the value that is transfered from the source expression and size adjusted. It's seems that in cast_enum_list() this has been overlooked with the result that the value is correctly adjusted but keep it's original type. Fix this by updating, for each member, the desired type. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-10-05	enum: add testcase for base & enumerator type	Luc Van Oostenryck	8	-0/+227
	Add various testcases for checking enum's base & enumerator type. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-10-05	enum: add testcase for type of enum members	Luc Van Oostenryck	1	-0/+15
	Members of an enum should all have the same type but isn't so currently. Add a testcase for it and mark it as 'known-to-fail'. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-10-05	enum: fix UB when rshifting by full width	Luc Van Oostenryck	2	-3/+15
	Shifting by an amount greater or equal than the width of the type is Undefined Behaviour. In the present case, when type_is_ok() is called with a type as wide as an ullong (64 bits here), the bounds are shifted by 64 which is UB and at execution (on x86) the value is simply unchanged (since the shift is done with the amount modulo 63). This, of course, doesn't give the expected result and as consequence valid enums can have an invalid base type (bad_ctype). Fix this by doing the shift with a small helper which return 0 if the amount is equal to the maximum width. NB. Doing the shift in two steps could also be a solution, as maybe some clever trick, but since this code is in no way critical performance-wise, the solution here has the merit to be very explicit. Fixes: b598c1d75a9c455c85a894172329941300fcfb9f Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-10-05	enum: add testcase for UB in oversized shift	Luc Van Oostenryck	1	-0/+17
	type_is_ok(), used to calculate the base type of enums, has a bug related to UB when doing a full width rshift. Add a testcase for this. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-10-05	doc: is_int_type() returns false for SYM_RESTRICTs	Luc Van Oostenryck	1	-0/+5
	It isn't at all obvious that is_int_type() return false for restricted/bitwise types. It's even quite counter-intuitive. So document this. Note: Fortunately, there isn't a lot of callers and the main callers are all in parse.c and are OK. OTOH, the callers in sparse-llvm.c are wrong and need another helper. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-26	print address space number for cast-from-AS warnings	Vincenzo Frascino	3	-4/+64
	This patch prints the address space number when a warning "cast removes address space of expression" is triggered. This makes easier to discriminate in between different address spaces. Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-10	ssa: relax what can be promoted	Luc Van Oostenryck	2	-4/+2
	During SSA conversion, it is checked what can be promoted and what cannot. Obviously, ints, longs, pointers can be promoted, enums and bitfields can too. Complication arise with unions and structs. Currently union are only accepted if they contains integers of the same size. For structs its even more complicated because we want to convert simple bitfields. What should be accepted is structs containing either: * a single scalar * only bitfields and only if the total size is < long However the test was slightly more strict than that: it dodn't allowed a struct with a total size bigger than a long. As consequence, on IP32, a struct containing a single double wasn't promoted. Fix this by moving the test about the total size and only if some bitfield was present. Reported-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-10	test: make 32-bit version of failed test	Luc Van Oostenryck	2	-2/+31
	The test mem2reg/init-local.c succeeds on 64-bit but fails on 32-bit. Duplicate the test, one with -m64 and the other with -m32 and mark this one as known-to-fail. Reported-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-10	test: use integers of different sizes, even on 32-bit	Luc Van Oostenryck	1	-2/+2
	The test optim/cse-size fials on 32-bit because it needs two integers of different size but uses int & long. These two types have indeed different sizes on 64-bit (LP64) but not on 32-bit (ILP32). Fix this by using short & int. Reported-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-10	test: make test Waddress-space-strict succeed on 32-bit	Luc Van Oostenryck	1	-26/+7
	The test Waddress-space-strict made assumptions about the relative size of integers & pointers. Since this test was crafted on a 64-bit machine, the test was running fine for LP64 but failed on a 32-bit machine (or anything using IP32, like using the -m32 option). However, since the test is about conversion of address-spaces, using integers of different size adds no value, and indeed brings problems. Fix this by limiting the conversions to a single integer type, the one with the same size as pointers on ILP32 & LP64: long. Reported-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-08	fix linearization of non-constant switch-cases	Luc Van Oostenryck	2	-3/+5
	The linearization of switches & cases makes the assumption that the expressions for the cases are constants (EXPR_VALUE). So, the corresponding values are dereferenced without checks. However, if the code uses a non-constant case, this dereference produces a random value, probably one corresponding to some pointers belonging to the real type of the expression. Fix this by checking during linearization the constness of the expression and ignore the non-constant ones. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-08	add testcase for non-constant switch-case	Luc Van Oostenryck	1	-0/+38
	Switches with non-constant cases are currently linearized using as value the bit pattern present in the expression, creating more or less random multijmps. Add a basic testcase to catch this. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-06	Merge branch 'rem-trivial-phi' into tip	Luc Van Oostenryck	3	-19/+66
	* remove more complex phi-nodes
2018-09-06	Merge branches 'missing-return' and 'fix-logical-phi' into tip	Luc Van Oostenryck	17	-143/+373
	* fix linearization/SSA when missing a return * fix linearization/SSA of (nested) logical expressions
2018-09-06	fix linearization of nested logical expr	Luc Van Oostenryck	5	-111/+121
	The linearization of nested logical expressions is not correct regarding the phi-nodes and their phi-sources. For example, code like: extern int a(void); int b(void); int c(void); static int foo(void) { return (a() && b()) && c(); } gives (optimized) IR like: foo: phisrc.32 %phi1 <- $0 call.32 %r1 <- a cbr %r1, .L4, .L3 .L4: call.32 %r3 <- b cbr %r3, .L2, .L3 .L2: call.32 %r5 <- c setne.32 %r7 <- %r5, $0 phisrc.32 %phi2 <- %r7 br .L3 .L3: phi.32 %r8 <- %phi2, %phi1 ret.32 %r8 The problem can already be seen by the fact that the phi-node in L3 has 2 operands while L3 has 3 parents. There is no phi-value for L4. The code is OK for non-nested logical expressions: linearize_cond_branch() takes the sucess/failure BB as argument, generate the code for those branches and there is a phi-node for each of them. However, with nested logical expressions, one of the BB will be shared between the inner and the outer expression. The phisrc will 'cover' one of the BB but only one of them. The solution is to add the phi-sources not before but after and add one for each of the parent BB. This way, it can be guaranteed that each parent BB has its phisrc, whatever the complexity of the sub- expressions. With this change, the generated IR becomes: foo: call.32 %r2 <- a phisrc.32 %phi1 <- $0 cbr %r2, .L4, .L3 .L4: call.32 %r4 <- b phisrc.32 %phi2 <- $0 cbr %r4, .L2, .L3 .L2: call.32 %r6 <- c setne.32 %r8 <- %r6, $0 phisrc.32 %phi3 <- %r8 br .L3 .L3: phi.32 %r1 <- %phi1, %phi2, %phi3 ret.32 %r1 Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-06	add tests for nested logical expr	Luc Van Oostenryck	1	-0/+49
	Nested logical expressions are not correctly linearized. Add a test for all possible combinations of 2 logical operators. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-06	fix ordering of phi-node operand	Luc Van Oostenryck	3	-7/+6
	The linearization of logical '&&' create a phi-node with its operands in the wrong order relatively to the parent BBs. Switch the order of the operands for logical '&&'. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-06	add testcases for wrong ordering in phi-nodes	Luc Van Oostenryck	4	-0/+55
	In valid SSA there is a 1-to-1 correspondance between each operand of a phi-node and the parents BB. However, currently, this is not always respected. Add testcases for the known problems. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-06	return nothing only in void functions	Luc Van Oostenryck	2	-4/+3
	Currently, the code for the return is only generated if the effectively return a type or a value with a size greater than 0. But this mean that a non-void function with an error in its return expression is considered as a void function for what the generated IR is concerned, making things incoherent. Fix this by using the declared type instead of the type of the return expression. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-06	use a temp var for the return type/symbol	Luc Van Oostenryck	1	-1/+2
	No functional changes, just preparing for the next patch. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-06	use UNDEF for missing returns	Luc Van Oostenryck	6	-5/+8
	If a return statement is missing in the last block, the generated IR will be invalid because the number of operands in the exit phi-node will not match the number or parent BBs. Detect this situation and insert an UNDEF for the missing value. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-06	extract add_return() from linearize_return()	Luc Van Oostenryck	1	-11/+16
	This will allow to reuse this code to generate valid IR in the case of a missing return statement. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-06	the return BB is never terminated	Luc Van Oostenryck	1	-8/+7
	After having called linearize_fn_statement(), it is tested if the active BB is terminated or not. But by construction, the active BB at this point is never terminated. So, remove the unneeded test. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-06	there is always an active BB after linearize_fn_statement()	Luc Van Oostenryck	1	-2/+2
	After having called linearize_fn_statement() the active BB is tested but at this point there is always an active BB as linearize_fn_statement() always create one. So, remove this unneeded test. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-06	specialize linearize_compound_statement()	Luc Van Oostenryck	1	-10/+16
	linearize_compound_statement() contains code that is only needed for the body of a function (including an already inlined function). To make things conceptually clearer and to facilitate some incoming changes, remove this conditional part from linearize_compound_statement(), used for all compound statements, and move it into a new linearize_fn_statement(), used only for function's body statements, where it is unconditional. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-06	topasm: top-level asm is special	Luc Van Oostenryck	3	-3/+7
	Top-level ASM statements are parsed as fake anonymous functions. Obviously, they have few in common with functions (for example, they don't have a return type) and mixing the two makes things more complicated than needed (for example, to detect a top-level ASM, we had to check that the corresponding symbol (name) had a null ident). Avoid potential problems by special casing them and return early in linearize_fn(). As consequence, they now don't have anymore an OP_ENTRY as first instructions and can be detected by testing ep->entry. Note: It would be more logical to catch them even erlier, in linearize_symbol() but they also need an entrypoint and an active BB so that we can generate the single statement. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-05	use a temp var for function's upper-level statement	Luc Van Oostenryck	1	-2/+3
	No functional changes, just preparing for the next patches. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-05	add testcases for missing return in last block	Luc Van Oostenryck	6	-0/+97
	In this case the phi-node created for the return value ends up with a missing operand, violating the semantic of the phi-node: map one value with each predecessor. Add testcases for these missing returns. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-05	add linearization as a pass	Luc Van Oostenryck	2	-0/+2
	If linearize_symbol() is called, it's meaningless to disable linearization. As such, linearization was not really considered as a pass. More exactly, there is no '-flinearize...' flag since -flinearize-enable & -flinearize-disable are both meaningless. However, -flinearize=last can be very useful for testing. So, recognize 'linearize' as a pass and leave -flinearize-{en,dis}able without effect. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-01	stricter warning for explicit cast to ulong	Luc Van Oostenryck	5	-2/+70
	sparse issues a warning when user pointers are casted to integer types except to unsigned longs which are explicitly allowed. However it may happen that we would like to also be warned on casts to unsigned long. Fix this by adding a new warning flag: -Wcast-from-as (to mirrors -Wcast-to-as) which extends -Waddress-space to all casts that remove an address space attribute (without using __force). References: https://lore.kernel.org/lkml/20180628102741.vk6vphfinlj3lvhv@armageddon.cambridge.arm.com/ Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-01	Merge branch 'dead-switch' into tip	Luc Van Oostenryck	4	-8/+83
	* fix linearization of unreachable switch + label
2018-09-01	Merge branch 'has-attribute' into tip	Luc Van Oostenryck	5	-10/+85
	* add support for __has_attribute()
2018-09-01	trivial-phi: remove more complex trivial phi-nodes	Luc Van Oostenryck	2	-3/+17
	In a set of related phi-nodes and phi-sources if all phi-sources but one correspond to the target of one of the phi-sources, then no phi-nodes is needed and all %phis can be replaced by the unique source. For example, code like: int test(void); int foo(int a) { while (test()) a ^= 0; return a; } used to produce an IR with a phi-node for 'a', like: foo: phisrc.32 %phi2(a) <- %arg1 br .L4 .L4: phi.32 %r7(a) <- %phi2(a), %phi3(a) call.32 %r1 <- test cbr %r1, .L2, .L5 .L2: phisrc.32 %phi3(a) <- %r7(a) br .L4 .L5: ret.32 %r7(a) but since 'a ^= 0' is a no-op, the value of 'a' is in fact never mofified. This can be seen in the phi-node where its second operand (%phi3) is the same as its target (%r7). So the only possible value for 'a' is the one from the first operand, its initial value (%arg1). Once this trivial phi-nodes is removed, the IR is the expected: foo: br .L4 .L4: call.32 %r1 <- test cbr %r1, .L4, .L5 .L5: ret.32 %arg1 Removing these trivial phi-nodes will usually trigger other simplifications, especially those concerning the CFG. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-01	trivial-phi: directly return the unique value	Luc Van Oostenryck	1	-17/+9
	In trivial_phi(), the fact that the phi-node is trivial or not is returned as an int and, if trivial, the unique value is returned via the pointer given as first argument. But these two results can easily be combined in a single one by returning the unique value if trivial and NULL otherwise. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-01	trivial-phi: use a temp var for the real source	Luc Van Oostenryck	1	-2/+7
	By design, all operands of a phi-node are defined by a OP_PHISRC. So, this phi-source need to be dereferenced to get the real source. Since this value is used in several tests, use a temoparary variable for it. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-01	trivial-phi: early return	Luc Van Oostenryck	1	-1/+1
	Once it has been detected that not all values are the same, nothing can change this fact. So, the function trivial_phi() can return its result as soon as this condition is met. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-01	trivial-phi: extract trivial_phi() from clean_up_phi()	Luc Van Oostenryck	1	-3/+22
	This will allow us, at a later step, to recursivaely test the operands. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-01	trivial-phi: make clean_up_phi() more sequential	Luc Van Oostenryck	1	-4/+5
	Reorganize clean_up_phi() so that the tests in the loop are more sequential. It functionally identical but will help to detect other, less trivial, trivial phi-nodes. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-01	trivial-phi: add testcase for unneeded trivial phi-nodes	Luc Van Oostenryck	1	-0/+15
	Trivial phi-nodes are phi-nodes having an unique possible outcome. So, there is nothing to join and the phi-node target can be replaced by the unique value. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-01	move DEF_OPCODE() to header file	Luc Van Oostenryck	2	-7/+8
	as it will be needed before it was defined in simplify.c and can be useful in other files too. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-01	fix linearization of unreachable switch (with reachable label).	Luc Van Oostenryck	2	-6/+6
	An unreachable/inactive switch statement is currently not linearized. That's nice because it avoids to create useless instructions. However, the body of the statement can contain a label which can be reachable. If so, the resulting IR will contain a branch to an unexisting BB. Bad. For example, code like: int foo(int a) { goto label; switch(a) { default: label: break; } return 0; } (which is just a complicated way to write: int foo(int a) { return 0; }) is linearized as: foo: br .L1 Fix this by linearizing the statement even if not active. Note: it seems that none of the other statements are discarded if inactive. Good. OTOH, statement expressions can also contains (reachable) labels and thus would need the same fix (which will need much more work). Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-01	add tescase for unreachable label in switch	Luc Van Oostenryck	1	-0/+20
	or more exactly, an unreachable switch statement but containing a reachable label. This is valid code but is curently wrongly linearized. So, add a testcase for it. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-01	ir-validate: validate return value	Luc Van Oostenryck	1	-0/+15
	A valid non-void function should not return VOID. VOID can only be returned if no return statements have been issued. Note: even if the expression is erroneous, and thus VOID, this returned value would be via a phi-node. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-01	ir-validate: ignore dead phis	Luc Van Oostenryck	1	-0/+3
	Dead phi-nodes should be eliminated early or, even better, never emitted. For the moment, ignore them during IR validation since they make the tests fail despite being no-ops. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-01	ir-validate: add validation branch to dead BB	Luc Van Oostenryck	2	-3/+40
	All branches must target an existing BB. Validate that it is the case for BR, CBR & SWITCH (COMPUTEDGOTO is left aside for the moment). Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-01	has-attr: add support for __has_attribute()	Luc Van Oostenryck	4	-10/+26
	Sparse has support for a subset of GCC's large collection of attributes. It's not easy to know which versions support this or that attribute. However, since GCC5 there is a good solution to this problem: the magic macro __has_attribute(<name>) which evaluates to 1 if <name> is an attribute known to the compiler and 0 otherwise. Add support for this __has_attribute() macro by extending the already existing support for __has_builtin(). Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-01	has-attr: add __designated_init__ & transparent_union	Luc Van Oostenryck	1	-0/+2
	Attributes can be used with the plain keyword or squeezed between a pair of double underscrore. For some reasons, 'designated_init' was not allowed with its underscores and '__transparent_union__' wasn't without them. So, allow '__designated_init__' & 'transparent_union'. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-01	has-attr: move 'mode' next to '__mode__'	Luc Van Oostenryck	1	-1/+1
	In the list of keywords '__mode__' was just before the entries for modes but 'mode' was lost in the middle of some other attributes. Move 'mode' justbefore '__mode__'. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-09-01	has-attr: add testcase for __has_attribute()	Luc Van Oostenryck	1	-0/+57
	Add a testcase for the incoming support of __has_attribute(). Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
2018-08-31	Merge branch 'opcode' into tip	Luc Van Oostenryck	10	-183/+177
	* consolidate instruction's properties into an opcode table
2018-08-30	Merge branch 'volatile-bitfield' and 'mode-pointer' into tip	Luc Van Oostenryck	8	-19/+77
	* fix: do not optimize away accesses to volatile bitfields * support mode(__pointer__) and mode(__byte__)