Abstract
We present a systematic representation of C++, called IPR, for complete semantic analysis and semantics-based program transformations. We describe the ideas and design principles that shaped the IPR. In particular, we describe how general type-based unification is key to minimal compact representation, fast type-safe traversal, and scalability. For example, the representation of a fairly typical non-trivial C++ program in GCC 3.4.2 was 32 times larger than its IPR representation; this led to significant improvements to GCC. IPR is general enough to handle real-world programs involving many translation units, archaic programming styles, and generic programming using C++0x extensions that affect the type system. The difficult issue of how to represent irregular (ad hoc) features in a systematic (non-ad hoc) manner is among the key contributions of this paper. The IPR data structure can represent all of C++ with just 157 simple node types; to compare the ISO C++ grammar has over 700 productions. The IPR is used for a variety of program analysis and transformation tasks, such as visualization, loop simplification, and concept extraction. Finally, we report impacts of this work on existing C++ compilers.
Similar content being viewed by others
References
Amarasinghe, S., Anderson, J., Lam, M., Tseng, C.-W.: An overview of the SUIF compiler for scalable parallel machines. In: Proceedings of the Seventh SIAM Conference on Parallel Processing for Scientific Computing, San Francisco, CA (1995)
Bagge, O.: CodeBoost: A framework for transforming C++ programs. Master’s thesis, University of Bergen, Bergen (2003)
Bagge, O., Kalleberg, K., Haveraaen, M., Visser, E.: Design of the CodeBoost transformation system for domain-specific optimisation of C++ programs. In: Binkley, D., Tonella, P. (eds.) Third International Workshop on Source Code Analysis and Manipulation (SCAM 2003), Amsterdam, The Netherlands, pp. 65–75. IEEE Computer Society Press, USA (2003)
Bessey A., Block K., Chelf B., Chou A., Fulton B., Hallem S., Henri-Gros C., Kamsky A., McPeak S., Engler D.: A few billion lines of code later: using static analysis to find bugs in the real world. Commun. ACM 53(2), 66–75 (2010)
clang: a C language family frontend for LLVM. http://clang.llvm.org/
Coplien J.O.: Curiously recurring template patterns. C++ Rep. 7(2), 24–27 (1995)
Dos Reis, G., Stroustrup, B.: The Pivot. http://parasol.tamu.edu/pivot
Dos Reis, G., Stroustrup, B.: A formalism for C++. Technical Report N1885=05-0145, ISO/IEC SC22/JTC1/WG21 (2005)
Dos Reis, G., Stroustrup, B.: Specifying C++ concepts. In: Conference Record of POPL ’06: The 33th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, Charleston, SC, USA, pp. 295–308 (2006)
Ellis M.A., Stroustrup B.: The Annotated C++ Reference Manual. Addison-Wesley, Reading (1990)
Ershov A.P.: On programming of arithmetic operations. Commun. ACM 1, 3–6 (1958)
Gamma E., Helm R., Johson R., Vlissides J.: Design Patterns. Addison-Wesley, Reading (1994)
GNU Compiler Collection. http://gcc.gnu.org/
Gregor, D., Järvi, J., Siek, J., Stroustrup, B., Dos Reis, G., Lumsdaine, A.: Concepts: linguistic support for generic programming in C++. In: OOPSLA ’06: Proceedings of the 21st Annual ACM SIGPLAN Conference on Object-Oriented Programming Languages, Systems, and Applications, pp. 291–310. ACM Press, New York (2006)
International Organization for Standards. International Standard ISO/IEC 14882. Programming Languages—C++, 2nd edn (2003)
International Organization for Standards. ISO/IEC PDTR 18015. Technical Report on C++ Performance (2003)
ISO/IEC JTC1/SC22/WG21. Programming languages C++. Technical report, ISO (2010). http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3092.pdf
Järvi, J., Stroustrup, B., Dos Reis, G.: Decltype and auto (revision 4). http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2004/n1705.pdf. September 2004. ISO/IEC JTC1/SC22/WG21 no. 1705
Kalleberg, K.: User-configurable, high-level transformations with CodeBoost. Master’s thesis, University of Bergen, Bergen (2003)
Lattner, C., Adve, V.: LLVM: a compilation framework for lifelong program analysis and transformation. In: CGO ’04: Proceedings of the International Symposium on Code Generation and Optimization, Washington, DC, USA, p. 75. IEEE Computer Society, USA (2004)
Lee, S.-I., Johnson, T.A., Eigenmann, R.: Cetus—an extensible compiler infrastructure for source-to-source transformation. In: Proceedings of the 16th International Workshop on Languages and Compilers for Parallel Computing (LCPC), pp. 539–553 (2003)
Mitchell, J.C.: Type Systems for Programming Languages, pp. 365–458. Elsevier, Amsterdam (1990)
Necula, G.C., McPeak, S., Rahul, S.P., Weimer, W.: CIL: intermediate language and tools for analysis and tranformations of C programs. In: Proceedings of the 11th International Conference on Compiler Construction. Lecture Notes in Computer Science, vol. 2304, pp. 219–228. Springer, Berlin (2002). http://manju.cs.berkeley.edu/cil/
Pierkelbauer, P., Dechev, D., Stroustrup, B.: Source code rejuvenation is not refactoring. In: 36th International Conference on Current Trends in Theory and Practice of Computer Science (2010)
Pirkelbauer, P.: Programming language evolution and source code rejuvenation. PhD thesis, Texas A&M University (2010)
Pirkelbauer, P., Dechev, D., Stroustrup, B.: Support for the evolution of C++ generic functions. In: Software Language Engineering. Lecture Notes in Computer Science, vol. 6563, pp. 123–142. Springer, Berlin (2010)
Schordan, M., Quinlan, D.: A source-to-source architecture for user-defined optimizations. In: Proceeding of Joint Modular Languages Conference (JMLC’03). Lecture Notes in Computer Science, vol. 2789, pp. 214–223. Springer, Berlin (2003)
Schupp, S., Gregor, D., Musser, D., Liu, S.-M.: User-extensible simplification—type-based optimizer generators. In: Wihlem, R. (ed.) International Conference on Compiler Construction Lecture Notes in Computer Science. Springer, Berlin (2001)
Schupp S., Gregor D., Musser D., Liu S.-M.: Semantic and behavioural library transformations. Inf. Softw. Technol. 44(13), 797–810 (2002)
Stroustrup, B.: C++ applications. http://www.research.att.com/~bs/applications.html
Stroustrup, B.: A History of C++: 1979–1991. In: Proceedings of ACM Conference on History of Programming Languages (HOPL-2) (1993)
Stroustrup, B.: Evolving a language in and for the real world: C++ 1991–2006. In: ACM HOPL-III, San Diego, CA. ACM Press, New York (2007)
Stroustrup, B.: What is C++0x? CVu 4, 8–13; 5, 21–25 (2009)
The Edison Design Group. http://www.edg.com/
Wagner, L.A.: Traversal, case analysis, and lowering for C++ program analysis. Master’s thesis, Texas A&M University (2009)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Dos Reis, G., Stroustrup, B. A Principled, Complete, and Efficient Representation of C++. Math.Comput.Sci. 5, 335–356 (2011). https://doi.org/10.1007/s11786-011-0094-1
Received:
Revised:
Accepted:
Published:
Issue date:
DOI: https://doi.org/10.1007/s11786-011-0094-1

