This is a getting started tutorial guide developing GCC. This tutorial is a starting guide and not as a complete documentation for everything about GCC. This guide will teach you how to build GCC and even do some simple modifications. And how to run the testsuite and compare between two testsuite runs.

List of GCC courses in the series

Welcome to the Getting Started Developing GCC

GCC can look big, clunky and even sometimes overwhelming; it does not need to be. Yes, GCC has a long history (almost 40 years); this is not a bad thing. This guide is designed to get started working on optimizations and backend for GCC. It can be used as a starting point for working on the front-ends though it will go into more details on GCC's optimizations and backends and how to do things there.

This guide assumes a few things:

Even though this guide uses some git commands, it is up to the reader to get experience with git separately. Though the guide will cover how to submit a patch from start to finish; and how to configure git to send patches via email using a gmail account; in the near future we will support using a forge.

language GCC is written in

Most of GCC is written in C++14. Though some front-ends are written in the langauge of the front-end, e.g. Ada front-end is written in Ada; D front-end is written in D. Many parts of GCC is written in C style rather than using C++ code; this is due to GCC long history and only parts of GCC has moved over to using C++.

There are a few specialized languages that GCC uses; the simplifier (match-and-simplify) and the machine descriptions (.md) are examples of this. These will be covered in PLACEHOLDER and PLACEHOLDER. GCC uses a precise garbage collection (GC) for many of its memory allocation. Don't worry about the GC for now; it will be covered in PLACEHOLDER and for the most part you don't need to worry about the GC.

Basics

This is hopefully helpful for you to get started developing GCC and submitting patches to the wider community.

Manuals

GCC has a few manuals that are good reference but GCC being a large project; they will feel overwhelming. They will be referenced below with the specific subpages so don't worry about these just yet.

Getting started with GCC

GCC can be built a few different ways. This alone can be overwhelming for folks. Here is the simplified version of how to build GCC to get started developing with. This is a simplified version of doing a build.

This guide assumes you are on x86_64 or aarch64 (Armv8) GNU/Linux distro (like Fedora, Ubuntu, Gentoo, RHEL, CentOS).

Currently aarch64 Mac OS is not currently supported by GCC (this will change; testing GCC there is harder than building on Linux). x86_64 Windows is supported but getting an environment setup is not an easy place to setup. Aarch64 Windows (Windows on Arm) is partially supported but like x86_64 Windows it is harder to get an setup working.

Installing the prerequisites

Below is a summary from GCC prerequisites of what prerequisites are required to build GCC on majority of distros.

Building GCC from source depends on some packages:

Building GCC from the git requires a few more packages than what is required when installing from a GCC release.

Testing GCC requires a few packages that are not normally installed.

Getting GCC sources

Downloading GCC sources

First get yourself familiar with GIT as that will be used to download, create and send patches (via email; more on that later [this will change]).

Download the sources via Git.

$ git clone https://gcc.gnu.org/git/gcc

Source structure

Source structure: GCC is structured in what might seemly overwhelming way but here is a simplified view:

toplevel directory

description

configure*

configure script

config*

scripts used by configure scripts including m4

Makefile*

Makefile is generated

gcc

compiler sources are located including the testsuite

libcpp

library of the C/C++ preprocessor

libstdc++-v3

the C++ runtime library (target library)

libgcc

generic runtime for the target

contrib

useful scripts, see below

libiberty

"portability" library

lto-plugin

the linker plugin that is used with LTO

The contrib directory contains scripts which might be useful for your journey with GCC sources; including a code whitespace format checker.

Note libiberty directory is the precursor to what is now gnulib but GCC does not use gnulib just yet. Also libiberty contains a few more things than gnulib; demanagler and a simple object reader. Note libgcc includes misc runtime functions; 128 bit divide, unwinder for [C++] exceptions, etc.

GCC directory

GCC directory: Under the GCC directory contains several directories and the main source of the compiler. Here we list just a few to get you started, the internals manual will list more.

file

description

configure*, config.gcc

configure files

Makefile.in

Makefile

match.pd

match and simplify patterns

gimple*, tree*

Gimple and Generic related files

c-family

C/C++ common part of the FEs

c

C front-end

cp

C++ front-end

Building

GCC can be built in a few different ways.

We will describe native bootstrap later on since that will be used with testing the change. Here we describe the simple fashion of doing a build without bootstrap. This is the fastest way of building and when doing development of a change is best to use.

NOTE bootstrap is required normally for a change (exceptions is if the change for a cross only target).

In the same directory as you cloned the GCC sources, the following will do the build:

$ CPUS=`grep processor /proc/cpuinfo|wc -l`
$ cd gcc
$ mkdir objdir-min
$ cd objdir-min
$ ../configure --prefix=${HOME}/mybuild --disable-bootstrap --enable-languages=c,c++
$ make -j$CPUS all

Note the prefix here is not used unless you do a make install but for these builds installing is not needed.

This will build GCC including the target libraries and the compiler. You could instead of doing all just use all-gcc to build just the compiler. This can fail if you didn't install the prerequisites.

After installing the prerequisites start over by deleting the objdir-min directory and doing the above commands again.

You don't need to use -j$CPUS but GCC builds faster if using all of the available cores.

This just enables the C and C++ front-end (and the fake LTO front-end). You can enable more languages if you want, this is just for the fastest and easiest build.

GMP/MPFR/MPC download (if needed)

If you can't find GMP, MPFR or libmpc devel package you can also use the ./contrib/download_prerequisites script to download the sources for GMP/MPFR/MPC which are known to work with this version of GCC. That is:

$ cd gcc
$ ./contrib/download_prerequisites

This is also a nice way also to statically link with these libraries for when building cross compilers that you want to distribute. It will also download ISL which is used with the graphite loop framework (which could use folks developing it).

Running xgcc, cc1 and cc1plus

Now you have a build of GCC. Lets play around with the build. Before we start, let's explain a few programs that were generated and will be installed.

Both cc1 and cc1plus are linked against the middle-end and the back-end of gcc (statically). And are linked statically against the preprocessor.

The driver programs sources are held in gcc.cc. It contains a specialized language (specs as in specifications) which is interpreted at runtime; specs describes how to invoke cc1/cc1plus, the assembler and the linker. Normally you don't need to change the specs; an example where you need to change it is when add an option which requires a library for runtime (santizers for an example).

Also for 99% uses of GCC, invoking cc1 and cc1plus directly is not recommended since the driver will add some target specific options and might even handle some options like -march=native. But when debugging GCC, invoking cc1 and cc1plus directly is normal.

With a simple t.c file say:

int f(void) { return 0; }

You invoke directly cc1 (cc1plus is similar):

$ ./cc1 t.c

If you need to invoke the driver (xgcc/xg++) make sure you use the -B option to point to the PWD like so:

$ ./xgcc -B. t.c -v

The -B option is to specify where to find the executables (e.g. cc1 and cc1plus). The -v option is to have the driver print out how it invoked cc1.

cc1 will output a few things but you can make it quiet by using the -quiet option.

cc1 accepts most (but not all) options that gcc normally gets. Some options like -S are not passed down to cc1 but rather control how the driver gcc is run.

Making a simple change and incremental builds

GCC build system is set up to allow for incremental builds in many cases. So in most cases you could just invoke make again without worrying about doing a make clean beforehand. While inside the objdir/gcc directory let's just do a touch ../../gcc/tree.h and then you can invoke make -j$CPU cc1 to rebuild the cc1; it will take a while do a build as tree.h is included in many places as it is where most of the AST/IR tree structures are defined for the front-ends and the gimple optimizers.

Doing a full bootstrap/test

Now you have done a simple change, let's talk about doing a bootstrap and testing.

Running a C/C++ specific testcase again

Comparing the testsuite results

GCC Community

This is a summary of Community

email lists

IRC/matrix

GCConIRC

Others

Submitting the patch to community

What next?

Other Tutorials, HOWTOs

Dealing with the source code

Structure Of GCC

None: GCC101 (last edited 2025-11-18 00:12:40 by AndrewPinski)