Ben Wagner | 7bebb101 | 2023-06-29 21:26:03 | [diff] [blame] | 1 | # Linux Minidump Code Lab |
| 2 | |
| 3 | [TOC] |
| 4 | |
| 5 | ## About Minidumps |
| 6 | |
| 7 | Minidump is a file format for storing parts of a program's state for later |
| 8 | inspection. [Microsoft's |
| 9 | Documentation](https://docs.microsoft.com/en-us/windows/desktop/api/minidumpapiset/) |
| 10 | defines the format though the [Rust |
| 11 | Documentation](https://docs.rs/minidump/latest/minidump/format/index.html) |
| 12 | is sometimes easier to navigate. The minidump implementation and tools used by |
| 13 | Chrome are |
| 14 | [Breakpad](https://chromium.googlesource.com/breakpad/breakpad/+/refs/heads/main/README.md) |
| 15 | and |
| 16 | [Crashpad](https://chromium.googlesource.com/crashpad/crashpad/+/main/README.md). |
| 17 | However, the tools of interest here are from the Breakpad project. |
| 18 | |
| 19 | ## Create a Minidump |
| 20 | |
| 21 | When Chrome crashes it writes out a minidump file. The minidump file is written |
| 22 | under the application product directory. On Linux this is |
| 23 | `<XDG_CONFIG_HOME>/<app-name>/Crash Reports`. The default for `XDG_CONFIG_HOME` |
| 24 | is `~.config`. Common `<app-name>`s are `chromium`, `google-chrome`, |
| 25 | `google-chrome-beta`, and `google-chrome-unstable`. A typical example is |
| 26 | `~.config/google-chrome/Crash Reports`. When a minidump is uploaded it will be |
| 27 | moved between the `new`, `pending`, and `completed` subdirectories. The minidump |
| 28 | file is named something like `<uuid>.dmp`. If the minidump is uploaded to the |
| 29 | crash reporting system, the `<uuid>.meta` file will contain the crash report id. |
| 30 | Those with access can find the uploaded report at `go/crash/<report-id>`, where |
| 31 | the minidump file will be available with a name like |
| 32 | `upload_file_minidump-<report-id>.dmp`. |
| 33 | |
| 34 | To create a minidump, you can use a local build of Chromium or a release |
| 35 | version of Chrome. Run the browser with the environment variable |
| 36 | `CHROME_HEADLESS=1`, which enables crash dumping but prevents crash dumps from |
| 37 | being uploaded and deleted. Something like `$ env CHROME_HEADLESS=1 |
| 38 | ./out/debug/chrome-wrapper` or `$ env CHROME_HEADLESS=1 |
| 39 | /opt/google/chrome/google-chrome`. Navigate to `chrome://crash` to trigger a |
| 40 | crash in the renderer process or reproduce your current crash bug. A crash dump |
| 41 | file should appear in the `Crash Reports` directory. |
| 42 | |
| 43 | ## Inspect the Minidump |
| 44 | |
| 45 | To get an idea about what is in a minidump file, install the Okteta hex editor |
| 46 | and add the [Minidump Structure |
| 47 | Definition](https://github.com/bungeman/structures/tree/main/okteta-minidump). |
| 48 | Open the minidump previously created and explore the information it contains. |
| 49 | |
| 50 | One quirk to notice is that there is a `ThreadListStream` which contains |
| 51 | `MINIDUMP_THREAD`s which contain a `MINIDUMP_THREAD_CONTEXT` and an |
| 52 | `ExceptionStream` which also contains a `MINIDUMP_THREAD_CONTEXT`. The thread |
| 53 | list contains the thread contexts as they existed when the crash reporter was |
| 54 | running. The exception's thread context is the state of the crashing thread at |
| 55 | the time that it crashed, which is generally the most interesting thread |
| 56 | context. When using the Breakpad tools for Linux (like `minidump_stackwalk` and |
| 57 | `minidump-2-core`) the thread context from the exception record is used in place |
| 58 | of the thread context associated with the corresponding thread. |
| 59 | |
| 60 | Each `MINIDUMP_THREAD` contains a `StackMemoryRva` which is a reference to to a |
| 61 | copy of the stack on that thread at the time the crash handler was running. |
| 62 | Parsing a stack usefully requires additional debug information. |
| 63 | `minidump_stackwalk` or a debugger may be used to parse the stack memory to |
| 64 | create a usable trace. |
| 65 | |
| 66 | ## Get the Tools |
| 67 | |
| 68 | From a Chromium checkout `ninja -C out/release minidump-2-core |
| 69 | minidump_stackwalk dump_syms`. From a [Breakpad |
| 70 | checkout](https://chromium.googlesource.com/breakpad/breakpad/) `make`. It can |
| 71 | be useful to use Breakpad directly on machines where one does not already have |
| 72 | a Chromium checkout. |
| 73 | |
| 74 | When working at this level, one will also want to have `readelf` and |
| 75 | `objdump` available, which are available from most distributions. |
| 76 | |
| 77 | ## Get Executables and Symbols |
| 78 | |
| 79 | In addition to the minidump, you will need the exact executables of Chromium or |
| 80 | Chrome which produced the minidump and those executable's symbols. If the |
| 81 | minidump was created locally, you already have the executables. Symbols for |
| 82 | Google Chrome's official builds are available from |
| 83 | `https://edgedl.me.gvt1.com/chrome/linux/symbols/google-chrome-debug-info-linux64-${VERSION}.zip` |
| 84 | where `${VERSION}` is any version of Google Chrome that has recently been served |
| 85 | to Stable, Beta, or Unstable (Dev) channels on Linux, like `114.0.5696.0`. Those |
| 86 | with access can find both executables and symbols for unreleased builds at |
| 87 | `go/chrome-symbols`. |
| 88 | |
| 89 | For symbols outside of Chrome (like when the crash is happening in a shared |
| 90 | library) then symbols for the files of interest must be found. If the minidump |
| 91 | was created locally then install the symbol packages from your distribution. If |
| 92 | not, you will need to track down the exact symbol files, which can be an |
| 93 | interesting exercise. For some distributions using the |
| 94 | [debuginfod](https://sourceware.org/elfutils/Debuginfod.html) system can be |
| 95 | quite helpful. |
| 96 | |
| 97 | To ensure the correct binaries and debug symbols are used, the minidump contains |
| 98 | the build-id for each loaded module in the `ModuleListStream` in the |
| 99 | `CvRecordRva`'s `Signature`. This build-id is matched against a note |
| 100 | section of type `NT_GNU_BUILD_ID`, usually named `.note.gnu.build-id` in the |
| 101 | executable and symbol files. This note can be inspected with `readelf -n |
| 102 | <file>` like `readelf -n chrome` or `readelf -n chrome.debug` and looking for |
| 103 | the `.note.gnu.build-id` section. `readelf` reports the `Build ID` as the flat |
| 104 | bytes in the note, but Breakpad binaries like `stackwalk_minidimp` and |
| 105 | `dump_syms` will report and expect this truncated to a formatted Type 2 GUID |
| 106 | (without dashes). This means `readelf` will output a `<build-id>` like |
| 107 | 33221100554477668899AABBCCDDEEFFXXXXXXXX but crashpad binaries will expect and |
| 108 | report this as a `<build-uuid>` of 00112233445566778899AABBCCDDEEFF. |
| 109 | |
| 110 | The `.gnu_debuglink` section states which debug symbol file to use with a |
| 111 | striped binary. For example `readelf --string-dump=.gnu_debuglink chrome` |
| 112 | produces `chrome.debug`. This can be helpful to know for libraries with |
| 113 | interesting debug symbol setup, like libc.so.6. |
| 114 | |
| 115 | ## Create Symbolized Stack |
| 116 | |
| 117 | Given a minidump with the name `mini.dmp` |
| 118 | |
| 119 | `minidump_stackwalk mini.dmp > mini.stackwalk.nosym` |
| 120 | |
| 121 | This will produce a mostly unsymbolized summary of the crash. To symbolize, look |
| 122 | toward the bottom of the output for `WARNING: No symbols, <file>, <build-uuid>`. |
| 123 | For each `<file>` which is of interest, `mkdir -p symbols/<file>/<build-uuid>` then |
| 124 | `dump_syms <file> <directory-with-file.debug> > |
| 125 | symbols/<file>/<build-uuid>/<file>.sym`. Ensure this output `<file>.sym` |
| 126 | contains the expected `<build-uuid>`. Then re-run `minidump_stackwalk` but |
| 127 | with the symbols directory, like `minidump_stackwalk mini.dmp symbols/ > |
| 128 | mini.stackwalk`. |
| 129 | |
| 130 | The output of `minidump_stackwalk` is often quite useful and enough to track |
| 131 | down many issues. However, it does not fully use all of the information from |
| 132 | DWARF, so it is possible sometimes to get much better stack traces from a full |
| 133 | debugger like gdb. This is particularly true when functions have been |
| 134 | aggressively inlined. |
| 135 | |
| 136 | ## Create Core File |
| 137 | |
| 138 | `minidump-2-core mini.dmp > mini.core` |
| 139 | |
| 140 | ## Loading into GDB |
| 141 | |
| 142 | This works best if the binaries, symbols, and core files are all in different |
| 143 | directories to prevent gdb from automatically loading them into the wrong |
| 144 | locations. This is also generally necessary when using a system installed |
| 145 | version of Chrome. For full details see [the gdb |
| 146 | manual](https://sourceware.org/gdb/onlinedocs/gdb/Separate-Debug-Files.html). |
| 147 | The easiest way is to rename and move the .debug files to a directory structure |
| 148 | like `<debugdir>/debug/.build-id/nn/nnnnnnnn.debug` where `nn` are the first |
| 149 | two hex characters of the `build-id`, and `nnnnnnnn` are the rest of the hex |
| 150 | characters of the `build-id`. Note that this `build-id` is exactly what is |
| 151 | reported by `readelf -n <binary> | grep "Build ID"` and not the `build-uuid` |
| 152 | used by Breakpad. Then in gdb use `show debug-file-directory` to get the |
| 153 | `<previous-directories>` and `set debug-file-directory |
| 154 | <previous-directories>:<debugdir>/debug`. |
| 155 | |
| 156 | The `offset`s used here are the offsets of the corresponding module from the |
| 157 | output of `minidump_stackwalk` or (equivalently) the value of |
| 158 | `ModuleListStream::Modules[]::BaseOfImage` from the minidump file (which can be |
| 159 | read with the structure definition). |
| 160 | |
| 161 | ``` |
| 162 | $ gdb |
| 163 | (gdb) file <executable> |
| 164 | (gdb) show debug-file-directory |
| 165 | <previous directories> |
| 166 | (gdb) set debug-file-directory <previous directories>:<debugdir>/debug |
| 167 | (gdb) symbol-file <executable> -o <executable-offset> |
| 168 | (gdb) core-file <mini.core> |
| 169 | ``` |
| 170 | |
| 171 | Running the commands in this order avoids needing to load the symbols twice and |
| 172 | maps the `<executable>` to the expected location. |
| 173 | |
| 174 | To add an additional shared library it is possible to |
| 175 | `(gdb) add-symbol-file <shared-library> -o <shared-library-offset>` |
| 176 | |
| 177 | Source paths in Chrome builds are relative to the `out/<build>` directory. If you |
| 178 | have a Chromium checkout at or around when the Chrome build was created, it can |
| 179 | be added to the debugger search path, like |
| 180 | |
| 181 | ``` |
| 182 | (gdb) directory <path-to-chromium>/chromium/src/out/<build>/ |
| 183 | ``` |
| 184 | |