ubuntu-dev: reduce bloat#162
Conversation
|
Don't merge yet. We seems to have depended on recommended packages. |
jankeromnes
left a comment
There was a problem hiding this comment.
One comment in passing (will wait for us to be sure about this before approving or merging).
| && apt-get update -q \ | ||
| && apt-get upgrade -qy \ | ||
| && apt-get install -qy \ | ||
| --no-install-recommends \ |
There was a problem hiding this comment.
Nit: I'd prefer apt-get install options that aren't packages to be on the same line as apt-get install (just like -qy).
b602ed3 to
1a7550b
Compare
|
Note: dropped wget. We can re-add if it's widely used (personally I don't because it's not popular in Arch Linux). Edit: seems it broke some child dockerfiles. I will re-add and revert the curl changes. |
1a7550b to
11d4cfe
Compare
|
Pushed some other changes. If you don't like them I will swap them out. |
1e47c1c to
6fd5bbe
Compare
There was a problem hiding this comment.
Thanks a lot for fighting bloat! It's a thankless job that benefits everyone without much celebration.
I'm pretty excited about some of these improvements. However, in general, do you have data on how much bloat you're saving? (Is it just about image size? How many GB does this save?)
Also, philosophically ubuntu-dev is supposed to be the biggest image possible, because every other image and container will be based on it, so if everything possible is already included once in ubuntu-dev, it won't be included N_images + N_containers times in various images and containers. This is why I'm not too keen on aggressively deleting stuff from ubuntu-dev (I'm afraid they'll get re-added down-the-line in a more disk-space-heavy way).
| && gclient sync --delete --jobs=18 \ | ||
| && ninja -C out/Default chrome -j18 | ||
| && gclient sync --delete --jobs=`nproc` \ | ||
| && ninja -C out/Default chrome -j`nproc` |
There was a problem hiding this comment.
Cool trick! I didn't know about nproc. (Also, I read somewhere that the ideal -j would be nproc + 2 or similar, but I'm not sure and it shouldn't make much of a difference, so go for nproc! 👍)
| # One-line setup command from: | ||
| # https://developer.mozilla.org/en-US/docs/Mozilla/Developer_guide/Build_Instructions/Linux_Prerequisites#Most_Distros_-_One_Line_Bootstrap_Command | ||
| RUN sudo apt-get update -q \ | ||
| RUN sudo apt-get update \ |
There was a problem hiding this comment.
Nit: Why remove the -q or -qq on every apt-get update?
There was a problem hiding this comment.
It's not needed and in the Docker library images it's never used. Often when an error occurs it would be more useful if we had the context (which is suppressed with -q).
| build-essential \ | ||
| cmake \ | ||
| python-virtualenv \ | ||
| python-pip \ |
There was a problem hiding this comment.
Thanks for removing these packages! However, I guess our comment # Packages are from https://github.com/servo/servo/blob/master/README.md#on-debian-based-linuxes is no longer true (i.e. when we'll want to update this package list, it will look like Servo introduced the new dependencies g++ build-essential cmake python-virtualenv python-pip, which is unfortunate).
What is the effect of doing a apt-get install of a package that's already installed? If it's essentially a no-op, please keep the list in sync with Servo's README.md.
There was a problem hiding this comment.
That's a good point. Yes, apt-get install on installed packages is a noop. I will revert it back.
| x11vnc \ | ||
| xmlto \ | ||
| xvfb \ | ||
| && rm -rf /var/lib/apt/lists/* \ |
There was a problem hiding this comment.
What is the exact effect of systematically adding rm -rf /var/lib/apt/lists/* to our Dockerfiles? Won't this cause user-triggered sudo apt-get updates to be slower, and to consume disk space outside of any Copy-on-Write?
Speaking of disk space, how much space does rm -rf /var/lib/apt/lists/* actually save? (If it's say > 8 GB then I find it useful, but if it's below that, I consider the bloat to be negligible, whereas forcing every image to do rm -rf /var/lib/apt/lists/* is fragile and cumbersome.)
There was a problem hiding this comment.
Each Dockerfile RUN record creates a layer; in this case, as long as you always run the rm line every time you do an apt-get, the apt lists will never end up in an image layer.
| fluxbox \ | ||
| gdb \ | ||
| gettext \ | ||
| git \ |
There was a problem hiding this comment.
Thanks for converting our Git build to a PPA! But are you 100% certain that:
- The PPA
gitwill always be the latest stable (e.g.2.16.2today)? - The PPA will also set up
git-completion.bashandgit-prompt.shfor every user session? (We used to source them in~/.bashrc) - The resulting Git prompt will forcibly include colors?
If not, please address these nits.
There was a problem hiding this comment.
According to the description, it's backed by Ubuntu Git maintainers and claims to have the latest version. Thus, it should have the latest stable packaged rapidly.
The completion files will be installed globally, but the prompt doesn't get modified. I will take a look into this.
| valgrind \ | ||
| wget \ | ||
| x11vnc \ | ||
| xmlto \ |
| # Install basic development packages. | ||
| RUN __LLVM_VERSION__="6.0" \ | ||
| && apt-get update -q \ | ||
| && apt-get upgrade -qy \ |
There was a problem hiding this comment.
Why not upgrade packages from ubuntu-dev while we're at it?
There was a problem hiding this comment.
Upgrading packages will result in space overhead due to CoW. The ubuntu base image is normally kept almost updated, so there's no reason to hurry.
| ENV PATH="${PATH}:/home/user/.cargo/bin" | ||
| RUN rustup install nightly \ | ||
| && rustup completions bash | sudo tee /etc/bash_completion.d/rustup.bash-completion | ||
| && rustup completions bash | sudo tee /etc/bash_completion.d/rustup.bash-completion > /dev/null |
There was a problem hiding this comment.
Because this doesn't contribute to diagnostics and makes log less readable.
ishitatsuyuki
left a comment
There was a problem hiding this comment.
This reduces the main package set from 3.1GB -> 1.7GB, which is primarily due to we have eliminated TeX Live to be included (the full suite takes over 1 gigabyte).
Disabling the recommended packages is a common practice in Docker world, and we have ran CI to ensure that the basic build passes (I've added those missing ones explicitly). This may accidentally reduce the set of available packages, but I believe the impact is small as the primary thing we removed is TeX Live.
| fluxbox \ | ||
| gdb \ | ||
| gettext \ | ||
| git \ |
There was a problem hiding this comment.
According to the description, it's backed by Ubuntu Git maintainers and claims to have the latest version. Thus, it should have the latest stable packaged rapidly.
The completion files will be installed globally, but the prompt doesn't get modified. I will take a look into this.
| ENV PATH="${PATH}:/home/user/.cargo/bin" | ||
| RUN rustup install nightly \ | ||
| && rustup completions bash | sudo tee /etc/bash_completion.d/rustup.bash-completion | ||
| && rustup completions bash | sudo tee /etc/bash_completion.d/rustup.bash-completion > /dev/null |
There was a problem hiding this comment.
Because this doesn't contribute to diagnostics and makes log less readable.
| # Install basic development packages. | ||
| RUN __LLVM_VERSION__="6.0" \ | ||
| && apt-get update -q \ | ||
| && apt-get upgrade -qy \ |
There was a problem hiding this comment.
Upgrading packages will result in space overhead due to CoW. The ubuntu base image is normally kept almost updated, so there's no reason to hurry.
| build-essential \ | ||
| cmake \ | ||
| python-virtualenv \ | ||
| python-pip \ |
There was a problem hiding this comment.
That's a good point. Yes, apt-get install on installed packages is a noop. I will revert it back.
| # One-line setup command from: | ||
| # https://developer.mozilla.org/en-US/docs/Mozilla/Developer_guide/Build_Instructions/Linux_Prerequisites#Most_Distros_-_One_Line_Bootstrap_Command | ||
| RUN sudo apt-get update -q \ | ||
| RUN sudo apt-get update \ |
There was a problem hiding this comment.
It's not needed and in the Docker library images it's never used. Often when an error occurs it would be more useful if we had the context (which is suppressed with -q).
6fd5bbe to
5b8cae0
Compare
jankeromnes
left a comment
There was a problem hiding this comment.
Ok, I'm starting to really like this change! I'm mostly OK with the current state, but I just have a few minor questions.
This reduces the main package set from 3.1GB -> 1.7GB, which is primarily due to we have eliminated TeX Live to be included (the full suite takes over 1 gigabyte).
That's excellent, thanks! I suppose no developer will ever need TeX Live on Janitor, so it probably makes sense to remove it from ubuntu-dev, and generally shrink it from 6.6GB down to 5.2GB (a 21% save).
One question though, if we completely remove /var/lib/apt/lists/* from all our image layers, won't user-triggered apt-get update commands on Janitor run somewhat slower and use more disk space than if our images already had some copy-on-write cache? (It's not a super-frequent or crucial use case, but I'm curious.)
Disabling the recommended packages is a common practice in Docker world, and we have ran CI to ensure that the basic build passes (I've added those missing ones explicitly).
Good thinking! A few other dependent images to be aware of:
- Everything under janitortechnology/* (notably
discourseanddspace) - bnjbvr/kresus-janitor
- chocobozzz/peertube-dev
(Note that they might still be based on janx/ubuntu-dev, which we don't update anymore, so your change won't break them until they upgrade.)
xmlto
Why is this needed?
Just curious.
| libegl1-mesa-dev \ | ||
| libdbus-1-dev \ | ||
| libpulse-dev \ | ||
| libharfbuzz-dev \ |
There was a problem hiding this comment.
Cool drive-by update, thanks!
| # Add source for the latest Mercurial packages. | ||
| RUN echo "deb http://ppa.launchpad.net/mercurial-ppa/releases/ubuntu xenial main" > /etc/apt/sources.list.d/mercurial.list \ | ||
| && apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 41BD8711B1F0EC2B0D85B91CF59CE3A8323293EE | ||
| RUN add-apt-repository ppa:mercurial-ppa/releases |
There was a problem hiding this comment.
Note that this makes us a little more vulnerable to malicious key changes (e.g. if the PPA gets hacked).
|
Please also rebase your commits. After that I'll approve this pull request and we can merge it and start saving disk space and network transfer! |
asciidoc drags in the whole texlive suite, which weighs at about 500MB. apt repositories should be also cleaned, while the impact is not large.
And refactor apt sources handling
Excessive parallelism can be slow due to more context switch, memory requirements and IO.
5b8cae0 to
0b7d3c0
Compare
|
Removed |
jankeromnes
left a comment
There was a problem hiding this comment.
Looks good to me, thanks!
Except for dspace.dockerfile, where it looks like you reverted recent changes. Please fix that and then we can merge.
| @@ -1,29 +1,45 @@ | |||
| FROM janitortechnology/ubuntu-dev | |||
| FROM janx/ubuntu-dev | |||
There was a problem hiding this comment.
Oops, this doesn't look right.
fdeb063 to
5764e7b
Compare
jankeromnes
left a comment
There was a problem hiding this comment.
Looks good to me! Thanks for the great work! 🥇 💯 👍 🎉 🎆
asciidoc drags in the whole texlive suite, which weighs at about 500MB. apt repositories should be also cleaned, while the impact is not large.