In This Tutorial, You Will Learn:
- How to structure Dockerfiles efficiently for faster builds
- Best practices for creating lightweight and secure Docker images
- Common Dockerfile errors and how to troubleshoot them
- Techniques to optimize multi-stage builds

Software Requirements and Linux Command Line Conventions
Category | Requirements, Conventions, or Software Version Used |
---|---|
System | Any Linux distribution |
Software | Docker Engine (20.10.x or newer) |
Other | Basic understanding of Docker concepts |
Conventions | # – Requires commands to be executed with root privileges, either directly as root or using sudo .$ – Requires commands to be executed as a regular non-privileged user. |
Understanding Dockerfile Basics and Best Practices
WHAT IS A DOCKERFILE?
A Dockerfile is a text document containing instructions to build a Docker image. Each instruction creates a layer in the image, which can affect build time, image size, and security. Optimizing these instructions is key to efficient containerization.
Creating efficient Dockerfiles is essential for developing containerized applications that are lightweight, secure, and fast to build. The way you structure your Dockerfile directly impacts your development workflow and production deployment. Let’s explore how to build better Dockerfiles and avoid common mistakes.
Step-by-Step Instructions
- Use specific base images: Start with a minimal, specific base image
FROM node:18-alpine
Always use specific version tags rather than
latest
to ensure reproducible builds. Alpine-based images are significantly smaller than their Debian/Ubuntu counterparts. The more specific your tag, the better for consistency and security updates. - Order instructions by change frequency: Place instructions that change least at the top
FROM node:18-alpine # Tools that rarely change RUN apk add --no-cache python3 make g++ # Dependencies that change occasionally COPY package*.json ./ RUN npm ci # Application code that changes frequently COPY . .
Docker’s build cache invalidates all subsequent layers when a layer changes. By placing more stable instructions at the top, you maximize cache usage and minimize rebuild time. This will significantly speed up your development workflow.
- Combine related commands: Use && to chain commands and reduce layers
# Bad practice (creates 3 layers) RUN apt-get update RUN apt-get install -y curl RUN rm -rf /var/lib/apt/lists/* # Good practice (creates 1 layer) RUN apt-get update && \ apt-get install -y curl && \ rm -rf /var/lib/apt/lists/*
Each
RUN
instruction creates a new layer. Combining related commands reduces image size and improves build performance. Always clean up package manager caches to keep images small. - Use .dockerignore file: Exclude unnecessary files from the build context
$ cat .dockerignore node_modules npm-debug.log Dockerfile .git .gitignore README.md
A
.dockerignore
file works like.gitignore
, preventing specified files from being sent to the Docker daemon during build. This speeds up builds and prevents sensitive files from being included in your image. - Implement multi-stage builds: Separate build and runtime environments
FROM node:18 AS builder WORKDIR /app COPY package*.json ./ RUN npm ci COPY . . RUN npm run build FROM node:18-alpine WORKDIR /app COPY --from=builder /app/dist ./dist COPY --from=builder /app/node_modules ./node_modules COPY package*.json ./ CMD ["npm", "start"]
Multi-stage builds let you use one image for building (with all build tools) and another for running your application. This results in significantly smaller production images and improved security by not including build tools in the final image.
In the example above:
- The first stage named
builder
uses a full Node.js image which includes all build tools - We install dependencies and build the application in this first stage
- The second stage starts fresh with a minimal Alpine-based image
- Using
COPY --from=builder
, we selectively copy only the build artifacts and runtime dependencies - Everything else from the build stage is discarded, including node_modules with dev dependencies, source code, and build tools
Multi-stage builds are particularly valuable for compiled languages like Go, Rust, or Java, where the final binary can be copied to a minimal image. For example, a Go application might use:
FROM golang:1.20 AS builder WORKDIR /app COPY go.* ./ RUN go mod download COPY . . RUN CGO_ENABLED=0 GOOS=linux go build -o /app/server FROM alpine:3.18 RUN apk --no-cache add ca-certificates COPY --from=builder /app/server /usr/local/bin/ CMD ["server"]
This approach can reduce image sizes by up to 99% in some cases (from 1GB+ to ~10MB). You can even use more than two stages when you need separate phases for testing, security scanning, or generating different artifacts.
- The first stage named
- Set appropriate user permissions: Avoid running containers as root
RUN addgroup -S appgroup && adduser -S appuser -G appgroup USER appuser
Running containers as root is a security risk. Create a non-privileged user and switch to it before running your application. This limits the potential damage if your container is compromised.
- Use ENTRYPOINT and CMD correctly: Understand their differences
# For applications ENTRYPOINT ["node", "app.js"] CMD ["--production"] # For utilities ENTRYPOINT ["aws"] CMD ["--help"]
ENTRYPOINT
defines the executable that runs when the container starts, whileCMD
provides default arguments to that executable. Using them together makes your containers more flexible and user-friendly. - Diagnose common errors: Understand build failures
$ docker build -t myapp .
Common build errors include:
- Base image not found: Verify the base image exists and you have proper access
- COPY/ADD failures: Ensure source paths exist and are correctly specified
- RUN command failures: Run the commands locally to debug or use
docker build --progress=plain
for verbose output
OPTIMIZING DOCKER BUILD PERFORMANCE
When working with large applications, consider these additional optimizations. Use BuildKit by setting DOCKER_BUILDKIT=1
before your build commands. Leverage build caching with --cache-from
in CI/CD pipelines. For Node.js applications, use npm ci
instead of npm install
for faster, more reliable builds. Consider Docker layer caching services like BuildJet for CI/CD pipelines. These techniques can reduce build times by up to 80% for complex applications.
Conclusion
Mastering Dockerfile best practices helps create efficient, secure, and maintainable container images. By organizing instructions based on change frequency, combining related commands, implementing multi-stage builds, and following security practices, you can significantly improve your Docker workflow. Remember that optimizing Dockerfiles is an ongoing process—continuously monitor image sizes and build times, and refine your approach as your application evolves. With these practices, you’ll avoid common pitfalls and build Docker images that are both developer-friendly and production-ready.
Frequently Asked Questions (FAQ)
-
Why is my Docker image so large?
Large Docker images typically result from: using bulky base images (consider Alpine alternatives), not cleaning up package manager caches, including unnecessary build tools in the final image, or forgetting to use multi-stage builds. Use
docker history <image>
to see which layers contribute most to image size, and consider tools likedive
for deeper analysis. -
How can I speed up my Docker builds?
Optimize build speed by: using BuildKit, organizing Dockerfile instructions by change frequency, implementing layer caching, using a .dockerignore file to reduce build context, and employing multi-stage builds. For CI/CD pipelines, consider caching strategies and parallel builds for microservices architectures.
-
What’s the difference between ADD and COPY in Dockerfiles?
While both commands add files to your image,
ADD
has additional features: it can extract compressed files and download files from URLs. However,COPY
is preferred for simple file copying as it’s more explicit. UseADD
only when you specifically need its extra functionality. -
Should I use multiple RUN instructions or chain commands?
Generally, chain related commands within a single
RUN
instruction using&&
to reduce the number of layers and image size. However, during development, separateRUN
instructions can improve build cache utilization. For production Dockerfiles, consolidate commands that are logically related (like package installation and cleanup). -
How do I debug a failing Docker build?
To debug failing builds: use
docker build --progress=plain
for verbose output, build up to the failing instruction withdocker build --target=stage
for multi-stage builds, run a container from the last successful layer withdocker run -it <image_id> sh
to interactively test commands, or addRUN ls -la
commands strategically to check file existence and permissions. -
Is it safe to use the latest tag in production?
Using
latest
tags in production is strongly discouraged. They make builds non-reproducible and can break your application when upstream images change. Always use specific version tags (likenode:18.12.1-alpine
) for production environments. Consider implementing a strategy to regularly update and test with newer versions while maintaining control over exactly what gets deployed.