Learn Python with Talk Python's 270 hours of courses

Best practices for Docker in production

Episode #323, published Sat, Jul 3, 2021, recorded Mon, Jun 14, 2021

You've got your Python API or app running in a Docker container. Great! Are you ready to ship it to that hosted cluster service and head off to production? Not so fast. Have you considered how you'll manage evolving dependencies and addressing security updates over time? Not just for the base OS but for installed packages? How about your pip installed dependencies? Are you running as root? If you don't know, the answer is yes.

We'll discuss these issues and many more with Itamar Turner-Trauring on this episode.

Watch this episode on YouTube
Play on YouTube
Watch the live stream version

Episode Deep Dive

Guests introduction and background

Itamar Turner-Trauring is a seasoned Python developer and author who focuses on production-ready Docker packaging and Python performance optimizations. He created the Fil memory profiler for Python and has written extensively on Docker packaging best practices at pythonspeed.com. In this episode, Itamar shares a wealth of knowledge on deploying Python code in Docker—from ensuring better security, to managing dependencies, to making builds both smaller and faster.

What to Know If You're New to Python

If you’re new to Python, here are some basics to help you follow along with this episode:

  • Familiarity with Python’s virtual environments and pip install concepts is helpful when discussing Docker’s build stages.
  • Basic understanding of how Python packages (like Django or NumPy) can be installed will help clarify why pinned dependencies matter.
  • Some awareness of the command line and Linux file systems can be useful, as Docker images often rely on Debian-based or Alpine-based distributions.

Key points and takeaways

  1. Docker for consistent deployment
    Docker containers let you bundle up Python, system libraries, and your own code into a single artifact to run anywhere. This consistency reduces the "it works on my machine" issues by giving all developers—and production—the exact same environment.

  2. Security updates and forced rebuilds
    Simply rebuilding the same Dockerfile doesn’t guarantee new OS-level patches get applied. Because Docker uses layer caching, you must occasionally force a fresh build to pull in security fixes and OS updates. Doing a scheduled rebuild (e.g., daily or weekly) helps keep images secure and up to date.

  3. Don’t run your container as root
    By default, Docker containers run as root, but that opens unnecessary security risks if the container is compromised. It’s straightforward to switch to a non-root user (e.g., RUN adduser + USER instructions in the Dockerfile), which reduces the damage an attacker can do.

  4. Container layering and caching
    Docker builds are split into layers, and each build step can be cached. While caching dramatically speeds up incremental builds, it can also inadvertently prevent installing new patches or updates. Carefully ordering your Dockerfile (for example, copying requirements before code) helps you cache what you really need.

  5. Alpine vs. Debian-based images
    Alpine Linux is popular for being extremely small. However, it uses a different C library (musl) which means Python wheels often have to be recompiled. For many Python projects (especially data science ones), this leads to slow, large builds. Debian-based slim images (e.g., python:3.9-slim-buster) are often more practical.

  6. Iterative approach to Dockerizing
    Itamar stresses treating Docker packaging as a process. Start with something that works, then layer on security best practices, continuous integration, correct version pinning, and eventually performance optimizations (like multi-stage builds or caching). That way, you can stop at any point and still have a working foundation.

  7. CI/CD builds for each commit or pull request
    Automating Docker builds in your CI/CD pipeline ensures each pull request or feature branch has its own Docker image. This approach preserves the stability of your main image tag and makes it easy to test or even deploy ephemeral versions of your app.

  8. Version pinning and dependency updates
    There’s a balance between locking dependencies too tightly vs. grabbing the latest version automatically. Locking them prevents random breakages, but you must have an ongoing strategy (like Dependabot or scheduled checks) for security updates and new releases.

  9. Precompiling .pyc for faster startup
    By default, .pyc files (compiled Python bytecode) aren’t preserved across container restarts. Precompiling them during the Docker build can speed up container startup times, particularly for short-lived or serverless workloads.

  10. Docker Compose for local development
    While the episode focuses on production concerns, Docker Compose is invaluable for spinning up dev environments that mirror production. It allows you to run Postgres, Redis, or other services as separate containers with minimal fuss.

Interesting quotes and stories

"I don’t like Docker packaging. It’s not a thing I’m doing because it’s fun, it’s just extremely useful." — Itamar Turner-Trauring

"If you’re running as root in your container, the answer is ‘yes, you’re running as root.’ And that’s probably not what you want in production." — Michael Kennedy

"You have to set up these ongoing processes for things like security updates and dependency updates. It’s not just a one-off thing." — Itamar Turner-Trauring

Key definitions and terms

  • Layer Caching: A feature in Docker builds that reuses intermediate steps if they haven’t changed, speeding up repeated builds.
  • Multi-stage Builds: A Docker technique for building your app in one stage (with compilers, etc.) and copying only the final artifacts into a smaller runtime image.
  • Musl vs. Glibc: Musl is a lightweight C standard library used in Alpine Linux; Glibc is the more common library in Debian/Ubuntu-based images. Many precompiled Python wheels assume Glibc, causing complications on Alpine.
  • Immutable Artifacts: The practice of treating container images as read-only snapshots. Once an image is built, you don’t edit it in place; you rebuild a new one.

Learning resources

Here are some resources to learn more about Python and Docker:

Overall takeaway

Docker provides a robust, repeatable way to package and deploy Python apps to production. However, it requires a thoughtful process—from selecting the right base image to pinning dependencies to staying on top of security updates. The effort pays off with consistent environments, reproducible builds, and cleaner deployments. As you adopt these best practices, you’ll gain efficiency, stability, and confidence that your Python Docker containers are truly production-ready.

Links from the show

PyCon Talk: youtube.com
Docker packaging articles (code TALKPYTHON to get 15% off): pythonspeed.com
PSF+JetBrains 2020 Survey: jetbrains.com
Give me back my monolith article: craigkerstiens.com
TestContainers: github.com
SpaceMacs: spacemacs.org
Rust bindings for Python: github.com
PyOxidizer: pyoxidizer.readthedocs.io
ahocorasick_rs: Quickly search for multiple substrings at once: github.com
FIL Profiler: pythonspeed.com
Free ebook covering this process: pythonspeed.com

Talk Python Twilio + Flask course: talkpython.fm/twilio
Watch this episode on YouTube: youtube.com
Episode transcripts: talkpython.fm

--- Stay in touch with us ---
Subscribe to Talk Python on YouTube: youtube.com
Talk Python on Bluesky: @talkpython.fm at bsky.app
Talk Python on Mastodon: talkpython
Michael on Bluesky: @mkennedy.codes at bsky.app
Michael on Mastodon: mkennedy

Talk Python's Mastodon Michael Kennedy's Mastodon