Learn Python with Talk Python's 270 hours of courses

Python Performance for Data Science

Episode #474, published Mon, Aug 19, 2024, recorded Thu, Jul 18, 2024

Python performance has come a long way in recent times. And it's often the data scientists, with their computational algorithms and large quantities of data, who care the most about this form of performance. It's great to have Stan Seibert back on the show to talk about Python's performance for data scientists. We cover a wide range of tools and techniques that will be valuable for many Python developers and data scientists.

Watch this episode on YouTube
Play on YouTube
Watch the live stream version

Episode Deep Dive

Guests Introduction and Background

Stan Seibert is a manager at Anaconda, where he oversees open-source development teams working on projects such as Numba, Jupyter, and other tools that push the boundaries of Python’s performance. With a deep background in physics and scientific computing, Stan has spent nearly a decade focusing on accelerating Python for data scientists and developers. His work at Anaconda includes guiding the Numba team to help users speed up core numerical algorithms in Python while maintaining a highly productive, Pythonic workflow.

What to Know If You're New to Python

If you're just getting started with Python but want to follow the discussion about data science and performance, here are some targeted resources:

Key Points and Takeaways

  1. Numba for High-Performance Python

    • Numba is a JIT compiler that transforms Python functions into fast, machine-level code for numerical work. It excels at accelerating custom algorithms rather than forcing developers into rewriting them in C or another lower-level language.
    • Links and Tools:
  2. Profiling and Identifying Bottlenecks

    • Before optimizing, you must measure your code’s performance with profiling tools. Surprisingly, code you suspect is slow might not be the actual bottleneck once measured.
    • Links and Tools:
  3. Python 3.13 and the Copy-and-Patch JIT

    • The faster CPython team introduced a new JIT approach in Python 3.13 known as copy-and-patch. It generates specialized code for common execution patterns at build time, without requiring LLVM at runtime.
    • Links and Tools:
  4. “Free-Threaded” Python and the Future of the GIL

    • PEP 703 aims to remove or disable the global interpreter lock (GIL), unleashing true multithreading in Python without forcing the use of subprocesses. While it’s experimental for now, it could eventually reduce the need for workarounds in CPU-intensive parallel code.
    • Links and Tools:
  5. Comparing Numba to Cython

    • Unlike Cython, which generally requires adding type declarations, Numba infers types at runtime and offers a single decorator for compilation. This “no-Python” mode helps keep your focus on Pythonic code while unlocking C/Fortran-level speeds for numeric algorithms.
    • Links and Tools:
  6. GPU Acceleration with CUDA

    • Numba has mature support for NVIDIA CUDA GPUs. You can write Python code that gets compiled into GPU kernels, offering a powerful option to speed up large-scale computations, especially after validating correctness on a CPU first.
    • Links and Tools:
  7. Memory Profiling

    • Python’s dynamic nature can hide significant memory usage in temporary objects (especially arrays). Memory profilers can help identify these hotspots, which is critical when scaling up data-intensive applications.
    • Links and Tools:
  8. Static Python Research (Spy)

    • Although very much R&D, there’s work exploring a “two-phase” Python, where dynamic magic like metaprogramming happens first, and then a static phase locks everything down for compiling. This could open up even more optimization potential in Python without losing flexibility.
    • Links and Tools:
      • PyPy (Related JIT research in Python)
      • (No official public link for Spy yet, but it’s discussed in PyCon talks)
  9. Other Mentioned Tools and Services

Interesting Quotes and Stories

  • “You don’t want to optimize the wrong thing. Measure before you do anything else.” – Stan Seibert, emphasizing the importance of profiling.
  • “Numba compiles two functions—a wrapper for the Python interpreter boundary and the no-Python core. Once you’re inside that core, it’s basically like writing C.” – Stan, describing how Numba keeps Python code feeling Pythonic.

Key Definitions and Terms

  • Numba: A Python JIT compiler that optimizes numerical functions for CPU and GPU execution.
  • Global Interpreter Lock (GIL): A mutex in CPython that prevents multiple threads from executing Python bytecode simultaneously.
  • CUDA: NVIDIA’s parallel computing platform for GPUs, usable from Python via libraries like Numba or CuPy.
  • Cython: A superset of Python that compiles to C for performance, but often requires explicit type declarations.
  • Profiling: The process of measuring which parts of code consume the most time or memory.

Learning Resources

Here are a few places to learn more about Python performance, data science, and best practices:

Overall Takeaway

Python’s readability and massive ecosystem can coexist with serious performance gains if you choose the right tools and techniques. From optimizing a small numerical function with Numba to experimenting with truly parallel, free-threaded builds of Python, there’s no shortage of ways to speed up your data science workflows. By combining solid profiling habits with specialized libraries (and keeping an eye on new developments in the Python community), you can retain Python’s flexibility while harnessing the power of modern hardware.

Links from the show

Stan on Twitter: @seibert
Anaconda: anaconda.com
High Performance Python with Numba training: learning.anaconda.cloud
PEP 0703: peps.python.org
Python 3.13 gets a JIT: tonybaloney.github.io
Numba: numba.pydata.org
LanceDB: lancedb.com
Profiling tips: docs.python.org
Memray: github.com
Fil: a Python memory profiler for data scientists and scientists: pythonspeed.com
Rust: rust-lang.org
Granian Server: github.com
PIXIE at SciPy 2024: github.com
Free threading Progress: py-free-threading.github.io
Free Threading Compatibility: py-free-threading.github.io
caniuse.com: caniuse.com
SPy, presented at PyCon 2024: us.pycon.org
Watch this episode on YouTube: youtube.com
Episode transcripts: talkpython.fm

--- Stay in touch with us ---
Subscribe to Talk Python on YouTube: youtube.com
Talk Python on Bluesky: @talkpython.fm at bsky.app
Talk Python on Mastodon: talkpython
Michael on Bluesky: @mkennedy.codes at bsky.app
Michael on Mastodon: mkennedy

Talk Python's Mastodon Michael Kennedy's Mastodon